Skip to content

Conversation

@AniPatil101
Copy link

Documentation done on binwidth, bins, breaks and freq+ added a clarifying sentence.

Added section When to Use Each Plot:

  • This new section gives a quick, practical overview of when and why to use each plot type.

Also added Usage Note and Version & Dependency.

  • For cleaner, easier-to-maintain, and more transparent for users

Documentation done on binwidth, bins, breaks and freq+
added a clarifying sentence.
Added @section When to Use Each Plot: This new section gives a quick, practical overview of when and why to use each plot type
Also added Usage Note and Version & Dependency.
For cleaner, easier-to-maintain, and more transparent for users
Copy link
Member

@jgabry jgabry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AniPatil101 sorry for the delay in reviewing this. I really appreciate the pull request and the effort you put into this!

I have a few concerns that I put in the review comments because this adds a lot of additional text to an already long documentation page and I think some of the info can already be found elsewhere (but some of it is definitely good to add!). And it would make this documentation page different than the others, so if we end up adding any of these new sections I think it would make sense to add them to all the pages. If you're interested in working on that it could be a project that you chip away at over time (I know there are a lot of doc pages) and then we merge it once it's ready. But no obligation of course!

Anyway, I really appreciate the PR and happy to talk more about this if you end up wanting to keep working on it.

Comment on lines +20 to +40
#' @param binwidth Numeric scalar; passed to the underlying geom/stat used
#' by the plotting function. Behavior by function:
#' * `ppc_hist()`: forwarded to `ggplot2::geom_histogram(binwidth = ...)`.
#' * `ppc_freqpoly()` / `ppc_freqpoly_grouped()`: forwarded to
#' `ggplot2::geom_area(stat = "bin", binwidth = ...)` (i.e. `stat = "bin"`).
#' * `ppc_dots()`: forwarded to `ggdist::stat_dots(binwidth = ...)`.
#' If `NULL`, the default binning behavior of the respective geom/stat is used.
#'
#' @param bins Integer number of bins. Forwarded in the same way as `binwidth`
#' (i.e. to `geom_histogram()`, `geom_area(stat = "bin")`, and `ggdist::stat_dots()`).
#'
#' @param breaks Numeric vector of break positions. Forwarded in the same way
#' as `binwidth`/`bins` (where supported).
#'
#' @param freq Logical. Controls whether the vertical scale shows counts
#' (`freq = TRUE`) or densities/proportions (`freq = FALSE`). This argument
#' is not limited to `ppc_hist()` — it is also used by `ppc_freqpoly()` and
#' `ppc_freqpoly_grouped()` because those functions use `stat="bin"`
#' (implemented via `geom_area(stat = "bin")`). The `freq` value is passed
#' to `set_hist_aes()` and affects the y aesthetic used by `geom_histogram()`
#' and `geom_area(stat="bin")`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been documenting these using shared roxygen templates (see the files in the man-roxygen directory so that we can use them in different R files and reduce the amount of redundant text across the different files in the package. So I'm not sure that we want to have to document them separate in each R file. That said, I do like the extra detail here! So I have mixed feeling about this.

Comment on lines +47 to +56
#' - Requires **ggplot2 (>= 3.4.0)** for the use of modern layering and
#' consistent handling of bin arguments.
#'
#' - Uses **ggdist::stat_dots()** (requires **ggdist >= 3.3.0**).
#'
#' - All `ppc_*` functions depend on **bayesplot** core plotting infrastructure
#' and `ggplot2` grammar of graphics.
#'
#' - Ensure that dependent packages (`ggplot2`, `ggdist`, and `bayesplot`)
#' are loaded or properly imported in your package’s `DESCRIPTION` file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all true, but it should already be enforced by the DESCRIPTION file we have. That is, you can't install bayesplot without ggplot2 >= 3.4.0. Although I do notice that we're missing the ggdist version number, thanks for pointing that out!

I'm not sure all this text is worth adding to the documentation page since we would need to add a section like this to every R file in the package for consistency and this information is already included in the DESCRIPTION file.

Comment on lines +120 to +149
#' @section Usage Note:
#'
#' - The `binwidth`, `bins`, and `breaks` arguments are passed to multiple
#' underlying `ggplot2` and `ggdist` geoms (`geom_histogram()`, `geom_area()`,
#' and `stat_dots()`), which means they affect both histogram and dot plot outputs.
#'
#' - The `freq` argument controls whether the y-axis represents **counts**
#' (`freq = TRUE`) or **densities** (`freq = FALSE`) for both histograms
#' and frequency polygons.
#'
#' - These plots assume numeric `y` values. For categorical data,
#' consider `ppc_bars()` or similar discrete comparison functions.
#'
#' - If a grouping variable is provided, ensure that `y` and `yrep` have
#' matching group structures; otherwise, plots may not align correctly.
#'
#' - For large replicated datasets (`yrep` with > 50 rows), consider
#' summarizing or subsampling to avoid overplotting.
#'
#' Note:
#' The `binwidth`, `bins`, and `breaks` arguments are forwarded not only
#' to `ggplot2::geom_histogram()` (used by `ppc_hist()`), but also to
#' `ggdist::stat_dots()` (used by `ppc_dots()`) and to the `stat = "bin"`
#' layers used by `ppc_freqpoly()` and `ppc_freqpoly_grouped()`
#' (implemented with `ggplot2::geom_area(stat = "bin", ...)`).
#'
#' Similarly, the `freq` argument affects not only histograms but also
#' frequency polygons created with `ppc_freqpoly()` and
#' `ppc_freqpoly_grouped()`, where it determines whether the y-axis
#' represents counts (`freq = TRUE`) or densities (`freq = FALSE`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all good info but I think most of it is clear from the existing documentation and it adds a lot of length to already long documentation. For example, it should already be clear that the freq argument affects multiple different plots because they each have the freq argument. I'm not sure we need to write that out in sentence form.

I guess my main concern here is that users are just going get lost in a sea of text. So we should only put things in sentences/paragraphs that are hard to discern from the existing documentation. Does that make sense? If you disagree feel free to let me know! I really appreciate the effort you put into this.

Comment on lines +102 to +118
#' These plots offer different perspectives on how simulated (`yrep`) and observed (`y`) data compare.
#'
#' \itemize{
#' \item **`ppc_hist()`** — Best for visualizing **overall distribution shapes**.
#' Use this when your data are continuous and you want to see how model predictions align
#' with observed distributions.
#'
#' \item **`ppc_dots()`** — Ideal for **small or discrete datasets**.
#' Displays each simulated dataset as dots, giving an intuitive visual comparison
#' between simulated and observed values.
#'
#' \item **`ppc_freqpoly()`** — Useful for **moderate to large continuous datasets**.
#' Highlights how predicted and observed **frequencies vary** across the range of `y`.
#'
#' \item **`ppc_freqpoly_grouped()`** — When you have a **grouping variable** (e.g., categories),
#' use this to compare frequencies across levels of that variable.
#' }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of having a section like this! But I think this is missing some of the plots on the ppc-distributions page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants