Skip to contents

Compares two data frames/tibbles (or two objects coercible to tibbles like matrices) and offers to inspect any differences in tabular diff format as neatly rendered HTML.

Usage

show_diff(
  x,
  y,
  ignore_order = FALSE,
  ignore_col_types = FALSE,
  ids = NULL,
  ask = TRUE,
  bypass_rstudio_viewer = FALSE,
  verbose = TRUE,
  max_diffs = 10L,
  diff_text = "{x_lbl} is different from {y_lbl}",
  ask_text = "Do you wish to display the changes in tabular diff format?",
  caption = "{x_lbl} → {y_lbl}",
  ...
)

Arguments

x

The data frame / tibble to check for changes.

y

The data frame / tibble that x should be checked against, i.e. the reference.

ignore_order

Whether or not to ignore the order of columns and rows.

ignore_col_types

Whether or not to distinguish similar column types. Currently, if set to TRUE, this will convert factors to characters and integers to doubles before the comparison.

ids

A character vector of column names that make up a primary key, if known. If NULL, heuristics are used to find a decent key (or a set of decent keys).

ask

Whether or not to ask interactively if the resulting difference object should be opened in case x and y differ. If FALSE, it will be opened right away. Only relevant if run interactively.

bypass_rstudio_viewer

If TRUE, x and y actually differ, and ask is set to TRUE, the resulting difference object will be opened in the system's default web browser instead of RStudio's built-in viewer. Only relevant if run within RStudio.

verbose

Whether or not to also output the differences detected by pal::is_equal_df() to the console.

max_diffs

The maximum number of differences shown on the console. Only relevant if verbose = TRUE.

diff_text

The text to display on the console in case x and y differ. It is passed to glue::glue() allowing its string interpolation syntax to be used. A character scalar.

ask_text

The text that is displayed when ask = TRUE. Ignored if ask = FALSE. A character scalar.

caption

The caption of the rendered difference object. It is passed to glue::glue() allowing its string interpolation syntax to be used. A character scalar.

...

Further arguments passed on to daff::diff_data(), excluding data, data_ref, ids, ordered, and columns_to_ignore.

Value

A difference object, invisibly. It could be rendered later using daff::render_diff(), for example.

Details

This function is basically a convenience wrapper combining pal::is_equal_df(), daff::diff_data() and daff::render_diff(). If run non-interactively or ask = FALSE, the differences will be shown right away, otherwise the user will be asked on the console.

Note that in tabular diff format, only changes in the column content of x and y are visible, meaning that the following properties and changes therein won't be displayed:

  • column types (e.g. integer vs. double)

  • row names and other attributes

See also

Other data frame / tibble functions: open_as_tmp_spreadsheet()

Examples

if (FALSE) {
library(magrittr)

mtcars |>
  dplyr::mutate(dplyr::across(c(cyl, gear),
                              \(x) dplyr::if_else(x > 4, x * 2, x))) |>
  yay::show_diff(mtcars)}