Skip to contents

This function compares two clean datasets, filters them based on a specified date range, and generates outputs for numeric, factor, character, binary, date, and other data types. The results are saved to an output directory, and optional views of the data can be displayed or saved.

Usage

compare_clean_data(
  old_data,
  new_data,
  output_dir,
  final_vars_set,
  date_col,
  limit_to_same_date = TRUE,
  show_views = FALSE,
  save_views = FALSE
)

Arguments

old_data

Dataframe. The old dataset for comparison.

new_data

Dataframe. The new dataset for comparison.

output_dir

Character. Path to the directory where the output files will be saved.

final_vars_set

Character vector. List of variable names to include in the comparison.

date_col

Character. Name of the column in the datasets representing the date. The function uses this column to filter rows based on the date range.

limit_to_same_date

Logical. Whether to filter the new dataset to match the date range of the old dataset (default: TRUE).

show_views

Logical. Whether to display the data views in the RStudio Viewer (default: FALSE).

save_views

Logical. Whether to save views of the dataframes to disk (default: FALSE).

Value

A list of dataframes containing the comparison results. Dataframes include numeric, factor, character, binary, date, and other derived datasets. Entries with missing data are omitted.