This release incorporates feedback from the PhD dissertation reviewers: dr hab. Andrzej Dudek, dr hab. Joanna Landmesser-Rusek, and dr hab. Paweł Andrzej Strzelecki.
occup_panel dataset - a rotational panel covering 2009Q1--2010Q4 - used to illustrate id_var direct matching and to exercise the panel-data code path in tests."nb") added to the supported ML methods, via e1071 (Suggests).summary_c2c() now supports both lm and glm, adds input/shape validation (df_old/df_new, coefficient table checks), and reports the reference distribution used for p-values (t or normal). Added edge-case tests and aligned regression/panel vignette examples with current summary_c2c() output and fixed-effects inference guidance.cat2cat_ml_run() now reports the Brier score and mean P(true class) in addition to accuracy. A proper scoring rule matters because cat2cat weights are probabilities, not classifications - a model can be accurate and still be poorly calibrated.stopifnot() assertions now carry descriptive messages, so failures point the user at the offending argument instead of printing the raw expression.cat2cat() ML fallback is now configurable via ml$on_fail ("freq", "naive", "na", "error") with optional warning control via ml$fail_warn. Failed ML weights are now explicitly handled according to this policy instead of always silently falling back to frequency weights.cat2cat() ML now accepts factor features in addition to numeric/logical: factor columns listed in ml$features are automatically one-hot encoded using the union of levels observed in ml$data and the target period.id_var, aggregated-data workflows, hierarchical-code mappings, and regression/inference after harmonisation into one better-structured advanced reference.cat2cat() and cat2cat_agg() trimmed - argument documentation kept, conceptual material moved to the vignettes where it belongs.nomnoml diagram that previously mislabelled the base/target sides under forward mapping.get_freqs() for r-devel/R CMD check: replaced as.data.frame(table(...)) conversion with direct data.frame(input = names(tab), Freq = as.integer(tab)) construction to avoid failures when NA appears in table names (row names contain missing values).cat2cat_ml_run function to check the ml models performance before cat2cat with ml option is run. Now, the ml models are more transparent.ropensci standards, like CONTRIBUTING file and testthat version 3.cat2cat_agg has updated cat_var argument to two new ones, cat_var_old and cat_var_new.master to main branch.freqs_df argument in the cat2cat function is moved from data to mappings part, it is backward compatible.
Now it is consistent with the python cat2cat implementation.pkgcheck related fixes, like 80 chars per line.data and library calls style.cat2cat function.dummy_c2c to be backward compatible.cat_apply_freq function performance.ml argument in the dummy_c2c function is redefined, shorter names for a simpler usage.cat2cat ml part is using direct cat_var for target (for an update) dataset now, not the one from the ml argument list.cat2cat validation, if the trans table covers all needed levels.ml and data argument in the cat2cat::cat2cat function, two additional arguments each.prune_c2c scales the weights now, so still sum to one for each subject.dummy_c2c to add a default cat2cat columns to a data.frame.occupand occup_small datasets have 4 periods now.pkgdown reference.pkgdown website.deparse instead of deparse1.tinyverse world, even less dependencies.cat2cat function, the ml part is assuming that categorical variable is always named "code".cat2cat function.randomForest packages to Suggests, they are delayed loaded now.occup_small dataset to pass checks in terms of computation time of examples.cat2cat function - "knn", "rf", "lda".prune_c2c and cross_c2c to improve processing of results.