Package: cat2cat 0.6.1.9000

cat2cat: Handling an Inconsistently Coded Categorical Variable in a Longitudinal Dataset

Unifying an inconsistently coded categorical variable between two different time points in accordance with a mapping table. The main rule is to replicate the observation if it could be assigned to a few categories. Then using frequencies or statistical methods to approximate the probabilities of being assigned to each of them. This procedure was invented and implemented in the paper by 'Nasinski', 'Majchrowska', and 'Broniatowska' (2020) <doi:10.24425/cejeme.2020.134747>.

Authors:Maciej Nasinski [aut, cre]

cat2cat_0.6.1.9000.tar.gz
cat2cat_0.6.1.9000.zip(r-4.7)cat2cat_0.6.1.9000.zip(r-4.6)cat2cat_0.6.1.9000.zip(r-4.5)
cat2cat_0.6.1.9000.tgz(r-4.6-any)cat2cat_0.6.1.9000.tgz(r-4.5-any)
cat2cat_0.6.1.9000.tar.gz(r-4.7-any)cat2cat_0.6.1.9000.tar.gz(r-4.6-any)
cat2cat_0.6.1.9000.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
cat2cat/json (API)

# Install 'cat2cat' in R:
install.packages('cat2cat', repos = c('https://polkas.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/polkas/cat2cat/issues

Pkgdown/docs site:https://polkas.github.io

Datasets:
  • occup - Occupational dataset
  • occup_panel - Occupational panel dataset with BAEL-style quarterly rotation
  • occup_small - Occupational dataset - small one
  • trans - Trans dataset containing mappings (transitions) between old (2008) and new (2010) occupational codes. This table could be used to map encodings in both directions.
  • verticals - Verticals dataset
  • verticals2 - Verticals2 dataset

On CRAN:

Conda:

categoriesencodingencodingsfactorlongitudinalmappingmappingspaneltransitions

5.18 score 5 stars 9 scripts 608 downloads 11 exports 1 dependencies

Last updated from:e1f0c6368a. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK303
source / vignettesOK320
linux-release-x86_64OK293
macos-release-arm64OK257
macos-oldrel-arm64OK288
windows-develOK249
windows-releaseOK263
windows-oldrelOK221
wasm-releaseOK128

Exports:cat_apply_freqcat2catcat2cat_aggcat2cat_ml_runcross_c2cdummy_c2cget_freqsget_mappingsplot_c2cprune_c2csummary_c2c

Dependencies:MASS

Advanced Workflows
ML weights | Multi-period chaining | Optional: passing mappings$freqs_df | Mapping table stability and truncation | Backward chaining | Forward chaining | Adding ML to the chain | Panel data with subject identifiers | Aggregated data and special cases | Regression on replicated data | Building a 4-period harmonised repeated cross-section dataset | Neutral impact demonstration | Fixed effects regression | Choosing the right advanced workflow | Next steps

Last update: 2026-05-17
Started: 2026-05-08

Choosing Weights and Validating ML
Step 1: Understand the competing weight assumptions | If ML fails on some rows: on_fail and fail_warn | Step 2: Check whether conclusions are sensitive to the weight choice | Compare weight methods on the same mapped data | Compare pruning strategies only after comparing full weights | Compare ensemble compositions when no single method dominates | Step 3: Validate whether ML actually improves on simpler baselines | What cat2cat_ml_run() is doing | Minimal validation workflow | Compare multiple ML methods in one run | Inspect per-group diagnostics when methods disagree | Decision rules for interpreting the output

Last update: 2026-05-17
Started: 2026-05-08

Get Started with cat2cat
What is cat2cat? | How to read the documentation | The Problem | The Solution | Value Added of cat2cat | Harmonisation Assumptions and Sensitivity | Three weighting schemes | When each assumption matters | Reducing assumption dependence | When cat2cat won't help | If you need something more advanced | Use Cases | Key Concepts | Direction: Backward vs Forward | ML weights in practice | Weights | Quick Example | Forward mapping example | Naive vs Frequency Weights | Value added: pooled regression across both periods | Diagnostic Plot | Mapping table | Common hierarchical classifications | Truncating mapping tables from hierarchical codes | Learn More

Last update: 2026-05-17
Started: 2022-03-10