Skip to contents

hcruR is an R package to help health economists and RWE analysts estimate and compare healthcare resource utilization (HCRU) from observational healthcare data, such as claims or electronic health records.


๐Ÿš€ Features

  • Estimate patient-level HCRU by:
    • Domain (inpatient, outpatient, pharmacy, etc.)
    • Time relative to index date (pre/post)
  • Compare HCRU across cohorts
  • Visualize domain-wise HCRU statistics
  • Designed for flexible real-world evidence (RWE) workflows

๐Ÿ“ฅ Installation

You can install the development version of hcruR from GitHub with:

# Install from GitHub (after you upload the repo)
# install.packages("devtools")
devtools::install_github("mumbarkar/hcruR")

# install.packages("pak")
pak::pak("mumbarkar/hcruR")

๐Ÿงช Example Usage

This is a basic example which shows you how to solve a common problem:

This is a basic example which shows you how to solve a common problem:

# Load library
library(hcruR)

## Generate HCRU summary using dplyr (this can be used for create HCRU plots)

# Load sample data
data(hcru_sample_data)
head(hcru_sample_data)

# Estimate HCRU
hcru_summary <- estimate_hcru(data = hcru_sample_data,
                             cohort_col = "cohort",
                             patient_id_col = "patient_id",
                             admit_col = "admission_date",
                             discharge_col = "discharge_date",
                             index_col = "index_date",
                             visit_col = "visit_date",
                             encounter_id_col = "encounter_id",
                             setting_col = "care_setting",
                             cost_col = "cost_usd",
                             readmission_col = "readmission",
                             time_window_col = "period",
                             los_col = "length_of_stay",
                             custom_var_list = NULL,
                             pre_days = 180,
                             post_days = 365,
                             readmission_days_rule = 30,
                             group_var = "cohort",
                             test = NULL,
                             gt_output = FALSE)

hcru_summary

## Generate HCRU summary using gtsummary (a publication ready output) 

# Estimate HCRU
hcru_summary_gt <- estimate_hcru(data = hcru_sample_data,
                             cohort_col = "cohort",
                             patient_id_col = "patient_id",
                             admit_col = "admission_date",
                             discharge_col = "discharge_date",
                             index_col = "index_date",
                             visit_col = "visit_date",
                             encounter_id_col = "encounter_id",
                             setting_col = "care_setting",
                             cost_col = "cost_usd",
                             readmission_col = "readmission",
                             time_window_col = "period",
                             los_col = "length_of_stay",
                             custom_var_list = NULL,
                             pre_days = 180,
                             post_days = 365,
                             readmission_days_rule = 30,
                             group_var = "cohort",
                             test = NULL,
                             gt_output = TRUE)

hcru_summary_gt

## Generate the HCRU plot for average visits per patient by cohort and 
time-line

p_avg_visit <- plot_hcru(summary_df = hcru_summary$`Summary by settings using dplyr`,
               x_var = "period",
               y_var = "Avg_visits_per_patient",
               cohort_col = "cohort",
               facet_var = "care_setting",
               facet_var_n = 3,
               title = "Per patient average visits by domain and cohort",
               x_lable = "Healthcare Setting (Domain)",
               y_lable = "Average visits",
               fill_lable = "Cohort"
)

p_avg_visit

## Generate the HCRU plot for average costs per patient by cohort and time-line

p_avg_cost <- plot_hcru(summary_df = hcru_summary$`Summary by settings using dplyr`,
               x_var = "period",
               y_var = "Avg_cost_per_patient",
               cohort_col = "cohort",
               facet_var = "care_setting",
               facet_var_n = 3,
               title = "Per patient average cost by domain and cohort",
               x_lable = "Healthcare Setting (Domain)",
               y_lable = "Average costs",
               fill_lable = "Cohort"
)

p_avg_cost

๐Ÿงพ Function Reference

estimate_hcru() estimates of healthcare resource utilization (HCRU) from electronic health record data across various care settings (e.g., IP, OP, ED/ER). It provides descriptive summaries of patient counts, encounters, costs, length of stay, and readmission rates for pre- and post-index periods

Arguments

Argument Type Description Default
data data.frame Input healthcare dataset containing admission, discharge, and visit information. โ€”
cohort_col character Name of the column that defines cohort groupings. "cohort"
patient_id_col character Name of the column containing unique patient identifiers. "patient_id"
admit_col character Name of the column containing admission dates. "admission_date"
discharge_col character Name of the column containing discharge dates. "discharge_date"
index_col character Name of the column containing the index (diagnosis) date. "index_date"
visit_col character Name of the column for visit or claim dates. "visit_date"
encounter_id_col character Name of the column containing encounter or claim IDs. "encounter_id"
setting_col character Name of the column representing care settings (e.g., IP, OP, ED). "care_setting"
pre_days numeric Number of days before index date to include in pre-period. 180
post_days numeric Number of days after index date to include in post-period. 365
readmission_days_rule numeric Number of days to define readmission following a discharge in the IP setting. 30
gt_output logical Whether to generate an additional gtsummary output. FALSE
cost_col character Name of the column containing cost information. "cost_usd"
los_col character Name of the column for length of stay values. "length_of_stay"
readmission_col character Name of the column indicating readmission status. "readmission"
time_window_col character Name of the column that categorizes records as pre or post index. "period"
custom_var_list character[] Optional list of additional columns to include in summary tables. NULL
group_var character Name of the grouping column for stratified summaries. "cohort"
test list Optional named list of statistical tests (e.g., list(age = "wilcox.test")). NULL

plot_hcru() provides the visualization of the events of the settings/domains grouped by cohort and time window.

Arguments:

Argument Type Description
summary_df dataframe Output from estimate_hcru() function.
x_var character Column name to plot on the x-axis (default "period").
y_var character Column name to plot on the y-axis (default "Cost").
cohort_col character Name of the column identifying cohorts (default "cohort").
facet_var character Column to generate subplots for (default "care_setting").
facet_var_n numeric Number of columns in the facet grid (default 3).
title character Title of the plot.
x_lable character Label for the x-axis.
y_lable character Label for the y-axis.
fill_lable character Label for the fill legend.

๐Ÿ“Š Sample Data

This package includes a demo datasets for easy testing:

  • hcru_sample_data: 200 patients across 2 cohorts
head(hcru_sample_data)

๐Ÿ“š Vignette

Run the following to access the full walkthrough:

vignette("hcru-analysis", package = "hcruR")

๐Ÿ”ฌ Use Cases

  • Cost burden studies before/after treatment
  • Resource comparison across patient populations
  • Outcome stratification based on utilization patterns

๐Ÿ› ๏ธ Development

To contribute:

git clone https://github.com/mumbarkar/hcruR.git
cd hcruR

๐Ÿ“œ License

This package is licensed under the MIT License.