Introduction
The nemsqar package provides an automated and
reproducible framework for calculating EMS quality measures defined by
the National EMS Quality Alliance (NEMSQA). These measures are widely
used by EMS agencies, trauma systems, quality improvement teams, and
researchers to evaluate performance and support evidence‑based
improvement activities.
This vignette is written for users who are knowledgeable in EMS,
injury epidemiology, or quality improvement, but who may be new to R or
new to calculating NEMSQA measures using R. The focus is to guide you
through each step of the workflow: loading data, preparing it for use
with nemsqar, and running a selected NEMSQA measure.
By the end of this vignette, you will understand:
- The structure and purpose of the example datasets included in
nemsqar
- How to load NEMSQA‑ready tables using synthetic data packaged with
nemsqar
- How to prepare EMS data for measure calculation
- How to run individual NEMSQA measure functions
- How to interpret results generated by the package
The sections that follow walk through the entire process, from loading EMS data in R to producing a standardized performance measure aligned with national reporting expectations.
The Measures
This vignette focuses on one NEMSQA measure implemented in
nemsqar:
- Asthma‑01: Assessment and treatment of patients with suspected asthma
All measures in nemsqar follow the same basic structure.
Each one requires a defined set of NEMSIS‑aligned tables and returns
results in a standardized format. The following sections introduce the
data required to calculate this measure and demonstrate how to run it in
R.
The Data
Before calculating any measure, it is important to understand the
example datasets included with nemsqar. These datasets are
small, synthetic representations of NEMSIS tables. They provide a safe
environment for learning the workflow before applying it to production
EMS data. Because NEMSQA measure logic relies on the NEMSIS data
structure, each measure requires several tables (for example, patient,
response, situation, medications, vitals).
Loading Example Data
The nemsqar package includes synthetic datasets that
mirror the NEMSIS tables required for NEMSQA measure calculation. You
can load them with the data() function. For Asthma‑01, the
following tables are required:
data("nemsqar_patient_scene_table")
data("nemsqar_response_table")
data("nemsqar_situation_table")
data("nemsqar_medications_table")Each dataset loads into your R environment as a standard data frame.
Inspecting Table Structure
Because many users are new to R, it is important to verify that each dataset loaded correctly and contains the variables required by the measure functions. A few simple commands help confirm this.
Review the structure of a loaded dataset
# Quick overview of column names and data types
dplyr::glimpse(nemsqar_patient_scene_table)
#> Rows: 10,000
#> Columns: 6
#> $ `Incident Patient Care Report Number - PCR (eRecord.01)` <chr> "NyXFBlJfnm-8…
#> $ `Incident Date` <date> 2023-12-20, …
#> $ `Patient Age (ePatient.15)` <dbl> 98, 75, 24, 1…
#> $ `Patient Age Units (ePatient.16)` <chr> "Minutes", "D…
#> $ `Patient Date Of Birth (ePatient.17)` <date> 2023-12-19, …
#> $ `Patient Gender (ePatient.13)` <chr> "Male to Fema…View the first few rows
# An abbreviated look at the actual data tables
head(nemsqar_patient_scene_table, n = 10)
#> # A tibble: 10 × 6
#> Incident Patient Care Report Number …¹ `Incident Date` Patient Age (ePatien…²
#> <chr> <date> <dbl>
#> 1 NyXFBlJfnm-8333586176 2023-12-20 98
#> 2 XTLCINMLTP-8616021114 2023-08-30 75
#> 3 HfYjlIEQSk-9529756610 2023-03-21 24
#> 4 MOwVDhriyC-5915613206 2023-09-13 115
#> 5 ZCGOtLEPKw-7820135532 2023-02-21 54
#> 6 fEMvUCQCRQ-9052388486 2023-08-23 88
#> 7 VTLPiFWWGd-6806896482 2023-12-09 83
#> 8 YvZbHRTUuK-8780915452 2023-04-06 24
#> 9 DkKIjJSFtA-7499641828 2023-05-07 95
#> 10 CIQMuVGJgS-9144926148 2023-11-13 82
#> # ℹ abbreviated names:
#> # ¹`Incident Patient Care Report Number - PCR (eRecord.01)`,
#> # ²`Patient Age (ePatient.15)`
#> # ℹ 3 more variables: `Patient Age Units (ePatient.16)` <chr>,
#> # `Patient Date Of Birth (ePatient.17)` <date>,
#> # `Patient Gender (ePatient.13)` <chr>These functions allow you to check column names, data types, and basic record structure. This step is essential because NEMSQA logic depends on specific fields. Incorrect data types (for example, character instead of numeric) will cause measure functions to fail.
Data types
The example datasets included in nemsqar already use
appropriate data types. When working with real EMS data, you must verify
these types manually. This ensures that the data satisfy validation
requirements and prevents errors during measure calculation.
In practice, EMS data commonly include issues such as:
- Dates stored as character strings
- Numeric values stored as text (for example,
"5"instead of5)
- Empty strings used in place of
NA
- Mixed formats within the same column
Below are examples of how to identify and correct these issues before
running any nemsqar measure.
Example: Converting character dates to proper date formats
# Example: incident dates stored as character values
example_data <- data.frame(
Incident_Date = c("2023-01-10", "01/12/2023", "20230114"),
stringsAsFactors = FALSE
)
# Convert using lubridate (recommended)
example_data$Incident_Date <- lubridate::parse_date_time(
example_data$Incident_Date,
orders = c("ymd", "mdy", "Ymd")
)Example: Converting numbers stored as character strings
numeric_example <- data.frame(
# note that 45 here has whitespace surrounding the value
Patient_Age_raw = c("34", "18", "07", " 45 ")
)
# Trim whitespace and convert to numeric
numeric_example$Patient_Age <- as.numeric(trimws(
numeric_example$Patient_Age_raw
))Example: Replacing empty strings with NA
missing_example <- data.frame(
eSituation_11 = c("", "R41.82", "", "T14.90")
)
# Replace empty strings with NA
missing_example$eSituation_11 <- dplyr::na_if(missing_example$eSituation_11, "")
missing_exampleWhy these steps matter
Date fields are used for patient age computation and for time‑based denominators. If these values are stored as character strings or in inconsistent formats, the measure logic will not execute correctly.
Numeric fields such as age, blood pressure, respiratory rate, or dosage must be numeric to satisfy validation checks and ensure appropriate comparisons.
Empty strings cause false exclusions during population filtering,
especially when nemsqar logic expects missing values to be
formally represented as NA.
Ensuring correct data types prior to running any nemsqar function improves reproducibility, reduces debugging time, and allows the measure logic to operate as intended.
Dealing with problematic column names
If your datasets already have clean names, you may skip this step.
EMS registry data often contain column names with spaces, punctuation, or special characters. These can make programming in R more difficult. To avoid these issues, it is helpful to standardize column names before running any measures.
Below is a simple reusable function to clean column names by replacing spaces and special characters with underscores.
# Define a reusable column-cleaning function
clean_cols <- function(data) {
data |>
dplyr::rename_with(
.cols = tidyselect::everything(),
~ . |>
gsub(pattern = "\\.|\\(|-|\\s", replacement = "_") |>
gsub(pattern = "_+", replacement = "_") |>
gsub(pattern = "\\)", replacement = "")
)
}
# Apply cleaning to each table
nemsqar_patient_scene_data <- nemsqar_patient_scene_table |> clean_cols()
nemsqar_response_data <- nemsqar_response_table |> clean_cols()
nemsqar_situation_data <- nemsqar_situation_table |> clean_cols()
nemsqar_medications_data <- nemsqar_medications_table |> clean_cols()
# Inspect the cleaned patient/scene table
dplyr::glimpse(nemsqar_patient_scene_data)
#> Rows: 10,000
#> Columns: 6
#> $ Incident_Patient_Care_Report_Number_PCR_eRecord_01 <chr> "NyXFBlJfnm-8333586…
#> $ Incident_Date <date> 2023-12-20, 2023-0…
#> $ Patient_Age_ePatient_15 <dbl> 98, 75, 24, 115, 54…
#> $ Patient_Age_Units_ePatient_16 <chr> "Minutes", "Days", …
#> $ Patient_Date_Of_Birth_ePatient_17 <date> 2023-12-19, 2023-0…
#> $ Patient_Gender_ePatient_13 <chr> "Male to Female, Tr…Now, special characters and whitespace are either removed or replaced
with _ so R can more easily recognize the column names, and
we can avoid annoying conventions to find column names.
Understanding Required Inputs
Each NEMSQA measure requires a specific set of input tables. Although
nemsqar can accept a single combined dataset through the
df argument, this approach is not recommended. The
preferred workflow is to supply separate tables using the
*_table arguments (for example,
patient_scene_table, response_table). This
aligns with the NEMSIS structure, where elements such as ePatient,
eScene, eResponse, and eSituation are stored in distinct tables.
In practice, your data should follow this multi‑table structure:
- Patient and scene data stored together (1:1)
- Response data stored separately
- Situation data stored separately
- Medications and vitals stored in their respective tables
Each measure expects a consistent set of these tables. For example:
- Asthma‑01 requires patient, response, situation, and medication tables.
The next sections demonstrate how to supply these inputs to
nemsqar and how to calculate a measure.
Running the NEMSQA Measures Using nemsqar
Once the required tables are loaded, you can calculate your first
measure. Each measure in nemsqar is implemented through a
dedicated function that accepts NEMSIS‑aligned tables and returns
standardized results.
nemsqar workhorse functions
Each measure is built using two core functions:
- A wrapper function named
measure_##()(for example,asthma_01()) - A corresponding population function, such as
asthma_01_population()
The wrapper function performs two main tasks. First, it calls the population function to identify the population of interest. Then it applies the measure logic to estimate performance. Each NEMSQA measure follows this same pattern.
Running the wrapper function for Asthma‑01
The asthma_01() function requires several NEMSIS‑aligned
tables and column mappings. All arguments shown below are required. Each
column argument identifies the specific NEMSIS field used by the measure
logic. Note that most argument names signal the corresponding NEMSIS
data element. For example, eresponse_05_col corresponds to
eResponse.05 in the NEMSIS data dictionary.
To help you map your own data, the list below shows how several key arguments align with their corresponding NEMSIS elements:
-
erecord_01_col–> eRecord.01 (PCR number)
-
incident_date_col–> eTimes.03 (Unit Notified by Dispatch Date/Time) -
patient_DOB_col–> ePatient.17 (patient date of birth)
-
epatient_15_col–> ePatient.15 (patient age)
-
epatient_16_col–> ePatient.16 (age units)
-
eresponse_05_col–> eResponse.05 (type of service requested)
-
esituation_11_col–> eSituation.11 (primary impression)
-
esituation_12_col–> eSituation.12 (secondary impression)
-
emedications_03_col–> eMedications.03 (medication administered)
These mappings ensure that each argument references the correct NEMSIS data element when running the measure.
# Run Asthma‑01 without grouping
asthma_01_all <- asthma_01(
patient_scene_table = nemsqar_patient_scene_data,
response_table = nemsqar_response_data,
situation_table = nemsqar_situation_data,
medications_table = nemsqar_medications_data,
erecord_01_col = Incident_Patient_Care_Report_Number_PCR_eRecord_01,
incident_date_col = Incident_Date,
patient_DOB_col = Patient_Date_Of_Birth_ePatient_17,
epatient_15_col = Patient_Age_ePatient_15,
epatient_16_col = Patient_Age_Units_ePatient_16,
eresponse_05_col = Response_Type_Of_Service_Requested_With_Code_eResponse_05,
esituation_11_col = Situation_Provider_Primary_Impression_Code_And_Description_eSituation_11,
esituation_12_col = Situation_Provider_Secondary_Impression_Description_And_Code_List_eSituation_12,
emedications_03_col = Patient_Medication_Given_or_Administered_Description_And_RXCUI_Codes_List_eMedications_03,
confidence_interval = TRUE,
method = "clopper-pearson",
conf.level = 0.95
)
# print the results
asthma_01_all
#> # A tibble: 3 × 8
#> measure pop numerator denominator prop prop_label lower_ci upper_ci
#> <chr> <chr> <int> <int> <dbl> <chr> <dbl> <dbl>
#> 1 Asthma-01 Adults 0 4 0 0% 0 0.602
#> 2 Asthma-01 Peds 3 25 0.12 12% 0.0255 0.312
#> 3 Asthma-01 All 3 29 0.103 10.34% 0.0219 0.274The output reflects the measure population, denominator, numerator,
and final performance classification for each record. This structure is
consistent across all NEMSQA measures implemented in
nemsqar.
Running the asthma_01 wrapper function using
grouping
nemsqar allows you to calculate a measure for the entire
dataset or for specific subgroups. Grouping can be useful when you want
to understand performance within meaningful categories, such as age
groups, service types, or impressions. Grouping is implemented using the
.by argument, which follows the same syntax used in
dplyr::summarize().
The example below shows how to run Asthma‑01 grouped
by age units. All required tables and column mappings remain the same;
the only additional argument is .by.
# Run `asthma_01` for a whole dataset, group by age units.
# All core inputs remain the same. Only the .by argument is added.
asthma_01_age <- asthma_01(
patient_scene_table = nemsqar_patient_scene_data,
response_table = nemsqar_response_data,
situation_table = nemsqar_situation_data,
medications_table = nemsqar_medications_data,
erecord_01_col = Incident_Patient_Care_Report_Number_PCR_eRecord_01,
incident_date_col = Incident_Date,
patient_DOB_col = Patient_Date_Of_Birth_ePatient_17,
epatient_15_col = Patient_Age_ePatient_15,
epatient_16_col = Patient_Age_Units_ePatient_16,
eresponse_05_col = Response_Type_Of_Service_Requested_With_Code_eResponse_05,
esituation_11_col = Situation_Provider_Primary_Impression_Code_And_Description_eSituation_11,
esituation_12_col = Situation_Provider_Secondary_Impression_Description_And_Code_List_eSituation_12,
emedications_03_col = Patient_Medication_Given_or_Administered_Description_And_RXCUI_Codes_List_eMedications_03,
confidence_interval = TRUE,
method = "clopper-pearson",
conf.level = 0.95,
# notice here that we use the `.by` argument from `dplyr::summarize` to group
# our analysis
.by = Patient_Age_Units_ePatient_16
)
# print the results
asthma_01_age
#> # A tibble: 10 × 9
#> Patient_Age_Units_ePa…¹ measure pop numerator denominator prop prop_label
#> <chr> <chr> <chr> <int> <int> <dbl> <chr>
#> 1 Years Asthma… Adul… 0 4 0 0%
#> 2 Months Asthma… Peds 1 12 0.0833 8.33%
#> 3 Minutes Asthma… Peds 2 9 0.222 22.22%
#> 4 Hours Asthma… Peds 0 2 0 0%
#> 5 Days Asthma… Peds 0 2 0 0%
#> 6 Months Asthma… All 1 12 0.0833 8.33%
#> 7 Minutes Asthma… All 2 9 0.222 22.22%
#> 8 Hours Asthma… All 0 2 0 0%
#> 9 Years Asthma… All 0 4 0 0%
#> 10 Days Asthma… All 0 2 0 0%
#> # ℹ abbreviated name: ¹Patient_Age_Units_ePatient_16
#> # ℹ 2 more variables: lower_ci <dbl>, upper_ci <dbl>Grouping is optional, and can reveal differences in performance
across patient subpopulations and can be applied to any NEMSQA measure
using the same .by syntax.
Working with the *_population() functions
Each NEMSQA measure includes a companion *_population()
function. These functions identify the population of interest by
applying the full set of inclusion and exclusion criteria defined by
NEMSQA. They perform all filtering, validation, and intermediate
computations needed to determine which records belong in the measure
denominator.
Each population function returns a list containing
several tibbles that help you examine the population:
- A tibble with counts for each filtering step
- Tibbles for specific populations (for example, adult, pediatric, or
all patients)
- A tibble showing the initial population before any filtering
- A tibble with the full dataset and computed fields
- A tibble summarizing missingness for required columns across all tables
These objects are useful when validating data quality, understanding how records flowed through the NEMSQA criteria, and troubleshooting unexpected measure results. In practice, population functions are most useful when you need to verify which records were included or excluded from the denominator and why. Analysts often use these functions when denominator counts look unexpected, when investigating data quality issues, or when comparing populations across systems or years. They provide a transparent view of how NEMSQA logic was applied to your data.
The example below demonstrates how to use
asthma_01_population() to inspect the population identified
for Asthma‑01.
Using asthma_01_population() to examine the target
population
The asthma_01_population() function identifies the
population of interest by applying all NEMSQA inclusion and exclusion
criteria. The function uses the same required tables and column mappings
as asthma_01(), but it does not calculate performance
estimates and does not use confidence interval or grouping arguments
# Run `asthma_01_population` for a whole dataset
# The code is virtually the same as `asthma_01()`, but we do not use the
# confidence interval arguments, nor the tidy dot `...` arguments for grouping
# or other operations via `dplyr::summarize`
populations_asthma_01 <- asthma_01_population(
patient_scene_table = nemsqar_patient_scene_data,
response_table = nemsqar_response_data,
situation_table = nemsqar_situation_data,
medications_table = nemsqar_medications_data,
erecord_01_col = Incident_Patient_Care_Report_Number_PCR_eRecord_01,
incident_date_col = Incident_Date,
patient_DOB_col = Patient_Date_Of_Birth_ePatient_17,
epatient_15_col = Patient_Age_ePatient_15,
epatient_16_col = Patient_Age_Units_ePatient_16,
eresponse_05_col = Response_Type_Of_Service_Requested_With_Code_eResponse_05,
esituation_11_col = Situation_Provider_Primary_Impression_Code_And_Description_eSituation_11,
esituation_12_col = Situation_Provider_Secondary_Impression_Description_And_Code_List_eSituation_12,
emedications_03_col = Patient_Medication_Given_or_Administered_Description_And_RXCUI_Codes_List_eMedications_03
)
# print structure of the results using `base::summary()`
populations_asthma_01 |> summary()
#> Length Class Mode
#> filter_process 2 tbl_df list
#> adults 16 tbl_df list
#> peds 16 tbl_df list
#> initial_population 16 tbl_df list
#> computing_population 16 tbl_df list
#> missingness 6 tbl_df listThis output provides a structured view of how records were filtered through the NEMSQA criteria. It allows you to inspect the initial population, denominator‑eligible records, age‑specific subgroups, and missingness summaries for required fields.
Examine a summary of counts for the NEMSQA population
The *_population() functions return several tibbles that
summarize how records were filtered into the final population of
interest. One of the most useful is the filter_process
tibble. It shows the number of records remaining after each inclusion or
exclusion step defined by NEMSQA.
Asthma-01 summary of attributes of the target population
# Display counts for each filtering step
populations_asthma_01$filter_process
#> # A tibble: 7 × 2
#> filter count
#> <chr> <int>
#> 1 911 calls 2400
#> 2 Asthma cases 109
#> 3 Beta agonist cases 1482
#> 4 Adults denominator 4
#> 5 Peds denominator 25
#> 6 Initial population 29
#> 7 Total dataset 10000filter_process is typically where analysts can look
first when values seem off.
Using filter_process from the population functions
Given that this vignette uses synthetic data, the counts may not
reflect realistic populations. However, the workflow remains the same
when working with real EMS data. The values in filter_process represent
distinct record counts at each stage (using
dplyr::distinct() internally). Reviewing these counts,
along with the missingness tibble returned by the population function,
can help diagnose data quality issues and better understand the
composition of the population being evaluated.
Common Pitfalls for New R Users
Users who are new to R often encounter several predictable issues when preparing data for NEMSQA measure calculation. The sections below highlight the most common problems and how to avoid them. Addressing these issues before running measures improves reproducibility and reduces debugging time.
Incorrect variable types
Many NEMSQA logic components require numeric fields. If these values
are imported as character strings, the measure functions will fail.
Always verify column types before running a measure and convert them as
needed to meet nemsqar validation requirements.
Unintended name changes
Column names must align with the NEMSIS fields that each function
argument represents. You may name your columns however you prefer, but
the values that originate from eResponse.05 must be
supplied to the eresponse_05_col argument. The function
relies on the data itself, not the literal column name, but incorrect
mapping will cause errors.
Missing required tables
Each measure requires a specific set of input tables. If a required table is not provided, the function will return an error. Ensure that all necessary tables are loaded and cleaned before running the measure.
Duplicated records
NEMSQA measures assume that each patient or encounter appears once in
the relevant input tables. Duplicate rows can shift denominator counts,
alter inclusion, or create unintended exclusions. Although
nemsqar includes safeguards to detect some duplication, it
is best practice to review your data for repeated records and to check
for unintended Cartesian joins created during data extraction or table
merging.
Next Steps
This vignette introduced the core workflow for calculating NEMSQA
measures using nemsqar. After reviewing these examples,
users may wish to expand their analyses by exploring additional
measures, integrating their own EMS datasets, or incorporating these
workflows into automated reporting pipelines. The package reference
documentation provides detailed descriptions of each function, and
additional vignettes will demonstrate multi‑measure workflows,
validation strategies, and integration with reproducible reporting tools
such as Quarto and R Markdown.
