Imports the FCDS data file and applies basic data pre-processing. If the user
provides only the path to the raw FCDS data, by default this function will
save a cached version of the pre-processed data in fcds default directory
(see fcds_default_data_path()
for more information). The user can then
subsequently call fcds_load()
to load the cached, pre-processed data rather
than repeating the importing process or needing to locate the original
raw data. The data caching can be disabled by setting output_file = NULL
.
fcds_import(file, output_dir = fcds_default_data_path(), output_file = "fcds_%F-%H%M.rds", ..., keep_original_columns = FALSE, fcds_recoding = NULL, verbose = TRUE, col_types = readr::cols(.default = readr::col_character()))
file | The raw FCDS data file |
---|---|
output_dir | The location where the cleaned FCDS data files should be
located. Set to |
output_file | Name of the file to store the cleaned FCDS data file. If
you don't want to save the cleaned data, set |
... | Arguments passed on to
|
keep_original_columns | Should the original FCDS columns be kept in the imported data? By default, only the cleaned columns are kept. |
fcds_recoding | The FCDS recoding definition. See the FCDS Recoding
section for more information. Set to |
verbose | Prints additional information about the importing process |
col_types | Passed to |
A tibble containing the pre-processed FCDS and optionally the
original columns of the raw FCDS data (if keep_original_columns = TRUE
).
The pre-processing step provides the following columns.
patient_id
: Patient ID Number.
NAACCR Item #20.
Derived from Patient_ID_N20
.
year_group
: Year of Diagnosis (5 year group).
NAACCR Item #390.
Derived from Date_of_Dx_Year_Recoded
.
year
: Year of Diagnosis (midpoint of 5 year group).
NAACCR Item #390.
Derived from Date_of_Dx_Year_Recoded
.
cancer_status
: Cancer Status at time abstract was completed.
NAACCR Item #1770.
Derived from Cancer_Status_N1770
.
cancer_site_group
: FCDS Site Group.
NAACCR Item #2220.
Derived from FCDS_Site_Group
.
cancer_site_specific
: FCDS Site Group with specific within-group areas.
NAACCR Item #2220.
Derived from FCDS_Site_Group
.
confirmation
: Diagnostic Confirmation at first diagnosis.
NAACCR Item #490.
Derived from Diagnostic_Confirmation_N490
.
age_group
: FCDS Age Group.
NAACCR Item #2220.
Derived from FCDS_Age_Group
.
race
: Race (recoded).
NAACCR Item #160.
Derived from Race_Recoded
.
sex
: Sex (recoded).
NAACCR Item #220.
Derived from Sex_Recoded
.
origin
: Spanish/Hispanic Origin (recoded).
NAACCR Item #190.
Derived from Ethnicity_Recoded
.
marital_status
: Marital Status at diagnosis (recoded).
NAACCR Item #150.
Derived from Marital_Status_Recoded
.
county_name
: County Name of patient's primary residence at the time tumor was diagnosed.
NAACCR Item #90.
Derived from County_at_DX_N90
.
county_fips
: County FIPS Code of patient's primary residence at the time tumor was diagnosed.
NAACCR Item #90.
Derived from County_at_DX_N90
.
state
: State of patient's primary residence at the time of diagnosis (recoded).
NAACCR Item #80.
Derived from Addr_at_DX_State_Recoded
.
florida_resident
: Patient's primary state of residence was Florida at time of diagnosis.
NAACCR Item #80.
Derived from Addr_at_DX_State_Recoded
.
country
: Country of patient's primary residence at time of diagnosis (recoded).
NAACCR Item #102.
Derived from Addr_at_Dx_Country_Recoded
.
birth_country
: Country of Birthplace (recoded).
NAACCR Item #254.
Derived from Birthplace_Country_Recoded
.
birth_state
: State of Birthplace (recoded).
NAACCR Item #254.
Derived from Birthplace_State_Abrv_Recoded
.
primary_payer
: Primary Payer at Diagnosis (recoded).
NAACCR Item #630.
Derived from Dx_Primary_Payor_Recoded
.
cancer_reporting_source
: Type of Reporting Source.
NAACCR Item #500.
Derived from Type_of_Reporting_Source_N500
.
cancer_ICDO3_conversion
: ICD-O-3 Conversion Flag.
NAACCR Item #2116.
Derived from ICDO3_Conversion_FL_N2116
.
cancer_laterality
: Laterality at Diagnosis.
NAACCR Item #410.
Derived from Laterality_N410
.
cancer_grade
: Grade, Differentiation, or Cell Lineage Indicator (SEER/CCCR).
NAACCR Item #440.
Derived from Grade_N440
.
cancer_ICDO3_histology
: Histologic Type ICD-O-3.
NAACCR Item #522.
Derived from Histologic_Type_ICDO3_N522
.
cancer_ICDO3_behavior
: Behavior Code ICD-O-3.
NAACCR Item #523.
Derived from Behavior_Code_ICDO3_N523
.
cancer_ICDO3_morphology
: Morphology Code ICD-O-3 (Type and Behavior).
NAACCR Item #521.
Derived from Histologic_Type_ICDO3_N522
, Behavior_Code_ICDO3_N523
.
seer_stage_1977
: SEER Summary Stage 1977.
NAACCR Item #760.
Derived from SEER_Summ_Stage_1977_N760
.
seer_stage_2000
: SEER Summary Stage 2000.
NAACCR Item #759.
Derived from SEER_Summ_Stage_2000_N759
.
seer_stage
: SEER Stage from 2000 falling back to 1977.
NAACCR Item #759.
Derived from seer_stage_1977
, seer_stage_2000
.
seer_stage_derived_1977
: Derivation of SEER Summary Stage 1977.
NAACCR Item #3040.
Derived from Derived_SS1977_FL_N3040
.
seer_stage_derived_2000
: Derivation of SEER Summary Stage 2000.
NAACCR Item #3050.
Derived from Derived_SS2000_FL_N3050
.
tobacco_cigarette
: Cigarette smoking.
NAACCR Item #9965.
Derived from FCDS_Tob_Use_Cigarette_N1300
.
tobacco_other
: Smoking tobacco products other than cigarettes (e.g., pipes, cigars, kreteks).
NAACCR Item #9966.
Derived from FCDS_Tob_Use_OthSmoke_N1300
.
tobacco_smokeless
: Smokeless tobacco products (e.g, chewing tobacco, snuff, etc.).
NAACCR Item #9967.
Derived from FCDS_Tob_Use_Smokeless_Tob_N1300
.
tobacco_nos
: Tobacco NOS, includes use of e-cigarettes and vaporizers.
NAACCR Item #9968.
Derived from FCDS_Tob_Use_NOS_N1300
.
Users must request the data from FCDS directly, via the FCDS Data Request webpage.
This section will discuss the formatting for the FCDS recoding yaml file.
Other FCDS Import Functions: fcds_cache
,
fcds_default_data_path
,
fcds_load
, fcds_recoding