Imports the FCDS data file and applies basic data pre-processing. If the user
provides only the path to the raw FCDS data, by default this function will
save a cached version of the pre-processed data in fcds default directory
(see fcds_default_data_path() for more information). The user can then
subsequently call fcds_load() to load the cached, pre-processed data rather
than repeating the importing process or needing to locate the original
raw data. The data caching can be disabled by setting output_file = NULL.
fcds_import(file, output_dir = fcds_default_data_path(), output_file = "fcds_%F-%H%M.rds", ..., keep_original_columns = FALSE, fcds_recoding = NULL, verbose = TRUE, col_types = readr::cols(.default = readr::col_character()))
| file | The raw FCDS data file |
|---|---|
| output_dir | The location where the cleaned FCDS data files should be
located. Set to |
| output_file | Name of the file to store the cleaned FCDS data file. If
you don't want to save the cleaned data, set |
| ... | Arguments passed on to
|
| keep_original_columns | Should the original FCDS columns be kept in the imported data? By default, only the cleaned columns are kept. |
| fcds_recoding | The FCDS recoding definition. See the FCDS Recoding
section for more information. Set to |
| verbose | Prints additional information about the importing process |
| col_types | Passed to |
A tibble containing the pre-processed FCDS and optionally the
original columns of the raw FCDS data (if keep_original_columns = TRUE).
The pre-processing step provides the following columns.
patient_id: Patient ID Number.
NAACCR Item #20.
Derived from Patient_ID_N20.
year_group: Year of Diagnosis (5 year group).
NAACCR Item #390.
Derived from Date_of_Dx_Year_Recoded.
year: Year of Diagnosis (midpoint of 5 year group).
NAACCR Item #390.
Derived from Date_of_Dx_Year_Recoded.
cancer_status: Cancer Status at time abstract was completed.
NAACCR Item #1770.
Derived from Cancer_Status_N1770.
cancer_site_group: FCDS Site Group.
NAACCR Item #2220.
Derived from FCDS_Site_Group.
cancer_site_specific: FCDS Site Group with specific within-group areas.
NAACCR Item #2220.
Derived from FCDS_Site_Group.
confirmation: Diagnostic Confirmation at first diagnosis.
NAACCR Item #490.
Derived from Diagnostic_Confirmation_N490.
age_group: FCDS Age Group.
NAACCR Item #2220.
Derived from FCDS_Age_Group.
race: Race (recoded).
NAACCR Item #160.
Derived from Race_Recoded.
sex: Sex (recoded).
NAACCR Item #220.
Derived from Sex_Recoded.
origin: Spanish/Hispanic Origin (recoded).
NAACCR Item #190.
Derived from Ethnicity_Recoded.
marital_status: Marital Status at diagnosis (recoded).
NAACCR Item #150.
Derived from Marital_Status_Recoded.
county_name: County Name of patient's primary residence at the time tumor was diagnosed.
NAACCR Item #90.
Derived from County_at_DX_N90.
county_fips: County FIPS Code of patient's primary residence at the time tumor was diagnosed.
NAACCR Item #90.
Derived from County_at_DX_N90.
state: State of patient's primary residence at the time of diagnosis (recoded).
NAACCR Item #80.
Derived from Addr_at_DX_State_Recoded.
florida_resident: Patient's primary state of residence was Florida at time of diagnosis.
NAACCR Item #80.
Derived from Addr_at_DX_State_Recoded.
country: Country of patient's primary residence at time of diagnosis (recoded).
NAACCR Item #102.
Derived from Addr_at_Dx_Country_Recoded.
birth_country: Country of Birthplace (recoded).
NAACCR Item #254.
Derived from Birthplace_Country_Recoded.
birth_state: State of Birthplace (recoded).
NAACCR Item #254.
Derived from Birthplace_State_Abrv_Recoded.
primary_payer: Primary Payer at Diagnosis (recoded).
NAACCR Item #630.
Derived from Dx_Primary_Payor_Recoded.
cancer_reporting_source: Type of Reporting Source.
NAACCR Item #500.
Derived from Type_of_Reporting_Source_N500.
cancer_ICDO3_conversion: ICD-O-3 Conversion Flag.
NAACCR Item #2116.
Derived from ICDO3_Conversion_FL_N2116.
cancer_laterality: Laterality at Diagnosis.
NAACCR Item #410.
Derived from Laterality_N410.
cancer_grade: Grade, Differentiation, or Cell Lineage Indicator (SEER/CCCR).
NAACCR Item #440.
Derived from Grade_N440.
cancer_ICDO3_histology: Histologic Type ICD-O-3.
NAACCR Item #522.
Derived from Histologic_Type_ICDO3_N522.
cancer_ICDO3_behavior: Behavior Code ICD-O-3.
NAACCR Item #523.
Derived from Behavior_Code_ICDO3_N523.
cancer_ICDO3_morphology: Morphology Code ICD-O-3 (Type and Behavior).
NAACCR Item #521.
Derived from Histologic_Type_ICDO3_N522, Behavior_Code_ICDO3_N523.
seer_stage_1977: SEER Summary Stage 1977.
NAACCR Item #760.
Derived from SEER_Summ_Stage_1977_N760.
seer_stage_2000: SEER Summary Stage 2000.
NAACCR Item #759.
Derived from SEER_Summ_Stage_2000_N759.
seer_stage: SEER Stage from 2000 falling back to 1977.
NAACCR Item #759.
Derived from seer_stage_1977, seer_stage_2000.
seer_stage_derived_1977: Derivation of SEER Summary Stage 1977.
NAACCR Item #3040.
Derived from Derived_SS1977_FL_N3040.
seer_stage_derived_2000: Derivation of SEER Summary Stage 2000.
NAACCR Item #3050.
Derived from Derived_SS2000_FL_N3050.
tobacco_cigarette: Cigarette smoking.
NAACCR Item #9965.
Derived from FCDS_Tob_Use_Cigarette_N1300.
tobacco_other: Smoking tobacco products other than cigarettes (e.g., pipes, cigars, kreteks).
NAACCR Item #9966.
Derived from FCDS_Tob_Use_OthSmoke_N1300.
tobacco_smokeless: Smokeless tobacco products (e.g, chewing tobacco, snuff, etc.).
NAACCR Item #9967.
Derived from FCDS_Tob_Use_Smokeless_Tob_N1300.
tobacco_nos: Tobacco NOS, includes use of e-cigarettes and vaporizers.
NAACCR Item #9968.
Derived from FCDS_Tob_Use_NOS_N1300.
Users must request the data from FCDS directly, via the FCDS Data Request webpage.
This section will discuss the formatting for the FCDS recoding yaml file.
Other FCDS Import Functions: fcds_cache,
fcds_default_data_path,
fcds_load, fcds_recoding