A Web Application and Software Package for Analysis and Exploration of the FCDS Florida Cancer Registry Data
Understanding and adapting to the specific needs of cancer patients within Moffitt’s catchment area, and more broadly across Florida, is critical to achieving Moffitt’s IMPACT 2028 goals [1] and CCSG Center of Excellence status. To this end, the Florida Statewide Cancer Registry [2] provides a comprehensive database of cancer diagnoses throughout Florida via the Florida Cancer Data System (FCDS). The goal of this work is provide a consistent and broadly accessible interface to FCDS resources. We presents a web application and a software package, fcds, to clean and harmonize FCDS cancer registry data to facilitate integration of highly-specific regional cancer incidence data within research data analyses of Moffitt patient information.
The primary output of this work is software package for the R statistical computing environment [3] to facilitate analysis and use of FCDS cancer registry data with a clear and consistent data structure as advocated in Tidy Data by Hadley Wickham [4].
library(tidyverse)
library(fcds)
fcds <- fcds_import("STAT_dataset_2018.dat")
# Load previously imported data
fcds <- fcds_load()
fcds_moffitt_prostate <- fcds %>%
filter(cancer_site_group == "Prostate Gland", year > 1985) %>%
filter_age_groups(age_gt = 20) %>%
count_fcds(sex = "Male", moffitt_catchment = TRUE)
sex | county_name | year_group | year | age_group | n |
---|---|---|---|---|---|
Male | Charlotte | 1986-1990 | 1988 | 50 - 54 | 1 |
Male | Charlotte | 1986-1990 | 1988 | 55 - 59 | 12 |
Male | Charlotte | 1986-1990 | 1988 | 60 - 64 | 57 |
Male | Charlotte | 1986-1990 | 1988 | 65 - 69 | 126 |
Male | Charlotte | 1986-1990 | 1988 | 70 - 74 | 137 |
fcds_moffitt_prostate <- fcds_moffitt_prostate %>%
complete_age_groups(age_gt = 20) %>%
age_adjust() %>%
mutate(n = n / 5, rate = rate / 5)
sex | county_name | year_group | year | n | population | rate |
---|---|---|---|---|---|---|
Male | Charlotte | 1986-1990 | 1988 | 107.0 | 38144 | 137.3968 |
Male | Charlotte | 1991-1995 | 1993 | 160.6 | 48412 | 160.8875 |
Male | Charlotte | 1996-2000 | 1998 | 212.4 | 53639 | 187.3476 |
Male | Charlotte | 2001-2005 | 2003 | 214.4 | 60336 | 174.3292 |
Male | Charlotte | 2006-2010 | 2008 | 226.4 | 64059 | 176.2285 |
fcds_map(fcds_moffitt_prostate) +
facet_wrap(~ year_group) +
ggtitle("Prostate Cancer in Moffitt Catchment Area",
"Age-Adjusted Rates, Men Aged 20+") +
labs(fill = "Mean Yearly\nAge-Adjusted Rate") +
theme(legend.position = "bottom")
The fcds R package powers and enables an easy-to-use and accessible web application for Moffitt researchers. The web application, called FCDS Explorer, is built using the R Shiny framework [5] and provides features for summarizing and presenting the FCDS data.
[1] “IMPACT 2028 building for the next 10 years.” https://moffitt.org/media/9330/moffitt-annual-report-2018.pdf.
[2] “Florida cancer data system.” https://fcds.med.miami.edu/inc/welcome.shtml.
[3] R Core Team, “R: A language and environment for statistical computing.” R Foundation for Statistical Computing, Vienna, Austria, 2018 [Online]. Available: https://www.R-project.org/
[4] H. Wickham, “Tidy data,” Journal of Statistical Software, Articles, vol. 59, no. 10, pp. 1–23, 2014 [Online]. Available: https://www.jstatsoft.org/v059/i10
[5] W. Chang, J. Cheng, J. Allaire, Y. Xie, and J. McPherson, “Shiny: Web application framework for r.” 2018 [Online]. Available: https://CRAN.R-project.org/package=shiny