Common Excel Functions in R

The following table is from How to Use R With Excel by Alyssa Columbus, a start-to-finish guide on replacing Excel with R. You can also download a printable version.

Tasks

  1. Filter two sheets data sets to include only patients that appear in both.

  2. Create a grouping variable by extracting part of a string from a text field.

  3. Extract the first or last N characters from a string.

  4. Count the number of unique entries in a column and compare to the overall total number of rows.

  5. Format dates: extract the date portion of a date-time and remove outlier or missing dates.

  6. Add several columns together into a Total column.

Data

I’ve generated fake example data to be used for the above tasks. There are two data sets stored as sheets in the file data-export-example.xlsx.

Example Data

Diagnoses

MRN Patient ID Birth Date Admission Date ICD10 Code
101914 PATIENT-00035984 11/10/92 12:00:00 AM 2001-03-27 C83.50 Lymphoblastic (diffuse) lymphoma, unspecified site
101914 PATIENT-00035984 11/10/92 12:00:00 AM 2004-04-22 C16.0 Malignant neoplasm of cardia
105705 PATIENT-00012113 9/1/24 12:00:00 AM 2011-10-05 C19 Malignant neoplasm of rectosigmoid junction
105705 PATIENT-00012113 9/1/24 12:00:00 AM 2012-11-19 C34.11 Malignant neoplasm of upper lobe, right bronchus or lung
105705 PATIENT-00012113 9/1/24 12:00:00 AM 2016-11-22 C43.8 Malignant melanoma of overlapping sites of skin
106355 PATIENT-00057620 8/9/10 12:00:00 AM 2000-04-30 C06.1 Malignant neoplasm of vestibule of mouth

Therapies

MRN Patient ID Admission Date Therapy Date PX2 CODE_TYPE DESCRIPTION CHEMOCAT
467156 PATIENT-00037449 2000-01-04 2000-02-06 C9423 HCPC DACARBAZINE, BRAND NAME, 100 MG DACARBAZINE
467156 PATIENT-00037449 2000-01-04 2000-02-20 C9414 HCPC ETOPOSIDE, ORAL, BRAND NAME, 50 MG ETOPOSIDE
242023 PATIENT-00070776 2000-04-16 2000-06-11 J2675 HCPC PROGESTERONE PROGESTERONE
242023 PATIENT-00070776 2000-04-16 2000-06-25 C9420 HCPC CYCLOPHOSPHAMIDE, BRAND NAME, 100 MG CYCLOPHOSPHAMIDE
106355 PATIENT-00057620 2000-04-30 2000-05-18 S0179 HCPC MEGESTROL ACETATE, ORAL, 20MG MEGESTROL
769022 PATIENT-00037390 2000-07-14 2000-08-15 S0187 HCPC TAMOXIFEN CITRATE, ORAL, 10MG TAMOXIFEN

Package Spotlight: writexl

writexl is a relatively new package that can be used to write data frames from R into Excel .xlsx files.

You can write an Excel file from a data frame in the same way you use write_csv().

You can also save multiple data frames into a single Excel file as different sheets by using a named list.

Package Spotlight: esquisse

esquisse is an RStudio addin the lets you build ggplot2 plots using a graphical interface. Learn more at https://github.com/dreamRs/esquisse/.