β code unverified
<environment: 0x562e0b14c590> π
Parent: <environment: package:teal.data>
Bindings:
- IRIS: [data.frame]
- MTCARS: [data.frame]
Topic 5: Deeper dive into how to provide and pre-process data in teal.modules.clinical (15 minutes)
How to provide data to a teal application? - code samples
The data argument in teal::init() is where you specify all the datasets your application will use. There are several ways to provide data to teal:
- using
teal_data() - using
cdisc_data()for clinical data
Option 1: Using teal_data()
Option 2: Adding code for reproducibility
One of the key features of teal.data is its ability to track the code used to generate data:
# Create teal_data with explicit code tracking
my_data <- teal_data()
# Add datasets with code
my_data <- within(my_data, {
# Load and prepare iris data
IRIS <- iris
IRIS$Species <- as.factor(IRIS$Species)
# Load and prepare mtcars data
MTCARS <- mtcars
MTCARS$cyl <- as.factor(MTCARS$cyl)
MTCARS$gear <- as.factor(MTCARS$gear)
})
# View the tracked code
get_code(my_data) |>
cat()IRIS <- iris
IRIS$Species <- as.factor(IRIS$Species)
MTCARS <- mtcars
MTCARS$cyl <- as.factor(MTCARS$cyl)
MTCARS$gear <- as.factor(MTCARS$gear)
Option 3: Using cdisc_data() for clinical data
For clinical trial data following CDISC standards, teal.data provides the specialized cdisc_data() function that automatically handles common clinical data relationships.
β code unverified
<environment: 0x562e0f570a80> π
Parent: <environment: package:pharmaverseadam>
Bindings:
- ADAE: [tbl_df]
- ADSL: [tbl_df]
Introduction to teal.data in the context of the CDISC data standard
The teal.data package serves as the foundation for all data operations in teal. It provides a structured way to:
- Store multiple datasets in a single object
- Track the code used to create or modify data
- Define relationships between datasets
When you create a teal_data object, teal does several things behind the scenes:
- Data storage - Your datasets are stored within the object
- Code tracking - The creation process is recorded for reproducibility
- Metadata creation - Information about datasets and their relationships is stored
Creating cdisc_data objects with explicit code
# More explicit approach with within()
library(teal.data)
library(teal)
library(pharmaverseadam)
clinical_data <- cdisc_data()
clinical_data <- within(clinical_data, {
adsl <- pharmaverseadam::adsl
adae <- pharmaverseadam::adae
})
join_keys(clinical_data) <- teal.data::default_cdisc_join_keys[c("ADSL", "ADAE")]
# View the structure and join keys
print(clinical_data)
print(join_keys(clinical_data))Introduction to the concept of keys in teal.data
join_keys define how datasets relate to each other, which is crucial for: - Proper filtering across related datasets (filtering on the parent dataset filters entries in the child dataset as well) - Correct data merging in modules (join_key are used by the modules to merge datasets together)
Automatic join key detection with cdisc_data()
One of the key advantages of cdisc_data() is automatic join key detection:
Loading required package: shiny
Loading required package: teal.slice
Registered S3 method overwritten by 'teal':
method from
c.teal_slices teal.slice
You are using teal version 1.1.0
Attaching package: 'teal'
The following objects are masked from 'package:teal.slice':
as.teal_slices, teal_slices
A join_keys object containing foreign keys between 1 datasets:
ADSL: [STUDYID, USUBJID]
An empty join_keys object.
Notice the difference in the two above prints. When we pass a dataset directly to cdisc_data constructor, the constructor automatically adds the join_keys object. Whereas in the second example, we used within and our object did not automatically receive correct join_keys.
Understanding join key output
library(pharmaverseadam)
library(teal.data)
library(teal)
# Create example with multiple relationships
complex_data <- cdisc_data(
ADSL = pharmaverseadam::adsl,
ADAE = pharmaverseadam::adae,
ADEX = pharmaverseadam::adex
)
join_keys(complex_data) <- teal.data::default_cdisc_join_keys[c("ADSL", "ADAE", "ADEX")]
# Understanding the output:
# - Each line shows a relationship between two datasets
# - The arrow (->) indicates the direction (parent -> child)
# - Variables in brackets show the joining columns
# - ADSL is typically the parent (subject-level data)
# - the implicit join keys are created automatically when
# teal.data detects a possible join through a "middleman" dataset
# Print and interpret join keys
cat("Join Keys Structure:\n")Join Keys Structure:
A join_keys object containing foreign keys between 3 datasets:
ADSL: [STUDYID, USUBJID]
<-- ADAE: [STUDYID, USUBJID]
<-- ADEX: [STUDYID, USUBJID]
ADAE: [STUDYID, USUBJID, ASTDTM, AETERM, AESEQ]
--> ADSL: [STUDYID, USUBJID]
--* (implicit via parent with): ADEX
ADEX: [STUDYID, USUBJID, PARCAT1, PARAMCD, AVISITN, ASTDTM, EXSEQ]
--> ADSL: [STUDYID, USUBJID]
--* (implicit via parent with): ADAE
Creating a custom teal.data object with custom user-defined keys - code samples
Sometimes you need to create or modify join keys manually, especially when working with non-standard data structures or when you need custom relationships.
Manual join keys creation
# Create a teal_data object without automatic join detection
my_data <- teal_data(
PATIENTS = data.frame(
patient_id = 1:100,
age = sample(18:80, 100, replace = TRUE),
treatment = sample(c("A", "B"), 100, replace = TRUE)
),
VISITS = data.frame(
patient_id = rep(1:100, each = 4),
visit_num = rep(1:4, 100),
visit_date = seq.Date(as.Date("2024-01-01"), by = "week", length.out = 400),
measurement = rnorm(400, 100, 15)
),
EVENTS = data.frame(
patient_id = sample(1:100, 200, replace = TRUE),
event_type = sample(c("AE", "CM", "EX"), 200, replace = TRUE),
event_date = sample(seq.Date(as.Date("2024-01-01"), as.Date("2024-12-31"), by = "day"), 200)
)
)
# Manually define join keys
join_keys(my_data) <- join_keys(
join_key("PATIENTS", "VISITS", "patient_id"),
join_key("PATIENTS", "EVENTS", "patient_id")
)
# View the join keys
print("Manual join keys:")[1] "Manual join keys:"
A join_keys object containing foreign keys between 3 datasets:
PATIENTS: [no primary keys]
<-- VISITS: [patient_id]
<-- EVENTS: [patient_id]
VISITS: [no primary keys]
--> PATIENTS: [patient_id]
--* (implicit via parent with): EVENTS
EVENTS: [no primary keys]
--> PATIENTS: [patient_id]
--* (implicit via parent with): VISITS
Overwriting automatic join keys in cdisc_data
Sometimes the automatic join key detection in cdisc_data() doesnβt match your specific needs. Hereβs how to override the default behavior:
library(pharmaverseadam)
library(teal)
library(teal.data)
custom_keys <- join_keys(
# Primary keys for each dataset
join_key("ADSL", "ADSL", c("STUDYID", "USUBJID")),
join_key("ADAE", "ADAE", c("STUDYID", "USUBJID", "AESEQ")),
join_key("ADCM", "ADCM", c("STUDYID", "USUBJID", "CMSEQ")),
# Relationships between datasets
join_key("ADSL", "ADAE", c("STUDYID", "USUBJID")),
join_key("ADSL", "ADCM", c("STUDYID", "USUBJID")),
# Custom: Allow direct relationship between ADAE and ADCM
# This might be useful for analyzing AEs and concomitant medications together
join_key("ADAE", "ADCM", c("STUDYID", "USUBJID"))
)
custom_data <- cdisc_data(
ADSL = pharmaverseadam::adsl,
ADAE = pharmaverseadam::adae,
ADCM = pharmaverseadam::adcm
)
# Override the automatic join keys
join_keys(custom_data) <- custom_keys
# View the custom join keys
cat("\nCustom join keys:\n")
Custom join keys:
A join_keys object containing foreign keys between 3 datasets:
ADSL: [STUDYID, USUBJID]
<-- ADAE: [STUDYID, USUBJID]
<-> ADCM: [STUDYID, USUBJID]
ADAE: [STUDYID, USUBJID, AESEQ]
--> ADSL: [STUDYID, USUBJID]
<-- ADCM: [STUDYID, USUBJID]
ADCM: [STUDYID, USUBJID, CMSEQ]
<-> ADSL: [STUDYID, USUBJID]
--> ADAE: [STUDYID, USUBJID]
π οΈ Exercise
- load
teal.data - create a
teal_dataobject namedbasic_datathat bundles the built-inirisandmtcarsdatasets - print
teal_data- notice the verification status printed in the console - try to verify the code you provided to
teal_dataactually reproduces the data you stored inside the object - inspect the resulting object with
print()andget_code()
π οΈ Exercise
- use
teal_data()together withwithin()(or an equivalent approach) to construct an object namedtracked_data. - make at least one transformation to each dataset (e.g., convert a column to a factor, create a derived variable).
- confirm that
get_code(tracked_data)records the transformation steps.
π οΈ Exercise
- load the
pharmaverseadampackage and pull the datasetsadsl,adae, andadtte. - build a single
teal_dataobject namedadam_manualcontaining those three datasets. - define custom join keys that mimic the automatic CDISC relationships:
-
ADSL->ADAEonc("STUDYID", "USUBJID") -
ADSL->ADTTEonc("STUDYID", "USUBJID") - Optionally add subject-level self keys for each table.
-
- assign the custom keys with
join_keys(adam_manual) <- .... - confirm the structure by printing the join keys.
library(pharmaverseadam)
library(teal.data)
library(teal)
adam_manual <- teal_data(adsl = adsl, adae = adae, adtte = adtte_onco)
custom_join_keys <- join_keys(
join_key("ADSL", "ADAE", c("STUDYID", "USUBJID")),
join_key("ADSL", "ADTTE", c("STUDYID", "USUBJID"))
)
join_keys(adam_manual) <- custom_join_keys
print(join_keys(adam_manual))π οΈ Exercise
- use one of the application defined by you in exercises or one of the applications shown as examples during workshops
- debug issues with join keys if any
- use the Show R Code button to verify the application returns code that you can use to reproduce the output
library(pharmaverseadam)
library(teal.data)
library(teal)
library(teal.modules.clinical)
adam_manual <- teal_data(ADSL = adsl, ADAE = adae, ADTTE = adtte_onco)
adam_manual <- within(adam_manual, {
ADSL$ARM <- as.factor(ADSL$ARM)
ADSL$ARMCD <- as.factor(ADSL$ARMCD)
})
custom_join_keys <- join_keys(
join_key("ADSL", "ADAE", c("STUDYID", "USUBJID")),
join_key("ADSL", "ADTTE", c("STUDYID", "USUBJID")),
join_key("ADSL", "ADSL", c("STUDYID", "USUBJID"))
)
join_keys(adam_manual) <- custom_join_keys
print(join_keys(adam_manual))
app <- init(
data = adam_manual,
modules = modules(
tm_t_events(
label = "Adverse Event Table",
dataname = "ADAE",
arm_var = choices_selected(c("ARM", "ARMCD"), "ARM"),
llt = choices_selected(
choices = variable_choices("ADAE", c("AETERM", "AEDECOD")),
selected = c("AEDECOD")
),
hlt = choices_selected(
choices = variable_choices("ADAE", c("AEBODSYS", "AESOC")),
selected = "AEBODSYS"
),
add_total = TRUE,
event_type = "adverse event",
sort_criteria = "alpha",
pre_output = shiny::div("Who won the most at the poker last night?")
)
)
)
shinyApp(app$ui, app$server)π References
tealdocumentationteal.modules.generaldocumentationteal.modules.clinicaldocumentationteal.datadocumentation