YoVDO

Building a Unified Cancer Immunotherapy Data Library

Offered By: Strange Loop Conference via YouTube

Tags

Strange Loop Conference Courses Data Analysis Courses Clojure Courses Data Integrity Courses Data Management Courses ETL Pipelines Courses

Course Description

Overview

Explore the development of CANDEL (Cancer Data and Evidence Library), a database system leveraging Datomic to unify cancer immunotherapy research data. Learn how this innovative approach addresses the challenge of managing the explosion of data in cancer research. Discover how CANDEL's expressive schema maps diverse representations into a common form and captures evolving concepts in molecular and cancer biology. Understand the configurable, data-driven ETL pipeline developed in Clojure that simplifies data loading. See how data scientists can utilize datalog queries directly from R analysis environments, and how Datomic's immutable history ensures reproducible analysis for large-scale collaborations. Gain insights into the application of CANDEL in translational analysis at the Parker Institute for Cancer Immunotherapy, and its potential to accelerate cures for cancer. Delve into the technical aspects of the system, including the metamodel, import validation, and query capabilities. Examine real-world applications, such as the PRINCE clinical trial in Pancreatic Cancer, and how CANDEL enables faster progress in cancer immunotherapy research through improved data management and analysis.

Syllabus

Intro
CANDEL: The CANcer Data & Evidence Library
The Parker Institute Network
History of Cancer Treatment
Types Of Immunotherapy
What does the data scientist's job look like?
Physically and conceptually fragmented data
Design requirements
Talk outline
Datomic: Getting the Data Right
Datomic: Getting More Realistic
Better Aspects of Both Worlds
Getting Data into Datomic
Metamodel (Schema Annotations as Data)
Config Structure Derived from Metamodel and Schema
Import Validation
pret Command Line Workflow
The CANDEL Schema
The PRINCE clinical trial in Pancreatic Cancer
Importing the PRINCE dataset into CANDEL with pret
Steps of a data import
The end result of data import
Datomic Query via R: Tool Support
Datomic Query in Raw EDN
Spec Parser for Datalog over Plain JSON
Query Comparison
Enabling downstream applications
Plotting and standard analyses enabled by CANDEL
Circulating Tumor DNA: Mutant KRAS
CANDEL enables faster progress in cancer immunotherapy research


Taught by

Strange Loop Conference

Tags

Related Courses

Building Batch Pipelines in Cloud Data Fusion
Google via Google Cloud Skills Boost
Data Engineering with Databricks
Pragmatic AI Labs via edX
MLOps Platforms: Amazon SageMaker and Azure ML
Pragmatic AI Labs via edX
Data Warehousing for Partners: Data Warehouse Migration with BigQuery
Google via Google Cloud Skills Boost
Building Data Engineering Pipelines in Python
DataCamp