YoVDO

Dat - An Open Source Tool for Sharing and Collaborating on Data

Offered By: JSConf via YouTube

Tags

JSConf Courses Data Management Courses Data Streaming Courses Data Pipelines Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore an impromptu presentation on Dat, an open-source tool for data sharing and collaboration, delivered at JSFest Oakland 2014. Discover how Dat aims to revolutionize data management in the same way Git transformed source control. Learn about reproducible science, data sharing challenges, and the analogy between current data practices and pre-Git source control methods. Dive into practical examples using npm data, and explore concepts like data pipelines, dependency management, and data streaming. Gain insights into tools such as gasket for cross-platform pipeline management and datscript for experimental pipeline configuration. Understand advanced features including branches, checkout functionality, multi-master replication, database synchronization, and registry capabilities.

Syllabus

Intro
Max Ogden @maxogden
dat is an open source tool for sharing and collaborating on data
we are grant funded and 100% open source
reproducible science
analogy time: lets talk about source control
life before git
i want to fix a bug in cool-project
1. somehow geta zip of cool-project 2. unpack and edita file 3. email the file back
claim: currently data sharing is a mess
email csv files
we want to do for data what git did for source code
a data set we can all relate to: npm
calculate how big npm is using dat
transform the npm data using bulk-markdown-to-png
bionode bioinformatics tools on npm
data pipelines dependency management data streaming
gasket is a cross platform pipeline manager
datscript is an experimental pipeline config language
branches, dat checkout 3b2d98V3, multi master replication, sync to databases, registry


Taught by

JSConf

Related Courses

Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera
Data Analysis with Python
IBM via Coursera
Intro to TensorFlow 日本語版
Google Cloud via Coursera
TensorFlow on Google Cloud - Français
Google Cloud via Coursera
Freedom of Data with SAP Data Hub
SAP Learning