How Companies Are Using Tachyon, a Memory Centric Distributed Storage
Offered By: Open Data Science via YouTube
Course Description
Overview
Explore the cutting-edge Tachyon distributed storage system in this ODSC West 2015 conference talk by Hayuan Li. Learn how memory-centric storage addresses big data processing bottlenecks and enables reliable file sharing at memory-speed across cluster frameworks like Apache Spark, MapReduce, and Flink. Discover Tachyon's key features, including Hadoop compatibility, fault tolerance, and its role as the default off-heap option in Spark. Gain insights into real-world use cases from companies leveraging Tachyon in production environments. Delve into topics such as star use cases, SAS and Spark implementations, SSD integration, new features, common misconceptions, configuration options, policies, transparent naming, and unified namespace. Understand how Tachyon fits into the Berkeley Data Analytics Stack and its widespread adoption across various institutions. Conclude with information on how to get involved in this open-source project that's revolutionizing distributed storage for big data processing.
Syllabus
Introduction
Star Use Case
SAS Use Case
Spark Use Case
SSD Use Case
New Features
Common Misconceptions
Configuration Options
Policies
Transparent Naming
Unified Namespace
Additional Features
How to get involved
Taught by
Open Data Science
Related Courses
CS115x: Advanced Apache Spark for Data Science and Data EngineeringUniversity of California, Berkeley via edX Big Data Analytics
University of Adelaide via edX Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera Introduction to Apache Spark and AWS
University of London International Programmes via Coursera