Data Management Tools on Databricks
Offered By: Pluralsight
Course Description
Overview
This course will teach you some of the fundamental techniques to store, manage, and process data using the Databricks platform.
Data is at the heart of Databricks and managing it in an optimal manner is a crucial skill for any user on this platform. In this course, Data Management Tools on Databricks, you’ll learn to load, configure, and access data using the UI, the dbutils library, and a Spark application. First, you'll explore the Databricks File System (DBFS), how it is implemented as a layer above object storage, and how it can be accessed using the Databricks web UI and the Databricks API. You'll also look into the use of the dbutils library, from its application in file system operations to setting up widgets in a notebook. Next, you'll delve into management of structured data in Databricks by creating and then using managed (Delta) tables and external tables, seeing the features available for each, how they are similar, and where they differ from each other. Finally, you'll turn your attention towards consuming and analyzing data from a Spark application built using a notebook, and glimpse into the metrics and graphs that are available for tracking executions and resources within Databricks. When you are finished with this course, you'll have gained the necessary knowledge and skills in data management and processing on Databricks to help you store and access data in a secure and efficient manner on this platform.
Data is at the heart of Databricks and managing it in an optimal manner is a crucial skill for any user on this platform. In this course, Data Management Tools on Databricks, you’ll learn to load, configure, and access data using the UI, the dbutils library, and a Spark application. First, you'll explore the Databricks File System (DBFS), how it is implemented as a layer above object storage, and how it can be accessed using the Databricks web UI and the Databricks API. You'll also look into the use of the dbutils library, from its application in file system operations to setting up widgets in a notebook. Next, you'll delve into management of structured data in Databricks by creating and then using managed (Delta) tables and external tables, seeing the features available for each, how they are similar, and where they differ from each other. Finally, you'll turn your attention towards consuming and analyzing data from a Spark application built using a notebook, and glimpse into the metrics and graphs that are available for tracking executions and resources within Databricks. When you are finished with this course, you'll have gained the necessary knowledge and skills in data management and processing on Databricks to help you store and access data in a secure and efficient manner on this platform.
Syllabus
- Course Overview 2mins
- Working with the Databricks File System 41mins
- Creating and Managing Databases and Tables 23mins
- Processing Data with Apache Spark 18mins
Taught by
Kishan Iyer
Related Courses
Coding the Matrix: Linear Algebra through Computer Science ApplicationsBrown University via Coursera كيف تفكر الآلات - مقدمة في تقنيات الحوسبة
King Fahd University of Petroleum and Minerals via Rwaq (رواق) Datascience et Analyse situationnelle : dans les coulisses du Big Data
IONIS via IONIS Data Lakes for Big Data
EdCast 統計学Ⅰ:データ分析の基礎 (ga014)
University of Tokyo via gacco