Databricks Data Science and Engineering: Basic Tools and Infrastructure
Offered By: Pluralsight
Course Description
Overview
This course will teach you how to make the best use of Databricks assets such as notebooks, clusters, and repos to simplify the development and management of your big data applications.
Building robust and high-performing big data applications requires a well-configured environment. On the Databricks platform, this means setting up various assets such as clusters, notebooks, and repos in order to get the most out of the platform and make development and analysis work as smooth as possible. In this course, Databricks Data Science and Engineering: Basic Tools and Infrastructure, you'll explore exactly how this can be accomplished. First, you'll begin by creating and then making use of clusters, tables, files, and notebooks, and explore how all of these can be combined in order to build and run a simple application. Next, you'll move on to the use of Databricks repos, which allow us to record changes to notebooks and related files in a workspace, and can be linked with an external Git repository. Then, you'll delve into how this linking can be performed, and explore how file additions, modifications, and removals can be performed and viewed on repos. Finally, you'll move on to jobs which represent the execution of a task on Databricks - how job executions can be configured and scheduled, and how notifications at various stages of a job can be sent. When you're finished with this course, you'll have skills and knowledge of Databricks resources such as clusters, notebooks, repos, and jobs, as well as their configurations, which will help you create a Databricks environment that is optimized for building and running applications and can help you get the most out of your data.
Building robust and high-performing big data applications requires a well-configured environment. On the Databricks platform, this means setting up various assets such as clusters, notebooks, and repos in order to get the most out of the platform and make development and analysis work as smooth as possible. In this course, Databricks Data Science and Engineering: Basic Tools and Infrastructure, you'll explore exactly how this can be accomplished. First, you'll begin by creating and then making use of clusters, tables, files, and notebooks, and explore how all of these can be combined in order to build and run a simple application. Next, you'll move on to the use of Databricks repos, which allow us to record changes to notebooks and related files in a workspace, and can be linked with an external Git repository. Then, you'll delve into how this linking can be performed, and explore how file additions, modifications, and removals can be performed and viewed on repos. Finally, you'll move on to jobs which represent the execution of a task on Databricks - how job executions can be configured and scheduled, and how notifications at various stages of a job can be sent. When you're finished with this course, you'll have skills and knowledge of Databricks resources such as clusters, notebooks, repos, and jobs, as well as their configurations, which will help you create a Databricks environment that is optimized for building and running applications and can help you get the most out of your data.
Syllabus
- Course Overview 2mins
- Managing Databricks Workspace Assets 41mins
- Developing Applications Using Notebooks 38mins
- Configuring and Managing Job Executions 25mins
Taught by
Kishan Iyer
Related Courses
Data Processing with AzureLearnQuest via Coursera Mejores prácticas para el procesamiento de datos en Big Data
Coursera Project Network via Coursera Data Science with Databricks for Data Analysts
Databricks via Coursera Azure Data Engineer con Databricks y Azure Data Factory
Coursera Project Network via Coursera Curso Completo de Spark con Databricks (Big Data)
Coursera Project Network via Coursera