YoVDO

Power of Electric Snakes! PySpark for Big Data and Data Science

Offered By: PASS Data Community Summit via YouTube

Tags

PASS Data Community Summit Courses Data Science Courses Python Courses SQL Courses Data Transformation Courses Big Data Analytics Courses Logistic Regression Courses RDDs Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the power of PySpark for Big Data and Data Science in this 57-minute conference talk from PASS Data Community Summit. Dive into the world of Big Data Analytics using Spark and Python, learning how to perform essential analytical tasks such as creating RDDs and Data Frames, transforming columns, and generating aggregations. Discover the differences between RDD and DataFrame, understand key concepts like aggregate by key, and learn how to handle unknown fields and create dummy variables. Gain insights into using SQL statements, column selection, and implementing for loops in PySpark. Delve into machine learning applications with Spark ML, including classification, transformation, and evaluation techniques. Compare logistic regression results and understand how to leverage sparse vectors for efficient data representation. Whether you're new to Big Data or looking to expand your skillset, this talk provides a comprehensive introduction to PySpark's capabilities in data science and analytics.

Syllabus

Introduction
Data
RDD vs DataFrame
What is RDD
Aggregate by Key
Describe
Replacing unknown fields
Dummy variables
For loops
SQL statements
Column selection statements
Putting it all together
Labeling
Vectors
Sparse Vector
Spark ML
Classifier
Transform
Evaluation
Logistic Regression
Comparing Results


Taught by

PASS Data Community Summit

Related Courses

Doing More with Less - The Challenges Ahead for Every Data Professional
PASS Data Community Summit via YouTube
Build a Modern Data Strategy and Put Your Data to Work
PASS Data Community Summit via YouTube
Transform Your Data Estate
PASS Data Community Summit via YouTube
Azure SQL and SQL Server 2022 - Intelligent Database Futures
PASS Data Community Summit via YouTube
SQL Server in Azure Virtual Machines Reimagined
PASS Data Community Summit via YouTube