YoVDO

High Volume PDF Text Extraction Using Python Open-Source Tools

Offered By: EuroPython Conference via YouTube

Tags

EuroPython Courses Big Data Courses Python Courses Unstructured Data Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore high-volume PDF text extraction techniques using Python open-source tools in this EuroPython 2023 conference talk. Learn about the importance of extracting information from large volumes of PDF documents for corporate decision-making and long-term forecasting. Discover how to tackle the challenges of processing unstructured data and integrating OCR capabilities. Gain insights into achieving top-tier performance and maximum extraction detail using an open-source toolset designed for Big Data scenarios. Understand the "need for speed" in text extraction and how to effectively recreate structured information from millions of pages of documents.

Syllabus

High Volume PDF Text Extraction using Python Open-Source Tools — Harald Lieder


Taught by

EuroPython Conference

Related Courses

Recolección y exploración de datos
Tecnológico de Monterrey via Coursera
Applied Machine Learning
Microsoft via edX
Creating an Analytical Dataset
Udacity
NoSQL Database Systems
Arizona State University via Coursera
Foundations of mining non-structured medical data
EIT Digital via Coursera