Ingesting 35 Million Hotel Images with Python in the Cloud
Offered By: EuroPython Conference via YouTube
Course Description
Overview
Learn how Skyscanner built a distributed architecture using Python to process and manage 35 million hotel images in the cloud. Explore the challenges and solutions involved in creating an incremental image processing pipeline that discards poor quality and duplicate images while optimizing for mobile devices. Discover the tools and techniques used, including Pillow, ImageHash, Kombu, and Boto, to handle tasks such as ingesting partner-provided images, detecting and removing bad quality and duplicates, resizing images, and ensuring scalability within time constraints. Gain insights into the technical stack, triggering mechanisms, downloading processes, fingerprinting techniques, and methods for choosing the best images in this informative conference talk from EuroPython 2016.
Syllabus
Intro
Processing 55 million images
Tech stack
Triggering
Downloading images
Fingerprinting
Duplication
Duplicator
Guarantee
Choosing the best images
Picking the best image
Resize images
Final result
Taught by
EuroPython Conference
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Computational Photography
Georgia Institute of Technology via Coursera Digital Signal Processing
École Polytechnique Fédérale de Lausanne via Coursera Creative, Serious and Playful Science of Android Apps
University of Illinois at Urbana-Champaign via Coursera