What Can We Learn from 750 Billion GitHub Events and 42 TB of Code
Offered By: Devoxx via YouTube
Course Description
Overview
Explore the vast world of GitHub data in this 42-minute Devoxx conference talk. Dive into an analysis of 750 billion GitHub events and 42 TB of code to gain valuable insights into software development trends, open-source community dynamics, and effective project management strategies. Learn how to leverage this massive dataset to guide project design decisions, measure community health, and understand coding patterns over time. Discover techniques for running static code analysis at scale, evaluating the impact of social media on project popularity, and identifying the most effective ways to request changes. Gain a deeper understanding of your project's audience by examining who starred it and their other interests. Through live on-stage analysis, uncover fascinating insights about coding preferences, engagement patterns, and geographical distribution of contributions. Whether you're a developer, project manager, or data enthusiast, this talk offers a unique perspective on the collaborative nature of software development and the power of big data analysis in the open-source ecosystem.
Syllabus
Intro
Who wants to analyze GitHub
GitHub Stars
Not all projects are equal
What else are they interested in
Thank you and stars
Engagement
Text analysis
Size and countries
Top projects by country
Looking at code
Prototool
Java
Requesting features
Code analysis numerically
Conclusion
Taught by
Devoxx
Related Courses
DCO042 - Python For InformaticsUniversity of Michigan via Independent Corpus Linguistics: Method, Analysis, Interpretation
Lancaster University via FutureLearn 日本中世の自由と平等 (ga001)
University of Tokyo via gacco "A Study in Scarlet" by Doyle: BerkeleyX Book Club
University of California, Berkeley via edX "A Room with a View" by Forster: BerkeleyX Book Club
University of California, Berkeley via edX