get_gh_commits.py
Je Sian Keith Herman August 05, 2023 #Python #Scripting #Data ExtractionA Python script for exporting a GitHub user's daily commit counts to a CSV file.
# requirements.txt
# Do `pip install beautifulsoup4 pandas`
beautifulsoup4
pandas
"""
Get GitHub commit data for a given username by scraping the GitHub
contribution calendar heatmap for each year of activity.
This function returns a list of dictionaries with the following keys:
- date: The date of the commit
- commits: The number of commits on that date
Parameters
----------
github_username : str
The GitHub username to get commit data for.
Returns
-------
list
A list of dictionaries with the commit data.
"""
# Import the requests and BeautifulSoup libraries
=
# Parse HTML and save to BeautifulSoup object
=
# Select the "year" links
=
# Get the page links for each year
=
# Get the commit data for each year
=
=
# Get the page HTML for a given year
=
=
# Get all the "rect" elements in the Calendar heatmap
=
# Get the commit data for each day
# Check if the rect element has a "data-date" and "data-count" attribute
# Check if the date is in the past
# Append the commit data to the list commits
=
= 0
return
# Get the GitHub username from the user
=
# Get the commit data and save to a pandas DataFrame
=
=
=
# Save the commit data to a CSV file
# Check if the script is being run directly and run the main function