A Python script that automatically gathers metadata for all repositories in a GitHub organization and exports it to a color-coded Excel spreadsheet for easy viewing and analysis.
- Fetches all repositories in an organization
- Collects key details:
- Repo visibility, name and description
- Date created and last updated
- Creator and top 4 contributors (
Unknowncreator means it was either a transferreed repository or a forked repository andNone (<GitHub Username>)means there was no full name attached to their github account) - Number of stars
- README, license,
.gitignore,CITATION.cff, and Package requirements (requirements.txt,environment.yaml, etc.) presence - Website Reference, Dataset, Paper Associated, DOI for GitHub Repo presence
- Number of branches
- Exports everything to an Excel file (
<org>_repo_info.xlsx) - Highlights “No” fields with red cell colors
-
Clone this repository:
git clone https://github.com/Imageomics/repo-exporter.git cd repo-exporter -
Install Python dependencies:
pip install -r requirements.txt -
Run the script:
python export_repos.py -
Enter your GitHub Personal Access Token
To create one with permissions for both private and public repositories (public repository read-access only is enabled by default without adminstrator approval):
- Go to github.com/settings/personal-access-tokens
- Click Generate new token → Fine-grained token
- Under Resource owner, select the organization you want to access.
- Under Repository access, choose All repositories.
- Under Permissions select Repositories and set:
- Metadata -> Read-only
- Contents -> Read-only
- Adminstration -> Read-only
- Click Generate token and copy it (make sure to store it somewhere safe for future use).
Note: The token must be approved by the organization administrator before accessing private repositories.