A self-hosted, AI-powered job tracking system that automatically scrapes GitHub job lists, filters opportunities by your eligibility, monitors application status via Gmail, and syncs everything to Google Sheets.
- GitHub Job Scraping — Automatically monitors GitHub job list repositories (markdown tables with job links)
- AI-Powered Extraction — Uses GPT-4o-mini or Gemini to extract job requirements from any job page format
- Smart Filtering — Filters jobs by class standing, graduation date, work authorization, and season/year
- Fit Scoring — Scores jobs 0-100 based on skills, location, salary, and company preferences
- Gmail Status Tracking — Detects application confirmations, OAs, interviews, offers, and rejections
- Google Sheets UI — All data in a familiar spreadsheet interface with status colors
- Discord Notifications — Instant alerts when dream company jobs are found
- Privacy First — Runs entirely on your machine, no data sent to third parties
Two separate services, no conflicts:
- Job Scraper Service: creates new job rows (
scrape_jobs.py) - Job Update Serice: update existing rows with status changes (
check_gmail.py)
- Jobs are automatically scraped from GitHub lists, filtered by your eligibility, and added to Google Sheets with extracted details like salary, location, and fit scores.
- Get instant Discord notifications on your phone when dream company jobs are added to your sheet.
- As you apply and receive responses, the app parses your Gmail to detect confirmations, OA invites, interviews, and offers—updating your sheet automatically.
- Python 3.10+
- Git
- Google Cloud Project with:
- Google Sheets API enabled
- Gmail API enabled
- OAuth 2.0 credentials (Desktop app)
- AI API Key (one of):
git clone https://github.com/yourusername/apply-potato.git
cd apply-potato
# Create virtual environment
python -m venv venv
# Activate (Windows)
venv\Scripts\activate
# Activate (macOS/Linux)
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Install Playwright browser
playwright install chromium- Go to Google Cloud Console
- Create a new project (or use existing)
- Enable Google Sheets API and Gmail API
- Go to APIs & Services → Credentials
- Create OAuth 2.0 Client ID (Desktop app)
- Download the JSON file
- Save it as
auth/credentials.json
- Create a new Google Sheet
- Copy the Sheet ID from the URL:
https://docs.google.com/spreadsheets/d/{THIS_IS_THE_SHEET_ID}/edit
The header row will be created automatically on first run.
# Copy example config
cp .env.example .env
# Edit .env with your valuesRequired settings:
# API Keys
OPENAI_API_KEY=sk-... # or GEMINI_API_KEY
AI_PROVIDER=openai # or gemini
# Google
GOOGLE_CREDENTIALS_PATH=./auth/credentials.json
GOOGLE_SHEET_ID=your_sheet_id_here
# Your Profile
USER_NAME=Your Name
USER_EMAIL=your.email@gmail.com
USER_CLASS_STANDING=Junior # Freshman/Sophomore/Junior/Senior (blank if graduated)
USER_GRADUATION_DATE=May 2028
USER_WORK_AUTHORIZATION=US Citizen
USER_TARGET_JOB_TYPE=Internship
USER_TARGET_SEASON_YEAR=Summer 2026python setup_wizard.pyThis will:
- Verify your configuration
- Test Google OAuth (opens browser for authentication)
- Validate API keys
# Run once (test) - processes 5 NEW jobs (skips duplicates/previously filtered)
python scrape_jobs.py --limit 5
# Run on schedule (every 30 min)
python scrape_jobs.py --scheduled
# Clear filtered cache (use after changing your profile - graduation date, class standing, etc.)
python scrape_jobs.py --clear-filtered# Run once (test)
python check_gmail.py
# Run on schedule (every 10 min)
python check_gmail.py --scheduledAll settings are in .env. Key sections:
USER_CLASS_STANDING=Junior # Your current class standing
USER_GRADUATION_DATE=May 2028 # Expected graduation
USER_MAJOR=Computer Science # Your major(s), comma-separated
USER_GPA=3.7 # Your GPA
USER_WORK_AUTHORIZATION=US Citizen # Work auth status
USER_TARGET_JOB_TYPE=Internship # Internship, Full-Time, or Both
USER_TARGET_SEASON_YEAR=Summer 2026 # Target start date
USER_PREFERRED_LOCATIONS=NYC,SF,Remote
USER_SKILLS=Python,Java,React,SQL
USER_TARGET_COMPANIES=Google,Meta,Apple # Dream companies (for notifications)Hard Filters (binary pass/fail):
- Class standing requirement
- Graduation timeline
- Season/year match
- Work authorization
Soft Scoring (0-100 points):
- Company + job category match: 30 pts
- Skills match: 20 pts
- Major match: 20 pts
- Location preference: 10 pts
- Salary match: 10 pts
- GPA match: 10 pts
# GitHub repos to scrape (comma-separated)
# Format: owner/repo@branch (branch defaults to "main" if not specified)
GITHUB_REPOS=owner/repo@branch,another-owner/another-repo@branchSCRAPE_INTERVAL_MINUTES=30 # How often to check for new jobs
GMAIL_CHECK_INTERVAL_MINUTES=10 # How often to check email
JOB_AGE_LIMIT_DAYS=7 # Ignore jobs older than thisGet instant alerts when jobs from your dream companies are found.
- Create a Discord server (or use existing)
- Go to Server Settings → Integrations → Webhooks
- Create a webhook, copy the URL
- Add to
.env:
DISCORD_ENABLED=true
DISCORD_WEBHOOK_URL=https://discord.com/api/webhooks/xxx/yyy
DREAM_COMPANY_MATCH_THRESHOLD=80 # Fuzzy match sensitivity (0-100)- New job alerts — When a job from Google, Meta, Apple, etc. is found
- Status updates — When an application moves to OA, interview, offer, or rejection
Install as a background service that starts automatically:
# Install services (Windows: WinSW, macOS: Launch Agent)
python install_service.py
# Check status
python install_service.py --status
# Uninstall
python install_service.py --uninstallapply-potato/
├── scrape_jobs.py # Main job scraping script
├── check_gmail.py # Gmail monitoring script
├── setup_wizard.py # First-time setup
├── install_service.py # Service installer
├── src/ # Source modules
│ ├── config.py # Configuration loader
│ ├── github_parser.py # GitHub markdown parser
│ ├── scraper.py # Playwright web scraper
│ ├── ai_extractor.py # AI job extraction
│ ├── filters.py # Hard eligibility filters
│ ├── scoring.py # Soft fit scoring
│ ├── deduplication.py # URL-based dedup
│ ├── sheets.py # Google Sheets API
│ ├── gmail.py # Gmail API
│ ├── email_classifier.py # AI email classification
│ ├── email_filters.py # Email privacy filters
│ └── notifications.py # Discord notifications
├── prompts/ # AI prompt templates
├── scripts/ # Manual test scripts
├── tests/ # Unit tests
└── auth/ # OAuth credentials (git-ignored)
# Run all unit tests
pytest tests/ -v
# Run specific test file
pytest tests/test_filters.py -v# Test full pipeline with a URL
python scripts/test_e2e.py https://jobs.lever.co/company/job-id
# Test with GitHub jobs
python scripts/test_e2e.py --from-github --count 5
# Test AI extraction on saved content
python scripts/test_ai_extractor.py --file scraped_content/job.txtSee scripts/README.md for detailed testing documentation.
pip install playwright
playwright install chromium- Make sure
auth/credentials.jsonexists - Delete
auth/gmail_token.jsonandauth/sheets_token.jsonto re-authenticate - Check that your OAuth consent screen is configured
- Check
logs/scrape.logfor errors - Verify GitHub repos in
.envare correct - Test with:
python scripts/test_e2e.py --from-github --parse-only
If you changed your graduation date, class standing, or work authorization, previously filtered jobs won't be re-evaluated. Clear the cache:
python scrape_jobs.py --clear-filtered- Check your API key is valid
- Check
logs/scrape.logfor API errors - Try with
LOG_LEVEL=DEBUGfor more detail - Increase
RENDER_DELAY_SECONDSfor slow-loading sites (e.g., Workday)
- Make sure
USER_EMAILmatches your Gmail account - Check
logs/gmail.logfor errors - Verify emails are in Primary inbox (not Promotions/Social)
- Test with:
python check_gmail.py --reprocess
- OpenAI/Gemini: Increase
RETRY_BASE_DELAY_SECONDS - Google Sheets: Built-in retry handles this automatically
Jobs progress through these statuses:
| Status | Meaning | Trigger |
|---|---|---|
| New | Just discovered | Job scraping |
| Applied | Application submitted | Gmail confirmation |
| OA | Online assessment | Gmail OA invite |
| Phone | Phone screen scheduled | Gmail phone invite |
| Technical | Technical interview | Gmail tech invite |
| Offer | Offer received | Gmail offer letter |
| Rejected | Application rejected | Gmail rejection |
| Ghosted | No response (manual) | User sets manually |



