A desktop GUI application for automatically scraping and downloading product images from anime figurine company websites.
- Desktop GUI built with Tkinter
- Multi-company support: Good Smile Company and Kotobukiya
- URL domain validation: Ensures URLs match the selected company
- Async image downloads with configurable concurrency (up to 5 parallel downloads)
- Automatic retry logic for failed downloads (up to 3 attempts)
- Real-time progress updates in the GUI
- OOP architecture using Factory Pattern for easy extensibility
- Automatic file organization with sanitized naming
- Error handling for network issues, invalid URLs, and missing images
The application follows Object-Oriented Programming principles with a modular design:
ScraperParser (Orchestrator)
↓
ScraperFactory (Factory Pattern)
↓
BaseScraper (Abstract Base Class)
├─ GoodSmileScraper
└─ KotobukiyaScraper
↓
ImageDownloader (Async Downloads)
- ScraperParser: Main orchestrator that validates inputs, coordinates scraping, and manages downloads
- ScraperFactory: Creates appropriate scraper instances based on company type
- BaseScraper: Abstract base class defining the scraper interface
- Company Scrapers: Implement company-specific scraping logic
- ImageDownloader: Handles async image downloads with retry logic
- Python 3.11 or higher
- pip (Python package manager)
-
Clone or download this repository
-
Install dependencies:
pip install -r requirements.txtThe project includes a comprehensive test suite using Python's unittest framework.
Run all test suites with a single command:
python run_tests.pyThis will automatically discover and run all tests in the src/tests/ directory.
- TestBugFixes: Validates bug fixes (factory duplicates, aliases, DPI awareness)
- TestURLValidation: Tests URL domain validation for each company
- TestImageFiltering: Verifies filtering of social media and UI images
Run a specific test file:
python src/tests/test_fixes.py
python src/tests/test_url_validation.py
python src/tests/test_image_filtering.pyRun with different verbosity:
python run_tests.py -v # Verbose output
python run_tests.py -q # Quiet outputpython run.py- Enter Product URL: Paste the product page URL from the manufacturer's website
- Select Company: Choose from the dropdown (Good Smile or Kotobukiya)
- The scraper will validate that the URL domain matches the selected company
- Good Smile accepts: goodsmile.info, goodsmileus.com, goodsmilecompany.com
- Kotobukiya accepts: kotobukiya.co.jp
- Set Output Directory: Specify where to save images (default:
downloads/) - Click "Start Scraping": The application will:
- Validate your inputs and URL domain
- Fetch the product page
- Extract product name and images
- Download all images to
<output_dir>/<product_name>/ - Display real-time progress
- View Results: Check the status window for download statistics
Downloaded images are saved as:
<output_directory>/<sanitized_product_name>/
├─ sanitized_product_name_001.jpg
├─ sanitized_product_name_002.jpg
└─ ...
ScrapeMei/
├── run.py # Application entry point
├── build.py # Standard build script (single .exe)
├── build_folder.py # Folder distribution build
├── build.bat # Windows build wrapper
├── run_tests.py # Test runner
├── requirements.txt # Python dependencies
├── README.md # This file
├── QUICKSTART.md # Quick start guide
├── src/logic/ # Main source code
│ ├── main.py # Application launcher
│ ├── gui.py # GUI components (AnimeScraperGUI)
│ ├── parser.py # Scraping orchestrator (ScraperParser)
│ ├── downloader.py # Async image downloader
│ ├── utils.py # Utility functions
│ └── scraper/ # Scraper package
│ └── ... # base, factory, and company scrapers
├── src/tests/ # Unit tests
│ └── ...
├── logs/ # Application logs
│ └── scraper.log
└── downloads/ # Default download directory
└── <product_name>/
The application is designed for easy extensibility. To add support for a new company:
- Create a new scraper class inheriting from
BaseScraper - Implement
get_product_name()andget_image_urls()methods - Register in
ScraperFactory._scrapersdictionary
The new company will automatically appear in the GUI dropdown.
To create a standalone executable, use the provided build scripts:
python build.pyCreates a single ScrapeMei.exe file in the dist/ folder.
python build_folder.pyCreates a dist/ScrapeMei/ folder with the .exe and all dependencies. More reliable on some systems.
build.batWindows batch wrapper for the standard build.
python build.py --debugCreates an .exe with a visible console window to diagnose errors.
The build script will:
- Clean previous build artifacts
- Automatically read dependencies from
requirements.txt - Generate hidden imports for PyInstaller
- Bundle the application with PyInstaller
- Create executable file(s) in the
dist/folder - Verify the build was successful
Note: The build script automatically parses requirements.txt and includes all dependencies in the executable. When you add a new package to requirements.txt, it will automatically be bundled in the next build.
The executable will be standalone and does not require Python to be installed on the target system.
"Failed to start python embedded interpreter"
- Try the folder distribution:
python build_folder.py - Build in debug mode:
python build.py --debug - Disable antivirus temporarily or add
dist/to exclusions - Install Visual C++ Redistributables
- Run the .exe as administrator
"ModuleNotFoundError" when running .exe
- Add the missing package to
requirements.txt - Rebuild with
python build.py
# Single file
pyinstaller --onefile --windowed --name ScrapeMei --noupx run.py
# Folder distribution
pyinstaller --windowed --name ScrapeMei --noupx run.py- requests: HTTP requests for fetching web pages
- beautifulsoup4: HTML parsing and element selection
- lxml: Fast HTML/XML parser backend
- aiohttp: Async HTTP client for parallel downloads
- tkinter: GUI framework (included with Python)
- pyinstaller: Creating standalone executables
- Supports up to 100 images per product efficiently
- Configurable parallel downloads (default: 5 concurrent)
- Async I/O for non-blocking downloads
- Automatic URL deduplication
- Exponential backoff for retries
This project is provided as-is for educational and personal use.
Version: 1.0.0 (MVP)
Last Updated: March 2026