Anime Figurine Image Scraper

A desktop GUI application for automatically scraping and downloading product images from anime figurine company websites.

Features

Desktop GUI built with Tkinter
Multi-company support: Good Smile Company and Kotobukiya
URL domain validation: Ensures URLs match the selected company
Async image downloads with configurable concurrency (up to 5 parallel downloads)
Automatic retry logic for failed downloads (up to 3 attempts)
Real-time progress updates in the GUI
OOP architecture using Factory Pattern for easy extensibility
Automatic file organization with sanitized naming
Error handling for network issues, invalid URLs, and missing images

Architecture

The application follows Object-Oriented Programming principles with a modular design:

ScraperParser (Orchestrator)
    ↓
ScraperFactory (Factory Pattern)
    ↓
BaseScraper (Abstract Base Class)
    ├─ GoodSmileScraper
    └─ KotobukiyaScraper
    ↓
ImageDownloader (Async Downloads)

Core Components

ScraperParser: Main orchestrator that validates inputs, coordinates scraping, and manages downloads
ScraperFactory: Creates appropriate scraper instances based on company type
BaseScraper: Abstract base class defining the scraper interface
Company Scrapers: Implement company-specific scraping logic
ImageDownloader: Handles async image downloads with retry logic

Installation

Prerequisites

Python 3.11 or higher
pip (Python package manager)

Setup

Clone or download this repository
Install dependencies:

pip install -r requirements.txt

Testing

The project includes a comprehensive test suite using Python's unittest framework.

Running All Tests

Run all test suites with a single command:

python run_tests.py

This will automatically discover and run all tests in the src/tests/ directory.

Test Suites

TestBugFixes: Validates bug fixes (factory duplicates, aliases, DPI awareness)
TestURLValidation: Tests URL domain validation for each company
TestImageFiltering: Verifies filtering of social media and UI images

Running Individual Tests

Run a specific test file:

python src/tests/test_fixes.py
python src/tests/test_url_validation.py
python src/tests/test_image_filtering.py

Run with different verbosity:

python run_tests.py -v      # Verbose output
python run_tests.py -q      # Quiet output

Usage

Running the Application

python run.py

Using the GUI

Enter Product URL: Paste the product page URL from the manufacturer's website
Select Company: Choose from the dropdown (Good Smile or Kotobukiya)
- The scraper will validate that the URL domain matches the selected company
- Good Smile accepts: goodsmile.info, goodsmileus.com, goodsmilecompany.com
- Kotobukiya accepts: kotobukiya.co.jp
Set Output Directory: Specify where to save images (default: downloads/)
Click "Start Scraping": The application will:
- Validate your inputs and URL domain
- Fetch the product page
- Extract product name and images
- Download all images to <output_dir>/<product_name>/
- Display real-time progress
View Results: Check the status window for download statistics

File Naming Convention

Downloaded images are saved as:

<output_directory>/<sanitized_product_name>/
    ├─ sanitized_product_name_001.jpg
    ├─ sanitized_product_name_002.jpg
    └─ ...

Project Structure

ScrapeMei/
├── run.py                       # Application entry point
├── build.py                     # Standard build script (single .exe)
├── build_folder.py              # Folder distribution build
├── build.bat                    # Windows build wrapper
├── run_tests.py                 # Test runner
├── requirements.txt             # Python dependencies
├── README.md                    # This file
├── QUICKSTART.md                # Quick start guide
├── src/logic/                   # Main source code
│   ├── main.py                  # Application launcher
│   ├── gui.py                   # GUI components (AnimeScraperGUI)
│   ├── parser.py                # Scraping orchestrator (ScraperParser)
│   ├── downloader.py            # Async image downloader
│   ├── utils.py                 # Utility functions
│   └── scraper/                 # Scraper package
│       └── ...                  # base, factory, and company scrapers
├── src/tests/                   # Unit tests
│   └── ...
├── logs/                        # Application logs
│   └── scraper.log
└── downloads/                   # Default download directory
    └── <product_name>/

Adding New Companies

The application is designed for easy extensibility. To add support for a new company:

Create a new scraper class inheriting from BaseScraper
Implement get_product_name() and get_image_urls() methods
Register in ScraperFactory._scrapers dictionary

The new company will automatically appear in the GUI dropdown.

Packaging as .exe (Windows)

To create a standalone executable, use the provided build scripts:

Option 1: Single-File Executable (Recommended)

python build.py

Creates a single ScrapeMei.exe file in the dist/ folder.

Option 2: Folder Distribution (Most Reliable)

python build_folder.py

Creates a dist/ScrapeMei/ folder with the .exe and all dependencies. More reliable on some systems.

Option 3: Batch File Wrapper

build.bat

Windows batch wrapper for the standard build.

Option 4: Debug Build

python build.py --debug

Creates an .exe with a visible console window to diagnose errors.

What the Build Script Does

The build script will:

Clean previous build artifacts
Automatically read dependencies from requirements.txt
Generate hidden imports for PyInstaller
Bundle the application with PyInstaller
Create executable file(s) in the dist/ folder
Verify the build was successful

Note: The build script automatically parses requirements.txt and includes all dependencies in the executable. When you add a new package to requirements.txt, it will automatically be bundled in the next build.

The executable will be standalone and does not require Python to be installed on the target system.

Troubleshooting Build Issues

"Failed to start python embedded interpreter"

Try the folder distribution: python build_folder.py
Build in debug mode: python build.py --debug
Disable antivirus temporarily or add dist/ to exclusions
Install Visual C++ Redistributables
Run the .exe as administrator

"ModuleNotFoundError" when running .exe

Add the missing package to requirements.txt
Rebuild with python build.py

Manual Build (Advanced)

# Single file
pyinstaller --onefile --windowed --name ScrapeMei --noupx run.py

# Folder distribution
pyinstaller --windowed --name ScrapeMei --noupx run.py

Technical Details

Dependencies

requests: HTTP requests for fetching web pages
beautifulsoup4: HTML parsing and element selection
lxml: Fast HTML/XML parser backend
aiohttp: Async HTTP client for parallel downloads
tkinter: GUI framework (included with Python)
pyinstaller: Creating standalone executables

Performance

Supports up to 100 images per product efficiently
Configurable parallel downloads (default: 5 concurrent)
Async I/O for non-blocking downloads
Automatic URL deduplication
Exponential backoff for retries

License

This project is provided as-is for educational and personal use.

Version: 1.0.0 (MVP)
Last Updated: March 2026

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
src		src
.gitignore		.gitignore
QUICKSTART.md		QUICKSTART.md
README.md		README.md
build.bat		build.bat
build.py		build.py
build_folder.py		build_folder.py
requirements.txt		requirements.txt
run.py		run.py
run_tests.py		run_tests.py

Folders and files

Latest commit

History

Repository files navigation

Anime Figurine Image Scraper

Features

Architecture

Core Components

Installation

Prerequisites

Setup

Testing

Running All Tests

Test Suites

Running Individual Tests

Usage

Running the Application

Using the GUI

File Naming Convention

Project Structure

Adding New Companies

Packaging as .exe (Windows)

Option 1: Single-File Executable (Recommended)

Option 2: Folder Distribution (Most Reliable)

Option 3: Batch File Wrapper

Option 4: Debug Build

What the Build Script Does

Troubleshooting Build Issues

Manual Build (Advanced)

Technical Details

Dependencies

Performance

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages