EuroCV

Convert resumes (PDF/DOCX) to Europass XML/JSON format with CLI, Python API, and HTTP service support.

Features

📄 Multi-format support: Parse PDF and DOCX resumes with auto-detection
🤖 Smart Extractors: LinkedIn PDF, Generic PDF (with Dutch support), and DOCX extractors
🌍 Multi-language: Native support for English and Dutch resumes
🇪🇺 Europass compliant: Validates against official Europass schemas
🐍 Python API: Use as a library in your Python projects
💻 CLI tool: Command-line interface for batch processing
🌐 HTTP service: FastAPI-based REST API
🐳 Docker ready: Containerized for easy deployment
🔒 AVG/GDPR compliant: Stateless processing, no data retention
🌍 Multi-locale: Support for multiple date/number formats (including Dutch)

Installation

From PyPI (coming soon)

pip install eurocv

From source

git clone https://github.com/emiel/eurocv.git
cd eurocv
pip install -e .

With OCR support

pip install eurocv[ocr]
# Also requires tesseract-ocr system package

Docker

docker pull ghcr.io/emiel/eurocv:latest

Usage

CLI

# Convert single file
eurocv convert resume.pdf --out output.json

# Convert with XML output
eurocv convert resume.docx --out-json output.json --out-xml output.xml

# Batch conversion
eurocv batch resumes/*.pdf --out-dir output/ --parallel 4

# Dutch locale and no photo (GDPR-friendly)
eurocv convert cv.pdf --locale nl-NL --no-photo --out output.json

Python API

from eurocv import convert_to_europass

# Simple conversion
europass_json = convert_to_europass("resume.pdf")

# With options
europass_json = convert_to_europass(
    "resume.pdf",
    locale="nl-NL",
    include_photo=False,
    output_format="json"
)

# Get both JSON and XML
result = convert_to_europass("resume.pdf", output_format="both")
print(result.json)
print(result.xml)

Docker

# Convert a file
docker run --rm -v $PWD:/data ghcr.io/emiel/eurocv \
  convert /data/resume.pdf --out /data/output.json

# Batch processing
docker run --rm -v $PWD:/data ghcr.io/emiel/eurocv \
  batch /data/resumes/*.pdf --out-dir /data/output

HTTP Service

Start the server:

# Using CLI
eurocv serve --host 0.0.0.0 --port 8000

# Using Docker
docker run -p 8000:8000 ghcr.io/emiel/eurocv serve

# Using uvicorn directly
uvicorn eurocv.api.main:app --host 0.0.0.0 --port 8000

API endpoints:

# Convert a resume
curl -X POST http://localhost:8000/convert \
  -F "file=@resume.pdf" \
  -F "locale=nl-NL" \
  -F "include_photo=false"

# Validate Europass JSON
curl -X POST http://localhost:8000/validate \
  -H "Content-Type: application/json" \
  -d @europass.json

# Get Europass JSON Schema
curl http://localhost:8000/schema > europass-schema.json

# Health check
curl http://localhost:8000/healthz

# Interactive API docs
open http://localhost:8000/docs

API Schema & Type Generation

The API provides a fully-typed JSON Schema for the Europass CV format. Use this for client code generation:

# Download the schema
curl http://localhost:8000/schema > europass-schema.json

# Generate TypeScript types
npx quicktype europass-schema.json -o europass.ts

# Generate Python types
datamodel-codegen --input europass-schema.json --output europass_types.py

# Generate Go types
quicktype europass-schema.json -o europass.go --lang go

# Generate Java classes
json schema2pojo --source europass-schema.json --target java-gen/

Benefits:

Type-safe clients: Auto-generate types for any language
IDE autocomplete: Full IntelliSense support
Validation: Validate responses against official schema
Documentation: Self-describing API

Example TypeScript usage:

import { EuropassCVResponse } from './europass';

async function convertResume(file: File): Promise<EuropassCVResponse> {
  const response = await fetch('http://localhost:8000/convert', {
    method: 'POST',
    body: formData
  });
  return await response.json(); // Fully typed!
}

// IDE knows all fields:
const firstName = result.data.LearnerInfo.Identification.PersonName.FirstName;

Architecture

eurocv/
├── core/
│   ├── extract/      # PDF/DOCX parsing
│   ├── map/          # Resume → Europass mapping
│   └── validate/     # Schema validation
├── cli/              # CLI interface (Typer)
├── api/              # HTTP service (FastAPI)
└── schemas/          # Europass XML/JSON schemas

Development

Setup

# Clone repository
git clone https://github.com/emiel/eurocv.git
cd eurocv

# Install with dev dependencies
pip install -e ".[dev,ocr]"

# Run tests
pytest

# Format code
black src/ tests/
ruff check src/ tests/

# Type checking
mypy src/

Running locally

# CLI
python -m eurocv.cli.main convert test.pdf --out output.json

# API server
uvicorn eurocv.api.main:app --reload

Docker build

# Build image
docker build -t eurocv:local .

# Run
docker run --rm -v $PWD:/data eurocv:local convert /data/test.pdf

Privacy & Compliance (AVG/GDPR)

Stateless processing: No data is stored on disk
No photo by default: Use --no-photo flag to exclude photos (GDPR-friendly)
Local processing: Run in your own infrastructure
Encrypted storage: Use --store=encrypted only when necessary

Europass Resources

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For issues and questions:

GitHub Issues: https://github.com/emiel/eurocv/issues
Email: [your-email]

Roadmap

Support for JSON Resume format
Enhanced OCR with layout analysis
Support for more input formats (LinkedIn, etc.)
AI-powered field extraction
Europass PDF rendering
Multi-language support

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github		.github
docs		docs
examples		examples
scripts		scripts
src/eurocv		src/eurocv
tests		tests
-		-
.cursorrules		.cursorrules
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.dev		Dockerfile.dev
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
MYPY_ENFORCEMENT.md		MYPY_ENFORCEMENT.md
Makefile		Makefile
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
PR_BODY.md		PR_BODY.md
PR_DESCRIPTION.md		PR_DESCRIPTION.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
SCHEMA_COMPLIANCE.md		SCHEMA_COMPLIANCE.md
TODO.md		TODO.md
codecov.yml		codecov.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EuroCV

Features

Installation

From PyPI (coming soon)

From source

With OCR support

Docker

Usage

CLI

Python API

Docker

HTTP Service

API Schema & Type Generation

Architecture

Development

Setup

Running locally

Docker build

Privacy & Compliance (AVG/GDPR)

Europass Resources

License

Contributing

Support

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Bermpje/EuroCV

Folders and files

Latest commit

History

Repository files navigation

EuroCV

Features

Installation

From PyPI (coming soon)

From source

With OCR support

Docker

Usage

CLI

Python API

Docker

HTTP Service

API Schema & Type Generation

Architecture

Development

Setup

Running locally

Docker build

Privacy & Compliance (AVG/GDPR)

Europass Resources

License

Contributing

Support

Roadmap

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages