Enhancement black and white image or video colorization using Zhang et al.'s deep learning algorithm with object-aware processing for custom recolorization, facial feature correction, and color bleeding prevention.
- Overview
- The Problem with Original Zhang Algorithm
- Our Solutions
- Visual Results & Comparisons
- Technical Architecture
- Installation
- Usage Guide
- Project Structure
- Citation
- License
- Acknowledgments
- Contact & Support
- Star History
- Roadmap
- Statistics
- Related Projects
- Further Reading
This project extends the groundbreaking Zhang et al. colorization algorithm with three critical enhancements that solve major real-world problems encountered when colorizing historical photographs, portraits, and complex scenes.
The Zhang et al. (2016) colorization network is a CNN-based deep learning model trained on over 1.3 million images from ImageNet. It predicts plausible color information (ab channels in Lab color space) from grayscale images (L channel). While revolutionary, it has several limitations in practical applications.
After extensive testing with real historical photos, vintage portraits, and complex scenes, we identified five critical problems with the original implementation:
- Color Bleeding Across Object Boundaries - Colors leak from one object to another
- Inconsistent Colorization - Same objects receive different colors within the image
- Facial Feature Miscoloring - Eye whites take skin tone, lips appear colorless
- Incomplete Coverage - Some regions remain grayscale or poorly colored
- No User Control - Cannot manually adjust colors of specific objects
This project provides three specialized Python implementations that systematically address these issues.
Problem: Colors from one object "bleed" or "leak" into adjacent objects, especially color bleeding across object boundariess.
Example: In horse images, neck colors spread to the background. In my image in Prague, clothing colors contaminate skin regions, and also my face skin color bleeded to the background.
Root Cause: The Zhang model processes the entire image holistically without understanding object boundaries.
Problem: Objects of the same type receive different colors in different parts of the image. Some regions remain completely grayscale.
Example: A person's jacket may be half-colored and half-grayscale, or a car's tires remain black while the body is colored. we can see issues in my image in Prague.jpg and in racing car.jpg colored images by Zhang algorithm.
Root Cause: Limited context window and lack of global semantic understanding.
Problem: Eye sclera (whites of eyes) take on skin tone instead of white. Lips appear pale or colorless instead of natural pink/rose.
Example: In portrait photos, eyes look unnatural with beige/tan whites, and lips blend with surrounding skin.
Root Cause: Zhang model treats all facial skin uniformly without anatomical awareness.
Problem: Cannot manually specify colors for specific objects (e.g., "make this tie red instead of blue").
Root Cause: Original implementation is fully automatic with no interactive capabilities.
File: object_detection_colorization.py
How it works:
- Use semantic segmentation (DeepLabV3+ or YOLOv8) to detect individual objects
- Isolate each object with its mask
- Colorize each object independently using Zhang model
- Intelligently blend overlapping regions
- Fill gaps with background colorization
Key Benefits:
- Complete elimination of color bleeding across object boundaries
- Each object maintains color integrity
- Proper handling of overlapping objects
- Automatic background completion
Supported Objects: 80+ categories including people, animals, vehicles, furniture, nature elements, and more (COCO dataset classes).
File: facial_feature_enhancement.py
How it works:
- Detect faces using Dlib's frontal face detector
- Locate 68 facial landmarks (eye contours, lip contours)
- Extract eye sclera masks (excluding pupils/iris)
- Extract lip region masks
- Apply Zhang colorization as base
- Override eye sclera with natural white (Lab: a≈0, b≈7)
- Override lip region with natural rose/pink (Lab: a≈45, b≈25)
Key Benefits:
- Authentic white eye sclera (prevents skin tone bleeding)
- Natural pink/rose lip coloration
- Maintains all other Zhang colorization features
- Works with portraits from any era
Technical Details:
- Eye Sclera Color: Neutral white with slight warm yellow undertone
- Lip Color: Rose/pink with red dominance and warm undertone
- Transition: Gaussian smoothing for natural blending
File: custom_recolorization.py
How it works:
- Perform object detection and colorization
- Analyze chroma intensity to identify "colorizable regions"
- Generate visual map showing which areas can be recolored
- Allow user to manually override colors for specific objects
- Respect colorizable region boundaries (won't color glass, metal, etc.)
Key Benefits:
- Full control over object colors
- Intelligent detection of colorizable vs. non-colorizable regions
- Interactive UI for color selection
- Percentage metrics for each object's colorizable area
Advanced Feature - Colorizable Region Detection:
Uses chroma threshold analysis to identify regions where Zhang applied meaningful color:
chroma = sqrt(a² + b²) # Color intensity in Lab space
colorizable = chroma > threshold (default: 5.0)This prevents trying to color inherently achromatic objects (glass, chrome, white objects).
Each of these images shows the complete transformation process from original grayscale to final colorization:
What this shows: [Original B&W] → [Object Detection/Segmentation] → [Final Colorization]
Key improvements:
- Natural white eye sclera (not skin-toned)
- Proper pink lip coloration
- Clear object boundaries (face, hair, clothing, background)
- Uniform skin tone across entire face
What this shows: [Original B&W] → [Object Detection] → [Colorized Output]
Key improvements:
- Authentic eye whites with natural warmth
- Rose-toned lips with proper saturation
- Consistent hair coloration
What this shows: [Original B&W] → [Segmentation Mask] → [Colorized Result]
Key improvements:
- CRITICAL FIX: Color bleeding eliminated - horse neck color no longer leaks to background
- Sharp boundary preservation between horse and landscape
- Proper sky colorization
- Natural horse and ground tones
What this shows: [Original B&W] → [Multi-Object Detection] → [Final Result]
Key improvements:
- All people properly detected and colored
- No color contamination between people and background
- No color bleeding between face and collar and background
- All body parts colored
What this shows: [Original B&W] → [Object Detection with Labels] → [Colorized with Custom Tie Color]
Demonstrates:
- Custom recolorization capability - tie changed from automatic color to custom color
- Person detection with clothing segmentation
- Suit jacket uniform coloring
- Face, tie, and suit properly separated
These comparison images highlight specific problems solved by our enhancements:
Left (Zhang Original): Eye whites have beige/tan skin tone, lips are pale/colorless
Right (Our Enhancement): Natural white eyes with subtle warmth, pink/rose lips
Arrows highlight: Eye sclera correction (cyan), Lip coloration (red)
Left (Zhang Original): Eyes appear unhealthy with skin-toned whites, lips too pale
Right (Our Enhancement): Bright, natural eye whites, vibrant pink lips
Technical Achievement: Facial landmark detection with 68-point model for precise masking
Left (Zhang Original): Horse neck color bleeds heavily into background sky/landscape
Right (Our Enhancement): Clean separation between horse and background
Critical Improvement: Object-by-object processing maintains semantic boundaries
Left (Zhang Original):
- Hands uncolored (remain grayscale)
- People in background not properly colored
- Color bleeding around head contours
- Jacket coloring inconsistent
Right (Our Enhancement):
- All hands around 90% colored
- Background people got colorized
- Sharp boundaries around all people
Left (Zhang Original):
- People poorly colored or incomplete
- Tire rubber not properly colored
- Inconsistent coloring throughout scene
Right (Our Enhancement):
- All people clearly detected and uniformly colored
- Tire rubber properly handled
- Consistent color application across entire scene
- Car body, wheels, and background properly separated
What this demonstrates:
Left Panel (Zhang Original): Automatic colorization - tie appears dark blue/black
Middle Panel (Custom - Red Tie): User manually changed tie to red
Right Panel (Custom - Blue Tie): User manually changed tie to blue
Interactive Feature: Users can:
- View detected objects with IDs
- See colorizable region percentages
- Override specific object colors (RGB input)
- Preview changes in real-time
Note: Colors are chosen to match grayscale intensity while being more saturated for demonstration clarity.
Input: Grayscale Image (L channel)
↓
┌────────────────────────────────────────┐
│ SEGMENTATION MODULE │
│ (DeepLabV3+ / YOLOv8) │
│ → Detects Objects & Boundaries │
└────────────────┬───────────────────────┘
↓
┌────────────────────────────────────────┐
│ ZHANG COLORIZATION MODULE │
│ (Per-Object Processing) │
│ → Predicts ab channels (Lab space) │
└────────────────┬───────────────────────┘
↓
┌────────────────────────────────────────┐
│ FACIAL ENHANCEMENT MODULE (Optional) │
│ (Dlib 68-point landmarks) │
│ → Corrects eyes & lips │
└────────────────┬───────────────────────┘
↓
┌────────────────────────────────────────┐
│ COLOR BLENDING & GAP FILLING │
│ → Merges objects, fills background │
└────────────────┬───────────────────────┘
↓
Output: Colorized Image (RGB)
Architecture: Convolutional Neural Network (CNN) with VGG-style backbone
Input: L channel (lightness) from Lab color space
Output: ab channels (color information)
Quantization: 313 discrete color bins for stable training
Model Files Required:
colorization_deploy_v2.prototxt- Network architecture (Caffe format)colorization_release_v2.caffemodel- Pre-trained weights (~129 MB)pts_in_hull.npy- Quantized color cluster centers
Color Space: Lab vs RGB
We use Lab color space instead of RGB because:
- L (Lightness): Preserved from original grayscale (0-100)
- a (Green-Red axis): -128 to +127
- b (Blue-Yellow axis): -128 to +127
Advantages:
- Separates luminance from chrominance
- More perceptually uniform than RGB
- Natural for colorization (only predict a,b; keep L)
Option A: DeepLabV3+ (Default)
- Model: ResNet-101 backbone with Atrous Spatial Pyramid Pooling
- Dataset: PASCAL VOC (21 classes)
- Classes: person, car, cat, dog, horse, bird, bottle, chair, etc.
- Advantages: No additional installation, works out-of-the-box
Option B: YOLOv8-X (Optional, Recommended)
- Model: YOLOv8-X instance segmentation
- Dataset: COCO (80 classes)
- Classes: All PASCAL VOC classes + 60 more (motorcycle, airplane, tie, umbrella, etc.)
- Advantages: More precise masks, more object categories
- Requirement:
pip install ultralytics
Library: Dlib (C++ library with Python bindings)
Model: 68-point facial landmark detector
Landmarks Used:
- Eyes: Points 36-47 (12 points total, 6 per eye)
- Lips: Points 48-67 (20 points for outer + inner contours)
Sclera Mask Generation:
- Create polygon from 6 eye landmark points
- Calculate eye center (mean of points)
- Estimate pupil radius (15% of eye width)
- Remove circular pupil region from mask
- Apply Gaussian blur for smooth transitions
Color Application (Lab Space):
- Sclera: a=0 (neutral), b=7 (slight warm yellow)
- Lips: a=45 (strong red), b=25 (warm undertone), Note:: a=25 and b=15 is a default and natural values
Purpose: Identify which regions Zhang successfully colored (vs. naturally achromatic regions)
Algorithm:
# Calculate chroma (color intensity)
chroma = sqrt(a² + b²)
# Threshold to find colored regions
colorizable = chroma > threshold # default: 5.0
# Morphological operations to clean mask
kernel = ellipse(5x5)
mask = close(mask, kernel) # Fill small gaps
mask = open(mask, kernel) # Remove small noiseThreshold Values:
- 3.0: Sensitive - detects subtle colors
- 5.0: Moderate - default, balanced
- 10.0: Strict - only strong colors
For each detected object:
1. Extract object mask from segmentation
2. Add padding (20px) around object for context
3. Crop region: image[y_min:y_max, x_min:x_max]
4. Convert crop to Lab space
5. Resize to 224x224 (Zhang input size)
6. Extract L channel, normalize (L -= 50)
7. Run Zhang network → predict ab channels
8. Resize ab back to crop dimensions
9. Apply object mask (zero out non-object pixels)
10. Accumulate into full-image ab map
Handle overlaps:
- Count how many objects contribute to each pixel
- Average colors where objects overlap
Fill gaps:
- Run Zhang on full image for background
- Fill pixels not covered by any object- Python: 3.7 or higher
- OS: Windows, macOS, or Linux
- GPU: Optional (CUDA-capable for faster processing)
- RAM: Minimum 8GB recommended
git clone https://github.com/YOUR_USERNAME/Zhang-Colorization-Enhanced.git
cd Zhang-Colorization-Enhancedpip install -r requirements.txtRequired packages:
opencv-python>=4.5.0
numpy>=1.19.0
torch>=1.9.0
torchvision>=0.10.0
Pillow>=8.0.0
Optional packages:
dlib>=19.22.0 # For facial feature enhancement
ultralytics # For YOLOv8 segmentation (better than DeepLabV3+)
Installing Dlib (can be tricky):
On Ubuntu/Debian:
sudo apt-get install cmake
sudo apt-get install libboost-all-dev
pip install dlibOn macOS:
brew install cmake
brew install boost
pip install dlibOn Windows:
pip install cmake
pip install dlib
# If fails, download pre-built wheel from:
# https://github.com/sachadee/DlibYou need to download 3 files (~132 MB total) and place them in the models/ directory.
automatically_download_Zhang_models.pyThis script will:
- Download all 3 Zhang model files automatically
- Optionally download facial landmark model (99 MB)
- Show progress bars
- Verify file integrity
See detailed instructions in models/README.md
Quick manual download:
-
colorization_deploy_v2.prototxt (4 KB)
https://raw.githubusercontent.com/richzhang/colorization/caffe/colorization/models/colorization_deploy_v2.prototxt -
colorization_release_v2.caffemodel (129 MB)
http://eecs.berkeley.edu/~rich.zhang/projects/2016_colorization/files/demo_v2/colorization_release_v2.caffemodel -
pts_in_hull.npy (3 KB)
https://github.com/richzhang/colorization/raw/caffe/colorization/resources/pts_in_hull.npy
Save all files to the models/ folder.
Only needed if using facial feature enhancement.
cd models/
wget http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
bunzip2 shape_predictor_68_face_landmarks.dat.bz2Or download manually from: http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
automatically_download_Zhang_models.pyExpected output:
✓ models/colorization_deploy_v2.prototxt (0.00 MB)
✓ models/colorization_release_v2.caffemodel (128.99 MB)
✓ models/pts_in_hull.npy (0.00 MB)
✓ models/shape_predictor_68_face_landmarks.dat (99.37 MB) [Optional]
✓ All required model files are present!
Best for: Complex scenes with multiple objects, outdoor photos, group photos
cd src
python Object detection processing to prevent color bleeding.pyInteractive prompts:
Select image number (default: 1): 1
Enable YOLOv8 segmentation? (y/n, default: n): n
What it does:
- Detects all objects in the image
- Colorizes each object independently
- Prevents color bleeding across boundaries
- Saves output to
colorized_output/
Output files:
object_by_object_[filename].jpg- Final colorized imagedetected_objects_[filename].jpg- Visualization of detected objects
Python API:
from object_detection_colorization import ObjectByObjectColorizer
# Initialize
colorizer = ObjectByObjectColorizer(
zhang_model_dir="models",
use_yolo=False # Set True if YOLOv8 installed
)
# Colorize
colorized, debug_info = colorizer.colorize("your_image.jpg")
# Save
cv2.imwrite("output.jpg", colorized)
# Debug info
print(f"Detected {debug_info['object_count']} objects")
print(f"Classes: {debug_info['detected_classes']}")Best for: Portraits, headshots, historical photos of people
cd src
facial_feature_enhancement.pyInteractive prompts:
Select image number (press Enter for first image): 1
What it does:
- Detects faces and facial landmarks
- Applies Zhang colorization as base
- Corrects eye sclera to natural white
- Applies natural pink/rose to lips
- Saves original, annotated, and colorized versions
Output files:
1_original_[filename].jpg- Original grayscale2_annotated_[filename].jpg- Shows detected eyes (cyan) and lips (red)3_colorized_[filename].jpg- Final natural colorization4_comparison_[filename].jpg- Side-by-side comparison
Python API:
from facial_feature_enhancement import FacialFeatureColorizer
# Initialize
colorizer = FacialFeatureColorizer(
zhang_model_dir="models",
landmark_predictor_path="models/shape_predictor_68_face_landmarks.dat"
)
# Process
original, annotated, colorized, masks, classes = \
colorizer.process_complete_colorization("portrait.jpg")
# Save
cv2.imwrite("colorized_portrait.jpg", colorized)Best for: Fine-tuning specific object colors, creative control, specific color requirements
cd src
python custom_recolorization.pyInteractive workflow:
1. Enable YOLOv8 segmentation? (y/n, default=n): n
2. Chroma threshold (3-10, default=5): 5
[Automatic colorization completes]
3. View colorizable regions map
4. See statistics for each object:
0. person - 87.3% colorizable (45,234 pixels)
1. tie - 92.1% colorizable (3,456 pixels)
2. car - 78.9% colorizable (67,890 pixels)
5. Modify object colors? (y/n): y
6. Object ID to recolor (or 'done'): 1
7. Enter RGB values (0-255):
R: 255
G: 0
B: 0
8. Color set to RGB(255, 0, 0)
What it does:
- Performs object detection
- Analyzes which regions are colorizable
- Shows percentage statistics
- Allows manual RGB input for specific objects
- Respects colorizable region boundaries
- Saves both automatic and custom versions
Output files:
auto_[filename].jpg- Automatic colorizationregions_[filename].jpg- Colorizable regions visualizationcustom_[filename].jpg- Custom colors applied
Python API:
from custom_recolorization import AdaptiveRecolorizer
# Initialize
colorizer = AdaptiveRecolorizer(
zhang_model_dir="models",
use_yolo=False,
chroma_threshold=5.0
)
# Automatic colorization
result = colorizer.process_interactive_colorization(
"image.jpg",
generate_visualization=True
)
colorized, masks, classes, region_map, visualization = result
# Custom color override
custom_colors = {
0: (255, 0, 0), # Object 0 → Red
2: (0, 0, 255), # Object 2 → Blue
}
custom_result = colorizer.process_interactive_colorization(
"image.jpg",
custom_color_map=custom_colors
)Process multiple images at once:
from pathlib import Path
import cv2
from object_detection_colorization import ObjectByObjectColorizer
colorizer = ObjectByObjectColorizer()
input_dir = Path("input_images/")
output_dir = Path("colorized_output/")
output_dir.mkdir(exist_ok=True)
for img_path in input_dir.glob("*.jpg"):
print(f"Processing {img_path.name}...")
colorized, _ = colorizer.colorize(str(img_path), verbose=False)
output_path = output_dir / f"colorized_{img_path.name}"
cv2.imwrite(str(output_path), colorized)
print(f"[INFO] Saved to {output_path}")Enhanced-Zhang-Colorization-with-Object-Aware-Processing/
│
├── README.md # This comprehensive guide
├── requirements.txt # Python dependencies
├── automatically_download_Zhang_models.py # Automatic model downloader
├── verify_models.py # Verify model installation
├── LICENSE # MIT License
│
├── src/ # Source code
│ ├── object_detection_colorization.py # Method 1: Color bleeding prevention
│ ├── facial_feature_enhancement.py # Method 2: Eye & lip correction
│ └── custom_recolorization.py # Method 3: Interactive recoloring
│
├── models/ # Model files (download required)
│ ├── README.md # Detailed download instructions
│ ├── colorization_deploy_v2.prototxt # (Download: 4 KB)
│ ├── colorization_release_v2.caffemodel # (Download: 129 MB)
│ ├── pts_in_hull.npy # (Download: 3 KB)
│ └── shape_predictor_68_face_landmarks.dat # (Download: 99 MB, optional)
│
├── examples/ # Example images and outputs
│ ├── input/ # Sample grayscale images (add your own)
│ └── output/ # Complete pipeline visualizations
│ ├── Combined_object aware_a new zealander vintage lady portrait.jpg
│ ├── Combined_object aware_a young woman vintage portrait.jpg
│ ├── Combined_object aware_horse.jpg
│ ├── Combined_object aware_my image in Prague.jpg
│ ├── Combined_object aware_William Holden actor vintage photo.jpg
│ └── sample_output_running.png
│
└── docs/ # Documentation & comparisons
├── comparison_images/ # Before/after comparisons
│ ├── Comparison_zhang with object aware__eye sclera and Natural lip coloration_a new zelander vintage lady portrait.jpg
│ ├── Comparison_zhang with object aware__eye sclera and Natural lip coloration_a young woman vintage portrait.jpg
│ ├── Comparison_zhang with object aware_prevent color bleeding_a vintage photo from a horse.jpg
│ ├── Comparison_zhang with object aware__Prevent color bleeding and uniform colorization_my image in Prague.jpg
│ ├── Comparison_zhang with object aware__Uniform Colorization and object detection_a vintage racing car.jpg
│ └── Comparison_zhang with object aware_Custom Object Recolorization_William Holden.jpg
│
└── technical_details.md # Deep technical documentation
If you use this enhanced implementation in your research or project, please cite:
@software{zhang_colorization_enhanced_2025,
author = {Hadi Sarhangi Fard},
title = {Zhang Colorization Enhanced: Object-Aware Image Colorization with Facial Feature Correction},
year = {2025},
publisher = {GitHub},
url = {https://github.com/YOUR_USERNAME/Zhang-Colorization-Enhanced},
note = {Advanced implementation of Zhang et al.'s colorization with object detection, facial enhancement, and interactive recolorization}
}Please also cite the original work this builds upon:
@inproceedings{zhang2016colorful,
title={Colorful Image Colorization},
author={Zhang, Richard and Isola, Phillip and Efros, Alexei A},
booktitle={European Conference on Computer Vision (ECCV)},
year={2016},
organization={Springer}
}Paper Links:
We welcome contributions! Here's how you can help:
-
New Enhancement Modules
- Hair color detection and correction
- Clothing texture awareness
- Sky/cloud specialized processing
-
Performance Optimizations
- GPU acceleration improvements
- Batch processing enhancements
- Memory usage reduction
-
Additional Segmentation Models
- Mask R-CNN integration
- SAM (Segment Anything Model) support
- Custom training for historical photos
-
UI/UX Improvements
- Web-based interface
- Drag-and-drop functionality
- Real-time preview
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Make your changes
- Add tests if applicable
- Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Follow PEP 8 style guide
- Add docstrings to all functions
- Include type hints where appropriate
- Comment complex algorithms
- Update README if adding new features
Issue 1: "Model file not found"
FileNotFoundError: models/colorization_release_v2.caffemodel
Solution: Download model files using python download_models.py or see models/README.md
Issue 2: "Dlib not installed" warning
[WARNING] dlib library unavailable - facial feature detection disabled
Solution: This is only needed for Method 2 (facial enhancement). Install with:
pip install dlibIf installation fails, see Dlib installation section above.
Issue 3: Out of memory error
RuntimeError: CUDA out of memory
Solution:
- Use CPU instead: Modify code to use
device = torch.device('cpu') - Process smaller images
- Close other applications
Issue 4: "No objects detected"
[WARNING] No objects detected in image
Solution:
- Image may be too simple or abstract
- Try enabling YOLOv8:
use_yolo=Truefor better detection - Check that image loaded correctly
Issue 5: Poor colorization results
Colors look washed out or unnatural
Solution:
- Adjust chroma threshold (Method 3): Try values between 3.0-10.0
- Ensure good quality input image (not too compressed)
- Historical photos with heavy damage may need preprocessing
Issue 6: Slow processing
Taking too long to process images
Solution:
- Enable GPU if available
- Use DeepLabV3+ instead of YOLOv8 (faster but less accurate)
- Resize large images before processing:
import cv2
img = cv2.imread('large_image.jpg')
img = cv2.resize(img, (800, 600)) # Resize to smaller dimensionsTested on: Intel i7-10700K, 32GB RAM, NVIDIA RTX 3070
| Method | Image Size | CPU Time | GPU Time | Objects Detected |
|---|---|---|---|---|
| Object-Aware (DeepLabV3+) | 1024x768 | 8.3s | 2.1s | 5-8 |
| Object-Aware (YOLOv8) | 1024x768 | 12.7s | 3.4s | 12-15 |
| Facial Enhancement | 800x600 | 6.5s | 1.8s | 1-3 faces |
| Custom Recolorization | 1024x768 | 9.1s | 2.3s | 5-8 |
| Original Zhang Only | 1024x768 | 1.2s | 0.4s | N/A |
Notes:
- YOLOv8 detects more objects but takes longer
- Facial enhancement adds ~2-3s for landmark detection
- GPU speeds up by 3-4x on average
-
Image Quality
- Use high-resolution scans (min 800x600)
- Avoid heavily compressed JPEGs
- Ensure good contrast in grayscale
-
Photo Types
- Portraits: Use Method 2 (facial enhancement)
- Complex scenes: Use Method 1 (object-aware)
- Creative control: Use Method 3 (custom recolorization)
- Simple scenes: Any method works
-
Processing Tips
- Test with automatic mode first
- Use YOLOv8 for photos with many objects
- Adjust chroma threshold if too much/little colorization
- For historical photos, consider noise reduction preprocessing
-
Color Accuracy
- Remember: AI predicts plausible colors, not original colors
- For known colors (uniforms, flags), use Method 3 for correction
- Compare with reference photos when available
This project is licensed under the MIT License - see the LICENSE file for details.
You CAN:
- Use commercially
- Modify the code
- Distribute
- Use privately
- Sublicense
You CANNOT:
- Hold authors liable
- Use authors' names for endorsement
You MUST:
- Include original license
- Include copyright notice
- Richard Zhang, Phillip Isola, Alexei A. Efros - Original colorization algorithm and pre-trained models
- UC Berkeley - Research institution supporting the original work
- OpenCV - Computer vision operations
- PyTorch & Torchvision - Deep learning framework and pre-trained models
- Dlib - Facial landmark detection
- Ultralytics - YOLOv8 implementation
- NumPy - Numerical computing
- ImageNet - Training data for Zhang model (1.3M images)
- COCO - Object detection categories (80 classes)
- PASCAL VOC - Semantic segmentation categories (21 classes)
- Stack Overflow community for troubleshooting help
- GitHub contributors and issue reporters
- Reddit communities: r/MachineLearning, r/computervision
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: sarhangifard.hadi@gmail.com (for business inquiries)
- GitHub: (https://github.com/Hadifard))
- LinkedIn: (https://www.linkedin.com/in/hadi-sarhangi-fard-mech-eng/)
If you find this project useful, please consider giving it a star!
- Web-based UI interface
- Video colorization support
- Real-time webcam colorization
- Mobile app (iOS/Android)
- Cloud API service
- Hair color detection and correction
- Clothing texture awareness
- Sky/cloud specialized processing
- Batch processing GUI
- Docker container support
- Object-aware colorization
- Facial feature enhancement
- Interactive custom recolorization
- Comprehensive documentation
- Example images and comparisons
- Lines of Code: ~2,500
- Functions: 45+
- Classes: 3 main colorization classes
- Supported Object Categories: 80+ (COCO dataset)
- Supported Image Formats: JPG, PNG, BMP
- Model Size: ~230 MB total (with all models)
- DeOldify - GAN-based colorization
- Colorful Image Colorization (Official) - Original Zhang implementation
- InstColorization - Instance-aware colorization
- ChromaGAN - Adversarial colorization
- [Zhang et al., 2016] Colorful Image Colorization (ECCV)
- [Iizuka et al., 2016] Let there be Color! (SIGGRAPH)
- [Larsson et al., 2016] Learning Representations for Automatic Colorization (ECCV)
- Colorization using Optimization
- Automatic Colorization - Two Minute Papers
- How AI Colorizes Black and White Photos
Made with ❤️ by Hadi Sarhangi Fard
Last Updated: December 2025











