CNN that classifies red blood cells as Parasitized (malaria-infected) or Uninfected using microscope images from the NIH dataset.
Accuracy: 95.43% on held-out test data
Takes a 64x64 microscope image of a blood cell and predicts whether the Plasmodium parasite is present. Trained on ~27,500 images (balanced classes) from the National Institutes of Health.
3-block CNN built with TensorFlow/Keras:
- Block 1: 32 filters (3x3) → ReLU → MaxPool — picks up edges and color spots
- Block 2: 64 filters (3x3) → ReLU → MaxPool — combines into shapes
- Block 3: 128 filters (3x3) → ReLU → MaxPool — recognizes parasite-like structures
Then a dense layer (128 units) with 50% dropout, and a sigmoid output for binary classification.
Trained for 10 epochs with Adam optimizer and binary cross-entropy loss. 80/20 train-test split, stratified to keep class balance.
- Test accuracy: 95.43%
- Generates
training_history.png(accuracy/loss curves) andpredictions.png(sample predictions with color-coded correctness)
Run on Kaggle with GPU enabled:
python malaria_detection.pyIt auto-detects the Kaggle dataset path. For local use, put the cell_images/ folder (with Parasitized/ and Uninfected/ subfolders) in the same directory.
- 27,558 images total
- 13,779 Parasitized + 13,779 Uninfected
- Thin blood smear slides, stained and photographed under microscope
- How CNNs extract features hierarchically (edges → shapes → complex patterns)
- Why stratified splits matter for balanced evaluation
- The effect of dropout on reducing overfitting in small-ish datasets
- Image preprocessing and normalization for neural networks