Skip to content
View amramer's full-sized avatar
🏠
Open to work
🏠
Open to work

Block or report amramer

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
amramer/README.md

Amr Amer

Machine Learning Engineer • Computer Vision • Multimodal Generative Models

Building intelligent systems for visual understanding, behavior generation, and real-time AI.

Profile Views

Amr Amer Business Card

Portfolio Email LinkedIn


🚀 Featured Projects

Multimodal generative model for generating realistic listener avatars in dyadic conversations, conditioned on personality traits. The model predicts facial expressions, head motion, and upper-body gestures of a listener from the speaker’s audio and motion signals.

personality-aware avatars

  • Tech Stack: PyTorch · Vision-Transformers · VQ-VAE · SMPL-X (PIXIE)-Librosa · OpenCV · CUDA · TensorBoard SLURM · Enroot · Multi-GPU Training (A100)

  • Demo: Thesis Website

  • Paper: Full Thesis Document


End-to-end computer vision pipeline for badminton match analysis that converts badminton match videos into structured performance analytics for players and coaches. The system tracks both players and the shuttlecock, detects shot events, projects motion onto a mini-court representation, and generates a downloadable coach-style performance report.

badminton analysis

  • Tech Stack: Python · PyTorch · OpenCV · YOLO · ByteTrack · Streamlit · SAM · Plotly · Docker · CUDA TrackNet · CI/CD Pipeline · Homography · Body Poses estimation · 3D Motion Trajectories

  • Demo: Live Demo | CV. Pipeline Video | Dashboard Video


This project focuses on semantic segmentation using the BDD100K dataset, a large-scale, diverse dataset for autonomous driving. The main objective is to accurately segment and identify various objects in street scenes, which is important for improving the AI perception vision of autonomous vehicles.

autonmous driving

  • Tech Stack: PyTorch · Fastai · Semantic Segmentation · Hyperparamter Tunning · Weights & Biases

An end-to-end workflow for multi-label brain tumor segmentation from 3D multimodal MRI scans. The project targets segmentation of glioma subregions (tumor core, whole tumor, enhancing tumor) using a 3D SegResNet model.

Medical Imaging

Tech Stack: PyTorch · MONAI · 3D SegResNet · Multi-modal MRI 3D Medical Image Transforms · Sliding Window Inference · Experiment Tracking (W&B)


This repository contains a curated set of Jupyter notebooks demonstrating core computer vision and vision–language capabilities using VLM foundation models. The notebooks progress from offline image understanding tasks to a realtime webcam application that performs image captioning and image classification on live video streams.

Vision-Langauge Models Application

  • Tech Stack: PyTorch · Torchvision · Hugging Face Transformers · Gradio · PIL · VLM · Realtime inference

🎙️ AI Conversational Agent (Ongoing)

This AI Agent is more than just a chatbot. It is real-time voice/text interaction, LLM-powered reasoning, and a digital human conversational model.

Conversation AI Bot

  • Tech Stack: Python · OpenAI GPT · Speech-to-Text · Text-to-Speech · HTML · CSS · JavaScript · Bootstrap · jQuery

🌐 Portfolio

For more details about my work and projects, visit my portfolio website.


🧠 Technical Focus

Area Skills & Tools
Computer Vision Object Detection, Segmentation, Tracking, Video Motion Analysis, Digital Human Models
Models & Frameworks PyTorch, TensorFlow, OpenCV, YOLO, Vision Transformers (ViT), Supervision
Training & Evaluation Transfer Learning, Hyperparameter Optimization, Weights & Biases, TensorBoard
Deployment Docker, AWS (EC2 GPU), Streamlit, Batch & Real-time Inference
Software Engineering Python, C++, Object-Oriented Design, Git, Debugging, Unit Testing

Profile Views

Pinned Loading

  1. amramer.github.io amramer.github.io Public

    Portfolio website showcasing my Master's thesis, computer vision projects, interactive demos, and ML engineering work.

    SCSS 1

  2. Personality-Aware-Non-verbal-Behavior-Generation Personality-Aware-Non-verbal-Behavior-Generation Public

    Thesis Project: A multimodal transformer-based generative model that creates listener avatars conditioned on personality traits to produce realistic non-verbal responses (facial expressions, body a…

    Python 1

  3. Semantic-Segmentation-for-Autonomous-Vehicles Semantic-Segmentation-for-Autonomous-Vehicles Public

    This project focuses on semantic segmentation using the BDD100K dataset, a large-scale, diverse dataset for autonomous driving. The main objective is to accurately segment and identify various obje…

    Jupyter Notebook 5

  4. brain-tumor-segmentation-3D-DeepLearning brain-tumor-segmentation-3D-DeepLearning Public

    3D multi-label brain tumor segmentation from multimodal MRI (FLAIR, T1w, T1gd, T2w) using Decathlon data. End-to-end workflow with 3D SegResNet, Dice loss, and Mean Dice evaluation for glioma sub-r…

    Jupyter Notebook 1

  5. realtime-vision-captioning realtime-vision-captioning Public

    This repository contains a small set of Jupyter notebooks demonstrating key computer vision and vision–language tasks using pretrained models. The final notebook integrates these tasks into a realt…

    Jupyter Notebook 1