Skip to content
View PM-0125's full-sized avatar
💭
Focusing
💭
Focusing

Block or report PM-0125

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
PM-0125/README.md

🧠 Pranjul Mishra — Machine Learning Engineer (NLP • RAG • MLOps • Data Pipelines)

Location Email LinkedIn GitHub

I design and build end-to-end ML systems, NLP/RAG pipelines, and production-grade Python/MLOps tooling.
My engineering focus is on reproducibility, structured retrieval, robust data pipelines, and clean, testable ML code.

I specialize in:

  • NLP / RAG systems (structured retrieval, SPARQL reasoning, knowledge-graphs)
  • Model reproducibility (MLflow, DVC, deterministic pipelines)
  • Python engineering (packaging, CI/CD, testing, modular design)
  • High-performance data processing (parallel pipelines, QC systems)

I care about building ML systems that are reliable, interpretable, and easy for others to run and extend.


🔭 What I’m Working On

🧬 SvPhaser — High-Performance SV Phasing Pipeline

Parallel, chromosome-aware phasing tool with confidence scoring.
Packaged as a CLI with full tests, CI, and docs.
➡️ https://github.com/SFGLab/SvPhaser

📚 INFERMed — Clinical RAG/NLP System with SPARQL Reasoning

A complete clinical retrieval assistant using DrugBank/FDA/PubChem RDF, multi-stage retrieval, and structured evidence reasoning.
➡️ https://github.com/PM-0125/INFERMed

📡 LOPHOS — Reproducible Data Processing & QC Pipeline

Format-aware parsing, QC metrics, deterministic workflows, and reproducible CLI tooling (pytest/mypy/ruff + CI).
➡️ https://github.com/SFGLab/lophos


🧰 Core Engineering Skills

Languages: Python · C++ · SQL · SPARQL
ML/AI: PyTorch · TensorFlow · XGBoost · scikit-learn · NLP · RAG · Feature Engineering
MLOps: MLflow · DVC · Docker · Conda · GitHub Actions · CI/CD · pytest · mypy · ruff · black
Data/DB: Pandas · NumPy · PostgreSQL · MySQL · RDF Knowledge Graphs · Apache Jena · QLever
Dev/Platforms: Linux · Git/GitLab · VS Code · Google Colab · GCP (Cloud Run / Vertex AI basics)


🌟 Featured Projects (ML / NLP / Data Engineering)

🧬 SvPhaser

Parallel SV phasing with confidence scoring; shipped as a reproducible CLI tool with tests/CI/docs.
➡️ https://github.com/SFGLab/SvPhaser

🧠 INFERMed — Clinical Retrieval System

RAG-style structured retrieval + reasoning over biomedical knowledge graphs.
➡️ https://github.com/PM-0125/INFERMed

📡 LOPHOS — Deterministic Data Processing

High-performance QC pipeline with reproducible environments and automation.
➡️ https://github.com/SFGLab/lophos

🔍 Structural Variant Detection

Algorithmic pipeline integrating read-depth and split-read signals.
➡️ https://github.com/PM-0125/Computational-Genomics/tree/main/Structural_Variant_Detection_Algorithm

🧪 Breast Cancer Survival Analysis

Comparative ML modelling (XGBoost/PCA/IF) on the METABRIC dataset.
➡️ https://github.com/PM-0125/AI_ML_Projects/tree/main/Advanced%20Breast%20Cancer%20Analysis


🎓 Education

  • M.Sc., Computer Science & Information Systems (AI) — Warsaw University of Technology
  • B.Tech., Computer Engineering (AI) — Marwadi University

📺 AI Pathfinders — Latest Videos

I teach ML in a practical, builder-first way.

Introduction | Artificial Intelligence | Education For All (Part 1) Decoding AI: AI vs ML vs Deep Learning The Machine Learning Landscape - Part 1


🤝 Contact

📧 Email: pranjulmishra228161@gmail.com
🔗 LinkedIn: https://www.linkedin.com/in/pranjul-mishra/
💻 GitHub: https://github.com/PM-0125

Pinned Loading

  1. AI_ML_Projects AI_ML_Projects Public

    This code snippet contains algorithms of automatic essay grading using Attention Mechanism. It is a technique of RNN in deep learning!

    Jupyter Notebook 2

  2. Computational-Genomics Computational-Genomics Public

    This repository contains my findings and my work on Genomic Data Analysis

    Python 1

  3. INFERMed INFERMed Public

    Intelligent Navigator for Fused Evidence-based Retrieval in Medicine

    Python

  4. SFGLab/SvPhaser SFGLab/SvPhaser Public

    Optimal Tool to Phase Structural Variants

    Python 1

  5. ashreya2003/Academic-Advisor-using-RAG ashreya2003/Academic-Advisor-using-RAG Public

    RAGAdvisor is a simple AI-powered academic advisor bot built using Google Gemini and Retrieval-Augmented Generation (RAG). It helps students by understanding their academic records (in JSON format)…

    CSS

  6. SFGLab/lophos SFGLab/lophos Public

    LOPHOS — LOops & Peaks HaplOtype phasing Suite Allele-specific peak & loop phasing for HiChIP data.

    Python