Skip to content

saurabhhhcodes/geneinsight-platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧬 GeneInsight Platform - AI-Enhanced SaaS Edition

Complete SaaS Bioinformatics Platform with LangChain AI, Multi-Tenant Architecture, and Advanced Molecular Analysis

🚀 Latest Major Update - LangChain Integration

New AI Features:

  • 🧠 LangChain AI Assistant: Conversational AI for molecular analysis powered by microsoft/DialoGPT-small
  • 🦠 COVID-19 Specialized Analysis: Expert insights for viral proteins and drug targeting
  • 💬 Context-Aware Chat: Natural language molecular conversations with memory
  • 🎯 Enhanced Docking: AI-powered molecular docking interpretation and optimization
  • 🔬 Automatic Sequence Detection: Paste protein sequences for instant expert analysis
  • 📊 Real-time AI Status: Live LangChain LLM monitoring and capabilities display
  • 🎨 3D + AI Integration: Interactive molecular visualization combined with AI insights

License: MIT SaaS Ready Multi-Tenant Next.js Spring Boot Python Vercel GitHub

🚀 Live SaaS Platform: [https://geneinsight-platform.vercel.app](https://geneinsight-platform-latest-euwr1g9lf-8packcoders-projects.vercel.app/)
💰 Pricing Plans: View SaaS Pricing
📊 SaaS Dashboard: Multi-tenant with usage analytics & billing management

🤖 LangChain AI Assistant - Revolutionary Molecular Analysis

💬 Conversational AI for Molecular Biology

Experience the future of molecular analysis with our LangChain-powered AI assistant that understands and responds to natural language queries about molecular structures, sequences, and drug interactions.

🧬 What You Can Do:

  • Paste any protein sequence → Get instant expert analysis with domain identification
  • Ask "covid 19" → Get specialized viral protein analysis and drug targeting guidance
  • Query "What does -9.2 kcal/mol mean?" → Get detailed binding affinity explanations
  • Request "show 3d" → Get 3D visualization guidance and molecular insights
  • Inquire about domains → Get functional site analysis and drug targeting potential

🎯 AI Capabilities:

  • Automatic Sequence Recognition: Detects protein sequences in chat (50+ chars, 85% amino acids)
  • COVID-19 Expertise: Specialized analysis for viral proteins (spike, main protease, etc.)
  • Context-Aware Responses: Remembers conversation history and provides relevant insights
  • Scientific Explanations: Expert-level interpretations of molecular data
  • Drug Discovery Insights: Binding affinity analysis and optimization suggestions

🔗 LangChain Integration:

  • Model: microsoft/DialoGPT-small for natural language generation
  • Chains: 2 specialized analysis chains (sequence, docking)
  • Memory: Conversational context preservation
  • Hybrid Approach: Rule-based analysis enhanced with LLM insights
  • Real-time Status: Live monitoring of AI capabilities

🌟 What is GeneInsight Platform?

GeneInsight Platform is a complete Software-as-a-Service (SaaS) bioinformatics platform that combines cutting-edge LangChain AI with advanced molecular visualization. Built for commercial deployment with multi-tenant architecture, subscription billing, and enterprise-grade features.

🔬 What Does GeneInsight Platform Do?

GeneInsight Platform transforms complex genetic analysis into an intuitive, web-based experience. Here's what researchers, students, and biotechnology professionals can accomplish:

🧬 Genetic Sequence Analysis

  • Upload & Analyze: Simply paste or upload DNA, RNA, or protein sequences in various formats (FASTA, plain text)
  • Instant Results: Get comprehensive analysis including:
    • Nucleotide/Amino Acid Composition: Detailed breakdown of sequence components
    • GC Content Analysis: Critical for PCR design and gene expression studies
    • Open Reading Frame (ORF) Detection: Identify potential protein-coding regions
    • Motif Recognition: Find regulatory elements and binding sites
    • Molecular Properties: Calculate molecular weight, isoelectric point, and stability

🎯 AI-Enhanced Molecular Docking

  • Protein-Ligand Docking: Advanced molecular docking simulations with AI interpretation
  • Multiple Binding Modes: Generate and analyze multiple ligand binding poses
  • Binding Affinity Calculation: Accurate kcal/mol scoring with AI explanations
  • Drug-Likeness Assessment: Lipinski Rule of Five validation and optimization
  • AI Chat Integration: Ask questions about docking results and get expert insights
  • 3D Visualization: Interactive visualization of docking results with binding sites

🔬 3D Molecular Visualization

  • Interactive 3D Structures: View and manipulate protein structures in real-time
  • Multiple Visualization Modes: Switch between cartoon, surface, stick, and ball-and-stick representations
  • PDB File Support: Import existing protein structures from the Protein Data Bank
  • Structure Prediction: AI-powered prediction of 3D protein structures from sequences
  • AI + 3D Integration: Combine interactive visualization with conversational AI insights
  • High-Quality Exports: Save publication-ready images and structure files

🤖 LangChain AI-Powered Insights

  • Conversational AI Assistant: Natural language molecular analysis with LangChain
  • Automatic Sequence Detection: Paste protein sequences for instant expert analysis
  • COVID-19 Specialized Analysis: Expert insights for viral proteins and drug targets
  • Context-Aware Responses: AI remembers conversation history and provides relevant insights
  • Binding Affinity Explanations: Detailed scientific interpretations of molecular interactions
  • Domain Analysis: Functional site identification with drug targeting potential
  • Structure Prediction: Machine learning models predict 3D protein structures
  • Disease Association: Analyze potential gene-disease relationships
  • Real-time AI Status: Live monitoring of LangChain LLM capabilities

📊 Professional Workflow Management

  • Project Organization: Organize analyses into projects and folders
  • Team Collaboration: Share results with team members and collaborators
  • Export Options: Download results in PDF, CSV, JSON, and FASTA formats
  • Analysis History: Track and revisit previous analyses with version control

💼 Commercial SaaS Features

  • Multi-Tenant Architecture: Secure, isolated workspaces for different organizations
  • Subscription Plans: Flexible pricing from free tier to enterprise solutions
  • Usage Analytics: Real-time tracking of analyses, storage, and team activity
  • API Access: Programmatic access for integration with existing workflows

🎯 Who Benefits from GeneInsight Platform?

  • 🎓 Students: Learn bioinformatics with an intuitive, modern interface
  • 🔬 Researchers: Accelerate genetic analysis with AI-powered tools
  • 🏥 Clinical Labs: Streamline genetic testing workflows
  • 🏢 Biotech Companies: Scale bioinformatics operations with SaaS efficiency
  • 👥 Research Teams: Collaborate seamlessly with shared workspaces

In essence, GeneInsight Platform makes advanced bioinformatics accessible to everyone - from students learning the basics to professional researchers conducting cutting-edge genetic analysis.

🎯 SaaS Business Model

Plan Price Users Analyses/Month Storage Target Market
Free $0 5 100 1GB Students & Individual Researchers
Pro $49/mo 25 1,000 10GB Research Teams & Small Labs
Enterprise $199/mo Unlimited Unlimited 100GB Large Organizations & Institutions

Annual billing available with 20% discount

💼 Key SaaS Features

🏢 Multi-Tenant Architecture

  • Organization Management: Complete tenant isolation with custom branding
  • Team Collaboration: Owner, Admin, Member, Viewer roles with granular permissions
  • User Invitations: Email-based team member invitations with role assignment
  • Data Isolation: Secure tenant separation with organization-scoped data access

💳 Subscription & Billing System

  • 3-Tier Pricing: Free, Pro ($49/mo), Enterprise ($199/mo) with clear value propositions
  • Usage Tracking: Real-time monitoring of analyses, storage, API calls, and team members
  • Automatic Limits: Usage enforcement with smart upgrade prompts and notifications
  • Stripe Integration: Ready for payment processing with subscription management

📊 SaaS Dashboard & Analytics

  • Usage Overview: Real-time usage statistics with progress bars and percentages
  • Billing Information: Current plan, billing cycle, subscription status display
  • Upgrade Prompts: Smart notifications when approaching usage limits
  • Team Analytics: Member activity and organization usage insights

🔬 Core Scientific Features

  • Multi-format Support: Analyze DNA, RNA, and protein sequences with usage tracking
  • Comprehensive Analysis: Nucleotide composition, GC content, ORF detection, motif identification
  • 3D Molecular Visualization: Interactive 3D viewer powered by 3DMol.js
  • Real-time Results: Instant analysis with detailed visualizations and usage metering
  • Export Options: Download results in multiple formats with plan-based limits

📁 Repository Structure

geneinsight-platform/
├── 🌐 Frontend (Next.js)
│   ├── app/                    # Next.js 13+ App Router
│   │   ├── page.tsx           # Landing page
│   │   ├── dashboard/         # SaaS dashboard
│   │   ├── ai-chat/          # LangChain AI chat interface
│   │   ├── docking/          # Molecular docking with 3D viewer
│   │   └── api/              # Next.js API routes
│   ├── components/           # React components
│   │   ├── langchain-chat.tsx # AI chat component
│   │   └── simple-3d-viewer.tsx # 3D molecular viewer
│   └── lib/                  # Utilities and client-side analysis
│
├── 🧠 ML Service (Python + LangChain)
│   ├── langchain_service/    # LangChain integration
│   │   └── molecular_chain.py # Conversational AI chains
│   ├── docking_service/      # Molecular docking algorithms
│   ├── models/              # ML models and utilities
│   ├── requirements.txt     # Python dependencies
│   └── app.py              # Flask application
│
├── ☕ Backend (Java Spring Boot)
│   ├── src/main/java/       # Java source code
│   │   └── com/geneinsight/ # Application packages
│   ├── src/main/resources/  # Configuration files
│   └── pom.xml             # Maven dependencies
│
├── 🐳 Deployment
│   ├── docker-compose.yml   # Multi-service Docker setup
│   ├── Dockerfile          # Main application container
│   ├── Dockerfile.apillon   # Apillon-optimized container
│   ├── apillon.json        # Apillon deployment config
│   ├── deploy-apillon.sh   # Automated Apillon deployment
│   └── vercel.json         # Vercel configuration
│
└── 📚 Documentation
    ├── README.md           # This file
    ├── DEPLOYMENT.md       # Deployment instructions
    └── APILLON_DEPLOYMENT.md # Apillon-specific guide

🏗️ Technical Architecture

🔗 LangChain AI Integration

Frontend (Next.js) → ML Service (Flask) → LangChain → DialoGPT-small
                                      ↓
                              Rule-based Analysis + LLM Enhancement

🧠 AI Components:

  • LangChain Framework: Conversational AI with memory and context
  • Model: microsoft/DialoGPT-small (351MB, CPU optimized)
  • Analysis Chains: 2 specialized chains (sequence analysis, docking analysis)
  • Output Parser: Custom molecular analysis result parser
  • Memory Management: Conversation context preservation
  • Hybrid Approach: Rule-based analysis enhanced with LLM insights

🚀 Multi-Technology Stack:

  • Frontend: Next.js 15.2.4 with TypeScript and Tailwind CSS
  • Backend: Java Spring Boot 3.2.0 with PostgreSQL
  • ML Service: Python Flask with LangChain integration
  • AI Models: LangChain + DialoGPT-small for conversational AI
  • 3D Visualization: Canvas-based molecular viewer
  • Deployment: Apillon (Web3) + Vercel (Frontend) + Docker (Full Stack)

📊 System Capabilities:

  • Real-time AI Chat: Conversational molecular analysis
  • Automatic Sequence Detection: 50+ characters, 85% amino acid threshold
  • Context Awareness: AI remembers conversation history
  • COVID-19 Expertise: Specialized viral protein analysis
  • 3D + AI Integration: Interactive visualization with AI insights
  • Production Ready: Scalable architecture with monitoring

🚀 Quick Start Guide

🌐 Option 1: Apillon (Web3 + Full LangChain)RECOMMENDED

# 1. Clone repository
git clone https://github.com/saurabhhhcodes/geneinsight-platform.git
cd geneinsight-platform

# 2. Install Apillon CLI and deploy
npm install -g @apillon/cli
apillon auth login
chmod +x deploy-apillon.sh
./deploy-apillon.sh

# 3. Access full LangChain features on Web3
# Frontend: https://geneinsight.apillon.io
# AI Chat: https://geneinsight.apillon.io/ai-chat

🌐 Option 2: Vercel (Enhanced with Edge Functions)

# 1. Clone the repository
git clone https://github.com/saurabhhhcodes/geneinsight-platform.git
cd geneinsight-platform

# 2. Install dependencies
npm install

# 3. Deploy to Vercel
npm i -g vercel
vercel --prod

🐳 Option 3: Docker (Full Stack) - One Command Setup

# 1. Clone the repository
git clone https://github.com/saurabhhhcodes/geneinsight-platform.git
cd geneinsight-platform

# 2. Start all services
docker-compose up -d

# Access: Frontend (3000), Backend (8080), ML Service (5000)

🚂 Option 4: Railway.app (Full LangChain)

# 1. Clone repository
git clone https://github.com/saurabhhhcodes/geneinsight-platform.git
cd geneinsight-platform

# 2. Install Railway CLI
npm install -g @railway/cli

# 3. Deploy to Railway
railway login
railway init
railway up --dockerfile Dockerfile.railway

# 4. Access full LangChain features
# Frontend: https://your-app.railway.app
# AI Chat: https://your-app.railway.app/ai-chat

🎨 Option 4: Render.com (FREE Alternative)

# 1. Fork repository on GitHub
# 2. Connect to Render.com
# 3. Use render.yaml configuration
# 4. Auto-deploys on git push
# Features: Full LangChain + PostgreSQL

✈️ Option 5: Fly.io (FREE with Docker)

# 1. Clone repository
git clone https://github.com/saurabhhhcodes/geneinsight-platform.git
cd geneinsight-platform

# 2. Install Fly CLI
curl -L https://fly.io/install.sh | sh

# 3. Deploy to Fly.io
flyctl auth login
flyctl launch
flyctl deploy

🧠 Option 6: Local Development (Full Features)

# 1. Clone and setup
git clone https://github.com/saurabhhhcodes/geneinsight-platform.git
cd geneinsight-platform

# 2. Install LangChain dependencies
pip install transformers torch langchain-community

# 3. Start ML service with LangChain
cd ml_service
python app.py

# 4. Start frontend
npm install && npm run dev

# 5. Access AI Chat: http://localhost:3001/ai-chat

🎯 Testing LangChain Features:

# Test COVID-19 analysis
curl -X POST http://localhost:5000/langchain/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "covid 19"}'

# Test sequence analysis
curl -X POST http://localhost:5000/langchain/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "SGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDVVYCPRHVICTSEDMLNPNYEDLLIRKSNHNFLVQAGNVQLRVIGHSMQNCVLKLKVDTANPKTPKYKFVRIQPGQTFSVLACYNGSPSGVYQCAMRPNFTIKGSFLNGSCGSVGFNIDYDCVSFCYMHHMELPTGVHAGTDLEGNFYGPFVDRQTAQAAGTDTTITVNVLAWLYAAVINGDRWFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCASLKELLQNGMNGRTILGSALLEDEFTPFDVVRQCSGVTFQ"}'

📊 Deployment Platform Comparison

Platform Free Tier LangChain Support Database Auto-Deploy Best For
🌐 Apillon Free tier available ✅ Full Support ✅ PostgreSQL ✅ GitHub Web3 Production
🚂 Railway $5 credit/month ✅ Full Support ✅ PostgreSQL ✅ GitHub Production
🎨 Render ✅ Unlimited ✅ Full Support ✅ PostgreSQL ✅ GitHub Development
✈️ Fly.io 3 VMs free ✅ Docker Support ✅ PostgreSQL ✅ GitHub Global Edge
🌐 Vercel ✅ Unlimited ✅ Edge Functions ❌ External ✅ GitHub Enhanced Demo
🐳 Local ✅ Free ✅ Full Support ✅ PostgreSQL ❌ Manual Development

🎯 Recommendations:

  • 🌐 Apillon: Best for Web3 deployment with full LangChain features and decentralized hosting
  • 🚂 Railway: Great for production deployment with complete AI functionality
  • 🎨 Render: Excellent for development and testing with unlimited free tier
  • ✈️ Fly.io: Perfect for global deployment with edge computing
  • 🌐 Vercel: Enhanced with Edge Functions for AI simulation and client-side analysis
  • 🐳 Local: Ideal for development and testing all features

🌐 Apillon Web3 Deployment

Why Apillon?

  • ✅ Full Backend Support: Deploy Python + LangChain on Web3
  • ✅ Decentralized Hosting: Censorship-resistant infrastructure
  • ✅ Real LLM: Complete conversational AI functionality
  • ✅ Auto-Scaling: Handles traffic automatically
  • ✅ Database Support: PostgreSQL included
  • ✅ Cost-Effective: Pay for what you use

Quick Apillon Deploy:

# Install Apillon CLI
npm install -g @apillon/cli

# Login and deploy
apillon auth login
chmod +x deploy-apillon.sh
./deploy-apillon.sh

# Your live URLs:
# https://geneinsight.apillon.io
# https://api.geneinsight.apillon.io

Apillon Features:

  • 🧬 Complete AI Platform: All LangChain features working
  • 🦠 COVID-19 Analysis: Expert viral protein insights
  • 🎯 Molecular Docking: Protein-ligand simulations
  • 🔬 3D Visualization: Interactive molecular viewer
  • 📊 Dashboard: SaaS features with subscription management
  • 🌐 Web3 Infrastructure: Decentralized, global distribution

🔌 SaaS API Endpoints

🏢 Organization Management

GET /api/organizations          // Get organization details
POST /api/organizations         // Create new organization
PUT /api/organizations          // Update organization settings

💳 Subscription & Billing

GET /api/subscriptions          // Get subscription status and usage
POST /api/subscriptions         // Upgrade subscription plan
GET /api/usage/track           // Get usage statistics
POST /api/usage/track          // Track usage (automatic)

🛠 Technology Stack

Frontend

  • Next.js 15 with TypeScript for type-safe, performant web application
  • Tailwind CSS for responsive, modern UI design
  • 3DMol.js for hardware-accelerated 3D molecular rendering

Backend

  • Spring Boot 3.2 with Java 17 for enterprise-grade API services
  • MySQL 8.0 with Redis caching for optimal data management
  • JWT Authentication with multi-tenant support

ML Service

  • Python 3.11 with Flask for advanced bioinformatics algorithms
  • Scikit-learn for machine learning capabilities
  • NumPy & Pandas for data processing

SaaS Infrastructure

  • Multi-tenant database design with proper isolation
  • Stripe integration for subscription billing
  • Usage tracking and limits enforcement
  • Role-based access control with organization management

📖 Using the Application

🎯 Getting Started as a User

  1. 🏠 Homepage & Navigation

    • Visit the application URL
    • Navigate: Home, Analyze, Visualize, Pricing, Dashboard
  2. 🧬 Analyzing Sequences

    • Go to /analyze page
    • Input DNA, RNA, or protein sequences
    • Supported formats: FASTA, PDB, plain text
    • View comprehensive results with usage tracking
  3. 🧪 3D Molecular Visualization

    • Go to /visualize page
    • Load structures via PDB ID or file upload
    • Interactive controls: rotate, zoom, pan, reset
    • Multiple rendering styles and color schemes
  4. 📊 Managing Your Organization

    • Access SaaS dashboard for usage analytics
    • Manage team members and permissions
    • Monitor billing and subscription status
    • Upgrade plans as needed

💰 Business Benefits

🚀 For SaaS Entrepreneurs

  • Complete SaaS foundation with proven business model
  • Scalable technology built on modern, cloud-native stack
  • Market validation in the $4.2B bioinformatics market

🏢 For Organizations

  • Cost-effective with pay-as-you-scale pricing
  • Team collaboration with enterprise security
  • Professional support based on subscription tier

🔬 For Researchers

  • Instant access with no software installation
  • Collaborative analysis sharing with team members
  • Always updated with latest features and algorithms

🤝 Contributing

We welcome contributions! Check out our Contributing Guide and look for Good First Issues.

🛠️ Development Setup

# Clone and setup
git clone https://github.com/saurabhhhcodes/geneinsight-platform.git
cd geneinsight-platform
npm install
npm run dev

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • 3DMol.js Team - For excellent 3D molecular visualization capabilities
  • Next.js & Vercel Team - For the amazing React framework and deployment platform
  • Open Source Community - For inspiration, feedback, and contributions
  • Bioinformatics Community - For domain expertise and scientific guidance

🧬 Built with ❤️ for the bioinformatics community

Ready to transform your bioinformatics workflow? 🚀 Get Started | 💰 View Pricing | 📖 Documentation

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published