Skip to content

NexusAI is a comprehensive AI platform that combines multiple AI capabilities into a single, user-friendly Streamlit dashboard. Built with Python and powered by Groq, OpenAI, Azure OpenAI, and Google Gemini, NexusAI offers chat, image analysis, image generation, text-to-speech, audio transcription, and document chat services.

License

Notifications You must be signed in to change notification settings

Amul-Thantharate/NexusAI

πŸ€– NexusAI - AI-Powered Chat & Image Platform

License Python Streamlit Groq OpenAI Azure OpenAI Google Gemini

NexusAI is a comprehensive AI platform that combines multiple AI capabilities into a single, user-friendly Streamlit dashboard. Built with Python and powered by Groq, OpenAI, Azure OpenAI, and Google Gemini, NexusAI offers chat, image analysis, image generation, text-to-speech, audio transcription, and document chat services.

πŸ†• New UI Features

The updated UI includes:

  1. Home Page - Introduction and project overview with author information
  2. API Key Setup - Dedicated page for configuring all API keys with instructions
  3. Chat - Enhanced chat interface with Groq's LLMs
  4. Image Analysis - Improved image analysis with multimodal models
  5. Image Generation - Create images with OpenAI or Azure OpenAI DALL-E 3
  6. Text-to-Speech - Convert text to natural speech with multiple voice options
  7. Speech-to-Text - Transcribe audio with advanced Whisper models
  8. Document Chat - Chat with your documents using Google Gemini
  9. Thank You Page - Acknowledgment page with author information

✨ Features

  • πŸ’¬ AI Chat: Engage in conversations with advanced LLMs like Llama 3
  • πŸ–ΌοΈ Image Analysis: Upload and analyze images with multimodal AI models
  • 🎨 Image Generation: Create stunning images with OpenAI or Azure OpenAI DALL-E 3
  • πŸ—£οΈ Text-to-Speech: Convert text to natural-sounding speech
  • 🎀 Audio Transcription: Transcribe audio files with Whisper models
  • πŸ“š Document Chat: Chat with your documents using Google Gemini and ChromaDB
  • πŸ“± Responsive UI: Clean, intuitive interface built with Streamlit

πŸš€ Getting Started

Prerequisites

  • Python 3.8 or higher
  • API keys for:
    • Groq (for chat, image analysis, TTS, and transcription)
    • OpenAI (for image generation with DALL-E 3)
    • Azure OpenAI (for alternative image generation with DALL-E 3)
    • Google Gemini (for document chat)

πŸ”‘ API Keys

NexusAI requires API keys to function. You can enter these directly in the application's API Setup page:

Groq API Key

  1. Visit Groq's website and create an account
  2. Navigate to the API section in your dashboard
  3. Generate a new API key
  4. Copy the key and add it to the application

OpenAI API Key

  1. Visit OpenAI's website and create an account
  2. Navigate to the API section in your dashboard
  3. Generate a new API key
  4. Copy the key and add it to the application

Azure OpenAI API Key

  1. Create an Azure account if you don't have one
  2. Set up an Azure OpenAI resource
  3. Deploy a DALL-E 3 model
  4. Get your API key, endpoint, and deployment name
  5. Add these to the application

Google Gemini API Key

  1. Visit Google AI Studio and create an account
  2. Create a new API key
  3. Copy the key and add it to the application

Installation

  1. Clone the repository
git clone https://github.com/yourusername/NexusAI.git
cd NexusAI
  1. Setup and Run (Linux/Mac)
# Make the run script executable
chmod +x run.sh

# Run the application
./run.sh
  1. Setup and Run (Windows PowerShell)
# You may need to set execution policy
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

# Run the application
.\run.ps1

The script will:

  • Create a virtual environment
  • Install required dependencies
  • Launch the Streamlit application

The application will be available at http://localhost:8501 in your web browser.

🧩 Features in Detail

AI Chat

  • Engage with advanced language models
  • Supports Llama 3 (8B and 70B) and Mixtral models
  • Adjustable parameters for temperature and token limits

Image Analysis

  • Upload images in various formats (JPG, PNG, WEBP)
  • Analyze images with multimodal AI models
  • Customizable analysis prompts

Image Generation

  • Generate images with OpenAI DALL-E 3 or Azure OpenAI DALL-E 3
  • Multiple size options (1024x1024, 1792x1024, 1024x1792)
  • Download generated images

Text-to-Speech

  • Convert text to natural-sounding speech
  • Multiple voice options
  • Various output formats (WAV, MP3, AAC, FLAC, PCM)

Audio Transcription

  • Transcribe audio files with Whisper models
  • Support for multiple languages
  • Word-level timestamps

Document Chat

  • Upload and process documents (PDF, TXT, CSV)
  • Chat with your documents using Google Gemini
  • Vector storage with ChromaDB
  • Source attribution for answers

πŸ“Š Application Structure

NexusAI is organized into a modular structure:

NexusAI/
β”œβ”€β”€ main.py                  # Main application with navigation
β”œβ”€β”€ chat_module.py           # Chat functionality
β”œβ”€β”€ image_analysis_module.py # Image analysis functionality
β”œβ”€β”€ image_generation_module.py # Image generation functionality
β”œβ”€β”€ tts_module.py            # Text-to-speech functionality
β”œβ”€β”€ stt_module.py            # Speech-to-text functionality
β”œβ”€β”€ document_chat_module.py  # Document chat interface
β”œβ”€β”€ document_chat.py         # Document chat backend
β”œβ”€β”€ requirements.txt         # Python dependencies
β”œβ”€β”€ run.sh                   # Linux/Mac launcher script
└── run.ps1                  # Windows PowerShell launcher script

πŸ”§ Configuration

NexusAI supports various configuration options through model selection dropdowns and parameter sliders in the UI. Key configurable parameters include:

  • Chat models (Llama 3, Mixtral)
  • Image analysis models
  • Image generation providers (OpenAI or Azure OpenAI)
  • TTS voices and formats
  • Transcription models and languages
  • Temperature and other generation parameters
  • Document processing options

πŸ› οΈ Technical Details

NexusAI is built with:

  • Streamlit: For the web interface
  • Groq API: For chat, image analysis, TTS, and transcription
  • OpenAI API: For image generation with DALL-E 3
  • Azure OpenAI API: For alternative image generation with DALL-E 3
  • Google Gemini API: For document embeddings and chat
  • ChromaDB: For vector storage
  • LangChain: For document processing and retrieval
  • Python libraries: Including Pillow, requests, and dotenv

For a detailed overview of the system architecture, please see the ARCHITECTURE.md document.

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. Make sure to read the Code of Conduct and Contributing Guidelines.

πŸ‘¨β€πŸ’» Authors

  • Amul Thantharate - An AI enthusiast passionate about creating innovative solutions. GitHub | LinkedIn
  • Avanti Nandanwar - A dedicated AI developer contributing to this project. GitHub | LinkedIn

πŸ™ Acknowledgements


⭐ If you find this project useful, please consider giving it a star on GitHub!

About

NexusAI is a comprehensive AI platform that combines multiple AI capabilities into a single, user-friendly Streamlit dashboard. Built with Python and powered by Groq, OpenAI, Azure OpenAI, and Google Gemini, NexusAI offers chat, image analysis, image generation, text-to-speech, audio transcription, and document chat services.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published