Skip to content

JarvisBox** is a voice-activated, AI-powered, hardware-integrated assistant designed to replicate the functionality of a real-world "JARVIS" system. It combines **LLM intelligence**, **multimodal input/output**, and **physical expressiveness** into a standalone smart terminal.

Notifications You must be signed in to change notification settings

Mattbusel/Project-JarvisBox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Project JarvisBox: The AI-Powered Voice Assistant Terminal

Status: In Progress – Alpha Prototype
Created by: Matthew Busel – AI/ML Engineer & Systems Developer


Overview

JarvisBox is a voice-activated, AI-powered, hardware-integrated assistant designed to replicate the functionality of a real-world "JARVIS" system. It combines LLM intelligence, multimodal input/output, and physical expressiveness into a standalone smart terminal. Think ChatGPT… but it talks, listens, learns, automates tasks, and lives on your desk.


⚙️ Features (Planned & In Progress)

Core AI Capabilities

  • GPT-powered voice interaction (ChatGPT API, OpenAI/Local LLMs)
  • Long-term memory & personalization
  • Calendar, weather, and task querying
  • Plugin-style custom agents (file parsing, market scan, etc.)

Voice Interface

  • Wake-word detection + passive listening (Hey Jarvis)
  • Natural TTS (ElevenLabs / Coqui.ai)
  • Speech-to-text with noise cancellation (Whisper, Deepgram, etc.)

Agent Utilities

  • Local system automation: open apps, write scripts, Git integration
  • Home automation (Home Assistant / IoT hooks)
  • GitHub repo scaffolding, commit summaries, file watcher

On-Device Intelligence (Stretch Goals)

  • Quantized LLM support (Mistral, LLaMA3, Ollama)
  • Offline fallback mode

Physical Expressiveness

  • LED ring for emotion/confidence output
  • Servo rotation toward speaker
  • OLED face display for expression, waveform, or info
  • Proximity/motion sensing (approaches when you enter room)

Hardware Stack

  • Raspberry Pi 5 (or Jetson Nano / local API node)
  • ReSpeaker 2-Mic Array or USB mic
  • Hi-Fi speaker / Bluetooth audio
  • OLED screen (0.96" I2C), optional webcam
  • Servo motor, RGB LED strip, optional NFC tag switch

Tech Stack

  • Python, Node.js (for hardware API bridges)
  • OpenAI / LLaMA (Ollama for local testing)
  • PyTorch, OpenCV, Whisper, Coqui TTS
  • MQTT for hardware comms
  • Flask/FastAPI for local API exposure

Roadmap

  • Basic GPT voice interface + STT/TTS
  • Multimodal integration (OLED, LEDs, servo)
  • Local LLM fallback
  • Long-term memory via vector DB
  • “Mood” switching + personalities
  • Full launch video + demo site

Why This Project?

This project serves as a showcase of:

  • Embedded AI system design
  • End-to-end machine learning integration
  • Edge AI deployment
  • Hardware/software interfacing
  • User-centric product development

About

JarvisBox** is a voice-activated, AI-powered, hardware-integrated assistant designed to replicate the functionality of a real-world "JARVIS" system. It combines **LLM intelligence**, **multimodal input/output**, and **physical expressiveness** into a standalone smart terminal.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages