Skip to content

SingularityAI-Dev/Nvidia-CLI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

NVIDIA Python 3.9+ MIT License Version 7.0.0 Free API Tier

NVIDIA CLI

A Claude Code-style agentic coding assistant for your terminal β€” powered by NVIDIA's free AI endpoints.

No paid API keys. No subscriptions. Just run nv chat.


NVIDIA CLI demo β€” nv chat with /init


Why NVIDIA CLI? β€’ Features β€’ Quick Start β€’ Usage β€’ Architecture β€’ Roadmap β€’ Contributing


πŸ’‘ Why NVIDIA CLI?

Tools like Claude Code and Aider are powerful β€” but they require paid API subscriptions that add up fast.

NVIDIA CLI gives you the same agentic coding experience using NVIDIA's free-tier AI endpoints. If you have an NVIDIA developer account, you can run a full multi-agent coding assistant with persistent memory, installable skills, and file-based identity β€” completely free.

NVIDIA CLI Claude Code Aider
API Cost βœ… Free tier available πŸ’° Paid (Anthropic API) πŸ’° Paid (OpenAI/Anthropic)
Runs in terminal βœ… βœ… βœ…
Persistent memory βœ… Hybrid vector + BM25 ❌ ❌
Installable skills βœ… With security scanning ❌ ❌
Agent identity/soul βœ… File-based ❌ ❌
Multi-agent support βœ… ❌ ❌
Self-hostable βœ… ❌ βœ…

✨ Features

πŸ€– Multi-Agent System

Create and manage multiple AI agents, each with their own configuration, model preferences, and behaviour. Spawn subagents for parallel task execution.

Multi-Agent System Demo

nv agent list              # List all agents
nv agent create mybot      # Create a new agent
nv agent delete mybot      # Remove an agent

πŸ‘€ Soul / Identity System

Give your agents a persistent personality through file-based identity documents. The Soul acts as active middleware, injecting personality and context into every interaction β€” inspired by OpenClaw.

Soul Identity Demo

File Purpose
SOUL.md Core personality principles and values
IDENTITY.md Agent name, emoji, and avatar
USER.md Human preferences and working style
MEMORY.md Curated long-term memories
HEARTBEAT.md Periodic background task definitions

πŸ›‘οΈ Skills System with Security Scanning

Discover, install, and manage agent skills from any source. Every skill is automatically scanned for dangerous patterns before installation β€” safe to run inside automated agentic loops.

Skills Security Demo

nv skill list              # List installed skills
nv skill install <path>    # Install a skill (pip, npm, brew, or git)
nv skill uninstall <name>  # Remove a skill

Skills are auto-discovered via SKILL.md files and scanned for eval, exec, and subprocess abuse before execution.


🧠 Hybrid Memory (Vector + BM25)

Persistent memory that actually finds what you need β€” combining semantic vector search with traditional keyword matching for best-of-both-worlds recall.

nv memory add "Project uses FastAPI with PostgreSQL"
nv memory search "database setup"
  • SQLite-backed β€” no external database required
  • Embedding providers: OpenAI or fully local via sentence-transformers
  • Automatic context injection β€” relevant memories surface in every conversation

πŸ’“ Heartbeat System

Schedule periodic background tasks that run inside your agent's context β€” ideal for maintenance checks, data syncing, or regular reminders.

Heartbeat Demo

nv heartbeat status        # Check all heartbeat task statuses
  • Quiet hours support β€” won't interrupt you at 2am
  • Batch processing for grouped checks

πŸ”€ Permission Modes

Fine-grained control over how the agent interacts with your filesystem:

Mode Behaviour
ask Always confirm before any action (default, safest)
accept_edits Auto-accept file edits, confirm everything else
auto Auto-approve safe operations
never Full dry-run β€” no actions executed

🧩 Available Models

All models run on NVIDIA's API. Free-tier access available at build.nvidia.com.

Alias Model
default deepseek-ai/deepseek-v3.2
nano nvidia/nemotron-nano-12b-v2-vl
llama70 nvidia/llama-3.1-nemotron-70b-instruct
llama8 meta/llama-3.1-8b-instruct

πŸš€ Quick Start

Prerequisites

Installation

# Clone the repository
git clone https://github.com/SingularityAI-Dev/Nvidia-CLI.git
cd Nvidia-CLI

# Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install
pip install -e .

Set Up Your API Key

# Option 1: Environment variable (recommended)
export NVIDIA_API_KEY="nvapi-your-key-here"

# Option 2: .env file
echo 'NVIDIA_API_KEY=nvapi-your-key-here' > .env

# Option 3: Just run nv chat β€” it will prompt you on first launch
nv chat

Your key is stored in ~/.nv-cli-config/config.json after first setup. You're good to go.


πŸ“– Usage

Start a chat session

nv chat

On first run in a project, use /init to have the agent analyse your codebase and build a context map:

nv> /init
[*] Analysing codebase...
[*] Context saved to .nv/NVIDIA.md

nv> How is authentication handled in this project?

One-shot queries

nv ask "What is the difference between CUDA and OpenCL?"

In-chat slash commands

Command Description
/init Analyse codebase and generate a context file
/add <file> Load a file into the current conversation
/clear Reset conversation and file context
/model <name> Switch AI model mid-session
/skill Manage skills from within chat
/help Show all available commands
/quit Exit with a session summary

Agent management

nv agent create researcher   # Create a specialist agent
nv agent list                # See all your agents
nv config edit               # Edit agent configuration

πŸ—οΈ Architecture

NVIDIA CLI Architecture Diagram

nv_cli/
β”œβ”€β”€ agents/          # ReActAgent loop & subagent orchestration
β”œβ”€β”€ config/          # Configuration dataclasses & validation
β”œβ”€β”€ heartbeat/       # Background task manager & scheduler
β”œβ”€β”€ memory/          # Hybrid search (vector embeddings + BM25)
β”œβ”€β”€ skills/          # Multi-installer (pip/npm/brew/git) & security scanner
β”œβ”€β”€ soul/            # File-based identity loading (OpenClaw-style)
β”œβ”€β”€ tools/           # Built-in tool registry & implementations
└── utils/           # Shared utilities

Key design decisions:

  • OpenAI-compatible SDK β€” Uses NVIDIA's OpenAI-compatible endpoint, so any model on the NVIDIA platform works out of the box
  • ReAct Agent Loop β€” Think β†’ select tool β†’ execute β†’ observe β†’ repeat
  • File-based Identity β€” Agent personality defined in markdown, not hardcoded prompts
  • Modular architecture β€” Each subsystem is fully independent with clean interfaces

πŸ—ΊοΈ Roadmap

  • Plugin marketplace for community skills
  • Multi-agent collaboration workflows
  • Web UI dashboard
  • Voice input/output support
  • RAG pipeline integration
  • Structured outputs with function calling

Have a feature request? Open an issue β€” contributions are very welcome.


🀝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for the full guide.

Quick start for contributors:

# Fork and clone
git clone https://github.com/<your-username>/Nvidia-CLI.git
cd Nvidia-CLI

# Set up dev environment
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

# Create a branch and submit a PR
git checkout -b feature/your-feature

πŸ“„ License

MIT License β€” see LICENSE for details.


πŸ™ Acknowledgments


If this project saved you money on API bills, consider giving it a ⭐ β€” it helps more than you'd think.

Built with ❀️ using NVIDIA AI β€’ GitHub β€’ Issues

About

Agentic CLI coding tool, using Nvidia LLM Models

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages