Skip to content

achere/private-gopt

Repository files navigation

Private-GoPT

Private-GoPT is a RAG (Retrieval-Augmented Generation) server inspired by private-gpt. It allows you to chat with your documents locally, ensuring privacy and data security.

Features

  • Local and Private: Your data stays on your machine.
  • LLM Support:
    • OpenAI-compatible models
    • Google Gemini
  • Vector Store:
    • Qdrant
  • Chat Completion: Context-aware chat completion for interacting with your documents.

Adding support for different vector stores and LLM back-ends on its way.

Running the application

This project uses Docker Compose to manage the necessary services.

Default Embedding Model

To run the application with the default, non-gated embedding model, use the following command:

docker compose -f docker-compose.yml up

Guarded Embedding Model

If you need to use a gated embedding model from Hugging Face, you will need to provide your Hugging Face token and specify the model.

  1. Set the EM_MODEL environment variable to the name of the gated model you want to use.
  2. Set the HF_TOKEN environment variable to your Hugging Face Hub token.

Then, run the application with the override docker-compose file:

docker compose up

Migrating from private-gpt

If you have an existing vector database created with private-gpt, you can migrate it to be compatible with private-gopt using the migration script.

The script will update the payload of the points in your Qdrant collection to match the schema expected by this application.

To run the migration, use the following command from the root of the project:

go run ./cmd/migrate --collection <your_collection_name>

You can also specify the Qdrant host, port, and batch size using flags:

  • --host: Qdrant DB host address (default: localhost)
  • --port: Qdrant DB gRPC port number (default: 6334)
  • --batchSize: Number of points processed per batch (default: 1000)

Example

go run ./cmd/migrate --host my-qdrant.local --port 6334 --collection my_private_gpt_collection

Configuration

The application is configured through the settings.toml file and environment variables.

settings.toml

[server]

  • port (integer): The port on which the HTTP server will listen.

[llm]

  • mode (string): The LLM mode to use. Can be "openai" or "gemini".
  • max_new_tokens (integer): The maximum number of new tokens the LLM can generate.
  • context_window (integer): The context window size for the LLM.
  • temperature (float): The temperature for the LLM sampling.
  • model (string): The specific model name to use (e.g., gpt-3.5-turbo, gemini-pro).

[openai]

  • api_base (string): The base URL for the OpenAI-compatible API.
  • request_timeout (integer): The request timeout in seconds.

[gemini]

This section is for Gemini-specific settings, but currently holds no values in settings.toml.

Environment Variables

  • SECRET: A secret key for the application.
  • OPENAI_API_KEY: Your OpenAI API key (if using OpenAI).
  • GEMINI_API_KEY: Your Gemini API key (if using Gemini).
  • EM_MODEL: The name of the embedding model to use (optional, for gated models).
  • HF_TOKEN: Your Hugging Face Hub token (optional, for gated models).

About

A Golang app inspired by private-gpt

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors