Skip to content

Commit 201a261

Browse files
authored
Merge pull request #85 from couchbase-examples/capella-model-services-haystack-example
Capella model services haystack cookbook
2 parents a1a9b89 + 55656da commit 201a261

File tree

8 files changed

+1047
-58
lines changed

8 files changed

+1047
-58
lines changed

capella-model-services/haystack/__frontmatter.__md

Lines changed: 0 additions & 20 deletions
This file was deleted.

capella-model-services/haystack/query_based/RAG_with_Capella_Model_Services_and_Haystack.ipynb

Lines changed: 958 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
---
2+
# frontmatter
3+
path: "/tutorial-capella-model-services-haystack-rag-with-hyperscale-and-composite-vector-index"
4+
title: "RAG with Haystack, Capella Model Services and Couchbase Hyperscale & Composite Vector Indexes"
5+
short_title: "RAG with Haystack, Capella Model Services and Couchbase Vector Indexes"
6+
description:
7+
- Learn how to build a semantic search engine using Couchbase Hyperscale and Composite Vector Indexes.
8+
- This tutorial demonstrates how Haystack integrates Couchbase vector search capabilities with embeddings generated by Capella Model Services.
9+
- Perform Retrieval-Augmented Generation (RAG) using Haystack with Couchbase and Capella Model Services.
10+
content_type: tutorial
11+
filter: sdk
12+
technology:
13+
- vector search
14+
tags:
15+
- Artificial Intelligence
16+
- Haystack
17+
- Hyperscale Vector Index
18+
- Composite Vector Index
19+
sdk_language:
20+
- python
21+
length: 60 Mins
22+
---

capella-model-services/haystack/requirements.txt renamed to capella-model-services/haystack/query_based/requirements.txt

File renamed without changes.

capella-model-services/haystack/RAG_with_Couchbase_Capella.ipynb renamed to capella-model-services/haystack/search_based/RAG_with_Capella_Model_Services_and_Haystack.ipynb

Lines changed: 40 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,9 @@
99
"This notebook demonstrates how to build a Retrieval Augmented Generation (RAG) system using:\n",
1010
"- The TMDB movie dataset\n",
1111
"- Couchbase as the vector store\n",
12+
"- Couchbase Search index to enable semantic search \n",
1213
"- Haystack framework for the RAG pipeline\n",
13-
"- Capella AI for embeddings and text generation\n",
14+
"- Capella Model Services for embeddings and text generation\n",
1415
"\n",
1516
"The system allows users to ask questions about movies and get AI-generated answers based on the movie descriptions."
1617
]
@@ -49,7 +50,6 @@
4950
"outputs": [],
5051
"source": [
5152
"import logging\n",
52-
"import base64\n",
5353
"import pandas as pd\n",
5454
"from datasets import load_dataset\n",
5555
"from haystack import Pipeline, GeneratedAnswer\n",
@@ -98,13 +98,15 @@
9898
"\n",
9999
"### Deploy Models\n",
100100
"\n",
101-
"To create the RAG application, use an embedding model for Vector Search and an LLM for generating responses. \n",
102-
" \n",
103-
"Capella Model Service lets you create both models in the same VPC as your database. It offers the Llama 3.1 Instruct model (8 Billion parameters) for LLM and the mistral model for embeddings. \n",
101+
"In order to create the RAG application, we need an embedding model to ingest the documents for Vector Search and a large language model (LLM) for generating the responses based on the context. \n",
102+
"\n",
103+
"Capella Model Service allows you to create both the embedding model and the LLM in the same VPC as your database. There are multiple options for both the Embedding & Large Language Models, along with Value Adds to the models.\n",
104+
"\n",
105+
"Create the models using the Capella Model Services interface. While creating the model, it is possible to cache the responses (both standard and semantic cache) and apply guardrails to the LLM responses.\n",
104106
"\n",
105-
"Use the Capella AI Services interface to create these models. You can cache responses and set guardrails for LLM outputs.\n",
107+
"For more details, please refer to the [documentation](https://docs.couchbase.com/ai/build/model-service/model-service.html). These models are compatible with the [Haystack OpenAI integration](https://haystack.deepset.ai/integrations/openai).\n",
106108
"\n",
107-
"For more details, see the [documentation](https://preview2.docs-test.couchbase.com/ai/get-started/about-ai-services.html#model). These models work with [Haystack OpenAI integration](https://haystack.deepset.ai/integrations/openai)."
109+
"After the models are deployed, please create the API keys for them and whitelist the keys on the IP on which the tutorial is being run. For more details, please refer to the documentation on [generating the API keys](https://docs.couchbase.com/ai/api-guide/api-start.html#model-service-keys)."
108110
]
109111
},
110112
{
@@ -115,14 +117,14 @@
115117
"\n",
116118
"Enter your Couchbase and Capella AI credentials:\n",
117119
"\n",
118-
"CAPELLA_AI_ENDPOINT is the Capella AI Services endpoint found in the models section.\n",
120+
"CAPELLA_MODEL_SERVICES_ENDPOINT is the Capella Model Services Endpoint found in the models section.\n",
119121
"\n",
120-
"> Note that the Capella AI Endpoint requires an additional `/v1` from the endpoint shown on the UI if it is not shown on the UI."
122+
"> Note that the Capella Model Services Endpoint requires an additional `/v1` from the endpoint shown on the UI if it is not shown on the UI."
121123
]
122124
},
123125
{
124126
"cell_type": "code",
125-
"execution_count": 2,
127+
"execution_count": null,
126128
"metadata": {},
127129
"outputs": [],
128130
"source": [
@@ -135,11 +137,14 @@
135137
"CB_BUCKET = input(\"Couchbase Bucket: \") \n",
136138
"CB_SCOPE = input(\"Couchbase Scope: \")\n",
137139
"CB_COLLECTION = input(\"Couchbase Collection: \")\n",
138-
"INDEX_NAME = input(\"Vector Search Index: \")\n",
140+
"INDEX_NAME = \"vector_search\" # need to be matched with the search index name in the search_index.json file\n",
139141
"\n",
140142
"# Get Capella AI endpoint\n",
141-
"CB_AI_ENDPOINT = input(\"Capella AI Services Endpoint\")\n",
142-
"CB_AI_ENDPOINT_PASSWORD = base64.b64encode(f\"{CB_USERNAME}:{CB_PASSWORD}\".encode(\"utf-8\")).decode(\"utf-8\")"
143+
"CAPELLA_MODEL_SERVICES_ENDPOINT = input(\"Enter your Capella Model Services Endpoint: \")\n",
144+
"LLM_MODEL_NAME = input(\"Enter the LLM name\")\n",
145+
"LLM_API_KEY = getpass.getpass(\"Enter your Couchbase LLM API Key: \")\n",
146+
"EMBEDDING_MODEL_NAME = input(\"Enter the Embedding Model name:\")\n",
147+
"EMBEDDING_API_KEY = getpass.getpass(\"Enter your Couchbase Embedding Model API Key: \")"
143148
]
144149
},
145150
{
@@ -194,7 +199,7 @@
194199
" print(f\"Collection '{CB_COLLECTION}' created successfully.\")\n",
195200
"\n",
196201
"# Create search index from search_index.json file at scope level\n",
197-
"with open('fts_index.json', 'r') as search_file:\n",
202+
"with open('search_index.json', 'r') as search_file:\n",
198203
" search_index_definition = SearchIndex.from_json(json.load(search_file))\n",
199204
" \n",
200205
" # Update search index definition with user inputs\n",
@@ -216,8 +221,8 @@
216221
" existing_index = scope_search_manager.get_index(search_index_name)\n",
217222
" print(f\"Search index '{search_index_name}' already exists at scope level.\")\n",
218223
" except Exception as e:\n",
219-
" print(f\"Search index '{search_index_name}' does not exist at scope level. Creating search index from fts_index.json...\")\n",
220-
" with open('fts_index.json', 'r') as search_file:\n",
224+
" print(f\"Search index '{search_index_name}' does not exist at scope level. Creating search index from search_index.json...\")\n",
225+
" with open('search_index.json', 'r') as search_file:\n",
221226
" search_index_definition = SearchIndex.from_json(json.load(search_file))\n",
222227
" scope_search_manager.upsert_index(search_index_definition)\n",
223228
" print(f\"Search index '{search_index_name}' created successfully at scope level.\")"
@@ -320,20 +325,20 @@
320325
},
321326
{
322327
"cell_type": "code",
323-
"execution_count": 6,
328+
"execution_count": null,
324329
"metadata": {},
325330
"outputs": [],
326331
"source": [
327332
"embedder = OpenAIDocumentEmbedder(\n",
328-
" api_base_url=CB_AI_ENDPOINT,\n",
329-
" api_key=Secret.from_token(CB_AI_ENDPOINT_PASSWORD),\n",
330-
" model=\"intfloat/e5-mistral-7b-instruct\",\n",
333+
" api_base_url=CAPELLA_MODEL_SERVICES_ENDPOINT,\n",
334+
" api_key=Secret.from_token(EMBEDDING_API_KEY),\n",
335+
" model=EMBEDDING_MODEL_NAME,\n",
331336
")\n",
332337
"\n",
333338
"rag_embedder = OpenAITextEmbedder(\n",
334-
" api_base_url=CB_AI_ENDPOINT,\n",
335-
" api_key=Secret.from_token(CB_AI_ENDPOINT_PASSWORD),\n",
336-
" model=\"intfloat/e5-mistral-7b-instruct\",\n",
339+
" api_base_url=CAPELLA_MODEL_SERVICES_ENDPOINT,\n",
340+
" api_key=Secret.from_token(EMBEDDING_API_KEY),\n",
341+
" model=EMBEDDING_MODEL_NAME,\n",
337342
")\n"
338343
]
339344
},
@@ -342,19 +347,19 @@
342347
"metadata": {},
343348
"source": [
344349
"# Initialize LLM Generator\n",
345-
"Configure the LLM generator using Capella AI's endpoint and Llama 3.1 model. This component will generate natural language responses based on the retrieved documents.\n"
350+
"Configure the LLM generator using Capella Model Services endpoint and LLM model name. This component will generate natural language responses based on the retrieved documents.\n"
346351
]
347352
},
348353
{
349354
"cell_type": "code",
350-
"execution_count": 7,
355+
"execution_count": null,
351356
"metadata": {},
352357
"outputs": [],
353358
"source": [
354359
"llm = OpenAIGenerator(\n",
355-
" api_base_url=CB_AI_ENDPOINT,\n",
356-
" api_key=Secret.from_token(CB_AI_ENDPOINT_PASSWORD),\n",
357-
" model=\"meta-llama/Llama-3.1-8B-Instruct\",\n",
360+
" api_base_url=CAPELLA_MODEL_SERVICES_ENDPOINT,\n",
361+
" api_key=Secret.from_token(LLM_API_KEY),\n",
362+
" model=LLM_MODEL_NAME,\n",
358363
")"
359364
]
360365
},
@@ -509,13 +514,13 @@
509514
"cell_type": "markdown",
510515
"metadata": {},
511516
"source": [
512-
"## Caching in Capella AI Services\n",
517+
"## Caching in Capella Model Services\n",
513518
"\n",
514-
"To optimize performance and reduce costs, Capella AI services employ two caching mechanisms:\n",
519+
"To optimize performance and reduce costs, Capella Model Services employ two caching mechanisms:\n",
515520
"\n",
516521
"1. Semantic Cache\n",
517522
"\n",
518-
" Capella AI’s semantic caching system stores both query embeddings and their corresponding LLM responses. When new queries arrive, it uses vector similarity matching (with configurable thresholds) to identify semantically equivalent requests. This prevents redundant processing by:\n",
523+
" Capella Model Services’ semantic caching system stores both query embeddings and their corresponding LLM responses. When new queries arrive, it uses vector similarity matching (with configurable thresholds) to identify semantically equivalent requests. This prevents redundant processing by:\n",
519524
" - Avoiding duplicate embedding generation API calls for similar queries\n",
520525
" - Skipping repeated LLM processing for equivalent queries\n",
521526
" - Maintaining cached results with automatic freshness checks\n",
@@ -569,13 +574,10 @@
569574
"cell_type": "markdown",
570575
"metadata": {},
571576
"source": [
572-
"## LLM Guardrails in Capella AI Services\n",
573-
"\n",
574-
"Capella AI services also provide input and response moderation using configurable LLM guardrails. These services can integrate with the LlamaGuard3-8B model from Meta.\n",
575-
"- Categories to be blocked can be configured during the model creation process.\n",
576-
"- Helps prevent unsafe or undesirable interactions with the LLM.\n",
577-
"\n",
578-
"By implementing caching and moderation mechanisms, Capella AI services ensure an efficient, cost-effective, and responsible approach to AI-powered recommendations."
577+
"# LLM Guardrails in Capella Model Services\n",
578+
"Capella Model services also have the ability to moderate the user inputs and the responses generated by the LLM. Capella Model Services can be configured to use the [Llama 3.1 NemoGuard 8B safety model](https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-content-safety/modelcard) guardrails model from Meta. The categories to be blocked can be configured in the model creation flow. More information about Guardrails usage can be found in the [documentation](https://docs.couchbase.com/ai/build/model-service/configure-guardrails-security.html#guardrails).\n",
579+
" \n",
580+
"Here is an example of the Guardrails in action"
579581
]
580582
},
581583
{
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
---
2+
# frontmatter
3+
path: "/tutorial-capella-model-services-services-haystack-rag-with-search-vector-index"
4+
title: RAG with Haystack, Capella Model Services and Couchbase Search Vector Index
5+
short_title: RAG with Haystack, Capella Model Services and Couchbase SVI
6+
description:
7+
- Learn how to build a semantic search engine using Couchbase Search Vector Index.
8+
- This tutorial demonstrates how Haystack integrates Couchbase vector search capabilities with embeddings generated by Capella Model Services.
9+
- Perform Retrieval-Augmented Generation (RAG) using Haystack with Couchbase and Capella Model Services.
10+
content_type: tutorial
11+
filter: sdk
12+
technology:
13+
- vector search
14+
tags:
15+
- Artificial Intelligence
16+
- Search Vector Index
17+
- Haystack
18+
sdk_language:
19+
- python
20+
length: 60 Mins
21+
---
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pandas>=2.1.4
2+
datasets>=2.14.5
3+
setuptools>=75.8.0
4+
couchbase-haystack==2.*
5+
transformers[torch]>=4.49.0
6+
tensorflow>=2.18.0

capella-model-services/haystack/fts_index.json renamed to capella-model-services/haystack/search_based/search_index.json

File renamed without changes.

0 commit comments

Comments
 (0)