diff --git a/capella-ai/langchain/RAG_with_Couchbase_Capella.ipynb b/capella-ai/langchain/RAG_with_Couchbase_Capella.ipynb
deleted file mode 100644
index 0abf6e17..00000000
--- a/capella-ai/langchain/RAG_with_Couchbase_Capella.ipynb
+++ /dev/null
@@ -1,827 +0,0 @@
-{
-  "cells": [
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "kNdImxzypDlm"
-      },
-      "source": [
-        "# Introduction\n",
-        "In this guide, we will walk you through building a Retrieval Augmented Generation (RAG) application using Couchbase Capella as the database, [Llama 3.1 8B Instruct](https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1/) model as the as the large language model provided by Couchbase Capella AI Services. We will use the [e5-mistral-7b-instruct](https://huggingface.co/intfloat/e5-mistral-7b-instruct) model for generating embeddings via the Capella AI Services. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial will equip you with the knowledge to create a fully functional RAG system using Capella AI Services and [LangChain](https://langchain.com/)."
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "# How to run this tutorial\n",
-        "\n",
-        "This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/capella-ai/RAG_with_Couchbase_Capella.ipynb).\n",
-        "\n",
-        "You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment."
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "# Before you start\n",
-        "\n",
-        "## Create and Deploy Your Operational cluster on Capella\n",
-        "\n",
-        "To get started with Couchbase Capella, create an account and use it to deploy an operational cluster.\n",
-        "\n",
-        "To know more, please follow the [instructions](https://docs.couchbase.com/cloud/get-started/create-account.html).\n",
-        "\n",
-        "\n",
-        "### Couchbase Capella Configuration\n",
-        "\n",
-        "When running Couchbase using [Capella](https://cloud.couchbase.com/sign-in), the following prerequisites need to be met:\n",
-        "\n",
-        "* Have a multi-node Capella cluster running the Data, Query, Index, and Search services.\n",
-        "* Create the [database credentials](https://docs.couchbase.com/cloud/clusters/manage-database-users.html) to access the travel-sample bucket (Read and Write) used in the application.\n",
-        "* [Allow access](https://docs.couchbase.com/cloud/clusters/allow-ip-address.html) to the Cluster from the IP on which the application is running.\n",
-        "\n",
-        "### Deploy Models\n",
-        "\n",
-        "In order to create the RAG application, we need an embedding model to ingest the documents for Vector Search and a large language model (LLM) for generating the responses based on the context. \n",
-        "\n",
-        "Capella Model Service allows you to create both the embedding model and the LLM in the same VPC as your database. Currently, the service offers Llama 3.1 Instruct model with 8 Billion parameters as an LLM and the mistral model for embeddings. \n",
-        "\n",
-        "Create the models using the Capella AI Services interface. While creating the model, it is possible to cache the responses (both standard and semantic cache) and apply guardrails to the LLM responses.\n",
-        "\n",
-        "For more details, please refer to the [documentation](https://preview2.docs-test.couchbase.com/ai/get-started/about-ai-services.html#model). These models are compatible with the [LangChain OpenAI integration](https://python.langchain.com/api_reference/openai/index.html).\n"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "NH2o6pqa69oG"
-      },
-      "source": [
-        "# Installing Necessary Libraries\n",
-        "To build our RAG system, we need a set of libaries. The libraries we install handle everything from connecting to databases to performing AI tasks. Each library has a specific role: Couchbase libraries manage database operations, LangChain handles AI model integrations, and we will use the OpenAI SDK for generating embeddings and calling the LLM in Capella AI services. By setting up these libraries, we ensure our environment is equipped to handle the tasks required for RAG."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "DYhPj0Ta8l_A"
-      },
-      "outputs": [],
-      "source": [
-        "!pip install --quiet datasets==3.6.0 langchain-couchbase==0.3.0 langchain-openai==0.3.17"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "1pp7GtNg8mB9"
-      },
-      "source": [
-        "# Importing Necessary Libraries\n",
-        "The script starts by importing a series of libraries required for various tasks, including handling JSON, logging, time tracking, Couchbase connections, embedding generation, and dataset loading. These libraries provide essential functions for working with data, managing database connections, and processing machine learning models."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "8GzS6tfL8mFP"
-      },
-      "outputs": [],
-      "source": [
-        "import getpass\n",
-        "import json\n",
-        "import logging\n",
-        "import sys\n",
-        "import time\n",
-        "\n",
-        "from datetime import timedelta\n",
-        "\n",
-        "from couchbase.auth import PasswordAuthenticator\n",
-        "from couchbase.cluster import Cluster\n",
-        "from couchbase.exceptions import CouchbaseException\n",
-        "from couchbase.management.search import SearchIndex\n",
-        "from couchbase.options import ClusterOptions\n",
-        "\n",
-        "from datasets import load_dataset\n",
-        "\n",
-        "from langchain_core.output_parsers import StrOutputParser\n",
-        "from langchain_core.prompts import ChatPromptTemplate\n",
-        "from langchain_core.runnables import RunnablePassthrough\n",
-        "from langchain_couchbase.vectorstores import CouchbaseSearchVectorStore\n",
-        "from langchain_openai import ChatOpenAI, OpenAIEmbeddings\n",
-        "\n",
-        "from tqdm import tqdm\n",
-        "import base64"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "K9G5a0en8mPA"
-      },
-      "source": [
-        "# Loading Sensitive Information\n",
-        "In this section, we prompt the user to input essential configuration settings needed. These settings include sensitive information like database credentials and collection names. Instead of hardcoding these details into the script, we request the user to provide them at runtime, ensuring flexibility and security.\n",
-        "\n",
-        "The script also validates that all required inputs are provided, raising an error if any crucial information is missing. This approach ensures that your integration is both secure and correctly configured without hardcoding sensitive information, enhancing the overall security and maintainability of your code.\n",
-        "\n",
-        "CAPELLA_AI_ENDPOINT is the Capella AI Services endpoint found in the models section.\n",
-        "\n",
-        "> Note that the Capella AI Endpoint also requires an additional `/v1` from the endpoint shown on the UI if it is not shown on the UI.\n",
-        "\n",
-        "INDEX_NAME is the name of the search index we will use for the vector search."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": 4,
-      "metadata": {
-        "id": "PFGyHll18mSe"
-      },
-      "outputs": [],
-      "source": [
-        "CB_CONNECTION_STRING = getpass.getpass(\"Enter your Couchbase Connection String: \")\n",
-        "CB_USERNAME = input(\"Enter your Couchbase Database username: \")\n",
-        "CB_PASSWORD = getpass.getpass(\"Enter your Couchbase Database password: \")\n",
-        "CB_BUCKET_NAME = input(\"Enter your Couchbase bucket name: \")\n",
-        "SCOPE_NAME = input(\"Enter your scope name: \")\n",
-        "COLLECTION_NAME = input(\"Enter your collection name: \")\n",
-        "INDEX_NAME = input(\"Enter your Search index name: \")\n",
-        "CAPELLA_AI_ENDPOINT = getpass.getpass(\"Enter your Capella AI Services Endpoint: \")\n",
-        "\n",
-        "# Check if the variables are correctly loaded\n",
-        "if not all(\n",
-        "    [\n",
-        "        CB_CONNECTION_STRING,\n",
-        "        CB_USERNAME,\n",
-        "        CB_PASSWORD,\n",
-        "        CB_BUCKET_NAME,\n",
-        "        CAPELLA_AI_ENDPOINT,\n",
-        "        SCOPE_NAME,\n",
-        "        COLLECTION_NAME,\n",
-        "        INDEX_NAME,\n",
-        "    ]\n",
-        "):\n",
-        "    raise ValueError(\"Missing required environment variables variables\")"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "# Generating Credentials for Capella Model Service\n",
-        "In Capella AI Services, the models are accessed using basic authentication with the linked Capella cluster credentials. We generate the credentials using the following code snippet:"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": 5,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "key_string = f\"{CB_USERNAME}:{CB_PASSWORD}\"\n",
-        "CAPELLA_AI_KEY = base64.b64encode(key_string.encode(\"utf-8\"))"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "qtGrYzUY8mV3"
-      },
-      "source": [
-        "# Connecting to the Couchbase Cluster\n",
-        "Couchbase will serve as our primary data store, handling all the storage and retrieval operations required for our RAG system. By establishing this connection, we enable our application to interact with the database, allowing us to perform operations such as storing embeddings, querying data, and managing collections."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "Zb3kK-7W8mZK"
-      },
-      "outputs": [],
-      "source": [
-        "try:\n",
-        "    auth = PasswordAuthenticator(CB_USERNAME, CB_PASSWORD)\n",
-        "    options = ClusterOptions(auth)\n",
-        "    cluster = Cluster(CB_CONNECTION_STRING, options)\n",
-        "    cluster.wait_until_ready(timedelta(seconds=5))\n",
-        "    print(\"Successfully connected to Couchbase\")\n",
-        "except Exception as e:\n",
-        "    raise ConnectionError(f\"Failed to connect to Couchbase: {str(e)}\")"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "C_Gpy32N8mcZ"
-      },
-      "source": [
-        "# Setting Up Collections in Couchbase\n",
-        "In Couchbase, data is organized in buckets, which can be further divided into scopes and collections. Think of a collection as a table in a traditional SQL database. Before we can store any data, we need to ensure that our collections exist. If they don't, we must create them. This step is important because it prepares the database to handle the specific types of data our application will process. By setting up collections, we define the structure of our data storage, which is essential for efficient data retrieval and management.\n",
-        "\n",
-        "Moreover, setting up collections allows us to isolate different types of data within the same bucket, providing a more organized and scalable data structure. This is particularly useful when dealing with large datasets, as it ensures that related data is stored together, making it easier to manage and query. Here, we also set up the primary index for query operations on the collection and clear the existing documents in the collection if any. If you do not want to do that, please skip this step."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "ACZcwUnG8mf2"
-      },
-      "outputs": [],
-      "source": [
-        "def setup_collection(cluster, bucket_name, scope_name, collection_name, flush_collection=False):\n",
-        "    try:\n",
-        "        bucket = cluster.bucket(bucket_name)\n",
-        "        bucket_manager = bucket.collections()\n",
-        "\n",
-        "        # Check if scope exists, create if it doesn't\n",
-        "        scopes = bucket_manager.get_all_scopes()\n",
-        "        scope_exists = any(scope.name == scope_name for scope in scopes)\n",
-        "        \n",
-        "        if not scope_exists:\n",
-        "            print(f\"Scope '{scope_name}' does not exist. Creating it...\")\n",
-        "            bucket_manager.create_scope(scope_name)\n",
-        "            print(f\"Scope '{scope_name}' created successfully.\")\n",
-        "        else:\n",
-        "            print(f\"Scope '{scope_name}' already exists. Skipping creation.\")\n",
-        "        \n",
-        "        # Check if collection exists, create if it doesn't\n",
-        "        collections = bucket_manager.get_all_scopes()\n",
-        "        collection_exists = any(\n",
-        "            scope.name == scope_name\n",
-        "            and collection_name in [col.name for col in scope.collections]\n",
-        "            for scope in collections\n",
-        "        )\n",
-        "\n",
-        "        if not collection_exists:\n",
-        "            print(f\"Collection '{collection_name}' does not exist. Creating it...\")\n",
-        "            bucket_manager.create_collection(scope_name, collection_name)\n",
-        "            print(f\"Collection '{collection_name}' created successfully.\")\n",
-        "        else:\n",
-        "            print(f\"Collection '{collection_name}' already exists. Skipping creation.\")\n",
-        "\n",
-        "        collection = bucket.scope(scope_name).collection(collection_name)\n",
-        "        time.sleep(2)  # Give the collection time to be ready for queries\n",
-        "\n",
-        "        # Ensure primary index exists\n",
-        "        try:\n",
-        "            cluster.query(\n",
-        "                f\"CREATE PRIMARY INDEX IF NOT EXISTS ON `{bucket_name}`.`{scope_name}`.`{collection_name}`\"\n",
-        "            ).execute()\n",
-        "            print(\"Primary index present or created successfully.\")\n",
-        "        except Exception as e:\n",
-        "            logging.warning(f\"Error creating primary index: {str(e)}\")\n",
-        "\n",
-        "        if flush_collection:\n",
-        "            # Clear all documents in the collection\n",
-        "            try:\n",
-        "                query = f\"DELETE FROM `{bucket_name}`.`{scope_name}`.`{collection_name}`\"\n",
-        "                cluster.query(query).execute()\n",
-        "                print(\"All documents cleared from the collection.\")\n",
-        "            except Exception as e:\n",
-        "                print(\n",
-        "                    f\"Error while clearing documents: {str(e)}. The collection might be empty.\"\n",
-        "                )\n",
-        "\n",
-        "    except Exception as e:\n",
-        "        raise Exception(f\"Error setting up collection: {str(e)}\")\n",
-        "\n",
-        "\n",
-        "setup_collection(cluster, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME, flush_collection=True)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "NMJ7RRYp8mjV"
-      },
-      "source": [
-        "# Loading Couchbase Vector Search Index\n",
-        "\n",
-        "Semantic search requires an efficient way to retrieve relevant documents based on a user's query. This is where the Couchbase Vector Search comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.\n",
-        "\n",
-        "Note that you might have to update the index parameters depending on the names of your bucket, scope and collection. The provided index assumes the bucket to be model_tutorial, scope to be rag and the collection to be data.\n",
-        "\n",
-        "For more information on creating a vector search index, please follow the [instructions](https://docs.couchbase.com/cloud/vector-search/create-vector-search-index-ui.html).\n",
-        "\n",
-        "To import the index into Capella via the UI, please follow the [instructions](https://docs.couchbase.com/cloud/search/import-search-index.html) on the documentation.\n",
-        "\n",
-        "There is code to create the index using the SDK as well below if you want to do it via code."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": 17,
-      "metadata": {
-        "id": "y7xiCrOc8mmj"
-      },
-      "outputs": [],
-      "source": [
-        "# If you are running this script in Google Colab, comment the following line\n",
-        "# and provide the path to your index definition file.\n",
-        "\n",
-        "index_definition_path = \"capella_index.json\"  # Local setup: specify your file path here\n",
-        "\n",
-        "# If you are running in Google Colab, use the following code to upload the index definition file\n",
-        "# from google.colab import files\n",
-        "# print(\"Upload your index definition file\")\n",
-        "# uploaded = files.upload()\n",
-        "# index_definition_path = list(uploaded.keys())[0]\n",
-        "\n",
-        "try:\n",
-        "    with open(index_definition_path, \"r\") as file:\n",
-        "        index_definition = json.load(file)\n",
-        "\n",
-        "        # Update search index definition with user inputs\n",
-        "        index_definition['name'] = INDEX_NAME\n",
-        "        index_definition['sourceName'] = CB_BUCKET_NAME\n",
-        "        # Update types mapping\n",
-        "        old_type_key = next(iter(index_definition['params']['mapping']['types'].keys()))\n",
-        "        type_obj = index_definition['params']['mapping']['types'].pop(old_type_key)\n",
-        "        index_definition['params']['mapping']['types'][f\"{SCOPE_NAME}.{COLLECTION_NAME}\"] = type_obj\n",
-        "        \n",
-        "except Exception as e:\n",
-        "    raise ValueError(\n",
-        "        f\"Error loading index definition from {index_definition_path}: {str(e)}\"\n",
-        "    )"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "v_ddPQ_Y8mpm"
-      },
-      "source": [
-        "# Creating or Updating Search Indexes\n",
-        "\n",
-        "With the index definition loaded, the next step is to create or update the Vector Search Index in Couchbase. This step is crucial because it optimizes our database for vector similarity search operations, allowing us to perform searches based on the semantic content of documents rather than just keywords. By creating or updating a Vector Search Index, we enable our RAG to handle complex queries that involve finding semantically similar documents using vector embeddings, which is essential for a robust RAG system."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "bHEpUu1l8msx"
-      },
-      "outputs": [],
-      "source": [
-        "# Create the Vector Index via SDK\n",
-        "try:\n",
-        "    scope_index_manager = (\n",
-        "        cluster.bucket(CB_BUCKET_NAME).scope(SCOPE_NAME).search_indexes()\n",
-        "    )\n",
-        "\n",
-        "    # Check if index already exists\n",
-        "    existing_indexes = scope_index_manager.get_all_indexes()\n",
-        "    index_name = index_definition[\"name\"]\n",
-        "\n",
-        "    if index_name in [index.name for index in existing_indexes]:\n",
-        "        print(f\"Index '{index_name}' found\")\n",
-        "    else:\n",
-        "        print(f\"Creating new index '{index_name}'...\")\n",
-        "\n",
-        "    # Create SearchIndex object from JSON definition\n",
-        "    search_index = SearchIndex.from_json(index_definition)\n",
-        "\n",
-        "    # Upsert the index (create if not exists, update if exists)\n",
-        "    scope_index_manager.upsert_index(search_index)\n",
-        "    print(f\"Index '{index_name}' successfully created/updated.\")\n",
-        "\n",
-        "except Exception as e:\n",
-        "    logging.error(f\"Index exists: {e}\")"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "QRV4k06L8mwS"
-      },
-      "source": [
-        "# Load the BBC News Dataset\n",
-        "To build a RAG engine, we need data to search through. We use the [BBC Realtime News dataset](https://huggingface.co/datasets/RealTimeData/bbc_news_alltime), a dataset with up-to-date BBC news articles grouped by month. This dataset contains articles that were created after the LLM was trained. It will showcase the use of RAG to augment the LLM. \n",
-        "\n",
-        "The BBC News dataset's varied content allows us to simulate real-world scenarios where users ask complex questions, enabling us to fine-tune our RAG's ability to understand and respond to various types of queries."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "TRfRslF_8mzo"
-      },
-      "outputs": [],
-      "source": [
-        "try:\n",
-        "    news_dataset = load_dataset('RealTimeData/bbc_news_alltime', '2024-12', split=\"train\")\n",
-        "    print(f\"Loaded the BBC News dataset with {len(news_dataset)} rows\")\n",
-        "except Exception as e:\n",
-        "    raise ValueError(f\"Error loading TREC dataset: {str(e)}\")"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "## Preview the Data"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "print(news_dataset[:5])"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "## Cleaning up the Data\n",
-        "\n",
-        "We will use the content of the news articles for our RAG system. \n",
-        "\n",
-        "The dataset contains a few duplicate records. We are removing them to avoid duplicate results in the retrieval stage of our RAG system."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "news_articles = news_dataset[\"content\"]\n",
-        "unique_articles = set()\n",
-        "for article in news_articles:\n",
-        "    if article:\n",
-        "        unique_articles.add(article)\n",
-        "unique_news_articles = list(unique_articles)\n",
-        "print(f\"We have {len(unique_news_articles)} unique articles in our database.\")"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "7FvxRsg38m3G"
-      },
-      "source": [
-        "# Creating Embeddings using Capella AI Service\n",
-        "Embeddings are at the heart of semantic search. They are numerical representations of text that capture the semantic meaning of the words and phrases. Unlike traditional keyword-based search, which looks for exact matches, embeddings allow our search engine to understand the context and nuances of language, enabling it to retrieve documents that are semantically similar to the query, even if they don't contain the exact keywords. By creating embeddings using Capella AI service, we equip our RAG system with the ability to understand and process natural language in a way that is much closer to how humans understand language. This step transforms our raw text data into a format that the Capella vector store can use to find and rank relevant documents.\n",
-        "\n",
-        "We are using the OpenAI Embeddings via the [LangChain OpenAI provider](https://python.langchain.com/docs/integrations/providers/openai/) with a few extra parameters specific to the Capella AI Services such as disabling the tokenization and handling of longer inputs using the LangChain handler. We provide the model and api_key and the URL for the SDK to those for Capella AI Services."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "_75ZyCRh8m6m"
-      },
-      "outputs": [],
-      "source": [
-        "try:\n",
-        "    embeddings = OpenAIEmbeddings(\n",
-        "        openai_api_key=CAPELLA_AI_KEY,\n",
-        "        openai_api_base=CAPELLA_AI_ENDPOINT,\n",
-        "        check_embedding_ctx_length=False,\n",
-        "        tiktoken_enabled=False,\n",
-        "        model=\"intfloat/e5-mistral-7b-instruct\",\n",
-        "    )\n",
-        "    print(\"Successfully created CapellaAIEmbeddings\")\n",
-        "except Exception as e:\n",
-        "    raise ValueError(f\"Error creating CapellaAIEmbeddings: {str(e)}\")"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "# Testing the Embeddings Model\n",
-        "We can test the embeddings model by generating an embedding for a string using the LangChain OpenAI package"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "print(len(embeddings.embed_query(\"this is a test sentence\")))\n"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "8IwZMUnF8m-N"
-      },
-      "source": [
-        "#  Setting Up the Couchbase Vector Store\n",
-        "The vector store is set up to store the documents from the dataset. The vector store is essentially a database optimized for storing and retrieving high-dimensional vectors. In this case, the vector store is using Couchbase using the [LangChain integration](https://python.langchain.com/docs/integrations/providers/couchbase/)."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "DwIJQjYT9RV_"
-      },
-      "outputs": [],
-      "source": [
-        "try:\n",
-        "    vector_store = CouchbaseSearchVectorStore(\n",
-        "        cluster=cluster,\n",
-        "        bucket_name=CB_BUCKET_NAME,\n",
-        "        scope_name=SCOPE_NAME,\n",
-        "        collection_name=COLLECTION_NAME,\n",
-        "        embedding=embeddings,\n",
-        "        index_name=INDEX_NAME,\n",
-        "    )\n",
-        "    print(\"Successfully created vector store\")\n",
-        "except Exception as e:\n",
-        "    raise ValueError(f\"Failed to create vector store: {str(e)}\")"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "C6DJVz7A9RZA"
-      },
-      "source": [
-        "# Saving Data to the Vector Store\n",
-        "With the Vector store set up, the next step is to populate it with data. We save the BBC articles dataset to the vector store. For each document, we will generate the embeddings for the article to use with the semantic search using LangChain.\n",
-        "\n",
-        "Here one of the articles is larger than the maximum tokens that we can use for our embedding model. If we want to ingest that document, we could split the document and ingest it in parts. However, since it is only a single document for simplicity, we ignore that document from the ingestion process."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "_6opqqvx9Rb_"
-      },
-      "outputs": [],
-      "source": [
-        "from langchain_core.documents import Document\n",
-        "from uuid import uuid4\n",
-        "\n",
-        "for article in tqdm(unique_news_articles, desc=\"Ingesting articles\"):\n",
-        "    try:\n",
-        "        documents = [Document(page_content=article)]\n",
-        "        uuids = [str(uuid4()) for _ in range(len(documents))]\n",
-        "        vector_store.add_documents(documents=documents)\n",
-        "    except Exception as e:\n",
-        "        print(f\"Failed to save documents to vector store: {str(e)}\")\n",
-        "        continue"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "uehAx36o9Rlm"
-      },
-      "source": [
-        "# Using the Large Language Model (LLM) in Capella AI\n",
-        "Language language models are AI systems that are trained to understand and generate human language. We'll be using the `Llama3.1-8B-Instruct` large language model via the Capella AI services inside the same network as the Capella operational database to process user queries and generate meaningful responses. This model is a key component of our RAG system, allowing it to go beyond simple keyword matching and truly understand the intent behind a query. By creating this language model, we equip our RAG system with the ability to interpret complex queries, understand the nuances of language, and provide more accurate and contextually relevant responses.\n",
-        "\n",
-        "The language model's ability to understand context and generate coherent responses is what makes our RAG system truly intelligent. It can not only find the right information but also present it in a way that is useful and understandable to the user.\n",
-        "\n",
-        "The LLM has been created using the LangChain OpenAI provider as well with the model name, URL and the API key based on the Capella AI Services."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": 14,
-      "metadata": {
-        "id": "yRAfBRLH9RpO"
-      },
-      "outputs": [],
-      "source": [
-        "try:\n",
-        "    llm = ChatOpenAI(openai_api_base=CAPELLA_AI_ENDPOINT, openai_api_key=CAPELLA_AI_KEY, model=\"meta-llama/Llama-3.1-8B-Instruct\", temperature=0)\n",
-        "    logging.info(\"Successfully created the Chat model in Capella AI Services\")\n",
-        "except Exception as e:\n",
-        "    raise ValueError(f\"Error creating Chat model in Capella AI Services: {str(e)}\")"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "k_XDfCx19UvG"
-      },
-      "source": [
-        "# Perform Semantic Search\n",
-        "Semantic search in Couchbase involves converting queries and documents into vector representations using an embeddings model. These vectors capture the semantic meaning of the text and are stored directly in Couchbase. When a query is made, Couchbase performs a similarity search by comparing the query vector against the stored document vectors. The similarity metric used for this comparison is configurable, allowing flexibility in how the relevance of documents is determined. Common metrics include cosine similarity, Euclidean distance, or dot product, but other metrics can be implemented based on specific use cases. Different embedding models like BERT, Word2Vec, or GloVe can also be used depending on the application's needs, with the vectors generated by these models stored and searched within Couchbase itself.\n",
-        "\n",
-        "In the provided code, the search process begins by recording the start time, followed by executing the `similarity_search_with_score` method of the `CouchbaseSearchVectorStore`. This method searches Couchbase for the most relevant documents based on the vector similarity to the query. The search results include the document content and a similarity score that reflects how closely each document aligns with the query in the defined semantic space. The time taken to perform this search is then calculated and logged, and the results are displayed, showing the most relevant documents along with their similarity scores. This approach leverages Couchbase as both a storage and retrieval engine for vector data, enabling efficient and scalable semantic searches. The integration of vector storage and search capabilities within Couchbase allows for sophisticated semantic search operations without relying on external services for vector storage or comparison."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "Pk-oFbnC9Uym"
-      },
-      "outputs": [],
-      "source": [
-        "query = \"What was Pep Guardiola's reaction to Manchester City's current form?\"\n",
-        "\n",
-        "try:\n",
-        "    # Perform the semantic search\n",
-        "    start_time = time.time()\n",
-        "    search_results = vector_store.similarity_search_with_score(query, k=5)\n",
-        "    search_elapsed_time = time.time() - start_time\n",
-        "\n",
-        "    # Display search results\n",
-        "    print(\n",
-        "        f\"\\nSemantic Search Results (completed in {search_elapsed_time:.2f} seconds):\"\n",
-        "    )\n",
-        "    for doc, score in search_results:\n",
-        "        print(f\"Score: {score:.4f}, ID: {doc.id}, Text: {doc.page_content}\")\n",
-        "        print(\"---\"*20)\n",
-        "\n",
-        "except CouchbaseException as e:\n",
-        "    raise RuntimeError(f\"Error performing semantic search: {str(e)}\")\n",
-        "except Exception as e:\n",
-        "    raise RuntimeError(f\"Unexpected error: {str(e)}\")"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "sS0FebHI9U1l"
-      },
-      "source": [
-        "# Retrieval-Augmented Generation (RAG) with Couchbase and Langchain\n",
-        "Couchbase and LangChain can be seamlessly integrated to create RAG (Retrieval-Augmented Generation) chains, enhancing the process of generating contextually relevant responses. In this setup, Couchbase serves as the vector store, where embeddings of documents are stored. When a query is made, LangChain retrieves the most relevant documents from Couchbase by comparing the query’s embedding with the stored document embeddings. These documents, which provide contextual information, are then passed to a large language model using LangChain.\n",
-        "\n",
-        "The language model, equipped with the context from the retrieved documents, generates a response that is both informed and contextually accurate. This integration allows the RAG chain to leverage Couchbase’s efficient storage and retrieval capabilities, while the LLM handles the generation of responses based on the context provided by the retrieved documents. Together, they create a powerful system that can deliver highly relevant and accurate answers by combining the strengths of both retrieval and generation."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": 16,
-      "metadata": {
-        "id": "ZGUXQQmv9ge4"
-      },
-      "outputs": [],
-      "source": [
-        "template = \"\"\"You are a helpful bot. If you cannot answer based on the context provided, respond with a generic answer. Answer the question as truthfully as possible using the context below:\n",
-        "    {context}\n",
-        "    Question: {question}\"\"\"\n",
-        "prompt = ChatPromptTemplate.from_template(template)\n",
-        "rag_chain = (\n",
-        "    {\"context\": vector_store.as_retriever(), \"question\": RunnablePassthrough()}\n",
-        "    | prompt\n",
-        "    | llm\n",
-        "    | StrOutputParser()\n",
-        ")\n",
-        "logging.info(\"Successfully created RAG chain\")"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "Mia7XxM9978M"
-      },
-      "outputs": [],
-      "source": [
-        "# Get responses\n",
-        "query = \"What was Pep Guardiola's reaction to Manchester City's recent form?\"\n",
-        "try:\n",
-        "    start_time = time.time()\n",
-        "    rag_response = rag_chain.invoke(query)\n",
-        "    rag_elapsed_time = time.time() - start_time\n",
-        "\n",
-        "    print(f\"RAG Response: {rag_response}\")\n",
-        "    print(f\"RAG response generated in {rag_elapsed_time:.2f} seconds\")\n",
-        "except Exception as e:\n",
-        "    print(\"Guardrails violation\", e)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "aIdayPzw9glT"
-      },
-      "source": [
-        "# Using Caching mechanism in Capella AI Services\n",
-        "In Capella AI services, the model outputs can be [cached](https://preview.docs-test.couchbase.com/ai-services-concepts/ai/get-started/about-ai-services.html#llm-caching) (both semantic and standard cache). The caching mechanism enhances the RAG's efficiency and speed, particularly when dealing with repeated or similar queries. When a query is first processed, the LLM generates a response and then stores this response in Couchbase. When similar queries come in later, the cached responses are returned. The caching duration can be configured in the Capella AI services.\n",
-        "\n",
-        "In this example, we are using the standard cache which works for exact matches of the queries."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "0xM2G3ef-GS2"
-      },
-      "outputs": [],
-      "source": [
-        "queries = [\n",
-        "        \"Who inaugurated the reopening of the Notre Dam Cathedral in Paris?\",\n",
-        "        \"What was Pep Guardiola's reaction to Manchester City's recent form?\", \n",
-        "        \"Who inaugurated the reopening of the Notre Dam Cathedral in Paris?\", # Repeated query\n",
-        "]\n",
-        "\n",
-        "for i, query in enumerate(queries, 1):\n",
-        "    try:\n",
-        "        print(f\"\\nQuery {i}: {query}\")\n",
-        "        start_time = time.time()\n",
-        "        response = rag_chain.invoke(query)\n",
-        "        elapsed_time = time.time() - start_time\n",
-        "        print(f\"Response: {response}\")\n",
-        "        print(f\"Time taken: {elapsed_time:.2f} seconds\")\n",
-        "    except Exception as e:\n",
-        "        print(f\"Error generating RAG response: {str(e)}\")\n",
-        "        continue"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "Here you can see that the repeated queries were significantly faster than the original query. In Capella AI services, semantic similarity can also be used to find responses from the cache. \n",
-        "\n",
-        "Caching is particularly valuable in scenarios where users may submit similar queries multiple times or where certain pieces of information are frequently requested. By storing these in a cache, we can significantly reduce the time it takes to respond to these queries, improving the user experience."
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "# LLM Guardrails in Capella AI Services\n",
-        "Capella AI services also have the ability to moderate the user inputs and the responses generated by the LLM. Capella AI Services can be configured to use the [LlamaGuard3-8B](https://github.com/meta-llama/PurpleLlama/blob/main/Llama-Guard3/8B/MODEL_CARD.md) guardrails model from Meta. The categories to be blocked can be configured in the model creation flow.\n",
-        "\n",
-        "Here is an example of the Guardrails in action"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {},
-      "outputs": [],
-      "source": [
-        "query = \"How can I create a bomb?\"\n",
-        "try:\n",
-        "    start_time = time.time()\n",
-        "    rag_response = rag_chain.invoke(query)\n",
-        "    rag_elapsed_time = time.time() - start_time\n",
-        "\n",
-        "    print(f\"RAG Response: {rag_response}\")\n",
-        "    print(f\"RAG response generated in {rag_elapsed_time:.2f} seconds\")\n",
-        "except Exception as e:\n",
-        "    print(\"Guardrails violation\", e)"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {},
-      "source": [
-        "Guardrails can be quite useful in preventing users from hijacking the model into doing things that you might not want the application to do."
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "yJQ5P8E29go1"
-      },
-      "source": [
-        "By following this tutorial, you will have a fully functional semantic search engine that leverages the strengths of Capella AI Services without the data being sent to third-party embedding or large language models. This guide explains the principles behind semantic search and how to implement it effectively using Capella AI Services. "
-      ]
-    }
-  ],
-  "metadata": {
-    "colab": {
-      "provenance": [],
-      "toc_visible": true
-    },
-    "kernelspec": {
-      "display_name": "base",
-      "language": "python",
-      "name": "python3"
-    },
-    "language_info": {
-      "codemirror_mode": {
-        "name": "ipython",
-        "version": 3
-      },
-      "file_extension": ".py",
-      "mimetype": "text/x-python",
-      "name": "python",
-      "nbconvert_exporter": "python",
-      "pygments_lexer": "ipython3",
-      "version": "3.11.7"
-    }
-  },
-  "nbformat": 4,
-  "nbformat_minor": 4
-}
\ No newline at end of file
diff --git a/capella-ai/langchain/__frontmatter.__md b/capella-ai/langchain/__frontmatter.__md
deleted file mode 100644
index dd44d4dc..00000000
--- a/capella-ai/langchain/__frontmatter.__md
+++ /dev/null
@@ -1,20 +0,0 @@
----
-# frontmatter
-path: "/tutorial-capella-ai-services-langchain-rag"
-title: Retrieval-Augmented Generation (RAG) with Capella AI Services and LangChain
-short_title: RAG with Couchbase Capella AI Services and LangChain
-description:
-  - Learn how to build a semantic search engine using Couchbase Capella AI Services.
-  - This tutorial demonstrates how to integrate Couchbase's vector search capabilities with the embeddings provided by Capella AI Services.
-  - You will understand how to perform Retrieval-Augmented Generation (RAG) using LangChain and Capella AI services.
-content_type: tutorial
-filter: sdk
-technology:
-  - vector search
-tags:
-  - Artificial Intelligence
-  - LangChain
-sdk_language:
-  - python
-length: 60 Mins
----
diff --git a/capella-ai/haystack/RAG_with_Couchbase_Capella.ipynb b/capella-model-services/haystack/RAG_with_Couchbase_Capella.ipynb
similarity index 100%
rename from capella-ai/haystack/RAG_with_Couchbase_Capella.ipynb
rename to capella-model-services/haystack/RAG_with_Couchbase_Capella.ipynb
diff --git a/capella-ai/haystack/__frontmatter.__md b/capella-model-services/haystack/__frontmatter.__md
similarity index 100%
rename from capella-ai/haystack/__frontmatter.__md
rename to capella-model-services/haystack/__frontmatter.__md
diff --git a/capella-ai/haystack/fts_index.json b/capella-model-services/haystack/fts_index.json
similarity index 100%
rename from capella-ai/haystack/fts_index.json
rename to capella-model-services/haystack/fts_index.json
diff --git a/capella-ai/haystack/requirements.txt b/capella-model-services/haystack/requirements.txt
similarity index 100%
rename from capella-ai/haystack/requirements.txt
rename to capella-model-services/haystack/requirements.txt
diff --git a/capella-model-services/langchain/search_based/RAG_with_Capella_Model_Services_and_LangChain.ipynb b/capella-model-services/langchain/search_based/RAG_with_Capella_Model_Services_and_LangChain.ipynb
new file mode 100644
index 00000000..de231bee
--- /dev/null
+++ b/capella-model-services/langchain/search_based/RAG_with_Capella_Model_Services_and_LangChain.ipynb
@@ -0,0 +1,1110 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "kNdImxzypDlm"
+   },
+   "source": [
+    "# Introduction\n",
+    "In this guide, we will walk you through building a Retrieval Augmented Generation (RAG) application using Couchbase Capella as the database, [Mistral-7B-Instruct-v0.3](https://build.nvidia.com/mistralai/mistral-7b-instruct-v03/modelcard) model as the as the large language model provided by Capella Model Services. We will use the [NVIDIA NeMo Retriever Llama3.2](https://build.nvidia.com/nvidia/llama-3_2-nv-embedqa-1b-v2/modelcard) model for generating embeddings via Capella Model Services. \n",
+    "\n",
+    "This notebook demonstrates how to build a RAG system using:\n",
+    "- The [BBC News dataset](https://huggingface.co/datasets/RealTimeData/bbc_news_alltime) containing news articles\n",
+    "- Couchbase Capella as the vector store with Search Service(formerly known as Full Text Search) for vector index creation\n",
+    "- Capella Model Services for embeddings and text generation\n",
+    "- LangChain framework for the RAG pipeline\n",
+    "\n",
+    "We leverage Couchbase's Search service to create and manage search vector indexes, enabling efficient semantic search capabilities. Search service provides the infrastructure for storing, indexing, and querying high-dimensional vector embeddings alongside traditional text search functionality. This tutorial can also be recreated using the Query and Indexing services using [Hyperscale and Composite Vector Indexes](http://docs.couchbase.com/cloud/vector-index/use-vector-indexes.html).\n",
+    "\n",
+    "Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial will equip you with the knowledge to create a fully functional RAG system using Capella Model Services and [LangChain](https://langchain.com/)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# How to run this tutorial\n",
+    "\n",
+    "This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/capella-model-services/langchain/search_based/RAG_with_Capella_Model_Services_and_LangChain.ipynb)\n",
+    "\n",
+    "You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Before you start\n",
+    "\n",
+    "## Create and Deploy Your Operational cluster on Capella\n",
+    "\n",
+    "To get started with Couchbase Capella, create an account and use it to deploy an operational cluster.\n",
+    "\n",
+    "To know more, please follow the [instructions](https://docs.couchbase.com/cloud/get-started/create-account.html).\n",
+    "\n",
+    "\n",
+    "### Couchbase Capella Configuration\n",
+    "\n",
+    "When running Couchbase using [Capella](https://cloud.couchbase.com/sign-in), the following prerequisites need to be met:\n",
+    "\n",
+    "* Have a multi-node Capella cluster running the Data, Query, Index, and Search services.\n",
+    "* Create the [database credentials](https://docs.couchbase.com/cloud/clusters/manage-database-users.html) to access the bucket (Read and Write) used in the application.\n",
+    "* [Allow access](https://docs.couchbase.com/cloud/clusters/allow-ip-address.html) to the Cluster from the IP on which the application is running.\n",
+    "\n",
+    "### Deploy Models\n",
+    "\n",
+    "In order to create the RAG application, we need an embedding model to ingest the documents for Vector Search and a large language model (LLM) for generating the responses based on the context. \n",
+    "\n",
+    "Capella Model Service allows you to create both the embedding model and the LLM in the same VPC as your database. There are multiple options for both the Embedding & Large Language Models, along with Value Adds to the models.\n",
+    "\n",
+    "Create the models using the Capella Model Services interface. While creating the model, it is possible to cache the responses (both standard and semantic cache) and apply guardrails to the LLM responses.\n",
+    "\n",
+    "For more details, please refer to the [documentation](https://docs.couchbase.com/ai/build/model-service/model-service.html). These models are compatible with the [LangChain OpenAI integration](https://python.langchain.com/api_reference/openai/index.html).\n",
+    "\n",
+    "After the models are deployed, please create the API keys for them and whitelist the keys on the IP on which the tutorial is being run. For more details, please refer to the documentation on [generating the API keys](https://docs.couchbase.com/ai/api-guide/api-start.html#model-service-keys).\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "NH2o6pqa69oG"
+   },
+   "source": [
+    "# Installing Necessary Libraries\n",
+    "To build our RAG system, we need a set of libaries. The libraries we install handle everything from connecting to databases to performing AI tasks. Each library has a specific role: Couchbase libraries manage database operations, LangChain handles AI model integrations, and we will use the OpenAI SDK for generating embeddings and calling the LLM in Capella Model services. By setting up these libraries, we ensure our environment is equipped to handle the tasks required for RAG."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "id": "DYhPj0Ta8l_A"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.3.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.3\u001b[0m\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
+     ]
+    }
+   ],
+   "source": [
+    "!pip install --quiet datasets==4.4.1 langchain-couchbase==1.0.0 langchain-openai==1.1.0"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "1pp7GtNg8mB9"
+   },
+   "source": [
+    "# Importing Necessary Libraries\n",
+    "The script starts by importing a series of libraries required for various tasks, including handling JSON, logging, time tracking, Couchbase connections, embedding generation, and dataset loading. These libraries provide essential functions for working with data, managing database connections, and processing machine learning models."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "id": "8GzS6tfL8mFP"
+   },
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import json\n",
+    "import logging\n",
+    "import sys\n",
+    "import time\n",
+    "\n",
+    "from datetime import timedelta\n",
+    "\n",
+    "from couchbase.auth import PasswordAuthenticator\n",
+    "from couchbase.cluster import Cluster\n",
+    "from couchbase.exceptions import CouchbaseException\n",
+    "from couchbase.management.search import SearchIndex\n",
+    "from couchbase.options import ClusterOptions\n",
+    "\n",
+    "from datasets import load_dataset\n",
+    "\n",
+    "from langchain_core.output_parsers import StrOutputParser\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_core.runnables import RunnablePassthrough\n",
+    "from langchain_couchbase.vectorstores import CouchbaseSearchVectorStore\n",
+    "from langchain_openai import ChatOpenAI, OpenAIEmbeddings\n",
+    "\n",
+    "from tqdm import tqdm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "K9G5a0en8mPA"
+   },
+   "source": [
+    "# Loading Sensitive Information\n",
+    "In this section, we prompt the user to input essential configuration settings needed. These settings include sensitive information like database credentials and collection names. Instead of hardcoding these details into the script, we request the user to provide them at runtime, ensuring flexibility and security.\n",
+    "\n",
+    "The script also validates that all required inputs are provided, raising an error if any crucial information is missing. This approach ensures that your integration is both secure and correctly configured without hardcoding sensitive information, enhancing the overall security and maintainability of your code.\n",
+    "\n",
+    "CAPELLA_MODEL_SERVICES_ENDPOINT is the Capella Model Services endpoint found in the models section.\n",
+    "\n",
+    "> Note that the Capella Model Services Endpoint also requires an additional `/v1` from the endpoint shown on the UI if it is not shown on the UI.\n",
+    "\n",
+    "INDEX_NAME is the name of the search index we will use for the vector search.\n",
+    "\n",
+    "LLM_MODEL_NAME and EMBEDDING_MODEL_NAME are the names of the models selected from the Capella Model Service catalogue. For this tutorial, we are using `mistralai/mistral-7b-instruct-v0.3` as the LLM and `nvidia/llama-3.2-nv-embedqa-1b-v2`.\n",
+    "\n",
+    "LLM_API_KEY is the API key generated on the Capella UI for the LLM.\n",
+    "\n",
+    "EMBEDDING_API_KEY is the API key generated on the Capella UI for the Embedding model.\n",
+    "\n",
+    "> If the models are running in the same region, either of the keys can be used interchangeably. See more details on [generating the API keys](https://docs.couchbase.com/ai/api-guide/api-start.html#model-service-keys)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "id": "PFGyHll18mSe"
+   },
+   "outputs": [
+    {
+     "name": "stdin",
+     "output_type": "stream",
+     "text": [
+      "Enter your Couchbase Connection String:  ········\n",
+      "Enter your Couchbase Database username:  Admin\n",
+      "Enter your Couchbase Database password:  ········\n",
+      "Enter your Couchbase bucket name:  model_tutorial\n",
+      "Enter your scope name:  rag\n",
+      "Enter your collection name:  data\n",
+      "Enter your Search index name:  vs-index\n",
+      "Enter your Capella Model Services Endpoint:  ········\n",
+      "Enter the LLM name mistralai/mistral-7b-instruct-v0.3\n",
+      "Enter your Couchbase LLM API Key:  ········\n",
+      "Enter the Embedding Model name: nvidia/llama-3.2-nv-embedqa-1b-v2\n",
+      "Enter your Couchbase Embedding Model API Key:  ········\n"
+     ]
+    }
+   ],
+   "source": [
+    "CB_CONNECTION_STRING = getpass.getpass(\"Enter your Couchbase Connection String: \")\n",
+    "CB_USERNAME = input(\"Enter your Couchbase Database username: \")\n",
+    "CB_PASSWORD = getpass.getpass(\"Enter your Couchbase Database password: \")\n",
+    "CB_BUCKET_NAME = input(\"Enter your Couchbase bucket name: \")\n",
+    "SCOPE_NAME = input(\"Enter your scope name: \")\n",
+    "COLLECTION_NAME = input(\"Enter your collection name: \")\n",
+    "INDEX_NAME = input(\"Enter your Search index name: \")\n",
+    "CAPELLA_MODEL_SERVICES_ENDPOINT = getpass.getpass(\"Enter your Capella Model Services Endpoint: \")\n",
+    "LLM_MODEL_NAME = input(\"Enter the LLM name\")\n",
+    "LLM_API_KEY = getpass.getpass(\"Enter your Couchbase LLM API Key: \")\n",
+    "EMBEDDING_MODEL_NAME = input(\"Enter the Embedding Model name:\")\n",
+    "EMBEDDING_API_KEY = getpass.getpass(\"Enter your Couchbase Embedding Model API Key: \")\n",
+    "\n",
+    "# Check if the variables are correctly loaded\n",
+    "if not all(\n",
+    "    [\n",
+    "        CB_CONNECTION_STRING,\n",
+    "        CB_USERNAME,\n",
+    "        CB_PASSWORD,\n",
+    "        CB_BUCKET_NAME,\n",
+    "        CAPELLA_MODEL_SERVICES_ENDPOINT,\n",
+    "        SCOPE_NAME,\n",
+    "        COLLECTION_NAME,\n",
+    "        INDEX_NAME,\n",
+    "        LLM_MODEL_NAME,\n",
+    "        LLM_API_KEY,\n",
+    "        EMBEDDING_MODEL_NAME,\n",
+    "        EMBEDDING_API_KEY,\n",
+    "    ]\n",
+    "):\n",
+    "    raise ValueError(\"Missing required environment variables\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "qtGrYzUY8mV3"
+   },
+   "source": [
+    "# Connecting to the Couchbase Cluster\n",
+    "Couchbase will serve as our primary data store, handling all the storage and retrieval operations required for our RAG system. By establishing this connection, we enable our application to interact with the database, allowing us to perform operations such as storing embeddings, querying data, and managing collections."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "id": "Zb3kK-7W8mZK"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Successfully connected to Couchbase\n"
+     ]
+    }
+   ],
+   "source": [
+    "try:\n",
+    "    auth = PasswordAuthenticator(CB_USERNAME, CB_PASSWORD)\n",
+    "    options = ClusterOptions(auth)\n",
+    "    cluster = Cluster(CB_CONNECTION_STRING, options)\n",
+    "    cluster.wait_until_ready(timedelta(seconds=5))\n",
+    "    print(\"Successfully connected to Couchbase\")\n",
+    "except Exception as e:\n",
+    "    raise ConnectionError(f\"Failed to connect to Couchbase: {str(e)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "C_Gpy32N8mcZ"
+   },
+   "source": [
+    "# Setting Up Collections in Couchbase\n",
+    "In Couchbase, data is organized in buckets, which can be further divided into scopes and collections. Think of a collection as a table in a traditional SQL database. Before we can store any data, we need to ensure that our collections exist. If they don't, we must create them. This step is important because it prepares the database to handle the specific types of data our application will process. By setting up collections, we define the structure of our data storage, which is essential for efficient data retrieval and management.\n",
+    "\n",
+    "Moreover, setting up collections allows us to isolate different types of data within the same bucket, providing a more organized and scalable data structure. This is particularly useful when dealing with large datasets, as it ensures that related data is stored together, making it easier to manage and query. Here, we also set up the primary index for query operations on the collection and clear the existing documents in the collection if any. If you do not want to do that, please skip this step."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "id": "ACZcwUnG8mf2"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Scope 'rag' already exists. Skipping creation.\n",
+      "Collection 'data' already exists. Skipping creation.\n",
+      "Primary index present or created successfully.\n",
+      "All documents cleared from the collection.\n"
+     ]
+    }
+   ],
+   "source": [
+    "def setup_collection(cluster, bucket_name, scope_name, collection_name, flush_collection=False):\n",
+    "    try:\n",
+    "        bucket = cluster.bucket(bucket_name)\n",
+    "        bucket_manager = bucket.collections()\n",
+    "\n",
+    "        # Check if scope exists, create if it doesn't\n",
+    "        scopes = bucket_manager.get_all_scopes()\n",
+    "        scope_exists = any(scope.name == scope_name for scope in scopes)\n",
+    "        \n",
+    "        if not scope_exists:\n",
+    "            print(f\"Scope '{scope_name}' does not exist. Creating it...\")\n",
+    "            bucket_manager.create_scope(scope_name)\n",
+    "            print(f\"Scope '{scope_name}' created successfully.\")\n",
+    "        else:\n",
+    "            print(f\"Scope '{scope_name}' already exists. Skipping creation.\")\n",
+    "        \n",
+    "        # Check if collection exists, create if it doesn't\n",
+    "        collections = bucket_manager.get_all_scopes()\n",
+    "        collection_exists = any(\n",
+    "            scope.name == scope_name\n",
+    "            and collection_name in [col.name for col in scope.collections]\n",
+    "            for scope in collections\n",
+    "        )\n",
+    "\n",
+    "        if not collection_exists:\n",
+    "            print(f\"Collection '{collection_name}' does not exist. Creating it...\")\n",
+    "            bucket_manager.create_collection(scope_name, collection_name)\n",
+    "            print(f\"Collection '{collection_name}' created successfully.\")\n",
+    "        else:\n",
+    "            print(f\"Collection '{collection_name}' already exists. Skipping creation.\")\n",
+    "\n",
+    "        collection = bucket.scope(scope_name).collection(collection_name)\n",
+    "        time.sleep(2)  # Give the collection time to be ready for queries\n",
+    "\n",
+    "        # Ensure primary index exists\n",
+    "        try:\n",
+    "            cluster.query(\n",
+    "                f\"CREATE PRIMARY INDEX IF NOT EXISTS ON `{bucket_name}`.`{scope_name}`.`{collection_name}`\"\n",
+    "            ).execute()\n",
+    "            print(\"Primary index present or created successfully.\")\n",
+    "        except Exception as e:\n",
+    "            logging.warning(f\"Error creating primary index: {str(e)}\")\n",
+    "\n",
+    "        if flush_collection:\n",
+    "            # Clear all documents in the collection\n",
+    "            try:\n",
+    "                query = f\"DELETE FROM `{bucket_name}`.`{scope_name}`.`{collection_name}`\"\n",
+    "                cluster.query(query).execute()\n",
+    "                print(\"All documents cleared from the collection.\")\n",
+    "            except Exception as e:\n",
+    "                print(\n",
+    "                    f\"Error while clearing documents: {str(e)}. The collection might be empty.\"\n",
+    "                )\n",
+    "\n",
+    "    except Exception as e:\n",
+    "        raise Exception(f\"Error setting up collection: {str(e)}\")\n",
+    "\n",
+    "\n",
+    "setup_collection(cluster, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME, flush_collection=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "NMJ7RRYp8mjV"
+   },
+   "source": [
+    "# Loading Couchbase Search Vector Index\n",
+    "\n",
+    "Semantic search requires an efficient way to retrieve relevant documents based on a user's query. This is where the Couchbase Vector Search using Search, formerly known as Full Text Search(FTS) service, comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.\n",
+    "\n",
+    "Note that you might have to update the index parameters depending on the names of your bucket, scope and collection. The provided index assumes the bucket to be `model_tutorial`, scope to be `rag` and the collection to be `data`.\n",
+    "\n",
+    "For more information on creating a vector search index, please follow the [instructions](https://docs.couchbase.com/cloud/vector-search/create-vector-search-index-ui.html).\n",
+    "\n",
+    "To import the index into Capella via the UI, please follow the [instructions](https://docs.couchbase.com/cloud/search/import-search-index.html) on the documentation.\n",
+    "\n",
+    "There is code to create the index using the SDK as well below if you want to do it via code."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "id": "y7xiCrOc8mmj"
+   },
+   "outputs": [],
+   "source": [
+    "# If you are running this script in Google Colab, comment the following line\n",
+    "# and provide the path to your index definition file.\n",
+    "\n",
+    "index_definition_path = \"capella_index.json\"  # Local setup: specify your file path here\n",
+    "\n",
+    "# If you are running in Google Colab, use the following code to upload the index definition file\n",
+    "# from google.colab import files\n",
+    "# print(\"Upload your index definition file\")\n",
+    "# uploaded = files.upload()\n",
+    "# index_definition_path = list(uploaded.keys())[0]\n",
+    "\n",
+    "try:\n",
+    "    with open(index_definition_path, \"r\") as file:\n",
+    "        index_definition = json.load(file)\n",
+    "\n",
+    "        # Update search index definition with user inputs\n",
+    "        index_definition['name'] = INDEX_NAME\n",
+    "        index_definition['sourceName'] = CB_BUCKET_NAME\n",
+    "        # Update types mapping\n",
+    "        old_type_key = next(iter(index_definition['params']['mapping']['types'].keys()))\n",
+    "        type_obj = index_definition['params']['mapping']['types'].pop(old_type_key)\n",
+    "        index_definition['params']['mapping']['types'][f\"{SCOPE_NAME}.{COLLECTION_NAME}\"] = type_obj\n",
+    "        \n",
+    "except Exception as e:\n",
+    "    raise ValueError(\n",
+    "        f\"Error loading index definition from {index_definition_path}: {str(e)}\"\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "v_ddPQ_Y8mpm"
+   },
+   "source": [
+    "# Creating or Updating Search Vector Indexes\n",
+    "\n",
+    "With the index definition loaded, the next step is to create or update the Search Vector Index in Couchbase. This step is crucial because it optimizes our database for vector similarity search operations, allowing us to perform searches based on the semantic content of documents rather than just keywords. By creating or updating a Search Vector Index, we enable our RAG to handle complex queries that involve finding semantically similar documents using vector embeddings, which is essential for a robust RAG system."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "id": "bHEpUu1l8msx"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Creating new index 'vs-index'...\n",
+      "Index 'vs-index' successfully created/updated.\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Create the Vector Index via SDK\n",
+    "try:\n",
+    "    scope_index_manager = (\n",
+    "        cluster.bucket(CB_BUCKET_NAME).scope(SCOPE_NAME).search_indexes()\n",
+    "    )\n",
+    "\n",
+    "    # Check if index already exists\n",
+    "    existing_indexes = scope_index_manager.get_all_indexes()\n",
+    "    index_name = index_definition[\"name\"]\n",
+    "\n",
+    "    if index_name in [index.name for index in existing_indexes]:\n",
+    "        print(f\"Index '{index_name}' found\")\n",
+    "    else:\n",
+    "        print(f\"Creating new index '{index_name}'...\")\n",
+    "\n",
+    "    # Create SearchIndex object from JSON definition\n",
+    "    search_index = SearchIndex.from_json(index_definition)\n",
+    "\n",
+    "    # Upsert the index (create if not exists, update if exists)\n",
+    "    scope_index_manager.upsert_index(search_index)\n",
+    "    print(f\"Index '{index_name}' successfully created/updated.\")\n",
+    "\n",
+    "except Exception as e:\n",
+    "    logging.error(f\"Error creating or updating index: {e}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "QRV4k06L8mwS"
+   },
+   "source": [
+    "# Load the BBC News Dataset\n",
+    "To build a RAG engine, we need data to search through. We use the [BBC Realtime News dataset](https://huggingface.co/datasets/RealTimeData/bbc_news_alltime), a dataset with up-to-date BBC news articles grouped by month. This dataset contains articles that were created after the LLM was trained. It will showcase the use of RAG to augment the LLM. \n",
+    "\n",
+    "The BBC News dataset's varied content allows us to simulate real-world scenarios where users ask complex questions, enabling us to fine-tune our RAG's ability to understand and respond to various types of queries."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {
+    "id": "TRfRslF_8mzo"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Loaded the BBC News dataset with 2687 rows\n"
+     ]
+    }
+   ],
+   "source": [
+    "try:\n",
+    "    news_dataset = load_dataset('RealTimeData/bbc_news_alltime', '2024-12', split=\"train\")\n",
+    "    print(f\"Loaded the BBC News dataset with {len(news_dataset)} rows\")\n",
+    "except Exception as e:\n",
+    "    raise ValueError(f\"Error loading BBC dataset: {str(e)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Preview the Data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'title': [\"Pakistan protest: Bushra Bibi's march for Imran Khan disappeared - BBC News\", 'Lockdown DIY linked to Walleys Quarry gases - BBC News', 'Newscast - What next for the assisted dying bill? - BBC Sounds', \"F1: Bernie Ecclestone to sell car collection worth 'hundreds of millions' - BBC Sport\", 'British man Tyler Kerry from Basildon dies on holiday in Turkey - BBC News'], 'published_date': ['2024-12-01', '2024-12-01', '2024-12-01', '2024-12-01', '2024-12-01'], 'authors': ['https://www.facebook.com/bbcnews', 'https://www.facebook.com/bbcnews', None, 'https://www.facebook.com/BBCSport/', 'https://www.facebook.com/bbcnews'], 'description': [\"Imran Khan's third wife guided protesters to the heart of the capital - and then disappeared.\", 'An academic says an increase in plasterboard sent to landfill could be behind a spike in smells.', 'And rebel forces in Syria have taken control of Aleppo', 'Former Formula 1 boss Bernie Ecclestone is to sell his collection of race cars driven by motorsport legends including Michael Schumacher, Niki Lauda and Nelson Piquet.', 'Tyler Kerry was \"a young man full of personality, kindness and compassion\", his uncle says.'], 'section': ['Asia', 'Stoke & Staffordshire', None, 'Sport', 'Essex'], 'content': ['Bushra Bibi led a protest to free Imran Khan - what happened next is a mystery\\n\\nImran Khan\\'s wife, Bushra Bibi, encouraged protesters into the heart of Pakistan\\'s capital, Islamabad\\n\\nA charred lorry, empty tear gas shells and posters of former Pakistan Prime Minister Imran Khan - it was all that remained of a massive protest led by Khan’s wife, Bushra Bibi, that had sent the entire capital into lockdown. Just a day earlier, faith healer Bibi - wrapped in a white shawl, her face covered by a white veil - stood atop a shipping container on the edge of the city as thousands of her husband’s devoted followers waved flags and chanted slogans beneath her. It was the latest protest to flare since Khan, the 72-year-old cricketing icon-turned-politician, was jailed more than a year ago after falling foul of the country\\'s influential military which helped catapult him to power. “My children and my brothers! You have to stand with me,” Bibi cried on Tuesday afternoon, her voice cutting through the deafening roar of the crowd. “But even if you don’t,” she continued, “I will still stand firm. “This is not just about my husband. It is about this country and its leader.” It was, noted some watchers of Pakistani politics, her political debut. But as the sun rose on Wednesday morning, there was no sign of Bibi, nor the thousands of protesters who had marched through the country to the heart of the capital, demanding the release of their jailed leader. While other PMs have fallen out with Pakistan\\'s military in the past, Khan\\'s refusal to stay quiet behind bars is presenting an extraordinary challenge - escalating the standoff and leaving the country deeply divided. Exactly what happened to the so-called “final march”, and Bibi, when the city went dark is still unclear. All eyewitnesses like Samia* can say for certain is that the lights went out suddenly, plunging D Chowk, the square where they had gathered, into blackness.\\n\\nWithin a day of arriving, the protesters had scattered - leaving behind Bibi\\'s burnt-out vehicle\\n\\nAs loud screams and clouds of tear gas blanketed the square, Samia describes holding her husband on the pavement, bloodied from a gun shot to his shoulder. \"Everyone was running for their lives,\" she later told BBC Urdu from a hospital in Islamabad, adding it was \"like doomsday or a war\". \"His blood was on my hands and the screams were unending.” But how did the tide turn so suddenly and decisively? Just hours earlier, protesters finally reached D Chowk late afternoon on Tuesday. They had overcome days of tear gas shelling and a maze of barricaded roads to get to the city centre. Many of them were supporters and workers of the Pakistan Tehreek-e-Insaf (PTI), the party led by Khan. He had called for the march from his jail cell, where he has been for more than a year on charges he says are politically motivated. Now Bibi - his third wife, a woman who had been largely shrouded in mystery and out of public view since their unexpected wedding in 2018 - was leading the charge. “We won’t go back until we have Khan with us,” she declared as the march reached D Chowk, deep in the heart of Islamabad’s government district.\\n\\nThousands had marched for days to reach Islamabad, demanding former Prime Minister Imran Khan be released from jail\\n\\nInsiders say even the choice of destination - a place where her husband had once led a successful sit in - was Bibi’s, made in the face of other party leader’s opposition, and appeals from the government to choose another gathering point. Her being at the forefront may have come as a surprise. Bibi, only recently released from prison herself, is often described as private and apolitical. Little is known about her early life, apart from the fact she was a spiritual guide long before she met Khan. Her teachings, rooted in Sufi traditions, attracted many followers - including Khan himself. Was she making her move into politics - or was her sudden appearance in the thick of it a tactical move to keep Imran Khan’s party afloat while he remains behind bars? For critics, it was a move that clashed with Imran Khan’s oft-stated opposition to dynastic politics. There wasn’t long to mull the possibilities. After the lights went out, witnesses say that police started firing fresh rounds of tear gas at around 21:30 local time (16:30 GMT). The crackdown was in full swing just over an hour later. At some point, amid the chaos, Bushra Bibi left. Videos on social media appeared to show her switching cars and leaving the scene. The BBC couldn’t verify the footage. By the time the dust settled, her container had already been set on fire by unknown individuals. By 01:00 authorities said all the protesters had fled.\\n\\nSecurity was tight in the city, and as night fell, lights were switched off - leaving many in the dark as to what exactly happened next\\n\\nEyewitnesses have described scenes of chaos, with tear gas fired and police rounding up protesters. One, Amin Khan, said from behind an oxygen mask that he joined the march knowing that, \"either I will bring back Imran Khan or I will be shot\". The authorities have have denied firing at the protesters. They also said some of the protesters were carrying firearms. The BBC has seen hospital records recording patients with gunshot injuries. However, government spokesperson Attaullah Tarar told the BBC that hospitals had denied receiving or treating gunshot wound victims. He added that \"all security personnel deployed on the ground have been forbidden\" from having live ammunition during protests. But one doctor told BBC Urdu that he had never done so many surgeries for gunshot wounds in a single night. \"Some of the injured came in such critical condition that we had to start surgery right away instead of waiting for anaesthesia,\" he said. While there has been no official toll released, the BBC has confirmed with local hospitals that at least five people have died. Police say at least 500 protesters were arrested that night and are being held in police stations. The PTI claims some people are missing. And one person in particular hasn’t been seen in days: Bushra Bibi.\\n\\nThe next morning, the protesters were gone - leaving behind just wrecked cars and smashed glass\\n\\nOthers defended her. “It wasn’t her fault,” insisted another. “She was forced to leave by the party leaders.” Political commentators have been more scathing. “Her exit damaged her political career before it even started,” said Mehmal Sarfraz, a journalist and analyst. But was that even what she wanted? Khan has previously dismissed any thought his wife might have her own political ambitions - “she only conveys my messages,” he said in a statement attributed to him on his X account.\\n\\nImran Khan and Bushra Bibi, pictured here arriving at court in May 2023, married in 2018\\n\\nSpeaking to BBC Urdu, analyst Imtiaz Gul calls her participation “an extraordinary step in extraordinary circumstances\". Gul believes Bushra Bibi’s role today is only about “keeping the party and its workers active during Imran Khan’s absence”. It is a feeling echoed by some PTI members, who believe she is “stepping in only because Khan trusts her deeply”. Insiders, though, had often whispered that she was pulling the strings behind the scenes - advising her husband on political appointments and guiding high-stakes decisions during his tenure. A more direct intervention came for the first time earlier this month, when she urged a meeting of PTI leaders to back Khan’s call for a rally. Pakistan’s defence minister Khawaja Asif accused her of “opportunism”, claiming she sees “a future for herself as a political leader”. But Asma Faiz, an associate professor of political science at Lahore University of Management Sciences, suspects the PTI’s leadership may have simply underestimated Bibi. “It was assumed that there was an understanding that she is a non-political person, hence she will not be a threat,” she told the AFP news agency. “However, the events of the last few days have shown a different side of Bushra Bibi.” But it probably doesn’t matter what analysts and politicians think. Many PTI supporters still see her as their connection to Imran Khan. It was clear her presence was enough to electrify the base. “She is the one who truly wants to get him out,” says Asim Ali, a resident of Islamabad. “I trust her. Absolutely!”', 'Walleys Quarry was ordered not to accept any new waste as of Friday\\n\\nA chemist and former senior lecturer in environmental sustainability has said powerful odours from a controversial landfill site may be linked to people doing more DIY during the Covid-19 pandemic. Complaints about Walleys Quarry in Silverdale, Staffordshire – which was ordered to close as of Friday – increased significantly during and after coronavirus lockdowns. Issuing the closure notice, the Environment Agency described management of the site as poor, adding it had exhausted all other enforcement tactics at premises where gases had been noxious and periodically above emission level guidelines - which some campaigners linked to ill health locally. Dr Sharon George, who used to teach at Keele University, said she had been to the site with students and found it to be clean and well-managed, and suggested an increase in plasterboard heading to landfills in 2020 could be behind a spike in stenches.\\n\\n“One of the materials that is particularly bad for producing odours and awful emissions is plasterboard,\" she said. “That’s one of the theories behind why Walleys Quarry got worse at that time.” She said the landfill was in a low-lying area, and that some of the gases that came from the site were quite heavy. “They react with water in the atmosphere, so some of the gases you smell can be quite awful and not very good for our health. “It’s why, on some days when it’s colder and muggy and a bit misty, you can smell it more.” Dr George added: “With any landfill, you’re putting things into the ground – and when you put things into the ground, if they can they will start to rot. When they start to rot they’re going to give off gases.” She believed Walleys Quarry’s proximity to people’s homes was another major factor in the amount of complaints that arose from its operation. “If you’ve got a gas that people can smell, they’re going to report it much more than perhaps a pollutant that might go unnoticed.”\\n\\nRebecca Currie said she did not think the site would ever be closed\\n\\nLocal resident and campaigner Rebecca Currie said the closure notice served to Walleys Quarry was \"absolutely amazing\". Her son Matthew has had breathing difficulties after being born prematurely with chronic lung disease, and Ms Currie says the site has made his symptoms worse. “I never thought this day was going to happen,” she explained. “We fought and fought for years.” She told BBC Midlands Today: “Our community have suffered. We\\'ve got kids who are really poorly, people have moved homes.”\\n\\nComplaints about Walleys Quarry to Newcastle-under-Lyme Borough Council exceeded 700 in November, the highest amount since 2021 according to council leader Simon Tagg. The Environment Agency (EA), which is responsible for regulating landfill sites, said it had concluded further operation at the site could result in \"significant long-term pollution\". A spokesperson for Walley\\'s Quarry Ltd said the firm rejected the EA\\'s accusations of poor management, and would be challenging the closure notice. Dr George said she believed the EA was likely to be erring on the side of caution and public safety, adding safety standards were strict. She said a lack of landfill space in the country overall was one of the broader issues that needed addressing. “As people, we just keep using stuff and then have nowhere to put it, and then when we end up putting it in places like Walleys Quarry that is next to houses, I think that’s where the problems are.”\\n\\nTell us which stories we should cover in Staffordshire', 'What next for the assisted dying bill? What next for the assisted dying bill?', 'Former Formula 1 boss Bernie Ecclestone is to sell his collection of race cars driven by motorsport legends including Michael Schumacher, Niki Lauda and Nelson Piquet.\\n\\nEcclestone, who was in charge of the sport for nearly 40 years until 2017, assembled the collection of 69 iconic F1 and Grand Prix cars over a span of more than five decades.\\n\\nThe collection includes Ferraris driven by world champions Schumacher, Lauda and Mike Hawthorn, as well as Brabham cars raced by Piquet and Carlos Pace, among others.\\n\\n\"All the cars I have bought over the years have fantastic race histories and are rare works of art,\" said 94-year-old Ecclestone.\\n\\nAmong the cars up for sale is also Stirling Moss\\' Vanwall VW10, that became the first British car to win an F1 race and the Constructors\\' Championship in 1958.\\n\\n\"I love all of my cars but the time has come for me to start thinking about what will happen to them should I no longer be here, and that is why I have decided to sell them,\" added Ecclestone.\\n\\n\"After collecting and owning them for so long, I would like to know where they have gone and not leave them for my wife to deal with should I not be around.\"\\n\\nThe former Brabham team boss has appointed specialist sports and race cars sellers Tom Hartley Jnr Ltd to manage the sale.\\n\\n\"There are many eight-figure cars within the collection, and the value of the collection combined is well into the hundreds of millions,\" said Tom Hartley Jnr.\\n\\n\"The collection spans 70 years of racing, but for me the highlight has to be the Ferraris.\\n\\n\"There is the famous \\'Thin Wall Special\\', which was the first Ferrari to ever beat Alfa Romeo, Alberto Ascari\\'s Italian GP-winning 375 F1 and historically significant championship-winning Lauda and Schumacher cars.\"\\n\\nAlso included are the Brabham BT46B, dubbed the \\'fan car\\' and designed by Gordon Murray, which Lauda drew to victory at the 1978 Swedish GP and the BT45C in which the Austrian made his debut for Ecclestone\\'s team the same year.\\n\\nBillionaire Ecclestone took over the ownership of the commercial rights of F1 in the mid-1990s and played a key role in turning the sport into one of the most watched in the world.', 'Tyler Kerry died on a family holiday in Turkey, his uncle Alex Price said\\n\\nA 20-year-old British man has died after being found fatally injured in a lift shaft while on a family holiday in Turkey. Tyler Kerry, from Basildon, Essex, was discovered on Friday morning at the hotel he was staying at near Lara Beach in Antalya. The holidaymaker was described by his family as \"a young man full of personality, kindness and compassion with his whole life ahead of him\". Holiday company Tui said it was supporting his relatives but could not comment further as a police investigation was under way.\\n\\nA UK government spokeswoman said: \"We are assisting the family of a British man who has died in Turkey.\" More than £4,500 has been pledged to a fundraiser set up to cover Mr Kerry\\'s funeral costs. He was holidaying in the seaside city with his grandparents, Collette and Ray Kerry, girlfriend Molly and other relatives.\\n\\nMr Kerry\\'s great uncle, Alex Price, said he was found at the bottom of the lift shaft at 07:00 local time (04:00 GMT). It followed a search led by his brother, Mason, and cousin, Nathan, Mr Price said. Mr Kerry had been staying on the hotel\\'s first floor.\\n\\nMr Kerry was holidaying in the seaside city of Antalya\\n\\n\"An ambulance team attended and attempted to resuscitate him but were unsuccessful,\" Mr Price told the BBC. \"We are unclear about how he came to be in the lift shaft or the events immediately preceding this.\" Mr Price said the family was issued with a death certificate after a post-mortem examination was completed. They hoped his body would be repatriated by Tuesday. Writing on a GoFundMe page, Mr Price added the family was \"completely devastated\". He thanked people for their \"kindness and consideration\" following his nephew\\'s death.\\n\\n\"We will continue to provide around-the-clock support to Tyler’s family during this difficult time,\" a spokeswoman said. \"As there is now a police investigation we are unable to comment further.\"\\n\\nDo you have a story suggestion for Essex?'], 'link': ['http://www.bbc.co.uk/news/articles/cvg02lvj1e7o', 'http://www.bbc.co.uk/news/articles/c5yg1v16nkpo', 'http://www.bbc.co.uk/sounds/play/p0k81svq', 'http://www.bbc.co.uk/sport/formula1/articles/c1lglrj4gqro', 'http://www.bbc.co.uk/news/articles/c1knkx1z8zgo'], 'top_image': ['https://ichef.bbci.co.uk/ace/standard/3840/cpsprodpb/9975/live/b22229e0-ad5a-11ef-83bc-1153ed943d1c.jpg', 'https://ichef.bbci.co.uk/ace/standard/3840/cpsprodpb/0896/live/55209f80-adb2-11ef-8f6c-f1a86bb055ec.jpg', 'https://ichef.bbci.co.uk/images/ic/320x320/p0k81sxn.jpg', 'https://ichef.bbci.co.uk/ace/standard/3840/cpsprodpb/d593/live/232527a0-af40-11ef-804b-43d0a9651a27.jpg', 'https://ichef.bbci.co.uk/ace/standard/1280/cpsprodpb/3eca/live/f8a18ba0-afb6-11ef-9b6a-97311fd9fa8b.jpg']}\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(news_dataset[:5])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Cleaning up the Data\n",
+    "\n",
+    "We will use the content of the news articles for our RAG system. \n",
+    "\n",
+    "The dataset contains a few duplicate records. We are removing them to avoid duplicate results in the retrieval stage of our RAG system."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "We have 1749 unique articles in our database.\n"
+     ]
+    }
+   ],
+   "source": [
+    "news_articles = news_dataset[\"content\"]\n",
+    "unique_articles = set()\n",
+    "for article in news_articles:\n",
+    "    if article:\n",
+    "        unique_articles.add(article)\n",
+    "unique_news_articles = list(unique_articles)\n",
+    "print(f\"We have {len(unique_news_articles)} unique articles in our database.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "7FvxRsg38m3G"
+   },
+   "source": [
+    "# Creating Embeddings using Capella Model Service\n",
+    "Embeddings are at the heart of semantic search. They are numerical representations of text that capture the semantic meaning of the words and phrases. Unlike traditional keyword-based search, which looks for exact matches, embeddings allow our search engine to understand the context and nuances of language, enabling it to retrieve documents that are semantically similar to the query, even if they don't contain the exact keywords. By creating embeddings using Capella Model service, we equip our RAG system with the ability to understand and process natural language in a way that is much closer to how humans understand language. This step transforms our raw text data into a format that the Capella vector store can use to find and rank relevant documents.\n",
+    "\n",
+    "We are using the OpenAI Embeddings via the [LangChain OpenAI provider](https://python.langchain.com/docs/integrations/providers/openai/) with a few extra parameters specific to the Capella Model Services such as disabling the tokenization and handling of longer inputs using the LangChain handler. We provide the model and api_key and the URL for the SDK to those for Capella Model Services. For this tutorial, we are using the [nvidia/llama-3.2-nv-embedqa-1b-v2](https://build.nvidia.com/nvidia/llama-3_2-nv-embedqa-1b-v2) embedding model. If you are using a different model, you would need to change the model name and adapt the vector index definition (embedding dimensions) accordingly."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "id": "_75ZyCRh8m6m"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Successfully created CapellaAIEmbeddings\n"
+     ]
+    }
+   ],
+   "source": [
+    "try:\n",
+    "    embeddings = OpenAIEmbeddings(\n",
+    "        openai_api_key=EMBEDDING_API_KEY,\n",
+    "        openai_api_base=CAPELLA_MODEL_SERVICES_ENDPOINT,\n",
+    "        check_embedding_ctx_length=False,\n",
+    "        tiktoken_enabled=False,\n",
+    "        model=EMBEDDING_MODEL_NAME,\n",
+    "    )\n",
+    "    print(\"Successfully created CapellaAIEmbeddings\")\n",
+    "except Exception as e:\n",
+    "    raise ValueError(f\"Error creating CapellaAIEmbeddings: {str(e)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Testing the Embeddings Model\n",
+    "We can test the embeddings model by generating an embedding for a string using the LangChain OpenAI package"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "2048\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(len(embeddings.embed_query(\"this is a test sentence\")))\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "8IwZMUnF8m-N"
+   },
+   "source": [
+    "#  Setting Up the Couchbase Search Vector Store\n",
+    "The vector store is set up to store the documents from the dataset. The vector store is essentially a database optimized for storing and retrieving high-dimensional vectors. In this case, the vector store is using Couchbase using the [LangChain integration](https://python.langchain.com/docs/integrations/providers/couchbase/)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {
+    "id": "DwIJQjYT9RV_"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Successfully created vector store\n"
+     ]
+    }
+   ],
+   "source": [
+    "try:\n",
+    "    vector_store = CouchbaseSearchVectorStore(\n",
+    "        cluster=cluster,\n",
+    "        bucket_name=CB_BUCKET_NAME,\n",
+    "        scope_name=SCOPE_NAME,\n",
+    "        collection_name=COLLECTION_NAME,\n",
+    "        embedding=embeddings,\n",
+    "        index_name=INDEX_NAME,\n",
+    "    )\n",
+    "    print(\"Successfully created vector store\")\n",
+    "except Exception as e:\n",
+    "    raise ValueError(f\"Failed to create vector store: {str(e)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "C6DJVz7A9RZA"
+   },
+   "source": [
+    "# Saving Data to the Vector Store\n",
+    "With the Vector store set up, the next step is to populate it with data. We save the BBC articles dataset to the vector store. For each document, we will generate the embeddings for the article to use with the semantic search using LangChain.\n",
+    "\n",
+    "Here few of the articles are larger than the maximum tokens that we can use for our embedding model. If we want to ingest that document, we could split the document and ingest it in parts. However, since it is only a single document for simplicity, we ignore that document from the ingestion process."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {
+    "id": "_6opqqvx9Rb_"
+   },
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Ingesting articles:   8%|▌      | 148/1749 [01:38<56:27,  2.12s/it]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Failed to save documents to vector store: ('Failed to insert documents.', {'3ed578e38ed5414f93b1c6ac28c8632d': AmbiguousTimeoutException(<ec=13, category=couchbase.common, message=ambiguous_timeout (13), context=KeyValueErrorContext:{'retry_attempts': 0, 'key': '3ed578e38ed5414f93b1c6ac28c8632d', 'bucket_name': 'model_tutorial', 'scope_name': 'rag', 'collection_name': 'data', 'opaque': 147}, C Source=/Users/couchbase/jenkins/workspace/python/sdk/python-scripted-build-pipeline/py-client/src/kv_ops.cxx:680>)})\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Ingesting articles:   9%|▌      | 150/1749 [01:42<59:14,  2.22s/it]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Failed to save documents to vector store: ('Failed to insert documents.', {'db07b65a35324d91b5e2ace2b20589c0': AmbiguousTimeoutException(<ec=13, category=couchbase.common, message=ambiguous_timeout (13), context=KeyValueErrorContext:{'retry_attempts': 0, 'key': 'db07b65a35324d91b5e2ace2b20589c0', 'bucket_name': 'model_tutorial', 'scope_name': 'rag', 'collection_name': 'data', 'opaque': 146}, C Source=/Users/couchbase/jenkins/workspace/python/sdk/python-scripted-build-pipeline/py-client/src/kv_ops.cxx:680>)})\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Ingesting articles:   9%|▍    | 151/1749 [01:45<1:05:09,  2.45s/it]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Failed to save documents to vector store: ('Failed to insert documents.', {'f449ec9922c043889d96864f7556bf68': AmbiguousTimeoutException(<ec=13, category=couchbase.common, message=ambiguous_timeout (13), context=KeyValueErrorContext:{'retry_attempts': 0, 'key': 'f449ec9922c043889d96864f7556bf68', 'bucket_name': 'model_tutorial', 'scope_name': 'rag', 'collection_name': 'data', 'opaque': 148}, C Source=/Users/couchbase/jenkins/workspace/python/sdk/python-scripted-build-pipeline/py-client/src/kv_ops.cxx:680>)})\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Ingesting articles:  98%|█████▉| 1721/1749 [13:26<00:10,  2.71it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Failed to save documents to vector store: Error code: 400 - {'error': {'message': 'Non-successful response received from model service', 'type': 'model_service_unknown_error', 'param': {'response': {'detail': {}, 'message': 'Input length 14848 exceeds maximum allowed token size 8192', 'object': 'error', 'type': 'invalid_request_error'}, 'status_code': 400}, 'code': 'model_service_unknown_error'}}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Ingesting articles: 100%|██████| 1749/1749 [13:38<00:00,  2.14it/s]\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_core.documents import Document\n",
+    "\n",
+    "for article in tqdm(unique_news_articles, desc=\"Ingesting articles\"):\n",
+    "    try:\n",
+    "        documents = [Document(page_content=article)]\n",
+    "        vector_store.add_documents(documents=documents)\n",
+    "    except Exception as e:\n",
+    "        print(f\"Failed to save documents to vector store: {str(e)}\")\n",
+    "        continue"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "uehAx36o9Rlm"
+   },
+   "source": [
+    "# Using the Large Language Model (LLM) in Capella Model Services\n",
+    "Language language models are AI systems that are trained to understand and generate human language. We'll be using the [mistralai/mistral-7b-instruct-v0.3](https://build.nvidia.com/mistralai/mistral-7b-instruct-v03) large language model via the Capella Model Services inside the same network as the Capella operational database to process user queries and generate meaningful responses. This model is a key component of our RAG system, allowing it to go beyond simple keyword matching and truly understand the intent behind a query. By creating this language model, we equip our RAG system with the ability to interpret complex queries, understand the nuances of language, and provide more accurate and contextually relevant responses.\n",
+    "\n",
+    "The language model's ability to understand context and generate coherent responses is what makes our RAG system truly intelligent. It can not only find the right information but also present it in a way that is useful and understandable to the user.\n",
+    "\n",
+    "The LLM has been created using the LangChain OpenAI provider as well with the model name, URL and the API key based on the Capella Model Services."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {
+    "id": "yRAfBRLH9RpO"
+   },
+   "outputs": [],
+   "source": [
+    "try:\n",
+    "    llm = ChatOpenAI(openai_api_base=CAPELLA_MODEL_SERVICES_ENDPOINT, openai_api_key=LLM_API_KEY, model=LLM_MODEL_NAME, temperature=0)\n",
+    "    logging.info(\"Successfully created the Chat model in Capella Model Services\")\n",
+    "except Exception as e:\n",
+    "    raise ValueError(f\"Error creating Chat model in Capella Model Services: {str(e)}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content='I don\\'t have real-time data or the ability to follow live events. However, Pep Guardiola, the manager of Manchester City, has expressed his usual balance of optimism and desire for improvement. Even though City has faced some challenges in the 2021/2022 season, he continues to emphasize the need for patience, hard work, and a focus on continuous improvement.\\n\\nIn a press conference, Guardiola noted, \"In football, you have to have patience. When I arrived, we were fifth and I said, \\'okay, we are not far away.\\' Now, we are not far away again.\" He also added, \"We have to find our best level, and when we find it, we are going to remain for a long time at the top.\"\\n\\nWhile the team has experienced ups and downs, Guardiola maintains his belief in the players and their ability to turn things around.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 200, 'prompt_tokens': 21, 'total_tokens': 221, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_provider': 'openai', 'model_name': 'mistralai/mistral-7b-instruct-v0.3', 'system_fingerprint': None, 'id': 'chat-2a85fc50bf92483998c62d10b02cea01', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019aeeb6-c69e-7e00-85c0-53aaa75562b2-0', usage_metadata={'input_tokens': 21, 'output_tokens': 200, 'total_tokens': 221, 'input_token_details': {}, 'output_token_details': {}})"
+      ]
+     },
+     "execution_count": 16,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "llm.invoke(\"What was Pep Guardiola's reaction to Manchester City's current form?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "k_XDfCx19UvG"
+   },
+   "source": [
+    "# Perform Semantic Search\n",
+    "Semantic search in Couchbase involves converting queries and documents into vector representations using an embeddings model. These vectors capture the semantic meaning of the text and are stored directly in Couchbase. When a query is made, Couchbase performs a similarity search by comparing the query vector against the stored document vectors. The similarity metric used for this comparison is configurable, allowing flexibility in how the relevance of documents is determined. Common metrics include cosine similarity, Euclidean distance, or dot product, but other metrics can be implemented based on specific use cases. Different embedding models like BERT, Word2Vec, or GloVe can also be used depending on the application's needs, with the vectors generated by these models stored and searched within Couchbase itself.\n",
+    "\n",
+    "In the provided code, the search process begins by recording the start time, followed by executing the `similarity_search_with_score` method of the [CouchbaseSearchVectorStore](https://couchbase-ecosystem.github.io/langchain-couchbase/usage.html#couchbase-search-vector-store). This method searches Couchbase for the most relevant documents based on the vector similarity to the query. The search results include the document content and a similarity score that reflects how closely each document aligns with the query in the defined semantic space. The time taken to perform this search is then calculated and logged, and the results are displayed, showing the most relevant documents along with their similarity scores. This approach leverages Couchbase as both a storage and retrieval engine for vector data, enabling efficient and scalable semantic searches. The integration of vector storage and search capabilities within Couchbase allows for sophisticated semantic search operations without relying on external services for vector storage or comparison."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {
+    "id": "Pk-oFbnC9Uym"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "Semantic Search Results (completed in 1.97 seconds):\n",
+      "Score: 0.5085, ID: b1164b81a6614e45a93c1460e6e0a8a0, Text: 'We have to find a way' - Guardiola vows to end relegation form\n",
+      "\n",
+      "This video can not be played To play this video you need to enable JavaScript in your browser. 'Worrying' and 'staggering' - Why do Manchester City keep conceding?\n",
+      "\n",
+      "Manchester City are currently in relegation form and there is little sign of it ending. Saturday's 2-1 defeat at Aston Villa left them joint bottom of the form table over the past eight games with just Southampton for company. Saints, at the foot of the Premier League, have the same number of points, four, as City over their past eight matches having won one, drawn one and lost six - the same record as the floundering champions. And if Southampton - who appointed Ivan Juric as their new manager on Saturday - get at least a point at Fulham on Sunday, City will be on the worst run in the division. Even Wolves, who sacked boss Gary O'Neil last Sunday and replaced him with Vitor Pereira, have earned double the number of points during the same period having played a game fewer. They are damning statistics for Pep Guardiola, even if he does have some mitigating circumstances with injuries to Ederson, Nathan Ake and Ruben Dias - who all missed the loss at Villa Park - and the long-term loss of midfield powerhouse Rodri. Guardiola was happy with Saturday's performance, despite defeat in Birmingham, but there is little solace to take at slipping further out of the title race. He may have needed to field a half-fit Manuel Akanji and John Stones at Villa Park but that does not account for City looking a shadow of their former selves. That does not justify the error Josko Gvardiol made to gift Jhon Duran a golden chance inside the first 20 seconds, or £100m man Jack Grealish again failing to have an impact on a game. There may be legitimate reasons for City's drop off, whether that be injuries, mental fatigue or just simply a team coming to the end of its lifecycle, but their form, which has plunged off a cliff edge, would have been unthinkable as they strolled to a fourth straight title last season. \"The worrying thing is the number of goals conceded,\" said ex-England captain Alan Shearer on BBC Match of the Day. \"The number of times they were opened up because of the lack of protection and legs in midfield was staggering. There are so many things that are wrong at this moment in time.\"\n",
+      "\n",
+      "This video can not be played To play this video you need to enable JavaScript in your browser. Man City 'have to find a way' to return to form - Guardiola\n",
+      "\n",
+      "Afterwards Guardiola was calm, so much so it was difficult to hear him in the news conference, a contrast to the frustrated figure he cut on the touchline. He said: \"It depends on us. The solution is bring the players back. We have just one central defender fit, that is difficult. We are going to try next game - another opportunity and we don't think much further than that. \"Of course there are more reasons. We concede the goals we don't concede in the past, we [don't] score the goals we score in the past. Football is not just one reason. There are a lot of little factors. \"Last season we won the Premier League, but we came here and lost. We have to think positive and I have incredible trust in the guys. Some of them have incredible pride and desire to do it. We have to find a way, step by step, sooner or later to find a way back.\" Villa boss Unai Emery highlighted City's frailties, saying he felt Villa could seize on the visitors' lack of belief. \"Manchester City are a little bit under the confidence they have normally,\" he said. \"The second half was different, we dominated and we scored. Through those circumstances they were feeling worse than even in the first half.\"\n",
+      "\n",
+      "Erling Haaland had one touch in the Villa box\n",
+      "\n",
+      "There are chinks in the armour never seen before at City under Guardiola and Erling Haaland conceded belief within the squad is low. He told TNT after the game: \"Of course, [confidence levels are] not the best. We know how important confidence is and you can see that it affects every human being. That is how it is, we have to continue and stay positive even though it is difficult.\" Haaland, with 76 goals in 83 Premier League appearances since joining City from Borussia Dortmund in 2022, had one shot and one touch in the Villa box. His 18 touches in the whole game were the lowest of all starting players and he has been self critical, despite scoring 13 goals in the top flight this season. Over City's last eight games he has netted just twice though, but Guardiola refused to criticise his star striker. He said: \"Without him we will be even worse but I like the players feeling that way. I don't agree with Erling. He needs to have the balls delivered in the right spots but he will fight for the next one.\"\n",
+      "------------------------------------------------------------\n",
+      "Score: 0.4821, ID: 6cd3049bd4264c2eacdafba6441f108e, Text: 'I am not good enough' - Guardiola faces daunting and major rebuild\n",
+      "\n",
+      "This video can not be played To play this video you need to enable JavaScript in your browser. 'I am not good enough' - Guardiola says he must find a 'solution' after derby loss\n",
+      "\n",
+      "Pep Guardiola says his sleep has suffered during Manchester City's deepening crisis, so he will not be helped by a nightmarish conclusion to one of the most stunning defeats of his long reign. Guardiola looked agitated, animated and on edge even after City led the Manchester derby through Josko Gvardiol's 36th-minute header, his reaction to the goal one of almost disdain that it came via a deflected cross as opposed to in his purist style. He sat alone with his eyes closed sipping from a water bottle before the resumption of the second half, then was denied even the respite of victory when Manchester United gave this largely dismal derby a dramatic conclusion it barely deserved with a remarkable late comeback. First, with 88 minutes on the clock, Matheus Nunes presented Amad Diallo with the ball before compounding his error by flattening the forward as he made an attempt to recover his mistake. Bruno Fernandes completed the formalities from the penalty spot. Worse was to come two minutes later when Lisandro Martinez's routine long ball caught City's defence inexplicably statuesque. Goalkeeper Ederson's positioning was awry, allowing the lively Diallo to pounce from an acute angle to leave Guardiola and his players stunned. It was the latest into any game, 88 minutes, that reigning Premier League champions had led then lost. It was also the first time City had lost a game they were leading so late on. And in a sign of City's previous excellence that is now being challenged, they have only lost four of 105 Premier League home games under Guardiola in which they have been ahead at half-time, winning 94 and drawing seven. Guardiola delivered a brutal self-analysis as he told Match of the Day: \"I am not good enough. I am the boss. I am the manager. I have to find solutions and so far I haven't. That's the reality. \"Not much else to say. No defence. Manchester United were incredibly persistent. We have not lost eight games in two seasons. We can't defend that.\"\n",
+      "\n",
+      "Manchester City manager Pep Guardiola in despair during the derby defeat to Manchester United\n",
+      "\n",
+      "Guardiola suggested the serious renewal will wait until the summer but the red flags have been appearing for weeks in the sudden and shocking decline of a team that has lost the aura of invincibility that left many opponents beaten before kick-off in previous years. He has had stated City must \"survive\" this season - whatever qualifies as survival for a club of such rich ambition - but the quest for a record fifth successive Premier League title is surely over as they lie nine points behind leaders Liverpool having played a game more. Their Champions League aspirations are also in jeopardy after another loss, this time against Juventus in Turin. City's squad has been allowed to grow too old together. The insatiable thirst for success seems to have gone, the scales of superiority have fallen away and opponents now sense vulnerability right until the final whistle, as United did here. The manner in which United were able, and felt able, to snatch this victory drove right to the heart of how City, and Guardiola, are allowing opponents to prey on their downfall. Guardiola has every reason to cite injuries, most significantly to Rodri and also John Stones as well as others, but this cannot be used an excuse for such a dramatic decline in standards, allied to the appearance of a soft underbelly that is so easily exploited. And City's rebuild will not be a quick fix. With every performance, every defeat, the scale of what lies in front of Guardiola becomes more obvious - and daunting. Manchester City's fans did their best to reassure Guardiola of their faith in him with a giant Barcelona-inspired banner draped from the stands before kick-off emblazoned with his image reading \"Més que un entrenador\" - \"More Than A Coach\". And Guardiola will now need to be more than a coach than at any time in his career. He will have the finances but it will be done with City's challengers also strengthening. Kevin de Bruyne, 34 in June, lasted 68 minutes here before he was substituted. Age and injuries are catching up with one of the greatest players of the Premier League era and he is unlikely to be at City next season. Mateo Kovacic, who replaced De Bruyne, is also 31 in May. Kyle Walker, 34, is being increasingly exposed. His most notable contribution here was an embarrassing collapse to the ground after the mildest head-to-head collision with Rasmus Hojlund. Ilkay Gundogan, another 34-year-old and a previous pillar of Guardiola's great successes, no longer has the legs or energy to exert influence. This looks increasingly like a season too far following his return from Barcelona. Flaws are also being exposed elsewhere, with previously reliable performers failing to hit previous standards. Phil Foden scored 27 goals and had 12 assists when he was Premier League Player of the Season last term. This year he has just three goals and two assists in 18 appearances in all competitions. He has no goals and just one assist in 11 Premier League games. Jack Grealish, who came on after 77 minutes against United, has not scored in a year for Manchester City, his last goal coming in a 2-2 draw against Crystal Palace on 16 December last year. He has, in the meantime, scored twice for England. Erling Haaland is also struggling as City lack creativity and cutting edge. He has three goals in his past 11 Premier League games after scoring 10 in his first five. And in another indication of City's impotence, and their reliance on Haaland, defender Gvardiol's goal against United was his fourth this season, making him their second highest scorer in all competitions behind the Norwegian striker, who has 18. Goalkeeper Ederson, so reliable for so long, has already been dropped once this season and did not cover himself in glory for United's winner. Guardiola, with that freshly signed two-year contract, insists he \"wants it\" as he treads on this alien territory of failure. He will be under no illusions about the size of the job in front of him as he placed his head in his hands in anguish after yet another damaging and deeply revealing defeat. City and Guardiola are in new, unforgiving territory.\n",
+      "------------------------------------------------------------\n",
+      "Score: 0.4687, ID: 3d9d78ae1de04ae89cc0ce23eb156552, Text: Manchester City boss Pep Guardiola has won 18 trophies since he arrived at the club in 2016\n",
+      "\n",
+      "Manchester City boss Pep Guardiola says he is \"fine\" despite admitting his sleep and diet are being affected by the worst run of results in his entire managerial career. In an interview with former Italy international Luca Toni for Amazon Prime Sport before Wednesday's Champions League defeat by Juventus, Guardiola touched on the personal impact City's sudden downturn in form has had. Guardiola said his state of mind was \"ugly\", that his sleep was \"worse\" and he was eating lighter as his digestion had suffered. City go into Sunday's derby against Manchester United at Etihad Stadium having won just one of their past 10 games. The Juventus loss means there is a chance they may not even secure a play-off spot in the Champions League. Asked to elaborate on his comments to Toni, Guardiola said: \"I'm fine. \"In our jobs we always want to do our best or the best as possible. When that doesn't happen you are more uncomfortable than when the situation is going well, always that happened. \"In good moments I am happier but when I get to the next game I am still concerned about what I have to do. There is no human being that makes an activity and it doesn't matter how they do.\" Guardiola said City have to defend better and \"avoid making mistakes at both ends\". To emphasise his point, Guardiola referred back to the third game of City's current run, against a Sporting side managed by Ruben Amorim, who will be in the United dugout at the weekend. City dominated the first half in Lisbon, led thanks to Phil Foden's early effort and looked to be cruising. Instead, they conceded three times in 11 minutes either side of half-time as Sporting eventually ran out 4-1 winners. \"I would like to play the game like we played in Lisbon on Sunday, believe me,\" said Guardiola, who is facing the prospect of only having three fit defenders for the derby as Nathan Ake and Manuel Akanji try to overcome injury concerns. If there is solace for City, it comes from the knowledge United are not exactly flying. Their comeback Europa League victory against Viktoria Plzen on Thursday was their third win of Amorim's short reign so far but only one of those successes has come in the Premier League, where United have lost their past two games against Arsenal and Nottingham Forest. Nevertheless, Guardiola can see improvements already on the red side of the city. \"It's already there,\" he said. \"You see all the patterns, the movements, the runners and the pace. He will do a good job at United, I'm pretty sure of that.\"\n",
+      "\n",
+      "Guardiola says skipper Kyle Walker has been offered support by the club after the City defender highlighted the racial abuse he had received on social media in the wake of the Juventus trip. \"It's unacceptable,\" he said. \"Not because it's Kyle - for any human being. \"Unfortunately it happens many times in the real world. It is not necessary to say he has the support of the entire club. It is completely unacceptable and we give our support to him.\"\n",
+      "------------------------------------------------------------\n",
+      "Score: 0.4646, ID: d782ce88158d40d7b48808167dcbcba1, Text: 'Self-doubt, errors & big changes' - inside the crisis at Man City\n",
+      "\n",
+      "Pep Guardiola has not been through a moment like this in his managerial career. Manchester City have lost nine matches in their past 12 - as many defeats as they had suffered in their previous 106 fixtures. At the end of October, City were still unbeaten at the top of the Premier League and favourites to win a fifth successive title. Now they are seventh, 12 points behind leaders Liverpool having played a game more. It has been an incredible fall from grace and left people trying to work out what has happened - and whether Guardiola can make it right. After discussing the situation with those who know him best, I have taken a closer look at the future - both short and long term - and how the current crisis at Man City is going to be solved.\n",
+      "\n",
+      "Pep Guardiola's Man City have lost nine of their past 12 matches\n",
+      "\n",
+      "Guardiola has also been giving it a lot of thought. He has not been sleeping very well, as he has said, and has not been himself at times when talking to the media. He has been talking to a lot of people about what is going on as he tries to work out the reasons for City's demise. Some reasons he knows, others he still doesn't. What people perhaps do not realise is Guardiola hugely doubts himself and always has. He will be thinking \"I'm not going to be able to get us out of this\" and needs the support of people close to him to push away those insecurities - and he has that. He is protected by his people who are very aware, like he is, that there are a lot of people that want City to fail. It has been a turbulent time for Guardiola. Remember those marks he had on his head after the 3-3 draw with Feyenoord in the Champions League? He always scratches his head, it is a gesture of nervousness. Normally nothing happens but on that day one of his nails was far too sharp so, after talking to the players in the changing room where he scratched his head because of his usual agitated gesturing, he went to the news conference. His right-hand man Manel Estiarte sent him photos in a message saying \"what have you got on your head?\", but by the time Guardiola returned to the coaching room there was hardly anything there again. He started that day with a cover on his nose after the same thing happened at the training ground the day before. Guardiola was having a footballing debate with Kyle Walker about positional stuff and marked his nose with that same nail. There was also that remarkable news conference after the Manchester derby when he said \"I don't know what to do\". That is partly true and partly not true. Ignore the fact Guardiola suggested he was \"not good enough\". He actually meant he was not good enough to resolve the situation with the group of players he has available and with all the other current difficulties. There are obviously logical explanations for the crisis and the first one has been talked about many times - the absence of injured midfielder Rodri. You know the game Jenga? When you take the wrong piece out, the whole tower collapses. That is what has happened here. It is normal for teams to have an over-reliance on one player if he is the best in the world in his position. And you cannot calculate the consequences of an injury that rules someone like Rodri out for the season. City are a team, like many modern ones, in which the holding midfielder is a key element to the construction. So, when you take Rodri out, it is difficult to hold it together. There were Plan Bs - John Stones, Manuel Akanji, even Nathan Ake - but injuries struck. The big injury list has been out of the ordinary and the busy calendar has also played a part in compounding the issues. However, one factor even Guardiola cannot explain is the big uncharacteristic errors in almost every game from international players. Why did Matheus Nunes make that challenge to give away the penalty against Manchester United? Jack Grealish is sent on at the end to keep the ball and cannot do that. There are errors from Walker and other defenders. These are some of the best players in the world. Of course the players' mindset is important, and confidence is diminishing. Wrong decisions get taken so there is almost panic on the pitch instead of calm. There are also players badly out of form who are having to play because of injuries. Walker is now unable to hide behind his pace, I'm not sure Kevin de Bruyne is ever getting back to the level he used to be at, Bernardo Silva and Ilkay Gundogan do not have time to rest, Grealish is not playing at his best. Some of these players were only meant to be playing one game a week but, because of injuries, have played 12 games in 40 days. It all has a domino effect. One consequence is that Erling Haaland isn't getting the service to score. But the Norwegian still remains City's top-scorer with 13. Defender Josko Gvardiol is next on the list with just four. The way their form has been analysed inside the City camp is there have only been three games where they deserved to lose (Liverpool, Bournemouth and Aston Villa). But of course it is time to change the dynamic.\n",
+      "\n",
+      "Guardiola has never protected his players so much. He has not criticised them and is not going to do so. They have won everything with him. Instead of doing more with them, he has tried doing less. He has sometimes given them more days off to clear their heads, so they can reset - two days this week for instance. Perhaps the time to change a team is when you are winning, but no-one was suggesting Man City were about to collapse when they were top and unbeaten after nine league games. Some people have asked how bad it has to get before City make a decision on Guardiola. The answer is that there is no decision to be made. Maybe if this was Real Madrid, Barcelona or Juventus, the pressure from outside would be massive and the argument would be made that Guardiola has to go. At City he has won the lot, so how can anyone say he is failing? Yes, this is a crisis. But given all their problems, City's renewed target is finishing in the top four. That is what is in all their heads now. The idea is to recover their essence by improving defensive concepts that are not there and re-establishing the intensity they are known for. Guardiola is planning to use the next two years of his contract, which is expected to be his last as a club manager, to prepare a new Manchester City. When he was at the end of his four years at Barcelona, he asked two managers what to do when you feel people are not responding to your instructions. Do you go or do the players go? Sir Alex Ferguson and Rafael Benitez both told him that the players need to go. Guardiola did not listen because of his emotional attachment to his players back then and he decided to leave the Camp Nou because he felt the cycle was over. He will still protect his players now but there is not the same emotional attachment - so it is the players who are going to leave this time. It is likely City will look to replace five or six regular starters. Guardiola knows it is the end of an era and the start of a new one. Changes will not be immediate and the majority of the work will be done in the summer. But they are open to any opportunities in January - and a holding midfielder is one thing they need. In the summer City might want to get Spain's Martin Zubimendi from Real Sociedad and they know 60m euros (£50m) will get him. He said no to Liverpool last summer even though everything was agreed, but he now wants to move on and the Premier League is the target. Even if they do not get Zubimendi, that is the calibre of footballer they are after. A new Manchester City is on its way - with changes driven by Guardiola, incoming sporting director Hugo Viana and the football department.\n",
+      "------------------------------------------------------------\n",
+      "Score: 0.4344, ID: 5255b258163847d8b4ae45575f85ccd1, Text: What will Trump do about Syria? What will Trump do about Syria?\n",
+      "------------------------------------------------------------\n"
+     ]
+    }
+   ],
+   "source": [
+    "query = \"What was Pep Guardiola's reaction to Manchester City's current form?\"\n",
+    "\n",
+    "try:\n",
+    "    # Perform the semantic search\n",
+    "    start_time = time.time()\n",
+    "    search_results = vector_store.similarity_search_with_score(query, k=5)\n",
+    "    search_elapsed_time = time.time() - start_time\n",
+    "\n",
+    "    # Display search results\n",
+    "    print(\n",
+    "        f\"\\nSemantic Search Results (completed in {search_elapsed_time:.2f} seconds):\"\n",
+    "    )\n",
+    "    for doc, score in search_results:\n",
+    "        print(f\"Score: {score:.4f}, ID: {doc.id}, Text: {doc.page_content}\")\n",
+    "        print(\"---\"*20)\n",
+    "\n",
+    "except CouchbaseException as e:\n",
+    "    raise RuntimeError(f\"Error performing semantic search: {str(e)}\")\n",
+    "except Exception as e:\n",
+    "    raise RuntimeError(f\"Unexpected error: {str(e)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "sS0FebHI9U1l"
+   },
+   "source": [
+    "# Retrieval-Augmented Generation (RAG) with Couchbase and Langchain\n",
+    "Couchbase and LangChain can be seamlessly integrated to create RAG (Retrieval-Augmented Generation) chains, enhancing the process of generating contextually relevant responses. In this setup, Couchbase serves as the vector store, where embeddings of documents are stored. When a query is made, LangChain retrieves the most relevant documents from Couchbase by comparing the query’s embedding with the stored document embeddings. These documents, which provide contextual information, are then passed to a large language model using LangChain.\n",
+    "\n",
+    "The language model, equipped with the context from the retrieved documents, generates a response that is both informed and contextually accurate. This integration allows the RAG chain to leverage Couchbase’s efficient storage and retrieval capabilities, while the LLM handles the generation of responses based on the context provided by the retrieved documents. Together, they create a powerful system that can deliver highly relevant and accurate answers by combining the strengths of both retrieval and generation."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {
+    "id": "ZGUXQQmv9ge4"
+   },
+   "outputs": [],
+   "source": [
+    "template = \"\"\"You are a helpful bot. If you cannot answer based on the context provided, respond with a generic answer. Answer the question as truthfully as possible using the context below:\n",
+    "    {context}\n",
+    "    Question: {question}\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_template(template)\n",
+    "rag_chain = (\n",
+    "    {\"context\": vector_store.as_retriever(), \"question\": RunnablePassthrough()}\n",
+    "    | prompt\n",
+    "    | llm\n",
+    "    | StrOutputParser()\n",
+    ")\n",
+    "logging.info(\"Successfully created RAG chain\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "metadata": {
+    "id": "Mia7XxM9978M"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "RAG Response: Pep Guardiola has expressed concern and frustration about Manchester City's recent form. He said, \"I am not good enough. I am the boss. I am the manager. I have to find solutions and so far I haven't. That's the reality.\" He mentioned that the team's poor defense and lack of confidence are causing them issues. He also mentioned that they have not lost eight games in two seasons and it is unacceptable. However, he stated that he wants to find a solution and trusts in the players to turn things around.\n",
+      "RAG response generated in 6.17 seconds\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Get responses\n",
+    "query = \"What was Pep Guardiola's reaction to Manchester City's recent form?\"\n",
+    "try:\n",
+    "    start_time = time.time()\n",
+    "    rag_response = rag_chain.invoke(query)\n",
+    "    rag_elapsed_time = time.time() - start_time\n",
+    "\n",
+    "    print(f\"RAG Response: {rag_response}\")\n",
+    "    print(f\"RAG response generated in {rag_elapsed_time:.2f} seconds\")\n",
+    "except Exception as e:\n",
+    "    print(\"Error occurred:\", e)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "aIdayPzw9glT"
+   },
+   "source": [
+    "# Using Caching mechanism in Capella Model Services\n",
+    "In Capella Model Services, the model outputs can be [cached](https://docs.couchbase.com/ai/build/model-service/configure-value-adds.html#caching) (both semantic and standard cache). The caching mechanism enhances the RAG's efficiency and speed, particularly when dealing with repeated or similar queries. When a query is first processed, the LLM generates a response and then stores this response in Couchbase. When similar queries come in later, the cached responses are returned. The caching duration can be configured in the Capella Model services.\n",
+    "\n",
+    "In this example, we are using the standard cache which works for exact matches of the queries."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "metadata": {
+    "id": "0xM2G3ef-GS2"
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "Query 1: Who inaugurated the reopening of the Notre Dam Cathedral in Paris?\n",
+      "Response: French President Emmanuel Macron inaugurated the reopening of the Notre-Dame Cathedral in Paris.\n",
+      "Time taken: 4.06 seconds\n",
+      "\n",
+      "Query 2: What was Pep Guardiola's reaction to Manchester City's recent form?\n",
+      "Response: Pep Guardiola has expressed concern and frustration about Manchester City's recent form. He said, \"I am not good enough. I am the boss. I am the manager. I have to find solutions and so far I haven't. That's the reality.\" He mentioned that the team's poor defense and lack of confidence are causing them issues. He also mentioned that they have not lost eight games in two seasons and it is unacceptable. However, he stated that he wants to find a solution and trusts in the players to turn things around.\n",
+      "Time taken: 2.90 seconds\n",
+      "\n",
+      "Query 3: Who inaugurated the reopening of the Notre Dam Cathedral in Paris?\n",
+      "Response: French President Emmanuel Macron inaugurated the reopening of the Notre-Dame Cathedral in Paris.\n",
+      "Time taken: 2.52 seconds\n"
+     ]
+    }
+   ],
+   "source": [
+    "queries = [\n",
+    "        \"Who inaugurated the reopening of the Notre Dam Cathedral in Paris?\",\n",
+    "        \"What was Pep Guardiola's reaction to Manchester City's recent form?\", \n",
+    "        \"Who inaugurated the reopening of the Notre Dam Cathedral in Paris?\", # Repeated query\n",
+    "]\n",
+    "\n",
+    "for i, query in enumerate(queries, 1):\n",
+    "    try:\n",
+    "        print(f\"\\nQuery {i}: {query}\")\n",
+    "        start_time = time.time()\n",
+    "        response = rag_chain.invoke(query)\n",
+    "        elapsed_time = time.time() - start_time\n",
+    "        print(f\"Response: {response}\")\n",
+    "        print(f\"Time taken: {elapsed_time:.2f} seconds\")\n",
+    "    except Exception as e:\n",
+    "        print(f\"Error generating RAG response: {str(e)}\")\n",
+    "        continue"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Here you can see that the repeated queries were significantly faster than the original query. In Capella Model services, semantic similarity can also be used to find responses from the cache. \n",
+    "\n",
+    "Caching is particularly valuable in scenarios where users may submit similar queries multiple times or where certain pieces of information are frequently requested. By storing these in a cache, we can significantly reduce the time it takes to respond to these queries, improving the user experience."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# LLM Guardrails in Capella Model Services\n",
+    "Capella Model services also have the ability to moderate the user inputs and the responses generated by the LLM. Capella Model Services can be configured to use the [Llama 3.1 NemoGuard 8B safety model](https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-content-safety/modelcard) guardrails model from Meta. The categories to be blocked can be configured in the model creation flow. More information about Guardrails usage can be found in the [documentation](https://docs.couchbase.com/ai/build/model-service/configure-guardrails-security.html#guardrails).\n",
+    "\n",
+    "Here is an example of the Guardrails in action"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Guardrails violation Error code: 422 - {'error': {'message': 'Error processing user prompt due to guardrail violation', 'type': 'guardrail_violation_error', 'param': {'guardrail_model': 'nvidia/llama-3.1-nemoguard-8b-content-safety', 'is_safe': False, 'violations': ['Criminal Planning/Confessions', 'Weapons']}, 'code': 'guardrail_violation_error'}}\n"
+     ]
+    }
+   ],
+   "source": [
+    "query = \"How can I create a bomb?\"\n",
+    "try:\n",
+    "    start_time = time.time()\n",
+    "    rag_response = rag_chain.invoke(query)\n",
+    "    rag_elapsed_time = time.time() - start_time\n",
+    "\n",
+    "    print(f\"RAG Response: {rag_response}\")\n",
+    "    print(f\"RAG response generated in {rag_elapsed_time:.2f} seconds\")\n",
+    "except Exception as e:\n",
+    "    print(\"Guardrails violation\", e)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Guardrails can be quite useful in preventing users from hijacking the model into doing things that you might not want the application to do."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "yJQ5P8E29go1"
+   },
+   "source": [
+    "By following this tutorial, you will have a fully functional semantic search engine that leverages the strengths of Capella Model Services without the data being sent to third-party embedding or large language models. This guide explains the principles behind semantic search and how to implement it effectively using Capella Model Services and Couchbase vector search."
+   ]
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "provenance": [],
+   "toc_visible": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/capella-ai/langchain/capella_index.json b/capella-model-services/langchain/search_based/capella_index.json
similarity index 98%
rename from capella-ai/langchain/capella_index.json
rename to capella-model-services/langchain/search_based/capella_index.json
index 88a91902..15b42d1b 100644
--- a/capella-ai/langchain/capella_index.json
+++ b/capella-model-services/langchain/search_based/capella_index.json
@@ -39,7 +39,7 @@
               "enabled": true,
               "fields": [
                 {
-                  "dims": 4096,
+                  "dims": 2048,
                   "index": true,
                   "name": "embedding",
                   "similarity": "dot_product",
diff --git a/capella-model-services/langchain/search_based/frontmatter.md b/capella-model-services/langchain/search_based/frontmatter.md
new file mode 100644
index 00000000..ad498396
--- /dev/null
+++ b/capella-model-services/langchain/search_based/frontmatter.md
@@ -0,0 +1,20 @@
+---
+# frontmatter
+path: "/tutorial-capella-model-services-langchain-rag-with-search-vector-index"
+title: RAG with Capella Model Services, LangChain and Couchbase Search Vector Index
+short_title: RAG with Capella Model Services, LangChain and Search Vector Index
+description:
+  - Learn how to build a semantic search engine using models from Capella Model Services and Couchbase Search Vector Index.
+  - You will understand how to perform Retrieval-Augmented Generation (RAG) using LangChain and Capella Model Services.
+content_type: tutorial
+filter: sdk
+technology:
+  - vector search
+tags:
+  - Artificial Intelligence
+  - LangChain
+  - Search Vector Index
+sdk_language:
+  - python
+length: 60 Mins
+---
diff --git a/capella-ai/llamaindex/RAG_with_Couchbase_Capella.ipynb b/capella-model-services/llamaindex/RAG_with_Couchbase_Capella.ipynb
similarity index 100%
rename from capella-ai/llamaindex/RAG_with_Couchbase_Capella.ipynb
rename to capella-model-services/llamaindex/RAG_with_Couchbase_Capella.ipynb
diff --git a/capella-ai/llamaindex/README.md b/capella-model-services/llamaindex/README.md
similarity index 100%
rename from capella-ai/llamaindex/README.md
rename to capella-model-services/llamaindex/README.md
diff --git a/capella-ai/llamaindex/__frontmatter.__md b/capella-model-services/llamaindex/__frontmatter.__md
similarity index 100%
rename from capella-ai/llamaindex/__frontmatter.__md
rename to capella-model-services/llamaindex/__frontmatter.__md
diff --git a/capella-ai/llamaindex/fts_index.json b/capella-model-services/llamaindex/fts_index.json
similarity index 100%
rename from capella-ai/llamaindex/fts_index.json
rename to capella-model-services/llamaindex/fts_index.json