Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 0 additions & 21 deletions huggingface/gsi/frontmatter.md

This file was deleted.

File renamed without changes.
23 changes: 23 additions & 0 deletions huggingface/query_based/frontmatter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
# frontmatter
path: "/tutorial-huggingface-couchbase-vector-search-with-hyperscale-or-composite-vector-index"
alt_paths: ["/tutorial-huggingface-couchbase-vector-search-with-hyperscale-vector-index", "/tutorial-huggingface-couchbase-vector-search-with-composite-vector-index"]
title: Using Hugging Face Embeddings with Couchbase Hyperscale and Composite Vector Index
short_title: Hugging Face with Couchbase Hyperscale & Composite Index
description:
- Learn how to generate embeddings using Hugging Face and store them in Couchbase.
- This tutorial demonstrates how to use Couchbase's vector search capabilities with Hugging Face embeddings using Hyperscale and Composite Vector Indexes.
- You'll understand how to perform high-performance vector search to find relevant documents based on similarity.
content_type: tutorial
filter: sdk
technology:
- vector search
tags:
- Hyperscale Vector Index
- Composite Vector Index
- Artificial Intelligence
- Hugging Face
sdk_language:
- python
length: 30 Mins
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output for the Code blocks is needed to show the performance in the test section

Large diffs are not rendered by default.

File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
---
# frontmatter
path: "/tutorial-huggingface-couchbase-vector-search-with-fts"
title: Using Hugging Face Embeddings with Couchbase Vector Search using FTS Service
short_title: Hugging Face with Couchbase Vector Search using FTS Service
path: "/tutorial-huggingface-couchbase-vector-search-with-search-vector-index"
title: Using Hugging Face Embeddings with Couchbase Search Vector Index
short_title: Hugging Face with Couchbase Search Vector Index
description:
- Learn how to generate embeddings using Hugging Face and store them in Couchbase.
- This tutorial demonstrates how to use Couchbase's vector search capabilities with Hugging Face embeddings.
- You'll understand how to perform vector search to find relevant documents based on similarity using FTS Service.
- You'll understand how to perform vector search to find relevant documents based on similarity using Search Vector Index.
content_type: tutorial
filter: sdk
technology:
- vector search
tags:
- FTS
- Search Vector Index
- Artificial Intelligence
- Hugging Face
sdk_language:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,23 @@
"id": "4c60986a",
"metadata": {},
"source": [
"# Introduction\n",
"## Introduction\n",
"\n",
"In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database, [Hugging Face](https://huggingface.co/) as the AI-powered embedding Model. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch. Alternatively, if you want to perform semantic search using the GSI index, please take a look at [this.](https://developer.couchbase.com//tutorial-huggingface-couchbase-vector-search-with-global-secondary-index)"
"In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database and [Hugging Face](https://huggingface.co/) as the AI-powered embedding model. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval.\n",
"\n",
"This tutorial uses Couchbase's **Search Vector Index** for vector similarity search. For more information on vector indexes, see the [Couchbase Vector Index Documentation](https://docs.couchbase.com/cloud/vector-index/use-vector-indexes.html).\n",
"\n",
"This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch. Alternatively, if you want to perform semantic search using Hyperscale or Composite Vector Indexes, please take a look at [this tutorial](https://developer.couchbase.com/tutorial-huggingface-couchbase-vector-search-with-hyperscale-or-composite-vector-index)."
]
},
{
"cell_type": "markdown",
"id": "6178e6b3",
"metadata": {},
"source": [
"# How to run this tutorial\n",
"## How to Run This Tutorial\n",
"\n",
"This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/huggingface/fts/hugging_face.ipynb).\n",
"This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/huggingface/search_based/hugging_face.ipynb).\n",
"\n",
"You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment."
]
Expand All @@ -27,9 +31,9 @@
"id": "ef73d80c",
"metadata": {},
"source": [
"# Before you start\n",
"## Before You Start\n",
"\n",
"## Create and Deploy Your Free Tier Operational cluster on Capella\n",
"### Create and Deploy Your Free Tier Operational Cluster on Capella\n",
"\n",
"To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with a environment where you can explore and learn about Capella with no time constraint.\n",
"\n",
Expand All @@ -48,12 +52,12 @@
"id": "77308721",
"metadata": {},
"source": [
"# Install necessary libraries"
"## Install Necessary Libraries"
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "208a54a1",
"metadata": {},
"outputs": [],
Expand All @@ -66,7 +70,7 @@
"id": "9470f9e3-311b-45c8-81c3-baa5fe0995d2",
"metadata": {},
"source": [
"# Imports"
"## Imports"
]
},
{
Expand Down Expand Up @@ -98,8 +102,9 @@
"id": "041a3edf-f5f7-43e1-99b9-b775e94fbfe6",
"metadata": {},
"source": [
"# Prerequisites\n",
"In order to run this tutorial, you will need access to a Couchbase Cluster with Full Text Search service either through Couchbase Capella or by running it locally and have credentials to acces a collection on that cluster:"
"## Prerequisites\n",
"\n",
"In order to run this tutorial, you will need access to a Couchbase Cluster with Search Service enabled either through Couchbase Capella or by running it locally, and have credentials to access a collection on that cluster:"
]
},
{
Expand All @@ -126,7 +131,8 @@
"id": "15edfec2-64bd-4ba1-b072-4fadacddb01a",
"metadata": {},
"source": [
"# Couchbase Connection\n",
"## Couchbase Connection\n",
"\n",
"In this section, we first need to create a `PasswordAuthenticator` object that would hold our Couchbase credentials:"
]
},
Expand Down Expand Up @@ -182,8 +188,13 @@
"id": "625881d5-39e2-44ed-bbca-0db67e98f765",
"metadata": {},
"source": [
"# Creating Couchbase Vector Search Index\n",
"In order to store generated with Hugging Face embeddings onto a Couchbase Cluster, a vector search index needs to be created first. We included a sample index definition that will work with this tutorial in a file named `huggingface_index.json` located in the folder with this tutorial. The definition can be used to create a vector index using Couchbase server web console, on more information on vector indexes, please read [Create a Vector Search Index with the Server Web Console](https://docs.couchbase.com/server/current/vector-search/create-vector-search-index-ui.html). Please note that the index is configured for documents from bucket `hugginface`, scope `_default` and collection `huggingface` and you will have to edit `source` and document type name in the index definition file if your collection, scope or bucket names are different.\n",
"## Creating Couchbase Search Vector Index\n",
"\n",
"In order to store Hugging Face-generated embeddings onto a Couchbase Cluster, a Search Vector Index needs to be created first. We included a sample index definition that will work with this tutorial in a file named `huggingface_index.json` located in the folder with this tutorial.\n",
"\n",
"The definition can be used to create a Search Vector Index using Couchbase server web console. For more information on vector indexes, please read [Create a Vector Search Index with the Server Web Console](https://docs.couchbase.com/server/current/vector-search/create-vector-search-index-ui.html).\n",
"\n",
"Please note that the index is configured for documents from bucket `huggingface`, scope `_default` and collection `huggingface`. You will need to edit the `source` and document type name in the index definition file if your collection, scope, or bucket names are different.\n",
"\n",
"Here, our code verifies the existence of the index and will throw an exception if the index has not been found:"
]
Expand Down Expand Up @@ -213,7 +224,7 @@
"id": "d71a7207-54d1-44fd-aa9d-d361b42d2c96",
"metadata": {},
"source": [
"# Hugging Face Initialization"
"## Hugging Face Initialization"
]
},
{
Expand All @@ -240,8 +251,9 @@
"id": "c0d8e261-d670-4c40-8037-3d4e3084c360",
"metadata": {},
"source": [
"# Embedding Documents\n",
"After initializing Hugging Face transformers library, it can be used to generate vector embeddings for user input or predefined set of phrases. Here, we're generating 2 embeddings for contained in the array strings:"
"## Embedding Documents\n",
"\n",
"After initializing the Hugging Face transformers library, it can be used to generate vector embeddings for user input or a predefined set of phrases. Here, we're generating embeddings for the strings contained in the array:"
]
},
{
Expand All @@ -266,8 +278,9 @@
"id": "80814e90-699f-4201-8cd3-7ef8adab9966",
"metadata": {},
"source": [
"# Storing Embeddings in Couchbase\n",
"Generated embeddings are then stored as vector fields inside documents that can contain additional information about the vector, including the original text. The documents are then upserted onto the couchbase cluster:"
"## Storing Embeddings in Couchbase\n",
"\n",
"Generated embeddings are then stored as vector fields inside documents that can contain additional information about the vector, including the original text. The documents are then upserted onto the Couchbase cluster:"
]
},
{
Expand All @@ -291,8 +304,9 @@
"id": "f11a0d98-bcf5-4fe4-b602-6e8a23edf95e",
"metadata": {},
"source": [
"# Searching For Embeddings\n",
"After the documents are upserted onto the cluster, their vector fields will be added into previously imported vector index. Later, new embeddings can be added or used to perform a similarity search on the previously added documents:"
"## Searching For Embeddings\n",
"\n",
"After the documents are upserted onto the cluster, their vector fields will be added to the previously imported Search Vector Index. Later, new embeddings can be added or used to perform a similarity search on the previously added documents:"
]
},
{
Expand Down