Skip to content

Commit fe18943

Browse files
Updates according to latest docs
1 parent 201a261 commit fe18943

File tree

8 files changed

+143
-317
lines changed

8 files changed

+143
-317
lines changed

huggingface/gsi/frontmatter.md

Lines changed: 0 additions & 21 deletions
This file was deleted.
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
---
2+
# frontmatter
3+
path: "/tutorial-huggingface-couchbase-vector-search-with-hyperscale-or-composite-vector-index"
4+
alt_paths: ["/tutorial-huggingface-couchbase-vector-search-with-hyperscale-vector-index", "/tutorial-huggingface-couchbase-vector-search-with-composite-vector-index"]
5+
title: Using Hugging Face Embeddings with Couchbase Hyperscale and Composite Vector Index
6+
short_title: Hugging Face with Couchbase Hyperscale & Composite Index
7+
description:
8+
- Learn how to generate embeddings using Hugging Face and store them in Couchbase.
9+
- This tutorial demonstrates how to use Couchbase's vector search capabilities with Hugging Face embeddings using Hyperscale and Composite Vector Indexes.
10+
- You'll understand how to perform high-performance vector search to find relevant documents based on similarity.
11+
content_type: tutorial
12+
filter: sdk
13+
technology:
14+
- vector search
15+
tags:
16+
- Hyperscale Vector Index
17+
- Composite Vector Index
18+
- Artificial Intelligence
19+
- Hugging Face
20+
sdk_language:
21+
- python
22+
length: 30 Mins
23+
---

huggingface/gsi/hugging_face.ipynb renamed to huggingface/query_based/hugging_face.ipynb

Lines changed: 80 additions & 270 deletions
Large diffs are not rendered by default.
Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
11
---
22
# frontmatter
3-
path: "/tutorial-huggingface-couchbase-vector-search-with-fts"
4-
title: Using Hugging Face Embeddings with Couchbase Vector Search using FTS Service
5-
short_title: Hugging Face with Couchbase Vector Search using FTS Service
3+
path: "/tutorial-huggingface-couchbase-vector-search-with-search-vector-index"
4+
title: Using Hugging Face Embeddings with Couchbase Search Vector Index
5+
short_title: Hugging Face with Couchbase Search Vector Index
66
description:
77
- Learn how to generate embeddings using Hugging Face and store them in Couchbase.
88
- This tutorial demonstrates how to use Couchbase's vector search capabilities with Hugging Face embeddings.
9-
- You'll understand how to perform vector search to find relevant documents based on similarity using FTS Service.
9+
- You'll understand how to perform vector search to find relevant documents based on similarity using Search Vector Index.
1010
content_type: tutorial
1111
filter: sdk
1212
technology:
1313
- vector search
1414
tags:
15-
- FTS
15+
- Search Vector Index
1616
- Artificial Intelligence
1717
- Hugging Face
1818
sdk_language:

huggingface/fts/hugging_face.ipynb renamed to huggingface/search_based/hugging_face.ipynb

Lines changed: 35 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -5,19 +5,23 @@
55
"id": "4c60986a",
66
"metadata": {},
77
"source": [
8-
"# Introduction\n",
8+
"## Introduction\n",
99
"\n",
10-
"In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database, [Hugging Face](https://huggingface.co/) as the AI-powered embedding Model. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch. Alternatively, if you want to perform semantic search using the GSI index, please take a look at [this.](https://developer.couchbase.com//tutorial-huggingface-couchbase-vector-search-with-global-secondary-index)"
10+
"In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database and [Hugging Face](https://huggingface.co/) as the AI-powered embedding model. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval.\n",
11+
"\n",
12+
"This tutorial uses Couchbase's **Search Vector Index** for vector similarity search. For more information on vector indexes, see the [Couchbase Vector Index Documentation](https://docs.couchbase.com/cloud/vector-index/use-vector-indexes.html).\n",
13+
"\n",
14+
"This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch. Alternatively, if you want to perform semantic search using Hyperscale or Composite Vector Indexes, please take a look at [this tutorial](https://developer.couchbase.com/tutorial-huggingface-couchbase-vector-search-with-hyperscale-or-composite-vector-index)."
1115
]
1216
},
1317
{
1418
"cell_type": "markdown",
1519
"id": "6178e6b3",
1620
"metadata": {},
1721
"source": [
18-
"# How to run this tutorial\n",
22+
"## How to Run This Tutorial\n",
1923
"\n",
20-
"This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/huggingface/fts/hugging_face.ipynb).\n",
24+
"This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/huggingface/search_based/hugging_face.ipynb).\n",
2125
"\n",
2226
"You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment."
2327
]
@@ -27,9 +31,9 @@
2731
"id": "ef73d80c",
2832
"metadata": {},
2933
"source": [
30-
"# Before you start\n",
34+
"## Before You Start\n",
3135
"\n",
32-
"## Create and Deploy Your Free Tier Operational cluster on Capella\n",
36+
"### Create and Deploy Your Free Tier Operational Cluster on Capella\n",
3337
"\n",
3438
"To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with a environment where you can explore and learn about Capella with no time constraint.\n",
3539
"\n",
@@ -48,12 +52,12 @@
4852
"id": "77308721",
4953
"metadata": {},
5054
"source": [
51-
"# Install necessary libraries"
55+
"## Install Necessary Libraries"
5256
]
5357
},
5458
{
5559
"cell_type": "code",
56-
"execution_count": 1,
60+
"execution_count": null,
5761
"id": "208a54a1",
5862
"metadata": {},
5963
"outputs": [],
@@ -66,7 +70,7 @@
6670
"id": "9470f9e3-311b-45c8-81c3-baa5fe0995d2",
6771
"metadata": {},
6872
"source": [
69-
"# Imports"
73+
"## Imports"
7074
]
7175
},
7276
{
@@ -98,8 +102,9 @@
98102
"id": "041a3edf-f5f7-43e1-99b9-b775e94fbfe6",
99103
"metadata": {},
100104
"source": [
101-
"# Prerequisites\n",
102-
"In order to run this tutorial, you will need access to a Couchbase Cluster with Full Text Search service either through Couchbase Capella or by running it locally and have credentials to acces a collection on that cluster:"
105+
"## Prerequisites\n",
106+
"\n",
107+
"In order to run this tutorial, you will need access to a Couchbase Cluster with Search Service enabled either through Couchbase Capella or by running it locally, and have credentials to access a collection on that cluster:"
103108
]
104109
},
105110
{
@@ -126,7 +131,8 @@
126131
"id": "15edfec2-64bd-4ba1-b072-4fadacddb01a",
127132
"metadata": {},
128133
"source": [
129-
"# Couchbase Connection\n",
134+
"## Couchbase Connection\n",
135+
"\n",
130136
"In this section, we first need to create a `PasswordAuthenticator` object that would hold our Couchbase credentials:"
131137
]
132138
},
@@ -182,8 +188,13 @@
182188
"id": "625881d5-39e2-44ed-bbca-0db67e98f765",
183189
"metadata": {},
184190
"source": [
185-
"# Creating Couchbase Vector Search Index\n",
186-
"In order to store generated with Hugging Face embeddings onto a Couchbase Cluster, a vector search index needs to be created first. We included a sample index definition that will work with this tutorial in a file named `huggingface_index.json` located in the folder with this tutorial. The definition can be used to create a vector index using Couchbase server web console, on more information on vector indexes, please read [Create a Vector Search Index with the Server Web Console](https://docs.couchbase.com/server/current/vector-search/create-vector-search-index-ui.html). Please note that the index is configured for documents from bucket `hugginface`, scope `_default` and collection `huggingface` and you will have to edit `source` and document type name in the index definition file if your collection, scope or bucket names are different.\n",
191+
"## Creating Couchbase Search Vector Index\n",
192+
"\n",
193+
"In order to store Hugging Face-generated embeddings onto a Couchbase Cluster, a Search Vector Index needs to be created first. We included a sample index definition that will work with this tutorial in a file named `huggingface_index.json` located in the folder with this tutorial.\n",
194+
"\n",
195+
"The definition can be used to create a Search Vector Index using Couchbase server web console. For more information on vector indexes, please read [Create a Vector Search Index with the Server Web Console](https://docs.couchbase.com/server/current/vector-search/create-vector-search-index-ui.html).\n",
196+
"\n",
197+
"Please note that the index is configured for documents from bucket `huggingface`, scope `_default` and collection `huggingface`. You will need to edit the `source` and document type name in the index definition file if your collection, scope, or bucket names are different.\n",
187198
"\n",
188199
"Here, our code verifies the existence of the index and will throw an exception if the index has not been found:"
189200
]
@@ -213,7 +224,7 @@
213224
"id": "d71a7207-54d1-44fd-aa9d-d361b42d2c96",
214225
"metadata": {},
215226
"source": [
216-
"# Hugging Face Initialization"
227+
"## Hugging Face Initialization"
217228
]
218229
},
219230
{
@@ -240,8 +251,9 @@
240251
"id": "c0d8e261-d670-4c40-8037-3d4e3084c360",
241252
"metadata": {},
242253
"source": [
243-
"# Embedding Documents\n",
244-
"After initializing Hugging Face transformers library, it can be used to generate vector embeddings for user input or predefined set of phrases. Here, we're generating 2 embeddings for contained in the array strings:"
254+
"## Embedding Documents\n",
255+
"\n",
256+
"After initializing the Hugging Face transformers library, it can be used to generate vector embeddings for user input or a predefined set of phrases. Here, we're generating embeddings for the strings contained in the array:"
245257
]
246258
},
247259
{
@@ -266,8 +278,9 @@
266278
"id": "80814e90-699f-4201-8cd3-7ef8adab9966",
267279
"metadata": {},
268280
"source": [
269-
"# Storing Embeddings in Couchbase\n",
270-
"Generated embeddings are then stored as vector fields inside documents that can contain additional information about the vector, including the original text. The documents are then upserted onto the couchbase cluster:"
281+
"## Storing Embeddings in Couchbase\n",
282+
"\n",
283+
"Generated embeddings are then stored as vector fields inside documents that can contain additional information about the vector, including the original text. The documents are then upserted onto the Couchbase cluster:"
271284
]
272285
},
273286
{
@@ -291,8 +304,9 @@
291304
"id": "f11a0d98-bcf5-4fe4-b602-6e8a23edf95e",
292305
"metadata": {},
293306
"source": [
294-
"# Searching For Embeddings\n",
295-
"After the documents are upserted onto the cluster, their vector fields will be added into previously imported vector index. Later, new embeddings can be added or used to perform a similarity search on the previously added documents:"
307+
"## Searching For Embeddings\n",
308+
"\n",
309+
"After the documents are upserted onto the cluster, their vector fields will be added to the previously imported Search Vector Index. Later, new embeddings can be added or used to perform a similarity search on the previously added documents:"
296310
]
297311
},
298312
{
File renamed without changes.

0 commit comments

Comments
 (0)