feat: Add Text Embeddings Inference (TEI) provider support by jayscambler · Pull Request #60 · greyhaven-ai/contextframe

jayscambler · 2025-07-02T16:45:12Z

Summary

Adds support for Hugging Face's Text Embeddings Inference (TEI) server as an embedding provider in ContextFrame, providing high-performance, self-hosted embeddings for 100+ open-source models.

Changes

✨ New TEIProvider class implementing the EmbeddingProvider interface
🔧 Updated factory function to support provider_type="tei"
📚 Added comprehensive documentation in embedding providers guide
🧪 Unit tests with mocks for TEI functionality
💡 Complete example demonstrating TEI usage patterns
📦 Added httpx as optional dependency for lightweight HTTP client

Features

Local and Remote Support: Works with TEI servers running locally or remotely
Authentication: Bearer token support for secured instances
Health Checks: Built-in server health monitoring
Error Handling: Automatic retries with exponential backoff
Configuration: Flexible timeout, truncation, and normalization options
Minimal Dependencies: Only requires httpx (25KB) for HTTP communication

Example Usage

from contextframe.embed import create_embedder

# Local TEI server
embedder = create_embedder(
    model="BAAI/bge-large-en-v1.5",
    provider_type="tei",
    api_base="http://localhost:8080"
)

# Embed documents
results = embedder.embed_batch(["Document 1", "Document 2"])

Benefits

🚀 Performance: Flash Attention, ONNX optimization, dynamic batching
🔐 Privacy: Self-hosted solution for sensitive data
🎯 Flexibility: Supports any Sentence Transformer or BERT-based model
📊 Production Ready: Built-in metrics, monitoring, health checks
💻 Hardware Support: GPU acceleration (CUDA) and CPU optimizations

Testing

The implementation includes comprehensive unit tests using mocks. Note: Current test suite has NumPy compatibility issues unrelated to this PR that will be addressed separately.

Docker Setup

# GPU deployment
docker run --gpus all -p 8080:80 -v $PWD/data:/data \
  ghcr.io/huggingface/text-embeddings-inference:1.7 \
  --model-id BAAI/bge-large-en-v1.5

# CPU deployment
docker run -p 8080:80 -v $PWD/data:/data \
  ghcr.io/huggingface/text-embeddings-inference:cpu-1.7 \
  --model-id BAAI/bge-large-en-v1.5

- Fix pylance dependency (was incorrectly 'lance' in v0.1.1) - Fix 38 failing integration tests with API corrections - Add 'member_of' to valid relationship types for collections - Fix custom metadata string validation issues - Implement Lance v0.30.0 vector search bug workaround - Fix UUID property access and len() usage on datasets - Improve error messages with field context and helpful hints feat: Add new features for better developer experience - Add full-text search index creation (create_fts_index method) - Add UUID override support at creation time - Add auto-indexing option for full-text search - Enhance create_scalar_index with index type support - Reorganize tests into unit/ and integration/ structure docs: Update documentation and changelog - Add comprehensive CHANGELOG entry for v0.1.2 - Add migration guide (docs/migration/api-changes-v012.md) - Add API improvements roadmap (docs/roadmap/api-improvements-v02.md) - Update API reference documentation BREAKING CHANGE: Replaced LlamaIndex text splitter with semantic-text-splitter

- Add TEIProvider class for high-performance embedding inference - Support both local and remote TEI server instances - Add httpx as optional dependency for lightweight HTTP client - Update factory function to support provider_type='tei' - Add comprehensive documentation and examples - Include unit tests with mocks for TEI functionality - Support for authentication, retries, and health checks TEI provides optimized inference for 100+ open-source models with: - Flash Attention and dynamic batching - GPU/CPU hardware acceleration - Production-ready monitoring and metrics - Self-hosted deployment for data privacy Implements CFOS-46

linear · 2025-07-02T16:45:16Z

CFOS-46 Add Text Embeddings Inference (TEI) support to embeddings module

- Add comprehensive TEI setup guide covering hardware requirements, installation methods, and troubleshooting - Include Docker, Docker Compose, and Kubernetes deployment examples - Add security considerations and performance tuning tips - Document NumPy 2.x compatibility issues with PyArrow - Link from main embedding providers doc to setup guide

- Upgrade PyArrow from 14.0.2 to >=17.0.0 for better NumPy compatibility - Pin NumPy to 1.x series (numpy>=1.24,<2) to avoid NumPy 2.x issues - Resolves 'numpy.core.multiarray failed to import' errors - Fixes test environment and development workflows

…embeddings-inference-tei-support-to-embeddings

jayscambler added 2 commits July 2, 2025 10:37

jayscambler added 3 commits July 2, 2025 11:53

Merge remote-tracking branch 'origin/main' into jay/cfos-46-add-text-…

62aac29

…embeddings-inference-tei-support-to-embeddings

jayscambler merged commit e3fc1bf into main Jul 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Text Embeddings Inference (TEI) provider support#60

feat: Add Text Embeddings Inference (TEI) provider support#60
jayscambler merged 5 commits intomainfrom
jay/cfos-46-add-text-embeddings-inference-tei-support-to-embeddings

jayscambler commented Jul 2, 2025

Uh oh!

linear bot commented Jul 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jayscambler commented Jul 2, 2025

Summary

Changes

Features

Example Usage

Benefits

Testing

Related

Docker Setup

Uh oh!

linear bot commented Jul 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant