First implementation of the ReRanker endpoint. #190

kalaspuffar · 2025-08-18T08:29:13Z

This is a PR as per the suggestion from danny-avila/LibreChat#9102

This will add an endpoint /rerank in order to use open source models to rerank documents. The endpoint needs a query to rerank against and documents to rank. We can also add information on how many results we need, k, and a configuration to set the model and keys in order to run this operation.

All available configuration options could be found over at https://github.com/AnswerDotAI/rerankers, which this endpoint is a thin wrapper over.

Test call

curl -s http://localhost:8000/rerank \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer YOUR_JWT_TOKEN' \
  -d '{
    "query": "I love you",
    "docs": ["I hate you", "I really like you"],
    "k": 5
  }'

Expected response:

[{"text":"I really like you","score":-1.537894606590271},{"text":"I hate you","score":-4.30911111831665}]

Realized that sending the model over the call is not the correct option, we need to load it one time to improve performance so now you can configure that in the environment for the rag_api repository.

SIMPLE_RERANKER_MODEL_NAME = "mixedbread-ai/mxbai-rerank-large-v1"
SIMPLE_RERANKER_MODEL_TYPE = "cross-encoder"
#SIMPLE_RERANKER_MODEL_NAME = "ms-marco-MiniLM-L-12-v2"
#SIMPLE_RERANKER_MODEL_NAME = "flashrank"
#SIMPLE_RERANKER_MODEL_TYPE = "colbert"
SIMPLE_RERANKER_LANG = ""
SIMPLE_RERANKER_API_PROVIDER = ""
SIMPLE_RERANKER_API_KEY = ""

kalaspuffar · 2025-08-19T07:04:28Z

Force push was due to black linting.

All done! ✨ 🍰 ✨
1 file reformatted, 1 file left unchanged.

Copilot

Pull request overview

This PR adds a new /rerank endpoint to enable document reranking using open source models via the rerankers library. The implementation allows users to submit a query and a list of documents to be reranked based on relevance, with optional control over the number of top results returned.

Key Changes:

Added rerankers library dependencies with transformers and flashrank support
Implemented /rerank endpoint that accepts queries and documents for reranking
Configured Docker Compose with NVIDIA runtime and HuggingFace cache volume for model support

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 8 comments.

File	Description
requirements.txt	Added rerankers library with transformers and flashrank extras for document reranking functionality
docker-compose.yaml	Added NVIDIA runtime support and HuggingFace cache volume mount to support GPU-accelerated model inference
app/routes/document_routes.py	Implemented reranker instance initialization and `/rerank` endpoint handler with document processing logic
app/models.py	Added `QueryMultipleDocs` Pydantic model to define request schema for the rerank endpoint

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Copilot · 2025-11-28T16:25:42Z

app/models.py

+class QueryMultipleDocs(BaseModel):
+    query: str
+    docs: List[str]
+    k: int = 4


The QueryMultipleDocs model only accepts List[str] for the docs field, but the endpoint implementation at lines 717-726 also handles dictionary objects with "text", "doc_id", and "metadata" fields. The model definition should use Union[str, dict] to accurately reflect the accepted input types:

from typing import List, Union class QueryMultipleDocs(BaseModel): query: str docs: List[Union[str, dict]] k: int = 4

Copilot · 2025-11-28T16:25:44Z

app/routes/document_routes.py

+@router.post("/rerank")
+async def rerank_documents_by_query(request: Request, body: QueryMultipleDocs):
+    try:
+        docs = []
+        for i, d in enumerate(body.docs):
+            if isinstance(d, str):
+                docs.append(ReRankDocument(text=d, doc_id=i))
+            else:
+                docs.append(
+                    ReRankDocument(
+                        text=d.get("text", ""),
+                        doc_id=d.get("doc_id", i),
+                        metadata=d.get("metadata", {}) or {},
+                    )
+                )
+
+        top_k = body.k
+
+        results = reRankerInstance.rank(query=body.query, docs=docs)
+        items = results.top_k(top_k) if top_k else results
+
+        return [
+            {"text": getattr(r.document, "text", None), "score": r.score} for r in items
+        ]
+    except Exception as e:
+        logger.error(
+            "Error in reranking documents | Query: %s | Error: %s | Traceback: %s",
+            body.query,
+            str(e),
+            traceback.format_exc(),
+        )
+        raise HTTPException(status_code=500, detail=str(e))


The new /rerank endpoint lacks test coverage. Consider adding tests similar to other endpoints in the codebase (e.g., test_query_embeddings_by_file_id) to verify:

Successful reranking with string documents

Reranking with dict documents containing metadata

Handling of the k parameter (top_k filtering)

Error handling for invalid inputs

Copilot · 2025-11-28T16:25:44Z

app/routes/document_routes.py

+reRankerInstance = Reranker(
+    model_name=os.getenv("SIMPLE_RERANKER_MODEL_NAME"),
+    model_type=os.getenv("SIMPLE_RERANKER_MODEL_TYPE"),
+    lang=os.getenv("SIMPLE_RERANKER_LANG"),
+)


The new reranker environment variables (SIMPLE_RERANKER_MODEL_NAME, SIMPLE_RERANKER_MODEL_TYPE, SIMPLE_RERANKER_LANG) are not documented in the README.md. These should be added to the "Environment Variables" section to help users configure the reranker feature properly.

Copilot · 2025-11-28T16:25:45Z

app/routes/document_routes.py

        raise HTTPException(status_code=500, detail=str(e))

+@router.post("/rerank")
+async def rerank_documents_by_query(request: Request, body: QueryMultipleDocs):


The /rerank endpoint is missing a docstring to describe its purpose, parameters, and return value. Other endpoints in the codebase (e.g., /text at line 746) include docstrings. Consider adding one:

@router.post("/rerank") async def rerank_documents_by_query(request: Request, body: QueryMultipleDocs): """ Rerank documents based on relevance to a query using a reranking model. Args: request: The FastAPI request object body: Contains query string, list of documents, and optional k value Returns: List of ranked documents with their scores """

Suggested change

async def rerank_documents_by_query(request: Request, body: QueryMultipleDocs):

async def rerank_documents_by_query(request: Request, body: QueryMultipleDocs):

"""

Rerank documents based on relevance to a query using a reranking model.

Args:

request: The FastAPI request object.

body: QueryMultipleDocs object containing the query string, list of documents, and optional k value.

Returns:

List of dictionaries, each containing the document text and its relevance score.

"""

Copilot · 2025-11-28T16:25:45Z

docker-compose.yaml

      - DB_PORT=5432
    ports:
      - "8000:8000"
+    runtime: nvidia


The runtime: nvidia configuration is added unconditionally, but this will cause failures on systems without NVIDIA GPUs or the NVIDIA Container Runtime. Consider:

Making this conditional based on an environment variable

Documenting that GPU support is required for reranking

Or using a docker-compose override file for GPU-specific configurations

Example:

runtime: ${DOCKER_RUNTIME:-runc} # defaults to runc, override with nvidia when needed

Suggested change

runtime: nvidia

runtime: ${DOCKER_RUNTIME:-runc}

Copilot · 2025-11-28T16:25:46Z

app/routes/document_routes.py


+@router.post("/rerank")
+async def rerank_documents_by_query(request: Request, body: QueryMultipleDocs):
+    try:


The endpoint doesn't validate that the docs list is non-empty before processing. If an empty list is provided, the reranker may behave unexpectedly. Consider adding validation:

if not body.docs: raise HTTPException(status_code=400, detail="docs list cannot be empty")

Suggested change

try:

try:

if not body.docs:

raise HTTPException(status_code=400, detail="docs list cannot be empty")

Copilot · 2025-11-28T16:25:46Z

app/routes/document_routes.py

 from app.utils.health import is_health_ok

 router = APIRouter()
+reRankerInstance = Reranker(


The variable name reRankerInstance uses unconventional camelCase naming. Python convention (PEP 8) recommends using lowercase with underscores for variable names. Consider renaming to reranker_instance or reranker for better consistency with the codebase (e.g., vector_store, logger).

Suggested change

reRankerInstance = Reranker(

reranker_instance = Reranker(

Copilot · 2025-11-28T16:25:46Z

app/routes/document_routes.py

+reRankerInstance = Reranker(
+    model_name=os.getenv("SIMPLE_RERANKER_MODEL_NAME"),
+    model_type=os.getenv("SIMPLE_RERANKER_MODEL_TYPE"),
+    lang=os.getenv("SIMPLE_RERANKER_LANG"),
+)




The Reranker instance is initialized at module import time with environment variables that may not be set or may be None. This could cause initialization failures or runtime errors when the module loads. Consider:

Adding validation to ensure required environment variables are set

Providing sensible defaults for optional parameters

Or deferring initialization until first use with lazy loading

For example:

model_name = os.getenv("SIMPLE_RERANKER_MODEL_NAME") if not model_name: raise ValueError("SIMPLE_RERANKER_MODEL_NAME environment variable must be set")

Suggested change

reRankerInstance = Reranker(

model_name=os.getenv("SIMPLE_RERANKER_MODEL_NAME"),

model_type=os.getenv("SIMPLE_RERANKER_MODEL_TYPE"),

lang=os.getenv("SIMPLE_RERANKER_LANG"),

)

@lru_cache(maxsize=1)

def get_reranker_instance():

model_name = os.getenv("SIMPLE_RERANKER_MODEL_NAME")

model_type = os.getenv("SIMPLE_RERANKER_MODEL_TYPE")

lang = os.getenv("SIMPLE_RERANKER_LANG")

if not model_name:

raise RuntimeError("SIMPLE_RERANKER_MODEL_NAME environment variable must be set")

if not model_type:

raise RuntimeError("SIMPLE_RERANKER_MODEL_TYPE environment variable must be set")

if not lang:

raise RuntimeError("SIMPLE_RERANKER_LANG environment variable must be set")

return Reranker(

model_name=model_name,

model_type=model_type,

lang=lang,

)

kalaspuffar force-pushed the reranker branch from 0b3c63e to c1267d3 Compare August 19, 2025 07:02

This was referenced Nov 19, 2025

Adding code to call the rag_api simple reranker. danny-avila/agents#33

Open

feat: Adding code to call the rag_api simple reranker. danny-avila/LibreChat#10574

Open

kalaspuffar added 2 commits November 19, 2025 22:27

First implementation of the ReRanker endpoint.

8782c8e

After linting.

b0fbc78

kalaspuffar force-pushed the reranker branch from 1af1023 to b0fbc78 Compare November 19, 2025 21:29

Fixing rebase issue.

278e770

danny-avila requested a review from Copilot November 28, 2025 16:21

Copilot started reviewing on behalf of danny-avila November 28, 2025 16:21 View session

Copilot finished reviewing on behalf of danny-avila November 28, 2025 16:24

Copilot AI reviewed Nov 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

First implementation of the ReRanker endpoint. #190

First implementation of the ReRanker endpoint. #190

Uh oh!

kalaspuffar commented Aug 18, 2025 •

edited

Loading

Uh oh!

kalaspuffar commented Aug 19, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 28, 2025

Uh oh!

Copilot AI Nov 28, 2025

Uh oh!

Copilot AI Nov 28, 2025

Uh oh!

Copilot AI Nov 28, 2025

Uh oh!

Copilot AI Nov 28, 2025

Uh oh!

Copilot AI Nov 28, 2025

Uh oh!

Copilot AI Nov 28, 2025

Uh oh!

Copilot AI Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-async def rerank_documents_by_query(request: Request, body: QueryMultipleDocs):
+async def rerank_documents_by_query(request: Request, body: QueryMultipleDocs):
+    """
+    Rerank documents based on relevance to a query using a reranking model.
+    Args:
+        request: The FastAPI request object.
+        body: QueryMultipleDocs object containing the query string, list of documents, and optional k value.
+    Returns:
+        List of dictionaries, each containing the document text and its relevance score.
+    """

-    try:
+    try:
+        if not body.docs:
+            raise HTTPException(status_code=400, detail="docs list cannot be empty")

-reRankerInstance = Reranker(
-    model_name=os.getenv("SIMPLE_RERANKER_MODEL_NAME"),
-    model_type=os.getenv("SIMPLE_RERANKER_MODEL_TYPE"),
-    lang=os.getenv("SIMPLE_RERANKER_LANG"),
-)
+@lru_cache(maxsize=1)
+def get_reranker_instance():
+    model_name = os.getenv("SIMPLE_RERANKER_MODEL_NAME")
+    model_type = os.getenv("SIMPLE_RERANKER_MODEL_TYPE")
+    lang = os.getenv("SIMPLE_RERANKER_LANG")
+    if not model_name:
+        raise RuntimeError("SIMPLE_RERANKER_MODEL_NAME environment variable must be set")
+    if not model_type:
+        raise RuntimeError("SIMPLE_RERANKER_MODEL_TYPE environment variable must be set")
+    if not lang:
+        raise RuntimeError("SIMPLE_RERANKER_LANG environment variable must be set")
+    return Reranker(
+        model_name=model_name,
+        model_type=model_type,
+        lang=lang,
+    )

First implementation of the ReRanker endpoint. #190

Are you sure you want to change the base?

First implementation of the ReRanker endpoint. #190

Uh oh!

Conversation

kalaspuffar commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kalaspuffar commented Aug 19, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kalaspuffar commented Aug 18, 2025 •

edited

Loading