Skip to content

Retriever Service Index Error #90

@GTS-AI-Infra-Lab-SotaS

Description

@GTS-AI-Infra-Lab-SotaS

作者你们好,
我在复现re-search的时候发现Retriever Service一致出现如下报错信息

Begin faiss searching...
End faiss searching
INFO: 10.44.101.107:48620 - "POST /batch_search HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
return await self.app(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/fastapi/applications.py", line 1134, in call
await super().call(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/applications.py", line 113, in call
await self.middleware_stack(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 63, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/routing.py", line 716, in call
await self.middleware_stack(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/routing.py", line 736, in app
await route.handle(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/routing.py", line 290, in handle
await self.app(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/fastapi/routing.py", line 125, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/fastapi/routing.py", line 111, in app
response = await f(request)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/fastapi/routing.py", line 391, in app
raw_response = await run_endpoint_function(
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/fastapi/routing.py", line 290, in run_endpoint_function
return await dependant.call(**values)
File "/data1/zzc/ReCall-re-search/scripts/serving/retriever_serving.py", line 90, in batch_search
results = retriever_list[retriever_idx].batch_search(query, top_n, return_score)
File "/data1/zzc/ReCall-re-search/src/flashrag/retriever/retriever.py", line 70, in wrapper
results, scores = func(self, query=query, num=num, return_score=True)
File "/data1/zzc/ReCall-re-search/src/flashrag/retriever/retriever.py", line 101, in wrapper
results, scores = func(self, query=query, num=num, return_score=True)
File "/data1/zzc/ReCall-re-search/src/flashrag/retriever/retriever.py", line 208, in batch_search
return self._batch_search(*args, **kwargs)
File "/data1/zzc/ReCall-re-search/src/flashrag/retriever/retriever.py", line 424, in _batch_search
results = load_docs(self.corpus, flat_idxs)
File "/data1/zzc/ReCall-re-search/src/flashrag/retriever/utils.py", line 149, in load_docs
results = [corpus[int(idx)] for idx in doc_idxs]
File "/data1/zzc/ReCall-re-search/src/flashrag/retriever/utils.py", line 149, in
results = [corpus[int(idx)] for idx in doc_idxs]
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 2876, in getitem
return self._getitem(key)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 2857, in _getitem
pa_subtable = query_table(self._data, key, indices=self._indices)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/datasets/formatting/formatting.py", line 612, in query_table
_check_valid_index_key(key, size)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/datasets/formatting/formatting.py", line 552, in _check_valid_index_key
raise IndexError(f"Invalid key: {key} is out of bounds for size {size}")
IndexError: Invalid key: 339337 is out of bounds for size 14406

我启动Retriver 服务用的参数如下

Image index path是从https://www.modelscope.cn/datasets/hhjinjiajie/FlashRAG_Dataset/file/view/master?id=47985&status=2&fileName=retrieval_corpus%252Fwiki18_100w_e5_index.zip网址下载下来的

corpus path是从https://huggingface.co/datasets/RUC-NLPIR/FlashRAG_datasets/tree/main/retrieval-corpus上下载下来的

训练参数如下:

bash train.sh
--train_batch_size 8
--ppo_mini_batch_size 8
--apply_chat True
--prompt_template_name re_search_template_sys
--actor_model_path /data1/Qwen2.5-7B-chat
--search_url http://10.44.101.107:6324
--project_name re-search
--experiment_name test1
--nnodes 1
--n_gpus_per_node 4
--save_freq 5
--test_freq 5
--total_epochs 1
--wandb_api_key None
--save_path /data1/zzc/ReCall-re-search/ckpts
--train_files /data1/zzc/ReCall-re-search/data/musique/train.parquet
--test_files /data1/zzc/ReCall-re-search/data/musique/test.parquet

训练数据和测试数据都是按照readme处理好的musique数据集

可以在百忙之中看一看问题并告诉我们怎么解决么。
谢谢。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions