-
Notifications
You must be signed in to change notification settings - Fork 76
Description
作者你们好,
我在复现re-search的时候发现Retriever Service一致出现如下报错信息
Begin faiss searching...
End faiss searching
INFO: 10.44.101.107:48620 - "POST /batch_search HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
return await self.app(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/fastapi/applications.py", line 1134, in call
await super().call(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/applications.py", line 113, in call
await self.middleware_stack(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 63, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/routing.py", line 716, in call
await self.middleware_stack(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/routing.py", line 736, in app
await route.handle(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/routing.py", line 290, in handle
await self.app(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/fastapi/routing.py", line 125, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/fastapi/routing.py", line 111, in app
response = await f(request)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/fastapi/routing.py", line 391, in app
raw_response = await run_endpoint_function(
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/fastapi/routing.py", line 290, in run_endpoint_function
return await dependant.call(**values)
File "/data1/zzc/ReCall-re-search/scripts/serving/retriever_serving.py", line 90, in batch_search
results = retriever_list[retriever_idx].batch_search(query, top_n, return_score)
File "/data1/zzc/ReCall-re-search/src/flashrag/retriever/retriever.py", line 70, in wrapper
results, scores = func(self, query=query, num=num, return_score=True)
File "/data1/zzc/ReCall-re-search/src/flashrag/retriever/retriever.py", line 101, in wrapper
results, scores = func(self, query=query, num=num, return_score=True)
File "/data1/zzc/ReCall-re-search/src/flashrag/retriever/retriever.py", line 208, in batch_search
return self._batch_search(*args, **kwargs)
File "/data1/zzc/ReCall-re-search/src/flashrag/retriever/retriever.py", line 424, in _batch_search
results = load_docs(self.corpus, flat_idxs)
File "/data1/zzc/ReCall-re-search/src/flashrag/retriever/utils.py", line 149, in load_docs
results = [corpus[int(idx)] for idx in doc_idxs]
File "/data1/zzc/ReCall-re-search/src/flashrag/retriever/utils.py", line 149, in
results = [corpus[int(idx)] for idx in doc_idxs]
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 2876, in getitem
return self._getitem(key)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 2857, in _getitem
pa_subtable = query_table(self._data, key, indices=self._indices)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/datasets/formatting/formatting.py", line 612, in query_table
_check_valid_index_key(key, size)
File "/root/anaconda3/envs/re-search/lib/python3.10/site-packages/datasets/formatting/formatting.py", line 552, in _check_valid_index_key
raise IndexError(f"Invalid key: {key} is out of bounds for size {size}")
IndexError: Invalid key: 339337 is out of bounds for size 14406
我启动Retriver 服务用的参数如下
index path是从https://www.modelscope.cn/datasets/hhjinjiajie/FlashRAG_Dataset/file/view/master?id=47985&status=2&fileName=retrieval_corpus%252Fwiki18_100w_e5_index.zip网址下载下来的
corpus path是从https://huggingface.co/datasets/RUC-NLPIR/FlashRAG_datasets/tree/main/retrieval-corpus上下载下来的
训练参数如下:
bash train.sh
--train_batch_size 8
--ppo_mini_batch_size 8
--apply_chat True
--prompt_template_name re_search_template_sys
--actor_model_path /data1/Qwen2.5-7B-chat
--search_url http://10.44.101.107:6324
--project_name re-search
--experiment_name test1
--nnodes 1
--n_gpus_per_node 4
--save_freq 5
--test_freq 5
--total_epochs 1
--wandb_api_key None
--save_path /data1/zzc/ReCall-re-search/ckpts
--train_files /data1/zzc/ReCall-re-search/data/musique/train.parquet
--test_files /data1/zzc/ReCall-re-search/data/musique/test.parquet
训练数据和测试数据都是按照readme处理好的musique数据集
可以在百忙之中看一看问题并告诉我们怎么解决么。
谢谢。