How to handling large tool response? #33156
Unanswered
Lin-jun-xiang
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm currently developing an agent where the tool response can sometimes be extremely large (tens of thousands of tokens).
Right now, I always add it directly to the conversation. However, this makes the next round of dialogue very slow (by feeding a massive number of tokens to the LLM). That said, it's still better than not storing the tool response as part of the history. What suggestions do you have for how to store and use these long-context tool responses?
I tried using the ReAct agent to handle problems directly, adding only the agent responses to the history without including the tool responses. This runs normally and quickly. However, the performance in multi-turn conversations isn't intelligent enough. So, I switched to also adding the tool responses as ToolMessages to the historical conversation. While this makes the agent a bit smarter, it results in extremely long response delays and massive costs.
Additionally, I've tried summarizing and compressing oversized tool responses first via an LLM, but this makes the compression process take a very long time, significantly increasing the overall delay.
Beta Was this translation helpful? Give feedback.
All reactions