[BugFix] fixing stream interval > 1 will cause tool call bug #31778
+157
−41
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
Fix the issue #31501
What is wrong
Summary
When the
stream-intervalis big enough to contain{full_tool_call}<tool_end_token>in onedelta_text, the parser will fail withTool: None, Args: ''When the
stream-intervalcontain{part_of_tool_call}<tool_end_token>in onedelta_text, the parse will withTool: list_directory, Args: '{}'Detailed tracing
In the test python script, the tool call had a total of 20 tokens (including start and end token).
Let's see what happens when interval is set to 20,
countersin the log stands for these fourand here is the log
There is a total of 2 iterations, in the 1st one, the
delta_textis just a<tool_call>. In the 2nd one, it is actually\n{"name": "list_directory", "arguments": {"dir": "/src"}}\n</tool_call>The\ncharacter makes the content ofdelta_textappear on 3 lines but actually they are in the samedelta_textBecause in the 1st iteration, we never updated
self.prev_tool_call_arr, so on the 2nd iteration, we saw the message "attempting to close tool call, but no tool call" This is an easy fix, that we just need to append an empty{}to it whenever we create a new tool call. This will help us pass the first null check in# case -- the current tool call is being closed.But remember our interval value (20) can contain all 19 tokens, which means we are getting the tool call content at the same time as the</tool_call>token. Therefore,self.prev_tool_call_arrwas not yet updated, anddiffwould be empty. Then we fall through toLuckily, when there is </tool_call> in the
delta_text,tool_call_portionwill always give good results (you can verify it yourself). Because we are on only the 2nd iteration where the 1st one only starts the call, we have not set the function name yet. So we enterif not self.current_tool_name_sent:Here is another oversight: we didn't expect function name and</tool_call>can be inside onedelta_text, so we never actually handle the tool call arguments in this section. Easy fix right? Just add them and update the states properly! But what about partial arguments? LuckilyDeltaFunctionCallhandles partial arguments pretty well and we don't have to worry about them here.So far, we have talked about the case for
delta_textcontains{full_tool_call}<tool_end_token>. Now we need to consider the case for{partial_tool_call}<tool_end_token>.Here is the new log
At this point, you should be familiar with this message "attempting to close tool call, but no tool call". Briefly: 1st iteration returns None, 2nd iteration returns function name without specifying arguments in
if not self.current_tool_name_sent:, and 3rd iteration findsprev_tool_call_arrwas never properly initialized. These are all old info we already know. The part I want to point out is suppose we properly initializedprev_tool_call_arrwith an empty{}. What happens next?The 3rd iteration becomes this
Notice the part
finding ": {"dir": "/src"}} in {"dir": "/src"}}. Because the 2delta_textare{"name": "list_directory", "argumentsand": {"dir": "/src"}}. The 3rd iterationdelta_textactually has structural JSON characters":which made the check fail and returns None. After some analysis, I think the best way to fix this is to always usecur_arguments_jsondirectly rather than creatingarguments_delta. I believe the original idea of havingarguments_deltais to avoid issues withjson.dumpsfunction giving autocompleted JSON and that's why we want to get the raw JSON from thedelta_text. But honestly, we will eventually overwrite incomplete JSON anyways in laterDeltaMessage. Autocompleted JSON shouldn't be a problem here. Let me know if you have any other suggestions.Test Plan
E2E test:
./tests/v1/e2e/test_streaming_tool_calls.pyManual test:
Running this command with different
stream-intervalvalues [1, 8, 9, 10, 18, 19, 20] on my local machine.This model only requires less than 8GB VRAM. You should be able to run it on your local machine!
and then run this python script from ktsaou (special thanks)
Test Result
Before the fix:
--stream-interval 20''--stream-interval 1(default)'{"dir": "src"}'After the fix:
--stream-interval 1(default)'{"dir": "src"}'--stream-interval 8(default)'{"dir": "src"}'--stream-interval 9(default)'{"dir": "src"}'--stream-interval 10(default)'{"dir": "src"}'--stream-interval 18(default)'{"dir": "src"}'--stream-interval 19(default)'{"dir": "src"}'--stream-interval 20(default)'{"dir": "src"}'Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.