[BugFix] fixing stream interval > 1 will cause tool call bug #31778

MrIceCreamMan · 2026-01-06T06:09:49Z

Purpose

Fix the issue #31501

What is wrong

Summary

When the stream-interval is big enough to contain {full_tool_call}<tool_end_token> in one delta_text, the parser will fail with Tool: None, Args: ''
When the stream-interval contain {part_of_tool_call}<tool_end_token> in one delta_text, the parse will with Tool: list_directory, Args: '{}'

Detailed tracing

In the test python script, the tool call had a total of 20 tokens (including start and end token).
Let's see what happens when interval is set to 20, counters in the log stands for these four

            prev_tool_start_count = previous_text.count(self.tool_call_start_token)
            prev_tool_end_count = previous_text.count(self.tool_call_end_token)
            cur_tool_start_count = current_text.count(self.tool_call_start_token)
            cur_tool_end_count = current_text.count(self.tool_call_end_token)

and here is the log

This parser is called
INFO 01-04 21:42:33 [hermes_tool_parser.py:200] delta_text: <tool_call>
INFO 01-04 21:42:33 [hermes_tool_parser.py:201] delta_token_ids: [151657]
INFO 01-04 21:42:33 [hermes_tool_parser.py:216] counters: 0, 0, 1, 0
INFO 01-04 21:42:33 [hermes_tool_parser.py:264] Starting on a new tool 0
INFO 01-04 21:42:33 [hermes_tool_parser.py:327] Parsed tool call None

This parser is called
INFO 01-04 21:42:34 [hermes_tool_parser.py:200] delta_text: 
INFO 01-04 21:42:34 [hermes_tool_parser.py:200] {"name": "list_directory", "arguments": {"dir": "/src"}}
INFO 01-04 21:42:34 [hermes_tool_parser.py:200] </tool_call>
INFO 01-04 21:42:34 [hermes_tool_parser.py:201] delta_token_ids: [198, 4913, 606, 788, 330, 1607, 14846, 497, 330, 16370, 788, 5212, 3741, 788, 3521, 3548, 95642, 151658, 151645]
INFO 01-04 21:42:34 [hermes_tool_parser.py:216] counters: 1, 0, 1, 1
INFO 01-04 21:42:34 [hermes_tool_parser.py:228] tool_call_end_token in delta_text
INFO 01-04 21:42:34 [hermes_tool_parser.py:282] attempting to close tool call, but no tool call

There is a total of 2 iterations, in the 1st one, the delta_text is just a <tool_call>. In the 2nd one, it is actually \n{"name": "list_directory", "arguments": {"dir": "/src"}}\n</tool_call> The \n character makes the content of delta_text appear on 3 lines but actually they are in the same delta_text

Because in the 1st iteration, we never updated self.prev_tool_call_arr, so on the 2nd iteration, we saw the message "attempting to close tool call, but no tool call" This is an easy fix, that we just need to append an empty {} to it whenever we create a new tool call. This will help us pass the first null check in # case -- the current tool call is being closed. But remember our interval value (20) can contain all 19 tokens, which means we are getting the tool call content at the same time as the </tool_call> token. Therefore, self.prev_tool_call_arr was not yet updated, and diff would be empty. Then we fall through to

current_tool_call = (
    partial_json_parser.loads(tool_call_portion or "{}", flags)
    if tool_call_portion
    else None
)

Luckily, when there is </tool_call> in the delta_text, tool_call_portion will always give good results (you can verify it yourself). Because we are on only the 2nd iteration where the 1st one only starts the call, we have not set the function name yet. So we enter if not self.current_tool_name_sent: Here is another oversight: we didn't expect function name and </tool_call> can be inside one delta_text, so we never actually handle the tool call arguments in this section. Easy fix right? Just add them and update the states properly! But what about partial arguments? Luckily DeltaFunctionCall handles partial arguments pretty well and we don't have to worry about them here.

So far, we have talked about the case for delta_text contains {full_tool_call}<tool_end_token>. Now we need to consider the case for {partial_tool_call}<tool_end_token>.

Here is the new log

This parser is called
INFO 01-06 00:34:21 [hermes_tool_parser.py:200] delta_text: <tool_call>
INFO 01-06 00:34:21 [hermes_tool_parser.py:201] delta_token_ids: [151657]
INFO 01-06 00:34:21 [hermes_tool_parser.py:214] counters: 0 0 1 0
INFO 01-06 00:34:21 [hermes_tool_parser.py:265] Starting on a new tool 0
INFO 01-06 00:34:21 [hermes_tool_parser.py:325] Parsed tool call None
This parser is called
INFO 01-06 00:34:21 [hermes_tool_parser.py:200] delta_text: 
INFO 01-06 00:34:21 [hermes_tool_parser.py:200] {"name": "list_directory", "arguments
INFO 01-06 00:34:21 [hermes_tool_parser.py:201] delta_token_ids: [198, 4913, 606, 788, 330, 1607, 14846, 497, 330, 16370]
INFO 01-06 00:34:21 [hermes_tool_parser.py:214] counters: 1 0 1 0
INFO 01-06 00:34:21 [hermes_tool_parser.py:325] Parsed tool call {'name': 'list_directory'}
This parser is called
INFO 01-06 00:34:21 [hermes_tool_parser.py:200] delta_text: ": {"dir": "/src"}}
INFO 01-06 00:34:21 [hermes_tool_parser.py:200] </tool_call>
INFO 01-06 00:34:21 [hermes_tool_parser.py:201] delta_token_ids: [788, 5212, 3741, 788, 3521, 3548, 95642, 151658, 151645]
INFO 01-06 00:34:21 [hermes_tool_parser.py:214] counters: 1 0 1 1
INFO 01-06 00:34:21 [hermes_tool_parser.py:229] tool_call_end_token in delta_text
INFO 01-06 00:34:21 [hermes_tool_parser.py:282] attempting to close tool call, but no tool call

At this point, you should be familiar with this message "attempting to close tool call, but no tool call". Briefly: 1st iteration returns None, 2nd iteration returns function name without specifying arguments in if not self.current_tool_name_sent:, and 3rd iteration finds prev_tool_call_arr was never properly initialized. These are all old info we already know. The part I want to point out is suppose we properly initialized prev_tool_call_arr with an empty {}. What happens next?

The 3rd iteration becomes this

INFO 01-06 00:46:31 [hermes_tool_parser.py:200] delta_text: ": {"dir": "/src"}}
INFO 01-06 00:46:31 [hermes_tool_parser.py:200] </tool_call>
INFO 01-06 00:46:31 [hermes_tool_parser.py:201] delta_token_ids: [788, 5212, 3741, 788, 3521, 3548, 95642, 151658, 151645]
INFO 01-06 00:46:31 [hermes_tool_parser.py:214] counters: 1 0 1 1
INFO 01-06 00:46:31 [hermes_tool_parser.py:229] tool_call_end_token in delta_text
INFO 01-06 00:46:31 [hermes_tool_parser.py:326] Parsed tool call {'name': 'list_directory', 'arguments': {'dir': '/src'}}
INFO 01-06 00:46:31 [hermes_tool_parser.py:372] Trying to parse current tool call with ID 0
INFO 01-06 00:46:31 [hermes_tool_parser.py:388] diffing old arguments: None
INFO 01-06 00:46:31 [hermes_tool_parser.py:389] against new ones: {'dir': '/src'}
INFO 01-06 00:46:31 [hermes_tool_parser.py:429] finding ": {"dir": "/src"}} in {"dir": "/src"}}

Notice the part finding ": {"dir": "/src"}} in {"dir": "/src"}}. Because the 2 delta_text are {"name": "list_directory", "arguments and ": {"dir": "/src"}}. The 3rd iteration delta_text actually has structural JSON characters ": which made the check fail and returns None. After some analysis, I think the best way to fix this is to always use cur_arguments_json directly rather than creating arguments_delta. I believe the original idea of having arguments_delta is to avoid issues with json.dumps function giving autocompleted JSON and that's why we want to get the raw JSON from the delta_text. But honestly, we will eventually overwrite incomplete JSON anyways in later DeltaMessage. Autocompleted JSON shouldn't be a problem here. Let me know if you have any other suggestions.

Test Plan

E2E test: ./tests/v1/e2e/test_streaming_tool_calls.py

Manual test:
Running this command with different stream-interval values [1, 8, 9, 10, 18, 19, 20] on my local machine.
This model only requires less than 8GB VRAM. You should be able to run it on your local machine!

vllm serve Qwen/Qwen2.5-1.5B-Instruct   
--stream-interval 18   
--enable-auto-tool-choice   
--tool-call-parser hermes   
--max-model-len 2048

and then run this python script from ktsaou (special thanks)

#!/usr/bin/env python3
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""Reproduction script for vLLM issue #31501"""

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")

tools = [
    {
        "type": "function",
        "function": {
            "name": "list_directory",
            "description": "List contents of a directory.",
            "parameters": {
                "type": "object",
                "properties": {
                    "dir": {"type": "string", "description": "Directory path"}
                },
                "required": ["dir"],
            },
        },
    }
]

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-1.5B-Instruct",
    messages=[
        {
            "role": "system",
            "content": "Use the list_directory tool when asked about files.",
        },
        {"role": "user", "content": "List files in the src folder"},
    ],
    tools=tools,
    stream=True,
)

# Accumulate streamed tool call
accumulated_args = ""
tool_name = None

for chunk in response:
    delta = chunk.choices[0].delta
    if delta.tool_calls:
        for tc in delta.tool_calls:
            if tc.function.name:
                tool_name = tc.function.name
            if tc.function.arguments:
                accumulated_args += tc.function.arguments

print(f"Tool: {tool_name}, Args: {accumulated_args!r}")

Test Result

Before the fix:

Setting	Output	Status
`--stream-interval 20`	Tool: None, Args: `''`	BUG
`--stream-interval 1` (default)	Tool: list_directory, Args: `'{"dir": "src"}'`	OK

After the fix:

Setting	Output	Status
`--stream-interval 1` (default)	Tool: list_directory, Args: `'{"dir": "src"}'`	OK
`--stream-interval 8` (default)	Tool: list_directory, Args: `'{"dir": "src"}'`	OK
`--stream-interval 9` (default)	Tool: list_directory, Args: `'{"dir": "src"}'`	OK
`--stream-interval 10` (default)	Tool: list_directory, Args: `'{"dir": "src"}'`	OK
`--stream-interval 18` (default)	Tool: list_directory, Args: `'{"dir": "src"}'`	OK
`--stream-interval 19` (default)	Tool: list_directory, Args: `'{"dir": "src"}'`	OK
`--stream-interval 20` (default)	Tool: list_directory, Args: `'{"dir": "src"}'`	OK

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

gemini-code-assist

Code Review

This pull request addresses a critical bug in the hermes_tool_parser that occurs when stream-interval > 1, causing tool call parsing to fail. The changes correctly handle scenarios where a single streaming delta contains multiple parts of a tool call, such as the function name and arguments, or partial arguments and the end token. The fixes include proper state initialization for new tool calls, preventing premature parsing termination, and correcting the logic for handling argument chunks. The changes are well-reasoned and supported by detailed testing, and I find them to be a solid improvement to the parser's robustness.

chaunceyjiang

Could you add an e2e test?

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

fixing stream interval > 1 will cause tool call bug

0f866f7

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

MrIceCreamMan requested review from aarnphm and chaunceyjiang as code owners January 6, 2026 06:09

gemini-code-assist bot reviewed Jan 6, 2026

View reviewed changes

chaunceyjiang reviewed Jan 6, 2026

View reviewed changes

chaunceyjiang self-assigned this Jan 6, 2026

added e2e test to verify the fix

cb39860

Signed-off-by: Zhuohao Yang <zy242@cornell.edu>

mergify bot added the v1 label Jan 7, 2026

MrIceCreamMan changed the title ~~fixing stream interval > 1 will cause tool call bug~~ [BugFix] fixing stream interval > 1 will cause tool call bug Jan 7, 2026

Merge branch 'main' into zy242-fix-stream-interval

073ed9d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BugFix] fixing stream interval > 1 will cause tool call bug #31778

[BugFix] fixing stream interval > 1 will cause tool call bug #31778

MrIceCreamMan commented Jan 6, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

chaunceyjiang left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[BugFix] fixing stream interval > 1 will cause tool call bug #31778

Are you sure you want to change the base?

[BugFix] fixing stream interval > 1 will cause tool call bug #31778

Conversation

MrIceCreamMan commented Jan 6, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

What is wrong

Summary

Detailed tracing

Test Plan

Test Result

Before the fix:

After the fix:

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MrIceCreamMan commented Jan 6, 2026 •

edited by github-actions bot

Loading