fix(openai): tool_calls dropped when content chunk precedes tool deltas in stream#1611
Open
STHITAPRAJNAS wants to merge 2 commits intolangfuse:mainfrom
Open
Conversation
…tart When CallbackHandler.on_chain_start fires at the root of a chain (parent_run_id is None), propagate_attributes was called without a trace_name, so the trace name was determined by whichever internal node's on_chain_start happened to fire first. On LangGraph resume (e.g. after a human-in-the-loop interrupt) that node is often an internal subgraph whose name is "", which produces a blank trace name. The fix passes span_name — the name already computed from the serialized runnable and kwargs — as trace_name to propagate_attributes. This ensures the trace name is always pinned to the root chain's name regardless of execution order on resume. As a companion change, _parse_langfuse_trace_attributes now also reads a langfuse_trace_name key from LangChain metadata, consistent with the existing langfuse_session_id / langfuse_user_id / langfuse_tags pattern. When present, metadata langfuse_trace_name takes priority over the computed span_name. The key is also added to the strip-list in _strip_langfuse_keys_from_dict so it does not leak into observation metadata. Fixes langfuse#1602
…as in stream
get_response_for_chat() built its return value with a Python `or` chain:
return completion["content"] or (completion["tool_calls"] and {...}) or None
Models like Qwen and DeepSeek emit a non-empty content chunk (often "\n\n"
or a brief reasoning prefix) before streaming the tool-call deltas. Because
a non-empty string is truthy, the `or` chain short-circuited at the content
branch and returned just the whitespace string, silently discarding all
accumulated tool_call data.
Fix: check tool_calls first. When tool_calls are present, return them as the
primary output. If the content is non-whitespace (e.g. a genuine reasoning
preamble) it is included alongside the tool_calls rather than dropped.
The function_call (legacy OpenAI format) and plain content paths are
unchanged.
Fixes langfuse/langfuse#12490
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When using the Langfuse-wrapped OpenAI client with models that emit a content chunk before streaming tool-call deltas (Qwen, DeepSeek, and other reasoning models often send
"\n\n"or a short preamble first), the generation output logged by Langfuse shows only the content string — all tool_call data is silently dropped.Root cause:
get_response_for_chat()inside_extract_streamed_openai_responseused a Pythonorchain to decide what to return:Because
"\n\n"is a truthy string, the expression short-circuited at the first branch and returned the whitespace string, discarding the accumulatedtool_calls.Fixes #12490 (tracked in langfuse/langfuse#12490).
Fix
Check
tool_callsbeforecontent. When tool calls are present they become the primary output. Non-whitespace content (e.g. a genuine reasoning preamble) is preserved alongside the tool calls rather than being discarded. Thefunction_call(legacy format) and plain-content paths are unchanged.Tests
Added
tests/test_openai_streaming_unit.py— 9 unit tests, no API calls:test_tool_calls_not_dropped_when_whitespace_content_precedes_them"\n\n"before tool deltastest_whitespace_only_content_not_included_in_resulttest_meaningful_content_preserved_alongside_tool_callstest_non_whitespace_content_before_tool_calls_preserves_bothtest_plain_text_response_returned_as_stringtest_empty_stream_returns_nonetest_tool_calls_returned_without_contenttest_multiple_tool_calls_all_returnedtest_function_call_returned_when_no_tool_callsDisclaimer: Experimental PR review
Greptile Summary
This PR bundles two independent fixes: a bug fix for the OpenAI streaming path where
tool_callswere silently dropped when a whitespace content chunk preceded tool-call deltas, and a new feature for the LangChain callback handler that allows users to passlangfuse_trace_namevia metadata to control the propagated trace name.langfuse/openai.py): The originalget_response_for_chat()used a Pythonorchain that short-circuited at the first truthy value — so a"\n\n"content chunk (common from Qwen/DeepSeek) caused all accumulatedtool_callsto be discarded. The fix checkstool_callsfirst and preserves non-whitespace content alongside them when it is meaningful.langfuse/langchain/CallbackHandler.py):_parse_langfuse_trace_attributesnow extractslangfuse_trace_namefrom metadata;on_chain_startforwards it topropagate_attributes()withspan_nameas a fallback viaor;_strip_langfuse_keys_from_dictstrips the key to prevent it leaking into stored metadata.tests/test_openai_streaming_unit.pyandtests/test_langchain_callback_unit.py) that mock away all external API calls.langfuse_trace_namefeature is a non-trivial separate change only briefly mentioned. Consider splitting unrelated changes across separate PRs for easier review and bisectability.Confidence Score: 5/5
Safe to merge — both fixes are correct, logically sound, and thoroughly covered by unit tests with no P0 or P1 issues found.
All findings are P2 (process/style). The OpenAI or-chain fix correctly prioritises tool_calls over content and handles the whitespace-only edge case cleanly. The LangChain trace_name feature wires correctly into the existing propagate_attributes API (which already accepts trace_name). Comprehensive unit tests cover the primary bug, edge cases, and regression scenarios.
No files require special attention — all four changed files are clean and well-tested.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[Stream chunk received] --> B{resource.type == 'chat'?} B -- No --> C[Accumulate completion text] B -- Yes --> D[Extract delta] D --> E{delta has content?} E -- Yes --> F[Append to completion content] E -- No --> G{delta has function_call?} G -- Yes --> H[Accumulate function_call] G -- No --> I{delta has tool_calls?} I -- Yes --> J[Accumulate tool_calls list] F --> L[get_response_for_chat] H --> L J --> L L --> M{completion tool_calls non-empty?} M -- Yes --> N[Return dict with tool_calls + non-whitespace content] M -- No --> O{completion function_call?} O -- Yes --> P[Return dict with function_call] O -- No --> Q[Return content or None]Reviews (1): Last reviewed commit: "fix(openai): tool_calls dropped when con..." | Re-trigger Greptile
(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!