fix(openai): tool_calls dropped when content chunk precedes tool deltas in stream by STHITAPRAJNAS · Pull Request #1611 · langfuse/langfuse-python

STHITAPRAJNAS · 2026-04-04T20:09:18Z

Problem

When using the Langfuse-wrapped OpenAI client with models that emit a content chunk before streaming tool-call deltas (Qwen, DeepSeek, and other reasoning models often send "\n\n" or a short preamble first), the generation output logged by Langfuse shows only the content string — all tool_call data is silently dropped.

Root cause: get_response_for_chat() inside _extract_streamed_openai_response used a Python or chain to decide what to return:

return (
    completion["content"]                  # "\n\n" is truthy → short-circuits here
    or (completion["function_call"] and …)
    or (completion["tool_calls"] and …)    # never reached
    or None
)

Because "\n\n" is a truthy string, the expression short-circuited at the first branch and returned the whitespace string, discarding the accumulated tool_calls.

Fixes #12490 (tracked in langfuse/langfuse#12490).

Fix

Check tool_calls before content. When tool calls are present they become the primary output. Non-whitespace content (e.g. a genuine reasoning preamble) is preserved alongside the tool calls rather than being discarded. The function_call (legacy format) and plain-content paths are unchanged.

if completion["tool_calls"]:
    result = {
        "role": "assistant",
        "tool_calls": [{"function": data} for data in completion["tool_calls"]],
    }
    if completion["content"] and completion["content"].strip():
        result["content"] = completion["content"]
    return result

if completion["function_call"]:
    return {"role": "assistant", "function_call": completion["function_call"]}

return completion["content"] or None

Tests

Added tests/test_openai_streaming_unit.py — 9 unit tests, no API calls:

Test	Scenario
`test_tool_calls_not_dropped_when_whitespace_content_precedes_them`	Primary bug: `"\n\n"` before tool deltas
`test_whitespace_only_content_not_included_in_result`	Leading whitespace is omitted from output
`test_meaningful_content_preserved_alongside_tool_calls`	Real preamble text is kept with tool_calls
`test_non_whitespace_content_before_tool_calls_preserves_both`	Multi-chunk preamble + tools
`test_plain_text_response_returned_as_string`	No regression on plain content
`test_empty_stream_returns_none`	Empty stream
`test_tool_calls_returned_without_content`	Pure tool call, no content
`test_multiple_tool_calls_all_returned`	Multiple sequential tool calls
`test_function_call_returned_when_no_tool_calls`	Legacy function_call path

9 passed in 1.54s

Disclaimer: Experimental PR review

Greptile Summary

This PR bundles two independent fixes: a bug fix for the OpenAI streaming path where tool_calls were silently dropped when a whitespace content chunk preceded tool-call deltas, and a new feature for the LangChain callback handler that allows users to pass langfuse_trace_name via metadata to control the propagated trace name.

OpenAI streaming (langfuse/openai.py): The original get_response_for_chat() used a Python or chain that short-circuited at the first truthy value — so a "\n\n" content chunk (common from Qwen/DeepSeek) caused all accumulated tool_calls to be discarded. The fix checks tool_calls first and preserves non-whitespace content alongside them when it is meaningful.
LangChain callback (langfuse/langchain/CallbackHandler.py): _parse_langfuse_trace_attributes now extracts langfuse_trace_name from metadata; on_chain_start forwards it to propagate_attributes() with span_name as a fallback via or; _strip_langfuse_keys_from_dict strips the key to prevent it leaking into stored metadata.
Both changes are covered by new, self-contained unit tests (tests/test_openai_streaming_unit.py and tests/test_langchain_callback_unit.py) that mock away all external API calls.
Scope note: The PR title and description focus on the OpenAI fix; the LangChain langfuse_trace_name feature is a non-trivial separate change only briefly mentioned. Consider splitting unrelated changes across separate PRs for easier review and bisectability.

Confidence Score: 5/5

Safe to merge — both fixes are correct, logically sound, and thoroughly covered by unit tests with no P0 or P1 issues found.

All findings are P2 (process/style). The OpenAI or-chain fix correctly prioritises tool_calls over content and handles the whitespace-only edge case cleanly. The LangChain trace_name feature wires correctly into the existing propagate_attributes API (which already accepts trace_name). Comprehensive unit tests cover the primary bug, edge cases, and regression scenarios.

No files require special attention — all four changed files are clean and well-tested.

Important Files Changed

Filename	Overview
langfuse/openai.py	Rewrites get_response_for_chat() to check tool_calls before content, fixing the or-chain short-circuit that silently dropped tool_calls when whitespace content preceded them in a stream
langfuse/langchain/CallbackHandler.py	Adds langfuse_trace_name metadata key support: extracted in _parse_langfuse_trace_attributes, forwarded to propagate_attributes with span_name fallback, and stripped from stored metadata
tests/test_openai_streaming_unit.py	New unit tests covering the primary whitespace-before-tool-calls bug and edge cases (pure content, pure tool-calls, multiple tools, legacy function_call) — no real API calls
tests/test_langchain_callback_unit.py	New unit tests for langfuse_trace_name parsing, on_chain_start propagation priority, and _strip_langfuse_keys_from_dict — no real API calls

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Stream chunk received] --> B{resource.type == 'chat'?}
    B -- No --> C[Accumulate completion text]
    B -- Yes --> D[Extract delta]
    D --> E{delta has content?}
    E -- Yes --> F[Append to completion content]
    E -- No --> G{delta has function_call?}
    G -- Yes --> H[Accumulate function_call]
    G -- No --> I{delta has tool_calls?}
    I -- Yes --> J[Accumulate tool_calls list]
    F --> L[get_response_for_chat]
    H --> L
    J --> L
    L --> M{completion tool_calls non-empty?}
    M -- Yes --> N[Return dict with tool_calls + non-whitespace content]
    M -- No --> O{completion function_call?}
    O -- Yes --> P[Return dict with function_call]
    O -- No --> Q[Return content or None]

_{Reviews (1): Last reviewed commit: "fix(openai): tool_calls dropped when con..." | Re-trigger Greptile}

_{(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!}

…tart When CallbackHandler.on_chain_start fires at the root of a chain (parent_run_id is None), propagate_attributes was called without a trace_name, so the trace name was determined by whichever internal node's on_chain_start happened to fire first. On LangGraph resume (e.g. after a human-in-the-loop interrupt) that node is often an internal subgraph whose name is "", which produces a blank trace name. The fix passes span_name — the name already computed from the serialized runnable and kwargs — as trace_name to propagate_attributes. This ensures the trace name is always pinned to the root chain's name regardless of execution order on resume. As a companion change, _parse_langfuse_trace_attributes now also reads a langfuse_trace_name key from LangChain metadata, consistent with the existing langfuse_session_id / langfuse_user_id / langfuse_tags pattern. When present, metadata langfuse_trace_name takes priority over the computed span_name. The key is also added to the strip-list in _strip_langfuse_keys_from_dict so it does not leak into observation metadata. Fixes langfuse#1602

…as in stream get_response_for_chat() built its return value with a Python `or` chain: return completion["content"] or (completion["tool_calls"] and {...}) or None Models like Qwen and DeepSeek emit a non-empty content chunk (often "\n\n" or a brief reasoning prefix) before streaming the tool-call deltas. Because a non-empty string is truthy, the `or` chain short-circuited at the content branch and returned just the whitespace string, silently discarding all accumulated tool_call data. Fix: check tool_calls first. When tool_calls are present, return them as the primary output. If the content is non-whitespace (e.g. a genuine reasoning preamble) it is included alongside the tool_calls rather than dropped. The function_call (legacy OpenAI format) and plain content paths are unchanged. Fixes langfuse/langfuse#12490

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

STHITAPRAJNAS added 2 commits April 4, 2026 13:23

claude bot reviewed Apr 4, 2026

View reviewed changes

STHITAPRAJNAS mentioned this pull request Apr 4, 2026

bug(python-sdk): streaming OpenAI integration silently drops tool_calls when content chunks precede tools langfuse/langfuse#12490

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(openai): tool_calls dropped when content chunk precedes tool deltas in stream#1611

fix(openai): tool_calls dropped when content chunk precedes tool deltas in stream#1611
STHITAPRAJNAS wants to merge 2 commits intolangfuse:mainfrom
STHITAPRAJNAS:fix/streaming-tool-calls-dropped-after-content-chunk

STHITAPRAJNAS commented Apr 4, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

claude bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

STHITAPRAJNAS commented Apr 4, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Tests

Disclaimer: Experimental PR review

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

STHITAPRAJNAS commented Apr 4, 2026 •

edited by greptile-apps bot

Loading