-
Notifications
You must be signed in to change notification settings - Fork 126
Description
Provider (if applicable)
All providers (confirmed: google)
Model (if applicable)
google:gemini-embedding-001
Bug Description
ReqLLM.embed/3 never emits the [:req_llm, :token_usage] telemetry event, unlike generate_object/generate_text. This means embedding calls are invisible to telemetry handlers — costs and token counts are silently dropped.
Reproduction Code
:telemetry.attach("test", [:req_llm, :token_usage], fn _, m, _, _ -> IO.inspect(m, label: "usage") end, nil)
# generate_text fires the event:
ReqLLM.generate_text("google:gemini-2.5-flash", "hello")
# => usage: %{tokens: %{input: 8, output: 12, ...}, total_cost: 0.000012}
# embed never fires the event:
ReqLLM.embed("google:gemini-embedding-001", "hello world")
# => (silence — no telemetry)Expected Behavior
[:req_llm, :token_usage] fires after every embed/3 call, consistent with all other ReqLLM API calls.
Actual Behavior
No telemetry event. Embedding costs are invisible to any [:req_llm, :token_usage] handler.
Root Cause
embed/3 in lib/req_llm/embedding.ex calls Req.request(request) directly without attaching Step.Usage, unlike the chat/object pipelines which call ReqLLM.Step.Usage.attach(req, model) before executing:
# Current (no telemetry):
{:ok, request} <- provider_module.prepare_request(:embedding, model, text, provider_opts),
{:ok, %Req.Response{...} = response} <- Req.request(request),
# Fix:
{:ok, request} <- provider_module.prepare_request(:embedding, model, text, provider_opts),
{:ok, %Req.Response{...} = response} <- request |> ReqLLM.Step.Usage.attach(model) |> Req.request(),Note on main branch
We also looked at the unreleased return_usage: true feature on main (added in #444). It reads usage from response.private[:req_llm][:usage], which is also only populated by Step.Usage — so return_usage: true will also always return nil until this is fixed.
Environment
- ReqLLM: 1.6.0
- Elixir: 1.19.1
- OTP: 28