Releases: DataDog/dd-trace-py
4.7.1
Estimated end-of-life date, accurate to within three months: 06-2027
See the support level definitions for more information.
Bug Fixes
- CI Visibility: This fix resolves an issue where a failure response from the
/search_commitsendpoint caused the git metadata upload to fall back to sending the full 30-day commit history instead of aborting. This fallback could trigger cascading write load on the backend. The upload now aborts whensearch_commitsfails, matching the behavior when the/packfileupload itself fails.
- Fixed a race condition with internal periodic threads that could have caused a rare crash when forking.
- Fixes an issue where internal background threads could cause crashes or instability in applications that fork (e.g. Gunicorn, uWSGI) or during Python shutdown. Affected applications could experience intermittent crashes or hangs on exit.
- CI Visibility: This fix resolves an issue where pytest-xdist worker crashes (
os._exit, SIGKILL, segfault) caused buffered test events to be lost. To enable eager flushing, setDD_TRACE_PARTIAL_FLUSH_MIN_SPANS=1.
4.6.8
Estimated end-of-life date, accurate to within three months: 06-2027
See the support level definitions for more information.
Bug Fixes
- CI Visibility: This fix resolves an issue where a failure response from the
/search_commitsendpoint caused the git metadata upload to fall back to sending the full 30-day commit history instead of aborting. This fallback could trigger cascading write load on the backend. The upload now aborts whensearch_commitsfails, matching the behavior when the/packfileupload itself fails.
- Fixed a race condition with internal periodic threads that could have caused a rare crash when forking.
- Fixes an issue where internal background threads could cause crashes or instability in applications that fork (e.g. Gunicorn, uWSGI) or during Python shutdown. Affected applications could experience intermittent crashes or hangs on exit.
4.5.10
Estimated end-of-life date, accurate to within three months: 06-2027
See the support level definitions for more information.
Bug Fixes
-
CI Visibility: This fix resolves an issue where a failure response from the
/search_commitsendpoint caused the git metadata upload to fall back to sending the full 30-day commit history instead of aborting. This fallback could trigger cascading write load on the backend. The upload now aborts whensearch_commitsfails, matching the behavior when the/packfileupload itself fails. -
Fixed a race condition with internal periodic threads that could have caused a rare crash when forking.
4.8.0rc4
Estimated end-of-life date, accurate to within three months: 05-2027
See the support level definitions for more information.
Upgrade Notes
- ray
ray.job.submitspans are removed. Ray job submission outcome is now reported on the existingray.jobspan throughray.job.submit_status.
Deprecation Notes
- LLM Observability
- Removes support for the RAGAS integration. As an alternative, if you have RAGAS evaluations, you can manually submit these evaluation results. See LLM Observability external evaluation documentation for more information.
- tracing
- The
pinparameter inddtrace.contrib.dbapi.TracedConnection,ddtrace.contrib.dbapi.TracedCursor, andddtrace.contrib.dbapi_async.TracedAsyncConnectionis deprecated and will be removed in version 5.0.0. To manage configuration of DB tracing please use integration configuration and environment variables. DD_TRACE_INFERRED_PROXY_SERVICES_ENABLEDis deprecated and will be removed in 5.0.0. UseDD_TRACE_INFERRED_SPANS_ENABLEDinstead. The old environment variable continues to work but emits aDDTraceDeprecationWarningwhen set.
- The
New Features
-
profiling
- Thread sub-sampling is now supported. This allows to set a maximum number of threads to capture stacks for at each sampling interval. This can be used to reduce the CPU overhead of the Stack Profiler.
-
ASM
- Adds a LiteLLM proxy guardrail integration for Datadog AI Guard. The
ddtrace.appsec.ai_guard.integrations.litellm.DatadogAIGuardGuardrailclass can be registered as a custom guardrail in the LiteLLM proxy to evaluate requests and responses against AI Guard security policies. Requires the LiteLLM proxy guardrails API v2 available sincelitellm>=1.46.1.
- Adds a LiteLLM proxy guardrail integration for Datadog AI Guard. The
-
azure_cosmos
- Add tracing support for Azure CosmosDB. This integration traces CRUD operations on CosmosDB databases, containers, and items.
-
CI Visibility
- adds automatic log correlation and submission so that test logs appear alongside their corresponding test run in Datadog. Set
DD_AGENTLESS_LOG_SUBMISSION_ENABLED=truefor agentless setups, orDD_LOGS_INJECTION=truewhen using the Datadog Agent.
- adds automatic log correlation and submission so that test logs appear alongside their corresponding test run in Datadog. Set
-
llama_index
- Adds APM tracing and LLM Observability support for
llama-index-core>=0.11.0. Traces LLM calls, query engines, retrievers, embeddings, and agents. See the llama_index documentation for more information.
- Adds APM tracing and LLM Observability support for
-
tracing
- Adds support for exporting traces in OTLP HTTP/JSON format via libdatadog. Set
OTEL_TRACES_EXPORTER=otlpto send spans to an OTLP endpoint instead of the Datadog Agent.
- Adds support for exporting traces in OTLP HTTP/JSON format via libdatadog. Set
-
mysql
- This introduces tracing support for
mysql.connector.aio.connectin the MySQL integration.
- This introduces tracing support for
-
LLM Observability
- Adds support for enabling and disabling LLMObs via Remote Configuration.
- Introduces a
decoratortag to LLM Observability spans that are traced by a function decorator. - Experiments accept a
pydantic_evalsReportEvaluatoras a summary evaluator when itsevaluatereturn annotation is exactlyScalarResult. The scalarvalueis recorded as the summary evaluation. Report evaluators that declare a broader analysis return type (for example the fullReportAnalysisunion) are not accepted as summary evaluators; use a class-based or function summary evaluator instead. Examples and further documentation can found in our documentation [here](https://docs.datadoghq.com/llm_observability/guide/evaluation_developer_guide).
Example:
from pydantic_evals.evaluators import ReportEvaluator from pydantic_evals.evaluators import ReportEvaluatorContext from pydantic_evals.reporting.analyses import ScalarResult from ddtrace.llmobs import LLMObs dataset = LLMObs.create_dataset( dataset_name="<DATASET_NAME>", description="<DATASET_DESCRIPTION>", records=[RECORD_1, RECORD_2, RECORD_3, ...] ) class TotalCasesEvaluator(ReportEvaluator): def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult: return ScalarResult( title='Total Cases', value=len(ctx.report.cases), unit='cases', ) def my_task(input_data, config): return input_data["output"] equals_expected = EqualsExpected() summary_evaluator = TotalCasesEvaluator() experiment = LLMObs.experiment( name="<EXPERIMENT_NAME>", task=my_task, dataset=dataset, evaluators=[equals_expected], summary_evaluators=[summary_evaluator], description="<EXPERIMENT_DESCRIPTION>." ) result = experiment.run()
Bug Fixes
- profiling
- Fixes lock profiling samples not appearing in the Thread Timeline view for events collected on macOS.
- A rare crash that could occur post-fork in fork-based applications has been fixed.
- A bug in Lock Profiling that could cause crashes when trying to access attributes of custom Lock subclasses (e.g. in Ray) has been fixed.
- A rare crash occurring when profiling asyncio code with many tasks or deep call stacks has been fixed.
- internal
- Fix a potential internal thread leak in fork-heavy applications.
- This fix resolves an issue where a
ModuleNotFoundErrorcould be raised at startup in Python environments without the_ctypesextension module. - A crash that could occur post-fork in fork-heavy applications has been fixed.
- A crash has been fixed.
- LLM Observability
- Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g.
invoke_agent) were incorrectly appearing as siblings of their SDK parent span (e.g.call_agent) rather than being nested under it. - Fixes multimodal OpenAI chat completion inputs being rendered as raw iterable objects in LLM Observability traces. Multimodal content parts (text, image, audio) are now properly materialized and formatted as readable text.
- Fixes
model_nameandmodel_providerreported on AWS Bedrock LLM spans as themodel_idfull model identifier value (e.g.,"amazon.nova-lite-v1:0") and"amazon_bedrock"respectively. Bedrock spans'model_nameandmodel_providernow correctly match backend pricing data, which enables features including cost tracking. - Fixes an issue where deferred tools (
defer_loading=True) in Anthropic and OpenAI integrations caused LLMObs span payloads to include full tool descriptions and JSON schemas for every tool in a large catalog. Deferred tool definitions now have their description and schema stripped from span metadata, with only the tool name preserved. - Fixes an issue where deeply nested tool schemas in Anthropic and OpenAI integrations were not yet supported. The Anthropic and OpenAI integrations now check each tool's schema depth at extraction time. If a tool's schema exceeds the maximum allowed depth, the schema is truncated.
- Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g.
- CI Visibility
- This fix resolves an issue where pytest-xdist worker crashes (
os._exit, SIGKILL, segfault) caused buffered test events to be lost. To enable eager flushing, setDD_TRACE_PARTIAL_FLUSH_MIN_SPANS=1. - This fix resolves an issue where a failure response from the
/search_commitsendpoint caused the git metadata upload to fall back to sending the full 30-day commit history instead of aborting. This fallback could trigger cascading write load on the backend. The upload now aborts whensearch_commitsfails, matching the behavior when the/packfileupload itself fails.
- This fix resolves an issue where pytest-xdist worker crashes (
- Code Security (IAST)
- This fix resolves a thread-safety issue in the IAST taint tracking context that could cause vulnerability detection to silently stop working under high concurrency in multi-threaded applications.
- Fixes a missing
returnin the IAST taint trackingadd_aspectnative function that caused redundant work when only the right operand of a string concatenation was tainted.
- celery
- remove unnecessary warning log about missing span when using
Task.replace().
- remove unnecessary warning log about missing span when using
- django
- Fixes
RuntimeError: coroutine ignored GeneratorExitthat occurred under ASGI with async views and async middleware hooks on Python 3.13+. Async view methods and middleware hooks are now correctly detected and awaited instead of being wrapped with sync bytecode wrappers.
- Fixes
- ray
- This fix resolves an issue where Ray integration spans could use an incorrect service name when the Ray job name was set after instrumentation initialization.
- Other
- Fixed a race condition with internal periodic threads that could have caused a rare crash when forking.
- Fixes an issue where internal background threads could cause crashes or instability in applications that fork (e.g. Gunicorn, uWSGI) or during Python shutdown. Affected applications could experience intermittent crashes or hangs on exit.
- tracing
- Fixes the
svc.autoprocess tag attribution logic. The tag now correctly reflects the auto-detected service name derived from the script or module entrypoint, matching the service name the tracer would assign to spans. - This fix resolves an issue where applications started with
python -m <module>could reportentrypoint.nameas-min process tags. - Fixed an issue where
network.client.ipandhttp.client_ipspan tags were missing when client IP collection was enabled and request had no headers.
- Fixes the
- litellm
- Fix missing LLMObs spans when routing requests through a litellm proxy. Proxy requests were incorrectly suppressed and resulted in empty or missing LLMObs spans. Proxy requests for OpenAI models are now always handled by the litellm integration....
4.7.0
Estimated end-of-life date, accurate to within three months: 05-2027
See the support level definitions for more information.
Upgrade Notes
- profiling
- This compiles the lock profiler's hot path to C via Cython, reducing per-operation overhead. At the default 1% capture rate, lock operations are ~49% faster for both contended and uncontended workloads. At 100% capture, gains are ~15-19%. No configuration changes are required.
- openfeature
- The minimum required version of
openfeature-sdkis now 0.8.0 (previously 0.6.0). This is required for thefinally_afterhook to receive evaluation details for metrics tracking.
- The minimum required version of
API Changes
- openfeature
- Flag evaluations for non-existent flags now return
Reason.ERRORwithErrorCode.FLAG_NOT_FOUNDinstead ofReason.DEFAULTwhen configuration is available but the flag is not found. The previous behavior (Reason.DEFAULT) is preserved when no configuration is loaded. This aligns Python with other Datadog SDK implementations.
- Flag evaluations for non-existent flags now return
New Features
-
mlflow
- Adds a request header provider (auth plugin) for MLFlow. If the environment variables
DD_API_KEY,DD_APP_KEYandDD_MODEL_LAB_ENABLEDare set, HTTP requests to the MLFlow tracking server will include theDD-API-KEYandDD-APPLICATION-KEYheaders. #16685
- Adds a request header provider (auth plugin) for MLFlow. If the environment variables
-
ai_guard
- Calls to evaluate now block if blocking was enabled for the service in the AI Guard UI. This behavior can be disabled by passing the parameter
block=False, which now defaults toblock=True. - This updates the AI Guard API client to return Sensitive Data Scanner (SDS) results in the SDK response.
- This introduces AI Guard support for Strands Agents. The Plugin API requires
strands-agents>=1.29.0; the HookProvider works with any version that exposes the hooks system.
- Calls to evaluate now block if blocking was enabled for the service in the AI Guard UI. This behavior can be disabled by passing the parameter
-
azure_durable_functions
- Add tracing support for Azure Durable Functions. This integration traces durable activity and entity functions.
-
profiling
- This adds process tags to profiler payloads. To deactivate this feature, set
DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
- This adds process tags to profiler payloads. To deactivate this feature, set
-
runtime metrics
- This adds process tags to runtime metrics tags. To deactivate this feature, set
DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
- This adds process tags to runtime metrics tags. To deactivate this feature, set
-
remote configuration
- This adds process tags to remote configuration payloads. To deactivate this feature, set
DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
- This adds process tags to remote configuration payloads. To deactivate this feature, set
-
dynamic instrumentation
- This adds process tags to debugger payloads. To deactivate this feature, set
DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
- This adds process tags to debugger payloads. To deactivate this feature, set
-
crashtracking
- This adds process tags to crash tracking payloads. To deactivate this feature, set
DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
- This adds process tags to crash tracking payloads. To deactivate this feature, set
-
data streams monitoring
- This adds process tags to Data Streams Monitoring payloads. To deactivate this feature, set
DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
- This adds process tags to Data Streams Monitoring payloads. To deactivate this feature, set
-
database monitoring
- This adds process tags to Database Monitoring SQL service hash propagation. To deactivate this feature, set
DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
- This adds process tags to Database Monitoring SQL service hash propagation. To deactivate this feature, set
-
Stats computation
- This adds process tags to stats computation payloads. To deactivate this feature, set
DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false.
- This adds process tags to stats computation payloads. To deactivate this feature, set
-
LLM Observability
-
Adds support for capturing
stop_reasonandstructured_outputfrom the Claude Agent SDK integration. -
Adds support for user-defined dataset record IDs. Users can now supply an optional
idfield when creating dataset records viaDataset.append(),Dataset.extend(),create_dataset(), orcreate_dataset_from_csv()(via the newid_columnparameter). If noidis provided, the SDK generates one automatically. -
Experiment tasks can now optionally receive dataset record metadata as a third
metadataparameter. Tasks with the existing(input_data, config)signature continue to work unchanged. -
This introduces
RemoteEvaluatorwhich allows users to reference LLM-as-Judge evaluations configured in the Datadog UI by name when running local experiments. For more information, see the documentation: https://docs.datadoghq.com/llm_observability/guide/evaluation_developer_guide/#using-managed-evaluators -
This adds cache creation breakdown metrics for the Anthropic integration. When making Anthropic calls with prompt caching,
ephemeral_5m_input_tokensandephemeral_1h_input_tokensmetrics are now reported, distinguishing between 5 minute and 1 hour prompt caches. -
Adds support for reasoning and extended thinking content in Anthropic, LiteLLM, and OpenAI-compatible integrations. Anthropic thinking blocks (
type: "thinking") are now captured asrole: "reasoning"messages in both streaming and non-streaming responses, as well as in input messages for tool use continuations. LiteLLM now extractsreasoning_output_tokensfromcompletion_tokens_detailsand capturesreasoning_contentin output messages for OpenAI-compatible providers. -
LLMJudgenow forwards any extraclient_optionsto the underlying provider client constructor. This allows passing provider-specific options such asbase_url,timeout,organization, ormax_retriesdirectly throughclient_options. -
Dataset records' tags can now be operated on with 3 new
Datasetmethods:`dataset.add_tags,dataset.remove_tags, anddataset.replace_tags. All 3 new methods accepts an int indicating the zero based index of the record to operate on, and a list of strings in the format of key:values representing the tags. For example, if the tag "env:prod" exists on the 1st record of the dataset ds, calling ds.remove_tags(0, ["env:prod"]` will update the local state of the dataset record to have the "env:prod" tag removed. -
Change experiment execution to run evaluators immediately after each record's task completes instead of batching all tasks first. Experiment spans and evaluation metrics are now posted incrementally as records complete rather than waiting until the end. This improves progress visibility and preserves partial results if a run fails midway.
-
Adds support for Pydantic AI evaluations in LLM Observability Experiments by allowing users to pass a pydantic evaluation (which inherents from
Evaluator) in an LLM Obs Experiment.Example:
from pydantic_evals.evaluators import EqualsExpected
from ddtrace.llmobs import LLMObs
dataset = LLMObs.create_dataset(
dataset_name="<DATASET_NAME>", description="<DATASET_DESCRIPTION>", records=[RECORD_1, RECORD_2, RECORD_3, ...])
def my_task(input_data, config):
return input_data["output"]def my_summary_evaluator(inputs, outputs, expected_outputs, evaluators_results):
return evaluators_results["Correctness"].count(True)equals_expected = EqualsExpected()
experiment = LLMObs.experiment(
name="<EXPERIMENT_NAME>", task=my_task, dataset=dataset, evaluators=[equals_expected], summary_evaluators=[my_summary_evaluator], # optional, used to summarize the experiment results description="<EXPERIMENT_DESCRIPTION>.")
result = experiment.run()
-
-
tracer
- This introduces API endpoint discovery support for Tornado applications. HTTP endpoints are now automatically collected at application startup and reported via telemetry, bringing Tornado in line with Flask, FastAPI, and Django.
- This adds process tags to trace payloads. To deactivate this feature, set
DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED=false. - Adds instrumentation support for `mlflow>=2.11.0. See the mlflow <https://ddtrace.readthedocs.io/en/stable/integrations.html#mlflow\> documentation for more information.
- Add process tags to client side stats payload
-
aiohttp
- Fixed an issue where spans captured an incomplete URL (e.g.
/status/200) whenaiohttp.ClientSessionwas initialized with abase_url. The span now records the fully-resolved URL (e.g.http://host:port/status/200), matching aiohttp's internal behaviour.
- Fixed an issue where spans captured an incomplete URL (e.g.
-
pymongo
- Add a new configuration option called
DD_TRACE_MONGODB_OBFUSCATIONto allow themongodb.queryto be obfuscated or not. Resource names always remain normalized regardless of the value. To preserve rawmongodb.queryvalues, pair withDD_APM_OBFUSCATION_MONGODB_ENABLED=falseon the Datadog Agent. See Datadog trace obfuscation docs: Trace obfuscation.
- Add a new configuration option called
-
google_cloud_pubsub
- Add tracing support for the
google-cloud-pubsublibrary. InstrumentsPublisherClient.publish()andSubscriberClient.subscribe()to generate spans for message publishing and consuming, with optional distributed trace context propagation via message attributes. UseDD_GOOGLE_CLOUD_PUBSUB_PROPAGATION_ENABLEDto control context propagation (default:True) andDD_GOOGLE_CLOUD_PUBSUB_PROPAGATION_AS_SPAN_LINKSto attach propagated context as span links instead of re-parenting subscriber spans under the producer trace (default:False).
- Add tracing support for the
Bug Fixes
- AAP
- Fix multipart request body parsing to preserve all values when the same field name appears multiple times. Previously, only the last value was kept for duplicate keys in
multipart/form-databodies, which could allow an attacker to bypass WAF inspection by hiding a malicious value among safe ones. - Fixes a minor issue where the ASGI middleware used the framew...
- Fix multipart request body parsing to preserve all values when the same field name appears multiple times. Previously, only the last value was kept for duplicate keys in
4.5.9
Estimated end-of-life date, accurate to within three months: 06-2027
See the support level definitions for more information.
Bug Fixes
- Fixes an issue where internal background threads could cause crashes or instability in applications that fork (e.g. Gunicorn, uWSGI) or during Python shutdown. Affected applications could experience intermittent crashes or hangs on exit.
- CI Visibility: This fix resolves an issue where pytest-xdist worker crashes (
os._exit, SIGKILL, segfault) caused buffered test events to be lost. To enable eager flushing, setDD_TRACE_PARTIAL_FLUSH_MIN_SPANS=1.
4.8.0rc3
Estimated end-of-life date, accurate to within three months: 05-2027
See the support level definitions for more information.
Deprecation Notes
- LLM Observability
- Removes support for the RAGAS integration. As an alternative, if you have RAGAS evaluations, you can manually submit these evaluation results. See LLM Observability external evaluation documentation for more information.
- tracing
- The
pinparameter inddtrace.contrib.dbapi.TracedConnection,ddtrace.contrib.dbapi.TracedCursor, andddtrace.contrib.dbapi_async.TracedAsyncConnectionis deprecated and will be removed in version 5.0.0. To manage configuration of DB tracing please use integration configuration and environment variables.
- The
New Features
-
ASM
- Adds a LiteLLM proxy guardrail integration for Datadog AI Guard. The
ddtrace.appsec.ai_guard.integrations.litellm.DatadogAIGuardGuardrailclass can be registered as a custom guardrail in the LiteLLM proxy to evaluate requests and responses against AI Guard security policies. Requires the LiteLLM proxy guardrails API v2 available sincelitellm>=1.46.1.
- Adds a LiteLLM proxy guardrail integration for Datadog AI Guard. The
-
azure_cosmos
- Add tracing support for Azure CosmosDB. This integration traces CRUD operations on CosmosDB databases, containers, and items.
-
CI Visibility
- adds automatic log correlation and submission so that test logs appear alongside their corresponding test run in Datadog. Set
DD_AGENTLESS_LOG_SUBMISSION_ENABLED=truefor agentless setups, orDD_LOGS_INJECTION=truewhen using the Datadog Agent.
- adds automatic log correlation and submission so that test logs appear alongside their corresponding test run in Datadog. Set
-
llama_index
- Adds APM tracing and LLM Observability support for
llama-index-core>=0.11.0. Traces LLM calls, query engines, retrievers, embeddings, and agents. See the llama_index documentation for more information.
- Adds APM tracing and LLM Observability support for
-
tracing
- Adds support for exporting traces in OTLP HTTP/JSON format via libdatadog. Set
OTEL_TRACES_EXPORTER=otlpto send spans to an OTLP endpoint instead of the Datadog Agent.
- Adds support for exporting traces in OTLP HTTP/JSON format via libdatadog. Set
-
LLM Observability
- Introduces a
decoratortag to LLM Observability spans that are traced by a function decorator. - Experiments accept a
pydantic_evalsReportEvaluatoras a summary evaluator when itsevaluatereturn annotation is exactlyScalarResult. The scalarvalueis recorded as the summary evaluation. Report evaluators that declare a broader analysis return type (for example the fullReportAnalysisunion) are not accepted as summary evaluators; use a class-based or function summary evaluator instead. Examples and further documentation can found in our documentation [here](https://docs.datadoghq.com/llm_observability/guide/evaluation_developer_guide).
Example:
from pydantic_evals.evaluators import ReportEvaluator from pydantic_evals.evaluators import ReportEvaluatorContext from pydantic_evals.reporting.analyses import ScalarResult from ddtrace.llmobs import LLMObs dataset = LLMObs.create_dataset( dataset_name="<DATASET_NAME>", description="<DATASET_DESCRIPTION>", records=[RECORD_1, RECORD_2, RECORD_3, ...] ) class TotalCasesEvaluator(ReportEvaluator): def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult: return ScalarResult( title='Total Cases', value=len(ctx.report.cases), unit='cases', ) def my_task(input_data, config): return input_data["output"] equals_expected = EqualsExpected() summary_evaluator = TotalCasesEvaluator() experiment = LLMObs.experiment( name="<EXPERIMENT_NAME>", task=my_task, dataset=dataset, evaluators=[equals_expected], summary_evaluators=[summary_evaluator], description="<EXPERIMENT_DESCRIPTION>." ) result = experiment.run() - Introduces a
Bug Fixes
- profiling
- Fixes lock profiling samples not appearing in the Thread Timeline view for events collected on macOS.
- A rare crash that could occur post-fork in fork-based applications has been fixed.
- A bug in Lock Profiling that could cause crashes when trying to access attributes of custom Lock subclasses (e.g. in Ray) has been fixed.
- internal
- Fix a potential internal thread leak in fork-heavy applications.
- This fix resolves an issue where a
ModuleNotFoundErrorcould be raised at startup in Python environments without the_ctypesextension module. - A crash that could occur post-fork in fork-heavy applications has been fixed.
- LLM Observability
- Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g.
invoke_agent) were incorrectly appearing as siblings of their SDK parent span (e.g.call_agent) rather than being nested under it. - Fixes multimodal OpenAI chat completion inputs being rendered as raw iterable objects in LLM Observability traces. Multimodal content parts (text, image, audio) are now properly materialized and formatted as readable text.
- Fixes
model_nameandmodel_providerreported on AWS Bedrock LLM spans as themodel_idfull model identifier value (e.g.,"amazon.nova-lite-v1:0") and"amazon_bedrock"respectively. Bedrock spans'model_nameandmodel_providernow correctly match backend pricing data, which enables features including cost tracking. - Fixes an issue where deferred tools (
defer_loading=True) in Anthropic and OpenAI integrations caused LLMObs span payloads to include full tool descriptions and JSON schemas for every tool in a large catalog. Deferred tool definitions now have their description and schema stripped from span metadata, with only the tool name preserved.
- Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g.
- CI Visibility
- This fix resolves an issue where pytest-xdist worker crashes (
os._exit, SIGKILL, segfault) caused buffered test events to be lost. To enable eager flushing, setDD_TRACE_PARTIAL_FLUSH_MIN_SPANS=1. - This fix resolves an issue where a failure response from the
/search_commitsendpoint caused the git metadata upload to fall back to sending the full 30-day commit history instead of aborting. This fallback could trigger cascading write load on the backend. The upload now aborts whensearch_commitsfails, matching the behavior when the/packfileupload itself fails.
- This fix resolves an issue where pytest-xdist worker crashes (
- Code Security (IAST)
- This fix resolves a thread-safety issue in the IAST taint tracking context that could cause vulnerability detection to silently stop working under high concurrency in multi-threaded applications.
4.8.0rc2
Estimated end-of-life date, accurate to within three months: 05-2027
See the support level definitions for more information.
Deprecation Notes
- LLM Observability
- Removes support for the RAGAS integration. As an alternative, if you have RAGAS evaluations, you can manually submit these evaluation results. See LLM Observability external evaluation documentation <https://docs.datadoghq.com/llm_observability/evaluations/external_evaluations\> for more information.
New Features
-
ASM
- Adds a LiteLLM proxy guardrail integration for Datadog AI Guard. The
ddtrace.appsec.ai_guard.integrations.litellm.DatadogAIGuardGuardrailclass can be registered as a custom guardrail in the LiteLLM proxy to evaluate requests and responses against AI Guard security policies. Requires the LiteLLM proxy guardrails API v2 available sincelitellm>=1.46.1.
- Adds a LiteLLM proxy guardrail integration for Datadog AI Guard. The
-
azure_cosmos
- Add tracing support for Azure CosmosDB. This integration traces CRUD operations on CosmosDB databases, containers, and items.
-
CI Visibility
- adds automatic log correlation and submission so that test logs appear alongside their corresponding test run in Datadog. Set
DD_AGENTLESS_LOG_SUBMISSION_ENABLED=truefor agentless setups, orDD_LOGS_INJECTION=truewhen using the Datadog Agent.
- adds automatic log correlation and submission so that test logs appear alongside their corresponding test run in Datadog. Set
-
llama_index
- Adds APM tracing and LLM Observability support for
llama-index-core>=0.11.0. Traces LLM calls, query engines, retrievers, embeddings, and agents. See the llama_index documentation for more information.
- Adds APM tracing and LLM Observability support for
-
tracing
- Adds support for exporting traces in OTLP HTTP/JSON format via libdatadog. Set
OTEL_TRACES_EXPORTER=otlpto send spans to an OTLP endpoint instead of the Datadog Agent.
- Adds support for exporting traces in OTLP HTTP/JSON format via libdatadog. Set
-
LLM Observability
- Introduces a
decoratortag to LLM Observability spans that are traced by a function decorator. - Experiments accept a
pydantic_evalsReportEvaluatoras a summary evaluator when itsevaluatereturn annotation is exactlyScalarResult. The scalarvalueis recorded as the summary evaluation. Report evaluators that declare a broader analysis return type (for example the fullReportAnalysisunion) are not accepted as summary evaluators; use a class-based or function summary evaluator instead. Examples and further documentation can found in our documentation [here](https://docs.datadoghq.com/llm_observability/guide/evaluation_developer_guide).
Example:
from pydantic_evals.evaluators import ReportEvaluator from pydantic_evals.evaluators import ReportEvaluatorContext from pydantic_evals.reporting.analyses import ScalarResult from ddtrace.llmobs import LLMObs dataset = LLMObs.create_dataset( dataset_name="<DATASET_NAME>", description="<DATASET_DESCRIPTION>", records=[RECORD_1, RECORD_2, RECORD_3, ...] ) class TotalCasesEvaluator(ReportEvaluator): def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult: return ScalarResult( title='Total Cases', value=len(ctx.report.cases), unit='cases', ) def my_task(input_data, config): return input_data["output"] equals_expected = EqualsExpected() summary_evaluator = TotalCasesEvaluator() experiment = LLMObs.experiment( name="<EXPERIMENT_NAME>", task=my_task, dataset=dataset, evaluators=[equals_expected], summary_evaluators=[summary_evaluator], description="<EXPERIMENT_DESCRIPTION>." ) result = experiment.run() - Introduces a
Bug Fixes
- profiling
- Fixes lock profiling samples not appearing in the Thread Timeline view for events collected on macOS.
- A rare crash that could occur post-fork in fork-based applications has been fixed.
- A bug in Lock Profiling that could cause crashes when trying to access attributes of custom Lock subclasses (e.g. in Ray) has been fixed.
- internal
- Fix a potential internal thread leak in fork-heavy applications.
- This fix resolves an issue where a
ModuleNotFoundErrorcould be raised at startup in Python environments without the_ctypesextension module. - A crash that could occur post-fork in fork-heavy applications has been fixed.
- LLM Observability
- Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g.
invoke_agent) were incorrectly appearing as siblings of their SDK parent span (e.g.call_agent) rather than being nested under it. - Fixes multimodal OpenAI chat completion inputs being rendered as raw iterable objects in LLM Observability traces. Multimodal content parts (text, image, audio) are now properly materialized and formatted as readable text.
- Fixes
model_nameandmodel_providerreported on AWS Bedrock LLM spans as themodel_idfull model identifier value (e.g.,"amazon.nova-lite-v1:0") and"amazon_bedrock"respectively. Bedrock spans'model_nameand model_provider` now correctly match backend pricing data, which enables features including cost tracking. - Fixes an issue where deferred tools (
defer_loading=True) in Anthropic and OpenAI integrations caused LLMObs span payloads to include full tool descriptions and JSON schemas for every tool in a large catalog. Deferred tool definitions now have their description and schema stripped from span metadata, with only the tool name preserved.
- Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g.
- CI Visibility
- This fix resolves an issue where pytest-xdist worker crashes (
os._exit, SIGKILL, segfault) caused buffered test events to be lost. To enable eager flushing, setDD_TRACE_PARTIAL_FLUSH_MIN_SPANS=1. - This fix resolves an issue where a failure response from the
/search_commitsendpoint caused the git metadata upload to fall back to sending the full 30-day commit history instead of aborting. This fallback could trigger cascading write load on the backend. The upload now aborts whensearch_commitsfails, matching the behavior when the/packfileupload itself fails.
- This fix resolves an issue where pytest-xdist worker crashes (
4.6.7
Estimated end-of-life date, accurate to within three months: 06-2027
See the support level definitions for more information.
Bug Fixes
- LLM Observability: Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g.
invoke_agent) were incorrectly appearing as siblings of their SDK parent span (e.g.call_agent) rather than being nested under it.
- CI Visibility: This fix resolves an issue where pytest-xdist worker crashes (
os._exit, SIGKILL, segfault) caused buffered test events to be lost. To enable eager flushing, setDD_TRACE_PARTIAL_FLUSH_MIN_SPANS=1.
4.8.0rc1
Estimated end-of-life date, accurate to within three months: 05-2027
See the support level definitions for more information.
Deprecation Notes
- LLM Observability
- Removes support for the RAGAS integration. As an alternative, if you have RAGAS evaluations, you can manually submit these evaluation results. See LLM Observability external evaluation documentation for more information.
New Features
-
ASM
- Adds a LiteLLM proxy guardrail integration for Datadog AI Guard. The
ddtrace.appsec.ai_guard.integrations.litellm.DatadogAIGuardGuardrailclass can be registered as a custom guardrail in the LiteLLM proxy to evaluate requests and responses against AI Guard security policies. Requires the LiteLLM proxy guardrails API v2 available sincelitellm>=1.46.1.
- Adds a LiteLLM proxy guardrail integration for Datadog AI Guard. The
-
azure_cosmos
- Add tracing support for Azure CosmosDB. This integration traces CRUD operations on CosmosDB databases, containers, and items.
-
CI Visibility
- adds automatic log correlation and submission so that test logs appear alongside their corresponding test run in Datadog. Set
DD_AGENTLESS_LOG_SUBMISSION_ENABLED=truefor agentless setups, orDD_LOGS_INJECTION=truewhen using the Datadog Agent.
- adds automatic log correlation and submission so that test logs appear alongside their corresponding test run in Datadog. Set
-
tracing
- Adds support for exporting traces in OTLP HTTP/JSON format via libdatadog. Set
OTEL_TRACES_EXPORTER=otlpto send spans to an OTLP endpoint instead of the Datadog Agent.
- Adds support for exporting traces in OTLP HTTP/JSON format via libdatadog. Set
-
LLM Observability
- Introduces a
decoratortag to LLM Observability spans that are traced by a function decorator. - Experiments accept a
pydantic_evalsReportEvaluatoras a summary evaluator when itsevaluatereturn annotation is exactlyScalarResult. The scalarvalueis recorded as the summary evaluation. Report evaluators that declare a broader analysis return type (for example the fullReportAnalysisunion) are not accepted as summary evaluators; use a class-based or function summary evaluator instead. Examples and further documentation can found in our documentation here.
Example:
from pydantic_evals.evaluators import ReportEvaluator from pydantic_evals.evaluators import ReportEvaluatorContext from pydantic_evals.reporting.analyses import ScalarResult from ddtrace.llmobs import LLMObs dataset = LLMObs.create_dataset( dataset_name="<DATASET_NAME>", description="<DATASET_DESCRIPTION>", records=[RECORD_1, RECORD_2, RECORD_3, ...] ) class TotalCasesEvaluator(ReportEvaluator): def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult: return ScalarResult( title='Total Cases', value=len(ctx.report.cases), unit='cases', ) def my_task(input_data, config): return input_data["output"] equals_expected = EqualsExpected() summary_evaluator = TotalCasesEvaluator() experiment = LLMObs.experiment( name="<EXPERIMENT_NAME>", task=my_task, dataset=dataset, evaluators=[equals_expected], summary_evaluators=[summary_evaluator], description="<EXPERIMENT_DESCRIPTION>." ) result = experiment.run() - Introduces a
Bug Fixes
- profiling
- Fixes lock profiling samples not appearing in the Thread Timeline view for events collected on macOS.
- internal
- Fix a potential internal thread leak in fork-heavy applications.
- This fix resolves an issue where a
ModuleNotFoundErrorcould be raised at startup in Python environments without the_ctypesextension module. - A crash that could occur post-fork in fork-heavy applications has been fixed.
- LLM Observability
- Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g.
invoke_agent) were incorrectly appearing as siblings of their SDK parent span (e.g.call_agent) rather than being nested under it. - Fixes
model_nameandmodel_providerreported on AWS Bedrock LLM spans as themodel_idfull model identifier value (e.g.,"amazon.nova-lite-v1:0") and"amazon_bedrock"respectively. Bedrock spans'model_nameandmodel_providernow correctly match backend pricing data, which enables features including cost tracking. - Fixes an issue where deferred tools (
defer_loading=True) in Anthropic and OpenAI integrations caused LLMObs span payloads to include full tool descriptions and JSON schemas for every tool in a large catalog. Deferred tool definitions now have their description and schema stripped from span metadata, with only the tool name preserved.
- Fixes incorrect span hierarchy in LLMObs traces when using the ddtrace SDK alongside OTel-based instrumentation (e.g. Strands Agents). OTel gen_ai spans (e.g.