Skip to content

docs: add RAIL Score evaluation cookbook — content + agent safety scoring#2766

Open
SumitVermakgp wants to merge 3 commits intolangfuse:mainfrom
SumitVermakgp:feat/rail-score-evaluation
Open

docs: add RAIL Score evaluation cookbook — content + agent safety scoring#2766
SumitVermakgp wants to merge 3 commits intolangfuse:mainfrom
SumitVermakgp:feat/rail-score-evaluation

Conversation

@SumitVermakgp
Copy link
Copy Markdown

@SumitVermakgp SumitVermakgp commented Apr 1, 2026

Summary

Adds a cookbook notebook demonstrating how to evaluate LLM outputs and agent tool calls with RAIL Score and push dimension scores to Langfuse traces.

What this cookbook covers

  • Content evaluation: inline and batch scoring of LLM outputs across 8 responsible AI dimensions (fairness, safety, reliability, transparency, privacy, accountability, inclusivity, user impact)
  • Deep mode: per-dimension explanations attached as score comments
  • Agent tool-call evaluation: pre-execution risk assessment (ALLOW/FLAG/BLOCK) pushed to observation-level scores
  • Agent session tracking: cumulative risk scores and pattern detection across multi-tool workflows
  • Human review integration: flagging low-scoring traces with needs_human_review boolean scores for Annotation Queue routing

Changes

  • Added cookbook/evaluation_with_rail_score.ipynb
  • Added route entry in cookbook/_routes.json

Links

Add cookbook notebook demonstrating RAIL Score integration with Langfuse
for 8-dimension responsible AI evaluation of LLM outputs and agent tool
calls. Covers inline scoring, batch evaluation, deep mode explanations,
agent tool-call risk assessment, session tracking, and human review
queue integration.
Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 1, 2026

@SumitVermakgp is attempting to deploy a commit to the langfuse Team on Vercel.

A member of the Team first needs to authorize it.

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Apr 1, 2026
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 1, 2026

CLA assistant check
All committers have signed the CLA.

@dosubot dosubot bot added the docs label Apr 1, 2026
Replace all em-dash characters with colons or hyphens for consistency
with project style conventions.
@jannikmaierhoefer jannikmaierhoefer self-requested a review April 2, 2026 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants