Skip to content

DAG Evaluator#114

Open
superdosh wants to merge 4 commits intomainfrom
dag-evaluator
Open

DAG Evaluator#114
superdosh wants to merge 4 commits intomainfrom
dag-evaluator

Conversation

@superdosh
Copy link
Copy Markdown
Contributor

@superdosh superdosh commented Apr 3, 2026

Building blocks for building DAG-like evaluators. You can see how it's used here: https://github.com/mlcommons/modelplane-flights/pull/6

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2026

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@superdosh superdosh marked this pull request as ready for review April 3, 2026 18:45
@superdosh superdosh requested a review from a team as a code owner April 3, 2026 18:45
@superdosh superdosh requested review from bkorycki and bollacker April 3, 2026 18:45
Copy link
Copy Markdown
Contributor

@bkorycki bkorycki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I really appreciate all the comments. It made the code easy to follow.

Sorry for all the comments. Mostly just clarification questions and nit picky suggestions.

arbiter = MyArbiter("Arbiter", routes_true=[VIOLATING], routes_false=[NONVIOLATING])

dag = (
EvaluatorDAG("refusal_evaluator", outputs=[NONVIOLATING, VIOLATING])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refusal is just the first component in the example right? So I think "safety_evaluator" might be a better name.

class EvaluatorDAG:
"""DAG of EvaluatorNodes.

Usage:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate this documentation


if node.name in self._all_names():
raise ValueError(
f"A different node named {node.name!r} is already registered."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh what does !r do?
Also maybe it would be more precise to say "a different node or output type..."


Build:
- _predecessors: dict mapping node name to list of parent node names (for context during execution)
- _root_nodes: list of node names with no incoming routes (starting points)
Copy link
Copy Markdown
Contributor

@bkorycki bkorycki Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be an example where someone would need more than 1 root node?

routes: Optional[list[str | Output]] = None,
) -> None:
self.name = name
self.routes_true = routes_true or []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just make [] the default?

def __init__(
self,
name: str,
routes_true: Optional[list[str | Output]] = None,
Copy link
Copy Markdown
Contributor

@bkorycki bkorycki Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would something need to route to multiple nodes?

root_nodes = [n for n in self._nodes if in_degree[n] == 0]
queue = collections.deque(root_nodes)
ordered: list[str] = []
while queue:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

holy leetcode flashbacks

Image

if len(ordered) != len(self._nodes):
# missing nodes
missing = set(self._nodes) - set(ordered)
raise ValueError(f"Graph contains a cycle. Missing nodes: {missing}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some unit tests for this validation code would be good.


# check all terminal nodes are Output nodes
terminal_nodes = [n for n in self._nodes if not all_routes.get(n)]
for terminal in terminal_nodes:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing to me that a terminal node can either be an Output object or an Arbiter object which ... routes to Output object(s)?

traversed_edges: Optional[set[tuple[str, str]]] = None,
final_output: Optional[Output] = None,
):
"""Render the DAG as a PNG image. In a Jupyter notebook the image is displayed inline.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fancy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants