From f218c70248bf8dfbca09a788c6c2bcb184f72fff Mon Sep 17 00:00:00 2001 From: Cursor Agent Date: Wed, 1 Oct 2025 21:06:33 +0000 Subject: [PATCH 1/2] docs: Add link to GitHub issue for test command Co-authored-by: sam --- fern/03-reference/baml-cli/test.mdx | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fern/03-reference/baml-cli/test.mdx b/fern/03-reference/baml-cli/test.mdx index cd8bb6f9fa..98b957b8d9 100644 --- a/fern/03-reference/baml-cli/test.mdx +++ b/fern/03-reference/baml-cli/test.mdx @@ -1,5 +1,9 @@ The `test` command runs BAML function tests defined in your BAML files. It provides comprehensive testing capabilities including filtering, parallel execution, and various output formats. + +For more details about the test command implementation and feature requests, see [GitHub issue #1772](https://github.com/BoundaryML/baml/issues/1772). + + ## Usage ``` From 0cc898e50109ec985ced6f233c6658655ab70689 Mon Sep 17 00:00:00 2001 From: Cursor Agent Date: Thu, 2 Oct 2025 00:36:19 +0000 Subject: [PATCH 2/2] Refactor test command documentation for clarity Co-authored-by: sam --- fern/03-reference/baml-cli/test.mdx | 70 ++++++++++++++++++++++++----- 1 file changed, 59 insertions(+), 11 deletions(-) diff --git a/fern/03-reference/baml-cli/test.mdx b/fern/03-reference/baml-cli/test.mdx index 98b957b8d9..6ed9bcac69 100644 --- a/fern/03-reference/baml-cli/test.mdx +++ b/fern/03-reference/baml-cli/test.mdx @@ -1,9 +1,5 @@ The `test` command runs BAML function tests defined in your BAML files. It provides comprehensive testing capabilities including filtering, parallel execution, and various output formats. - -For more details about the test command implementation and feature requests, see [GitHub issue #1772](https://github.com/BoundaryML/baml/issues/1772). - - ## Usage ``` @@ -26,13 +22,33 @@ baml-cli test [OPTIONS] ## Description -The `test` command performs the following actions: +The `test` command executes BAML function tests and validates their outputs against defined assertions. It provides a comprehensive testing framework for LLM-based functions. + +### How It Works + +1. **Discovery**: Scans your BAML source directory for test definitions +2. **Filtering**: Applies include/exclude patterns to select which tests to run +3. **Execution**: Runs tests in parallel with configurable concurrency +4. **Validation**: Evaluates assertions (`@@assert`) and checks (`@@check`) against function outputs +5. **Reporting**: Displays real-time progress and detailed results with pass/fail status + +### Test Execution Flow + +When you run tests: +- Tests execute concurrently (default: 10 parallel tests) +- Progress updates appear in real-time showing running/completed tests +- Failed tests and assertions are displayed immediately +- You can cancel execution anytime with Ctrl+C +- Final summary shows all test results grouped by function -1. Discovers and parses all test cases defined in BAML files -2. Applies include/exclude filters to select which tests to run -3. Executes tests in parallel (configurable concurrency) -4. Reports results with detailed output and assertions -5. Supports various output formats and CI integration +### Assertions vs Checks + +Tests can include two types of validations: + +- **`@@assert`**: Must pass for the test to succeed. If an assertion fails, the test fails immediately. +- **`@@check`**: Used for validation that needs human review. Failing checks mark the test as "needs evaluation" but don't fail it outright. + +Both use Jinja expressions and can access the function's output via `this`. ## Test Filtering @@ -176,7 +192,12 @@ baml-cli test --list -i "Extract*::" -x "*::TestSlow*" ## Test Definition -Tests are defined in BAML files using the `test` block syntax: +Tests are defined in BAML files using the `test` block syntax. Each test specifies: +- The function(s) to test +- Input arguments to pass to the function +- Assertions and/or checks to validate the output + +### Basic Test Example ```baml function ExtractResume(resume: string) -> Resume { @@ -194,6 +215,33 @@ test TestBasicResume { } ``` +### Using Assertions and Checks + +```baml +test TestResumeWithValidation { + functions [ExtractResume] + args { + resume "Jane Smith\nSenior Engineer at TechCorp\n5 years experience" + } + // Assertions must pass for the test to succeed + @@assert({{ this.name != null }}) + @@assert({{ this.years_experience > 0 }}) + + // Checks flag issues for human review without failing the test + @@check({{ this.years_experience == 5 }}) + @@check({{ "TechCorp" in this.company }}) +} +``` + +### Test Output + +When tests run, you'll see: +- **PASSED**: All assertions passed +- **FAILED**: One or more assertions failed (shows which assertion and why) +- **NEEDS EVAL**: All assertions passed, but one or more checks failed (needs human review) +- Test execution time and function/test names +- Detailed error messages for failures + ## Related Commands - [`baml dev`](./dev) - Development server with hot reload for interactive testing