Skip to content

Conversation

@vector-of-bool
Copy link
Contributor

Refer: CDRIVER-6107

This changeset introduces the ability to apply label/tags to test cases, and exposes those as CTest labels.

This also includes new general mutable string and array/vector handling utilities, and refactors a lot of our test suite to use those.

Changes:

  • 6d908aa supports a because clause on all mlib_check assertions. This message is included in the output when the program terminates, and acts as inline documentation for why a particular assertion is present.
  • d636e33 Brings mstr, the data-owning counterpart to mstr_view. This piece has been in a stash for a while, and was brought out as a better way to handle mutable strings.
  • 9675ae5 Adds vec.t.h, a generic template header that is used to define vector types. Basically:
    • #define a type T, and then #include <mlib/vec.t.h>, and it will define a T_vec type that acts as a dynamically sized contiguous array of T objects.
    • Additional macros can be defined to add copy/destroy semantics to the vector objects.
  • f97883e Adds mstr_trim for removing whitespace from string views.
  • 47629d7 Refactors a lot of our test suite code to use mtsr and vec.t.h types. Also adds mlib/str_vec.h which just defines the mstr_vec for the very common "array of strings" type.
  • b39b795 Adds support for specifying tags/labels associated with test cases:
    • The tags are specified within the same string as the test name. (There was an attempt to add a second string parameter for tags, but this was extremely cumbersome to update all of the test cases across the codebase to pass a second empty string, especially with all the different signatures for adding test cases. A future refactor may want to consolidate our many "TestSuite_Add" functions/macros.)
    • After the test case name, place one or more spaces, and then a list of [bracketed][tags].
    • To support exporting of tags for LoadTests.cmake, the --tests-cmake switch will print CMake code that defines all the test cases. I wanted to use JSON to emit the test cases, but CMake's JSON handling functionality is incredibly slow for very large JSON blobs, so directly emitting CMake code was chosen instead.
    • LoadTests.cmake will evaluate the emitted CMake code and use it to call add_test for all the test cases, as well as apply the test case labels and fixtures.
    • If a test has a tag [uses:foo], then LoadTests.cmake will add a FIXTURES_REQUIRED of mongoc/fixtures/foo. This currently only applies to the IMDS tests, but will eventually be used for other test case fixtures.

This change allows for test cases to declare any number of associated
"tags". The tags are specified after the test case name string as a
list of bracketed tags, mimicking Catch2's syntax.

The LoadTests.cmake script has been modified to apply test case's declared
tags as CTest labels. This also gives us the ability to apply test
fixture requirements granularly on only tests that declare their
requirement (via a tag).
@vector-of-bool vector-of-bool marked this pull request as ready for review October 20, 2025 23:26
@vector-of-bool vector-of-bool requested a review from a team as a code owner October 20, 2025 23:26
@vector-of-bool vector-of-bool requested review from kevinAlbs and rcsanchez97 and removed request for rcsanchez97 October 20, 2025 23:26
@kevinAlbs kevinAlbs requested a review from eramongodb October 22, 2025 16:32
Copy link
Contributor

@eramongodb eramongodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial feedback. TIL RESOURCE_LOCK, this is fantastic. I am also very excited about the mlib_str and mlib_vec APIs.


Some benchmarks, averaged across 10 runs ('~' is standard deviation), testing only /bson/* (438 tests, all safe to execute in parallel) with --no-fork (/bson/value/big_copy is skipped):

  • test-libmongoc (current): 0.79 secs (~0.05)
  • test-libmongoc (PR): 0.82 secs (~0.04)
  • ctest: 4.56 secs (~0.75)
  • ctest -j 0: 1.57 secs (~0.12)

It is surprising to me that ctest with parallelism is almost twice as slow as the single-threaded test-libmongoc. We may need to explore improving the test registration and filtering setup (in a followup PR is fine) to avoid this performance overhead as much as possible (the eager test registration skips implemented by this PR in TestSuite_Add*() do not seem to be sufficient).


Just a note that the current output of ctest --print-labels is:

All Labels:
  ipv6
  lock:fake-kms
  lock:live-server
  test-libmongoc-generated
  timeout:10
  timeout:20
  timeout:30
  uses:fake_kms_provider_server
  uses:simple-http-server-18000

Update: "json" and "slow" have been removed.


Some questions:

  • What motivated the addition of [timeout:N] to certain tests and not others? Is the default timeout of 10 seconds too aggressive (e.g. perhaps 60 secs may be preferable)?
  • What is the purpose of the [json] and [slow] tags, both of which are only used by JSON tests?

@vector-of-bool
Copy link
Contributor Author

@eramongodb I'm unfortunately aware of the big slowness when running tests via CTest. The issue is indeed caused by the overhead of spawning the subprocess which then needs to register all the test cases, even if it only needs one of them.

I'm hoping in a future change, after more adoption of CTest, to speed it up significantly by simplifying the startup process, but for now it runs fast enough that I don't worry about it.

re: [slow][json] tags: Those were leftover from an earlier attempt to handle timeouts and live tests. I've removed them now.

re: [timeout:N] I wanted to have a much stricter timeout on tests, and only exempt those that were really really slow, and have tests declare their own timeout in-situ, so [timeout:N] was added as a way to do that. It may be sufficient to just set the default timeout really high, but I wanted to catch tests that were slow and specifically call them out in-source. This also gives the ability to select tests that don't need a big timeout (i.e. will run very fast) with ctest --label-exclude timeout:.

Copy link
Contributor

@eramongodb eramongodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor feedback remaining; otherwise, LGTM!

Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow review. The string and vector utilities look very useful!

benchmarks

How is ctest expected to be invoked? I see a bigger difference locally:

./cmake-build/src/libmongoc/test-libmongoc --no-fork -d --match "/bson/*"         # ~1 second
ctest --test-dir cmake-build -R "^mongoc/bson/*" --output-on-failure --parallel   # ~8 seconds

Granted, ctest is not running in CI yet. So I am OK with addressing performance as future work.


/**
* @brief Resize an existing or null `mstr`, without initializing any of the
* added content other than the null terminator. This operation is potientially
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* added content other than the null terminator. This operation is potientially
* added content other than the null terminator. This operation is potentially

return false;
}
// Try to (re)allocate the region
char *data = (char *)realloc(str->data, alloc_size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calls to realloc and free ignore functions set by bson_mem_set_vtable. bson_mem_set_vtable is used by the PHP driver.

I expect mlib is designed not to depend on libbson. Consider adding a function to set custom allocators from bson_mem_set_vtable.

* @brief Replace the content of the given string, attempting to reuse the buffer
*
* @param inout Pointer to a valid or null `mstr` to be replaced
* @param s The new string contents
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param s The new string contents
* @param s The new string contents. Must not be a view referring to data in `inout`.

Suggest documenting self-assignment is not supported. I expect the mstr_resize_for_overwrite invalidates views to inout.

mlib_check(n_delete <= n_chars_avail_to_delete);
// Compute the new string length
size_t new_len = str->len;
// This should never fail, because we should try to delete more chars than we have
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This should never fail, because we should try to delete more chars than we have
// This should never fail, because we should never try to delete more chars than we have

*
* @param str The string object to be updated
* @param splice_pos The position at which to do the splice
* @param n_delete The number of characters to delete at `splice_pos`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param n_delete The number of characters to delete at `splice_pos`
* @param n_delete The number of characters to delete at `splice_pos`. Must not delete beyond string length.

Suggest documenting to alert callers of possible assert. Alternatively consider clamping the delete count:

if (n_delete > n_chars_avail_to_delete) {
   // Clamp the number of chars to delete to the number that are available
   n_delete = n_chars_avail_to_delete;
}

That appears consistent with Javascript's splice, but I do not see an immediate use-case.

* @param str The string object to be updated
* @param splice_pos The position at which to do the splice
* @param n_delete The number of characters to delete at `splice_pos`
* @param insert A string to be inserted at `split_pos` after chars are deleted
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param insert A string to be inserted at `split_pos` after chars are deleted
* @param insert A string to be inserted at `split_pos` after chars are deleted. Must not be a view referring to data in `str`

* @brief Append a string to the end of some other string.
*
* @param str The string to be modified
* @param suffix The suffix string to be appended onto `*str`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param suffix The suffix string to be appended onto `*str`
* @param suffix The suffix string to be appended onto `*str`. Must not be a view referring to data in `str`

Suggest documenting, since this may surprise callers:

mstr s = mstr_copy_cstring("foo");
mstr_append(&s, s); // use-after-free

*
* @param str The string object to be updated
* @param needle The non-empty needle string to be searched for.s
* @param sub The string to be inserted in place of each `needle`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param sub The string to be inserted in place of each `needle`
* @param sub The string to be inserted in place of each `needle`. Must not be a view referring to data in `str`

* will be destroyed and the overall copy will fail.
*
* To add a trival copying function, define `VecCopyElement` to
* `VecTrivialCopyElement`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VecTrivialCopyElement appears unused in this file (aside from Intellisense supporting macros). Is this comment out-dated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of VecTrivialCopyElement to be used by includers, like this:

#define T int
#define VecCopyElement VecTrivialCopyElement 
#include "vec.th"

list_select(labels SELECT "^uses:(.*)$" REPLACE "mongoc/fixtures/\\1" OUT fixtures)

# For any "lock:..." labels, add a resource lock with the corresponding name
list_select(labels SELECT "^lock:(.*)$" REPLACE "\\1" OUT lock)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
list_select(labels SELECT "^lock:(.*)$" REPLACE "\\1" OUT lock)
list_select(labels SELECT "^lock:(.*)$" REPLACE "\\1" OUT locks)

For consistency with labels and fixtures since there may be more than one lock.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants