Add an XCB-based backend #426

jholveck · 2025-11-07T01:06:31Z

Changes proposed in this PR

This adds a ctypes-based module to use XCB.

The mss.linux.MSS constructor is now a factory, which takes an optional backend keyword parameter.
The previous mss.linux implementation has now been designated as the "xlib" backend. Per Ease X11 maintenance and thread safety by using XCB or xcffib #425, this will have "legacy" status, and new functionality like XShmGetImage won't be implemented there.
A new XCB-based backend, named "getimage", has been added. This is functionally the same as the existing "xlib" backend, in almost all respects, but has a different implementation.

Notable improvement: I previously was seeing periodic thread failures in test_thread_safety. In one scenario (Docker container, Python 3.9, Xvfb), I tested 84% failures in test_thread_safety. With the XCB-based backend, I saw no failures. (See also #251 and the test failure I mentioned in #421.)

The performance is, in my brief testing, the same as the existing xlib implementation. This PR is about making clearer code and improving thread safety, to pave the way for future improvements.

Fixes Ease X11 maintenance and thread safety by using XCB or xcffib #425

It is very important to keep up to date tests and documentation.

Tests added/updated
Documentation updated

Is your code right?

./check.sh passed

@BoboTiG

This version works fine, but needs improvements, especially to the testing. What has happened: * A ctypes-based module to use XCB has been added. * The mss.linux.MSS constructor is now a factory, which takes an optional `backend` keyword parameter. * The previous mss.linux implementation has now been designated as the "xlib" backend. Per BoboTiG#425, this will have "legacy" status, and new functionality like XShmGetImage won't be implemented there. * A new XCB-based backend, named "getimage", has been added. This is functionally the same as the existing "xlib" backend, in almost all respects, but has a different implementation. * The GNU/Linux-specific tests that could easily be made to work with the new backend have been adjusted. Notable improvement: I previously was seeing periodic thread failures in `test_thread_safety`. In one scenario (Docker container, Python 3.9, Xvfb), I tested 84% failures in `test_thread_safety`. With the XCB-based backend, I saw no failures. (See also BoboTiG#251 and the test failure I mentioned in BoboTiG#421.) The performance is, in my brief testing, the same as the existing xlib implementation. This PR is about making clearer code and improving thread safety, to pave the way for future improvements. What's left to do (some before merging, some possibly later): * The way the libraries are initialized needs to be rethought; I'll want to talk with @BoboTiG about that. * Many tests in `test_gnu_linux` used monkey-patches that are specific to the Xlib implementation. I've arranged for those to only use the Xlib implementation; I need to add corresponding tests for the XCB implementation. * The tests outside of `test_gnu_linux` need to test both backends, as well as the different backends we may add to other OSs. I need to add the right Pytest code to parameterize these. * This code introduces new automatic memory management aspects that need tests assigned to them. * The bulk of the `xcb.py` code is just hand-translations of the [XCB protocol specification XML files](https://gitlab.freedesktop.org/xorg/proto/xcbproto). Hand-translation can be error-prone (see BoboTiG#418). Since the XCB specs are in XML, then the bulk of this code can be auto-generated, like xcffib does. This can be done manually as needed, before shipping the distribution, so end users won't have to do this. (See BoboTiG#425 for a discussion of why we're not using xcffib.) * Even if we don't auto-generate the code, some of the repetitive parts can still be hoisted into higher-order functional programming, to reduce the risk of making errors in copying. * There are some parts that I worsened, to satisfy ruff. I want to revisit those. * The proof-of-concept code in `xcb.py:main` includes some additional functionality should be integrated into the MSS class, either in this PR or in separate branches. We also should delete that code before merging. * Additional testing on the X11 visual, to make sure it's really in the format we expect. (This is probably the only part we'll merge in with this PR.) * Support for capturing an individual window, without needing its location. * Identifying whether the alpha channel is meaningful or not. (This is only relevant with individual-window capture; there aren't many screens that can turn transparent!) Fixes: BoboTiG#425

jholveck · 2025-11-07T01:10:04Z

As you can see, this is currently a draft PR. I'm just putting it out to show the work in progress.

The GNU/Linux-specific tests that could easily be made to work with the new backend have been adjusted, but there's more to be done still.

The way the libraries are initialized needs to be rethought; I'll want to talk with @BoboTiG about that.
Many tests in test_gnu_linux used monkey-patches that are specific to the Xlib implementation. I've arranged for those to only use the Xlib implementation; I need to add corresponding tests for the XCB implementation.
The tests outside of test_gnu_linux need to test both backends, as well as the different backends we may add to other OSs. I need to add the right Pytest code to parameterize these.
I need to fix the (currently-failing) tests on other OS's, of course. That will come with how I handle the backend parameterization.
This code introduces new automatic memory management aspects that need tests assigned to them.
The bulk of the xcb.py code is just hand-translations of the XCB protocol specification XML files. Hand-translation can be error-prone (see X11 typing and error message improvements #418). Since the XCB specs are in XML, then the bulk of this code can be auto-generated, like xcffib does. This can be done manually as needed, before shipping the distribution, so end users won't have to do this. (See Ease X11 maintenance and thread safety by using XCB or xcffib #425 for a discussion of why we're not using xcffib.)
Even if we don't auto-generate the code, some of the repetitive parts can still be hoisted into higher-order functional programming, to reduce the risk of making errors in copying.
There are some parts that I worsened, to satisfy ruff. I want to revisit those.
The proof-of-concept code in xcb.py:main includes some additional functionality should be integrated into the MSS class, either in this PR or in separate branches. We also should delete that code before merging.
- Additional testing on the X11 visual, to make sure it's really in the format we expect. (This is probably the only part we'll merge in with this PR.)
- Support for capturing an individual window, without needing its location.
- Identifying whether the alpha channel is meaningful or not. (This is only relevant with individual-window capture; there aren't many screens that can turn transparent!)

jholveck · 2025-11-07T01:20:44Z

The way the libraries are initialized needs to be rethought; I'll want to talk with @BoboTiG about that.

So, @BoboTiG, let me get your thoughts on this.

The Xlib-based code will re-load libX11 etc. each time an MSS instance is constructed. (Under the hood, the library doesn't actually get reloaded; the same library that was previously loaded is reused (see dlopen(3)). However, a new CDLL instance with all-new _Funcptrs is created, so the .argtypes etc. that you've set are new.)

In my current version of the code, the library is loaded and initialized (i.e., argtypes etc. are set) only once, and that is reused by every future MSS instance.

Do you have a strong preference for one or the other?

BoboTiG · 2025-11-07T01:51:45Z

Wow, that's quite something you've done here! That's really good work! 🥂

About ruff, it is here to help us. When it is more a pain than it should be, you can workaround the rules either by adding specific skips for xcb.py like done in

python-mss/pyproject.toml

Line 181 in 7cb3dcc

"src/tests/*" = [

, or skip specific rules in the code directly using noqa: xxx.
I do not like such a tool to force us changing our core implementation. If you have better code, then use it and tell ruff to ignore it.

About the backend name, I would advocate for xlib and xcb. We can support getimage as it is in the code right now, but for end users I would not promote it. It's like adding xgetimage as an alias to xlib.

About testing, maybe could we add parametrized tests "by default"? I mean, maybe pytest allowes us to create a factory to inject parameters, so that we would only need to set backends to test in one place, then the tests runner will deduplicate everything automatically for all backends? Not sure we want that, just asking.

I have not much more to say about the PR, this is very interesting, and I'm eager to see where it will go at the end on the performances level + thread-safety levels.

BoboTiG · 2025-11-07T01:56:35Z

In my current version of the code, the library is loaded and initialized (i.e., argtypes etc. are set) only once, and that is reused by every future MSS instance.

Do you have a strong preference for one or the other?

Not really. The later seems better.
If this does not introduce more tricky errors with the errors handler then go with your preference.

Some tests now take a "backend" parameter. A fixture will iterate that through the backends implemented for that OS. Many tests called mss(display=os.getenv("DISPLAY")). Rather than add "backend" parameters to each of them, and maybe need to make other changes in the future, these now take an "mss_impl" fixture. This is a callable that will return the appropriate MSS object. One reason for this is so that we might add a pyvirtualdisplay or similar mechanism to these functions, when it's available. This may enable some tests to run in conditions they might not otherwise. It also may help standardize some of the testing.

jholveck · 2025-11-07T03:02:23Z

I've added some test parameterization.

I now have a backend fixture, and an mss_impl fixture. The backend fixture simply iterates through the backends available for that OS. The mss_impl fixture will give you a callable that will, effectively, run mss(*args, display=os.environ["DISPLAY"], backend=backend, **kwargs), where it's iterating through the backends.

In other words, you can replace this:

    with mss.mss(display=os.environ["DISPLAY"]):
        ....

with this:

    with mss_impl():
        ....

For the cases that need more fine-grained work, then I've used the backend fixture directly instead.

I see that some tests use display=os.environ["DISPLAY"], some don't give a display argument, and some use pyvirtualdisplay to set up an Xvfb and use that. Is there a particular reason for the differences, or is that a historical accident?

I can change mss_impl to use pyvirtualdisplay if it's available, and let the program use the default display (which should be os.environ["DISPLAY"] on Linux) if pyvirtualdisplay isn't installed. I'm not really sure if there's a reason for using display=os.environ["DISPLAY"] in any case, except a test to specifically see that the display= parameter is properly honored if it's correct.

src/mss/linux/xgetimage.py

BoboTiG · 2025-11-07T03:20:49Z

Is there a particular reason for the differences, or is that a historical accident?

Right now, I can't tell. I can be to workaround CI issues, as well as testing different DISPLAY envars found in user environments.

BoboTiG · 2025-11-07T03:23:04Z

About backend names, I was more proposing to have:

"xlib" (+ "default" now, which makes sense)
"xcb"

And get rid of xgetimage and getimage since that's not really backend names, it's too specific.

WDYT?

jholveck · 2025-11-07T03:27:10Z

About the backend name, I would advocate for xlib and xcb. We can support getimage as it is in the code right now, but for end users I would not promote it. It's like adding xgetimage as an alias to xlib.

Here's my thought about the backend name.

Here's the backends I anticipate having:

Xlib, with GetImage (existing code)
XCB, with GetImage (this code)
XCB, with ShmGetImage
XCB, with XComposite

As an aside, the name of the X11 protocol message is GetImage, the name of the Xlib function is XGetImage, and the name of the XCB function is xcb_get_image. In all these cases, though, it's often just called "XGetImage" in conversation.

The reason I envision three different XCB-based backends is because they apply under different circumstances.

GetImage is the slowest, but it's the only one that always is applicable. In particular, it's the only one that works with remote connections, such as over ssh.
XComposite is the fastest by far, but only can capture single windows, and pretty much only makes sense for capturing to a GPU buffer.
ShmGetImage is the best for full-screen capture or capturing to a CPU buffer on a local machine, a very common use case.

I figure the factory can dispatch to the correct backend automatically, based on what the user has requested.

Now, with these four backends, the naming that made the most sense to me was to name the XCB-based backends (which will be the preferred backends) based on the mechanism they're using. Since we're also keeping an Xlib-based backend, but not adding new mechanisms to that, then it seemed like "xlib" was the most sensible name for it.

Does that make sense to you?

BoboTiG · 2025-11-07T03:31:26Z

Ah, you're right! That makes more sense to me now.

jholveck · 2025-11-07T03:44:14Z

Glad to hear it! Sorry I wasn't more clear about my reasoning before. I've waffled a lot about the backend naming, and I'm happy to change things up before we merge. That's the point of this being a draft PR!

One downside here is that the user will need to choose these options at the time they create the MSS object. Since each of these also has resources they will want to store in the object and reuse (shared memory buffers for XShmGetImage, offscreen Pixmaps and CUDA contexts for XComposite), that still seems reasonable.

When it comes time to add single-window capture, then that also will need to be chosen at the object's creation time: otherwise, the factory couldn't choose between XComposite and XShmGetImage.

Of course, a user could always keep multiple MSS objects around, and there's little downside to that. But I'm not sure how you envision the API working, with regard to the scope of the MSS object.

The MSS object, when it's created, will now ensure that the X11 visual's format is exactly what we think it is: 32bpp, as a 24-bit BGRx or 32-bit BGRA image, in that order, with no extra scanline padding, color indexing, etc. This is true of all modern X servers in real-world situations, but it's not guaranteed. For instance, VNC-based servers can be run in depth 16 with RGB565 formats. The error messages for protocol-level errors (such as bad coordinates) are only available if the xcb-util-errors library is installed. (That's libxcb-errors.so.0, in the libxcb-errors0 package on Debian/Ubuntu.) This isn't widely installed, and is mostly useful to developers, so it's optional. The connection-level errors (like the hint to check $DISPLAY) are available everywhere.

BoboTiG · 2025-11-07T07:03:44Z

For now, if "xlib" is kept as default, then users won't have anything to change. Later, if XCB becomes the default, we should use ShmGetImage since it seems the best for all cases (except onver SSH, in that case, we will update the documentation to tell users to use xlib).

Backends will be for users wanting more performances, and knowing what they want.

I do not see upcoming issues with this path, maybe am I missing something? We should not try to be too smart in the factory, but use the safest defaults "for the mass" I would say.

jholveck · 2025-11-07T07:33:08Z

Yes, I do want the library to do the right thing without the user making choices. MSS's ease of use is one of its greatest assets!

I think the factory can still make intelligent default choices without problems. The decision between XShmGetImage and XComposite is pretty simple: capturing a single window to the GPU? Use XComposite. Otherwise? Use XShmGetImage. OBS Studio actually labels these two as "Single window" and "Whole desktop" (or something like that).

I've been planning on having the XShmGetImage backend automatically fall back to XGetImage if it can't use XShmGetImage. It's difficult to detect in advance that you can't use XShmGetImage, but it's easy to switch if the call fails, and remember that decision. My first prototypes testing XShmGetImage in MSS did just that.

So it should be easy to have the factory make intelligent default choices and "just work". The user can specify a backend if they know of particular needs, but that'd be an advanced option, as you suggest.

I'm not worried about any of this being a problem. I just wanted to make sure I understood your concept of the MSS object. The user wouldn't be able to create one MSS object at the beginning and keep alternating between capturing to the CPU in one grab() call, the GPU in another, or do other things like that. That's why I wanted to make sure the API would be okay with you. It sounds like you're good with that concept, so I'll go ahead and finish up this code, and then start working on XShmGetImage.

Test the cases of missing libraries when using XCB. Add fixtures to reset the XCB library loader after each testcase. Add fixtures to check for Xlib errors after each testcase.

jholveck · 2025-11-14T07:20:54Z

Just to give you an update:

I'm working on a version that generates the XCB function calls from the XML definitions. It's still pretty rough - I'm still simplifying it and cleaning things up - but you can see that on jholveck:feat-xcb-codegen. The code generator and (for the moment) the XML definitions are in src/xcbproto, the generated code is in src/mss/linux/xcbgen.py, and the MSS class that uses it is in src/mss/linux/xgetimage.py. As you can see, the "business logic" - the actual MSS implementation - is quite short and simple.

I'm trying to decide whether to go with this direction or not. The code generator alone is longer than the generated code, or the hand-written version (I'm still simplifying the code generator, though). But we can be confident that the generated code is correct, and not miscopied or something. Also, as we add more X11 features (XShmGetImage, XComposite and Render, and later Present), we don't have to change the generator: we just say what additional functions we want to pull in.

We can check in the generated code, and even the protocol definitions, to ensure consistency.

Do you have a strong preference about whether we go with generated or hand-written code? I think the generated version is probably going to be better long-term, but we can go either way.

BoboTiG · 2025-11-14T12:46:14Z

The idea would be that we generare everything only once when we want to change anything in the C API, correct? That does not incur a performance penality when calling python -m mss, mss, or when using the module from another Python script?

The "generate" version is more code, but it will prevent any misuse on our side. It may improve future development speed also.
I see only good thing for now. Can you spot any potential downside?

jholveck · 2025-11-15T04:23:44Z

The idea would be that we generare everything only once when we want to change anything in the C API, correct? That does not incur a performance penality when calling python -m mss, mss, or when using the module from another Python script?

Exactly.

The "generate" version is more code, but it will prevent any misuse on our side. It may improve future development speed also. I see only good thing for now. Can you spot any potential downside?

I agree. It's just more code for now.

Some people just don't like generated code, or have particular ways they want it to work, so I was checking with you. I'll proceed along that path! Thanks!

BoboTiG · 2025-11-15T11:54:20Z

Where do XML come from? Should we sync them from an upstream repository before calling the generator?

EDIT: xproto.xml seem to be from https://cgit.freedesktop.org/xcb/proto/tree/src/xproto.xml
We might want a small shell script to sync those file, but it's maybe too much since they won't change in years. Forget the sync part :)

jholveck · 2025-11-16T12:27:40Z

That's the one, but its official location has moved (as noted in the README in that repo). These days, the current location for that file is https://gitlab.freedesktop.org/xorg/proto/xcbproto/-/blob/master/src/xproto.xml

But yes, that's the repo that all the XML files come from. I intend to add relevant notes to the README about that.

It would be also possible to make the xorg/proto (where those XML files live) a git submodule instead of vendoring them in. We really only need to update them if we're adding new features that get released in the future; the existing protocol specs are stable and forwards-compatible. I thought it might be more trouble than it's worth to make it a submodule, but haven't really thought carefully about it yet.

BoboTiG · 2025-11-16T15:57:52Z

Vendoring seems a good option, I guess.

jholveck · 2025-11-18T00:06:52Z

One possibility, if you're uncomfortable with just copying the files in, would be to use git subtree. It can bring the xcbproto repo (which includes all the XML files we need) into the MSS repo, along with metadata to show the commit reference we've merged from, so we can pull in updates more easily.

Another alternative would be to use git subrepo. This is not yet an official part of git, but (from what I understand) was written by the author of git subtree to fix some of its shortcomings. Notably, it includes metadata that shows the origin that the subtree came from.

Both of these give the tooling to manage the xcbproto repo as a subtree in the MSS repo, and without the complexity of git submodule. The files will also be part of the MSS repo, so it'll be clear what versions we're using, without needing to configure git submodule. But, these tools will bring in the entire xcbproto repo, when we only need a few files.

Since we'll rarely need to update these protocol definitions, and won't be making changes to them, I don't think it's necessary to use tooling like this. But I can see the elegance of letting git know that these files come from somewhere different, if you'd prefer to.

BoboTiG · 2025-11-18T04:27:53Z

Keep it simple: no need for more :)

jholveck added 2 commits November 6, 2025 18:45

Ruff cleanups

fd91e62

Small bug fixes to get the tests back to doing meaningful testing

5a33e8c

BoboTiG reviewed Nov 7, 2025

View reviewed changes

src/mss/linux/xgetimage.py Outdated Show resolved Hide resolved

jholveck added 3 commits November 6, 2025 23:37

Merge branch 'BoboTiG:main' into feat-xcb

90f613f

Move the library globals to a smart container class

606badb

Improve testing

163ec2d

Test the cases of missing libraries when using XCB. Add fixtures to reset the XCB library loader after each testcase. Add fixtures to check for Xlib errors after each testcase.

Use generated code to interface with XCB instead of hand-written code

7f3317d

jholveck added 2 commits November 18, 2025 20:29

Merge branch 'main' into feat-xcb

536d3d9

Fix bug in multithreaded initialization guard

a33f6e0

Uh oh!

Add an XCB-based backend #426

Are you sure you want to change the base?

Add an XCB-based backend #426

Uh oh!

Conversation

jholveck commented Nov 7, 2025

Changes proposed in this PR

Uh oh!

jholveck commented Nov 7, 2025

Uh oh!

jholveck commented Nov 7, 2025

Uh oh!

BoboTiG commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BoboTiG commented Nov 7, 2025

Uh oh!

jholveck commented Nov 7, 2025

Uh oh!

Uh oh!

BoboTiG commented Nov 7, 2025

Uh oh!

BoboTiG commented Nov 7, 2025

Uh oh!

jholveck commented Nov 7, 2025

Uh oh!

BoboTiG commented Nov 7, 2025

Uh oh!

jholveck commented Nov 7, 2025

Uh oh!

BoboTiG commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jholveck commented Nov 7, 2025

Uh oh!

jholveck commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BoboTiG commented Nov 14, 2025

Uh oh!

jholveck commented Nov 15, 2025

Uh oh!

BoboTiG commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jholveck commented Nov 16, 2025

Uh oh!

BoboTiG commented Nov 16, 2025

Uh oh!

jholveck commented Nov 18, 2025

Uh oh!

BoboTiG commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BoboTiG commented Nov 7, 2025 •

edited

Loading

BoboTiG commented Nov 7, 2025 •

edited

Loading

jholveck commented Nov 14, 2025 •

edited

Loading

BoboTiG commented Nov 15, 2025 •

edited

Loading