Python bridge revision #3537

bettinaheim · 2025-10-22T18:19:31Z

Summary of changes:

Added additional validation and type conversions for arguments and return values when necessary
Added proper error messages to indicate when a copy of a reference type is required
Added proper error messages for assignments that are not supported
Added a copy helper function for lists, np.arrays, tuples, and dataclasses to create deep and shallow copies

Bug Fixes

Fixes an issue where kernels would not be found in certain cases (due to analysis not adding them or dependencies not being correctly accumulated)
Fixes incorrect errors when building certain nested expressions (e.g. constructing a list within a call)
Fixed various issues with passing nested containers across device kernels
Added some missing overloads for some gate invocations and kernel calls
Fixed some inconsistencies for variable assignments
Added support for chaining item and attribute access and fixed related issues with item assignments
Fixed an issue where lists and arrays were not properly copied when a copy constructor was invoked
Added a comprehensive error for some cases that are not yet fully supported and resulted in a segfault
Fixed a bug that would cause Boolean operations with more than two operands to not be correctly evaluated
Fixed an issue that would cause a crash when building a list from a range in some cases

The rest of this description contains a breakdown of the PR content with explanations for developers.
The PR focuses on the Python bridge and its use of data types. It intentionally does not modify anything related to host-device data transfer, the symbol table, or the representation of callables and states, since the revisions by @schweitzpgi will majorly change that.

Changes to ast_bridge.py:

There are four main things revised

The value stack during IR construction
Loops
Calls
Assignments

Value stack

The previous implementation was using a single deque to propagate values across node visits. This is problematic, since it looses the association of values with specific node. Correspondingly, the previous implementation would fail for (some) nested expressions (e.g. a list built inside a call expression). It was also highly problematic since a failure to produce a value would go undetected and a previously pushed value may be used instead, leading to incorrect code (either failing to compile, or worse, compiling but not matching the application code).

This PR hence introduces a proper PyStack to propagate MLIR values across visits. For each node visit, a frame is pushed to that stack, and the new implementation validates that the correct number of values is produced by each node. To further make the detection of incorrect or incorrectly processed code more robust, each node now pushes a single value.
The relevant pieces of code to look at are the PyStack class and the definition of visit. The override for generic_visit is removed and no longer used. Related changes:

Calls to 'range' and 'enumerate' in visit_Call push a vector.
The processing of KrausChannels is removed from visit_Name and visit_Attribute and instead implemented directly inside apply_noise in visit_Call; the use of KrausChannels is only ever valid as part of a call (for both the previous and new implementation)

Loops

In the previous implementation, all loops were indiscriminately marked as invariant. This was incorrect for the case of e.g. visit_For, where the for-loop may contain break statements. The new implementation defines a general helper for loop-creation (createForLoop), as well as createMonotonicForLoop and createInvariantForLoop calling into that. These are used in the implementation of visit_For, visit_While, visit_Compare (for 'In/NotIn' comparison), visit_List_Comp, in various Call expressions, as well as in migrateLists (explained further below).

Calls

The previous implementation did very little checks around the correctness of arguments to calls. This PR largely refactors the entire implementation of visit_Call to have most of the cases call into shared helper functions defined at the beginning of visit_Call. These helpers ensure that (with few exceptions that I haven't refactored) all arguments to calls are checked for correctness and type conversions are applied when appropriate. There are two general helpers that are also used in other parts of the code base; changeOperandToType and __groupValues. The former is responsible for all type conversions throughout the bridge. The latter is used to process a list of AST nodes and validate that we have an appropriate number of them. In addition to the nodes to process, it takes a list[int | tuple[int, int]] argument indicating how many values to expect and how to group them. See the doc comment for more details.

Assignments

The changes to assignment make up the largest part of the changes (and tests). For one, the new implementation treats all of the following as assignments: assignments as part of visit_Assign and visit_AugAssign, the definition of function arguments, and the definition of loop iteration variables. This change also introduces a clear set of rules for the use of pointers. The outline given here only applies to the values produced to represent objects in the Python source code, not internal representations for data types (e.g. pointers to data arrays contained in a vector). The distinction is important since internal representation for certain data types should be opaque to the python bridge for the sake of encapsulation.
In the new implementation, the only data type that is always passed by pointer is a State. I won't elaborate on states any further; the support for them is currently in a somewhat inconsistent state and needs to be reexamined after the merge of the Python compiler changes. I hence left it as is in this PR. All other data types are passed by value (both in the previous and in the new implementation); the paragraphs further below elaborate how the new implementation deals with reference types in Python. Aside from State objects (and internal representations), pointers are exclusively used to represent variables. When variables are created, a pointer to a stack slot is created and push to the symbol table. When variables are used the current value of the variable is loaded and pushed to the value stack. This leads to a clean and consistent handling of all value types in Python. They behave as they should and are not subject to any restrictions in their use. The subsequent paragraphs discuss python reference types and quantum types, for which we have to impose certain restrictions to ensure that any code that successfully compiles matches the expected Python behavior.

Quantum Types and Measurements

Quantum types (qubits, qvectors, quantum structs) and measurement results are stored as values - not pointers! - in the symbol table (both in the previous and in the new implementation). This necessarily requires that we impose restrictions regarding assignments to variables of these types. While these restrictions are likely desirable for quantum types, we could reexamine these restrictions once we introduce a proper type distinction between boolean values and measurements throughout the stack. As it is currently, storing them as values is needed to enable sampling with explicit measurements (to an extend - the current support is incomplete and full support requires the type distinction).

The restriction for assignments to variables of these types specifically is that values of these types cannot be assigned to variables in a parent scope, whether directly or indirectly (meaning item assignment). A direct assignment is impossible to support with the current representation in the symbol table since there is no way to conditionally update the value in the parent scope depending on whether the child scope was executed. We could lift this restriction in a future version e.g. by leveraging phi nodes. An indirect assignment (i.e. assigning to an item of a vector or struct in the parent scope) would technically be possible for measurements but would lose the information of where the value came from - I left it as is and don't support that; to be reexamined in a future version. For quantum values, any item assignments are intentionally forbidden by design of the CUDA-Q language. Other than that, no restriction exists for assignments to variables in the same scope.

Python Reference Types

Currently supported types within kernels that are reference types in Python are lists, numpy arrays, and dataclasses.

NOTE: Much like States, Callables were left as is in this PR, with the expectation to reexamine them after the Python compiler changes were merged. Left as is specifically means they are currently stored as values (not pointers) in the symbol table and subject to the same restrictions for assignments that were discussed for vectors.

Lists and numpy arrays:

Lists and numpy arrays share the same representation in the IR. Everything stated for lists should be taken to apply also for numpy arrays.
Lists are represented as stdvec objects in the IR. Their internal representation contains a pointer to a data array and an integer indicating their size. This ensures that even though we pass stdvec objects by value across function boundaries, they indeed follow reference behavior. There are two issues with the representation as is (not modified in this PR) that are left for consideration to revise in a future version:

stdvec objects are also stored as values (not pointers) in the symbol table, and both their data pointer and their size is immutable as far as I saw (no IR expression to update the size of a stdvec without constructing a new one). As such, we have the similar issues as we have for quantum types when it comes to direct assignments to variables in the parent scope. Indirect assignments to items of variables in the parent scope do not suffer from this and work as expected.
[only if the kernel returns lists]: The memory to store the data of an stdvec is allocated on the stack. When we return an stdvec we hence need to make sure to copy that memory to the heap. This copy is inconsistent with the behavior one would expect from python in the case when the returned list was in fact passed as argument (aside from the data in this case in fact being caller allocated). After careful consideration of all options, the best path seems to be to make sure that we keep track of vectors that come from arguments (see container item restrictions below) and give an error when we return one that was passed as argument (more details are in comments in code). In practice, I believe this is not encountered all that often and in most cases, the performance after optimization should be the same.

Classical Dataclasses:

In a future version, it may be nice to have a similar representation for dataclasses as we have for vectors (a value encapsulating a pointer). Keeping the representation as it was, and passing the classical structs as values across function boundaries, we have two main kinds/sources of restrictions we need to impose. These restrictions ensure that any valid code behaves as one would expect Python code to behave.

We cannot assign an lvalue to a variable in the parent scope (directly or indirectly). This is because the current representation makes it impossible to ensure that the two handles (lvalue that is assigned and lvalue we are assigning to) indeed access the same data. For the same reason, we impose that a dataclass lvalue cannot be used as a container item (see container item restrictions below).
The new implementation imposes additional checks for dataclasses passed as function arguments to force that the copy that we effectively do as part of assigning it to an argument is explicit in the Python code. Specifically, we require an explicit .copy in the python code when directly or indirectly assigning a dataclass function argument to a local variable.

NOTE: Tuples are value types in Python. While they stare the same representation with dataclasses as structs in the IR, we forbid assignments to tuple items (matching python behavior), but/therefore do not otherwise need to impose any restrictions on them.

Restrictions on Container Items:

For the reasons outlined above, and in more detail in code comments and tests, we impose the following restrictions for any container items:

Dataclass lvalues cannot be used as container items - an explicit .copy must be made to store them in the container.
[only if the kernel returns lists]: Lists passed as function arguments (or as an function argument item) cannot be used as container items - an explicit .copy must be made to store them in the container.
Container items cannot be pointers - i.e. containers cannot contain States (since this is the only type where we represent a Python value as a pointer). This restriction was already made in the previous implementation and I kept any behavior related to State and Callables unmodified.

Currently supported containers in the Python bridge are tuples, dataclasses and lists/numpy.arrays. The restrictions above are enforced by a common helper function (__validate_container_entry) that is called whenever a container is created. I.e. it is called during item assignment, list comprehension, and constructor calls, copy constructors, or literals of containers.

The most relevant pieces of code to look at to understand the changes with regards to assignments are visit_Assign, visit_Name, and visit_Return, as well as the tests in test_assignments.py.

Signed-off-by: Bettina Heim <heimb@outlook.com>

…mplemented in nvqir Signed-off-by: Bettina Heim <heimb@outlook.com>

…uda-quantum into list-comprehension Signed-off-by: Bettina Heim <heimb@outlook.com>

Signed-off-by: Bettina Heim <heimb@outlook.com>

bettinaheim · 2025-11-19T22:28:31Z

/ok to test 8de65f1

Command Bot: Processing...

github-actions · 2025-11-20T01:08:00Z

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

lib/Frontend/nvqpp/ConvertStmt.cpp

…ully Signed-off-by: Bettina Heim <heimb@outlook.com>

lmondada · 2025-11-21T06:30:40Z

I agree that .copy() is a very useful shorthand. It also feels the "most natural" to me.

Perhaps unfortunately, the "most pythonic" would be to support the use of the copy and deepcopy free functions that can be imported from the copy module.

import copy

v = ['a', 'b']
deep_v = ['a', ['a', 'b']]
w = copy.copy(v)
deep_w = copy.deepcopy(deep_v)

Given that we are parsing the AST, the fact that these functions are imported from the copy module makes this syntax an annoying complication, given the different ways these functions could be imported and used:

import copy followed by copy.copy or copy.deepcopy
from copy import copy, deepcopy, followed by the unprefixed copy(v) or deepcopy(v)
from copy import *
worst of all, renaming: from copy import copy as shallow_copy

Supporting just 1. seems the simplest thing to do and is reasonable (even without checking for existence of the import). However, leaning into this Python convention without supporting at least 2. (which is how I'd use it) is likely to surprise users...

As a different point to consider, numpy supports both the free function as well as providing a .copy() method:

on the one hand, the numpy docs page for the free function encourages the use of the .copy() method instead

on the other, there is no support for deep copying. This page states that python's copy.deepcopy function should be used for that...

python/tests/visualization/test_draw.py

python/tests/mlir/ast_continue.py

python/tests/kernel/test_control_negations.py

include/cudaq/Optimizer/Dialect/CC/CCTypes.h

bettinaheim · 2025-11-21T16:00:51Z

I agree that .copy() is a very useful shorthand. It also feels the "most natural" to me.

Perhaps unfortunately, the "most pythonic" would be to support the use of the copy and deepcopy free functions that can be imported from the copy module.
import copy

v = ['a', 'b']
deep_v = ['a', ['a', 'b']]
w = copy.copy(v)
deep_w = copy.deepcopy(deep_v)
Given that we are parsing the AST, the fact that these functions are imported from the copy module makes this syntax an annoying complication, given the different ways these functions could be imported and used:

import copy followed by copy.copy or copy.deepcopy

from copy import copy, deepcopy, followed by the unprefixed copy(v) or deepcopy(v)

from copy import *

worst of all, renaming: from copy import copy as shallow_copy

Supporting just 1. seems the simplest thing to do and is reasonable (even without checking for existence of the import). However, leaning into this Python convention without supporting at least 2. (which is how I'd use it) is likely to surprise users...

As a different point to consider, numpy supports both the free function as well as providing a .copy() method:

on the one hand, the numpy docs page for the free function encourages the use of the .copy() method instead

Summarizing a brief offline discussion: Matching the Python library functions for specific data types would require storing the Python data type along with the MLIR values during bridge construction. We definitively want to stick with the premise that "if it looks like Python code it should behave like Python code" for valid code that compiles. However, I think we can take a bit of liberty with additional functionality (i.e. adding member functions e.g. for lists that are not usually defined for Python lists). The rationale is that when we compile Python, we are mapping Python data types to other types that match Python behavior but may have a slightly different set of methods defined for convenience.

Signed-off-by: Bettina Heim <heimb@outlook.com>

Co-authored-by: Luca Mondada <72734770+lmondada@users.noreply.github.com> Signed-off-by: Bettina Heim <heimb@outlook.com>

Co-authored-by: Pradnya Khalate <148914294+khalatepradnya@users.noreply.github.com> Signed-off-by: Bettina Heim <heimb@outlook.com>

Signed-off-by: Bettina Heim <heimb@outlook.com>

bettinaheim · 2025-11-21T16:44:56Z

/ok to test 1f1cfe8

Command Bot: Processing...

github-actions · 2025-11-21T18:29:58Z

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

bettinaheim · 2025-11-26T09:03:51Z

/ok to test 1f1cfe8

Command Bot: Processing...

copy-pr-bot · 2025-11-26T09:03:54Z

/ok to test 1f1cfe8

@bettinaheim, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

1tnguyen

LGTM 💯

bettinaheim added 30 commits September 30, 2025 11:22

first draft for some bridge fixes

0351238

Signed-off-by: Bettina Heim <heimb@outlook.com>

getting closer with the testing

6e2a893

Signed-off-by: Bettina Heim <heimb@outlook.com>

more tests and fixes

461a32e

Signed-off-by: Bettina Heim <heimb@outlook.com>

finished the tuple testing

3249b15

Signed-off-by: Bettina Heim <heimb@outlook.com>

some clean up

1cf4e67

Signed-off-by: Bettina Heim <heimb@outlook.com>

allow to deconstruct lists

77c02a8

Signed-off-by: Bettina Heim <heimb@outlook.com>

more clean up

912c905

Signed-off-by: Bettina Heim <heimb@outlook.com>

don't assume stack size

1b6024a

Signed-off-by: Bettina Heim <heimb@outlook.com>

fixed controlled and adjoint

f8e7835

Signed-off-by: Bettina Heim <heimb@outlook.com>

tests for callable kernel args with control and adjoint

9104265

Signed-off-by: Bettina Heim <heimb@outlook.com>

argument conversion needs reexamination

0d8d34d

Signed-off-by: Bettina Heim <heimb@outlook.com>

handling the vect<bool> conversion in the argument conversion

a5e4944

Signed-off-by: Bettina Heim <heimb@outlook.com>

minor clean up

bd73206

Signed-off-by: Bettina Heim <heimb@outlook.com>

some test clean up

95054ff

Signed-off-by: Bettina Heim <heimb@outlook.com>

some test clean up

418ae3b

Signed-off-by: Bettina Heim <heimb@outlook.com>

some bridge clean up

7fab128

Signed-off-by: Bettina Heim <heimb@outlook.com>

added a test

c8c699d

Signed-off-by: Bettina Heim <heimb@outlook.com>

more tests

3ada403

Signed-off-by: Bettina Heim <heimb@outlook.com>

forgot something

7c48168

Signed-off-by: Bettina Heim <heimb@outlook.com>

forgot something

9d611eb

Signed-off-by: Bettina Heim <heimb@outlook.com>

moving processing of veq functions

8f5e619

Signed-off-by: Bettina Heim <heimb@outlook.com>

minor things

a96dccd

Signed-off-by: Bettina Heim <heimb@outlook.com>

reverting a bad change

8ed8d07

Signed-off-by: Bettina Heim <heimb@outlook.com>

proper tests for comparison and found some bugs

6439bca

Signed-off-by: Bettina Heim <heimb@outlook.com>

more tests

6fb2fe7

Signed-off-by: Bettina Heim <heimb@outlook.com>

Merge branch 'main' into list-comprehension

cf202a1

Signed-off-by: Bettina Heim <heimb@outlook.com>

we have some issues with data conversion for __quantum__* functions i…

0c03364

…mplemented in nvqir Signed-off-by: Bettina Heim <heimb@outlook.com>

Merge branch 'list-comprehension' of https://github.com/bettinaheim/c…

e164d88

…uda-quantum into list-comprehension Signed-off-by: Bettina Heim <heimb@outlook.com>

last remaining tests for now

8173b7b

Signed-off-by: Bettina Heim <heimb@outlook.com>

forgot to commit matching bridge changes

383584b

Signed-off-by: Bettina Heim <heimb@outlook.com>

github-actions bot pushed a commit that referenced this pull request Nov 20, 2025

Docs preview for PR #3537.

7d4bf02

bettinaheim changed the title ~~Update Python bridge to properly handled nested expressions and other fixes~~ Python bridge revision Nov 20, 2025

lmondada reviewed Nov 20, 2025

View reviewed changes

lib/Frontend/nvqpp/ConvertStmt.cpp Outdated Show resolved Hide resolved

no need for array pointers from symbol table after looking more caref…

8e2d221

…ully Signed-off-by: Bettina Heim <heimb@outlook.com>

khalatepradnya reviewed Nov 21, 2025

View reviewed changes

python/tests/visualization/test_draw.py Show resolved Hide resolved

khalatepradnya reviewed Nov 21, 2025

View reviewed changes

python/tests/mlir/ast_continue.py Show resolved Hide resolved

khalatepradnya reviewed Nov 21, 2025

View reviewed changes

python/tests/kernel/test_control_negations.py Show resolved Hide resolved

khalatepradnya reviewed Nov 21, 2025

View reviewed changes

include/cudaq/Optimizer/Dialect/CC/CCTypes.h Outdated Show resolved Hide resolved

bettinaheim added the breaking change Change breaks backwards compatibility label Nov 21, 2025

bettinaheim and others added 5 commits November 21, 2025 16:04

minor thing

6494a5b

Signed-off-by: Bettina Heim <heimb@outlook.com>

Update lib/Frontend/nvqpp/ConvertStmt.cpp

f6d1f67

Co-authored-by: Luca Mondada <72734770+lmondada@users.noreply.github.com> Signed-off-by: Bettina Heim <heimb@outlook.com>

Update include/cudaq/Optimizer/Dialect/CC/CCTypes.h

2256ec0

Co-authored-by: Pradnya Khalate <148914294+khalatepradnya@users.noreply.github.com> Signed-off-by: Bettina Heim <heimb@outlook.com>

formatting

8db1516

Signed-off-by: Bettina Heim <heimb@outlook.com>

Merge branch 'main' into python-bridge

1f1cfe8

bettinaheim added the bug fix To be listed under Bug Fixes in the release notes label Nov 21, 2025

github-actions bot pushed a commit that referenced this pull request Nov 21, 2025

Docs preview for PR #3537.

3ba769e

This was referenced Nov 25, 2025

Refactor ast_bridge.py #1943

Closed

[python, run] Creating data classes with constructors generates empty return #3085

Open

Merge branch 'main' into python-bridge

a95bac3

1tnguyen approved these changes Nov 26, 2025

View reviewed changes

bettinaheim merged commit 0cf5d85 into NVIDIA:main Nov 26, 2025
12 of 13 checks passed

github-actions bot pushed a commit that referenced this pull request Nov 26, 2025

Cleaning up docs preview for PR #3537.

266dcc6

Python bridge revision #3537

Python bridge revision #3537

Uh oh!

Conversation

bettinaheim commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of changes:

Bug Fixes

Changes to ast_bridge.py:

Value stack

Loops

Calls

Assignments

Quantum Types and Measurements

Python Reference Types

Uh oh!

bettinaheim commented Nov 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 20, 2025

Uh oh!

Uh oh!

lmondada commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bettinaheim commented Nov 21, 2025

Uh oh!

bettinaheim commented Nov 21, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 21, 2025

Uh oh!

bettinaheim commented Nov 26, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Nov 26, 2025

Uh oh!

1tnguyen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

bettinaheim commented Oct 22, 2025 •

edited

Loading

bettinaheim commented Nov 19, 2025 •

edited by github-actions bot

Loading

lmondada commented Nov 21, 2025 •

edited

Loading

bettinaheim commented Nov 21, 2025 •

edited by github-actions bot

Loading

bettinaheim commented Nov 26, 2025 •

edited by github-actions bot

Loading