Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -165,13 +165,21 @@ jobs:
free-threading:
- false
- true
interpreter:
- switch-case
exclude:
# Skip Win32 on free-threaded builds
- { arch: Win32, free-threading: true }
include:
# msvc::musttail is currently only supported on x64,
# and only supported on 3.15+.
- { arch: x64, free-threading: false, interpreter: tail-call }
- { arch: x64, free-threading: true, interpreter: tail-call }
uses: ./.github/workflows/reusable-windows.yml
with:
arch: ${{ matrix.arch }}
free-threading: ${{ matrix.free-threading }}
interpreter: ${{ matrix.interpreter }}

build-windows-msi:
# ${{ '' } is a hack to nest jobs under the same sidebar category.
Expand Down
14 changes: 11 additions & 3 deletions .github/workflows/reusable-windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ on:
required: false
type: boolean
default: false
interpreter:
description: Which interpreter to build (switch-case or tail-call)
required: true
type: string

env:
FORCE_COLOR: 1
Expand All @@ -20,7 +24,7 @@ env:

jobs:
build:
name: Build and test (${{ inputs.arch }})
name: Build and test (${{ inputs.arch }}, ${{ inputs.interpreter }})
runs-on: ${{ inputs.arch == 'arm64' && 'windows-11-arm' || 'windows-2025-vs2026' }}
timeout-minutes: 60
env:
Expand All @@ -33,9 +37,12 @@ jobs:
if: inputs.arch != 'Win32'
run: echo "::add-matcher::.github/problem-matchers/msvc.json"
- name: Build CPython
# msvc::musttail is not supported for debug builds, so we have to
# switch to release.
run: >-
.\\PCbuild\\build.bat
-e -d -v
-e -v
${{ inputs.interpreter == 'switch-case' && '-d' || '--tail-call-interp -c Release' }}
-p "${ARCH}"
${{ fromJSON(inputs.free-threading) && '--disable-gil' || '' }}
shell: bash
Expand All @@ -45,6 +52,7 @@ jobs:
run: >-
.\\PCbuild\\rt.bat
-p "${ARCH}"
-d -q --fast-ci
-q --fast-ci
${{ inputs.interpreter == 'switch-case' && '-d' || '' }}
${{ fromJSON(inputs.free-threading) && '--disable-gil' || '' }}
shell: bash
35 changes: 0 additions & 35 deletions .github/workflows/tail-call.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,41 +23,6 @@ env:
LLVM_VERSION: 21

jobs:
windows:
name: ${{ matrix.target }}
runs-on: ${{ matrix.runner }}
timeout-minutes: 60
strategy:
fail-fast: false
matrix:
include:
- target: x86_64-pc-windows-msvc/msvc
architecture: x64
runner: windows-2025-vs2026
build_flags: ""
run_tests: true
- target: x86_64-pc-windows-msvc/msvc-free-threading
architecture: x64
runner: windows-2025-vs2026
build_flags: --disable-gil
run_tests: false
steps:
- uses: actions/checkout@v6
with:
persist-credentials: false
- uses: actions/setup-python@v6
with:
python-version: '3.11'
- name: Build
shell: pwsh
run: |
./PCbuild/build.bat --tail-call-interp ${{ matrix.build_flags }} -c Release -p ${{ matrix.architecture }}
- name: Test
if: matrix.run_tests
shell: pwsh
run: |
./PCbuild/rt.bat -p ${{ matrix.architecture }} -q --multiprocess 0 --timeout 4500 --verbose2 --verbose3

macos:
name: ${{ matrix.target }}
runs-on: ${{ matrix.runner }}
Expand Down
45 changes: 45 additions & 0 deletions Doc/library/profiling.sampling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1003,6 +1003,47 @@ at the top indicate functions that consume significant time either directly
or through their callees.


Differential flame graphs
~~~~~~~~~~~~~~~~~~~~~~~~~

Differential flame graphs compare two profiling runs to highlight where
performance changed. This helps identify regressions introduced by code changes
and validate that optimizations achieved their intended effect::

# Capture baseline profile
python -m profiling.sampling run --binary -o baseline.bin script.py

# After modifying code, generate differential flamegraph
python -m profiling.sampling run --diff-flamegraph baseline.bin -o diff.html script.py

The visualization draws the current profile with frame widths showing current
time consumption, then applies color to indicate how each function changed
relative to the baseline.

**Color coding**:

- **Red**: Functions consuming more time (regressions). Lighter shades indicate
modest increases, while darker shades show severe regressions.

- **Blue**: Functions consuming less time (improvements). Lighter shades for
modest reductions, darker shades for significant speedups.

- **Gray**: Minimal or no change.

- **Purple**: New functions not present in the baseline.

Frame colors indicate changes in **direct time** (time when the function was at
the top of the stack, actively executing), not cumulative time including callees.
Hovering over a frame shows comparison details including baseline time, current
time, and the percentage change.

Some call paths may disappear entirely between profiles. These are called
**elided stacks** and occur when optimizations eliminate code paths or certain
branches stop executing. If elided stacks are present, an elided toggle appears
allowing you to switch between the main differential view and an elided-only
view that shows just the removed paths (colored purple).


Gecko format
------------

Expand Down Expand Up @@ -1488,6 +1529,10 @@ Output options

Generate self-contained HTML flame graph.

.. option:: --diff-flamegraph <baseline.bin>

Generate differential flamegraph comparing to a baseline binary profile.

.. option:: --gecko

Generate Gecko JSON format for Firefox Profiler.
Expand Down
10 changes: 8 additions & 2 deletions Doc/library/xml.etree.elementtree.rst
Original file line number Diff line number Diff line change
Expand Up @@ -691,7 +691,7 @@ Functions
.. versionadded:: 3.2


.. function:: SubElement(parent, tag, attrib={}, **extra)
.. function:: SubElement(parent, tag, /, attrib={}, **extra)

Subelement factory. This function creates an element instance, and appends
it to an existing element.
Expand All @@ -705,6 +705,9 @@ Functions
.. versionchanged:: 3.15
*attrib* can now be a :class:`frozendict`.

.. versionchanged:: next
*parent* and *tag* are now positional-only parameters.


.. function:: tostring(element, encoding="us-ascii", method="xml", *, \
xml_declaration=None, default_namespace=None, \
Expand Down Expand Up @@ -880,7 +883,7 @@ Element Objects
:noindex:
:no-index:

.. class:: Element(tag, attrib={}, **extra)
.. class:: Element(tag, /, attrib={}, **extra)

Element class. This class defines the Element interface, and provides a
reference implementation of this interface.
Expand All @@ -893,6 +896,9 @@ Element Objects
.. versionchanged:: 3.15
*attrib* can now be a :class:`frozendict`.

.. versionchanged:: next
*tag* is now a positional-only parameter.


.. attribute:: tag

Expand Down
3 changes: 2 additions & 1 deletion Include/internal/pycore_dict.h
Original file line number Diff line number Diff line change
Expand Up @@ -138,13 +138,14 @@ extern PyObject *_PyDict_LoadBuiltinsFromGlobals(PyObject *globals);

/* Consumes references to key and value */
PyAPI_FUNC(int) _PyDict_SetItem_Take2(PyDictObject *op, PyObject *key, PyObject *value);
PyAPI_FUNC(int) _PyDict_SetItem_Take2_KnownHash(PyDictObject *op, PyObject *key, PyObject *value, Py_hash_t hash);
extern int _PyDict_SetItem_LockHeld(PyDictObject *dict, PyObject *name, PyObject *value);
// Export for '_asyncio' shared extension
PyAPI_FUNC(int) _PyDict_SetItem_KnownHash_LockHeld(PyDictObject *mp, PyObject *key,
PyObject *value, Py_hash_t hash);
// Export for '_asyncio' shared extension
PyAPI_FUNC(int) _PyDict_GetItemRef_KnownHash_LockHeld(PyDictObject *op, PyObject *key, Py_hash_t hash, PyObject **result);
extern int _PyDict_GetItemRef_KnownHash(PyDictObject *op, PyObject *key, Py_hash_t hash, PyObject **result);
PyAPI_FUNC(int) _PyDict_GetItemRef_KnownHash(PyDictObject *op, PyObject *key, Py_hash_t hash, PyObject **result);
extern int _PyDict_GetItemRef_Unicode_LockHeld(PyDictObject *op, PyObject *key, PyObject **result);
PyAPI_FUNC(int) _PyObjectDict_SetItem(PyTypeObject *tp, PyObject *obj, PyObject **dictptr, PyObject *name, PyObject *value);

Expand Down
3 changes: 2 additions & 1 deletion Include/internal/pycore_long.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,8 @@ PyAPI_FUNC(void) _PyLong_ExactDealloc(PyObject *self);
# error "_PY_NSMALLPOSINTS must be greater than or equal to 257"
#endif

#define _PY_IS_SMALL_INT(val) ((val) >= 0 && (val) < 256 && (val) < _PY_NSMALLPOSINTS)
#define _PY_IS_SMALL_INT(val) \
(-_PY_NSMALLNEGINTS <= (val) && (val) < _PY_NSMALLPOSINTS)

// Return a reference to the immortal zero singleton.
// The function cannot return NULL.
Expand Down
Loading
Loading