Use WAL with SQLite cache, fix close by hauntsaninja · Pull Request #21154 · python/mypy

hauntsaninja · 2026-04-03T01:00:38Z

This is the more modern way to manage concurrency with SQLite. Relevant to current discussion, it means concurrent mypy runs using the cache will wait for each other, rather than fail

SQLite also claims this is significantly faster, but I haven't yet done a good profile (If you are profiling this, note that WAL is a persistent setting, so you will want to delete the cache). This might also allow removing the PRAGMA synchronous=OFF

Finally, I also explicitly close the connection in main. This is relevant to this change, because it forces checkpointing of the WAL, which keeps reads fast, reduces disk space and means the cache.db remains a single self-contained file under regular use

Fixes #21136, see also discussion in #13916

For what it's worth, I feel there are many legitimate uses of concurrent mypy. At work, we often share cache between multiple projects. At home, I often end up having parallel runs with a debugger while working on mypy (although this PR just makes those ones hang waiting for the lock lol)

This is the more modern way to manage concurrency with SQLite In our case, it means concurrent mypy runs using the cache will wait for each other, rather than fail SQLite also claims this is faster, but I haven't yet done a good profile (If you are profiling this, note that WAL is a persistent setting, so you will want to delete the cache) Finally, I also explicitly close the connection in main. This is relevant to this change, because it forces checkpointing of the WAL, which reduces disk space and means the cache.db remains a single self-contained file in regular use

github-actions · 2026-04-03T03:40:12Z

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

ilevkivskyi · 2026-04-03T09:24:06Z

For what it's worth, I feel there are many legitimate uses of concurrent mypy

Can you give some more concrete examples? Also the problem is, as I mentioned in other issue, even if there are such uses, current incremental logic was not designed for this, and it would be tricky to guarantee correctness.

ilevkivskyi · 2026-04-03T09:35:35Z

Anyway, I am not strongly against this per se, as I mentioned in #13916 (comment) IMO the key point is to give a loud warning, the exact best effort semantics is not as important then.

Also, to be clear, although this will make SQLite cache behave like FS cache, this will not solve other concurrent-related crashes like #14521 or #18473, while disabling cache completely will fix those.

hauntsaninja · 2026-04-03T20:28:20Z

Sure, can talk a little bit more about work use case. We have monorepo with lots of first party projects. These often have similar dependency graph, and sharing cache across them helps, e.g. import torch is slow if you analyse cold but now it will always be warm, and you avoid ending up with gigabytes of duplicated mypy cache everywhere.

Note this behaviour will still be a little different from FS cache (and so might reduce likelihood of those issues). Once a connection has started writing, this will block until the connection commits at the end of the build. If we do want the fallback behaviour to match FS cache, we should set isolation_level=None so each write is its own transaction

ilevkivskyi · 2026-04-04T00:30:04Z

These often have similar dependency graph

OK, in such cases parallel mypy invocations will likely work, unless you use different options. But even in such cases I would give a warning, to make it clear that we can't guarantee correctness in general, and users can do this at their own risk.

Once a connection has started writing, this will block until the connection commits at the end of the build

And how exactly will this work in case of parallel type checking? In this case we want the workers to be reading/writing ~freely (because coordinator is already making sure they are scheduled in a way to guarantee correctness).

If we do want the fallback behaviour to match FS cache, we should set isolation_level=None so each write is its own transaction

This seems like a better option, but I vaguely remember I tried it at some point, and I didn't like it in terms of performance, but it well may be I did something wrong. In general, it would be good to have performance measurements for parallel checking (say on self check with cold cache) for both these options vs status quo. This is arguably a niche use case, and I don't want to sacrifice performance for everyone else because of it.

ilevkivskyi · 2026-04-04T00:37:42Z

Btw couple notes on performance measurements for parallel checking:

Always use a compiled version, and run self-check from outside of mypy directory, otherwise Python will find local (interpreted) version of mypy.build_worker.worker.
Run ~twice more runs that you would normally do, since time variance for parallel checking is much higher.

hauntsaninja mentioned this pull request Apr 3, 2026

1.20 and 2.0 Release Planning #20726

Open

hauntsaninja requested review from JukkaL and ilevkivskyi April 3, 2026 01:02

attribute error

d2b4176

hauntsaninja marked this pull request as draft April 3, 2026 01:30

This comment has been minimized.

Sign in to view

hauntsaninja added 2 commits April 2, 2026 19:47

check and set wal

4613cb8

hack

6befc59

This comment has been minimized.

Sign in to view

hauntsaninja marked this pull request as ready for review April 3, 2026 21:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use WAL with SQLite cache, fix close#21154

Use WAL with SQLite cache, fix close#21154
hauntsaninja wants to merge 4 commits intopython:masterfrom
hauntsaninja:sqlitewal

hauntsaninja commented Apr 3, 2026 •

edited

Loading

Uh oh!

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Apr 3, 2026

Uh oh!

ilevkivskyi commented Apr 3, 2026

Uh oh!

ilevkivskyi commented Apr 3, 2026

Uh oh!

hauntsaninja commented Apr 3, 2026

Uh oh!

ilevkivskyi commented Apr 4, 2026

Uh oh!

ilevkivskyi commented Apr 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

hauntsaninja commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Apr 3, 2026

Uh oh!

ilevkivskyi commented Apr 3, 2026

Uh oh!

ilevkivskyi commented Apr 3, 2026

Uh oh!

hauntsaninja commented Apr 3, 2026

Uh oh!

ilevkivskyi commented Apr 4, 2026

Uh oh!

ilevkivskyi commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hauntsaninja commented Apr 3, 2026 •

edited

Loading

ilevkivskyi commented Apr 4, 2026 •

edited

Loading