Skip to content

Comments

Use type checking to detect invalid mutants#468

Draft
Otto-AA wants to merge 1 commit intomainfrom
type-checking
Draft

Use type checking to detect invalid mutants#468
Otto-AA wants to merge 1 commit intomainfrom
type-checking

Conversation

@Otto-AA
Copy link
Collaborator

@Otto-AA Otto-AA commented Feb 14, 2026

Fixes #467

What works so far:

  • runs custom type checker on ./mutants/
  • parses errors from JSON output (pyright, pyrefly)
  • maps error lines to mutants
  • disables mutants

Big TODOs:

  • keep original typing of class properties (at least pyright types self.x as str | None, if there are multiple methods with self.x = 'foo' and self.x = None)
  • keep original typing of mutated methods (when doing def foo(*args, **kwargs): return _trampoline..., pyright cannot use the types of foo to infer types at places where we call x = foo(a). This can break typing). Also see Preserve original signature in trampoline #465

I think for the class properties problem, we would need to define the mutated methods outside of the class and dynamically either add them or overwrite the original method.

To keep the original types of mutated methods, using libcst to copy the signature should be fine.

@Otto-AA Otto-AA force-pushed the type-checking branch 3 times, most recently from 2863520 to 00778dc Compare February 15, 2026 10:24
@Otto-AA
Copy link
Collaborator Author

Otto-AA commented Feb 15, 2026

At least for the small E2E test it works well with pyrefly: 🧙

image

I'll try to get it working with pyrefly for a medium-size project, and only later look if pyright and others can also be supported. I think for pyright we would need to relax some type checking rules.

And also need to debug why the tests fail in CI, it seems pyrefly outputs JSON + normal text in the CI but locally only JSON

@Otto-AA
Copy link
Collaborator Author

Otto-AA commented Feb 16, 2026

On my sample repo it worked for pyrefly and mypy, but not for pyright and ty.

The difference is that in the following example they infer following types for the self.x in c:

  • pyrefly: int
  • mypy: int
  • ty: Unknown | Literal[2, "a"]
  • pyright: int | str
from typing import reveal_type

class Foo:
    def __init__(self):
        self.x = 2

    def some_mutant(self):
        self.x = 'a'

    def c(self) -> int:
        reveal_type(self.x)
        return self.x

Thus, pyrefly and mypy infer self.x based on the first usage (and error in some_mutant that str is not assignable to int). Pyright and ty infer the union of all usages (and unknown for ty?), thus some_mutant does not error but changes the type of self.x which can break previously working types in other functions like c.

I can't think of a method to fix this currently, except for copying the whole class per mutant. I won't implement this for now, but likely will add this feature only for pyrefly and mypy for now.

Remaining TODOs:

  • unit test error parsing
  • revisit error handling/messages
  • hide filtered mutants in the browser TUI
  • sample some of the filtered mutants in my sample repo to check if it works as expected
  • wait for the next pyrefly release which fixes the pyrefly JSON output in github actions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Type hints/checking to reduce mutants

1 participant