Pytifex — Automated Differential Testing for Python Type Checkers

GitHub Python 3.12+ License: MIT

Pytifex automatically discovers disagreements between Python type checkers by mining real bugs from type checker repositories, generating targeted test cases with an LLM, and establishing ground truth through multi-tiered evaluation.

How It Works

 Mine bugs from        Generate code         Run 4 type           Evaluate which
 GitHub issues    →    variations via    →    checkers on      →   checker is
 (mypy, ty, ...)       Gemini LLM            each example         correct
  1. Mine — Fetch real bug reports (false positives, false negatives) from mypy, pyrefly, ty, and pyright GitHub repositories
  2. Mutate — Use the bugs as seeds for Gemini to generate new code targeting similar edge cases
  3. Test — Run mypy, pyrefly, zuban, and ty on each generated example; keep only disagreements
  4. Evaluate — Determine which checker is correct using runtime crash detection, Hypothesis testing, PEP spec matching, and AST analysis

Type Checkers Tested

Checker Version
mypy 1.19.0
pyrefly 0.44.2
zuban 0.3.0
ty 0.0.1-alpha.32

Quick Start

pip install pytifex

export GEMINI_API_KEY=your_key

uv run pytifex

Note: Pytifex is a research tool developed for a senior comprehensive project. It implements a bug-seeded mutation methodology for proactively finding type checker bugs before users encounter them.

Documentation