This directory contains miscellaneous scripts that are useful for working on or with mathlib. When adding a new script, please make sure to document it here, so other readers have a chance to learn about it as well!
Installation scripts
install_debian.sh
,install_macos.sh
Installation scripts referenced from the leanprover community install pages. https://leanprover-community.github.io/install/debian.html https://leanprover-community.github.io/install/macos.html If these web pages are deprecated or removed, we should remove these scripts.
Repository analysis and reporting
-
user_activity_report.py
Generates a comprehensive report of all users with repository access and their last commit activity. Shows username, age of last commit, and access level, sorted by commit recency (most recent first).Features:
- Fetches repository collaborators and organization members via GitHub API
- Intelligent caching: user lists (24h TTL) and commit data (6h TTL) for performance
- Access level filtering:
--admin
(admin users only),--write
(write+ access) - Single user analysis:
--user USERNAME
for debugging specific users - Result limiting:
--limit N
for testing with smaller datasets - Inactive user cleanup:
--remove N
generates (but doesn't execute) gh commands to remove write access from non-admin users inactive for more than N days - Fallback to contributors API if collaborators access is restricted (
--contributors-only
)
Caching: Results cached in
scripts/users_cache.json
andscripts/commits_cache.json
(automatically added to .gitignore). Cache saved after each commit lookup to prevent data loss.Requirements:
gh
(GitHub CLI) installed and authenticated (gh auth login
).
Tools for manual maintenance
-
fix_unused.py
Bulk processing of unused variable warnings, replacing them with_
. -
add_deprecations.sh
is a text-based script that automatically adds deprecation statements. It assumes that the only difference between master and the current status of the PR consists of renames. More precisely, any change on a line that contains a declaration name and is not a rename, will likely confuse the script. -
create_deprecated_modules.lean
defines the#create_deprecated_modules
command that automatically generates thedeprecated_module
entries, gathering information fromgit
. The expectation is that this will be expanded to a fully automated process that happens in CI. -
migrate_to_fork.py
Helps contributors migrate from having direct write access to the main repository to using a fork-based workflow. This comprehensive script automates the entire migration process:- Validates the current branch (prevents migration of system branches like master, nightly-testing)
- Checks GitHub CLI installation/authentication with OS-specific installation instructions
- Creates or syncs a fork of mathlib4 automatically
- Sets up git remotes correctly (
upstream
for leanprover-community/mathlib4,origin
for user's fork) - Detects already-completed migration steps and skips them for efficiency
- Migrates the current branch to the fork with proper upstream tracking
- Intelligently handles existing PRs (migrates main repo PRs to fork-based PRs, detects existing fork PRs)
- Uses fast delete/re-add approach for remote operations to avoid slow branch tracking updates
- Provides comprehensive status reporting and next steps guidance
Run with
python3 scripts/migrate_to_fork.py
(interactive) orpython3 scripts/migrate_to_fork.py -y
(auto-accept). Requires GitHub CLI (gh
) installed and authenticated. Safe to run multiple times.
-
githelper.py
The subcommandgithelper.py fix
helps contributors fix their git repository setup by step-by-step converting it from its current state to a well-defined target state. The target state mostly matches the state after of a freshly cloned fork (gh repo clone <fork>
) and looks like this:- The remote
upstream
points toleanprover-community/mathlib4
- The remote
origin
points to the contributor's own fork - The
gh
default repo points toleanprover-community/mathlib4
master
s remote isupstream
but its pushRemote isorigin
Other subcommands to automate git-related actions may be added in the future.
- The remote
Analyzing Mathlib's import structure
unused_in_pole.sh
(followed by an optional<target>
, defaulting toMathlib
) callslake exe pole --loc --to <target>
to compute the longest pole to a given target module, and then feeds this intolake exe unused
to analyze transitively unused imports. Generatesunused.md
containing a markdown table showing the unused imports, and suggestslake exe graph
commands to visualize the largest "rectangles" of unused imports.
CI workflow
lake-build-with-retry.sh
Runslake build
on a target untillake build --no-build
succeeds. Used in the main build workflows.mk_all.lean
run vialake exe mk_all
, regenerates the import-only filesMathlib.lean
,Mathlib/Tactic.lean
,Archive.lean
andCounterexamples.lean
lint-style.lean
,lint-style.py
,print-style-errors.sh
style linters, written in Python and Lean. Run vialake exe lint-style
. Medium-term, the latter two scripts should be rewritten and incorporated inlint-style.lean
.lint-bib.sh
normalize the BibTeX filedocs/references.bib
usingbibtool
.yaml_check.py
,check-yaml.lean
Sanity checks forundergrad.yaml
,overview.yaml
,100.yaml
and1000.yaml
.lean-pr-testing-comments.sh
Generate comments and labels on a Lean or Batteries PR after CI has finished on a*-pr-testing-NNNN
branch.update_nolints_CI.sh
Update thenolints.json
file to remove unneeded entries. Automatically run once a week.assign_reviewers.py
is used to automatically assign a reviewer to each stale github PR on the review queue. This script downloads a .json file with proposed assignments and makes the corresponding github API calls.bench_summary.lean
Convert data retrieved from the speed center into a shorter, more accessible format, and post a comment with this summary on github.declarations_diff.sh
Attempts to find which declarations have been removed and which have been added in the current PR with respect tomaster
, and posts a comment on github with the result.autolabel.lean
is the Lean script in charge of automatically adding at-
label on eligible PRs. Autolabelling is inferred by which directories the current PR modifies.
Managing nightly-testing and bump branches
-
create-adaptation-pr.sh
implements some of the steps in the workflow described at https://leanprover-community.github.io/contribute/tags_and_branches.html#mathlib-nightly-and-bump-branches Specifically, it will:- merge
master
intobump/v4.x.y
- create a new branch from
bump/v4.x.y
, calledbump/nightly-YYYY-MM-DD
- merge
nightly-testing
into the new branch - open a PR to merge the new branch back into
bump/v4.x.y
- announce the PR on zulip
- finally, merge the new branch back into
nightly-testing
, if conflict resolution was required.
If there are merge conflicts, it pauses and asks for help from the human driver.
- merge
-
merge-lean-testing-pr.sh
takes a PR numberNNNN
as argument, and attempts to merge the branchlean-pr-testing-NNNN
intomaster
. It will resolve conflicts inlean-toolchain
,lakefile.lean
, andlake-manifest.json
. If there are more conflicts, it will bail.
Managing downstream repos
downstream_repos.yml
contains basic information about significant downstream repositories.downstream-tags.py
is a script to check whether a given tag exists on the downstream repositories listed indownstream_repos.yml
.
Managing and tracking technical debt
technical-debt-metrics.sh
Prints information on certain kind of technical debt in Mathlib. This output is automatically posted to zulip once a week.long_file_report.sh
Prints the list of the 10 longest Lean files inMathlib
. This output is automatically posted to zulip once a week.
Mathlib tactics
polyrith_sage.py
,polyrith_sage_helper.py
are required forpolyrith
to communication with the Sage server.
Data files with linter exceptions
nolints.json
contains exceptions for allenv_linter
s in mathlib. For permanent and deliberate exceptions, add a@[nolint lintername]
in the .lean file instead.nolints_prime_decls.txt
contains temporary exceptions for thedocPrime
linter
Both of these files should tend to zero over time; please do not add new entries to these files. PRs removing (the need for) entries are welcome.
API surrounding CI
parse_lake_manifest_changes.py
compares two versions oflake-manifest.json
to report dependency changes in Zulip notifications. Used by theupdate_dependencies_zulip.yml
workflow to show which dependencies were updated, added, or removed, with links to GitHub diffs.update_PR_comment.sh
is a script that edits an existing message (or creates a new one). It is used by thePR_summary
workflow to maintain an up-to-date report with a searchable history.get_tlabel.sh
extracts thet-
label that a PR has (assuming that there is exactly one). It is used by themaintainer_merge
family of workflows to dispatchmaintainer merge
requests to the appropriate topic on zulip.count-trans-deps.py
,import-graph-report.py
andimport_trans_difference.sh
produce various summaries of changes in transitive imports that thePR_summary
message incorporates.zulip_emoji_reactions.py
is called- every time a
bors d
,bors merge
orbors r
comment is added to a PR, - whenever bors merges a PR,
- whenever a PR is closed or reopened
- whenever a PR is labelled or unlabelled with
awaiting-author
ormaintainer-merge
It looks through all zulip posts containing a reference to the relevant PR and will post or update an emoji reaction corresponding to the current PR state to the message. This reaction is ✌️ (:peace_sign:
) for delegated,:bors:
for PRs sent to bors,:merge
for merged PRs, ✍️ (:writing:
) for PRs awaiting-author, 🔨 (:hammer:
) for maintainer-merged PRs and:closed-pr:
for closed PRs. PRs which were migrated to a fork (as indicated by themigrated-to-fork
label) additionally receive a reaction ... (skip_forward
). Two of these are custom emojis configured on zulip.
- every time a
late_importers.sh
is the main script used by thelatest_import.yml
action: it formats thelinter.minImports
output, summarizing the data in a table. See the module docs oflate_importers.sh
for further details.maintainer_merge_message.sh
contains a shell script that produces the Zulip message for amaintainer merge
/maintainer delegate
comment.
Docker images
docker_build.sh
builds thelean4
,gitpod4
, andgitpod4-blueprint
Docker images. These are used by some CI workflows, as well as places such as Gitpod.docker_push.sh
first runsdocker_build.sh
, and then pushes the images to Docker Hub, appropriately tagged with the date on which the images were built. This should be re-run after breaking changes toelan
, so that CI and Gitpod have access to updated versions ofelan
.