This Month in Our Test Infra: January and February 2025
This is a quick summary of the changes in the test infrastructure for the rust-lang/rust repository1 for January and February 20252.
As usual, if you encounter bugs or UX issues when using our test infrastructure, please file an issue. Bugs and papercuts can't be fixed if we don't know about them!
Thanks to everyone who contributed to our test infra!
Highlights
ci.py is now a proper citool Rust crate
The old ci.py Python script used to orchestrate CI jobs was unmaintainable. Any changes to the python script risked bringing down the entire queue or bypass testing entirely. There was practically no test coverage. CI UX improvements were hard to implement and difficult to review.
So, Jakub decided enough was enough and rewritten src/ci/github-actions/ci.py as src/ci/citool, a proper Rust CLI tool. This allowed the job definitions to be properly parsed and handled, and also enabled adding unit tests. It also allowed improving some error messages. Furthermore, it improved the UX of running the CI jobs locally (on Linux). Consult the rustc-dev-guide docs in rust-lang/rust for updated running instructions (at the time of writing, this hasn't been synced back to rustc-dev-guide yet).
try-jobs now supports glob patterns for job names
As part of CI efficiency efforts, many CI jobs have been split into multiple jobs (e.g. x86_64-apple-{1,2}) to balance between runner capability and build/test times. This had an unfortunate side effect of making it more difficult to know which job name you need to specify to run the tests you want in custom try jobs.
https://github.com/rust-lang/rust/pull/138307 permits the contributor to write glob patterns to match on job names (up to 20 matching jobs, see next highlight). For instance, if you wanted to run all msvc-related jobs as part of try-jobs, you no longer have to specify a whole bunch of e.g. try-job: x86_64-msvc-1, try-job: x86_64-msvc-2, try-job: dist-x86_64-msvc, try-job: i686-msvc-1, try-job: i686-msvc-2. Instead, you are now able to write
try-job: `*msvc*`
Which will expand to match against (and thus run) all of x86_64-msvc-{1,2}, i686-msvc-{1,2} and dist-x86_64-msvc.
Note the backticks (`) surrounding the glob pattern. This is needed to prevent GitHub UI from interpreting the glob pattern as (possibly mismatched) markdown italics markup. The backticks will be ignored by CI tooling.
Max custom try-job job limit is now 20 (instead of 10)
You can now run up to 20 custom try-jobs instead of the previous limit of 10.
The Makefile-based run-make test infrastructure has been retired
Almost 8 years ago, astute early contributors noticed that the Makefile-based run-make tests were both hard-to-run and hard-to-write. It was proposed that we switch run-make tests from Makefile to rust for multiple motivations, such as:
- Make it more accessible for contributors.
Makefilesyntax (with bash intertwined) and semantics is its own source of bugs and footguns, and is a frequent source of frustrations. - Reduce dependency on external tools (especially external bin tools) where feasible and beneficial.
- Become less platform-dependent.
- Avoid having to deal with different flavors of
makes (GNU make of various versions,nmake) that are (subtly) incompatible with each other3. - Make it possible to not have to use some kind of Unix-compatibility layer (e.g. MSYS) to run the test suite on Windows natively (be it MSVC or mingw).
In 2023, after consultation with multiple contributors, we converged on a new run-make test infrastructure that has two key components:
- Each
run-maketest consists of a test recipe,rmake.rs. This is the main test file. - The test recipe has access to a test support library called
run-make-support. The support library centralizes common helpers that differentrun-maketests use. It also allows re-exporting useful ecosystem crates for use by tests, such asobjectorregex. Ecosystem crates make it possible forrmake.rstests to perform more precise checks than the text-based manipulations mostMakefile-based tests use.
The most important difference here is perhaps improved accessibility to Rust contributors. The rmake.rs tests are just ordinary Rust programs. This means the contributor does not need to be constantly fighting all the Makefile and shell syntax quirks (the multitude of quoting styles, interpolation, etc.) or behavioral quirks (e.g. pipefail)4.
There are 200+ run-make tests, so we couldn't port them all in one go. Instead, the approach taken was:
- The legacy
Makefile-basedrun-maketest infra co-existed with the newrmake.rs-basedrun-maketest infra. Which test infra was used depended upon whether the test directory containedMakefileorrmake.rs. - We maintained a quest-like tracking issue that exhaustively listed all the
Makefile-basedrun-maketests that needed to be ported, and tracked their migration progress. Contributors were invited to claim specific tests that they wanted to help port.- This divided the workload between many contributors to make this migration possible. This is still mentored if the contributor needed assistance or wanted to discuss the approach, such as if they wanted to run the test against specific
try-jobs. - Through a mentored Google Summer of Code (GSoC) 2024 project, @Oneirical worked on porting a majority of the
run-maketests. You can read their final GSoC report here. - Many maintainers also helped with infrastructure, reviews, testing and providing suggestions, and also authoring migration PRs themselves.
- Thanks to everyone who helped in this effort!
- This divided the workload between many contributors to make this migration possible. This is still mentored if the contributor needed assistance or wanted to discuss the approach, such as if they wanted to run the test against specific
- Adopt a migration process that was not a naive 1-to-1 port. Where possible, contributors tried to improve the tests to:
- Become well documented, by linking to relevant context, references, discussions, implementation history and issues where suitable. Many
Makefileversions of the tests did not have any test descriptions. There was a lot of git archaeology involved in figuring out what the tests were trying to test in the first place. - Actually test what the test wanted to test. For example,
tests/run-make/translationdid not test what it wanted to test because theMakefiledidn't setSHELL=/bin/bash -o pipefail. - Become more precise and less fragile. Quite a few of
run-maketests were able to make use of the excellentobjectcrate to perform structured analysis on binaries (for symbols and debuginfo), as opposed trying to do text grepping on human-readable textual output of bin tools (likeobjdumpornm, where the CLI interface and textual output format can also be different between platforms).
- Become well documented, by linking to relevant context, references, discussions, implementation history and issues where suitable. Many
The migration effort took around a year, until we were finally able to declare all Makefile-based run-make tests ported, and thus we were able to retire the legacy Makefile-based test infrastructure in early 2025.
Of course, the new test infrastructure isn't all sunshine and rainbows. There are still issues, desired improvements and test UX papercuts that await to be addressed. However, like the overall test infra, they can be and will be improved over time.
Bootstrap test and build step metrics are now available in GitHub job summaries
https://github.com/rust-lang/rust/pull/137077 implemented postprocessing logic for bootstrap test and build metrics to convert them into GitHub job summaries.

Notable changes
This section is intended to be like "compatibility notes" but for human test writers.
rustc-based (ToolRustc) tools have unified staging handling
Tools that wants to use a locally built rustc previously inconsistently implemented their own staging logic in their tool and test steps. This caused a lot of confusion as different ToolRustc tools (and their tests) handled the staging differently; some had unnecessarily builds while others were seemingly "off by one stage". There were hacks in various places to "chop off" or "increment" stages separately. To make this situation more maintainable, https://github.com/rust-lang/rust/pull/137215 unifies ToolRustc tool staging logic.
Notably, ./x test without args and ./x test src/tools/{cargo,clippy}, where possible, now default to stage 2. Previously, ./x test src/tools/{cargo,clippy} without explicit test stage configuration corresponded to --stage 1 but they actually required building stage 2 rustc anyway. Bootstrap will now warn if you try to specify a test stage < 2 when testing these two tools (that they don't necessarily work against stage 1 rustc is an pre-existing issue).
Additionally, the previous ./x build $rustc_tool --stage 0 invocation (not std or bootstrap tools) is now equivalent to ./x build $rustc_tool --stage 1. Before https://github.com/rust-lang/rust/pull/137215, stages for rustc tools in build flows were incremented by inconsistent adjustments, and when --stage N was specified on the ./x build $rustc_tool invocation it would build stage N+1 rustc. Now, ./x build $rustc_tool --stage N will produce a rustc-tool using stage N rustc.
Consult the new Writing tools in Bootstrap chapter for further clarification on picking a correct bootstrap tool mode.
run-make-support and rmake.rs is now fixed to be built with stage 0 compiler
See https://github.com/rust-lang/rust/pull/137373 and https://github.com/rust-lang/rust/pull/137537.
Previously, run-make-support and rmake.rs was mistakenly built with top-stage compiler, but this was wrong. run-make-support and rmake.rs should be built with the stage 0 compiler (they are test infra and needs to be reliable regardless of the possibly broken stage > 0 compiler under test). This caused a few rmake.rs tests to accidentally be using unstable features in the test recipes themselves, which will cause issues for beta/stable backports/bumps, and will also cause issues for out-of-tree codegen backends like rustc_codegen_cranelift that needs to run run-make tests at stage 0.
The docs are also updated to explicitly clarify that run-make-support and rmake.rs may not use unstable features.
core and alloc unit tests are now located in separate coretests and alloctests packages respectively
Having std tests in the same package as a std crate has issues such as
causing the test to have a dependency on a locally built standard library crate, while also indirectly depending on it through
libtest
https://github.com/rust-lang/rust/pull/135937 moves core tests and https://github.com/rust-lang/rust/pull/136642 moves alloc tests into separate packages that does not depend on core to prevent the duplicate crates problem, even when compiler flags don't match between sysroot build and test build.
Other parts of std still has this problem. This is part of an on-going effort to make std tests more robust and more buildable by custom codegen backends.
PR listing
Improvements
- compiletest and test suites: Implement
needs-subprocessdirective, and cleanup a bunch of tests to useneeds-{subprocess,threads} - compiletest: Add directives to ignore
arm-unknown-*targets - compiletest: Add
{ignore,only}-rustc_abi-x86-sse2directives - run-make: Port
split-debuginfoto rmake.rs - library tests: Put the
coreunit tests in a separatecoretestspackage - library tests: Put the
allocunit tests in a separatealloctestspackage - bootstrap, library tests: Various
coretestsimprovements - CI: Rewrite the
ci.pyscript in Rust - bootstrap: Stabilize stage management for rustc tools
- CI, citool: Add post-merge analysis CI workflow
- CI, citool: Postprocess bootstrap metrics into GitHub job summary
- CI, citool: Increase the max. custom try jobs requested to
20 - CI, citool: Allow specifying glob patterns for try jobs
Fixes
- compiletest: Remove a footgun-y feature / relic of the past from the compiletest DSL.5
- compiletest: Perform deeper compiletest path normalization for
$TEST_BUILD_DIRto account for compare-mode/debugger cases, and normalize long type file filename hashes - compiletest: compiletest should not inherit all host
RUSTFLAGS - bootstrap, compiletest, run-make-support and run-make tests: Compile
run-make-supportandrun-maketests with the bootstrap compiler - compiletest and run-make tests: Prevent
rmake.rsfrom using unstable features, and fix 3 run-make tests that currently do - compiletest and run-make tests: Include
stage0-sysrootlibstd dylib in recipe dylib search path - bootstrap: Fix
x test --stage 1 ui-fulldepson macOS (until the next beta bump) - bootstrap: Add build step log for
run-make-support - bootstrap: Use stage 2 on
cargoandclippytests when possible - CI, citool: Handle empty test suites in GitHub job summary report
Cleanups
- compiletest: Add erroneous variant to
string_enums conversions error - compiletest: Cleanup
is_rustdoclogic and remove a useless path join inrustdoc-jsonruntest logic - compiletest: Feed stage number to compiletest directly
- compiletest: Make the distinction between sources root vs test suite sources root in compiletest less confusing
- compiletest: Make the distinction between root build directory vs test suite specific build directory in compiletest less confusing
- compiletest: Retire the legacy
Makefile-basedrun-maketest infra - bootstrap and compiletest: Use
size_of_valfrom the prelude instead of imported - bootstrap: Clean up code related to the
rustdoc-jstest suite - tests: Remove generic
//@ ignore-{wasm,wasm32,emscripten}in tests
Documentation updates
Note that since rustc-dev-guide became a josh subtree in rust-lang/rust, some doc updates are made alongside the rust-lang/rust PR themselves.
- CI, citool: Fix docker run-local docs
- rustc-dev-guide: Document how to find the configuration used in CI
- rustc-dev-guide: Fix outdated
rustdoc-jstest suite name - rustc-dev-guide: Rewrite section on executing Docker tests
- rustc-dev-guide: Remove "Port run-make tests from Make to Rust" tracking issue from Recurring work
- rustc-dev-guide: compiletest directives:
ignore-stage0andonly-stage0do not exist - rustc-dev-guide: Clean
--blesstext
-
The test infra here refers to the test harness compiletest and supporting components in our build system bootstrap. This test infra is used mainly by rustc and rustdoc. Other tools like cargo, miri or rustfmt maintain their own test infra. ↩
-
I may or may not have forgotten about the January issue last month. Oops. ↩
-
The test suite even had to maintain behavioral tests for
grepbecause there are different flavors ofgrepthat are incompatible with each other and had different CLI interfaces / behavior. ↩ -
During the porting process, we found multiple tests that had varying degree of brokenness due to hard to notice
Makefileand shell quirks. ↩ -
this person is a goober who left a
FIXMEcomment to remind themselves to fix this in a follow-up but forgot to follow-up. ↩