This Development-cycle in Cargo: 1.92
This is a summary of what has been happening around Cargo development for the last 6 weeks which is approximately the merge window for Rust 1.92.
Plugin of the cycle
Cargo can't be everything to everyone, if for no other reason than the compatibility guarantees it must uphold. Plugins play an important part of the Cargo ecosystem and we want to celebrate them.
Our plugin for this cycle is cargo-wizard which can optimize your project for build times, runtime performance, or binary size.
Thanks to Kobzol for the suggestion!
Please submit your suggestions for the next post.
Implementation
Build performance guide
On Zulip, Kobzol floated the idea of a build performance guide being added to the Cargo book. The first thing we needed to work out was how to handle having small reviewable chunks while having enough content to justify the document. We decided to hold off on merging anything until the start of this development cycle. The guide was introduced in #15970.
Ideally, this guide wouldn't be needed. In some cases, there are steps we can take to obsolete a section, like providing a meaningful unused dependency warning (#15813) rather than suggesting tools that try to guess what dependencies are unused. In some cases, builds are slower by default as we try to balance several competing needs. However, even in those cases, we can evaluate whether we have the right balance or if there is another way to meet multiple needs (e.g. #15931). We decided to link out to this content to help raise awareness of these efforts to track them or participate.
Going forward, we are going to need to figure out how to balance what optimizations to include and how to talk about them. How do we vet that an optimization is actually beneficial? How much of an improvement is worth mentioning? How niche or tricky of an optimization is worth including? We dealt a little bit with this when adding documentation about linkers (#15991) because some platforms already have fast linkers and making linking slightly faster than that is very different than switching from a slow linker to a faster one.
We're tracking further progress on this effort at #16119.
Cargo Script
Update from 1.86
epage posted the stabilization report for the Rust frontmatter syntax, the first step towards stabilizing Cargo script. Cargo's frontmatter parser was also updated to better match rustc's whitespace handling (#15975) and error messages ( #15952, #15972 ).
build-dir (docs),
which split out of target-dir in Cargo 1.91,
was modeled off of Cargo script but implemented independently.
In #16073,
Cargo script switched to using build-dir = "{cargo-cache-home}/build/{workspace-path-hash}"
which is proposed to be the new build-dir default eventually
(#16147).
However, this did lead to issues with memfd
(#16110)
which still needs discussion.
To match the semantics of build-dir being for internals,
Cargo script's Cargo.lock was moved into build-dir
(#16087).
In preparing to stabilize Cargo script, the Cargo team talked through some of the open questions.
In #12870,
novafacing requested a way to get the script's original path.
CARGO_MANIFEST_PATH was previously added but didn't meet everyone's needs.
Nemo157 pointed out that ideally CLI parsers report the script, not the binary, in usage examples.
There isn't really a way for libraries like clap to detect and workaround this,
requiring hacks on the user end.
They suggested Cargo override arg[0] which is what CLI parsers use for usage examples.
When we discussed this as a team, we were interested in people being able to get both pieces of information, the binary and the source.
We were also concerned about platform support for setting arg[0] and current_exe.
Granted, shebang support is also not supported on every platform.
Python and Ruby report arg[0] as the script but they have more control over the behavior.
In the end, we decided on setting arg[0] where possible, on a best-effort basis.
We will leave current_exe untouched to serve as the way to access the binary path.
We would be open to people contributing support for more platforms,
likely through contributing it to std.
Setting of arg[0] was implemented in
#16027.
Cargo scripts do not support every manifest field, especially for the initial release. A long standing open question has been whether the manifest fields should be validated with an allowlist or a denylist. The concern is if a new field gets added, should we err on the side of it being supported or not? Forgetting to update the Cargo script allowlist on the release of a new feature is a poor experience. On the other hand, forgetting to update the denylist could mean we commit to a feature that we shouldn't support. The ideal solution is to rely on the type system to ensure we exhausitvely the manifest fields. If that isn't possible, we erred on the side of an allowlist. Thankfully, the implementation had already been updated to make it easy to rely on the type system for this. The validation logic was changed in #16026.
A cargo script's file name gets turned into a
package.name
but not every script name is a valid package.name.
So far, Cargo has sanitized the file name into being a valid package.name.
But valid according to whom?
General Cargo commands,
cargo new,
or crates.io?
So far, the cargo new rules had been implemented.
This is important to decide upfront because the sanitization results are visible through the binary's name, cargo metadata, and --message-format json.
As we stepped through each cargo new rule,
we found they were becoming less relevant through other efforts in Cargo, changes in Windows, etc.
We decided to do the bare minimum sanitization needed for general Cargo commands.
During the implementation of
#16120,
epage
felt it was too premature to freely allow names that would collide with directory names from build-dir being overlaid with target-dir.
Users can now move build-dir out in Rust 1.91
(#15833).
Changing this to be the default in Cargo is still under discussion
(#16147)
and users could still move it back.
Instead of sanitizing to avoid conflicts with build-dir content,
epage let this fall back to existing validation rules that will error for now.
Public dependencies
Update from 1.76
While this feature is largely blocked on the lint within rustc, this was further refined in Cargo.
jneem experimented with Cargo rewriting the lint to provide Cargo-specific context in #16002.
sadmac7000 changed cargo adds version auto-selection to evaluate public dependencies in case the user intends to use them together
(#15966).
JohnScience proposed cargo tree --edges no-external as a way to see only local packages
(#16043).
We have this today in --depth workspace though maybe we could improve parts of our documentation about this.
However, this got us to re-evaluate --depth public which walks through all public public dependencies and no further
(inspired by --depth workspace).
Would this be better served as --edges public?
The flag was originally added to help in analysing the lints current behavior
(rust#119428).
Most --edges opt-in specific edge types,
while this would instead be applying a filter across edge types.
The only other exception is no-proc-macros.
We decided that we were comfortable adding more edge filters and decided to change this (#16081).
Build-dir layout
Update from 1.90
Cargo's caches have traditionally been organized around the role they fulfil
with .fingerprint/ housing the state for rebuild-detection for all packages
while deps/ stores the build artifacts.
This makes calling rustc easy, just pass it deps/ and it will figure out what files need to be loaded.
By mixing intermediate artifacts together like this,
- if we were to GC the content, we'd need to track individual files for a build unit (#5026)
- it is difficult to coordinate more granular locks (#4282)
- it is more difficult to cache build unit artifacts across projects (#5931).
- requires Cargo to make the file names unique (except on Windows) (#8332)
- and file collisions on Windows (#8794)
- leads to bugs where project binaries can shadow system or Rust toolchain binaries on Windows because we have to put
deps/inPATHfor linking (#7919)
The layout for intermediate build artifacts is an implementation detail which we can change. #15010 proposes changing the layout to be centered on the build unit the files belong to, rather than the role of the files. We have a single folder to track for GC, locking, and caching. A unique hash will be kept in the parent directory's name, allowing us to reduce collisions of files and shadowing of binaries on Windows. This new layout was implemented in #15947.
There is a catch: many tools in the ecosystem depend on the layout.
The reason ranger-ross added support for the new
build-dir
was to serve as an easy for projects to test if they rely on internals of Cargo.
We can punt on finding alternative solutions to these projects,
but that means each time we change the layout of the build-dir,
there is an ecosystem cost.
Turns out, we might want to change it multiple times.
The build-dir is subdivided by <profile>/<platform>/
but that is mostly beneficial for locking purposes.
If we had a new locking scheme
(#4282),
we could reduce path lengths on Windows
and allow intermediate artifact reuse between profiles and even platforms (e.g. build script builds).
As I said earlier, the locking scheme is also blocked on the new layout.
We either have to implement and stabilize them together or have two transitions.
It doesn't stop there.
A new locking scheme may be benefited by us moving away from mutable intermediate artifacts
which could balloon disk usage as each build for each edit of your source would have a distinct artifact.
This would be benefitted by aggressive GC of the intermediate artifacts
which is also blocked on the new layout.
As a team, we discussed this tricky path towards stabilization of the new layout.
After talking through the interaction between these different features, we leaned towards doing one layout change without blocking on any other work and evaluating how that goes to see how we should handle further layout changes.
It would be great if crater
could identify projects impacted by changing the layout.
It may not help us when it is a build process extracting build.rs generated artifacts or when running the tool being built.
There may be some -sys crate situations it might identify.
Later,
ehuss
posted on
Zulip
some preliminary investigations into what projects might be doing relying on the build-dir layout.
In addition to this type of inspection,
we could change the layout on nightly-only to help identify impacted projects.
We are using build-dir as an opt-in for people to evaluate both changing it itself and as a smoke test for a new layout.
Even once we change the build-dir location
(#16147),
users will be able to opt-out.
Should we do similar for the new layout itself?
If we made the flag a proper config,
this would give the build-dir layout more of a semblance of stability than is meant.
This is also a maintenance burden.
Supporting the two layouts already complicates things and has limited our changes to the new layout.
Supporting the old layout for any period of time will likely require all features built on top of it to be conditioned on it until we are able to drop the old layout.
A temporary environment variable to toggle the behavior may work.
At this point, it is on epage and ranger-ross to come up with a concrete transition plan.
Misc
- epage continued their work on migrating existing Cargo messages to annotate-snippets (#15944) in #15942, #15943, #15945. jneem joined in, posting #16035, #16066, #16113, #16126 (Update from 1.90).
- weihanglo posted initial support for structured, persistent logging for Cargo for performing build analysis (#16150)
- weihanglo cleaned up the config logic in preparation for config-include to have an
optionalkey for not erroring on not-present configs. - Muscraft proposed adopting
typosfor spell checking Cargo (#16122)
Focus areas without progress
These are areas of interest for Cargo team members with no reportable progress for this development-cycle.
Ready-to-develop:
Planning:
- Disabling of default features
- RFC #3416:
featuresmetadata- RFC #3487: visibility (visibility)
- RFC #3486: deprecation
- Unstable features
- Pre-RFC: Global, mutually exclusive features
- RFC #3553: Cargo SBOM Fragment
- OS-native config/cache directories (ie XDG support)
How you can help
If you have ideas for improving cargo, we recommend first checking our backlog and then exploring the idea on Internals.
If there is a particular issue that you are wanting resolved that wasn't discussed here, some steps you can take to help move it along include:
- Summarizing the existing conversation (example:
Better support for docker layer caching,
Change in
Cargo.lockpolicy, MSRV-aware resolver ) - Document prior art from other ecosystems so we can build on the work others have done and make something familiar to users, where it makes sense
- Document related problems and solutions within Cargo so we see if we are solving to the right layer of abstraction
- Building on those posts, propose a solution that takes into account the above information and cargo's compatibility requirements (example)
We are available to help mentor people for S-accepted issues on Zulip and you can talk to us in real-time during Contributor Office Hours. If you are looking to help with one of the bigger projects mentioned here and are just starting out, fixing some issues will help familiarize yourself with the process and expectations, making things go more smoothly. If you'd like to tackle something without a mentor, the expectations will be higher on what you'll need to do on your own.