Skip to content

fix(download): avoid shared partial cache files#4911

Closed
cachebag wants to merge 1 commit into
rust-lang:mainfrom
cachebag:main
Closed

fix(download): avoid shared partial cache files#4911
cachebag wants to merge 1 commit into
rust-lang:mainfrom
cachebag:main

Conversation

@cachebag

@cachebag cachebag commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

After going back and forth a bit for a few hours and perusing the history of this issue, I figured it would at the least, make sense to change cached component downloads such that they use unique partial filenames instead of sharing <hash>.partial across processes.

This fixes the specific download-cache race seen in #4910, where parallel auto-installs can fail while renaming a shared partial file. It is an alternative to the locking approach in #4606, using unique partial paths instead of file locks.

This does not attempt to solve the broader concurrency problem in #988, since concurrent toolchain transactions can still race after downloads complete. I don't think I want to pretend to know where to start there, though I think this is a decent start and at the least covers the issue reported in #4910.

Existing legacy <hash>.partial files are still claimed for resume, so interrupted downloads remain recoverable.

Partially Closes #4910

After going back and forth a bit for a few hours and perusing the history
of this issue, I figured it would at the least, make sense to change
cached component downloads such that they use unique partial filenames
instead of sharing `<hash>.partial` across processes.

This fixes the specific download-cache race seen in rust-lang#4910, where parallel
auto-installs can fail while renaming a shared partial file. It is an
_alternative_ to the locking approach in rust-lang#4606, using unique partial
paths instead of file locks.

This does not attempt to solve the broader concurrency problem in rust-lang#988,
since concurrent toolchain transactions can still race after downloads
complete. I don't think I want to pretend to know where to start there,
though I think this is a decent start and at the least covers the issue
reported in rust-lang#4910.

Existing legacy `<hash>.partial` files are still claimed for resume,
so interrupted downloads remain recoverable.

Partially Closes rust-lang#4910
Comment thread src/dist/download.rs
.unwrap_or("_");
target_file.with_file_name(format!(
"{file_name}.{}.partial",
utils::raw::random_string(16)

@rami3l rami3l Jun 16, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cachebag Thanks for looking into this!

However, I believe you are trying to solve the problem at the wrong level. If I have under stood your proposed solution correctly, even if you have avoided shared partial cache files, the unpacking will still clash because they will unpack to the same positions.

The current logic has essentially prevented rustup from picking up partial downloads (the network failure case was only a special one, but the more generally case would be that it has to recover from interruption or even kill).

The right behavior we would like to implement is that when rustup tries to download a toolchain, it will ensure that all downloading/extracting actions of essentially the same toolchain will be mutually exclusive. I actually am working on a plan for this.

View changes since the review

@cachebag cachebag Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh. Okay that makes sense...yeah I guess in retrospect, an ad-hoc change like this isn't very conducive..

Thanks and let me know if you need anything. Happy to help

@cachebag cachebag closed this Jun 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

component download failed under parallel

2 participants