fix(download): avoid shared partial cache files#4911
Conversation
After going back and forth a bit for a few hours and perusing the history of this issue, I figured it would at the least, make sense to change cached component downloads such that they use unique partial filenames instead of sharing `<hash>.partial` across processes. This fixes the specific download-cache race seen in rust-lang#4910, where parallel auto-installs can fail while renaming a shared partial file. It is an _alternative_ to the locking approach in rust-lang#4606, using unique partial paths instead of file locks. This does not attempt to solve the broader concurrency problem in rust-lang#988, since concurrent toolchain transactions can still race after downloads complete. I don't think I want to pretend to know where to start there, though I think this is a decent start and at the least covers the issue reported in rust-lang#4910. Existing legacy `<hash>.partial` files are still claimed for resume, so interrupted downloads remain recoverable. Partially Closes rust-lang#4910
| .unwrap_or("_"); | ||
| target_file.with_file_name(format!( | ||
| "{file_name}.{}.partial", | ||
| utils::raw::random_string(16) |
There was a problem hiding this comment.
@cachebag Thanks for looking into this!
However, I believe you are trying to solve the problem at the wrong level. If I have under stood your proposed solution correctly, even if you have avoided shared partial cache files, the unpacking will still clash because they will unpack to the same positions.
The current logic has essentially prevented rustup from picking up partial downloads (the network failure case was only a special one, but the more generally case would be that it has to recover from interruption or even kill).
The right behavior we would like to implement is that when rustup tries to download a toolchain, it will ensure that all downloading/extracting actions of essentially the same toolchain will be mutually exclusive. I actually am working on a plan for this.
There was a problem hiding this comment.
Ahh. Okay that makes sense...yeah I guess in retrospect, an ad-hoc change like this isn't very conducive..
Thanks and let me know if you need anything. Happy to help
After going back and forth a bit for a few hours and perusing the history of this issue, I figured it would at the least, make sense to change cached component downloads such that they use unique partial filenames instead of sharing
<hash>.partialacross processes.This fixes the specific download-cache race seen in #4910, where parallel auto-installs can fail while renaming a shared partial file. It is an alternative to the locking approach in #4606, using unique partial paths instead of file locks.
This does not attempt to solve the broader concurrency problem in #988, since concurrent toolchain transactions can still race after downloads complete. I don't think I want to pretend to know where to start there, though I think this is a decent start and at the least covers the issue reported in #4910.
Existing legacy
<hash>.partialfiles are still claimed for resume, so interrupted downloads remain recoverable.Partially Closes #4910