Skip to content

fix(remote): remove -U flag from auto-injected sagemaker requirements install#5922

Open
nileshpatil6 wants to merge 1 commit into
aws:masterfrom
nileshpatil6:fix/remote-decorator-drop-pip-upgrade-flag
Open

fix(remote): remove -U flag from auto-injected sagemaker requirements install#5922
nileshpatil6 wants to merge 1 commit into
aws:masterfrom
nileshpatil6:fix/remote-decorator-drop-pip-upgrade-flag

Conversation

@nileshpatil6
Copy link
Copy Markdown

Fixes #5872

What happened

The @remote (and @step) decorator injects sagemaker>=3.2.0,<4.0.0 into a temporary requirements file, then installs it with:

pip install -r <requirements_file> -U

The -U flag forces pip to upgrade sagemaker to the latest version within that range, even when a compatible version is already installed. The logs show this clearly:

Requirement already satisfied: sagemaker<4.0.0,>=3.2.0 ... (3.5.0)
Collecting sagemaker<4.0.0,>=3.2.0 ...
Found existing installation: sagemaker 3.5.0
Uninstalling sagemaker-3.5.0: Successfully uninstalled sagemaker-3.5.0
Successfully installed sagemaker-3.11.0

After the forced upgrade, the container tries to deserialize a payload serialized by the 3.5.0 client using a different format, which throws DeserializationError.

Fix

Remove the -U flag from _install_requirements_txt and _install_req_txt_in_conda_env in both sagemaker-core and sagemaker-train. The version constraint (>=3.2.0,<4.0.0) already guarantees a compatible version will be present; there is no reason to force an upgrade.

Files changed

  • sagemaker-core/src/sagemaker/core/remote_function/runtime_environment/runtime_environment_manager.py
  • sagemaker-train/src/sagemaker/train/remote_function/runtime_environment/runtime_environment_manager.py

Testing

Existing unit tests pass (47 passed in test_runtime_environment_manager.py for sagemaker-core). No test changes needed since the tests mock _run_shell_cmd and verify it is called, which still holds.

The @Remote decorator injects sagemaker>=3.2.0,<4.0.0 into a
requirements file and installed it with pip install -r ... -U. The -U
flag forces pip to upgrade sagemaker to the latest version within the
range even if a compatible version is already installed. This created a
version mismatch between the client (which serialized the function at
version 3.5.0) and the container (which deserialized at 3.11.0),
causing DeserializationError.

Remove the -U flag from _install_requirements_txt and
_install_req_txt_in_conda_env in both sagemaker-core and sagemaker-train.
The version constraint already ensures compatibility; forced upgrades are
not needed and actively harmful when the serialization format changes
between minor versions.

Fixes aws#5872

Signed-off-by: nileshpatil6 <technil6436@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

@remote decorator forces sagemaker upgrade via pip install -U, causing DeserializationError

1 participant