Skip to content

feat: make conda-pypi mappings configurable#6333

Merged
tdejager merged 36 commits into
prefix-dev:mainfrom
tdejager:feat-additive-conda-pypi-map
Jun 23, 2026
Merged

feat: make conda-pypi mappings configurable#6333
tdejager merged 36 commits into
prefix-dev:mainfrom
tdejager:feat-additive-conda-pypi-map

Conversation

@tdejager

@tdejager tdejager commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Description

This PR makes the conda↔PyPI name mapping configurable in both directions:

  • workspace conda-pypi-map now supports per-channel overlay/replacement mappings, inline entries, URL/file locations, hard disables, and an explicit same-name heuristic
  • pixi-build-python now supports pypi-conda-map overrides before the PyPI→conda mapping service

The forward and reverse mappings intentionally have different shapes:

  • conda-pypi-map is per channel and maps conda package names → one or more PyPI names because it decides which installed conda packages satisfy PyPI requirements/PURLs.
  • pypi-conda-map is a flat map of PyPI package name → one conda package name or false because the build backend converts each Python requirement into at most one conda recipe dependency.

Example:

[workspace.conda-pypi-map]
# Additive overlay: project entries win, misses use Pixi's default mapping data.
conda-forge = { mapping = { pytorch = "torch", not-on-pypi = false } }

# Replace Pixi's default mapping data for this channel. Same-name guessing is separate.
my-company = { location = "https://internal.example.com/map.json", mapping-mode = "replace" }

# Hard-disable PyPI name derivation for this channel.
internal = false

[package.build.config]            # pixi-build-python
ignore-pypi-mapping = false
pypi-conda-map = { torch = "pytorch", my-internal-pkg = false }

Common configurations:

# Fix a few names, otherwise let Pixi do its best.
[workspace.conda-pypi-map]
conda-forge = { mapping = { pytorch = "torch", not-on-pypi = false } }
# Avoid default mapping lookups while keeping the conda-forge same-name heuristic.
# This is the explicit replacement for the deprecated `conda-pypi-map = {}`.
[workspace]
conda-pypi-map = { conda-forge = { mapping-mode = "replace" } }
# Treat a mapping file as the source of truth: no default mapping data and no same-name guesses.
[workspace.conda-pypi-map]
conda-forge = {
  location = "mapping.json",
  mapping-mode = "replace",
  same-name-heuristic = false,
}
# Enable the same-name heuristic for a non-conda-forge/private channel.
[workspace.conda-pypi-map]
my-internal = { mapping-mode = "replace", same-name-heuristic = true }

Concretely this PR:

  1. Adds table entries for conda-pypi-map channels with:

    • location
    • inline mapping
    • mapping-mode = "overlay" | "replace"
    • same-name-heuristic = true | false
  2. Keeps bare string entries as shorthand for an additive overlay location.

  3. Supports multiple PyPI names for one conda package (airflow = ["airflow", "apache-airflow"]).

  4. Makes false consistent at its scope:

    • conda-pypi-map = false disables all PyPI name derivation globally
    • <channel> = false disables all PyPI name derivation for that channel
    • mapping.package = false means that package is explicitly not on PyPI
  5. Makes the same-name heuristic explicit:

    • default enabled for conda-forge
    • default disabled for other channels
    • can be enabled for any configured channel
  6. Preserves the legacy behavior of conda-pypi-map = {} as a soft-deprecated spelling for “avoid default mapping lookups but keep conda-forge same-name guessing”. The explicit replacement is:

    [workspace]
    conda-pypi-map = { conda-forge = { mapping-mode = "replace" } }
  7. Caches URL mapping locations via standard HTTP cache semantics: remote fetches go through the existing http-cache middleware (CacheMode::Default), which honors the server's Cache-Control/ETag. A freshly fetched mapping is reused on later solves without network access, a stale one is revalidated cheaply, and a previously fetched copy is reused when a refresh fails (unless the server marked the response no-store/must-revalidate) so offline solves keep working. No bespoke per-entry TTL knob is needed.

  8. Adds pypi-conda-map to pixi-build-python, including target-specific per-key merge behavior.

  9. Improves diagnostics for offline/firewalled mapping failures and HTML/GitHub-blob mapping URLs.

  10. Cleans up pypi_mapping::resolvers naming (PrefixHash, PrefixCompressed, ProjectDefined, SameName).

Behavior changes

Existing manifest Before After To get the old behavior
conda-forge = "mapping.json" Exclusive: only packages in the file got PURLs. Additive overlay: project entries win, misses use Pixi's default mapping data and then same-name if enabled. conda-forge = { location = "mapping.json", mapping-mode = "replace", same-name-heuristic = false }
A mapping for channel A, while also using channel B Configuring any project mapping suppressed the conda-forge same-name heuristic globally. Channel B behaves as if no mapping were configured. Configure B explicitly, e.g. B = false to disable it.
conda-pypi-map = {} Avoided default mapping lookups while still allowing conda-forge same-name guessing. Same behavior, but deprecated with a warning. conda-pypi-map = { conda-forge = { mapping-mode = "replace" } }
conda-pypi-map = false Disabled project/default lookups but still allowed conda-forge same-name guessing. Hard-disable: no PyPI name derivation at all. Use { conda-forge = { mapping-mode = "replace" } } if you only wanted no network/default lookups.

How Has This Been Tested?

Automated/unit/integration coverage includes:

  • manifest parsing and snapshot diagnostics for the new mapping-mode, same-name-heuristic, false/null/list mapping values, and empty-map deprecation
  • project mapping conversion and duplicate-channel validation
  • overlay/replacement mapping behavior
  • explicit package false preventing fallback
  • channel/global hard-disable behavior
  • legacy conda-pypi-map = {} behavior
  • same-name heuristic defaulting to conda-forge and explicit enablement for another channel
  • remote mapping caching: a second, independent solve reuses the on-disk HTTP cache without any network access, and an uncached remote mapping whose fetch fails is a hard error
  • HTML/GitHub-blob parse hint
  • pypi-conda-map override/skip/marker/merge behavior

Commands run locally:

cargo test -p pypi_mapping
cargo test -p pixi_manifest conda_pypi_map
cargo test -p pixi_core conda_pypi_map
cargo test -p pixi --test integration_rust conda_pypi_map_tests

Schema files were regenerated from schema/model.py and validated with pixi run -e schema test-schema.

AI Disclosure

  • This PR contains AI-generated content.
    • I have tested AI-generated content in my PR.
    • I take responsibility for any AI-generated content in my PR.

Tools: Claude Code / pi coding agent

Checklist

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added sufficient tests to cover my changes
  • I have verified that changes that impact the JSON schema have been made in schema/model.py

tdejager added 10 commits June 10, 2026 12:25
…eplace modes, inline mappings and cache-ttl

- conda-pypi-map now accepts false (global disable), and per-channel
  values can be a bare string, false, or a table with location,
  inline mapping entries, mode = "extend"|"replace" and cache-ttl.
- BREAKING: bare location strings now use the additive extend mode
  (overlay over the prefix.dev chain) instead of the exclusive
  replace mode. The old behavior is available via mode = "replace".
- conda-pypi-map = {} is soft-deprecated in favor of false and emits
  a deprecation warning.
- pixi_core wires the manifest entries into the per-channel mapping
  configuration; inline keys are lowercased to match normalized conda
  names, and cache-ttl is validated to require an http(s) location.
- Move all mapping/purl tests from solve_group_tests.rs into a new
  conda_pypi_map_tests.rs integration module.
- Split CondaPypiMapEntry::Map into CondaPypiMapSpec with a dedicated
  MappingLocationSpec { location, cache_ttl } so the TTL is structurally
  tied to the location source it applies to.
- Clarify in the Disabled doc comments that the offline conda-forge
  verbatim fallback still applies when lookups are disabled.
- Deduplicate the offline help text into a shared MAPPING_OFFLINE_HELP
  const used by both the prefix.dev and project-defined fetch errors, and
  mention pointing at a custom mapping location (with cache-ttl) as an
  escape hatch.
- Document why the TTL cache cannot reuse the http-cache middleware
  (header-driven freshness, client-global max_ttl, no stale-on-error).
- Docs: add a parselmouth raw-URL pinning recipe (and a note that blob
  URLs serve HTML).
- pixi_toml: add a custom_error(message, span) constructor and use it for
  the conda-pypi-map validation errors.
- pixi_core: extract the conda-pypi-map manifest conversion out of
  workspace/mod.rs into a workspace::conda_pypi_map module with named,
  unit-testable functions (incl. the channel-membership validation).
- pixi_core: classify mapping locations with rattler_lock::UrlOrPath
  instead of hand-rolled starts_with checks; file:// urls normalize to
  paths and non-http(s) remote schemes are rejected with a clear error.
- pypi_mapping: make the per-record fallback policy explicit with a
  Fallback enum (PrefixThenVerbatim | Verbatim | None) instead of a
  mutable suppression flag.
- pixi-build-python: dedupe the requirement version conversion into
  convert_requirement_version, shared by the user-map and service paths.
- test: pin that a mapping for one channel no longer suppresses the
  verbatim fallback for records from other, unmapped channels (online).
- TTL cache: treat a future mtime (clock skew) as age zero instead of
  making the cached copy invisible to the freshness check and the stale
  fallback; write cache files atomically via tempfile + persist; unit
  tests for the age computation.
- pypi-conda-map: an invalid conda name in an override now falls through
  to the mapping service instead of silently dropping the dependency.
- Split the offline help text: failures fetching a user-configured
  location now suggest checking the URL / adding cache-ttl instead of
  the firewall-framed prefix.dev advice; clearer HTTP status error.
- Warn when a mapping location uses plain http://, since a tampered
  mapping influences dependency resolution.
- Encode the manifest-mode to MappingMode conversion in a documented
  convert_mode function (a From impl is impossible: neither crate
  depends on the other, so the orphan rule forces it into pixi_core).
- Error wording: cache-ttl duration errors show example values;
  cache-ttl-without-location message no longer implies location must be
  a URL; {} deprecation help reworded; stale Disabled doc hedge fixed;
  duplicated doc comment removed.
- Docs: warning box now also covers the verbatim-fallback scope change
  for unmapped channels; cache-ttl docs state the no-cache hard-failure;
  inline-mapping example no longer reuses 'pytorch' as both channel and
  package name.
- New tests: mixed-case inline keys, cache-ttl on a local path rejected,
  file:// table-form location works (pins UrlOrPath normalization),
  Skip entries with markers, vacuous purls assertion fixed, unit tests
  for parse_mapping_location/convert_entry; re-documented what the
  fresh-cache TTL test actually pins (cache layout + no network).
- typos: reword 'mis-mapped' in the conda/PyPI concepts page.
- basedpyright: no implicit string concatenation in the new
  schema/model.py field descriptions (schema output unchanged).
nichmor and others added 10 commits June 12, 2026 12:44
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@tdejager tdejager marked this pull request as ready for review June 12, 2026 14:07
@tdejager tdejager added the test:extra_slow Run the extra slow tests label Jun 12, 2026
tdejager and others added 2 commits June 12, 2026 16:11
…-pypi-map

# Conflicts:
#	Cargo.lock
#	crates/pixi_manifest/Cargo.toml
@ruben-arts ruben-arts added the breaking Breaks something in the api or config label Jun 15, 2026
@ruben-arts

Copy link
Copy Markdown
Contributor

What is the reason for the addition of the cache-ttl on the remote mapping?

@ruben-arts

Copy link
Copy Markdown
Contributor

I've done some user testing and it all seems to work as designed. I've got one question about the conda-pypi-map = false. I was expecting no mapping at all, e.g. if there is a pypi and conda map it's just going to get both packages.

I believe there is currently no way accept making an empty map and assigning that as replace. Was this behavior already existing?

@nichmor

nichmor commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

What is the reason for the addition of the cache-ttl on the remote mapping?

We want to cache the requested mapping and don't fetch it multiple times during the solve - for this reason, we thought that it would be great to allow users to choose how "fresh" the mapping should be for them, so we added this conf option

@tdejager

Copy link
Copy Markdown
Contributor Author

What is the reason for the addition of the cache-ttl on the remote mapping?

We want to cache the requested mapping and don't fetch it multiple times during the solve - for this reason, we thought that it would be great to allow users to choose how "fresh" the mapping should be for them, so we added this conf option

@ruben-arts for reference the compressed mapping for conda-forge on GitHub which you might use is about 1MB :)

Rename the additive channel mode to `mapping-mode = "overlay"` and add a per-channel `same-name-heuristic` switch. The heuristic now defaults to enabled for conda-forge and disabled elsewhere, but can be explicitly enabled for any channel.

Treat `conda-pypi-map = false` and `<channel> = false` as hard disables, while preserving the legacy empty-map behavior as a deprecated no-default-lookup conda-forge same-name configuration. Allow mapping-mode-only entries to express empty replacement mappings explicitly.

Update schema, docs, and integration coverage for the revised hierarchy, and clean up pypi_mapping resolver naming.
@tdejager tdejager changed the title feat: make conda-pypi-map additive and add pypi-conda-map build overrides feat: make conda-pypi mappings configurable Jun 16, 2026
tdejager and others added 2 commits June 16, 2026 15:02
Remote mapping fetches already pass through the http-cache middleware
(CacheMode::Default), and the real mapping hosts (prefix.dev, parselmouth
on raw.githubusercontent.com) send Cache-Control + ETag. Rely on that
entirely instead of a bespoke per-entry cache-ttl plus a hand-rolled
mtime/stale-on-error file cache:

- Remove the cache-ttl manifest option, its parsing/validation and snapshots.
- Collapse the now single-field MappingLocationSpec to a plain String.
- Delete the custom project-defined file cache; fetch straight through the
  cache-aware client. The middleware handles freshness, ETag revalidation,
  and use-stale-on-error (conditional_fetch serves the cached copy when a
  refresh fails, unless the response is no-store/must-revalidate).
- Drop now-unused deps (humantime, filetime, rattler_digest dev-dep).

Verified that a second, independent client reuses the on-disk HTTP cache
without any network access (test_remote_mapping_reused_from_cache_offline).
@tdejager tdejager force-pushed the feat-additive-conda-pypi-map branch from 0db381f to 665ed56 Compare June 17, 2026 07:54
- test_mapping_location asserted same-name=true for a non-conda-forge
  channel (https://prefix.dev/test-channel) via extend(); the heuristic
  correctly defaults to false there, so build the expected value explicitly.
- Remove the now-unused tempfile dependency from pypi_mapping (flagged by
  cargo-shear after the custom stale-cache file writer was deleted).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tdejager tdejager force-pushed the feat-additive-conda-pypi-map branch from 665ed56 to d539cae Compare June 17, 2026 08:16
@tdejager tdejager requested a review from baszalmstra June 17, 2026 10:03
@tdejager tdejager merged commit 2344205 into prefix-dev:main Jun 23, 2026
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaks something in the api or config test:extra_slow Run the extra slow tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants