Skip to content

Add support for handling zarr urls#7113

Draft
stephenworsley wants to merge 3 commits into
SciTools:mainfrom
stephenworsley:zarr_url_support
Draft

Add support for handling zarr urls#7113
stephenworsley wants to merge 3 commits into
SciTools:mainfrom
stephenworsley:zarr_url_support

Conversation

@stephenworsley

Copy link
Copy Markdown
Contributor

Closes #6967

@stephenworsley stephenworsley marked this pull request as draft May 28, 2026 13:10
@codecov

codecov Bot commented May 28, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 30.76923% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.10%. Comparing base (28762c4) to head (beb1109).
⚠️ Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
lib/iris/loading.py 0.00% 3 Missing and 1 partial ⚠️
lib/iris/io/__init__.py 25.00% 2 Missing and 1 partial ⚠️
lib/iris/fileformats/netcdf/saver.py 50.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7113      +/-   ##
==========================================
- Coverage   90.13%   90.10%   -0.04%     
==========================================
  Files          91       91              
  Lines       24957    24968      +11     
  Branches     4684     4688       +4     
==========================================
+ Hits        22496    22498       +2     
- Misses       1683     1689       +6     
- Partials      778      781       +3     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@pp-mo

pp-mo commented Jun 10, 2026

Copy link
Copy Markdown
Member

Some Notes

Since @pp-mo @stephenworsley discussed this yesterday, here's my take on some things we found...

  • I pointed out that I'm not too sure about the approach here , because I think it undermines the intent of the "schemes"

    • if done more rigorously, I think it should be possible to supply a url to a remote zarr dataset, similar to the OpenDAP protocol supported by the netcdf file loader
      • .. but I don't know if there's a server solution that actually supports such url-based remote access ?
    • .. and admittedly the whole framework of "schemes" is a bit clunky :
      Notably, in the case of netcdf files, we can access from a filepath, a url, or an open dataset ...
  • we also found that saving to zarr by this means currently fails for lazy data streaming (though OK for all-real data)

    • since we added "deferred writes" for netcdf (Lazy netcdf saves #5191), our standard approach is to create a variable and not write the data. We then close the file, and the proxies re-open it in modify ("r+") mode, to fill in a chunk of data.
    • .. it looks like this simply doesn't work with netCDF4-python any more if it is writing Zarr :
      when such a file is re-opened, no such variable exists (the metadata is not saved ?)
    • .. it's not clear though whether this is an essential limitation of Zarrr itself (controlling metadata cannot exist when there are no recorded chunks for a given array ??). Or possibly, this is just a problem with the netCDF4-python implementation.
    • we could however fix this by giving the 'lazy stream' operation an extra mode control, to be specifically turned on for Zarr ..
      This would use a "da.store" directly when creating the original file, i.e. a "third way" for this code to function, where the 'store' operation is just da.store -- which is exactly what the code used to do before Lazy netcdf saves #5191.
      N.B. I already found in Lazy netcdf saves #5191 that you can't create a write proxy and just 'da.store' it immediately (? at least for process-based/distributed scheduler? ), since an attempt to re-open the file with write privilege will then fail (since the file is already, or "still" open for write)
  • we need to discuss whether this makes this whole approach too awkward for supporting Zarr via this route?
    @ukmo-ccbunney to reconsider versus alternative mechanisms in Zarr I/O #6961 ??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Implicit Zarr support via nczarr

2 participants