19 Jun 17:49

MasterJH5574

c7ba073

v0.25.0 Latest

Latest

Introduction

The TVM community has worked since the last release to deliver the following new exciting improvements!

The main tags are below (bold text is with lots of progress): Relax, Frontend, TIR, Runtime, etc.

Please visit the full listing of commits for a complete view: v0.24.0...v0.25.0.

Community

None.

RFCs

None.

Arith

#19604 - [REFACTOR][TIR]Phase out ControlFlowGraph, NarrowPredicateExpression, and rename Simplify to StmtSimplify
#19638 - [REFACTOR]Phase out arith/scalable_expression; arith no longer proves over scalable vectors
#19670 - Memoize IntervalSet variable relaxation to avoid exponential blowup
#19669 - Gate canonical-simplify LT Case 2 on extra scale == +1
#19675 - Make Analyzer a tvm-ffi Object

BugFix

#19502 - [TIR] Skip bool-typed expressions in CSE
#19497 - [Relax] Fix scatter_elements and scatter_nd CUDA compilation
#19498 - [Relax][ONNX] Resolve param Vars in Concat to handle mixed Shape/Tensor inputs
#19511 - [Relax][Torch] Honor multi-axis dims in torch.flip converter
#19512 - [Relax][Torch] Honor correction in std/var converter
#19514 - [S-TIR] Wrap bare scalar bodies in DefaultGPUSchedule to avoid root-block crash
#19527 - [Relax]: handle ONNX ScatterElements reduction
#19535 - [Fix][Relax]: ONNX Clip NaN bounds and preserve input NaN (ORT parity)
#19554 - [Fix][CI]: remove astral-sh/setup-uv from lint workflow
#19557 - [Fix][Relax] Lower bool prod as logical all
#19567 - [Target][LLVM] Use libm for asin/acos instead of buggy inline Taylor
#19568 - [Target][LLVM] Route sinh/cosh/atan/asinh/erf through libm extern
#19619 - [Vulkan][CodeGen] Change OpControlBarrier to AcquireRelease
#19643 - [Fix] Stabilize layer_norm variance computation with two-pass reduction
#19650 - [Fix][Relax] Support ND batched matmul chains in AdjustMatmulOrder pass
#19683 - [Fix] CommReduce could handle 0-dim data
#19779 - [Fix] nn.attention support dynamic batch_size
#19808 - [Fix] Revert C++20-only lambda captures for C++17 build

CI

#19629 - Remove tvm-lint from tvm-bot
#19656 - Add cibw-based wheel publishing to PyPI
#19659 - Wheel publishing follow-ups
#19665 - Derive the version from Git tags via setuptools_scm
#19664 - Reformat the macOS repair-wheel-command as a multiline script
#19697 - Target apache-tvm for PyPI wheel publishing
#19775 - Merge PR against its target branch instead of main (#19712)
#19685 - Remove PyPI-only tag ref guard from wheel publishing
#19703 - Pin actions by version tag, trim wheel perms
#19706 - [Tests] Fix s_tir tests using removed T.block API in TIRx script
#19700 - Fix release verification script
#19704 - [Tests] Skip test modules cleanly when optional deps are missing
#19713 - Fix CI script test subprocess environment
#19724 - [Tests][Disco] Skip CCL tests when runtime support is absent
#19725 - [Tests][Relax] Gate multi-GPU VM test on three devices
#19726 - [Tests][Hexagon] Lazily import pytest plugin dependencies
#19730 - [Tests][NNAPI] Skip tests cleanly when remote environment is unavailable
#19729 - [Tests][S-TIR] Fix stale MetaSchedule sketch expectations and migrate let binds to T.let
#19715 - [Tests] Remove test_runtime_ndarray (covered by tvm-ffi)
#19731 - [Script][Tests] Fix dialect redirect module re-execution and stray category-less tirx.intrin_test op
#19735 - [S-TIR][Tests] Fix transform test failures after TIRx bringup
#19740 - [Tests] Check WebGPU volatile allreduce annotation structurally
#19746 - [Tests] Fix flaky popen pool executor test
#19738 - Align cuda-python with PyTorch cuda-bindings
#19745 - [Tests][LLVM] Gate stepvector intrinsic rename on LLVM 20
#19751 - [S-TIR][Tests] Mark test_cp_async_in_if_then_else as xfail
#19737 - Run s_tir/transform tests in the python-unittest stage
#19754 - Updated cibw to 4.1.0
#19752 - [Tests][AArch64] Make SVE codegen assertions robust across LLVM versions
#19761 - Drop redundant cmake/ninja install from the Linux wheel CUDA sidecar
#19777 - [Tests] Modernize test gating
#19786 - [Tests] Make TargetCreation.DeduplicateKeys host-agnostic on AArch64
#19787 - [Tests] Replace remaining requires_* helpers with standard pytest
#19793 - Pin GitHub Actions to SHA for ASF INFRA compliance
#19798 - Remove Jenkins PR linter step
#19800 - [Tests][Refactor] Remove unused testing helpers

Docs

#19606 - Reorganize development guide content
#19720 - Clarify loading serialized artifacts requires a trusted source
#19782 - [CI] Bump tlcpack-sphinx-addon to restore search result summaries
#19788 - Modernize test-gating documentation

Frontend

#19590 - [ONNX] Add RMSNormalization converter for ONNX opset 23

Hexagon

#19747 - [Tests] Clean up stale hexagon tests
#19796 - [REFACTOR]Phase out Hexagon app and test wrappers

LLVM

#19716 - [Codegen]Accept splat form in VLA broadcast test
#19744 - [Codegen][Tests] Gate +v9a vscale_range expectation on LLVM version

Relax

#19495 - [Frontend] Add ParameterList and ParameterDict containers
#19491 - [Frontend][TFLite] Add segment operator mappings
#19499 - [Frontend][TFLite] Add tests coverage for SPACE_TO_BATCH_ND and BATCH_TO_SPACE_ND
#19516 - [TFLite] Add gather frontend expected IRModule tests
#19488 - [PyTorch] Fix segfault in from_exported_program when model uses index_put_ with tuple output
#19523 - [Frontend][TFLite] Add Conv3D support
#19525 - [ONNX] Normalize negative indices before the take call for Gather operator
#19530 - [Frontend] Add TFLite Frontend Support for CONV_3D_TRANSPOSE
#19536 - [Frontend][TFLite] Add initial StableHLO builtin operator support
#19547 - [ONNX] Set max_output_boxes_per_class default value to 0 for NonMaxSuppression
#19515 - [ONNX] Add ONNX Backend Tests for systematic frontend coverage
#19566 - [ONNX] Prevent Div divide-by-zero crashes
#19573 - [ONNX] Fix TopK scalar K extraction in from_onnx
#19587 - [Frontend][TFLite] Support StableHLO region-based ops and multi-subgraph models
[#19588](#1...

Assets 5

16 Jun 22:55

MasterJH5574

v0.25.0.rc1

c7ba073

v0.25.0.rc1 Pre-release

Pre-release

What's Changed

[CI] Merge PR against its target branch instead of main (#19712) by @MasterJH5574 in #19775
[RELEASE] Backport main to prepare v0.25.0.rc1 by @MasterJH5574 in #19774
[v0.25.0] Backport recent main to prepare v0.25.0.rc1 by @MasterJH5574 in #19792
[CMAKE] Revert build baseline to C++17 by @MasterJH5574 in #19805
[Fix] Revert C++20-only lambda captures for C++17 build by @MasterJH5574 in #19808

Full Changelog: v0.25.0.rc0...v0.25.0.rc1

Contributors

MasterJH5574

Assets 5

08 Jun 20:11

MasterJH5574

v0.25.0.rc0

5ec6844

v0.25.0.rc0 Pre-release

Pre-release

What's Changed

[release][Dont Squash] Update version to 0.24.0 and 0.25.0.dev on main branch by @ysh329 in #19446
[Relax][Frontend] Add ParameterList and ParameterDict containers by @mshr-h in #19495
[Relax][Frontend][TFLite] Add segment operator mappings by @Aharrypotter in #19491
[BUGFIX][TIR] Skip bool-typed expressions in CSE by @tqchen in #19502
[Relax][Frontend][TFLite] Add tests coverage for SPACE_TO_BATCH_ND and BATCH_TO_SPACE_ND by @rknastenka in #19499
[BugFix][Relax] Fix scatter_elements and scatter_nd CUDA compilation by @as4230 in #19497
[BugFix][Relax][ONNX] Resolve param Vars in Concat to handle mixed Shape/Tensor inputs by @swjng in #19498
[Web] Add support for OPFS by @akaashrp in #19494
[BugFix][Relax][Torch] Honor multi-axis dims in torch.flip converter by @swjng in #19511
[BugFix][Relax][Torch] Honor correction in std/var converter by @swjng in #19512
[BugFix][S-TIR] Wrap bare scalar bodies in DefaultGPUSchedule to avoid root-block crash by @swjng in #19514
[Relax][TFLite] Add gather frontend expected IRModule tests by @weicheng-hsu in #19516
[Relax][PyTorch] Fix segfault in from_exported_program when model uses index_put_ with tuple output by @cchung100m in #19488
[Relax][Frontend][TFLite] Add Conv3D support by @weicheng-hsu in #19523
[REFACTOR][IR] Remove dead AttrFunctor template by @tqchen in #19528
[Relax][ONNX] Normalize negative indices before the take call for Gather operator by @cchung100m in #19525
[Relax][Frontend] Add TFLite Frontend Support for CONV_3D_TRANSPOSE by @weicheng-hsu in #19530
[TIR] Add cooperative_tensor builtins and metal.cooperative_tensor storage scope by @oraluben in #19423
[Relax][Frontend][TFLite] Add initial StableHLO builtin operator support by @Aharrypotter in #19536
[Contrib] Fix CUDA contrib build after FFI/header cleanups by @MasterJH5574 in #19539
[BugFix][Relax]: handle ONNX ScatterElements reduction by @THINKER-ONLY in #19527
[Fix][Relax]: ONNX Clip NaN bounds and preserve input NaN (ORT parity) by @ConvolutedDog in #19535
[Fix][CI]: remove astral-sh/setup-uv from lint workflow by @ConvolutedDog in #19554
[Relax][ONNX] Set max_output_boxes_per_class default value to 0 for NonMaxSuppression by @cchung100m in #19547
[Relax][ONNX] Add ONNX Backend Tests for systematic frontend coverage by @Aharrypotter in #19515
[Fix][Relax] Lower bool prod as logical all by @ConvolutedDog in #19557
[Relax][ONNX] Prevent Div divide-by-zero crashes by @cchung100m in #19566
[TIRx] Bringup TIRx Infrastructure by @spectrometerHBH in #19581
[BugFix][Target][LLVM] Use libm for asin/acos instead of buggy inline Taylor by @swjng in #19567
[RFC][CodeGen][CUDA]: Gate fast math intrinsic lowering behind target option by @ConvolutedDog in #19565
[TVMScript] Handle undefined functions when dumping IRModule by @ConvolutedDog in #19583
[BugFix][Target][LLVM] Route sinh/cosh/atan/asinh/erf through libm extern by @swjng in #19568
[Relax][ONNX] Fix TopK scalar K extraction in from_onnx by @javierdejesusda in #19573
[Relax][Frontend][TFLite] Support StableHLO region-based ops and multi-subgraph models by @Aharrypotter in #19587
[ONNX] Add RMSNormalization converter for ONNX opset 23 by @q55180514 in #19590
[BUILD] Modularize device runtime into per-backend DSOs by @tqchen in #19594
[Relax] Normalize negative concat axis in ReorderPermuteDimsAfterConcat by @cchung100m in #19588
[RPC][Tracker] Bound msg_size to MAX_TRACKER_MSG_BYTES to prevent unbounded buffer growth by @bl4cksku11 in #19586
[CodeGen][CUDA] Move fast math intrinsic lowering option to PassContext by @tlopex in #19596
[IR] Add annotations to Call nodes by @tlopex in #19597
[REFACTOR][RELAX] Fold CalleeCollector into relax DeadCodeElimination by @tqchen in #19603
[Relax][Frontend][TFLite] Support quantized TFLite import via QDQ decomposition by @Aharrypotter in #19538
Fix PytestUnknownMarkWarning: Unknown pytest.mark.adreno_clml by @cchung100m in #19602
[REFACTOR][IR] Cleanup attrs.h: drop NullValue, AttrsNodeReflAdapter, legacy BaseAttrsNode methods by @tqchen in #19607
[Docs] Reorganize development guide content by @tlopex in #19606
[REFACTOR] Move src/ir/script_printer.cc to src/script/printer/ by @tqchen in #19611
[REFACTOR][IR] Phase out src/ir/structural_{hash,equal}.cc to tvm-ffi by @tqchen in #19613
[REFACTOR][IR] Inline ApplyPassToFunction into relax decompose_ops, delete the util by @tqchen in #19612
[REFACTOR][TIR][ARITH] Phase out ControlFlowGraph, NarrowPredicateExpression, and rename Simplify to StmtSimplify by @tqchen in #19604
[REFACTOR][IR] Phase out class Integer and class Bool in Attrs and PassConfig by @tqchen in #19614
[CMAKE][RUNTIME] Link tvm_rpc with all backend runtime libraries by @cbalint13 in #19617
[REFACTOR][IR] attrs.h follow-up cleanup: drop legacy vtable / rename / phase out AttrFieldInfo by @tqchen in #19615
[REFACTOR][TIR] Tie AnnotateDeviceRegions/SplitHostDevice/LowerDeviceKernelLaunch together by @tqchen in #19605
[Relax][Frontend][TFLite] Support control-flow multi-subgraph operators by @Aharrypotter in #19616
[Relax][Frontend][TFLite] Add UNIDIRECTIONAL_SEQUENCE_RNN converter by @LudovicoYIN in #19601
[IR] Rename Call annotations to attrs by @tlopex in #19618
[REFACTOR][RUNTIME] Phase out tvm::runtime::regex_match by @tqchen in #19620
[REFACTOR][RUNTIME] Remove leftover microTVM/CRT crumbs by @tqchen in #19622
[REFACTOR][RUNTIME] Relocate nvtx.h to tvm/support/cuda and make it header-only by @tqchen in #19621
[REFACTOR][PYTHON] Lift compiler/CLI/process modules from tvm.contrib to tvm.support by @tqchen in #19624
[REFACTOR][IR][FFI] Bump tvm-ffi (+ SEqHashDef migration) and phase out tvm/ir/repr.h by @tqchen in #19627
[REFACTOR][IR] Inline ReplaceGlobalVars into AttachGlobalSymbol by @tqchen in #19625
[BugFix][Vulkan][CodeGen] Change OpControlBarrier to AcquireRelease by @kistenklaus in #19619
[REFACTOR][RUNTIME] Structural reorganization: locality moves for thread_map, texture, minrpc, disco, contrib by @tqchen in #19628
[REFACTOR][PYTHON] Consolidate derived_object into tvm.ir.utils by @tqchen in #19630
[CI] Remove tvm-lint from tvm-bot by @yongwww in #19629
[REFACTOR][SCRIPT] tvmscript streamline: lift printer.h, restore one-way dep, migrate dialect config to extra_config by @tqchen in #19631
[REFACTOR][ARITH] Phase out arith/scalable_expression; arith no longer proves over scalable vectors by @tqchen in #19638
[Relax][Frontend][TFLite] Add REDUCE_WINDOW support by @THINKER-ONLY in #19637
[Relax][Frontend][TFLite] Add RNN converter by @LudovicoYIN in #19632
[REFACTOR][IR] Delete class Bool and class Integer boxed-type wrappers by @tqchen in #19636
[Relax][Frontend][TFLite] Add LSTM and SVDF converter by @LudovicoYIN in #19633
[Relax][Frontend][TFLite] Add TFLite Resource Variable and Static Hashtable Import Support by @Aharrypotter in #19639
[TIRx] Fix stale Simplify import in lowering test by @tlopex in #19642
[Relax][Frontend][TFLite] Support sequence LSTM and RNN operators by @LudovicoYIN in #19634
[Relax][Frontend][TFLite] Support STABLEHLO_WHILE by @Aharrypotter in #19646
[Fix] Stabilize layer_norm variance computation with two-pass reduction by @ConvolutedDog in #19643
...

Contributors

tomayac, tqchen, and 24 other contributors

Assets 5

09 May 01:20

ysh329

v0.24.0

af3e4ba

Apache TVM v0.24.0