[In Progress] ONNX weight replacement#4957
Conversation
| Running a baked program in-process | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| ``create_program_with_weights`` deliberately does **not** finalize the program. | ||
| Finalizing uploads literal data to the device, which is wasted work if you only | ||
| intend to ``save`` the result (the bytes would be serialized after a redundant | ||
| host-to-device round trip). | ||
|
|
||
| The baked program therefore is not yet runnable on the device. The portable way | ||
| to make it runnable is to save it and load it back: loading a compiled MXR | ||
| finalizes it automatically, allocating device buffers and uploading the baked | ||
| literals. | ||
|
|
||
| .. code-block:: cpp | ||
|
|
||
| auto baked = migraphx::create_program_with_weights(prog, "weights_v1", t); | ||
| migraphx::save(baked, "model_v1.mxr"); // serialize the baked program | ||
|
|
||
| auto runnable = migraphx::load("model_v1.mxr"); // finalized on load | ||
| auto outputs = runnable.eval(params); | ||
|
|
||
| .. note:: | ||
|
|
||
| The underlying core library has a ``program::finalize(const target&)`` method | ||
| that finalizes a baked or loaded program in place, but it is **not** exposed | ||
| through the C or C++ API wrappers (``migraphx.h`` / ``migraphx.hpp``). From | ||
| C++, use the save/load round trip above. From Python the method *is* exposed | ||
| (see below) if you want to avoid touching disk. |
There was a problem hiding this comment.
Looking for feedback on this
Regressions detected 🔴 * No develop baseline was found for this PR's branch point; compared against the latest available develop run instead. |
|
| return gpu::allocate_gpu(s); | ||
| } | ||
|
|
||
| void target::lower_baked_literals(module& m) const |
There was a problem hiding this comment.
There should be a function that returns passes to lower the literals.
| } | ||
|
|
||
| if(result.is_compiled()) | ||
| t.lower_baked_literals(*mm); |
There was a problem hiding this comment.
You need to run the pass manager just in case there are weights in the submodules.
| std::unordered_map<std::string, module> modules; | ||
| std::vector<context> contexts; | ||
| std::vector<target> targets; | ||
| std::unordered_map<std::string, external_data_info> external_weight_map; |
There was a problem hiding this comment.
I dont think we should store this here. This can be stored in the IR directly by creating a onnx_externel_weights op.
| } | ||
|
|
||
| program | ||
| create_program_with_weights(const program& prog, const std::string& base_dir, const target& t) |
There was a problem hiding this comment.
This function is very onnx specific and should be moved to the onnx module. It should be name replace_onnx_external_weights.
| std::string external_data_path = ""; | ||
| /// When true, external-data initializers become parameters instead of literals, | ||
| /// enabling runtime weight swapping without re-parsing | ||
| bool external_weights_as_parameters = false; |
There was a problem hiding this comment.
This should be named keep_weights_external or defer_external_weights.
Motivation
The goal is to parameterize weights in onnx so that a user can quickly swap out weights without recompiling the program from scratch.
Technical Details
Included doc of rough outline of proposed changes.
Changelog Category
Add a
CHANGELOG.mdentry for any option other thanNot Applicable