[AMD] Update QuarkQuantization pass for Quark 0.12#2532
Open
thpereir wants to merge 1 commit into
Open
Conversation
5 tasks
Contributor
There was a problem hiding this comment.
Pull request overview
Updates Olive’s QuarkQuantization pass to align with AMD Quark 0.12 behavior and APIs across both ONNX and Torch backends.
Changes:
- Updated Torch quantization integration to use Quark 0.12 APIs (
preprocess_for_quantization) and removed now-obsolete patch reversion. - Updated ONNX calibration dataloader construction to pass
model_path/io_config, filtering non-model-input columns (to avoid Quark 0.12 invalid input errors). - Raised the ONNX-side minimum Quark version check to
>=0.12.0and updated log/docs to reference 0.12.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
olive/passes/quark_quantizer/torch/quark_torch_quantization.py |
Migrates Torch quantization flow to Quark 0.12 APIs and adjusts the quantize/freeze/export sequence accordingly. |
olive/passes/quark_quantizer/quark_quantization.py |
Updates Quark 0.12 messaging/version gating and fixes ONNX calibration dataloader creation to filter non-input columns. |
Collaborator
|
We can also add a version pin here: https://github.com/microsoft/Olive/blob/main/olive/olive_config.json#L601-L619 |
Fix the ONNX calibration path: Quark 0.12 rejects extra inputs in the calibration data reader, so pass model_path/io_config to filter out dataset labels (e.g. `class`) that are not model inputs. Update the Torch path for the Quark 0.12 API: replace the removed `revert_model_patching` and the deprecated `prepare_for_moe_quant` (which now raises on transformers>=5) with `preprocess_for_quantization`. Bump the required amd-quark version to >=0.12.0.
Author
Good call, updated |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Describe your changes
Updates
QuarkQuantizationto work with AMD Quark 0.12, which introduced two breaking API changes:ONNX path — calibration reader fix:
Quark 0.12 rejects calibration inputs that are not model inputs. Olive was forwarding the full dataset batch (including label columns like
class) to the reader, causingINVALID_ARGUMENT: Invalid input nameduring calibration — the workflow would complete with exit code 0 but produce no output model. Fixed by passingmodel_path/io_configtocreate_calibration_dataloader()so non-input columns are filtered out, matching the pattern already used byOnnxQuantization.Torch path — API update:
revert_model_patchingwas removed in Quark 0.12 (patching is now reverted internally during export). Removed the import and call.prepare_for_moe_quantis deprecated and raises ontransformers>=5. Replaced withpreprocess_for_quantization.Version gating:
Both ONNX and Torch paths now gate on
amd-quark>=0.12.0with a clearValueErrorso users get an actionable error message instead of a crypticImportError.Checklist before requesting a review
Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by running
lintrunner -aIs this a user-facing change? If yes, give a description of this change to be included in the release notes.
The
QuarkQuantizationpass now requiresamd-quark>=0.12.0. Users on 0.11.x will receive a clearValueErrorwith an upgrade prompt. The ONNX calibration path and Torch path are updated to work correctly with the 0.12 API.(Optional) Issue link