OLive

Граф коммитов

Автор	SHA1	Сообщение	Дата
Xiaoyu	a27150ccb4	Add generate_config_file option to cli (#1568 ) ## Describe your changes Add generate_config_file option to cli. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-23 17:47:25 -08:00
Xiaoyu	0b46917a81	Add options document (#1566 ) ## Describe your changes - Add options page. - Update document for recent changes. - Rename `API reference` to `Reference`. - Remove Engine, Metrics, Evaluators, Resource Path API page. - Remove advanced users page. - Add subsections to Pass page. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-23 16:03:46 -08:00
Jambay Kinley	fb0d2284f1	OnnxStaticQuantization: Remove `quant_preprocess=False` default for qnn ep (#1565 ) ## Describe your changes `quant_preprocess` was originally set to `False` by default because I thought it might not be compatible with QNN EP. However, it is needed for most models for the quantization to work properly: - The quantizer needs a shape inferred model to quantize some tensors. - Model quality is bad if constants are not made into initializers (see #1552). ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-23 16:02:30 -08:00
Zac	28beb3f5c6	Fix: Use correct cache directory for file logging (#1564 ) ## Describe your changes ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [x] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-22 19:38:09 -08:00
Jambay Kinley	ab72d6905e	Offline QuaRot Implementation (#1556 ) ## Describe your changes - Previous QuaRot pass is replaced with our own implementation of the pass which only performs the offline weight rotation. - The online hadamard rotation parts are not relevant to us since it involves reimplementing the model architecture or updating them dynamically to add the input/kv rotation functions. Moreover, these are not compatible with onnx export. - This pass does not do any quantization. The rotated output model should be subsequently quantized using GPTQ and/or QDQ passes. - All usage of quarot from the examples and cli are removed. New examples and cli options will be added once E2E validation of Rotate -> GPTQ -> QDQ workflows is complete. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-22 15:30:38 -08:00
Jambay Kinley	e510074cfe	Add huggingface `ModelWrapper` (#1555 ) ## Describe your changes - Implemented a new huggingface `ModelWrapper` that acts as in interface with huggingface models. It keeps maps for different model types that allows the user to get model attributes and submodules. - All code using the previous mappings directly have been updated accordingly. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-21 18:04:45 -08:00
Jambay Kinley	cdb8693b38	Hf data preprocessing: Truncate dataset if `max_samples` is provided (#1561 ) ## Describe your changes - For huggingface data preprocessing (except text-gen which has it's own logic), truncate the data before tokenization if `max_samples` is provided. - There is no need to tokenizer and process the whole dataset if only a subset is going to be used. This is useful for large datasets where the tokenized data might be too large to fit in memory. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-21 17:37:44 -08:00
shaahji	be1278ca4f	Validate pass config before instantiating the pass (#1553 ) ## Validate pass config before instantiating the pass * Use of search point should be limited to engine logic only. Rest of the Olive implementation should receive a validated configuration to use. * Validation is for complete configuration and not merely for a search point. * Fixed a few issues related to use of BasePassConfig vs. FullPassConfig. * Add local caching to OlivePackageConfig for loaded modules. * Renamed a few variables in engine logic to be explicit about use of pass config vs. pass run config. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-21 14:45:42 -08:00
Jambay Kinley	92f431e812	GraphSurgeries: Add pass to olive_config.json, Resolve output_model_path (#1560 )	2025-01-21 11:58:35 -08:00
Xiaoyu	3d6e7317a7	Set onnxoptimizer.optimize as default for peepholeoptimizer (#1550 ) ## Describe your changes Set onnxoptimizer.optimize as default for peepholeoptimizer ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-15 15:31:25 -08:00
shaahji	6c2b14b81c	Consolidate pass configs (#1547 ) ## Consolidate pass configs Move AbstractPassConfig into pass_config.py and rename PassConfigBase to BasePassConfig. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-14 21:30:59 -08:00
shaahji	c1e1365c75	Fix graph vertex ordering during topological ordering (#1546 ) ## Fix graph vertex ordering during topological ordering Previous implementation returned vertices in reverse order even when there are no dependencies in the graph. Iterate the vertices in reverse order so input order can be retained. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-13 22:28:50 -08:00
shaahji	d27a07852f	Move module class variables from implementation to olive_config.json (#1545 ) ## Move module class variables from implementation to olive_config.json This prevents loading the module pre-emptively even when it isn't intended to be used. Also, after module gets imported, set the variable back on the class so rest of the implementation doesn't get impacted. TODO: Pass::run_on_target should be removed but code paths in tests circumvent loading the olive package config. Follow up change to implement pass search will fix the issue. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-13 22:28:37 -08:00
Xiaoyu	dcc0cb9905	Migrate pipeline linux test to docker image (#1529 ) ## Describe your changes - Migrate pipeline linux test to docker image. - Skip failed tests for further investigation. Potential fail reasons: - python 3.10 - onnxruntime 1.20 - docker in docker ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2025-01-09 10:56:34 -08:00
Xiaoyu	c631ba2b61	Add onnxscript to peephole pass (#1530 ) ## Describe your changes - Add onnxscrits to peephole pass - Remove self-implemented constant folding. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-12-20 14:06:06 -08:00
Xiaoyu	0e328a50f4	update images to ubuntu 22.04 & fix integ test (#1532 ) ## Describe your changes - Update images to ubuntu 22.04. - GPU image may nee to be updated to cuda 12.x. - Fix integ test to unblock pipeline. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-12-20 12:51:58 -08:00
Ti-Tai Wang	f372b89a69	Support `conversion_dtype` in ONNXConversion pass and `dynamic_shapes` on torch.onnx.export(..., dynamo=True) (#1511 ) ## Describe your changes (1) Introduce `dynamic_shapes`, which is a crucial parameter to torch.onnx.export(..., dynamo=True). (2) Enable `dynamic` boolean control in the config (3) Actually apply `torch_dtype` to model inputs. NOTE: 1. Some automation is intentionally not supported by the dynamic_shapes field due to the complexity of generatirng `dynamic_shapes`, and torch.onnx.export(dynamo=True) actually supports converting `dynamic_axes` to `dynamic_shapes` (limited). 2. To follow JSON rules, `dynamic_shapes` requires users to provide a list of [dim_name(str), min(int), max(int)]. These information will later be used to compose `torch.export.Dim(dim_name, min=min, max=max)` ([detail](https://pytorch.org/docs/stable/export.html#expressing-dynamism)). 3. `dynamic_shapes` follows the tree structure of the model inputs. For example, if the model input is nested tuple, then the `dynamic_shapes` should be a nested tuple, instead of a dictionary. 4. The `kv_cache` support of `dynamic_shapes` is limited in terms of the variation of model signatures, implementations, and inputs. Users are encouraged to provide full kv cache. ## Checklist before requesting a review - [x] Add unit tests for this change. - [x] Make sure all tests can pass. - [x] Update documents if necessary. - [x] Lint and apply fixes to your code by running `lintrunner -a` - [x] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-12-18 10:00:42 -08:00
Jambay Kinley	68b1ee72be	CaptureSplitInfo: Separate split for memory intensive module (#1520 )	2024-12-16 09:11:37 -08:00
Xiaoyu	72616c42ab	Add constant folding and onnxoptimizer to peephole optimizer (#1516 ) ## Describe your changes Add constant folding and remove initializer from input to peephole optimizer. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-12-12 15:57:07 -08:00
Xiaoyu	dc6700880d	Add ReplaceErfWithTanh to GraphSurgeries (#1521 ) ## Describe your changes Add ReplaceErfWithTanh to GraphSurgeries ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-12-11 20:03:06 -08:00
Xiaoyu	9191ba61e0	Add OnnxIODataTypeConverter & remove Float32Converter (#1519 ) ## Describe your changes Add OnnxIODataTypeConverter & remove Float32Converter ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-12-10 14:45:51 -08:00
Xiaoyu	aa21017238	Add ReorderInputs to GraphSurgeries Pass (#1518 ) ## Describe your changes Add ReorderInputs to GraphSurgeries Pass ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-12-09 19:25:08 -08:00
Jambay Kinley	4e4a625e49	SplitModel: Keep QDQ nodes together (#1515 )	2024-12-06 16:52:51 -08:00
Xiaoyu	9c562e6c97	Add graph surgeries pass (#1507 ) ## Describe your changes Add graph surgeries pass. Add surgeon: - RenameInputs - RenameOutputs - InferShapes - RemoveShapes - ReorderInputs - ZeroOutInput - RemoveInputs - ExposeOutputs - ExposeQuantizedOutput ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-12-06 15:03:03 -08:00
Xiaoyu	cf322550b3	Fix shared cache bug (#1513 ) ## Describe your changes Fix shared cache bug ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link https://github.com/microsoft/Olive/issues/1509	2024-12-03 11:32:24 -08:00
Xiaoyu	b3809d1a8d	Save output model to output_dir (#1430 ) ## Describe your changes Save output model to output_dir ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-12-03 11:32:11 -08:00
Xiaoyu	6a74990726	Fix doc link (#1494 ) ## Describe your changes Fix doc link ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-11-25 15:25:08 -08:00
Xiaoyu	a0822a2a2c	Fix RUFF format (#1495 ) ## Describe your changes Fix RUFF format ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-11-22 16:21:37 -08:00
Jambay Kinley	0db2d72b00	CLI: Use getattr on conditional args, Check handler type. Debug log in `get_attr` (#1486 )	2024-11-20 13:19:04 -08:00
Jambay Kinley	d1499ab07c	HfModelHandler: Call `save_metadata` after model is saved (#1485 )	2024-11-20 13:18:52 -08:00
Xiaoyu	f47265f377	Fix several bugs & add Bert inc example to pipeline (#1492 ) ## Describe your changes Fix Olive bugs & Add Bert inc examples to example pipeline. - `batch_size` is needed for Inc dataloader. Set default size to 1. - Custom eval func doesn't have batch_size as input. - For `QuantizationAwareTraining` pass, `train_data_config` is not required if user provides `training_loop_func`. - Latest transformers package will automatically save trained model as safetensors format. Add `save_safetensors` as false to train argument. - Some passes may have nested data_config in its config. Update auto-fill data_config logic to achieve this. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-11-18 14:03:15 -08:00
Jambay Kinley	f11061a174	Update main version to 0.8.0.dev0 (#1489 ) ## Describe your changes We already already done some `0.7.x` releases so, updating the dev version to `0.8.0.dev0`. This way it's ahead of the official releases and we can differentiate dev builds from official releases. Similar to how transformers versions their dev branch. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-11-14 13:02:19 -08:00
anujj	52af325ede	Update TensorRT Model Optimizer documentation and dependencies (#1483 ) ## Describe your changes ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR.	2024-11-13 14:26:25 -08:00
Xiaoyu	3e44c97806	Remove static key from quantize cli (#1481 ) ## Describe your changes `static` quantization is not supported by cli now. Removing `static` key from template. Otherwise, cli doc will have this option. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-11-12 12:34:01 -08:00
Justin Chu	ddf98d0aaa	Set to use `dynamo=True` only with PyTorch 2.6 temporarily (#1479 ) ## Describe your changes For the Olive release, in the ONNX export pass, since when `dynamo=True` the dynamic_shapes argument is not provided properly, we temporarily change the check to require torch 2.6 to enable the new `dynamo=True` logic pass. This way when a user has torch 2.5 Olive will still use the old `dynamo_export` logic and function without errors. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link - https://github.com/microsoft/Olive/issues/1478	2024-11-11 23:55:13 -08:00
Jambay Kinley	61876e211f	AWQ: Patch for mismatched devices in RotaryEmbedding (#1480 )	2024-11-11 23:00:55 -08:00
Jambay Kinley	b7405973cb	OnnxModelOptimizer: Rename to OnnxPeepholeOptimizer, handle failed shape inference (#1477 )	2024-11-11 13:45:36 -08:00
shaahji	1363be501d	Add option to use dynamo exporter for onnx conversion pass (#1476 ) ## Add option to use dynamo exporter for onnx conversion pass ## Checklist before requesting a review - [ ] Add unit tests for this change. - [x] Make sure all tests can pass. - [ ] Update documents if necessary. - [x] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-11-11 13:21:43 -08:00
Jambay Kinley	1ff2f40815	MatMulNBitsToQDQ: Support `INT4` elem type, per-axis quantization (#1468 ) ## Describe your changes - `use_int4` parameter added to use `INT4` elem type instead of `UINT4`. Quantized data is converted to int4 by first casting from uint8 to int8 and then subtracting 8. - We expect INT4 to only be used for symmetric quantized models but it appears to work fine for assymetric quantized models so there is no restriction on it. - `add_zero_point` parameter to force adding zero points to the DQ node even though they are all zeros. - If the number of K-blocks is 1, we assume it to be per-axis quantization and create the DQ node as such: - no `block_size` attribute - scales and zero points are 1-D - `axis` is opposite that of block-wise quantization ## Checklist before requesting a review - [x] Add unit tests for this change. - [x] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-11-11 12:46:52 -08:00
Jambay Kinley	4009defbc9	HfModel: Disable `use_exllama` by default for GPTQ models (#1474 ) ## Describe your changes The default value for `use_exllama` in transformers is `True`. However, exllama model cannot be loaded on cpu (for model export) and doesn't have a backward pass implemented for finetuning. Since the main use for gptq quantized model in Olive is for export and finetuning, we should disable `use_exllama` by default. User can provide `use_exllama=True` as part of the loading args if they want to enable exllama for inference, etc. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-11-11 12:46:40 -08:00
devang-ml	cad03e2807	auto-opt: `--use_ort_genai` option, fix fp16 transformers optimizer (#1473 ) ## Describe your changes - Use ONNX adapter format as the default format - Provide --use_ort_genai option to generate genai config via auto-opt CLI - Correct the fp16 parameter in transformers optimizer config from `use_fp16` to `float16` ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link --------- Co-authored-by: Jambay Kinley <jambaykinley@microsoft.com>	2024-11-08 17:37:27 -08:00
Jambay Kinley	aa031633f8	Expose accelerator memory and use in `CaptureSplitInfo` (#1467 ) ## Describe your changes - Expose `memory` in `AcceleratorSpec` and `AcceleratorConfig`. - Removed the unused fields in accelerator spec to avoid confusion. - CaptureSplitInfo pass now uses this field instead of a config parameter. ## Checklist before requesting a review - [x] Add unit tests for this change. - [x] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-11-08 16:49:48 -08:00
Jambay Kinley	81037d7d03	`auto-opt`: Add model splitting options, Update no-search rules (#1466 ) ## Describe your changes - Add new options to enable model splitting in `auto-opt` command. Only used for no-search model since model evaluation for split model is undefined. Some other rule changes: - Add `--use_qdq_encoding` option to make this optional - Remove pytorch and model splitting passes when input model is onnx model - Remove optimizer and matmul4 passes when model builder is used since the model is already optimized and in the expected precision - Keep/remove passes that are only needed for specific EPs. - Onnx conversion is always in fp32 since fp16 conversion doesn't work for all models. We instead use the transformers optimizer pass to do the conversion to fp16 afterwards. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link Fixes #1464	2024-11-07 15:56:56 -08:00
Jambay Kinley	74873a54a8	Model Splitting: Add cost model, Simplify split logic (#1456 ) ## Describe your changes - In addition to `num_splits`, `CaptureSplitInfo` pass now supports cost model. The cost model is a csv containing the cost per component of the model. Currently, the only cost supported is the number of bytes. - The split decision is made based on the `max_memory` config parameter. This config parameter is temporary and will be moved into the accelerator spec. The required changes are bigger than the scope of this PR so it will be handled in a follow up PR. - Added `generate-cost-model` CLI to generate cost models for huggingface models. Also added some pre-generated cost models for phi and llama models under `assets/cost_models`. - `include_all_nodes` option has been removed from `SplitModel` pass. It always includes all nodes. This also made the logic simpler. - OnnxDAG: - Updated logic to handle unused inputs/outputs for nodes that are left as `""`. Found in contrib operators commonly. - Handle unnamed nodes. Node names are optional in the onnx spec. ## Checklist before requesting a review - [x] Add unit tests for this change. - [ ] Make sure all tests can pass. - [x] Update documents if necessary. - [x] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link Fixes: #1459	2024-11-07 13:00:52 -08:00
Jambay Kinley	e6b3e99ce4	SharedCache: Fix incorrect calls (#1465 ) ## Describe your changes ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link Fixes: #1463	2024-11-07 12:40:38 -08:00
shaahji	3ce5035b28	Issue #1460 : Don't run transformer optimizer pass when using MB (#1462 ) ## Issue #1460: Don't run transformer optimizer pass when using MB The pass and MB aren't compatible. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [x] Make sure all tests can pass. - [ ] Update documents if necessary. - [x] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link Fixes: #1460	2024-11-07 11:14:57 -08:00
shaahji	0bfc4dfdfa	Fix an order of operation issue causing AttributeError (#1455 ) ## Fix an order of operation issue causing AttributeError Github Issue: 1449, 1454 ## Checklist before requesting a review - [ ] Add unit tests for this change. - [x] Make sure all tests can pass. - [ ] Update documents if necessary. - [x] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-11-06 18:00:36 -08:00
anujj	16ffab8314	Update NV-ModelOPT INT4 quantization (#1441 ) 1. Update modelopt integration in Olive 2. Add phi3 example 3. Remove the old bert model example ## Checklist before requesting a review - [x] Add unit tests for this change. - [x] Make sure all tests can pass. - [x] Update documents if necessary. - [x] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR.	2024-11-01 12:24:27 -07:00
Jambay Kinley	ad72625b7f	GPTQ: Don't save zero bias with the checkpoint, MB: Config save bugfix (#1445 ) ## Describe your changes - GPTQ creates and saves zero bias for all quantized modules. This is fixed on main but there hasn't been a release with this fix yet. - This causes unnecessary warning while loading the quantized model for unused bias. - Model exported using MB might also have the bias even though they are just zeros. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link Fixes #1450	2024-11-01 10:10:47 -07:00
shaahji	07afaae3e9	Update olive_config and auto-opt to use precision, providers and accelerators (#1440 ) ## Update olive_config and auto-opt to use precision, providers and accelerators * Implement providers, accelerators and precisions for pass in olive_config.json. Each pass module config now also requires listing supported providers, accelerators and precisions. * Update auto-opt cli to use a pre-defined order of passes when search is disabled. The list of passes is selected based on the user's choice of precision for the output model. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [x] Make sure all tests can pass. - [ ] Update documents if necessary. - [x] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link	2024-10-31 11:34:44 -07:00

1 2 3 4 5 ...

781 Коммитов