Migrating from AffineQuantizedTensor + Layouts to new structure of tensor subclasses

Update: Our team will evaluate this more before outsourcing the migration to more people in the community

Context:
Previously we use AffineQuantizedTensor for many of our use cases including int4, float8, intx, floatx. It introduces some complicated abstractions like Layout, people have been saying it's a bit hard to understand, and there are many indirections in the code.

As an effort simplify the code base and make it easier to contribute to, we have been adding new features with a different structure in mind. Now we want to structure Tensors by "dtype" and "packing_format", e.g. we'll have Int4PreshuffledTensor, Int8Tensor, Float8Tensor, instead of having AffineQuantizedTensor and multiple layouts.

Please check out our updated docs for the new tensor subclass organization structure and guide for design:
* quantization overview: https://docs-preview.pytorch.org/pytorch/ao/2723/quantization_overview.html
* contributor guide: https://docs-preview.pytorch.org/pytorch/ao/2723/contributor_guide.html
* Examples of tensor subclasses following new design: https://github.com/pytorch/ao/tree/main/torchao/quantization/quantize_/workflows

List of things to migrate:
INT8
* [ ] [move to prototype] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/block_sparse_layout.py @jainapurva 
* [ ] [migrate] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/plain_layout.py @namgyu-youn https://github.com/pytorch/ao/pull/3241
* [ ] [move to prototype] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/semi_sparse_layout.py @namgyu-youn  https://github.com/pytorch/ao/pull/3258 (no need to migrate to new tensor structure)


[migration done, TODO: delete old path after all migration is done] INT4 weight only
* [x] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/int4_cpu_layout.py @Xia-Weiwen  https://github.com/pytorch/ao/blob/main/torchao/quantization/quantize_/workflows/int4/int4_opaque_tensor.py
* [x] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/int4_xpu_layout.py @liangan1 https://github.com/pytorch/ao/pull/2845
* [x] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/marlin_sparse_layout.py @liangel-02 https://github.com/pytorch/ao/pull/2771
* [x] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/tensor_core_tiled_layout.py @jerryzh168  https://github.com/pytorch/ao/pull/2791
* [x] HQQ support for tensor core tiled layout @jerryzh168 https://github.com/pytorch/ao/pull/2912/

[move to prototype] INT4 weight + int8 activation
* [ ] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/cutlass_int4_packed_layout.py
* [ ] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/dyn_int8_act_int4_wei_cpu_layout.py
* [ ] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/marlin_qqq_tensor.py


UINTx Weight Only
* [ ] [move to protoype or migrate (check with Hicham)] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/gemlite_layout.py
* [ ] [move to protoype] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/uintx_layout.py

[migration done, TODO: delete old path after all migration is done] Int8DynamicActivationIntxWeightConfig
* [x] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/packed_linear_int8_dynamic_activation_intx_weight_layout.py @metascroy https://github.com/pytorch/ao/pull/2742
* [x] https://github.com/pytorch/ao/blob/main/torchao/dtypes/uintx/q_dq_layout.py @metascroy https://github.com/pytorch/ao/pull/2732

FP8
* [ ] [migrate] https://github.com/pytorch/ao/blob/main/torchao/dtypes/floatx/cutlass_semi_sparse_layout.py @namgyu-youn  https://github.com/pytorch/ao/pull/3258 and @bbeckca #3182 

FPx
* [ ] [move to protoype] https://github.com/pytorch/ao/blob/main/torchao/dtypes/floatx/floatx_tensor_core_layout.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Migrating from AffineQuantizedTensor + Layouts to new structure of tensor subclasses #2752

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Migrating from AffineQuantizedTensor + Layouts to new structure of tensor subclasses #2752

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions