what
A python package for LLMs quantization based on GPTQ.
by the time v1.0.0 is officially released, AutoGPTQ will be able to serve as an extendable and flexible quantization backend that supports all GPTQ-like methods and automatically quantize LLMs written by Pytorch.
features
- python API
- quantize Pytorch models
- load GPTQ quantized models
- supported evaluation tasks:
LanguageModelingTask
,SequenceClassificationTask
andTextSummarizationTask
properties
- not all models are supported
- not all evaluation tasks are supported