llmquantizationread what A weight-only post-training quantization method that achieves SoTA performance in extreme compression (⇐ 4 bits per weight) application I-quant in llama-cpp was inspired by QuIP source: SOTA 2-bit quants by ikawrakow · Pull Request #4773 · ggerganov/llama.cpp · GitHub resources repo https://github.com/Cornell-RelaxML/quip-sharp