PyTorch 8-Bit Weight Quantizer

This tool demonstrates quantization of neural network weights to INT8 precision. It implements a custom W8A16LinearLayer that uses 8-bit weights with 16-bit activations.

1 512
1 512
Data Type