The fast and the curious load testing Llama 3.1 with vLLM Nov 5, 2024 Throughput Scaling of Llama 3.1 8B Under Various Quantization Methods on an NVIDIA A6000 Throughput Scaling of Llama 3.1 70B Under Various Quantization Methods on 4 x NVIDIA A6000