- 
	
	
	
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper • 2208.07339 • Published • 5 - 
	
	
	
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper • 2210.17323 • Published • 8 - 
	
	
	
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Paper • 2211.10438 • Published • 6 - 
	
	
	
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 56 
Ahmad Khan
ahkhan
		·
				AI & ML interests
None yet
		Recent Activity
						updated
								a Space
							
						about 1 month ago
						
					
						
						
						
						ahkhan/trackio
						
						published
								a Space
							
						about 1 month ago
						
					
						
						
						
						ahkhan/trackio
						
						updated 
								a collection
							
						3 months ago
						
					Quantization Reading-List
						Organizations
None yet