Training LLMs over Neurally Compressed Text
			Paper
			•
			2404.03626
			•
			Published
				
			•
				
				24
			
Table 5: Transformers struggle to learn Arithmetic Coding. In the sequence-to-sequence setting, a model that learns AC compression/decompression shoul