DocOwl2 using flash attention mandatorily
#1
by
						
nicozck
	
							
						- opened
							
					
DocOwl2 cannot be loaded without flash_attn because the implementation of the compressor mandatorily uses flash attention.
This issue causes DocOwl2 to not run on many non-NVIDIA devices. Please consider adding an option to disable or enable flash attention.
