overfitting
LTX -Video now supports up to 60 seconds but when actually generating 60 seconds long clips the model seems to show memorized data like watermarks of stick videos, video game HUDs, ...
prompt: "The turquoise waves crash against the dark, jagged rocks of the shore, sending white foam spraying into the air. The scene is dominated by the stark contrast between the bright blue water and the dark, almost black rocks. The water is a clear, turquoise color, and the waves are capped with white foam. The rocks are dark and jagged, and they are covered in patches of green moss. The shore is lined with lush green vegetation, including trees and bushes. In the background, there are rolling hills covered in dense forest. The sky is cloudy, and the light is dim."
length: 60 seconds at 12 FPS
result:
the inference is using diffusers and the model is quantized to nf4, if there are any fixes for that that aren't incresing reselution, fps or higher quantisation, please provide them