Token limit?

#1
by jujutechnology - opened

Hi. The model only outputs approximately 16 seconds of audio no matter how long the text is. Is this just a limitation of this Space or is the model not able to do longer form text?

NineNineSix org

This model is pre-trained on audio up to 15 seconds, which is okay for streaming but not very good for generating long sentences. On longer sentences it may show instability.

is there a streaming implementation example somewhere ? if not maybe high-level instructions on how to about it, thank you for incredible work ๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘

Sign up or log in to comment