loop utterance
#2
by
vinnitu
- opened
How to handle long files?
Like
model = gigaam.load_model("ctc", use_flash=False)
recognition_result = model.transcribe_longform("long_example.wav")
for utterance in recognition_result:
transcription = utterance["transcription"]
start, end = utterance["boundaries"]
print(f"[{gigaam.format_time(start)} - {gigaam.format_time(end)}]: {transcription}")
in non-onnx version here
You can use VAD (voice activity detection). There's an example at the link:
https://github.com/istupakov/onnx-asr?tab=readme-ov-file#vad