Google Cloud Speech：单词开始时间-Java 学习之路

我'm looking at using Google Cloud Speech to convert long-form narrated audio files and I need to know the start time of each phrase in the audio file. Is there a way to do this with Google Cloud Speech? I'目前正在使用 transcribe_async.py . 谢谢 .

2 回答

1

Google Cloud Speech无法做到这一点 . 如果该信息对您很重要，您可能需要查看其他ASR系统 . 我知道像Kaldi和CMU Sphinx这样的离线非托管ASR系统会为您提供此信息 . 我不知道托管的ASR系统是否可以提供该信息 .

回复于 2024-05-06T12:58:48+08:00
0

您可以通过将enableWordTimeOffsets选项设置为True来获得（aproximated）每个单词的开始和结束时间（从音频轨道的开头）：https://cloud.google.com/speech/docs/async-time-offsets .

请注意，抄本的第一个单词的开始时间始终为0，据我所知，每个单词的开始时间对应于前一个单词结束时间（如果有暂停） .

回复于 2024-05-06T12:58:48+08:00

Google Cloud Speech：单词开始时间

2 回答

相关问题