ASR systems that have become available on the market recently such as Whisper, seem to provide overwhelmingly accurate transcription of speech. But how do these systems perform under atypical conditions? For example, in the case of dialects, children or elderly speech or speech from non-native Dutch speakers? What happens if there are multiple speakers, cross talk and background noises? And, what to do if you want to transcribe very large amounts of speech data? What's the best way to handle this in a more (infra)structural way?
In this seminar, we will show examples from different application areas and discuss practical, operational, and strategic aspects.
For whom
researchers, teachers, support staff from various disciplines interested in the application of automatic speech recognition
0 Praat mee