Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
Snow, J. S. (2018). Automatic Detection and Correction of Errors in Video Tutorial Transcripts. Retrieved from http://purl.flvc.org/fsu/fd/2018_Sp_Snow_fsu_0071N_14574
Speech-to-text technologies continues to become more accessible and more accurate. But while automated personal assistants and transcription software is improving, it still falters in many cases that involve domain-specific terminology, like medicine and technology. If spoken clearly and concisely, some terms can be correctly transcribed, but there are many confounding variables like accents that can make it unlikely to transcribe the correct word or phrase. To fix these errors in domain-specific terminology, I suggest a post-processing step to find and correct these errors. To find the mistakes, I trained a language model is trained on relevant data—in this test case, Stack Overflow posts that were tagged with Java-related keywords. The model is used to find words in automated transcripts that are likely mis-transcribed. Using n-gram and a long short-term memory neural network trained on 80% of the manual audio transcriptions, suggestions for replacements of the mis-transcribed words are offered. The accuracy of these suggestions are found by comparing corrections to the test case manual transcription corresponding with the automated transcription. These results are a mixed bag that reveal an issue with the language model but that the neural network can successfully suggest word corrections for technical words with almost a 70% accuracy.
Snow, J. S. (2018). Automatic Detection and Correction of Errors in Video Tutorial Transcripts. Retrieved from http://purl.flvc.org/fsu/fd/2018_Sp_Snow_fsu_0071N_14574