locked
System.Speech Pronunciation Training App RRS feed

  • Question

  • I want to develop an app for evaluating the pronunciation of non-native English speakers on words and short phrases (up to 5 words). The user would receive feedback on the accuracy of the pronunciation, perhaps with further feedback on which words/phonemes are poorly pronounced.

    I have reviewed the following content:

    https://stackoverflow.com/questions/16746850/finding-pronunciation-correctness

    https://stackoverflow.com/questions/2854087/is-it-possible-to-use-windows-speech-recognition-engine-in-a-word-pronunciation/2909005#2909005

    https://stackoverflow.com/questions/20770593/speech-to-phoneme-in-net

    https://stackoverflow.com/questions/49519428/how-to-get-pronunciation-phonemes-corresponding-to-a-word-using-c

    From Eric Brown's comments, a project using System.Speech will need to be in C++, and follow the following steps:

    1. Generate pronunciations for the target word/phrase as phonemes using ISpEnginePronunciation::GetPronunciations This will generate one or more different variants as to how the word/phrase might be pronounced;

    2. Generate the recognised phonemes from the user using a dictation grammar;

    3. Run a comparison between each of the pronunciation variants of the target word/phrase and the recognised phonemes. This is the nub of the issue.

    It has been suggested that the Levenshtein distance might be used to make the comparison. This appears to be a string matching algorithm and doesn't account, for instance, for the similarity between the d and t sounds in English.

    I did see one comment from Eric Brown that "The SAPI Phonetic Alphabet Reference can help you here, as it breaks down the consonants & vowels into features"

    I think I found that here:

    https://documentation.help/Microsoft-Speech-Platform-SDK-11/ccbe7d0a-2aa5-4987-98c5-b65089bac8c3.htm

    It does not seem to provide what I'm looking for; a method of assessing the similarity in pronunciation.

    My questions:

    1. Am I on the right track with the description above?;

    2. If so, how to use the SAPI Phonetic Alphabet Reference to calculate the "distance" between a target pronunciation and a spoken one?

    3. What would be the limitations of this method? Would it be useful for long phrases of more than a few words?

    Thanks in advance,

    Ozs

    Friday, May 24, 2019 3:24 PM

All replies

  • Hi Oztromboli,

    Thank you for posting here.

    For your question, you could check the link below.

    https://www.codeproject.com/Articles/363410/AISpeech-API-ASDK-Tutorial-A-spoken-English-assess

    I hope this would be helpful.

    Best Regards,

    Wendy


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, May 27, 2019 6:11 AM
  • Hi Wendy

    Thanks very much for your response. The article you provide uses the AISpeech API for Apache Flex. I was really after a pronunciation checker under dotNet and using System.Speech because the parent app is developed in this.

    Regards

    OzS

    Monday, May 27, 2019 6:36 AM
  • Correction, SAPI, not System.Speech.

    Wednesday, May 29, 2019 3:42 AM
  • Hi Oztromboli,

    For System.Speech, I do not find the source file of project for this. But in the link below, it provide a way to do that with System.Speech.

    https://stackoverflow.com/questions/16746850/finding-pronunciation-correctness

    Best Regards,

    Wendy


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, June 3, 2019 3:00 AM
  • Hi Wendy

    Thanks, I had seen this reference. In fact it was the first link that I listed in my original question.

    I need someone who is across using the ISpEnginePronunciation::GetPronunciations method to generate reference phonemes. And also a method to compare these with spoken phonemes to generate a quality score.

    Regards

    OzS

    Monday, June 3, 2019 7:54 PM
  • Hi Oztro,boli,

    Sorry for that. For C++, Visual C++ forum would give you more help.

    Best Regards,

    Wendy


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Wednesday, June 5, 2019 5:46 AM
  • Hello,

    I think you should better ask directly in a forum that only concentrates on developing with the MS speach api: https://social.msdn.microsoft.com/Forums/en-US/home?forum=SpeechService

    Regards, Guido

    Thursday, June 6, 2019 6:14 AM