locked
Bing Voice Recognition API Output RRS feed

  • Question

  • I'm seeing some odd behavior where the voice recognition API will sometimes return words and sometimes return numbers.  For example, given the audio input of "fifteen dollars and seven cents" it will sometimes return "fifteen dollars and seven cents" and others "$15.07".  Is there a way to ensure one or the other are returned?  Or, will I need to parse and convert words to numbers on my own.

    Thanks,
    -Mike

    Tuesday, September 6, 2016 9:21 PM

All replies

  • Hi Mike,

    Did you get some headway into this issue. We are running into exactly same issue.

    Thanks

    Tuesday, September 13, 2016 6:34 AM
  • We ended up having to write a parser to convert lexical responses to numbers.  Seems this all started happening about 10 days ago.  Another user posted this question in Stack Exchange - "http://stackoverflow.com/questions/39325602/getting-different-results-via-bing-speech-recognition-api-beta-for-same-audio". 

    Sorry from wrapping that ink in quotes.  Apparently you can't post links in responses until your account is verified (which I'm pretty sure I did).

    • Edited by marwine Tuesday, September 13, 2016 5:41 PM
    Tuesday, September 13, 2016 5:40 PM
  • Same issue
    Wednesday, September 14, 2016 1:05 AM
  • Same here, lexical and name responses are identical as of ~1/2, 2ish weeks ago.  According to MS's documentation, the 'name' response should include the post-inverse text normalization (which should be 92.64 in my example below).  Seems that it's only including the pre-inverse text normalization in the 'name' result.

    Here's an example JSON response I received:

    "name":"ninety two point six four","lexical":"ninety two point six four"

    Doc link:

    https://www.microsoft.com/cognitive-services/en-us/Speech-api/documentation/API-Reference-REST/BingVoiceRecognition#3-voice-recognition-responses

    Friday, September 16, 2016 10:12 PM