Microsoft Bopomofo Mis-Conversion Report Tool RRS feed

  • Question

  • The above tool popped up on my screen yesterday. I am not sure whether I should send it back. The list you sent me seems "The thousand nights and a night" or Arabian nights to me.

    First of all, I am not sure this is the right forum to send this in. I don't know what your project is and what you tried to accomplish here. I assume you tried to build something for your NLP. And also, you know Phonetics is just a branch of linguistics.

    Secondary, I believe that your team knows what the inputted characters are based on the word in your tool name, Bopomofo. But, what I don't understand was that how could you say this word is the correct conversion result and this word is converted characters based on the inputted characters. One inputted character or pronunciation character may have multiple words to represent the nature language in phonetics. Bopomofo is just a set of pronunciation characters like International Phonetic Alphabet (IPA) in English. It may not have normal form in grammar or semantics phonetically.

    Tertiary, I am not linguist but I know both languages and 注音符號. To me, it doesn't make sense to analyze pronunciation characters to build your NLP or knowledge base. Language itself should be self complacency.

    I give you an example. One of the examples in the list you sent me.

    Inputted characters ˋ may have 17 words, 自, 字, 恣, 漬,..  You say 自 is a converted character and 字 is the correct conversion result based on the Inputted character ˋ.  Could you explain what you mean?  

    The next example I give you is "造紐約(矽)". I believe the original sentence was "打造紐約矽谷". You can not take three words "造紐約" to determine what the next character is 矽 or 系 . 打造紐約矽谷 means "To build New York as silicon valley 2". 矽谷 is an objective compliment. It has to be a noun. We rarely use 系 as a noun in Chinese. Unless, we say 系 as a department.   

    Your opinion is welcomed if you are interested in this topic. Thanks.



    Wednesday, September 14, 2016 6:57 PM