none
Cannot extract from text file with ANSI encoding

    Question

  • I cannot extract the data from tab-delimited text file which is encoded in ANSI (with some special character in it). I tried to use all the encoding available for Extractors.Text but nothing works. Any workaround for this?

    Here are some of the character which cause the error.

    Tuesday, August 23, 2016 8:26 AM

Answers

  • By default UTF-8 is expected, ANSI is not supported. You can find the supported encodings, that you can set with the Encoding property, here: https://msdn.microsoft.com/en-us/library/azure/mt621366.aspx#Anchor_0

    I also had some ANSI encoded files that I changed to UTF-8 using Notepad++. You can open a lot of files and change them all together at once: http://stackoverflow.com/questions/7256049/notepad-converting-ansi-encoded-file-to-utf-8


    Jorg Klein's Microsoft Business Intelligence Blog

    Tuesday, August 23, 2016 10:20 AM

All replies

  • By default UTF-8 is expected, ANSI is not supported. You can find the supported encodings, that you can set with the Encoding property, here: https://msdn.microsoft.com/en-us/library/azure/mt621366.aspx#Anchor_0

    I also had some ANSI encoded files that I changed to UTF-8 using Notepad++. You can open a lot of files and change them all together at once: http://stackoverflow.com/questions/7256049/notepad-converting-ansi-encoded-file-to-utf-8


    Jorg Klein's Microsoft Business Intelligence Blog

    Tuesday, August 23, 2016 10:20 AM
  • In addition to Jorg's reply, if you cannot change the encoding of the data to UTF-8, you can write your own custom extractor that uses the C# capabilities to handle ANSI code pages.

    Also, feel free to add your vote to this user feedback item.


    Michael Rys

    Tuesday, August 23, 2016 7:56 PM
    Moderator