locked
strange characters importing csv file RRS feed

  • Question

  • Hi,

    i have a problem: i  have a strange csv file (the first row of the table is a comment, the second row is a header of a table and the other rows are all the order items, instead the columns are the items characterizing the single order,such as ID, code,shipper ecc..).

    Creating a flat schema with the wizard, when i select the instance file i choose my csv file:at this point i had to select the portion of data to define the record and...surprise! I see all strange characters like this"6>@%&f�+��! `�e���֫`�{�K�T,���o8u����+�`�ӡ���`��B".What is wrong in my procedure? Thanks to everyone who will help me. 

    Monday, September 3, 2012 2:32 PM

Answers

  • Maybe i solved...

    I've copied the entire text from my file ( that was a file named <filename>.csv.xlsx), that i've opened a new black excel page and i've copied all the text.

    Finally i've saved the "new document" as .csv ( the icon is changed) and the import with the BizTalk Wizard seems to work now.

    Tuesday, September 4, 2012 12:17 PM

All replies

  • Did you verify the code page set to UTF-8 (65001) on the first step where you select the .csv file

    If this post answers your question, please mark it as such. If this post is helpful, click 'Vote as helpful'.


    • Edited by Madhu_A Monday, September 3, 2012 2:39 PM
    Monday, September 3, 2012 2:39 PM
  • When importing flat files and converting them to XML, the Disassembler needs to know what Code Page the flatfile uses. You can choose the "Code page" in a dropdown list on the same page in the wizard where you also choose the "Instance file".

    Morten la Cour

    Monday, September 3, 2012 2:39 PM
  • When importing i already choose the Code Page for the flat file (and i remember that the coding is automatically chosen by BizTalk or no?).

    I try to do this: i have copied all the items of the flat file into a simple txt file (so the coding remains the same!).

    When i use the wizard all works fine and when i try to validate instance, it succeded.

    Why this?

    Monday, September 3, 2012 3:00 PM
  • If you renamed the .csv file to .txt, the encoding remains the same, if you selected all the text from the .csv file and pasted it into a .txt file, the encoding may have changed.  If your .csv file included a BOM (byte order mark) at the beginning of the file, the flat file wizard probably defaulted to the appropriate code page by recognizing the encoding from the BOM.  If you take a look at the .csv file using a hex editor, you can see whether a BOM exists or not.  Several examples of BOM values are:

    Unicode (little endian):  FF FE

    UTF8: EF BB BF

    For additional information regarding byte order marks see: http://en.wikipedia.org/wiki/Byte_order_mark


    David Downing... If this answers your question, please Mark as the Answer. If this post is helpful, please vote as helpful.


    Tuesday, September 4, 2012 4:54 AM
  • Thanks David,

    with a hex editor i found in my file this:

    FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 52 00 6F 00 6F 00 74 00 20 00 45 00 6E 00 74 00 72 00 79 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 16 00 05 01 FF FF FF FF FF FF FF FF 02 00 00 00 20 08 02 00 00 00 00 00 C0 00 00 00 00 00 00 46 00 00 00 00 00 00 00 00 00 00 00 00 90 D1 ED B3 70 8A CD 01 FE FF FF FF 00 00 00 00 00 00 00 00 57 00 6F 00 72 00 6B 00 62 00 6F 00 6F 00 6B 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

    So as you mentioned, my file is afflicted by BOM. 

    The attached file shows what i see in the first step of the wizard: how can i resolve this stuff?

    Thanks to all

    Tuesday, September 4, 2012 8:20 AM
  • Maybe i solved...

    I've copied the entire text from my file ( that was a file named <filename>.csv.xlsx), that i've opened a new black excel page and i've copied all the text.

    Finally i've saved the "new document" as .csv ( the icon is changed) and the import with the BizTalk Wizard seems to work now.

    Tuesday, September 4, 2012 12:17 PM
  • I believe you did.  The .xlsx file does not contain a BOM however, the information you displayed is .xlsx specific formatting information.

    David Downing... If this answers your question, please Mark as the Answer. If this post is helpful, please vote as helpful.

    Tuesday, September 4, 2012 5:47 PM