none
Bad type conversion from CSV file to DataTable RRS feed

  • Question

  • Hi,
    I have a CSV file with header as first values.
    I can open the file with Excel and the data are shown like this:

    Title    Category  Quantity Price

    Title1      12              1           0,2

    Title2      11              2           0,2

    Title3      21              2           0,3

     

    Now I would load the data in memory in a DataTable with typed DataColumn.
    The snippet code is:

    // Create the connection to CSV file

    private const string CONNECTIONSTRING = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + CSVFOLDER +"\\;" +"Extended Properties='Text;HDR=Yes;FMT=CSVDelimited'";

    private const string CSVFOLDER="CSVFOLDER";

    // CSV fileName

    string fileName=”C:\\..”; 
     
    string csvFolder = Path.GetDirectoryName(FileName);

     connection = new OleDbConnection(CONNECTIONSTRING.Replace(CSVFOLDER,csvFolder));

     

    // Load data in memory in a DataTable

    OleDbDataAdapter csvAdapter;

        csvAdapter = new OleDbDataAdapter("SELECT * FROM [" + Path.GetFileName(fileName) + "]", connection);

     

    DataTable CSVDataTable = new DataTable(Path.GetFileNameWithoutExtension(fileName));

     

        try

        {

         csvAdapter.Fill(CSVDataTable);

        }

        catch (Exception ex)

        {

         throw new Exception(String.Format("Error loading CSV file in {0}.\n{1}", Path.GetFileName(fileName), ex.Message));

        }

     

    The DataTable is filled by CSVAdapter but typed Columns (and consequently values) are not as excepted.
    Title is String, Quantity and Category are translated as Int32. It’s OK.
    But surprise, Price is converted to DateTime and the first value is 30/12/1899 0.02.00.

    If Price would be a series of 0,99, Price Column was translated as Double with the correct value.

     

    Could I control the conversion anyway?

     

    Best regards,
    Mauro



    Mauro
    Friday, July 31, 2009 7:01 PM

Answers

  • It's really hard.
    I try everything:
    keep off .csv in header filename in schema.ini and putting again
    copy in other folder both filename and schema.ini (the filename in header is the same of the file)
    keep off the backslash in connectionString after the Folder Data Source
    seeing if the last access time in schema.ini is changed (No change but I don't if sufficient to say that is not read)
    Always Price as Time (spent..)

    Reading file with TextFieldReader could be a solution but I would have data typed.

    Thanks Paul and VMazur

    Mauro


    Mauro
    • Marked as answer by MauroGv Wednesday, August 5, 2009 6:52 PM
    Tuesday, August 4, 2009 2:39 PM

All replies

  • The best way to ensure that the values are being imported correctly is to define a schema.ini file for your text file(s):

    http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx


    Paul ~~~~ Microsoft MVP (Visual Basic)
    Friday, July 31, 2009 8:18 PM
  • Hi Paul,
    thanks for your reply.
    I added a schema.ini file in the same directory of my CSV file.
    This is my schema.ini:

    [FileName.csv]
    ColNameHeader=True
    Format=CSVDelimited
    MaxScanRows=0
    CharacterSet=OEM
    Price="Price" Double

    I run again my code but the Price column remains as Data.
    I'm not sure if the syntax of the column is correct.
    Have I to specify all columns or what I want to change?

    Mauro


    Mauro
    Friday, July 31, 2009 9:34 PM
  • You need to define datatypes for all the columns, not just the one you are going to use. You schema.ini file shjuld look like

    [FileName.csv]
    ColNameHeader=True
    Format=CSVDelimited
    MaxScanRows=0
    CharacterSet=OEM
    Col1=Title Text

    Col2=Category Long
    Col3=Quantity Long
    Col4=Price Double

    Val Mazur (MVP) http://www.xporttools.net
    Monday, August 3, 2009 2:54 AM
    Moderator
  • Hi Val Mazur,
    thanks for your reply.
    It seems so easy but my schema.ini is completly ignored.
    The data are written in this way:

    Title;Category;Quantity;Price
    Title1;12;1;0,2
    Title2;11;2;0,2
    Title3;21;2;0,3

    It is CSV with different separator (;)
    and I tried to modify the schema.ini as below but the result was the same

    Schema.ini and filename.csv are in the same directory.


    [filename.csv]
    ColNameHeader=True
    Format=Delimited() 
    MaxScanRows=0
    CharacterSet=OEM
    Col1=Title Text
    Col2=Category Long
    Col3=Quantity Long
    Col4=Price Double

    (or Format=Delimited(;)  no change the result)

    When I fill data in DataTable I've still got Price as DateTime

    Is it possible?

    Mauro


    Mauro
    Monday, August 3, 2009 9:36 AM
  • I used the following (Currency instead of Double):

    [filename.csv]
    ColNameHeader=True
    Format=Delimited(;) 
    MaxScanRows=0
    CharacterSet=OEM
    Col1=Title Text
    Col2=Category Long
    Col3=Quantity Long
    Col4=Price Currency

    You may have to format the currency value but all the information is there.
    Paul ~~~~ Microsoft MVP (Visual Basic)
    Monday, August 3, 2009 12:34 PM
  • BTW, you could also probably just import the Price (Col4) as Text so you don't lose any parts of the value. But if you need to perfom any calculations then you will need to convert it to a numeric data type.
    Paul ~~~~ Microsoft MVP (Visual Basic)
    Monday, August 3, 2009 4:21 PM
  • Thanks again Paul,
    I tried all the possibility but DateTime in Price still remains.
    I haven't more idea, probably the schema.ini file is not read.
    Mauro


    Mauro
    Monday, August 3, 2009 8:29 PM
  • Did you put the schema.ini file in the same folder as the text file(s)?
    Paul ~~~~ Microsoft MVP (Visual Basic)
    Monday, August 3, 2009 8:43 PM
  • Unfortunately... yes.


    Mauro
    Monday, August 3, 2009 9:24 PM
  • And the file name in [filename.csv] matches the actual file name?
    Val Mazur (MVP) http://www.xporttools.net
    Tuesday, August 4, 2009 10:30 AM
    Moderator
  • Another option would be to use the TextFieldParser Class:

            Dim TextFileReader As New Microsoft.VisualBasic.FileIO.TextFieldParser("C:\Documents and Settings\...\My Documents\My Database\Text\List.txt")
    
            TextFileReader.TextFieldType = FileIO.FieldType.Delimited
            TextFileReader.SetDelimiters(";")
    
            Dim CurrentRow As String()
            While Not TextFileReader.EndOfData
                Try
                    CurrentRow = TextFileReader.ReadFields()
                    Console.WriteLine(CurrentRow(0).ToString)
                    Console.WriteLine(CurrentRow(1).ToString)
                    Console.WriteLine(CurrentRow(2).ToString)
                    Console.WriteLine(CurrentRow(3).ToString)
                Catch ex As _
                Microsoft.VisualBasic.FileIO.MalformedLineException
                    MsgBox("Line " & ex.Message & _
                    "is not valid and will be skipped.")
                End Try
                Console.WriteLine()
            End While
            TextFileReader.Dispose()

    Paul ~~~~ Microsoft MVP (Visual Basic)
    Tuesday, August 4, 2009 1:03 PM
  • It's really hard.
    I try everything:
    keep off .csv in header filename in schema.ini and putting again
    copy in other folder both filename and schema.ini (the filename in header is the same of the file)
    keep off the backslash in connectionString after the Folder Data Source
    seeing if the last access time in schema.ini is changed (No change but I don't if sufficient to say that is not read)
    Always Price as Time (spent..)

    Reading file with TextFieldReader could be a solution but I would have data typed.

    Thanks Paul and VMazur

    Mauro


    Mauro
    • Marked as answer by MauroGv Wednesday, August 5, 2009 6:52 PM
    Tuesday, August 4, 2009 2:39 PM