none
Importing CSV File Data Loses Text Formatting RRS feed

  • Question

  • Good morning, all -

    I have a little VBA script to input the contents of a .csv file for later manipulation. I used to do this manually - well, still do, until I get this thing working - and this is supposed to automate part of the process.

    The problem is, I have a numeric field in the data that needs to be imported as text, because it is important to maintain leading zeros.

    The code I use is...

    Set wsDestSheet = ActiveWorkbook.Sheets("Buffer")
    sPath = "\\server\Bunch_O_Folders\"
    sFileName = "Import_File.csv"
    sConnection = sPath & sFileName
    
    With wsDestSheet.QueryTables.Add(Connection:="TEXT;" & sConnection, Destination:=wsDestSheet.Range("A1"))
       .TextFileParseType = xlDelimited
       .TextFileCommaDelimiter = True
       .Refresh
    End With

    This imports fine, but sets the cells to General format, except for a couple that were set to Date.

    So, I pre-formatted the columns that need to be text to that format and ran the import code again.

    No dice.

    What it did was insert the imported data, rather than just sort of paste it in. So, what happened was, the columns were all shifted to the right with the imported data inserted in front of them.

    Alright, said I, then let's try this, and I added code to the Macro prior to the import code to set the specific columns to text using...

    Columns("H:J").Select
    Selection.NumberFormat = "@"

    ...and tried that.

    This time, the columns did as they should and stayed text - they didn't shift to the right like with an insert - but, the data still lost the leading zeros.

    Seems to me that something in the import process is considering columns with numeric data to be numeric before it gets to the worksheet. So, the leading zeros are being dropped out during the data's trip through the ether to its destination in Excel. Once it arrives, it is in text format, but, having lost the leading zeros in the transition, the data is corrupted.

    All that being said, how can I get the data to import to Excel such that the fields that have numbers with leading zeros will maintain them?

    From another task I'm working on, I have another way of importing data that defines the format of each field as a part of the statement...

    ActiveWorkbook.Queries.Add …

    ...but - at least during debugging - every time I run a test, it creates a new query for the workbook. Which makes sense, I figure, because the .Add is there. I don't know how to do it without it adding a new query every time, which is why I used the above code that isn't maintaining formats. And, since this is something that will run every day, that load of extra queries is a-gonna get huge over time.

    So, is there a way to import data from a .csv file into Excel that won't create new queries every time it's run and will maintain formatting?

    Thanx in advance for any assistance!


    • Edited by Adam Quark Thursday, June 13, 2019 4:48 PM Clarity
    Thursday, June 13, 2019 4:47 PM

Answers

  • Hi Adam

    I have had a similar problem with leading zeros.  It appears to be Excel confusing data types and relying on the top rows to sort out which to adopt, despite any formatting of the column as text.  The workaround that I use is to make two dummy rows at the top of the worksheet with the relevant column formatted as text and containing "ggggggg" in the relevant cells.  (I use "g"s in case Excel decides it is hexadecimal).  These rows can either be hidden or can be deleted in code once Excel has been persuaded that they are text and not numeric.

    I got some good advice about this from experts in this forum a few years ago at this address https://social.msdn.microsoft.com/Forums/en-US/4e9125a2-b14d-4c7c-9864-75329c2a1e2b/ms-word-table-numbers-shown-in-scientific-format?forum=isvvba  

    Hope this sorts the leading zeros bit out.  Can't help on the queries bit I'm afraid

    Andy C

    Thanx for the reply, Andy! But, I finally got it to work and solved both issues with the same answer.

    It involves the second option query I mentioned above, the one that kept adding a new query to the workbook with every execution.

    After an afternoon of searching, I came across a very simple answer to that problem - embarrassingly simple, actually. All I needed was a snippet of code before running the import code that would just delete all queries in the workbook, like so;

    For Each qQuery In ActiveWorkbook.Queries
       qQuery.Delete
    Next qQuery

    I mean, like duh, right?

    The prior problem, the one where I needed to import with text formatting on those numeric fields that needed to be text was, like I said, to use the code I mentioned using the ActiveWorkbook.Queries.Add routine. That method let me select a format specification for each field being imported. But, I was having a problem with it because when I modified it to specify "text" for the special numeric fields, it would always toss a "Blank Look" error at me (1004).

    If I ran it unmodified, it worked great - lost the leading zeros, but did the job, otherwise. Use the changed version and it would crash.

    So, I put the code into a text editor and dissected it. Here's the original code;

    '*****  Import csv data
    
        ActiveWorkbook.Queries.Add Name:="New Data", Formula:= _
            "let" & Chr(13) & "" & Chr(10) & "    Source = Csv.Document(File.Contents(""c:\Import.csv""),[Delimiter="","", Columns=24, Encoding=65001, QuoteStyle=QuoteStyle.None])," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Source,{{""Column1"", type text}, {""Column2"", type text}, {""Column3"", type text}, {""Column4"", type text}, {" & _
            """Column5"", type text}, {""Column6"", type text}, {""Column7"", type text}, {""Column8"", Int64.Type}, {""Column9"", type text}, {""Column10"", type text}, {""Column11"", type date}, {""Column12"", type text}, {""Column13"", type text}, {""Column14"", type date}, {""Column15"", type date}, {""Column16"", type text}, {""Column17"", Int64.Type}, {""Column18"", Int64." & _
            "Type}, {""Column19"", Int64.Type}, {""Column20"", type text}, {""Column21"", type text}, {""Column22"", type text}, {""Column23"", type date}, {""Column24"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
        ActiveWorkbook.Worksheets.Add
        With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
            "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""New Data"";Extended Properties=""""" _
            , Destination:=Range("$A$1")).QueryTable
            .CommandType = xlCmdSql
            .CommandText = Array("SELECT * FROM [New Data]")
            .RowNumbers = False
            .FillAdjacentFormulas = False
            .PreserveFormatting = True
            .RefreshOnFileOpen = False
            .BackgroundQuery = True
            .RefreshStyle = xlInsertDeleteCells
            .SavePassword = False
            .SaveData = True
            .AdjustColumnWidth = True
            .RefreshPeriod = 0
            .PreserveColumnInfo = True
            .ListObject.DisplayName = "New Data"
            .Refresh BackgroundQuery:=False
        End With

    I changed the field for Column 8 from {""Column8"", Int64.Type} to {""Column8"", text}.

    Lo and behold, I found that was a mistake.  Took me forever to discover that I was missing the "type" identifier that had to go before "text."

    So, I made that change, et voila! The code, she works! Had I known that error code 1004 was Excel just throwing up its arms saying, "I dunno," I might have discovered the answer earlier.

    But, now it's working and I'm all happy.

    Thanx again for the reply, Andy!


    • Marked as answer by Adam Quark Tuesday, June 18, 2019 7:57 PM
    • Edited by Adam Quark Tuesday, June 18, 2019 7:58 PM
    Tuesday, June 18, 2019 7:56 PM

All replies

  • Hi Adam,
    re:  keep leading zeros

    Usually works...
    Format worksheet as 00000 (zip codes).
    Change quantity of zeros as needed.
    Hope your numbers are all the same length.
    '---

    Custom_Functions add-in (19 new functions)
    Download from MediaFire...

    Friday, June 14, 2019 12:55 AM
  • Hi Adam,
    re:  keep leading zeros

    Usually works...
    Format worksheet as 00000 (zip codes).
    Change quantity of zeros as needed.
    Hope your numbers are all the same length.
    '---

    Custom_Functions add-in (19 new functions)
    Download from MediaFire...

    Thanx, NLtL, but that's not an option. The length of the values vary and not all of them have leading zeros. So, "1234" is not the same as "001234" and neither of them are the same as "0001234."

    We get data from different vendors and each has their own ID methodology.

    But, thanx again for your reply!

    Friday, June 14, 2019 4:19 PM
  • Hi Adam

    I have had a similar problem with leading zeros.  It appears to be Excel confusing data types and relying on the top rows to sort out which to adopt, despite any formatting of the column as text.  The workaround that I use is to make two dummy rows at the top of the worksheet with the relevant column formatted as text and containing "ggggggg" in the relevant cells.  (I use "g"s in case Excel decides it is hexadecimal).  These rows can either be hidden or can be deleted in code once Excel has been persuaded that they are text and not numeric.

    I got some good advice about this from experts in this forum a few years ago at this address https://social.msdn.microsoft.com/Forums/en-US/4e9125a2-b14d-4c7c-9864-75329c2a1e2b/ms-word-table-numbers-shown-in-scientific-format?forum=isvvba  

    Hope this sorts the leading zeros bit out.  Can't help on the queries bit I'm afraid

    Andy C

    Tuesday, June 18, 2019 6:54 AM
  • Hi Adam

    I have had a similar problem with leading zeros.  It appears to be Excel confusing data types and relying on the top rows to sort out which to adopt, despite any formatting of the column as text.  The workaround that I use is to make two dummy rows at the top of the worksheet with the relevant column formatted as text and containing "ggggggg" in the relevant cells.  (I use "g"s in case Excel decides it is hexadecimal).  These rows can either be hidden or can be deleted in code once Excel has been persuaded that they are text and not numeric.

    I got some good advice about this from experts in this forum a few years ago at this address https://social.msdn.microsoft.com/Forums/en-US/4e9125a2-b14d-4c7c-9864-75329c2a1e2b/ms-word-table-numbers-shown-in-scientific-format?forum=isvvba  

    Hope this sorts the leading zeros bit out.  Can't help on the queries bit I'm afraid

    Andy C

    Thanx for the reply, Andy! But, I finally got it to work and solved both issues with the same answer.

    It involves the second option query I mentioned above, the one that kept adding a new query to the workbook with every execution.

    After an afternoon of searching, I came across a very simple answer to that problem - embarrassingly simple, actually. All I needed was a snippet of code before running the import code that would just delete all queries in the workbook, like so;

    For Each qQuery In ActiveWorkbook.Queries
       qQuery.Delete
    Next qQuery

    I mean, like duh, right?

    The prior problem, the one where I needed to import with text formatting on those numeric fields that needed to be text was, like I said, to use the code I mentioned using the ActiveWorkbook.Queries.Add routine. That method let me select a format specification for each field being imported. But, I was having a problem with it because when I modified it to specify "text" for the special numeric fields, it would always toss a "Blank Look" error at me (1004).

    If I ran it unmodified, it worked great - lost the leading zeros, but did the job, otherwise. Use the changed version and it would crash.

    So, I put the code into a text editor and dissected it. Here's the original code;

    '*****  Import csv data
    
        ActiveWorkbook.Queries.Add Name:="New Data", Formula:= _
            "let" & Chr(13) & "" & Chr(10) & "    Source = Csv.Document(File.Contents(""c:\Import.csv""),[Delimiter="","", Columns=24, Encoding=65001, QuoteStyle=QuoteStyle.None])," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Source,{{""Column1"", type text}, {""Column2"", type text}, {""Column3"", type text}, {""Column4"", type text}, {" & _
            """Column5"", type text}, {""Column6"", type text}, {""Column7"", type text}, {""Column8"", Int64.Type}, {""Column9"", type text}, {""Column10"", type text}, {""Column11"", type date}, {""Column12"", type text}, {""Column13"", type text}, {""Column14"", type date}, {""Column15"", type date}, {""Column16"", type text}, {""Column17"", Int64.Type}, {""Column18"", Int64." & _
            "Type}, {""Column19"", Int64.Type}, {""Column20"", type text}, {""Column21"", type text}, {""Column22"", type text}, {""Column23"", type date}, {""Column24"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
        ActiveWorkbook.Worksheets.Add
        With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
            "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""New Data"";Extended Properties=""""" _
            , Destination:=Range("$A$1")).QueryTable
            .CommandType = xlCmdSql
            .CommandText = Array("SELECT * FROM [New Data]")
            .RowNumbers = False
            .FillAdjacentFormulas = False
            .PreserveFormatting = True
            .RefreshOnFileOpen = False
            .BackgroundQuery = True
            .RefreshStyle = xlInsertDeleteCells
            .SavePassword = False
            .SaveData = True
            .AdjustColumnWidth = True
            .RefreshPeriod = 0
            .PreserveColumnInfo = True
            .ListObject.DisplayName = "New Data"
            .Refresh BackgroundQuery:=False
        End With

    I changed the field for Column 8 from {""Column8"", Int64.Type} to {""Column8"", text}.

    Lo and behold, I found that was a mistake.  Took me forever to discover that I was missing the "type" identifier that had to go before "text."

    So, I made that change, et voila! The code, she works! Had I known that error code 1004 was Excel just throwing up its arms saying, "I dunno," I might have discovered the answer earlier.

    But, now it's working and I'm all happy.

    Thanx again for the reply, Andy!


    • Marked as answer by Adam Quark Tuesday, June 18, 2019 7:57 PM
    • Edited by Adam Quark Tuesday, June 18, 2019 7:58 PM
    Tuesday, June 18, 2019 7:56 PM