none
CR/LF in CSV file import RRS feed

  • Question

  • I am trying to import a CSV file to a new table but get this error: Index or primary key cannot contain a Null value. However, if I open the file with Excel, then save again as CSV without making any (intentional) changes, it imports with no problem.

    An initial comparison of the before-and-after files with a text editor (Notepad++) indicated that they are the same, but once I showed all characters, it immediately became evident that our old friends CR & LF are in play here. That is, the file that cannot be imported includes only a CR at the end of each line, whereas the file that can be imported includes CR & LF at the end of each line. That is, opening the file in Excel and saving immediately as CSV replaces all solo CRs with CRLFs.

    I have no control over the source file. In the end, this will be scripted using TransferText in VBA, invoking an import spec, so it seems ridiculous to step out of the automated process to have a user open the file and save it immediately before the import.

    So, on to my question: is there a way to force the import process to accept CR with or without LF as a valid record separator in a CSV file? If not, how would I do a search & replace of my text file so that I add an LF after each CR not already having an LF after it? (I know, I probably just have to search the forum for ways to automate editing of text files...but now that I have this all written up here, I will see if someone can point me the right direction first.)

    Quite a brain-teaser, this one is.

    Sunday, November 8, 2015 4:08 PM

Answers

  • Thank you, Imb. That is a good catch. It was my assumption that the line-end characters were included, so I did not test without that. So I just simplified the Input/Output loop to this

    Do While Not EOF(FileOriginal)
        'read first line
            Line Input #FileOriginal, CurrentLine
        'add to temp file
            Print #FileTemp, CurrentLine
    Loop

    Note that when I tried using the semicolon, it simply ran all my lines together, with no line breaks. I still need the line breaks between lines, since each line represents a separate record for my subsequent TransferText.

    Hi Brian,

    Yes, that is correct. If you use the semicolon for all lines, there are no line breaks.

    But you could use something like:

    Do While Not EOF(FileOriginal)
        'read first line
            Line Input #FileOriginal, CurrentLine
        'add to temp file
            If (Not EOF(FileOriginal) Then
                Print #FileTemp, CurrentLine
            Else
                Print #FileTemp, CurrentLine;
            End If
    Loop

    Imb.

    • Marked as answer by Brian D. Hart Monday, November 9, 2015 5:02 PM
    Monday, November 9, 2015 4:54 PM
  • Hi,

    Here's an old post at UtterAccess, but I thought it might help for starters. Good luck!

    • Marked as answer by Brian D. Hart Monday, November 9, 2015 12:17 AM
    Sunday, November 8, 2015 4:29 PM

All replies

  • Hi,

    Here's an old post at UtterAccess, but I thought it might help for starters. Good luck!

    • Marked as answer by Brian D. Hart Monday, November 9, 2015 12:17 AM
    Sunday, November 8, 2015 4:29 PM
  • Fair enough. I dug out my prior code that I use for writing to text files, combined it with a read of the original text file, and did a rename after writing the corrected file. Note that the CR/LF is inserted automatically when writing to a text file line-by-line. So I just had to strip out all the Chr(10) and Chr(13). This, of course, left an extra CRLF at the end of my text file; however, this did not pose a problem for my import. (Otherwise, I would have had to figure out which was the last line in my Output and leave off the CRLF--which I was too lazy to do once I realized that the superfluous CRLF at the end of the file caused me no problems.

    Here is my code. I had to solve a couple of interesting challenges preventing errors, such as ensuring that the temporary file does not exist before creating a new one and ensuring that the original file is deleted before renaming the temporary file as the original file.

    Private Sub ReformatFile_Click()
    'dimension variables
        Dim FilePath As String
        Dim FileName As String
        Dim FileNameTemp As String
        Dim FilePathName As String
        Dim FilePathNameTemp As String
        Dim CurrentLine As String
        Dim FileOriginal As Long
        Dim FileTemp As Long
    'set values
        FilePath = "C:\Blah\Blah\Blah"
        FileName = "BlahBlahBlah.csv"
        FilePathName = FilePath & "\" & FileName
        FileNameTemp = "BlahBlahBlahTemp.csv"
        FilePathNameTemp = FilePath & "\" & FileNameTemp
    'open file for reading
        FileOriginal = FreeFile
        Open FilePathName For Input As #FileOriginal
    'delete temporary file if it exists
        On Error Resume Next 'do not error if DataFile does not exist
            Kill FilePathNameTemp
        On Error GoTo 0 'in my real code, I return to my error handler here
    'wait until temporary file is deleted before proceeding
            Do While Len(Dir(FilePathNameTemp)) > 0
            Loop
    'open file for writing
        FileTemp = FreeFile
        Open FilePathNameTemp For Output As #FileTemp
    Do While Not EOF(FileOriginal)
        'read first line
            Line Input #FileOriginal, CurrentLine
        'remove CR
            CurrentLine = Replace(CurrentLine, Chr(10), "")
        'remove LF
            CurrentLine = Replace(CurrentLine, Chr(13), "")
    'CRLF is inserted automatically as the code writes each new line
        'add to temp file
            Print #FileTemp, CurrentLine
    Loop
    'close files
        Close FileOriginal
        Close FileTemp
    'delete original file
        On Error Resume Next 'do not error if DataFile does not exist
            Kill FilePathName
        On Error GoTo 0 'in my real code, I return to my error handler here
    'wait until original file is deleted before proceeding
            Do While Len(Dir(FilePathName)) > 0
            Loop
    'rename temporary file
        Name FilePathNameTemp As FilePathName
    End Sub

    Monday, November 9, 2015 12:23 AM
  • Hi. Congratulations on figuring it out and thanks for posting your solution. Good luck with your project.
    Monday, November 9, 2015 1:05 AM
  • Do While Not EOF(FileOriginal)
        'read first line
            Line Input #FileOriginal, CurrentLine
        'remove CR
            CurrentLine = Replace(CurrentLine, Chr(10), "")
        'remove LF
            CurrentLine = Replace(CurrentLine, Chr(13), "")
    'CRLF is inserted automatically as the code writes each new line
        'add to temp file
            Print #FileTemp, CurrentLine
    Loop
    

    Hi Brian,

    I wonder why you replace the Chr(10) and Chr(13) from the CurrentLine.

    As far as my experience goes, CurrentLine does not contain the line break characters any more.

    A small detail is that when you use

            Print #FileTemp, CurrentLine;

    with a semicolon at the end, the CRLF at the end of the line is suppressed.

    Imb.

    Monday, November 9, 2015 2:09 PM
  • Thank you, Imb. That is a good catch. It was my assumption that the line-end characters were included, so I did not test without that. So I just simplified the Input/Output loop to this

    Do While Not EOF(FileOriginal)
        'read first line
            Line Input #FileOriginal, CurrentLine
        'add to temp file
            Print #FileTemp, CurrentLine
    Loop

    Note that when I tried using the semicolon, it simply ran all my lines together, with no line breaks. I still need the line breaks between lines, since each line represents a separate record for my subsequent TransferText.

    Monday, November 9, 2015 4:26 PM
  • Thank you, Imb. That is a good catch. It was my assumption that the line-end characters were included, so I did not test without that. So I just simplified the Input/Output loop to this

    Do While Not EOF(FileOriginal)
        'read first line
            Line Input #FileOriginal, CurrentLine
        'add to temp file
            Print #FileTemp, CurrentLine
    Loop

    Note that when I tried using the semicolon, it simply ran all my lines together, with no line breaks. I still need the line breaks between lines, since each line represents a separate record for my subsequent TransferText.

    Hi Brian,

    Yes, that is correct. If you use the semicolon for all lines, there are no line breaks.

    But you could use something like:

    Do While Not EOF(FileOriginal)
        'read first line
            Line Input #FileOriginal, CurrentLine
        'add to temp file
            If (Not EOF(FileOriginal) Then
                Print #FileTemp, CurrentLine
            Else
                Print #FileTemp, CurrentLine;
            End If
    Loop

    Imb.

    • Marked as answer by Brian D. Hart Monday, November 9, 2015 5:02 PM
    Monday, November 9, 2015 4:54 PM
  • Thank you again, Imb.

    I did not understand at first that EOF occurred after the last line is read, instead assuming it would not occur until after the loop was completed. But because it occurs after the Input, this modification does, very neatly, omit the final (and superfluous) CRLF.

    Monday, November 9, 2015 5:02 PM