locked
How to find what is illegal character in xml file RRS feed

  • Question

  • Hello ,

    i am using this code to read a xml file.

    declare @xml xml

    SELECT 
    @xml= CAST(x AS XML)
    FROM 
    OPENROWSET(BULK 'E:\test_data.xml',SINGLE_BLOB) AS T(x) 

    select @xml

    But I am getting the below error while reading.

    Msg 9420, Level 16, State 1, Line 3
    XML parsing: line 6925104, character 537, illegal xml character

    I can not open the file with NotePad/Notepad++ as the file is more than 2GB.

    how can i find the illegal character in the xml that is causing the issue. how to debug this issue.

    Thanks in advance for any help.

    Friday, October 17, 2014 2:01 PM

Answers

  • http://www.xmlvalidation.com/

    http://msdn.microsoft.com/en-us/library/ms171878.aspx


    T-SQL Articles
    T-SQL e-book by TechNet Wiki Community
    T-SQL blog

    Friday, October 17, 2014 2:35 PM
  • Looks like you will have to read a C# program (or whatever language you ar comfortable with) that reads and discards line, until it comes to line 6925104 and then inspects character 537. And possibly replaces the evil bastard with something else.

    You could also read the file into a two-column table, where one column is an identity column and the other is an nvarchar(MAX) and bulk-load the file. But it would be more difficult to reconstruct the XML document from tne table.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Friday, October 17, 2014 9:21 PM

All replies

  • http://www.xmlvalidation.com/

    http://msdn.microsoft.com/en-us/library/ms171878.aspx


    T-SQL Articles
    T-SQL e-book by TechNet Wiki Community
    T-SQL blog

    Friday, October 17, 2014 2:35 PM
  • Looks like you will have to read a C# program (or whatever language you ar comfortable with) that reads and discards line, until it comes to line 6925104 and then inspects character 537. And possibly replaces the evil bastard with something else.

    You could also read the file into a two-column table, where one column is an identity column and the other is an nvarchar(MAX) and bulk-load the file. But it would be more difficult to reconstruct the XML document from tne table.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Friday, October 17, 2014 9:21 PM