none
Load, parse and selectnodes from unknown XMLdocument

    Question

  • I am aware about namespaces and how to add them to a namespace manager etc.
    This is fine in an environment which I fully control.
    However, I am now involved in a scenario where I pass xml data thru our system and on to a central validating hub. This data can have any number and kind of namespaces, totally unknown to me. If it is malformed, the central hub refuses it, and I send this information back to the originator. So far, so good.
    Problems arise when the contents of the well-formed xml does not meet the requirements of the central hub. The central hub sends back an XPath string pointing to the error. If the original document used prefixes, then the Xpath string does - if the original had no prefixes, then the XPath string does not either.
    My job now is to display the original text and highlight the error. In other words, I wish to parse a document without knowing what is in it, and then apply the Xpath string to find the error location.
    Would have been nice, if the loaded XMLDcoument could have auto-detected the namespaces. This is disapointment number 1.
    So, I have to read the document with an XMLReader first to extract the namespaces? Or use Regex?
    Then I add the namespaces to an XMLDocument and load the thing all over again?
    Then - oh dear - a namespace is declared with an empty prefix and selectsinglenode will not give any results = disapointment number 2.
    So I manipulate the received XPath string and add my own prefixes? Wow. Is there no simple idiot method for this?
    I somehow suspect that the central hub is using a very primitive parser - surely they are not going thru the same pain as I am? Would be great if somebody out there had some tips for me. This doesn't seem to me to be such a wild scenario...

    TIA Elisabeth

    Thursday, January 16, 2014 4:17 PM

Answers

  • OK, so I am answering this myself with the hack which I have devised for myself - maybe others will find it useful.

    The main purpose of namespaces is to make the tags unique, right? They would still be unique without the ":" symbol (unless you are extremely unlucky and the document happens to also have such constructs as
    a:Apple as well as aApple - then you are sunk...).
    So, I created following hack:

        Public Function XpathHacker(XMLString As String, XPathstring As String) As XmlNode
            Dim xDoc As New XmlDocument
            Dim nd As XmlNode = Nothing
            'remove ":" in Xpathstring
            XPathstring = Regex.Replace(XPathstring, "(/?\w*):", "$1")
            'remove ":" in XMLString
            XMLString = Regex.Replace(XMLString, "<(/?\w*):", "<$1")
            'remove "xmlns etc." in XMLString
            XMLString = Regex.Replace(XMLString, " (xmlns[^ >]*)", "")
    
            xDoc.LoadXml(XMLString)
            nd = xDoc.SelectSingleNode(XPathstring)
    
            Return nd
        End Function
    Cheers,
    Elisabeth

    Friday, January 17, 2014 9:41 AM
  • Hello,

    Glad to hear that you have found a solution.

    And if you want ignore the namespace, have a try to use local-name() XPath function as below:

    string setParameterName = doc.XPathSelectElement("/*[local-name() = 'bookstore']/*[local-name() = 'book']/*[local-name() = 'title']").Value;

    It will ignore namespaces with XPath.

    If I misunderstand, please let me know.

    Regards.


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Friday, January 17, 2014 9:59 AM

All replies

  • OK, so I am answering this myself with the hack which I have devised for myself - maybe others will find it useful.

    The main purpose of namespaces is to make the tags unique, right? They would still be unique without the ":" symbol (unless you are extremely unlucky and the document happens to also have such constructs as
    a:Apple as well as aApple - then you are sunk...).
    So, I created following hack:

        Public Function XpathHacker(XMLString As String, XPathstring As String) As XmlNode
            Dim xDoc As New XmlDocument
            Dim nd As XmlNode = Nothing
            'remove ":" in Xpathstring
            XPathstring = Regex.Replace(XPathstring, "(/?\w*):", "$1")
            'remove ":" in XMLString
            XMLString = Regex.Replace(XMLString, "<(/?\w*):", "<$1")
            'remove "xmlns etc." in XMLString
            XMLString = Regex.Replace(XMLString, " (xmlns[^ >]*)", "")
    
            xDoc.LoadXml(XMLString)
            nd = xDoc.SelectSingleNode(XPathstring)
    
            Return nd
        End Function
    Cheers,
    Elisabeth

    Friday, January 17, 2014 9:41 AM
  • Hello,

    Glad to hear that you have found a solution.

    And if you want ignore the namespace, have a try to use local-name() XPath function as below:

    string setParameterName = doc.XPathSelectElement("/*[local-name() = 'bookstore']/*[local-name() = 'book']/*[local-name() = 'title']").Value;

    It will ignore namespaces with XPath.

    If I misunderstand, please let me know.

    Regards.


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Friday, January 17, 2014 9:59 AM