none
Parse xml?

    Question

  • If xml file held aload of data in this format:

    <resources>
        <items>
             < entry id =" 1 " flags =" 0 " > Hello there </ entry >
             < entry id =" 2 " flags =" 0 " > Goodbye </ entry >
             < entry id ="3 " flags =" 0 " > Welcome! </ entry >
        </items>
    </resources>

    And I wanted to return an id by locating the phrase i.e.:

    int GetIDByPhrse("Goodbye")

    How is this achievable?

    Tuesday, October 06, 2009 2:20 AM

Answers

  • Hi ,
      I hope ypur XML is proper. I used below and code by Brian should work for you.
    <?xml version="1.0" encoding="utf-8" ?>
    <resources>
      <items>
        <entry id ="1" flags ="0" >Hello there</entry >
        <entry id ="2" flags ="0" >Goodbye</entry >
        <entry id ="3" flags ="0" >Welcome!</entry >
      </items>
    </resources>
    

    Note : There are no spaces etc.

    C# code

    private void ParseXML()
    {
    	XmlDocument doc = new XmlDocument();
    	doc.Load(@"C:\XMLFile1.xml");
    	//Get the single node 
    	XmlNode node = doc.SelectSingleNode("/resources/items/entry[.='Goodbye']");
    	if (node != null)
    	{
    		Console.WriteLine("{0}", node.Attributes["id"].Value);
    	}
    	//Get All nodes
    	XmlNodeList nodes = doc.SelectNodes("/resources/items/entry");
    	foreach (XmlNode mynode in nodes)
    	{
    		Console.WriteLine("{0}", mynode.Attributes["id"].Value);
    	}
    }
    If it does not work, let us know the any error or runtime exceptions.

    I hope it helps :-)
    • Marked as answer by w1z8yte Tuesday, October 06, 2009 4:16 PM
    Tuesday, October 06, 2009 7:17 AM

All replies

  • If we make some reasonable assumptions this is pretty straight forward.  First, make your xml valid and get rid of all the extra spacing:

    <?xml version="1.0" encoding="utf-8" ?>
    <resources>
      <items>
        <entry id ="1" flags ="0" >Hello there</entry >
        <entry id ="2" flags ="0" >Goodbye</entry >
        <entry id ="3 " flags ="0" >Welcome!</entry >
      </items>
    </resources>

    Now, use this XPath query:

            private static int GetIDFromXml()
            {
                XmlDocument doc = new XmlDocument();
                doc.Load(@"C:\Users\Brian\Documents\Visual Studio 2008\Projects\TestApps\ConsoleApplication1\samplexml.xml");
                XmlNode target = doc.SelectSingleNode("/resources/items/entry[.='Goodbye']");
                int result = int.Parse(target.Attributes["id"].Value);
                return result;
            }
            static void Main(string[] args)
            {
                Console.WriteLine(GetIDFromXml());
            }
    And like magic...
    Good coding involves knowing one's logical limits and expanding them as necessary.
    • Proposed as answer by P.Brian.Mackey Tuesday, October 06, 2009 4:53 PM
    Tuesday, October 06, 2009 2:50 AM
  • First, load your xml into an XmlDocument class. Then use the SelectSingleNode method to return an XmlNode object, using this XPath command;

    "resources/items/entry[text()='Goodbye']/@id"

    So your function looks something like;

    		private int GetIDByPhrse(string text)
    		{
    			System.Xml.XmlDocument xdoc = new System.Xml.XmlDocument();
    
    			xdoc.LoadXml("<resources><items><entry id =\"1\" flags = \"0\">Hello there</entry><entry id =\"2\" flags =\"0\">Goodbye</entry><entry id =\"3\" flags =\"0\" >Welcome!</entry></items></resources>");
    
    			System.Xml.XmlNode node = xdoc.SelectSingleNode("resources/items/entry[text()='" + text.Replace("'", "''") + "']/@id");
    
    			return Convert.ToInt32(node.Value);
    		}
    Note however;

    1. This code is inefficient because it loads the xml into the xmldocument on every call, you should either pass in the xml document or use a field to store a reference to it so it is only loaded once.

    2. There is no null checking, or checking to see if the id attribute contains something other than a number, so this code will crash if you pass in a value that is not found or where the id attribute contains non-numeric characters.



    Tuesday, October 06, 2009 2:58 AM
  • Isn't there an efficient way to load the xml into memory and then whenever i want to parse data from it use the memory stored version?
    Tuesday, October 06, 2009 3:52 AM
  • Yes,

    Load the xml into an XmlDocument (like both samples show you), and the use that xml document each time. You can do that either by passing the xml document into the function, i.e

    private void DoLookups()
    {
     System.Xml.XmlDocument xdoc = new System.Xml.XmlDocument();

    xdoc.LoadXml("<resources><items><entry id =\"1\" flags = \"0\">Hello there</entry><entry id =\"2\" flags =\"0\">Goodbye</entry><entry id =\"3\" flags =\"0\" >Welcome!</entry></items></resources>");


    int i =   GetIDByPhrase("Goodbye");
      i = GetIDByPhrase("Hello there");
    //etc.
    }


      private int GetIDByPhrase(XmlDocument xdoc, string text)
    {

       System.Xml.XmlNode node = xdoc.SelectSingleNode("resources/items/entry[text()='" + text.Replace("'", "''") + "']/@id");

    return Convert.ToInt32(node.Value);
    }

    Or keep the  XmlDocument in a class level variable (field), i.e

    public class SomeClass
    {
      private XmlDocument _XmlData;

      public SomeClass(string xml)
      {
         _XmlData = new XmlDocument();
         _XmlData.LoadXml(xml);
      }

      public void DoLookups()
    {
    int i =   GetIDByPhrase("Goodbye");
      i = GetIDByPhrase("Hello there");
    //etc.
    }

      private int GetIDByPhrase(XmlDocument xdoc, string text)
    {

       System.Xml.XmlNode node = _XmlData.SelectSingleNode("resources/items/entry[text()='" + text.Replace("'", "''") + "']/@id");

    return Convert.ToInt32(node.Value);
    }

    }

    Tuesday, October 06, 2009 3:55 AM
  • Hi,
       Remember that you are using DOM parsing which loads into memory. If it is large XML, you might end up with out of memory exception. So in that case, Use SAX parsing (XmlReader). XMLReader is slow compared to DOM parsing but is quite efficient. The size of XML won't be a problem.

    YOu can always program smart enough to switch between type of parsing on basis of file.

    Hope it helps

    Thanks
    Tuesday, October 06, 2009 4:23 AM
  • Hmmm ok I've loaded the xml in my form's load event:

    Variables Class:
    
    private static XPathDocument _resources;
    public static XPathDocument resources
    {
        set { _resources = resources; }
        get { return _resources; }
    }
    
    Main Form:
    
    private void Form_Load(object sender, EventArgs e)
    {
        Variables.resources = new XPathDocument(string.Format("{0}resources.xml", Variables.dirData));
    }
    
    public static int GetItemIDByPhrase(string phrase)
    {
        try
        {
            XmlNode target = Variables.resources.SelectSingleNode(string.Format("/resources/items/entry[.='{0}']", phrase));
            int result = int.Parse(target.Attributes["id"].Value);
            return result;
        }
        catch { return -1; }
    }

    I get error:

    Error    1    'System.Xml.XPath.XPathDocument' does not contain a definition for 'SelectSingleNode' and no extension method 'SelectSingleNode' accepting a first argument of type 'System.Xml.XPath.XPathDocument' could be found (are you missing a using directive or an assembly reference?)
    Tuesday, October 06, 2009 4:24 AM
  • SelectSingleNode is a method of XmlDocument (in fact from XmlNode). Don't use XPathDocument  if you have to do simple parsing. Use XMLDocument.
    Tuesday, October 06, 2009 4:47 AM
  • I'm just confused, all I want to do is load the xml into memory and then parse the data when needed.
    Tuesday, October 06, 2009 4:51 AM
  • XmlDocument

     

    class from namespace System.Xml. (C:\Windows\Microsoft.NET\Framework\v2.0.50727\System.Xml.dll). Please make sure that case is same

    Tuesday, October 06, 2009 4:58 AM
  • I still don't get it lol.
    • Proposed as answer by Vic Vega Tuesday, October 06, 2009 6:16 AM
    • Unproposed as answer by w1z8yte Tuesday, October 06, 2009 6:24 AM
    Tuesday, October 06, 2009 5:55 AM
  • By Mistake I clicked on Propose as answer :-)
    Tuesday, October 06, 2009 6:16 AM
  • Hi ,
     Do you have System.XML in your references ? Can you try creating a new project and try again.

    Tuesday, October 06, 2009 6:22 AM
  • I don't have a problem finding the assembly, I have a problem getting it to work the way I want, where I store the xml into memory for easier reading, and then use that stored version for parsing.
    • Proposed as answer by Vic Vega Tuesday, October 06, 2009 6:26 AM
    • Unproposed as answer by w1z8yte Tuesday, October 06, 2009 3:25 PM
    Tuesday, October 06, 2009 6:24 AM
  • Hi,
     My mouse is really bad and it clicked "Propose again " :-)

    Now I get your problem. Is your XML file a physical file in disk, or embedded resource , or created in memory?

    If it physical file, you can load using Load method XmlDocument.
    If it is in memory, you can use LoadXML as suggested by Yort.

    Thanks
    PKR

    Tuesday, October 06, 2009 6:31 AM
  • It's a xml file, a pretty huge one too (850kb), I still can't get it to work without errors.
    Tuesday, October 06, 2009 6:38 AM
  • Hi ,
      I hope ypur XML is proper. I used below and code by Brian should work for you.
    <?xml version="1.0" encoding="utf-8" ?>
    <resources>
      <items>
        <entry id ="1" flags ="0" >Hello there</entry >
        <entry id ="2" flags ="0" >Goodbye</entry >
        <entry id ="3" flags ="0" >Welcome!</entry >
      </items>
    </resources>
    

    Note : There are no spaces etc.

    C# code

    private void ParseXML()
    {
    	XmlDocument doc = new XmlDocument();
    	doc.Load(@"C:\XMLFile1.xml");
    	//Get the single node 
    	XmlNode node = doc.SelectSingleNode("/resources/items/entry[.='Goodbye']");
    	if (node != null)
    	{
    		Console.WriteLine("{0}", node.Attributes["id"].Value);
    	}
    	//Get All nodes
    	XmlNodeList nodes = doc.SelectNodes("/resources/items/entry");
    	foreach (XmlNode mynode in nodes)
    	{
    		Console.WriteLine("{0}", mynode.Attributes["id"].Value);
    	}
    }
    If it does not work, let us know the any error or runtime exceptions.

    I hope it helps :-)
    • Marked as answer by w1z8yte Tuesday, October 06, 2009 4:16 PM
    Tuesday, October 06, 2009 7:17 AM
  • Great thanks that worked, now how would I parse that same file if I wanted to do it vice versa, so get the phrase by using the id?
    Tuesday, October 06, 2009 3:25 PM
  • That is a new question about a new topic.  LINQ.

    http://msdn.microsoft.com/en-us/library/bb308960.aspx#xlinqoverview_topic3a

    Forum protocol says to start a new thread.  It's okay to include to post a link to this thread.  In fact, I recommend it.

    Please close the thread by marking the most helpful reply as "Answer". 

    thanx ahead of time.
    Mark the best replies as answers. "Fooling computers since 1971."
    Tuesday, October 06, 2009 4:08 PM
    Moderator