none
How to iterate through an xml file selecting certain nodes using XDocument RRS feed

  • Question

  • Hi,

    I'm reading through an xml file that is rather large, my aim is to take certain node values and add them to my database. Here is a sample xml:

    <EVENT>
      <EVENT_DETAIL>
        <EVENT_HEADING>
           <EVENT_HEADER_1>
              <VENUE>Main Track</VENUE>
              <CIRCUIT>2</CIRCUIT>
              <EVENT_TIME>13:05</EVENT_TIME>
              <PARTY>12</PARTY>
           </EVENT_HEADER_1>
           <EVENT_HEADER_2>
              <EVENT_TITLE>BEST OF THE BEST</EVENT_TITLE>
              <DISTANCE>100m</DISTANCE>
           </EVENT_HEADER_2>
        </EVENT_HEADING>
        <PLAYER>
           <PLAYER_HEADER_1>
              <NO>1</NO>
              <NAME>D.Jones</NAME>
              <WEIGHT>70k</WEIGHT>
           </PLAYER_HEADER_1>
           <PLAYER_HEADER_2>
              <COACH>A Wright</COACH>
              <COACH_2>K Naidoo</COACH_2>
              <COACH_YEAR>2017</COACH_YEAR>
              <OWNER>Mr Roy Johnson</OWNER>
              <ASSISTANT>S Davies</ASSISTANT>
              <COMMENT>Running late</COMMENT>
           </PLAYER_HEADER_2>
        </PLAYER>
        <PLAYER>
        <PLAYER>
        <PLAYER>
        <PLAYER>
        <PLAYER>
        <PLAYER>
       </EVENT_DETAIL>
    </EVENT>

    There are multiple <EVENT> in the xml file, I want to get <VENUE> <CIRCUIT> <EVENT_TIME> <EVENT_TITLE> <DISTANCE> <NO> <NAME> <WEIGHT> <COACH> <OWNER> <ASSISTANT> There are also multiple <PLAYER> within an event.

    I used the following for each node <g class="gr_ gr_223 gr-alert gr_tiny gr_spell gr_inline_cards gr_run_anim ContextualSpelling multiReplace" data-gr-id="223" id="223">i</g> wanted:

    var venue= doc.Root.Descendants("VENUE").Select(c => c.Value);

    Which gathered all the values and did this for each node that I wanted but I'm not sure this is the right approach. I Imagined if I captured all the nodes that I wanted from the first instance of <EVENT> and added them to a list the iterate through and upload to the DB then on the next instance of <EVENT> instantiate a new list and repeat.

    Not sure how to go about this, any examples or pointers would be really appreciated.


    CuriousCoder


    Wednesday, January 16, 2019 10:37 AM

Answers

  • The problem with your approach is that you'll get a list of all the venues but you won't have any context for the other information. I assume that multiple events can have the same venues and other information. So if you want to associate the venue with the player information then you need grab them together otherwise you may have multiple events with the same venue but different players. If you want to just grab all the unique values then grabbing each would be find but I suspect you need to know that player X goes with venue Y.

    You should grab the events first and then grab the data from the event that you need. Then save the event data to your DB. XPath is great for this kind of thing but you can also use LINQ for XML. Here's one way of doing it.

    var doc = XDocument.Load("test.xml");
    
    //Get the events
    var eventElements = doc.XPathSelectElements("/EVENTS/EVENT");
    
    //Get the data for each event, should use a strong type here to make it easier but I'll use
    //an anonymous type since this is an example
    var items = from e in eventElements
                    select new {
                        Venue = e.XPathSelectElement("EVENT_DETAIL/EVENT_HEADING/EVENT_HEADER_1/VENUE")?.Value,
                        Circuit = e.XPathSelectElement("EVENT_DETAIL/EVENT_HEADING/EVENT_HEADER_1/CIRCUIT")?.Value,
                        Time = e.XPathSelectElement("EVENT_DETAIL/EVENT_HEADING/EVENT_HEADER_1/EVENT_TIME")?.Value,
    
                        Title = e.XPathSelectElement("EVENT_DETAIL/EVENT_HEADING/EVENT_HEADER_2/EVENT_TITLE")?.Value,
                        Distance = e.XPathSelectElement("EVENT_DETAIL/EVENT_HEADING/EVENT_HEADER_2/DISTANCE")?.Value,
                        Players = from p in e.XPathSelectElements("EVENT_DETAIL/PLAYER") 
                                select new {                                          
                                    Id = p.XPathSelectElement("PLAYER_HEADER_1/NO")?.Value,
                                    Name = p.XPathSelectElement("PLAYER_HEADER_1/NAME")?.Value,
                                    Weight = p.XPathSelectElement("PLAYER_HEADER_1/WEIGHT")?.Value,
                                    Coach = p.XPathSelectElement("PLAYER_HEADER_2/COACH")?.Value,
                                    Owner = p.XPathSelectElement("PLAYER_HEADER_2/OWNER")?.Value,
                                    Assistant = p.XPathSelectElement("PLAYER_HEADER_2/ASSISTANT")?.Value,
                                }
                    };

    Note this code is using the null conditional (?.) operator. If you aren't using the newer version of C# then this won't compile and you'll need to switch to conditional expression.

    Also note that this code is going to be a little slow against very large XML files because you're continually resolving XPath queries to the same elements. For a utility app this should be fine. For a more commercial app I prefer to create helper methods that "parse" out the pieces I need. This gives me more flexibility to make adjustments. For example maybe the event time needs to be formatted differently or I need to link the coach to something else in the system. Doing all that inside the LINQ expression would be overwhelming so ParseEventHeader/ParsePlayer-style methods make it easier to maintain. And, as I mention in the code, I'm using an anonymous type to keep the sample short. If you are passing this data on to another method then you'll want to define strong types to make it easier to use.

    Finally note that your XML is invalid. Here's the XML I used.

    <?xml version="1.0" encoding="utf-8" ?>
    <EVENTS>
      <EVENT>
        <EVENT_DETAIL>
          <EVENT_HEADING>
            <EVENT_HEADER_1>
              <VENUE>Main Track</VENUE>
              <CIRCUIT>2</CIRCUIT>
              <EVENT_TIME>13:05</EVENT_TIME>
              <PARTY>12</PARTY>
            </EVENT_HEADER_1>
            <EVENT_HEADER_2>
              <EVENT_TITLE>BEST OF THE BEST</EVENT_TITLE>
              <DISTANCE>100m</DISTANCE>
            </EVENT_HEADER_2>
          </EVENT_HEADING>
          <PLAYER>
            <PLAYER_HEADER_1>
              <NO>1</NO>
              <NAME>D.Jones</NAME>
              <WEIGHT>70k</WEIGHT>
            </PLAYER_HEADER_1>
            <PLAYER_HEADER_2>
              <COACH>A Wright</COACH>
              <COACH_2>K Naidoo</COACH_2>
              <COACH_YEAR>2017</COACH_YEAR>
              <OWNER>Mr Roy Johnson</OWNER>
              <ASSISTANT>S Davies</ASSISTANT>
              <COMMENT>Running late</COMMENT>
            </PLAYER_HEADER_2>
          </PLAYER>
         </EVENT_DETAIL>
      </EVENT>
      <EVENT>
        <EVENT_DETAIL>
          <EVENT_HEADING>
            <EVENT_HEADER_1>
              <VENUE>Main Track 2</VENUE>
              <CIRCUIT>3</CIRCUIT>
              <EVENT_TIME>13:05</EVENT_TIME>
              <PARTY>12</PARTY>
            </EVENT_HEADER_1>
            <EVENT_HEADER_2>
              <EVENT_TITLE>BEST OF THE BEST 2</EVENT_TITLE>
              <DISTANCE>100m</DISTANCE>
            </EVENT_HEADER_2>
          </EVENT_HEADING>
          <PLAYER>
            <PLAYER_HEADER_1>
              <NO>1</NO>
              <NAME>B.Smith</NAME>
              <WEIGHT>70k</WEIGHT>
            </PLAYER_HEADER_1>
            <PLAYER_HEADER_2>
              <COACH>C.Jones</COACH>
              <COACH_2>K Naidoo</COACH_2>
              <COACH_YEAR>2017</COACH_YEAR>
              <OWNER>Mr Roy Johnson</OWNER>
              <ASSISTANT>S Davies</ASSISTANT>
              <COMMENT>Running late</COMMENT>
            </PLAYER_HEADER_2>
          </PLAYER>
         </EVENT_DETAIL>
      </EVENT>
    </EVENTS>


    Michael Taylor http://www.michaeltaylorp3.net

    • Marked as answer by CuriousCoder15 Thursday, January 17, 2019 11:51 AM
    • Unmarked as answer by CuriousCoder15 Thursday, January 17, 2019 12:04 PM
    • Marked as answer by CuriousCoder15 Thursday, January 17, 2019 12:13 PM
    Wednesday, January 16, 2019 3:15 PM
    Moderator

All replies

  • The problem with your approach is that you'll get a list of all the venues but you won't have any context for the other information. I assume that multiple events can have the same venues and other information. So if you want to associate the venue with the player information then you need grab them together otherwise you may have multiple events with the same venue but different players. If you want to just grab all the unique values then grabbing each would be find but I suspect you need to know that player X goes with venue Y.

    You should grab the events first and then grab the data from the event that you need. Then save the event data to your DB. XPath is great for this kind of thing but you can also use LINQ for XML. Here's one way of doing it.

    var doc = XDocument.Load("test.xml");
    
    //Get the events
    var eventElements = doc.XPathSelectElements("/EVENTS/EVENT");
    
    //Get the data for each event, should use a strong type here to make it easier but I'll use
    //an anonymous type since this is an example
    var items = from e in eventElements
                    select new {
                        Venue = e.XPathSelectElement("EVENT_DETAIL/EVENT_HEADING/EVENT_HEADER_1/VENUE")?.Value,
                        Circuit = e.XPathSelectElement("EVENT_DETAIL/EVENT_HEADING/EVENT_HEADER_1/CIRCUIT")?.Value,
                        Time = e.XPathSelectElement("EVENT_DETAIL/EVENT_HEADING/EVENT_HEADER_1/EVENT_TIME")?.Value,
    
                        Title = e.XPathSelectElement("EVENT_DETAIL/EVENT_HEADING/EVENT_HEADER_2/EVENT_TITLE")?.Value,
                        Distance = e.XPathSelectElement("EVENT_DETAIL/EVENT_HEADING/EVENT_HEADER_2/DISTANCE")?.Value,
                        Players = from p in e.XPathSelectElements("EVENT_DETAIL/PLAYER") 
                                select new {                                          
                                    Id = p.XPathSelectElement("PLAYER_HEADER_1/NO")?.Value,
                                    Name = p.XPathSelectElement("PLAYER_HEADER_1/NAME")?.Value,
                                    Weight = p.XPathSelectElement("PLAYER_HEADER_1/WEIGHT")?.Value,
                                    Coach = p.XPathSelectElement("PLAYER_HEADER_2/COACH")?.Value,
                                    Owner = p.XPathSelectElement("PLAYER_HEADER_2/OWNER")?.Value,
                                    Assistant = p.XPathSelectElement("PLAYER_HEADER_2/ASSISTANT")?.Value,
                                }
                    };

    Note this code is using the null conditional (?.) operator. If you aren't using the newer version of C# then this won't compile and you'll need to switch to conditional expression.

    Also note that this code is going to be a little slow against very large XML files because you're continually resolving XPath queries to the same elements. For a utility app this should be fine. For a more commercial app I prefer to create helper methods that "parse" out the pieces I need. This gives me more flexibility to make adjustments. For example maybe the event time needs to be formatted differently or I need to link the coach to something else in the system. Doing all that inside the LINQ expression would be overwhelming so ParseEventHeader/ParsePlayer-style methods make it easier to maintain. And, as I mention in the code, I'm using an anonymous type to keep the sample short. If you are passing this data on to another method then you'll want to define strong types to make it easier to use.

    Finally note that your XML is invalid. Here's the XML I used.

    <?xml version="1.0" encoding="utf-8" ?>
    <EVENTS>
      <EVENT>
        <EVENT_DETAIL>
          <EVENT_HEADING>
            <EVENT_HEADER_1>
              <VENUE>Main Track</VENUE>
              <CIRCUIT>2</CIRCUIT>
              <EVENT_TIME>13:05</EVENT_TIME>
              <PARTY>12</PARTY>
            </EVENT_HEADER_1>
            <EVENT_HEADER_2>
              <EVENT_TITLE>BEST OF THE BEST</EVENT_TITLE>
              <DISTANCE>100m</DISTANCE>
            </EVENT_HEADER_2>
          </EVENT_HEADING>
          <PLAYER>
            <PLAYER_HEADER_1>
              <NO>1</NO>
              <NAME>D.Jones</NAME>
              <WEIGHT>70k</WEIGHT>
            </PLAYER_HEADER_1>
            <PLAYER_HEADER_2>
              <COACH>A Wright</COACH>
              <COACH_2>K Naidoo</COACH_2>
              <COACH_YEAR>2017</COACH_YEAR>
              <OWNER>Mr Roy Johnson</OWNER>
              <ASSISTANT>S Davies</ASSISTANT>
              <COMMENT>Running late</COMMENT>
            </PLAYER_HEADER_2>
          </PLAYER>
         </EVENT_DETAIL>
      </EVENT>
      <EVENT>
        <EVENT_DETAIL>
          <EVENT_HEADING>
            <EVENT_HEADER_1>
              <VENUE>Main Track 2</VENUE>
              <CIRCUIT>3</CIRCUIT>
              <EVENT_TIME>13:05</EVENT_TIME>
              <PARTY>12</PARTY>
            </EVENT_HEADER_1>
            <EVENT_HEADER_2>
              <EVENT_TITLE>BEST OF THE BEST 2</EVENT_TITLE>
              <DISTANCE>100m</DISTANCE>
            </EVENT_HEADER_2>
          </EVENT_HEADING>
          <PLAYER>
            <PLAYER_HEADER_1>
              <NO>1</NO>
              <NAME>B.Smith</NAME>
              <WEIGHT>70k</WEIGHT>
            </PLAYER_HEADER_1>
            <PLAYER_HEADER_2>
              <COACH>C.Jones</COACH>
              <COACH_2>K Naidoo</COACH_2>
              <COACH_YEAR>2017</COACH_YEAR>
              <OWNER>Mr Roy Johnson</OWNER>
              <ASSISTANT>S Davies</ASSISTANT>
              <COMMENT>Running late</COMMENT>
            </PLAYER_HEADER_2>
          </PLAYER>
         </EVENT_DETAIL>
      </EVENT>
    </EVENTS>


    Michael Taylor http://www.michaeltaylorp3.net

    • Marked as answer by CuriousCoder15 Thursday, January 17, 2019 11:51 AM
    • Unmarked as answer by CuriousCoder15 Thursday, January 17, 2019 12:04 PM
    • Marked as answer by CuriousCoder15 Thursday, January 17, 2019 12:13 PM
    Wednesday, January 16, 2019 3:15 PM
    Moderator
  • Thank you very much for your help and explanation, I will work through this now.

    CuriousCoder

    Wednesday, January 16, 2019 3:46 PM
  • Hi CuriousCoder15,

    Thank you for posting here.

    Based on your description, you want to get all the values that you mentioned in xml.

    For convenience, I used the xml provided by CoolDadTx. You could try the following code.

    static void Main(string[] args)
            {
                
                string[] m = { "VENUE" ,"CIRCUIT", "EVENT_TIME", "EVENT_TITLE", "DISTANCE", "NO" , "NAME", "WEIGHT", "COACH", "OWNER", "ASSISTANT" };
                foreach (var item in m)
                {
                    xml(item);
               }
    
                Console.ReadKey();
            }
            static void xml(string node)
            {
                var doc = XDocument.Load(@"test.xml");
                var result = from i in doc.Descendants(node)
                             select i.Value;
                foreach (var item in result)
                {
                    Console.WriteLine(item);
                }
    
            }
    

    Result:

    Hope my advice could be helpful.

    Best regards,

    Jack


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, January 17, 2019 8:52 AM
    Moderator