locked
Best way for Reading XML file... RRS feed

  • Question

  • Hi...

    what is the best & fast way to read the XML file...?

    I have following XML file... (Actual size of file will be very large)

    <Comapny>
            <Employee ID="1" >
              <Name First="ABC" Middle="SSS" Surname="XYZ" />
              <Designation>Engineer</Designation>
              <Department>Service</Department>
            </Employee>
            <Employee ID="2" >
              <Name First="XXX" Middle="FFF" Surname="www" />
              <Designation>Engineer</Designation>
              <Department>Service</Department>
            </Employee>
            <Employee ID="3" >
              <Name First="HJK" Middle="QWE" Surname="DSFD" />
              <Designation>Engineer</Designation>
              <Department>Sales</Department>
            </Employee>
            <Employee ID="4" >
              <Name First="ADSF" Middle="HJH" Surname="TTYT" />
              <Designation>Engineer</Designation>
              <Department>Service</Department>
            </Employee>
            <Employee ID="5" >
              <Name First="VVV" Middle="JJK" Surname="MBN" />
              <Designation>Engineer</Designation>
              <Department>Marketing</Department>
            </Employee>
    </Comapny>

    I have a 'CompanyClass'... I want to create a object of this class...
    I have EmployeeClass... Comapany class will hav list of EmployeeClass...
    Employee class will have properies for Name, Designation & Department...

    How i can read this file... Fast & best way to read this XML file...?


    Thanks in advance,
    IamHuM

    Tuesday, July 14, 2009 4:30 AM

Answers

  • Hi!

    It's relatively easy with Linq2xml.

    class Program
    {
        static void Main(string[] args)
        {
            XDocument xdoc = XDocument.Load("test.xml");
            Company c = new Company
            {
                Employees = new List<Employee>(from e in xdoc.Descendants("Employee")
                                               select new Employee
                                               {
                                                   ID = (string)e.Attribute("ID"),
                                                   Department = (string)e.Element("Department"),
                                                   Designation = (string)e.Element("Designation"),
                                                   Name = new EmployeeName
                                                   {
                                                       First = (string)e.Element("Name").Attribute("First"),
                                                       Middle = (string)e.Element("Name").Attribute("Middle"),
                                                       Surname = (string)e.Element("Name").Attribute("Surname")
                                                   }
                                               })
            };
            Console.Read();
        }
    }
    
    public class Employee
    {
        string id;
        public string ID
        {
            get { return id; }
            set { id = value; }
        }
    
        EmployeeName name;
        public EmployeeName Name
        {
            get { return name; }
            set { name = value; }
        }
    
        string designation;
        public string Designation
        {
            get { return designation; }
            set { designation = value; }
        }
    
        string department;
        public string Department
        {
            get { return department; }
            set { department = value; }
        }
    }
    
    public class EmployeeName
    {
        string first;
        public string First
        {
            get { return first; }
            set { first = value; }
        }
    
        string middle;
        public string Middle
        {
            get { return middle; }
            set { middle = value; }
        }
    
        string surname;
        public string Surname
        {
            get { return surname; }
            set { surname = value; }
        }
    }
    
    
    public class Company
    {
        List<Employee> employees;
        public List<Employee> Employees
        {
            get { return employees; }
            set { employees = value; }
        }
    }
    Altough this works, you should try using an XmlSerializer too (but then the classes need a little tailoring).
    Hope this helps!

    David
    • Proposed as answer by David Fulop Tuesday, July 14, 2009 7:31 AM
    • Marked as answer by Figo Fei Wednesday, July 15, 2009 3:45 AM
    • Unmarked as answer by IamHuM Wednesday, July 15, 2009 1:18 PM
    • Marked as answer by Figo Fei Monday, July 20, 2009 3:32 AM
    Tuesday, July 14, 2009 7:31 AM
  • You don't hold all the data, you are streaming it in like a StreamReader. You just evaluate what is being streamed in.

    //Example

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Xml;

     

    namespace ConsoleApplication1
    {
        class Program
        {
            static void Main(string[] args)
            {
                XmlTextReader rdr = new XmlTextReader("1.xml");
                while (rdr.Read())
                {
                    if (rdr.NodeType == XmlNodeType.Element)
                    {
                        if (rdr.Name == "Employee")
                        {
                            if (rdr.GetAttribute("ID").ToString() == "5")
                            {
                                rdr.ReadToDescendant("Name");
                                Console.WriteLine(rdr.GetAttribute("First"));
                                Console.WriteLine(rdr.GetAttribute("Middle"));
                                Console.WriteLine(rdr.GetAttribute("Surname"));
                                rdr.ReadToNextSibling("Designation");
                                Console.WriteLine(rdr.ReadInnerXml());
                                rdr.ReadToNextSibling("Department");
                                Console.WriteLine(rdr.ReadInnerXml());
                                break;
                            }
                        }
                    }
                }
                Console.ReadLine();
            }
        }
    }


    John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com

    • Edited by JohnGrove Wednesday, July 15, 2009 3:07 PM
    • Marked as answer by Figo Fei Monday, July 20, 2009 3:32 AM
    Wednesday, July 15, 2009 2:49 PM

All replies

  • Hi!

    It's relatively easy with Linq2xml.

    class Program
    {
        static void Main(string[] args)
        {
            XDocument xdoc = XDocument.Load("test.xml");
            Company c = new Company
            {
                Employees = new List<Employee>(from e in xdoc.Descendants("Employee")
                                               select new Employee
                                               {
                                                   ID = (string)e.Attribute("ID"),
                                                   Department = (string)e.Element("Department"),
                                                   Designation = (string)e.Element("Designation"),
                                                   Name = new EmployeeName
                                                   {
                                                       First = (string)e.Element("Name").Attribute("First"),
                                                       Middle = (string)e.Element("Name").Attribute("Middle"),
                                                       Surname = (string)e.Element("Name").Attribute("Surname")
                                                   }
                                               })
            };
            Console.Read();
        }
    }
    
    public class Employee
    {
        string id;
        public string ID
        {
            get { return id; }
            set { id = value; }
        }
    
        EmployeeName name;
        public EmployeeName Name
        {
            get { return name; }
            set { name = value; }
        }
    
        string designation;
        public string Designation
        {
            get { return designation; }
            set { designation = value; }
        }
    
        string department;
        public string Department
        {
            get { return department; }
            set { department = value; }
        }
    }
    
    public class EmployeeName
    {
        string first;
        public string First
        {
            get { return first; }
            set { first = value; }
        }
    
        string middle;
        public string Middle
        {
            get { return middle; }
            set { middle = value; }
        }
    
        string surname;
        public string Surname
        {
            get { return surname; }
            set { surname = value; }
        }
    }
    
    
    public class Company
    {
        List<Employee> employees;
        public List<Employee> Employees
        {
            get { return employees; }
            set { employees = value; }
        }
    }
    Altough this works, you should try using an XmlSerializer too (but then the classes need a little tailoring).
    Hope this helps!

    David
    • Proposed as answer by David Fulop Tuesday, July 14, 2009 7:31 AM
    • Marked as answer by Figo Fei Wednesday, July 15, 2009 3:45 AM
    • Unmarked as answer by IamHuM Wednesday, July 15, 2009 1:18 PM
    • Marked as answer by Figo Fei Monday, July 20, 2009 3:32 AM
    Tuesday, July 14, 2009 7:31 AM
  • Hi IamHuM,

    How i can read this file... Fast & best way to read this XML file...?

    INMHO "FAST" and "BEST" contradict each other.  Realistically two sides of a coin.

    Any how i think what you need is XML serializatilon. Have a look at the link below.

    http://msdn.microsoft.com/en-us/library/ms950721.aspx


    Yet to be discovered!
    Tuesday, July 14, 2009 7:40 AM
  • hi,

            Thanks for the replies.

            My first option is to use .Net 2.0 so i cant use LINQ since LINQ is not supported by .Net 2.0.
            So what is the better way to do in .Net 2.0.

            In case, If i convert my application in .Net 3.0 or higher... Can you please tell me how i can solve follwing problems.

                1. Suppose i have more complecated XML file then how i can use the LINQ queries for the same...?
                           (I mean to say .... here i have only 3 layers of nodes... if there are more layers of node then can i do the same as shown above...?)

                 2. For more complecated XML files and class structures .... how can i use LINQ efficiently and easily...?


    My apology for delayed reply.


    Thanks in advance,
    IamHuM

           

    Wednesday, July 15, 2009 1:18 PM
  • //Example

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Xml.Linq;

    namespace ConsoleApplication1
    {
        class Program
        {
            static void Main(string[] args)
            {
                XElement element = XElement.Load("1.xml");
                XElement id5 = element.Descendants()
                    .Where(i => (String)i.Attribute("ID") == "5")
                    .First();
                Console.WriteLine(id5);
                Console.ReadLine();
            }
        }
    }


    John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
    Wednesday, July 15, 2009 1:38 PM
  • "Fast" is relative. How big is your file.

    Big file = XmlReader.
    Small file = XmlDocument
    John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
    Wednesday, July 15, 2009 1:40 PM
  • //using .NET 2.0

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Xml;

    namespace ConsoleApplication1
    {
        class Program
        {
            static void Main(string[] args)
            {
                XmlDocument doc = new XmlDocument();
                doc.Load("1.xml");
                XmlNode node = doc.SelectSingleNode("//Employee[@ID='5']");
                Console.WriteLine(node["Name"].Attributes["First"].InnerText);
                Console.WriteLine(node["Name"].Attributes["Middle"].InnerText);
                Console.WriteLine(node["Name"].Attributes["Surname"].InnerText);
                Console.WriteLine(node["Designation"].InnerText);
                Console.WriteLine(node["Department"].InnerText);
                Console.ReadLine();
            }
        }
    }


    John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
    Wednesday, July 15, 2009 2:08 PM
  • hi,

         Thanks for the reply.

          My file can contain data of more than 1,00,000 employee data. Each employee will have it's own data.
          In my XML file minumum 10 layers of nodes will be there. I cant use database... i have to use XML file.

          Whats is the good way to read this file.


    Thanks in advance,
    IamHuM
    Wednesday, July 15, 2009 2:09 PM
  • Then you definitely want to use the XmlReader/XmlWriter.

    From "Beginning Xml with C# 2008, P.61"

    "DOM-based parsers are best suited to modifying XML documents that are small. However, with huge XML documents, DOM access can pose problems in terms of memory footprint and performance. In such cases, an alternative must be adopted so that we can read and write XML documents without these limitations.......The .NET Framework provides a class called XmlReader...."

    John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
    Wednesday, July 15, 2009 2:10 PM
  • Yes, XmlReader is the way to go, but then you face the problem of holding all this data in memory, assuming you create Employee objects as you read each Employee element in the xml; potentially you could end up with 1 million instantiated Employee objects, which could kill your system. You probably need to be offloading onto some other type of storage, so your code is a way of converting between xml and some other storage format, e.g., a database. As you read each Employee element, you "deserialize" it into an Employee object which you then save off (e.g., to a database) and release the memory used by it.
    Wednesday, July 15, 2009 2:44 PM
  • hi,


              Thanks...
              I have put some code.... Can you tell me in this is correct way to read the complete XML file...


    /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
               
    using (XmlReader reader = XmlReader.Create(filePath, readerSettings))
    {
    doc.Load(reader);
    }
    
    
    XmlNode UtilitiesXmlNode = doc.SelectSingleNode("Company");\\get company node
    
    Company company = new Company();
    
    company.ReadConfigXml(companyXmlNode);// pass the company node to company class for reading.
    
    
    /// Inside Comapny class
    
    public ReadConfigXml(companyXmlNode)
    {
        foreach(XmlNode node in companyXmlNode.ChildNodes)
        {
          Employee employee  = new Employee ();
    
          employee .ReadConfigXml(node);// pass the employee node one by one to company class for reading.
    
         // Then add to list...
    
    
        }
    
    }
    
    /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////



    Thanks for the help once again,
    IamHuM
    Wednesday, July 15, 2009 2:45 PM
  • You don't hold all the data, you are streaming it in like a StreamReader. You just evaluate what is being streamed in.

    //Example

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Xml;

     

    namespace ConsoleApplication1
    {
        class Program
        {
            static void Main(string[] args)
            {
                XmlTextReader rdr = new XmlTextReader("1.xml");
                while (rdr.Read())
                {
                    if (rdr.NodeType == XmlNodeType.Element)
                    {
                        if (rdr.Name == "Employee")
                        {
                            if (rdr.GetAttribute("ID").ToString() == "5")
                            {
                                rdr.ReadToDescendant("Name");
                                Console.WriteLine(rdr.GetAttribute("First"));
                                Console.WriteLine(rdr.GetAttribute("Middle"));
                                Console.WriteLine(rdr.GetAttribute("Surname"));
                                rdr.ReadToNextSibling("Designation");
                                Console.WriteLine(rdr.ReadInnerXml());
                                rdr.ReadToNextSibling("Department");
                                Console.WriteLine(rdr.ReadInnerXml());
                                break;
                            }
                        }
                    }
                }
                Console.ReadLine();
            }
        }
    }


    John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com

    • Edited by JohnGrove Wednesday, July 15, 2009 3:07 PM
    • Marked as answer by Figo Fei Monday, July 20, 2009 3:32 AM
    Wednesday, July 15, 2009 2:49 PM
  • ok...

            I have to use the read data somewhere else in my aplication. So when i complete my file reading... i shoould have a list of employees in a company class... Which i will use...

           I think this is not possible with your code... you are just reading it and didsplying it... am i correct...?

           Please suggest...

    Thanks for help,

    IamHuM

    Wednesday, July 15, 2009 3:25 PM
  • I think David answered that using .NET 3.5, just modify his example and couple it with mine. Don't let us do all your work for you. You certainly have enough to make a decision now.


    John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
    Wednesday, July 15, 2009 4:04 PM