none
Parsing HTML and getting the value of an element based on its ID RRS feed

  • Question

  • I'm executing an HttpWebRequest that grabs the HTML from a given URL. I'm receiving the HTML as expected, but I'm not sure of the best way to get the value of an element based on its ID.

    Here's the HTML I receive:

    <HTML><HEAD><TITLE>Device Status</TITLE></HEAD><BODY><div id="deviceValue">00</div></BODY></HTML>

    I'm wanting to get 00 from the element with the ID: "deviceValue"

    I'm assuming there has to be a super simple way to do this since people have been parsing HTML for decades but I can't find a such a simple solution.

    Any ideas?

    Thanks!

    Thursday, January 23, 2020 11:07 PM

Answers

  • I just created my own function to parse the value. I'm really surprised that C# doesn't have a basic library for HTML parsing.

    Here's what I came up with...

            private int GetValueByID(string html, string id)
            {
                int response = -1;
    
                string searchFor = $"<div id=\"{id}\">";
                int index = html.IndexOf(searchFor);
    
                if (index > 0)
                {
                    int valueIndex = index + searchFor.Length;
                    string value = html.Substring(valueIndex, 2);
    
                    response = Convert.ToInt32(value, 16);
                }
    
                return response;
            }

    And then the method is called like this...

                    int parsedValue = GetValueByID(html: htmlResponse, id: "status");
    
                    if (parsedValue > -1)
                    {
                        // Value found
                    }

    I pass along the HTML response and I decide which ID to look for.

    • Marked as answer by T Gregory Tuesday, January 28, 2020 7:07 PM
    Tuesday, January 28, 2020 7:02 PM

All replies

  • Hello,

    Look at HTML Agility pack

    https://stackoverflow.com/questions/40559406/how-to-get-value-of-data-url-of-a-specific-div-with-htmlagilitypack


    Please remember to mark the replies as answers if they help and unmarked them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via my MSDN profile but will not answer coding question on either.

    NuGet BaseConnectionLibrary for database connections.

    StackOverFlow
    profile for Karen Payne on Stack Exchange

    Friday, January 24, 2020 12:47 AM
    Moderator
  • If you are confident that it only appears once then just grep for it using IndexOf, but do so case insensitive. This assumes the HTML is pretty static and you don't need anything else. However any spacing changes or whatnot will throw it off so you might do better to find `deviceValue` first, then the greater than and then the closing div element. Everything in between is the value.

    If you need any more complex parsing or the HTML is pretty dynamic then using HAP as Karen mentioned is the slower but safer option.


    Michael Taylor http://www.michaeltaylorp3.net

    Friday, January 24, 2020 2:58 PM
    Moderator
  • I just created my own function to parse the value. I'm really surprised that C# doesn't have a basic library for HTML parsing.

    Here's what I came up with...

            private int GetValueByID(string html, string id)
            {
                int response = -1;
    
                string searchFor = $"<div id=\"{id}\">";
                int index = html.IndexOf(searchFor);
    
                if (index > 0)
                {
                    int valueIndex = index + searchFor.Length;
                    string value = html.Substring(valueIndex, 2);
    
                    response = Convert.ToInt32(value, 16);
                }
    
                return response;
            }

    And then the method is called like this...

                    int parsedValue = GetValueByID(html: htmlResponse, id: "status");
    
                    if (parsedValue > -1)
                    {
                        // Value found
                    }

    I pass along the HTML response and I decide which ID to look for.

    • Marked as answer by T Gregory Tuesday, January 28, 2020 7:07 PM
    Tuesday, January 28, 2020 7:02 PM