none
URL (DownloadUrl in the code) should come from a textbox on the form.The logic for counting words is so that I can do my search programatically and count instance of <a href etc. Anyone can help? RRS feed

All replies

  • Hi lts,

    Thank you for posting here.

    Based on your description, you want to count instance of <a href.

    You could refer to the following code.

     static void Main(string[] args)
            {
                HtmlWeb hw = new HtmlWeb();
                HtmlDocument doc = hw.Load("D:\\test.html");
                
                var m = ExtractAllAHrefTags(doc);
                Console.WriteLine(m.Count);
                
                Console.ReadKey();
            }
            private static List<string> ExtractAllAHrefTags(HtmlDocument htmlSnippet)
            {
                List<string> hrefTags = new List<string>();
    
                foreach (HtmlNode link in htmlSnippet.DocumentNode.SelectNodes("//a[@href]"))
                {
                    HtmlAttribute att = link.Attributes["href"];
                    hrefTags.Add(att.Value);
                    
          
                }
    
                return hrefTags;
            }

    Html:

    <!DOCTYPE html>
    <html>
    <head> 
    <meta charset="utf-8"> 
    <title>Hello</title> 
    </head>
    <body>
    
    <p>
    <a href="/index.html">text</a> This is a link</p>
    
    <p><a href="//www.microsoft.com/">txt</a> this is a another link </p>
    
    </body>
    </html>

    Result:

    Best Regards,

    Jack


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.


    Wednesday, July 17, 2019 8:45 AM
    Moderator
  • If you want to load HTML and then find elements in it then you'll want to use an HTML parser. Simply "counting" strings won't work. An HTML element could be embedded in a comment, string literal, script, etc. Use a parser like HtmlAgilityPack to parse the HTML. Questions related to using it should be posted in their support forums.

    Questions related to working with Winfoms should be posted in the Windows Forms forum.


    Michael Taylor http://www.michaeltaylorp3.net

    Wednesday, July 17, 2019 2:03 PM
    Moderator