locked
LINQ query nullreference exception unhandled RRS feed

  • Question

  • User1691596021 posted

    Hi,

    I am trying to do some screen scraping with HtmlAgilityPack (http://www.codeplex.com/htmlagilitypack) and LINQ. But for thoose that haven't used that library I think you can answer this anyway.

    I just used the stringbuilder to make a simple html webpage with two divs.  And inside both divs I have some text. First I tried to get the both divs and display the innertext in thoose divs and it worked fine. My query looked like this

    List<HtmlNode> divList = (from HtmlNode node in doc.DocumentNode.SelectNodes("//div")

    where node.Name == "div"

    select node).ToList();


    and then I had a foreach that did foreach(HtmlNode node in divList)

    Console.WriteLine(node.InnerText);


    And all of this trivial stuff worked just fine. Then I added a class to one of the divs called "withclass" and in the linq query said where node.Name == "div" && node.Attributes["class"].Value == "withclass"... and this works fine IF both of the divs have a class attribute. And if they both have a class attribute but with different class values the where part of the query just grabs the one with the name I specified. But If Only one of the divs got a class attribute I get a nullreference exception unhandled (Object reference not set to an instance of an object.). Which I guess is because it tries to check the value of the attribue class but it doesn't exist in one of the divs. So I added in the where part node.Attributes["class"].Value != null but the problem remains.

    I thought that with a linq query like that It wouldn't mind if some of the HtmlNode objects didn't have for example a class attribute and that the where part is only where I choose that thoose HtmlNodes that I select meets the criteria. But I guess I am wrong?


    Would be really grateful if something could spreak some light in this!


    Best Regards

    Thomas

    Tuesday, February 23, 2010 6:06 PM

Answers

  • User-1179452826 posted

    where Node.Name == "div" && Node.Attributes["class"] != null && Node.Attributes["class"] == "yourClass"

     

    That should work. 

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Wednesday, February 24, 2010 1:22 AM