locked
Using the HtmlAgilityPack to parse HTML and update the database RRS feed

  • Question

  • User-352524747 posted

    I read mikesdotnetting's post (http://www.mikesdotnetting.com/article/273/using-the-htmlagilitypack-to-parse-html-in-asp-net) and i trying to parse a table which is inside another table which is inside another table...on a webpage. 

    How to get the data of:

    3rd <td> on 3rd <tr>

    3rd <td> on 4th <tr>

    3rd <td> on 5th <tr>

    3rd <td> on 6th <tr>

    3rd <td> on 14th <tr>

    3rd <td> on 15th <tr>

    <iframe src="https://onedrive.live.com/embed?cid=BFB0807C7B287C21&resid=BFB0807C7B287C21%21181&authkey=AApKiP8kRFmv9-8" width="320" height="230" frameborder="0" scrolling="no"></iframe>

    The html code is below:

    <table class="tabcontent" cellspacing="0" cellpadding="3" width="100%" border="1" frame="VOID" rules="ALL" bordercolor="#D3DAD3">
        <tbody>
            <tr>
                <td colspan="5">
                    <b>&nbsp;Last update :</b>&nbsp;12:17:33&nbsp;04.06.2015
                </td>
            </tr>
            <tr>
                <td colspan="2">&nbsp;</td>
                <td align="right" colspan="3" valign="top">all/fx</td>
            </tr>
            <tr>
                <td nowrap="">USD</td>
                <td nowrap="">USD</td>
                <td align="right" nowrap="">124.27</td>
                <td align="right" nowrap="">-2.51</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
            <tr>
                <td nowrap="">EUR</td>
                <td nowrap="">EUR</td>
                <td align="right" nowrap="">140.91</td>
                <td align="right" nowrap="">-0.02</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
            <tr>
                <td nowrap="">GBP</td>
                <td nowrap="">GBP</td>
                <td align="right" nowrap="">191.69</td>
                <td align="right" nowrap="">-1.72</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
            <tr>
                <td nowrap="">CHF</td>
                <td nowrap="">CHF</td>
                <td align="right" nowrap="">133.73</td>
                <td align="right" nowrap="">-1.69</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
            <tr>
                <td nowrap="">JPY</td>
                <td nowrap="">JPY</td>
                <td align="right" nowrap="">100.17</td>
                <td align="right" nowrap="">-1.77</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
            <tr>
                <td nowrap="">AUD</td>
                <td nowrap="">AUD</td>
                <td align="right" nowrap="">96.22</td>
                <td align="right" nowrap="">-2.34</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
            <tr>
                <td nowrap="">CAD</td>
                <td nowrap="">CAD</td>
                <td align="right" nowrap="">99.84</td>
                <td align="right" nowrap="">-2.05</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
            <tr>
                <td nowrap="">SEK</td>
                <td nowrap="">SEK</td>
                <td align="right" nowrap="">15.04</td>
                <td align="right" nowrap="">+0.04</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
            <tr>
                <td nowrap="">NOK</td>
                <td nowrap="">NOK</td>
                <td align="right" nowrap="">16.08</td>
                <td align="right" nowrap="">-0.12</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
            <tr>
                <td nowrap="">DKK</td>
                <td nowrap="">DKK</td>
                <td align="right" nowrap="">18.90</td>
                <td align="right" nowrap="">+0.02</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
            <tr>
                <td nowrap="">SDR</td>
                <td nowrap="">SDR</td>
                <td align="right" nowrap="">173.63</td>
                <td align="right" nowrap="">-2.87</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
            <tr>
                <td nowrap="">Gold(OZ 1)</td>
                <td nowrap="">XAU</td>
                <td align="right" nowrap="">146938.39</td>
                <td align="right" nowrap="">-3,566.79</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
            <tr>
                <td nowrap="">Silver(OZ 1)</td>
                <td nowrap="">XAG</td>
                <td align="right" nowrap="">2036.70</td>
                <td align="right" nowrap="">-64.00</td>
                <td width="20" align="center">
                    <img src="#">
                </td>
            </tr>
        </tbody>
    </table>
    Friday, June 5, 2015 5:08 AM

Answers

  • User-821857111 posted

    You need to get a reference to the table (which you might be able to do using its css class if you haven't already accomplished this). Then you obtain the collection of rows and burrow into them using the ElementAt method:

    var table = doc.DocumentNode.Descendants("table")
        .Where(t => t.GetAttributeValue("class", "").Equals("tabcontent"))
        .First();
    var rows = table.Descendants("tr");
    var thirdCellInThirdRow = rows.ElementAt(2).Descendants("td").ElementAt(2);
    
    <div>@thirdCellInThirdRow.InnerText</div>

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Friday, June 5, 2015 7:07 AM