none
extract data from html page RRS feed

  • Question

  • hi all

     

    I am working on shipment tracking i want to extract data from page source. thanks

     

    <div class="tableborder" id="table8335200020" name="tableable">
    <table border="0" summary="Summary of table content">
    <thead class="tophead">
    <tr>
    <th scope="col" axis="length" class="icon" style="width:5%">
    <div class="iconImg" id="summaryicon"></div>
    <img src="/img/modules/icon_tclp_icon_blank1.jpg" alt="" class="iconImg" id="trackContact8335200020"/>
    <!-- <div dojoType="dijit.Tooltip" connectId="trackContact8335200020" ></div> -->
    </th>
    <th scope="col" axis="length" style="width:32% ;text-align:left ">
    <span class="air_waybill">Waybill: 8335200020</span>
    <span class="error" id="summarystatus"></span>
    </th>
    <th colspan="2" scope="col" axis="length" style="width:37% ;text-align:left">
    <div id="summarydatetime"></div>
    Origin Service Area:
    <span class="font_normal"><img src="/img/common/arrow.gif"/>&nbsp;<a id="orginURL4" href='#'>GATWICK - UK</a></span>
    <br/>
    Destination Service Area:
    <span class="font_normal"><img src="/img/common/arrow.gif"/>&nbsp;<a id="destinationURL4" href='#'>LONDON-HEATHROW - UK</a>
    </span>
    </th>
    <th scope="col" axis="length" class="lastChild" style="width:26%; ;text-align:left">
    </th>
    </tr>
    </thead>
    <thead>
    <tr>
    <td colspan="5" class="emptyRow"></td>
    </tr>
    <tr>
    <th scope="col" colspan="2" axis="length" style="width: 40% ;text-align:left">Friday, September 16, 2011 </th>
    <th scope="col" axis="length" style="width: 30% ;text-align:left ">Location</th>
    <th scope="col" axis="length" style="width: 9%;text-align:left">Time</th>
    <th scope="col" axis="length" class="lastChild" style="width: 25% ;text-align:left">&nbsp;</th>
    </tr>
    </thead>
    <tbody>
    <tr>
    <td class="lastRow" style="width: 5% ;text-align:left">1</td>
    <td class="lastRow" style="text-align:left">Shipment information received</td>
    <td class="lastRow" style="text-align:left">
    GATWICK - UK
    </td>
    <td class="lastRow">5:46 PM</td>
    <td class="lastChild lastRow">
    <!--start contentteaser -->
    <div class="dhl">
    <div>
    <div class="clearAll">&nbsp;</div>
    </div>
    </div>
    <!--end contentteaser -->
    </td>
    </tr>
    </tbody>
    <!-- Sudeep Start change section 4 -->
    <!-- Sudeep end change section 4 -->
    </table>
    </div>

     

     


    ayaz
    Wednesday, October 26, 2011 12:11 AM

Answers

  • Hi

    You can do it in couple of ways.

    1. Using HTML Objects

    For this you need to use the following references

    • Microsoft Internet Controls
    • Microsoft HTML Object Library

    2. Web Query

    I feel it is the easy way. You can get formatted data as an External Query table using Excel. From there it is easy to grab the necessary data

    Cheers

    Shasur

     

     

     


    http://www.vbadud.blogspot.com http://www.dotnetdud.blogspot.com
    Wednesday, October 26, 2011 1:34 PM
  • If the page is available from a queriable source, typically an on line web page, either of Shasur's suggestions should work well, probably the web query the easiest.

    If you are receiving the page as source html, you could save it to a text file with an *.html extension and open it in Excel, and go from there.

    Peter Thornton

    Wednesday, October 26, 2011 3:13 PM
    Moderator

All replies

  • Hi

    You can do it in couple of ways.

    1. Using HTML Objects

    For this you need to use the following references

    • Microsoft Internet Controls
    • Microsoft HTML Object Library

    2. Web Query

    I feel it is the easy way. You can get formatted data as an External Query table using Excel. From there it is easy to grab the necessary data

    Cheers

    Shasur

     

     

     


    http://www.vbadud.blogspot.com http://www.dotnetdud.blogspot.com
    Wednesday, October 26, 2011 1:34 PM
  • If the page is available from a queriable source, typically an on line web page, either of Shasur's suggestions should work well, probably the web query the easiest.

    If you are receiving the page as source html, you could save it to a text file with an *.html extension and open it in Excel, and go from there.

    Peter Thornton

    Wednesday, October 26, 2011 3:13 PM
    Moderator
  • Hi Peter and Shasur

    Thanks for help I will dig into this and get back to this forum.

    Is there any examples

    Regards


    • Edited by Ayaz_H Thursday, October 27, 2011 1:52 PM
    Thursday, October 27, 2011 1:49 PM
  • You would need to give a clearer idea of what you are doing before considering which of the three suggestions you now have is likely to the easiest or most useful. However the Web Query and save the HTML string to file and reopen in Excel are very easy and only a few manual steps (which could of course be done programmatically if required), it should be simple for you to do without examples.

    Try those first and see if either meets your needs.

    Peter Thornton

    Thursday, October 27, 2011 2:30 PM
    Moderator
  • Hi Peter

    I didnt explaing it properly earlier, Let me explain it further here, I am working on frieght management module in our software, the idea is to track all parcels by sending query to different tracking website. With the correct tracking number I have reply back from tracking website but I need to extract data from HTML source code.

    Regards

     


    ayaz
    Monday, October 31, 2011 9:27 AM
  • Probably the web query approach would be easiest for what you describe. Have you tried it. If so and not suitable explain what problems you had with it.

    Peter Thornton

    Monday, October 31, 2011 11:25 AM
    Moderator