Answered by:
Scrap Webpage get results and create a table

Question
-
User-909867351 posted
HI
I want to get some values from tracking website and I use the following code:
String url = "http://www.cttexpresso.pt/feapl_2/app/open/cttexpresso/objectSearch/objectSearch.jspx?objects=RH306695573PT"; var doc = new HtmlAgilityPack.HtmlDocument(); HtmlAgilityPack.HtmlNode.ElementsFlags["br"] = HtmlAgilityPack.HtmlElementFlag.Empty; doc.OptionWriteEmptyNodes = true; var webRequest = HttpWebRequest.Create(url); Stream stream = webRequest.GetResponse().GetResponseStream(); doc.Load(stream); stream.Close(); string testDivSelector = "//table[@class='full-width']"; var divString = doc.DocumentNode.SelectSingleNode(testDivSelector).InnerHtml.ToString(); Response.Write(divString);
I got the correct result https://registos.programamos.pt/lectt.aspx
My problem:
I need to create a table with this results like:
Hora Estado Motivo Local segunda-feira, 7 Janeiro 2019 15:57 Entregue ANTERO DE QUENTAL (P.DELGADA) 08:51 Disponível para levantamento ANTERO DE QUENTAL (P.DELGADA) sexta-feira, 4 Janeiro 2019 17:47 Expedição Nacional 9500 PONTA DELGADA What's the best option for that?
Thank you
Wednesday, January 9, 2019 3:55 PM
Answers
-
User-943250815 posted
mariolopes,
Descendants is an IEnumerable, you need the first item.
So replaceHtmlNode InnerTbl = OuterTbl.Descendants("table")(0);
by
HtmlNode InnerTbl = OuterTbl.Descendants("table").ElementAt(0);
- Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
Thursday, January 10, 2019 12:34 PM
All replies
-
User-943250815 posted
If the nested table can help you, use Descendants. Sample will include Recetor column
If Receptor should no be part of result, perhaps you can remove last column and ajust colspans, or keep using Descendants to get TR and TD, to get cell values and construct a table like you want.If you are working on webform add a Literal control on page
Dim url As String = "http://www.cttexpresso.pt/feapl_2/app/open/cttexpresso/objectSearch/objectSearch.jspx?objects=RH306695573PT" Dim zHTML As New HtmlAgilityPack.HtmlDocument() Dim Selector As String = "//table[@class='full-width']" Dim webRequest = System.Net.HttpWebRequest.Create(url) Dim stream As System.IO.Stream = webRequest.GetResponse().GetResponseStream() zHTML.Load(stream) stream.Close() Dim OuterTbl As HtmlNode = zHTML.DocumentNode.SelectSingleNode(Selector) Dim InnerTbl As HtmlNode = OuterTbl.Descendants("table")(0) Literal1.Text = InnerTbl.OuterHtml
Wednesday, January 9, 2019 6:57 PM -
User-909867351 posted
Hi
When I convert it to C# I got one error
string url = "http://www.cttexpresso.pt/feapl_2/app/open/cttexpresso/objectSearch/objectSearch.jspx?objects=RH306695573PT"; HtmlAgilityPack.HtmlDocument zHTML = new HtmlAgilityPack.HtmlDocument(); string Selector = "//table[@class='full-width']"; var webRequest = System.Net.HttpWebRequest.Create(url); System.IO.Stream stream = webRequest.GetResponse().GetResponseStream(); zHTML.Load(stream); stream.Close(); HtmlNode OuterTbl = zHTML.DocumentNode.SelectSingleNode(Selector); HtmlNode InnerTbl = OuterTbl.Descendants("table")(0); Literal1.Text = InnerTbl.OuterHtml;
Got error on
HtmlNode InnerTbl = OuterTbl.Descendants("table")(0);
Name method expected
Any help?
Thursday, January 10, 2019 9:33 AM -
User-943250815 posted
Oops bad, I imported (using) HtmlAgilityPack
Just add HtmlAgilityPackDim url As String = "http://www.cttexpresso.pt/feapl_2/app/open/cttexpresso/objectSearch/objectSearch.jspx?objects=RH306695573PT" Dim zHTML As New HtmlAgilityPack.HtmlDocument() Dim Selector As String = "//table[@class='full-width']" Dim webRequest = System.Net.HttpWebRequest.Create(url) Dim stream As System.IO.Stream = webRequest.GetResponse().GetResponseStream() zHTML.Load(stream) stream.Close() Dim OuterTbl As HtmlAgilityPack.HtmlNode = zHTML.DocumentNode.SelectSingleNode(Selector) Dim InnerTbl As HtmlAgilityPack.HtmlNode = OuterTbl.Descendants("table")(0) Literal1.Text = InnerTbl.OuterHtml
Thursday, January 10, 2019 11:51 AM -
User-909867351 posted
Hi
I have HtmlAgilityPack
using HtmlAgilityPack; using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Net; using System.Web; using System.Web.UI; using System.Web.UI.WebControls; public partial class lectt : System.Web.UI.Page { protected void Page_Load(object sender, EventArgs e) { String url = "http://www.cttexpresso.pt/feapl_2/app/open/cttexpresso/objectSearch/objectSearch.jspx?objects=RH306695573PT"; HtmlAgilityPack.HtmlDocument zHTML = new HtmlAgilityPack.HtmlDocument(); string Selector = "//table[@class='full-width']"; var webRequest = System.Net.HttpWebRequest.Create(url); System.IO.Stream stream = webRequest.GetResponse().GetResponseStream(); zHTML.Load(stream); stream.Close(); HtmlNode OuterTbl = zHTML.DocumentNode.SelectSingleNode(Selector); HtmlNode InnerTbl = OuterTbl.Descendants("table")(0); Literal1.Text = InnerTbl.OuterHtml;
Thursday, January 10, 2019 12:00 PM -
User-943250815 posted
mariolopes,
Descendants is an IEnumerable, you need the first item.
So replaceHtmlNode InnerTbl = OuterTbl.Descendants("table")(0);
by
HtmlNode InnerTbl = OuterTbl.Descendants("table").ElementAt(0);
- Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
Thursday, January 10, 2019 12:34 PM -
User-909867351 posted
Thank you
Solved my question. I'll work in that project because I want the resultant table be responsive (with bootstrap) and I think I have to create another (bootstrap) table with the data from this table. I think will be the best option!
I have to read each row of this table and create another one.
My problem I have 2 classes with the same name tables full-width I need to get only the last one.
Any idea?
Thank you again
Thursday, January 10, 2019 12:57 PM -
User-943250815 posted
As I told, you can keep using Descendants to get Rows and from Rows get Cells.
The worst part is understand correctly how HTML was constructed, and deal with.
Same applies to get last table in a single shot, you can query as already done or construct all xpath to get it, just because programmer is not using ID
To get Rows and Cells:For Each Row In InnerTbl.Descendants("tr")
For Each Cell in Row.Descendants("td") Dim CellValue as string = Cell.InnerText Next
NextThursday, January 10, 2019 1:29 PM