locked
browser automation

    Question

  •  

    Hi
    i'd like to have a programm that navigates to http://www.handelsblatt.com/News/def...ymbol=FLUK.NWX
    selects "Times and Sales" from the menu "Darstellung", clicks on "aktualisieren" and copies the new table to a file.
    I'm still hoping i can deal with most of the steps, but I have no clue how to select from the dropdown menu.
    I'm using VB.NET 2005 express.
    I'd really appreciate any kind of help.
    Thank you!!
    Wednesday, November 28, 2007 11:07 AM

Answers

  •  d.j.t wrote:

    And I'am not sure what you want  to tell me with:

    e.g. Dim WithEvents Button1 As Button  

    Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the Object Browser comboBox, and all events corresponding to the Button1 will display in the Event Browser comboBox.
    do I need to insert this code even though i added a button?

    Because you said a error occured " Handles clause requires a WithEvents variable defined in the containing type or one of its base types ". The error has something to do with WithEvents. So that's only extra reference. You can ignore it.

     

    Come back to the topic: Please drag&drop a Button control named Button1 to your Form.

    In this case, you have to click the button to perform the tasks. That's indeed restriction.

     

    OK! Please adopt this idea. Still use WebBrowser1_DocumentCompleted event but add a Boolean avariable as switch, which can ensure perform the tasks only once.

    Code Block

    Public Class Form1

        Dim march As Boolean  ' Set a swith

     

        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

     

            march = True  ' Initialize the switch as True

     

            WebBrowser1.Dock = DockStyle.Fill

            Me.WindowState = FormWindowState.Maximized

            ' Part 1: Use WebBrowser control to load web page

            WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")

     

        End Sub

     

        Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

            'Dertermine the swith state

            If march = True Then

                'Part 2: Automatically select specified option from ComboBox

                Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

                For Each curElement As HtmlElement In theElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then

                        curElement.SetAttribute("Value", 0)

                    End If

                Next

                Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

                For Each curElement As HtmlElement In theWElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    'Part 3: Automatically check the CheckBox

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then

                        curElement.SetAttribute("Checked", True)

                        'Part 4: Automatically click the button

                    ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

                        curElement.InvokeMember("click")

                    End If

                Next

                Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")

                w.Write(WebBrowser1.Document.Body.InnerHtml)

                w.Close()

                march = False  ' If accomplish the task, change the switch to False.

            End If

        End Sub

     

    End Class

    Wednesday, December 05, 2007 11:34 AM
  • Dominik: "what happens there is (while working fine most of the times), that SOMETIMES the first table is copied, the one that was displayed when first browsing to the page, before doing the selections and refreshing. so to me it seems as if the skript doesnt wait for the documentcompleted-event any more. but only sometimes! sometimes the correct table is also copied, sometimes not. i dont understand this! (actually i never fully understood of the documentcompleted-event-thing). the only way i can explain is that the old computer is to slow... im frustrated!"

    Hi Dominik,

    In Part 6 you are extracting the javascript immediately after automatically clicking the More button without waiting for the next webpage to load with new data:

    Code Snippet
    1. 'Part 6 Automatically click Continue link
    2. Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")
    3. For Each curElement As HtmlElement In hrefElementCollection
    4.     Dim controlName As String = curElement.GetAttribute("id").ToString
    5.     If controlName.Contains("LBtn_More") Then
    6.         curElement.InvokeMember("Click")
    7.     End If
    8. Next
    9. extract()


    The code in my first post on this thread fixes that problem. The DocumentCompleted event fires when a new webpage loads. After clicking the button in Part 4 we have to wait for the next DocumentCompleted which tells us that next webpage has loaded with new data. Similarly with clicking the More button in Part 6 (see: http://msdn2.microsoft.com/en-us/library/system.windows.forms.webbrowser.documentcompleted.aspx):

    Code Snippet
    1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
    2.     document_completed = document_completed + 1
    3.     If document_completed = 1 Then ' First table
    4.         Part2() ' Automatically select specified option from ComboBox
    5.         Part3() ' Automatically check the CheckBox
    6.         Part4() ' Automatically click the Button
    7.     ElseIf document_completed > 1 And document_completed < 11 Then ' Second to tenth tables
    8.         Part5() ' Extract javascript and update last_datetime
    9.         If last_datetime > earliest_datetime Then
    10.             Part6() ' Click Continue Button
    11.         End If
    12.     End If
    13. End Sub


    But the If statements need to be refined a bit because DocumentCompleted fires twice per page (once for the page banner and once for the default page containing the javascript data that we want):

    Code Snippet
    1. If (document_completed < 3) And (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
    2. .
    3. .
    4. .
    5. ElseIf (document_completed > 2) And (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then


    The second problem is that you are using a 12 hour clock without specifying a.m. or p.m. when generating the filename so there is potential for overwriting old files or appending new data to an old file:

    Code Snippet
    1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")


    Use a 24 hour clock instead using capital HH:

    Code Snippet
    1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddHHmmss")


    The other bugs I pointed out were "features" that I had introduced myself when converting from VB to C++ (I was a bit unfamiliar with the Using statement) so you can ignore these.

    • Edited by Tim Mathias Wednesday, October 14, 2009 6:03 PM Reformatted code snippets.
    Tuesday, January 29, 2008 10:24 AM
  • > Is it exactly necessary to mention e.Url.AbsoluteUri = ...  because the url stays the same througout the whole procedure?

     

    It's essential because the url DOESN'T stay the same throughout the whole procedure because the webpage contains a link to a banner page that also calls the procedure after it loads. I've added a MessageBox to show these two URLs. It's this double message that causes the first table to be extracted in your skript (i.e. the table we want to ignore).

     

    I've also added an If statement that returns when the banner URL completes (it's a bit neater than the former If tests I wrote).

     

    And I've added the Me.Close ()

    Code Snippet
    1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
    2.     MessageBox.Show("DocumentCompleted:  " & e.Url.AbsoluteUri)
    3.     If Not (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
    4.         Return
    5.     End If
    6.     document_completed = document_completed + 1
    7.     If document_completed = 1 Then ' First table
    8.         Part2() ' Automatically select specified option from ComboBox
    9.         Part3() ' Automatically check the CheckBox
    10.         Part4() ' Automatically click the Button
    11.     ElseIf document_completed > 1 Then
    12.         Part5() ' Extract javascript and update last_datetime
    13.         If last_datetime > earliest_datetime Then
    14.             Part6() ' Automatically click Continue Button
    15.         Else
    16.             Me.Close() ' Part 7: Close programme
    17.         End If
    18.     End If
    19. End Sub
    • Edited by Tim Mathias Wednesday, October 14, 2009 5:38 PM Reformatted code snippet.
    Wednesday, January 30, 2008 2:42 PM
  • I did originally limit the document_completed count to 10 tables to avoid an infinite repeat in case there was a problem parsing the DateTime from the webpage (bold red). You'll have the cybercops after you for a suspected DoS attack.

     

    Here's the ultimate bug free code  (until you find the next one):

    Code Snippet
    1. Dim previous_last_datetime As DateTime
    2.  
    3. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
    4.     MessageBox.Show("DocumentCompleted:  " & e.Url.AbsoluteUri)
    5.     If Not (e.Url.AbsoluteUri = seite) Then
    6.         Return
    7.     End If
    8.     document_completed = document_completed + 1
    9.     If document_completed = 1 Then ' First table
    10.         Part2() ' Automatically select specified option from ComboBox
    11.         Part3() ' Automatically check the CheckBox
    12.         Part4() ' Automatically click the Button
    13.     ElseIf document_completed > 1 And document_completed < 11 Then
    14.         previous_last_datetime = last_datetime
    15.         Part5() ' Extract javascript and update last_datetime
    16.         If previous_last_datetime > last_datetime Then
    17.             Part6() ' Automatically click Continue Button
    18.         Else
    19.             Me.Close() ' Part 7: Close programme
    20.         End If
    21.     End If
    22. End Sub
    • Edited by Tim Mathias Wednesday, October 14, 2009 5:30 PM Reformatted code snippet.
    Friday, February 01, 2008 7:04 PM

All replies

  • Hi d.j.t,

    Your question is related to Automation Test technology.

    The website you mentioned is a German website.

    Here is the Introduction of one Web Application Testing in .Net.

    It allows you to emulate real users interacting with your web site by automating IE and bring you an easy way to automate tests with Internet Explorer.

    http://blogs.charteris.com/blogs/edwardw/archive/2007/07/16/watin-web-application-testing-in-net-introduction.aspx

    http://watin.sourceforge.net/

    Check above documents for main idea of Web Automation Test.

    Basic features:

      • Automates all major HTML elements
      • Find elements by multiple attributes

      How to Locate elements

      Creating test scripts in most cases involves finding an html element and either causing it to fire an event, set it's value or assert it's expected value.

      In order to perform an action against an element you must first obtain a reference to it. This can be done in 3 different ways:

      1. By the elements id (if it has one)
      2. Regular expression that matches the elements id
      3. Attribute class

      Regards,

      Martin

    Friday, November 30, 2007 8:53 AM
  • Hi d.j.t,

     

    I think I have worked it out.

    We can locate and access elements of a webpage loaded in WebBrowser control. In your case, you want to select an option from ComboBox, check a CheckBox and click a Button.

     

    1. Darstellung ComboBox element and Times &  Sales Option:

    <SELECT class=wp1-input id=ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_DD_Step name=ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step> <OPTION value=0 selected>Times &amp; Sales</OPTION>

    2. The Kapitalmaßnahmen einbeziehen Checkbox element:

    <INPUT id=ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_CBx_CapitalMeasures type=checkbox name=ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures>

    3. The Aktualisieren Button element:

    <INPUT id=ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_IBtn_Refresh1 title=Aktualisieren type=image alt=Aktualisieren src="http://bc2.handelsblatt.com/hbi/images/wp1/wp1_refresh.gif" align=right name=ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1>

    This code can automatically perform above steps:

    Code Block

    Public Class Form1

     

        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

            WebBrowser1.Dock = DockStyle.Fill

            Me.WindowState = FormWindowState.Maximized

            ' Part 1: Use WebBrowser control to load web page

            WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")

        End Sub

     

        Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

            'Part 2: Automatically select specified option from ComboBox

            Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

            For Each curElement As HtmlElement In theElementCollection

                Dim controlName As String = curElement.GetAttribute("name").ToString

                If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then

                    curElement.SetAttribute("Value", 0)

     

                End If

            Next

     

            Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

            For Each curElement As HtmlElement In theWElementCollection

                Dim controlName As String = curElement.GetAttribute("name").ToString

                'Part 3: Automatically check the CheckBox

                If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then

                    curElement.SetAttribute("Checked", True)

     

                'Part 4: Automatically click the button

                ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

                    curElement.InvokeMember("click")

                    ' javascript has a click method for we need to invoke on the current button element.

                End If

            Next

        End Sub

     

    End Class

     

     

    Similar issue: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=2456794&SiteID=1

     

    Best regards,

    Martin

    Monday, December 03, 2007 4:00 AM
  •  d.j.t wrote:

    ... and copies the new table to a file.

     

    To achieve the task, here are two suggestions:

    1.

    Code Block

    Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

          ' After automatically clicking the button,

         '  append the following code to save the webpage as htm file 

            Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")

            w.Write(WebBrowser1.Document.Body.InnerHtml)

            w.Close()

    End Sub

     

    1. Check this thread for detail: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=2468541&SiteID=1

    You need to Add Reference... ->  COM tab -> Find Microsoft CDO For Windows 2000 Library and Microsoft ActiveX Data Objects 2.5 Library and add them to your project

    Code Block

    Imports ADODB

    Imports CDO

    Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

          ' After automatically clicking the button,

         '  append the following code to save the webpage as mht file 

      SavePage(WebBrowser1.Url.ToString"c:\table.mht")

    End Sub

     

    Private Sub SavePage(ByVal Url As String, ByVal FilePath As String)

            Dim iMessage As CDO.Message = New CDO.Message

            iMessage.CreateMHTMLBody(Url, CDO.CdoMHTMLFlags.cdoSuppressObjects, "", "")

            Dim adodbstream As ADODB.Stream = New ADODB.Stream

            adodbstream.Type = ADODB.StreamTypeEnum.adTypeText

            adodbstream.Charset = "US-ASCII"

            adodbstream.Open()

            iMessage.DataSource.SaveToObject(adodbstream, "_Stream")

            adodbstream.SaveToFile(FilePath, ADODB.SaveOptionsEnum.adSaveCreateOverWrite)

        End Sub

     

    Monday, December 03, 2007 4:34 AM
  • Hi Martin
    your first reply is great! Thanks a lot!


    1. I just have one problem with the first task: when executing, the selection of the combo&checkboxes works perfectly fine, but the "aktualisieren" button is klicked endlessly. i'd like to stop that. (I used a webbrowser elemet from the toolbox in form1)

    2. with the extraction i unfourtunately had problems too:
    'Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs)' has multiple definitions with identical signatures  "
    naming the Private Sub "WebBrowser1_DocumentCompleted2" worked - i hope i can just do that...
    But anyway, this only helped with the first solution, which only creates a html of the complete website (or at least parts of it). But i need something that i can easily import to a database, such as .txt (the cellls seperated by tabs and lines) or .xls.
    So i tried the second solution (not really knowing what the output will be in that case, maybe more or less the same), but after renaming the sub still there was the error: "  Value of type 'System.Uri' cannot be converted to 'string'  "
    But if the exported file will be more then the pure table data (as i expect) the problem doesn't really matter.

    If you have an idea how to deal with one of the problems, especially the first, I'd appreciate if you could post it.

    My project has made a enormous progress thanks to you!
    Monday, December 03, 2007 2:59 PM
  • Hi d.j.t,

     

    1. "  'Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs)' has multiple definitions with identical signatures  "
    naming the Private Sub "WebBrowser1_DocumentCompleted2" worked - i hope i can just do that...

    ->  You should place the two part code (Automation part and Save page part) into the WebBrowser1_DocumentCompleted event. Don't name it as WebBrowser1_DocumentCompleted2.

    Code Block

    Public Class Form1

     

        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

            WebBrowser1.Dock = DockStyle.Fill

            Me.WindowState = FormWindowState.Maximized

            ' Part 1: Use WebBrowser control to load web page

            WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")

        End Sub

     

        Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

            'Part 2: Automatically select specified option from ComboBox

            Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

            For Each curElement As HtmlElement In theElementCollection

                Dim controlName As String = curElement.GetAttribute("name").ToString

                If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then

                    curElement.SetAttribute("Value", 0)

     

                End If

            Next

     

            Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

            For Each curElement As HtmlElement In theWElementCollection

                Dim controlName As String = curElement.GetAttribute("name").ToString

                'Part 3: Automatically check the CheckBox

                If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then

                    curElement.SetAttribute("Checked", True)

     

                'Part 4: Automatically click the button

                ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

                    curElement.InvokeMember("click")

                    ' javascript has a click method for we need to invoke on the current button element.

                End If

            Next

     

            ' After automatically clicking the button,

           ' append the following code to save the webpage as htm file 

            Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")

            w.Write(WebBrowser1.Document.Body.InnerHtml)

            w.Close()

        End Sub

     

    End Class

     

    2. So i tried the second solution (not really knowing what the output will be in that case, maybe more or less the same), but after renaming the sub still there was the error: "  Value of type 'System.Uri' cannot be converted to 'string'  "
    -> Please change it to WebBrowser1.Url.ToString. I have modified my third post.

        This solution will save entire web page as .mht file which containing all text and images. It seems not to be what you expect.

    Tuesday, December 04, 2007 2:41 AM
  • 3. I just have one problem with the first task: when executing, the selection of the combo&checkboxes works perfectly fine, but the "aktualisieren" button is klicked endlessly. i'd like to stop that. (I used a webbrowser elemet from the toolbox in form1)
    -> CAUSE: When clicking the button to retrieve data, it refresh and reload current page, so all the time it fires the WebBrowser1_DocumentCompleted event.

    Solution: You can place that code in Button1_Click event.

    Code Block

    Public Class Form1

        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

            WebBrowser1.Dock = DockStyle.Fill

            Me.WindowState = FormWindowState.Maximized

            ' Part 1: Use WebBrowser control to load web page

            WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")

        End Sub

     

        Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

            MessageBox.Show("Complete loading webpage") ' Optional code

        End Sub

     

        Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

            'Part 2: Automatically select specified option from ComboBox

            Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

            For Each curElement As HtmlElement In theElementCollection

                Dim controlName As String = curElement.GetAttribute("name").ToString

                If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then

                    curElement.SetAttribute("Value", 0)

                End If

            Next

     

            Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

            For Each curElement As HtmlElement In theWElementCollection

                Dim controlName As String = curElement.GetAttribute("name").ToString

                'Part 3: Automatically check the CheckBox

                If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then

                    curElement.SetAttribute("Checked", True)

                    'Part 4: Automatically click the button

                ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

                    curElement.InvokeMember("click")

                    ' javascript has a click method for we need to invoke on the current button element.

                End If

            Next

     

            Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")

            w.Write(WebBrowser1.Document.Body.InnerHtml)

            w.Close()

        End Sub

    End Class

     

    4. But I need something that i can easily import to a database, such as .txt (the cellls seperated by tabs and lines) or .xls.

    But if the exported file will be more than the pure table data (as i expect) the problem doesn't really matter.
    -> You need to retrieve that part html code (<Table>...</Table>) containing table data. Here are some references:

    1) Using the HTML Parser to parse HTML code

       http://www.developer.com/net/csharp/article.php/10918_2230091_2

    2) See the Similar issue, you can use Regular Expressions to extract part html code.

       .NET Development » Regular Expressions Forum

     

    I'm glad to hear that you have made enormous progress. Cheers!

    Best regards,

    Martin

    Tuesday, December 04, 2007 3:47 AM
  • Hi Martin
    i tried to use the button1click event but a error  occured: " Handles clause requires a WithEvents variable defined in the containing type or one of its base types "
    Nevertheless, when excuting it, the same endless clicking of the refreshbutton happened...
    Thanks for your efforts!
    Dominik
    Wednesday, December 05, 2007 9:56 AM
  • i'm just working on the extraction.
    - the first link is related to c# ... can i just change the language?
    - the similar issue seems to be excactly what i want but there is no complete code provided
    - the regular expressions thing - i appologize for this noob question - what is that?
    dominik
    Wednesday, December 05, 2007 10:32 AM
  •  d.j.t wrote:
    Hi Martin
    i tried to use the button1click event but a error  occured: " Handles clause requires a WithEvents variable defined in the containing type or one of its base types "

    Please directly drag&drop a Button control named Button1 to your Form.

     

     

    Reference: WithEvents keyword

    http://msdn2.microsoft.com/en-us/library/aty3352y(VS.80).aspx

    Specifies that one or more declared member variables refer to an instance of a class that can raise events.

     

    e.g. Dim WithEvents Button1 As Button

        Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the Object Browser comboBox, and all events corresponding to the Button1 will display in the Event Browser comboBox.
    Wednesday, December 05, 2007 10:38 AM
  • Well I could have known it had something to do with a button on the form... sorry :-/
    But now im really confuesed... cause now i have to click the button to perform the tasks.
    And I'am not sure what you want  to tell me with:

    e.g. Dim WithEvents Button1 As Button

        Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the Object Browser comboBox, and all events corresponding to the Button1 will display in the Event Browser comboBox.


    do I need to insert this code even though i added a button?

    Well is there a possibility to solve that problem of the repetition by adding something like the following (in plain english) to the code you first recommended?
    "and if value of the combobox is not equal to 0?"
    Wednesday, December 05, 2007 11:01 AM
  •  d.j.t wrote:

    And I'am not sure what you want  to tell me with:

    e.g. Dim WithEvents Button1 As Button  

    Then at the top of the code view (e.g. Form1.vb), the Button1 will display in the Object Browser comboBox, and all events corresponding to the Button1 will display in the Event Browser comboBox.
    do I need to insert this code even though i added a button?

    Because you said a error occured " Handles clause requires a WithEvents variable defined in the containing type or one of its base types ". The error has something to do with WithEvents. So that's only extra reference. You can ignore it.

     

    Come back to the topic: Please drag&drop a Button control named Button1 to your Form.

    In this case, you have to click the button to perform the tasks. That's indeed restriction.

     

    OK! Please adopt this idea. Still use WebBrowser1_DocumentCompleted event but add a Boolean avariable as switch, which can ensure perform the tasks only once.

    Code Block

    Public Class Form1

        Dim march As Boolean  ' Set a swith

     

        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

     

            march = True  ' Initialize the switch as True

     

            WebBrowser1.Dock = DockStyle.Fill

            Me.WindowState = FormWindowState.Maximized

            ' Part 1: Use WebBrowser control to load web page

            WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")

     

        End Sub

     

        Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

            'Dertermine the swith state

            If march = True Then

                'Part 2: Automatically select specified option from ComboBox

                Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

                For Each curElement As HtmlElement In theElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then

                        curElement.SetAttribute("Value", 0)

                    End If

                Next

                Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

                For Each curElement As HtmlElement In theWElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    'Part 3: Automatically check the CheckBox

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then

                        curElement.SetAttribute("Checked", True)

                        'Part 4: Automatically click the button

                    ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

                        curElement.InvokeMember("click")

                    End If

                Next

                Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")

                w.Write(WebBrowser1.Document.Body.InnerHtml)

                w.Close()

                march = False  ' If accomplish the task, change the switch to False.

            End If

        End Sub

     

    End Class

    Wednesday, December 05, 2007 11:34 AM
  • Thank you! Thats exactly what i was trying to do (but lack of experience prevened me from doing so)! First task acomplished!

    So there remains the second task of extracting the table... even though - after you helped me so much - i'm a bit embarressed to ask, did you see my questions concerning your links (regarding extraction) (Tuesday, 10:32 PM)?

     

    Wednesday, December 05, 2007 3:47 PM
  •  d.j.t wrote:
    i'm just working on the extraction.
    - the first link is related to c# ... can i just change the language?
    - the similar issue seems to be excactly what i want but there is no complete code provided
    - the regular expressions thing - i appologize for this noob question - what is that?
    dominik


    Yes, I see the second task of extracting the table.

    Regular Expressions can be used to extract part html code.

    You need to Imports System.Text.RegularExpressions namespace.

    Suggest posting this task to Regular Expressions forum for quicker and better responses.

       .NET Development » Regular Expressions Forum

    Please remember to point out the html page:
    http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX

    Also point out the Table where you want to extract data as below:

    Code Block
    <TABLE cellSpacing=0 cellPadding=0 width="100%" border=0>

    <TBODY>

    <TR>

    <TH class=wp1-header colSpan=6>Historische Daten </TH></TR>

    <TR>

    <TH class=wp1-header>Datum</TH>

    <TH class=wp1-header>Eröffnung</TH>

    <TH class=wp1-header>Hoch</TH>

    <TH class=wp1-header>Tief</TH>

    <TH class=wp1-header>Schluss</TH>

    <TH class=wp1-header>Volumen</TH></TR>

    <TR>

    <TD class=wp1-line1 align=middle>05.12.07 15:23</TD>

    <TD class=wp1-line1 align=right>57,60</TD>

    <TD class=wp1-line1 align=right>59,90</TD>

    <TD class=wp1-line1 align=right>57,60</TD>

    <TD class=wp1-line1 align=right>59,90</TD>

    <TD class=wp1-line1 align=right>3.753</TD></TR>

    <TR>

    <TD class=wp1-line2 align=middle>04.12.07 18:29</TD>

    <TD class=wp1-line2 align=right>57,90</TD>

    <TD class=wp1-line2 align=right>58,10</TD>

    <TD class=wp1-line2 align=right>57,27</TD>

    <TD class=wp1-line2 align=right>57,50</TD>

    <TD class=wp1-line2 align=right>4.730</TD>

    <TR>

    <TD class=wp1-line1 align=middle>03.12.07 18:57</TD>

    <TD class=wp1-line1 align=right>58,50</TD>

    <TD class=wp1-line1 align=right>58,75</TD>

    <TD class=wp1-line1 align=right>57,39</TD>

    <TD class=wp1-line1 align=right>57,85</TD>

    <TD class=wp1-line1 align=right>10.219</TD></TR>

    <TR>

    <TD class=wp1-line2 align=middle>30.11.07 14:43</TD>

    <TD class=wp1-line2 align=right>57,95</TD>

    <TD class=wp1-line2 align=right>58,75</TD>

    <TD class=wp1-line2 align=right>57,95</TD>

    <TD class=wp1-line2 align=right>58,46</TD>

    <TD class=wp1-line2 align=right>12.249</TD>

    <TR>

    <TD class=wp1-line1 align=middle>29.11.07 14:52</TD>

    <TD class=wp1-line1 align=right>58,45</TD>

    <TD class=wp1-line1 align=right>58,75</TD>

    <TD class=wp1-line1 align=right>58,00</TD>

    <TD class=wp1-line1 align=right>58,00</TD>

    <TD class=wp1-line1 align=right>1.532</TD></TR>

    <TR>

    <TD class=wp1-line2 align=middle>28.11.07 14:17</TD>

    <TD class=wp1-line2 align=right>57,70</TD>

    <TD class=wp1-line2 align=right>58,23</TD>

    <TD class=wp1-line2 align=right>57,58</TD>

    <TD class=wp1-line2 align=right>58,23</TD>

    <TD class=wp1-line2 align=right>1.540</TD>

    <TR>

    <TD class=wp1-line1 align=middle>27.11.07 16:08</TD>

    <TD class=wp1-line1 align=right>58,60</TD>

    <TD class=wp1-line1 align=right>58,92</TD>

    <TD class=wp1-line1 align=right>57,30</TD>

    <TD class=wp1-line1 align=right>57,60</TD>

    <TD class=wp1-line1 align=right>7.683</TD></TR>

    <TR>

    <TD class=wp1-line2 align=middle>26.11.07 14:09</TD>

    <TD class=wp1-line2 align=right>58,30</TD>

    <TD class=wp1-line2 align=right>59,00</TD>

    <TD class=wp1-line2 align=right>58,30</TD>

    <TD class=wp1-line2 align=right>58,90</TD>

    <TD class=wp1-line2 align=right>5.321</TD>

    <TR>

    <TD class=wp1-line1 align=middle>23.11.07 19:10</TD>

    <TD class=wp1-line1 align=right>57,15</TD>

    <TD class=wp1-line1 align=right>57,74</TD>

    <TD class=wp1-line1 align=right>57,15</TD>

    <TD class=wp1-line1 align=right>57,50</TD>

    <TD class=wp1-line1 align=right>8.880</TD></TR>

    <TR>

    <TD class=wp1-line2 align=middle>22.11.07 19:48</TD>

    <TD class=wp1-line2 align=right>57,60</TD>

    <TD class=wp1-line2 align=right>57,60</TD>

    <TD class=wp1-line2 align=right>56,51</TD>

    <TD class=wp1-line2 align=right>56,51</TD>

    <TD class=wp1-line2 align=right>9.393</TD>

    <TR>

    <TD class=wp1-line1 align=middle>21.11.07 19:23</TD>

    <TD class=wp1-line1 align=right>58,30</TD>

    <TD class=wp1-line1 align=right>58,80</TD>

    <TD class=wp1-line1 align=right>56,90</TD>

    <TD class=wp1-line1 align=right>57,00</TD>

    <TD class=wp1-line1 align=right>7.971</TD></TR>

    <TR>

    <TD class=wp1-line2 align=middle>20.11.07 15:12</TD>

    <TD class=wp1-line2 align=right>58,05</TD>

    <TD class=wp1-line2 align=right>58,80</TD>

    <TD class=wp1-line2 align=right>57,07</TD>

    <TD class=wp1-line2 align=right>58,80</TD>

    <TD class=wp1-line2 align=right>5.601</TD>

    <TR>

    <TD class=wp1-line1 align=middle>19.11.07 15:23</TD>

    <TD class=wp1-line1 align=right>58,70</TD>

    <TD class=wp1-line1 align=right>59,35</TD>

    <TD class=wp1-line1 align=right>57,60</TD>

    <TD class=wp1-line1 align=right>57,95</TD>

    <TD class=wp1-line1 align=right>6.562</TD>

    </TR>

    </TBODY>

    </TABLE>


    By the way, convert C# code to VB.NET code by means of this Code Translator tool.


    Thursday, December 06, 2007 3:16 AM
  • Hi Martin!
    Well there is one last question (even though others might follow:-) that fits in this topic: How do i click the "weiter" button at the bottom of the table? I tried to do it the same way as clicking "refresh":      

      _________________________________________________________________________

    Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

                For Each curElement As HtmlElement In theWElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    

                        'Part 4: Automatically click the button

                If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

                        curElement.InvokeMember("click")                                                             


    I tried to find the TagName and the attribute for the "weiter" link but it didnt work with what i found: "a" instead of "input" and "id" instead of "name"
    </td>
    	<td align="right"><a id="ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" class="wp1-more" href="javascript:__doPostBack('ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$LBtn_More','')">Weiter&gt;&gt;</a></td>
    </tr>
    Once more I hope you can provide help. 
    Thanks Dominik
    Thursday, December 06, 2007 12:13 PM
  • The following is complete code.

    Please check part 5: Automatically click Continue link. ("weiter" is translated to "Continue")

    Code Block

    Public Class Form1

        Dim march As Boolean  ' Set a swith

     

        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

            march = True  ' Initialize the switch as True

            WebBrowser1.Dock = DockStyle.Fill

            Me.WindowState = FormWindowState.Maximized

            ' Part 1: Use WebBrowser control to load web page    WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")

        End Sub

     

        Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

            'Dertermine the swith state

            If march = True Then

                'Part 2: Automatically select specified option from ComboBox

                Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

                For Each curElement As HtmlElement In theElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then

                        curElement.SetAttribute("Value", 0)

                    End If

                Next

     

                Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

                For Each curElement As HtmlElement In theWElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    'Part 3: Automatically check the CheckBox

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then

                        curElement.SetAttribute("Checked", True)

                        'Part 4: Automatically click the button

                    ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

                        curElement.InvokeMember("click")

                    End If

                Next

     

                'Dim w As IO.StreamWriter = New IO.StreamWriter("C:\Table.htm")

                'w.Write(WebBrowser1.Document.Body.InnerHtml)

                'w.Close()

     

                march = False  ' If accomplish the task, change the switch to False.

     

            Else   ' If march = False, don't need to perform above tasks, directly continue to click "Continue" link.

                'Part 5: Automatically click Continue link

                Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")

                For Each curElement As HtmlElement In hrefElementCollection

                    Dim controlName As String = curElement.GetAttribute("id").ToString

                    If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then

                        curElement.InvokeMember("Click")

                    End If

                Next

            End If

        End Sub

    End Class

     

     

    Friday, December 07, 2007 3:22 AM
  • Hi Martin,

    thanks for the reference to the other forum, it was quite useful: somebody there could provide assistance!

    i have to extend the question above:

    This program is meant to be launched each day to copy the data. But due to holidays that wont be possible. And sometimes all data doesn't fit onto 1 page (as the tables on the concerned site are limited to 100 rows). Thats why I am thinking about a loop in the final part: After selecting, refreshing and copying, i'd like to have the "weiter" (next page) link clicked and the copying done again and again until a certain past date appears in the table.

    Like this

    1. do selections and refresh
    2. extract
    3. click "weiter"(next page) (so far my above question) IF THE LAST DATE IN THE TABLE IS NOT MORE THAN x DAYS AGO (click link if: last_date_in_table > todays_date - x)
    4. then go back to step 2

    i'd be fine if the x could be a variable, selected in a form when starting the programm. but that should be rather  easy then.

    thanks for you commitment

    Dominik


    edit: i just noticed your answer to my last question. many thanks!
    Friday, December 07, 2007 1:33 PM
  • Hi with that code - thanks for it - the repetition in the end is happening again. I introduced a second switch and changed the final part to avoid this:

            Else   

                If marchb = True Then

                    'Part 5: Automatically click Continue link

                    Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")

                    For Each curElement As HtmlElement In hrefElementCollection

                        Dim controlName As String = curElement.GetAttribute("id").ToString

                        If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then

                             curElement.InvokeMember("Click")

                           

                         End If

    'insert extraction once again

     marchb = False  ' missing: if date as specified

                    Next

                End If
           End If

        End Sub 

    End Class


    The task with the Date remains.
    I really appreciate your advice!
    Friday, December 07, 2007 2:08 PM
  • Hi martin,

    -at the reg.ex. forum i was provided a lot of help but one Problem remains: I inserted the extraction where i had planed it, but it seems it happens to fast: the extracted table is the one displayed before refreshing. I hoped a few seconds pausing or another switch after the new table is completely loaded should do the trick, but my attempts have not been successfull yet.

    -And another little thing: up to now the extracted table is saved to a "fix-named" file. as this programm will run often, i'd like to have a changing date component and (for several pages a day) a counter in the filename.

    This is the complete code:

    Hi ok now i am puzzled once more: i finally tried the exporting but it did export the first table, the table that is displayed before the selection from the comboboxes is done. (but i need the table that is displayed after the comboboxselection). whats wrong? please have a look at my complete code. Thank you:


    Imports System.IO

    Imports System.Text.RegularExpressions

    Public Class Form1

     

        Dim lastDate As DateTime

        Dim marchb As Boolean

     

        Dim march As Boolean  ' Set a swith

     

        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

     

            march = True  ' Initialize the switch as True

            marchb = True

     

     

            WebBrowser1.Dock = DockStyle.Fill

     

            Me.WindowState = FormWindowState.Maximized

     

            ' Part 1: Use WebBrowser control to load web page   

     

            WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")

     

        End Sub


        Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

     

            'Dertermine the swith state

     

            If march = True Then

     

                'Part 2: Automatically select specified option from ComboBox

     

                Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

     

                For Each curElement As HtmlElement In theElementCollection

     

                    Dim controlName As String = curElement.GetAttribute("name").ToString

     

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then

     

                        curElement.SetAttribute("Value", 0)

     

                    End If

     

                Next

     

     

                'Part 2,5: Automatically select specified option from ComboBox

     

                Dim the2ElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

     

                For Each curElement As HtmlElement In the2ElementCollection

     

                    Dim controlName As String = curElement.GetAttribute("name").ToString

     

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Lines" Then

     

                        curElement.SetAttribute("Value", 100)

     

                    End If

     

                Next

     

     

     

                Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

     

                For Each curElement As HtmlElement In theWElementCollection

     

                    Dim controlName As String = curElement.GetAttribute("name").ToString

     

                    'Part 3: Automatically check the CheckBox

     

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then

     

                        curElement.SetAttribute("Checked", True)

     

                        'Part 4: Automatically click the button

     

                    ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

     

                        curElement.InvokeMember("click")

     

                    End If

     

                Next

     

                'part 5 export

                'java skript

     

                Dim rows As New System.Collections.ObjectModel.Collection(Of String())()

     

                Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+(?:\\t))+([^\\]+(?=\\r\\n'))"

     

     

     

                For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern)

     

                    rows.Add(m.Value.Split(New String() {"\t"}, StringSplitOptions.None))

     

                Next

     

     

                ' export to txt

                march = False  ' If accomplish the task, change the switch to False.

                lastDate = Nothing

     

                Dim lastDateStr As String = Nothing

     

                Dim separator As String = vbTab

     

                Using sw As StreamWriter = File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export.txt")

     

                    For Each row As String() In rows

     

                        sw.WriteLine(String.Join(separator, row))

                        lastDateStr = row(0)

     

                    Next

     

                End Using

     

     

                If lastDateStr IsNot Nothing Then

     

                    lastDate = DateTime.Parse(lastDateStr)

     

                End If

     

     

     

            Else   ' If march = False, don't need to perform above tasks, directly click Continue link.

     

                If marchb = True And lastDate = Today.AddDays(1) Then ' something like that - dont think that already works

     

                    'Part 6 Automatically click Continue link

                    Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")

     

                    For Each curElement As HtmlElement In hrefElementCollection

     

                        Dim controlName As String = curElement.GetAttribute("id").ToString

     

                        If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then

     

                            curElement.InvokeMember("Click")

                            'extract again... yet to be inserted

     

                        End If

                        marchb = False

                    Next

                End If

     

     

            End If

     

        End Sub

     

    End Class


    Wednesday, December 12, 2007 9:17 AM
  • Hi d.j.t, 

     

    Welcome back!

    I'm glad to hear that you got much help from Regular Expressions forum.

     

    "but it seems it happens to fast: the extracted table is the one displayed before refreshing."

    ->         'Delay 2 seconds

                System.Threading.Thread.Sleep(2000) 

                'Call sub to extract

                ExportTableData()           

    "And another little thing: up to now the extracted table is saved to a "fix-named" file. as this programm will run often, i'd like to have a changing date component and (for several pages a day) a counter in the filename."

     

    ->    'Add current DataTime to file name to identify

            Dim currentDataTime As String = DateTime.Now.ToString("yyyymmddhhmmss")

            Using sw As StreamWriter = File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export" & currentDataTime & ".txt")

    Thursday, December 13, 2007 2:58 AM
  • This is complete code. The modified parts are marked in bold font.

    Code Block

    Imports System.IO

    Imports System.Text.RegularExpressions

     

    Public Class Form1

        Dim lastDate As DateTime

        Dim marchb As Boolean

    Dim march As Boolean  ' Set a switch

     

        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

            march = True  ' Initialize the switch as True

            marchb = True

            WebBrowser1.Dock = DockStyle.Fill

            Me.WindowState = FormWindowState.Maximized

            ' Part 1: Use WebBrowser control to load web page   

            WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")

        End Sub

     

        Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

     

            'Dertermine the swith state

            If march = True Then

                'Part 2: Automatically select specified option from ComboBox

                Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

                For Each curElement As HtmlElement In theElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then

                        curElement.SetAttribute("Value", 0)

                    End If

                Next

     

                'Part 2,5: Automatically select specified option from ComboBox

                Dim the2ElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

     

                For Each curElement As HtmlElement In the2ElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Lines" Then

                        curElement.SetAttribute("Value", 100)

                    End If

                Next

     

     

                Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

                For Each curElement As HtmlElement In theWElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    'Part 3: Automatically check the CheckBox

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then

                        curElement.SetAttribute("Checked", True)

                        'Part 4: Automatically click the button

                    ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

                        curElement.InvokeMember("click")

                    End If

                Next

                march = False  ' If accomplish the task, change the switch to False.

     

                'Delay 2 seconds

                System.Threading.Thread.Sleep(2000)

                'Call sub to extract

                ExportTableData()

     

            Else   ' If march = False, don't need to perform above tasks, directly click Continue link.

                If marchb = True And lastDate = Today.AddDays(1) Then ' something like that - dont think that already works

     

                    'Part 6 Automatically click Continue link

                    Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")

                    For Each curElement As HtmlElement In hrefElementCollection

                        Dim controlName As String = curElement.GetAttribute("id").ToString

                        If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then

                            curElement.InvokeMember("Click")

     

                            'Delay 2 seconds

                            System.Threading.Thread.Sleep(2000)

                            'Call sub to extract again

                            ExportTableData()

     

                        End If

                        marchb = False

                    Next

                End If

            End If

        End Sub

    ' To be continue...

     

     

    Thursday, December 13, 2007 3:10 AM
  • Code Block

    ' Continue

     

    ' I put extract function code in custom method in order to be called conveniently.

        Public Sub ExportTableData()

            'part 5 export

            'java script

            Dim rows As New System.Collections.ObjectModel.Collection(Of String())()

            Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+(?:\\t))+([^\\]+(?=\\r\\n'))"

            For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern)

                rows.Add(m.Value.Split(New String() {"\t"}, StringSplitOptions.None))

            Next

     

            ' export to txt

            lastDate = Nothing

            Dim lastDateStr As String = Nothing

            Dim separator As String = vbTab

     

            'Add current DataTime to file name to identify

            Dim currentDataTime As String = DateTime.Now.ToString("yyyymmddhhmmss")

            Using sw As StreamWriter = File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export" & currentDataTime & ".txt")

                For Each row As String() In rows

                    sw.WriteLine(String.Join(separator, row))

                    lastDateStr = row(0)

                Next

            End Using

     

            If lastDateStr IsNot Nothing Then

                lastDate = DateTime.Parse(lastDateStr)

            End If

    End Sub

     

    End Class

     

     

    Thursday, December 13, 2007 3:13 AM
  •  

    Thanks for all those answers!!!! Just Great! i hope that with this i can finally finish my task! Loads of thanks!
    Thursday, December 13, 2007 10:58 AM
  • Hi Martin,

    finally i have a complete working code doing exactly what i want. Big thanks to you! i have some questions still but they are mere "cosmetics".

    -With that code the first table is copied twice. I dont really understand why...

    -Can it easyly be done, that the user doesnt notice anything else of the execution of the skript once it is executed. I mean no window, no sounds...

    -I'd like that programm to be used not only for one stock, but for several (up to 100). So i could just change the adress in the first sub and create a executable programm for each stock. Then write few lines that make all those programms be executed. I think this should even be possible at the same time.??.
    Well of course i'd would be more elegant if i didnt need to create so many single programms . is there an conviniently easy way to do this in the skipt?

    Thanks! Dominik

    Ps: Skript in next post... cant post it in color... (dont ask me why, the forum always refuses to accept (unknown error))

     



    Friday, December 14, 2007 1:23 PM
  • Imports System.IO

    Imports System.Text.RegularExpressions

    Public Class Form1

        Dim lastDate As DateTime

        Dim marchb As Boolean

        Dim marchc As Boolean

        Dim march As Boolean  ' Set a swith

        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

            march = True  ' Initialize the switch as True

            marchc = True


            WebBrowser1.Dock = DockStyle.Fill

            Me.WindowState = FormWindowState.Maximized


            ' Part 1: Use WebBrowser control to load web page    

            WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=EAD.ETR")

        End Sub

        Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

            'Dertermine the swith state

            If march = True Then

                'Part 2: Automatically select specified option from ComboBox

                Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

                For Each curElement As HtmlElement In theElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Step" Then

                        curElement.SetAttribute("Value", 0)

                    End If

                Next


                'Part 2,5: Automatically select specified option from ComboBox

                Dim the2ElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

                For Each curElement As HtmlElement In the2ElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$DD_Lines" Then

                        curElement.SetAttribute("Value", 100)

                    End If

                Next



                Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

                For Each curElement As HtmlElement In theWElementCollection

                    Dim controlName As String = curElement.GetAttribute("name").ToString

                    'Part 3: Automatically check the CheckBox

                    If controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$CBx_CapitalMeasures" Then

                        curElement.SetAttribute("Checked", True)

                        'Part 4: Automatically click the button

                    ElseIf controlName = "ctl00$ctl00$ctl16$ctl00$WP1Quotes$ctl03$IBtn_Refresh1" Then

                        curElement.InvokeMember("click")

                        march = False  ' If accomplish the task, change the switch to False.

                    End If

                Next



            Else

                If marchc = True And march = False Then   ' If march = False, don't need to perform above tasks, directly click Continue link.

                    'part 5 export
                    extract()

                    marchc = False


                End If

            End If



            If marchc = False And lastDate > Today.AddDays(-2) Then ' im not sure if that works

                'Part 6 Automatically click Continue link

                Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")

                For Each curElement As HtmlElement In hrefElementCollection

                    Dim controlName As String = curElement.GetAttribute("id").ToString

                    If controlName = "ctl00_ctl00_ctl16_ctl00_WP1Quotes_ctl03_LBtn_More" Then

                        curElement.InvokeMember("Click")

                    End If

                Next
                extract()

                'ElseIf lastDate > "01.01.0001" And lastDate < Today.AddDays(-2) Then : Close() 'just good to know...
            End If

        End Sub
        Public Sub extract()
            Dim rows As New System.Collections.ObjectModel.Collection(Of String())()

            Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+(?:\\t))+([^\\]+(?=\\r\\n'))"

            For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern)

                rows.Add(m.Value.Split(New String() {"\t"}, StringSplitOptions.None))

            Next


            ' export to txt

            lastDate = Nothing

            Dim lastDateStr As String = "0"

            Dim separator As String = vbTab

            Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")

            Using sw As StreamWriter = File.CreateText("M:\Dominik\Handelsblattskript\Testfergebnisse\export" & currentDataTime & ".txt")

                For Each row As String() In rows

                    sw.WriteLine(String.Join(separator, row))

                    lastDateStr = row(0)

                Next

            End Using
            If lastDateStr IsNot "0" Then

                lastDate = DateTime.ParseExact(lastDateStr, "dd.MM. HH:mmTongue Tieds", System.Globalization.CultureInfo.CreateSpecificCulture("de-de"))
                System.Threading.Thread.Sleep(1000)
            End If
        End Sub


    End Class

    Friday, December 14, 2007 1:30 PM
  • "im not sure if that works"

    Try this:

    Code Snippet
    1. Public Class Form1
    2.     Dim document_completed As Integer
    3.     Dim last_datetime As DateTime
    4.     Dim earliest_datetime As DateTime
    5.     Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
    6.         WebBrowser1.Dock = DockStyle.Fill
    7.         Me.WindowState = FormWindowState.Maximized
    8.         Part1() ' Use WebBrowser control to load web page
    9.     End Sub
    10.     Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
    11.         document_completed = document_completed + 1
    12.         If document_completed = 1 Then ' First table
    13.             Part2() ' Automatically select specified option from ComboBox
    14.             Part3() ' Automatically check the CheckBox
    15.             Part4() ' Automatically click the Button
    16.         ElseIf document_completed > 1 And document_completed < 11 Then ' Second to tenth tables
    17.             Part5() ' Extract javascript and update last_datetime
    18.             If last_datetime > earliest_datetime Then
    19.                 Part6() ' Click Continue Button
    20.             End If
    21.         End If
    22.     End Sub
    23.     Private Sub Part1()
    24.         ' Part 1: Use WebBrowser control to load web page
    25.         document_completed = 0
    26.         last_datetime = DateTime.Now
    27.         earliest_datetime = last_datetime.AddDays(-2)
    28.         WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
    29.     End Sub
    30.     Private Sub Part2()
    31.         ' Part 2: Automatically select specified option from ComboBox
    32.     End Sub
    33.     Private Sub Part3()
    34.         ' Part 3: Automatically check the CheckBox
    35.     End Sub
    36.     Private Sub Part4()
    37.         ' Part 4: Automatically click the Button
    38.     End Sub
    39.     Private Sub Part5()
    40.         ' Part 5: Extract javascript and update last_datetime
    41.     End Sub
    42.     Private Sub Part6()
    43.         ' Part 6: Click Continue Button
    44.     End Sub
    45. End Class
    • Edited by Tim Mathias Wednesday, October 14, 2009 6:25 PM Reformatted code snippet.
    Friday, January 25, 2008 6:06 AM
  • Not forgetting Part 7 from this thread http://forums.microsoft.com/msdn/showpost.aspx?postid=2514450&siteid=1&sb=0&d=1&at=7&ft=11&tf=0&pageid=2

    Code Snippet
    1. If last_datetime > earliest_datetime Then
    2.     Part6() ' Click Continue Button
    3. Else
    4.     Me.Close() ' Part 7: Close programme
    5. End If
    • Edited by Tim Mathias Wednesday, October 14, 2009 6:10 PM Reformatted code snippet.
    Friday, January 25, 2008 6:22 AM
  • Hi Dominik,

    I found a couple of bugs in Part 5 when I tried it out in C++ (I'm a C++ man not a VB one). I've highlighted the important changes in bold (namely -- 24 hour clock, closed the output file immediately after writing to it, and parsing a 15 character substring for the last datetime). (I've also used GetElementById to get straight to the point.)

    With the original version, ParseExact threw an exception every time, leaving the output file open and empty. Maybe this is what is causing you stability issues with VB.

    Code Snippet
    1. void Part1 ()
    2. {
    3.     Trace::WriteLine ("Part 1");
    4.  
    5.     // Part 1: Use WebBrowser control to load web page
    6.     document_completed = 0;
    7.     last_datetime = DateTime::Now;
    8.     earliest_datetime = last_datetime.AddDays (-2.0);
    9.     webBrowser1->DocumentCompleted += gcnew WebBrowserDocumentCompletedEventHandler (this, &Form1::DocumentCompleted);
    10.     webBrowser1->Navigate ("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX");
    11. }
    12.  
    13. void Part2 ()
    14. {
    15.     Trace::WriteLine ("Part 2");
    16.  
    17.     // Part 2: Automatically select specified option from ComboBox
    18.     HtmlElement ^el = webBrowser1->Document->GetElementById ("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_DD_Step");
    19.     el->SetAttribute ("value", "0");
    20. }
    21.  
    22. void Part3 ()
    23. {
    24.     Trace::WriteLine ("Part 3");
    25.  
    26.     // Part 3: Automatically check the CheckBox
    27.     HtmlElement ^el = webBrowser1->Document->GetElementById ("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_CBx_CapitalMeasures");
    28.     el->SetAttribute ("checked", "true");
    29. }
    30.  
    31. void Part4 ()
    32. {
    33.     Trace::WriteLine ("Part 4");
    34.  
    35.     // Part 4: Automatically click the button
    36.     HtmlElement ^el = webBrowser1->Document->GetElementById ("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_IBtn_Refresh1");
    37.     el->InvokeMember ("click");
    38. }
    39.  
    40. void Part5 ()
    41. {
    42.     Trace::WriteLine ("Part 5");
    43.  
    44.     // Part 5: Extract javascript and update last_datetime
    45.     try
    46.     {
    47.         ArrayList ^rows = gcnew ArrayList ();;
    48.         Regex ^pattern = gcnew Regex ("(?<=myl\\+=\\')([^\\\\]+(?:\\\\t))+([^\\\\]+(?=\\\\r\\\\n'))");
    49.         Trace::WriteLine ("Part 5: pattern = " + pattern);
    50.         MatchCollection ^matches = pattern->Matches (webBrowser1->DocumentText);
    51.         Trace::WriteLine ("Part 5: matches->Count = " + matches->Count);
    52.         array <String^> ^tab = { gcnew String ("\\t") };
    53.         for (int i = 0; i < matches->Count; i++)
    54.         {
    55.             Trace::WriteLine (matches [i]->Value);
    56.             rows->Add (String::Join ("\t", matches [i]->Value->Split (tab, StringSplitOptions::None)));
    57.             Trace::WriteLine (rows [i]);
    58.         }
    59.         String ^current_datetime = DateTime::Now.ToString ("yyyyMMddHHmmss"); // 24 hour clock
    60.         StreamWriter ^file = gcnew StreamWriter ("BrowserAutomation" + current_datetime + ".txt");
    61.         for (int i = 0; i < rows->Count; i++)
    62.         {
    63.             file->WriteLine (rows [i]);
    64.         }
    65.         file->Close ();
    66.  
    67.         String ^str_last_datetime = (String ^) rows [rows->Count - 1];
    68.         Trace::WriteLine ("str_last_datetime = " + str_last_datetime);
    69.         last_datetime = DateTime::ParseExact (str_last_datetime->Substring (0, 15), "dd.MM. HH:mm:ss", System::Globalization::CultureInfo::CreateSpecificCulture ("de-de"));
    70.         Trace::WriteLine ("last_datetime = " + last_datetime);
    71.     }
    72.     catch (Exception ^e)
    73.     {
    74.         Trace::WriteLine ("Part 5: " + e->Message);
    75.     }
    76. }
    77.  
    78. void Part6 ()
    79. {
    80.     Trace::WriteLine ("Part 6");
    81.  
    82.     // Part 6: Click Continue Button
    83.     HtmlElement ^el = webBrowser1->Document->GetElementById ("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_LBtn_More");
    84.     el->InvokeMember ("click");
    85. }
    • Edited by Tim Mathias Wednesday, October 14, 2009 6:20 PM Reformatted code snippet.
    Friday, January 25, 2008 10:37 PM
  • Hi

    thanks for your posts but as this is my first skript and therefore my programming experience is near zero, i dont know how i would have to translate your skript to vb.net. or do you propose to change to c++? well i've only used vb.net up to now.

    nevertheless i made some changes within my code (namely i put: add.days(-1) everywhere where i had different numbers before) and now it seems to work.

    well this programm is supposed to run on an old win2000sp4 computer that is not used for anything else, so nobody can interfere. but after all was working fine on the (more or less new) win xpcomputer, on which i wrote the whole thing, it is not working that fine on the old win2000sp4 computer.
    what happens there is (while working fine most of the times), that SOMETIMES the first table is copied, the one that was displayed when first browsing to the page, before doing the selections and refreshing. so to me it seems as if the skript doesnt wait for the documentcompleted-event any more. but only sometimes! sometimes the correct table is also copied, sometimes not. i dont understand this! (actually i never fully understood of the documentcompleted-event-thing). the only way i can explain is that the old computer is to slow... im frustrated!

    is there anyone who has an idea why this could be?

    i post the whole code once again....

    Thanks Dominik
    Monday, January 28, 2008 2:22 PM
  • Imports System.IO

     

    Imports System.Text.RegularExpressions

     

    Public Class Form1

     

    Dim lastDate As DateTime

     

    Dim marchc As Boolean

     

    Dim march As Boolean' set 2 switches

     

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load

    Me.Visible = False

    march = True' Initialize the switches as True

     

    marchc = True

     

     

    WebBrowser1.Dock = DockStyle.Fill

     

    Me.WindowState = FormWindowState.Maximized

     

     

    ' Part 1: Use WebBrowser control to load web page

     

    WebBrowser1.Navigate("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=SAP.ETR")

     

    End Sub

     

    Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted

     

    'Dertermine the swith state

    'Me.Visible = False ' egal

    If march = True Then

     

    'Part 2: Automatically select specified option from ComboBox

     

    Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

     

    For Each curElement As HtmlElement In theElementCollection

     

    Dim controlName As String = curElement.GetAttribute("name").ToString

     

    If controlName.contains("DD_Step") Then

     

    curElement.SetAttribute("Value", 0)

     

    End If

     

    Next

     

     

    'Part 2,5: Automatically select specified option from ComboBox

     

    Dim the2ElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("select")

     

    For Each curElement As HtmlElement In the2ElementCollection

     

    Dim controlName As String = curElement.GetAttribute("name").ToString

     

    If controlName.contains("DD_Lines") Then

     

    curElement.SetAttribute("Value", 100)

     

    End If

     

    Next

     

     

     

    Dim theWElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("input")

     

    For Each curElement As HtmlElement In theWElementCollection

     

    Dim controlName As String = curElement.GetAttribute("name").ToString

     

    'Part 3: Automatically check the CheckBox

     

    If controlName.contains("CBx_CapitalMeasures") Then

     

    curElement.SetAttribute("Checked", True)

     

    'Part 4: Automatically click the button

     

    ElseIf controlName.contains("IBtn_Refresh1") Then

     

    curElement.InvokeMember("click")

     

    march = False' If accomplish the task, change the switch1 to False.

     

    End If

     

    Next

     

     

     

    Else

     

    If marchc = True And march = False Then ' If march = False, don't need to perform above tasks, directly click Continue link.

     

    'part 5 export

    extract()

     

    marchc = False

     

     

    End If

     

    End If

     

     

     

    If marchc = False And lastDate > Today.AddDays(-1) Then ' im not sure if that works

     

    'Part 6 Automatically click Continue link

     

    Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")

     

    For Each curElement As HtmlElement In hrefElementCollection

     

    Dim controlName As String = curElement.GetAttribute("id").ToString

     

    If controlName.Contains("LBtn_More") Then

     

    curElement.InvokeMember("Click")

     

    End If

     

    Next

    extract()

    'part 7 close program

    ElseIf lastDate > "01.01.0001" And lastDate < Today.AddDays(-1) Then

     

    Me.Close()

     

    End If

     

    End Sub

    'sub to extract

    Public Sub extract()

    Dim rows As New System.Collections.ObjectModel.Collection(Of String())()

     

    Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+(?:\\t))+([^\\]+(?=\\r\\n'))"

     

    For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern)

     

    rows.Add(m.Value.Split(New String() {"\t"}, StringSplitOptions.None))

     

    Next

     

     

    ' export to txt

     

    lastDate = Nothing

     

    Dim lastDateStr As String = "0"

     

    Dim separator As String = vbTab

     

    Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")

     

    Using sw As StreamWriter = File.CreateText("C:\abc\def\etr" & currentDataTime & ".txt")

     

    For Each row As String() In rows

     

    sw.WriteLine(String.Join(separator, row))

     

    lastDateStr = row(0)

     

    Next

     

    End Using

    If lastDateStr IsNot "0" Then

     

    lastDate = DateTime.ParseExact(lastDateStr, "dd.MM. HH:mmTongue Tieds", System.Globalization.CultureInfo.CreateSpecificCulture("de-de"))

    System.Threading.Thread.Sleep(2000)

     

    End If

    End Sub

     

     

    End Class

     

     

    Monday, January 28, 2008 2:23 PM
  • Dominik: "what happens there is (while working fine most of the times), that SOMETIMES the first table is copied, the one that was displayed when first browsing to the page, before doing the selections and refreshing. so to me it seems as if the skript doesnt wait for the documentcompleted-event any more. but only sometimes! sometimes the correct table is also copied, sometimes not. i dont understand this! (actually i never fully understood of the documentcompleted-event-thing). the only way i can explain is that the old computer is to slow... im frustrated!"

    Hi Dominik,

    In Part 6 you are extracting the javascript immediately after automatically clicking the More button without waiting for the next webpage to load with new data:

    Code Snippet
    1. 'Part 6 Automatically click Continue link
    2. Dim hrefElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")
    3. For Each curElement As HtmlElement In hrefElementCollection
    4.     Dim controlName As String = curElement.GetAttribute("id").ToString
    5.     If controlName.Contains("LBtn_More") Then
    6.         curElement.InvokeMember("Click")
    7.     End If
    8. Next
    9. extract()


    The code in my first post on this thread fixes that problem. The DocumentCompleted event fires when a new webpage loads. After clicking the button in Part 4 we have to wait for the next DocumentCompleted which tells us that next webpage has loaded with new data. Similarly with clicking the More button in Part 6 (see: http://msdn2.microsoft.com/en-us/library/system.windows.forms.webbrowser.documentcompleted.aspx):

    Code Snippet
    1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
    2.     document_completed = document_completed + 1
    3.     If document_completed = 1 Then ' First table
    4.         Part2() ' Automatically select specified option from ComboBox
    5.         Part3() ' Automatically check the CheckBox
    6.         Part4() ' Automatically click the Button
    7.     ElseIf document_completed > 1 And document_completed < 11 Then ' Second to tenth tables
    8.         Part5() ' Extract javascript and update last_datetime
    9.         If last_datetime > earliest_datetime Then
    10.             Part6() ' Click Continue Button
    11.         End If
    12.     End If
    13. End Sub


    But the If statements need to be refined a bit because DocumentCompleted fires twice per page (once for the page banner and once for the default page containing the javascript data that we want):

    Code Snippet
    1. If (document_completed < 3) And (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
    2. .
    3. .
    4. .
    5. ElseIf (document_completed > 2) And (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then


    The second problem is that you are using a 12 hour clock without specifying a.m. or p.m. when generating the filename so there is potential for overwriting old files or appending new data to an old file:

    Code Snippet
    1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")


    Use a 24 hour clock instead using capital HH:

    Code Snippet
    1. Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddHHmmss")


    The other bugs I pointed out were "features" that I had introduced myself when converting from VB to C++ (I was a bit unfamiliar with the Using statement) so you can ignore these.

    • Edited by Tim Mathias Wednesday, October 14, 2009 6:03 PM Reformatted code snippets.
    Tuesday, January 29, 2008 10:24 AM
  • Hi Tim,
    thanks for your comprehensive explanations! I think with the structure you are adviceing it should work a lot better than what i had before.
    one thing i still dont understand is why my skript not only extracts the "old table" but also the new one... well but that doesnt matter.

    First i wondered whether this would allow not more then 10 tables
    ElseIf document_completed > 1 And document_completed < 11 Then ' Second to tenth tables
    But i see this part needs to be changed to what you wrote so this restriction drops out:
    ElseIf (document_completed > 2) And (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then

    Is it exactly necessary to mention
    e.Url.AbsoluteUri = ...  because the url stays the same througout the whole procedure?

    Well, as i am doing this while studying i cant implement all your  advices right now, but i'll do so soon and report my progress!

    Thanks a lot! Dominik

    Wednesday, January 30, 2008 10:23 AM
  • Hi i just tried it, works fine! Just the me.close part is missing but no time left now, will continue next fryday. Thanks a lot!!!!!! Dominik
    Wednesday, January 30, 2008 11:10 AM
  • > Is it exactly necessary to mention e.Url.AbsoluteUri = ...  because the url stays the same througout the whole procedure?

     

    It's essential because the url DOESN'T stay the same throughout the whole procedure because the webpage contains a link to a banner page that also calls the procedure after it loads. I've added a MessageBox to show these two URLs. It's this double message that causes the first table to be extracted in your skript (i.e. the table we want to ignore).

     

    I've also added an If statement that returns when the banner URL completes (it's a bit neater than the former If tests I wrote).

     

    And I've added the Me.Close ()

    Code Snippet
    1. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
    2.     MessageBox.Show("DocumentCompleted:  " & e.Url.AbsoluteUri)
    3.     If Not (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then
    4.         Return
    5.     End If
    6.     document_completed = document_completed + 1
    7.     If document_completed = 1 Then ' First table
    8.         Part2() ' Automatically select specified option from ComboBox
    9.         Part3() ' Automatically check the CheckBox
    10.         Part4() ' Automatically click the Button
    11.     ElseIf document_completed > 1 Then
    12.         Part5() ' Extract javascript and update last_datetime
    13.         If last_datetime > earliest_datetime Then
    14.             Part6() ' Automatically click Continue Button
    15.         Else
    16.             Me.Close() ' Part 7: Close programme
    17.         End If
    18.     End If
    19. End Sub
    • Edited by Tim Mathias Wednesday, October 14, 2009 5:38 PM Reformatted code snippet.
    Wednesday, January 30, 2008 2:42 PM
  • Thanks a lot i! i think now i understand the documentcompleted structure better!

    I'll test this skipt, but i think still there is one problem:


    If the last date in the table is yesterday, the scipt will click "more/next table"("weiter") to get the next table. Now sometimes there is no futher information [because the intraday-data i need is saved for only 5 days or so]. Then when clicking on "more/next table" the same table is loaded again, as there is no next table. In that case the program will endlessly repeat the re-loading and extraction of that table.
    [With my data this is extremely unlikely to happen, but it happend for the first time in 2 weeks yesterday so i got the same file a thousand times and the skript (the former one) ran for like 12 hours until it crashed].

    What i thought of to solve this problem was to save the lastdate for one turn so that the next time we can compare if the last date has changed. So we need the lastdate of the previous and the pre-previous table.
    It can probably be done easier. So don't continue reading if you have an easy solution.

    EDIT: i found an easyer way so dont read the second snipplet:
    EDIT 2: tried it on http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX, didnt work totally correct: it produced 2 times the same file with this link.  (but still better than infinite times!Wink

        Dim previouslastdate As DateTime

        Private Sub Form1_Load
    ...
            previouslastdate = DateTime.Now.AddDays(-1000)
            WebBrowser1.Dock = DockStyle.Fill
            Me.WindowState = FormWindowState.Minimized
            Part1() ' Use WebBrowser control to load web page
        End Sub

        Private Sub WebBrowser1_DocumentCompleted...

            document_completed = document_completed + 1

            If (document_completed < 3) And (e.Url.AbsoluteUri = Seite) Then ' First table
                Part2() ' Automatically select specified option from ComboBox
                Part3() ' Automatically check the CheckBox
                Part4() ' Automatically click the Button

            ElseIf (document_completed > 2) And (e.Url.AbsoluteUri = Seite) Then ' Second to xth tables
               
    previouslastdate = lastdate
    Part5() ' Extract javascript and update last_datetime
                    If lastdate > earliest_datetime And lastdate <> previouslastdate Then

                    Part6() ' Click Continue Button
                Else
                    Me.Close() ' Part 7: Close programme
                End If
            End If

        End Sub



    But anyway,m y idea was therefore to save the lastdate every second time into a new variable. my idea was to determine if it is the second time by counting the docment_completed events: i understand we get this event 4 times whithin 2 turns .
    So here the code... just didnt know how to determine if a variable is an integer...


    Insert in the part 5 sub
    ...
    dim checkdate as datetime1
    dim checkdate as datetime2

    lastDate = Nothing
            Dim lastDateStr As String = "0"
            Dim separator As String = vbTab
            Dim currentDataTime As String = DateTime.Now.ToString("yyyyMMddhhmmss")
            Using sw As StreamWriter = File.CreateText(Pfad & currentDataTime & ".txt")
                For Each row As String() In rows
                    sw.WriteLine(String.Join(separator, row))
                    lastDateStr = row(0)
                Next
            End Using

            If lastDateStr IsNot "0" Then
                lastdate = DateTime.ParseExact(lastDateStr, "dd.MM. HH:mmTongue Tieds", System.Globalization.CultureInfo.CreateSpecificCulture("de-de"))
    If document_completed / 4 gives an integer Then checkdate1 = lastdate checkdate2 = 0
    Else checkdate2 = last date checkdate1 = 0
    End If
                System.Threading.Thread.Sleep(2000)
            End If
    ...

    and insert in the document completed sub

    ...
    ElseIf (document_completed > 2) And (e.Url.AbsoluteUri = "http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX") Then ' Second to xth tables
                Part5() ' Extract javascript and update last_datetime
                If lastdate > earliest_datetime
    And document_completed / 4 gives an integer and checkdate2  <> lastdate Then
                    Part6() ' Click Continue Button

        ElseIf lastdate > earliest_datetime And document_completed / 4 does not give an integer and checkdate1  <> lastdate Then
                    Part6() ' Click Continue Button
                Else
                    Me.Close() ' Part 7: Close programme
                End If
            End If

    ...
    Friday, February 01, 2008 2:08 PM
  • I did originally limit the document_completed count to 10 tables to avoid an infinite repeat in case there was a problem parsing the DateTime from the webpage (bold red). You'll have the cybercops after you for a suspected DoS attack.

     

    Here's the ultimate bug free code  (until you find the next one):

    Code Snippet
    1. Dim previous_last_datetime As DateTime
    2.  
    3. Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
    4.     MessageBox.Show("DocumentCompleted:  " & e.Url.AbsoluteUri)
    5.     If Not (e.Url.AbsoluteUri = seite) Then
    6.         Return
    7.     End If
    8.     document_completed = document_completed + 1
    9.     If document_completed = 1 Then ' First table
    10.         Part2() ' Automatically select specified option from ComboBox
    11.         Part3() ' Automatically check the CheckBox
    12.         Part4() ' Automatically click the Button
    13.     ElseIf document_completed > 1 And document_completed < 11 Then
    14.         previous_last_datetime = last_datetime
    15.         Part5() ' Extract javascript and update last_datetime
    16.         If previous_last_datetime > last_datetime Then
    17.             Part6() ' Automatically click Continue Button
    18.         Else
    19.             Me.Close() ' Part 7: Close programme
    20.         End If
    21.     End If
    22. End Sub
    • Edited by Tim Mathias Wednesday, October 14, 2009 5:30 PM Reformatted code snippet.
    Friday, February 01, 2008 7:04 PM
  •  

    I've had a deeper look at the website's pagination problem. I've separated the reading of the table rows from the writing of the table rows -- Part5A and Part5B. I've also added a new variable -- more_data -- to test whether the next table is really more data or just a repeat of the last table. If you want you can also add a time limit to this test -- earliest_datetime -- as we had before.

     

    Currently (at time of writing this post) there's still a mysterious problem with that particular website with a double entry:

    30.01. 17:15:08 47,80 Handel 1.000
    30.01. 17:15:08 47,70 Handel 1.000

    If you select 20 lines per page the latter of these entries disappears.

     

    Here's the code:

    Code Snippet
    1. Imports System.IO
    2. Imports System.Text.RegularExpressions
    3.  
    4. Public Class Form1
    5.  
    6.     Dim seite As Uri
    7.     Dim document_completed As Integer
    8.     Dim last_datetime As DateTime
    9.     Dim rows As ArrayList
    10.     Dim more_data As Boolean
    11.  
    12.     Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
    13.         Trace.WriteLine(vbCrLf & vbCrLf & "Form1_Load")
    14.         Me.WindowState = FormWindowState.Maximized
    15.         Part1() ' Use WebBrowser control to load web page
    16.     End Sub
    17.  
    18.     Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
    19.         Trace.WriteLine(vbCrLf & "WebBrowser1_DocumentCompleted url = " & e.Url.ToString)
    20.         If (e.Url <> seite) Then
    21.             Return ' Ignore banner page load
    22.         End If
    23.         document_completed = document_completed + 1
    24.         Trace.WriteLine(vbCrLf & "document_completed = " & document_completed & vbCrLf)
    25.         If document_completed = 1 Then ' First table
    26.             Trace.WriteLine(vbCrLf & "Section A" & vbCrLf)
    27.             Part2() ' Automatically select specified options from ComboBoxes
    28.             Part3() ' Automatically check the CheckBox
    29.             Part4() ' Automatically click the Button
    30.         ElseIf more_data And document_completed < 11 Then
    31.             Trace.WriteLine(vbCrLf & "Section B" & vbCrLf)
    32.             Part5A() ' Read javascript table rows and update more_data
    33.             If more_data Then
    34.                 Part6() ' Automatically click More Button
    35.             Else
    36.                 Part5B() ' Write combined table rows to file
    37.                 Close() ' Part 7: Close programme
    38.             End If
    39.         Else
    40.             Trace.WriteLine("Too many tables.")
    41.             Part5B() ' Write combined table rows to file
    42.             Close() ' Part 7: Close programme
    43.         End If
    44.     End Sub
    45.  
    46.     Private Sub Part1()
    47.         ' Part 1: Use WebBrowser control to load web page
    48.         Trace.WriteLine("Part1: Use WebBrowser control to load web page")
    49.         seite = New Uri("http://www.handelsblatt.com/News/default.aspx?_p=200023&_t=wp1_quoteshistory&wp1_symbol=FLUK.NWX")
    50.         document_completed = 0
    51.         last_datetime = DateTime.Now
    52.         rows = New ArrayList
    53.         more_data = True
    54.         WebBrowser1.Dock = DockStyle.Fill
    55.         WebBrowser1.Navigate(seite)
    56.     End Sub
    57.  
    58.     Private Sub Part2()
    59.         ' Part 2: Automatically select specified options from ComboBoxes
    60.         Trace.WriteLine("Part2: Automatically select specified options from ComboBoxes")
    61.         Try
    62.             ' Part 2A: Times & Sales
    63.             Dim el1 As HtmlElement = WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_DD_Step")
    64.             el1.SetAttribute("value", "0")
    65.  
    66.             ' Part 2B: 100 lines
    67.             Dim el2 As HtmlElement = WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_DD_Lines")
    68.             el2.SetAttribute("value", "100")
    69.         Catch e As Exception
    70.             Trace.WriteLine("ERROR: Part2: " & e.Message)
    71.             Close()
    72.         End Try
    73.     End Sub
    74.  
    75.     Private Sub Part3()
    76.         ' Part 3: Automatically check the CheckBox
    77.         Trace.WriteLine("Part3: Automatically check the CheckBox")
    78.         Try
    79.             Dim el As HtmlElement = WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_CBx_CapitalMeasures")
    80.             el.SetAttribute("checked", "true")
    81.         Catch e As Exception
    82.             Trace.WriteLine("ERROR: Part3: " & e.Message)
    83.             Close()
    84.         End Try
    85.     End Sub
    86.  
    87.     Private Sub Part4()
    88.         ' Part 4: Automatically click the Button
    89.         Trace.WriteLine("Part4: Automatically click the Button")
    90.         Try
    91.             Dim el As HtmlElement = WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_IBtn_Refresh1")
    92.             el.InvokeMember("click")
    93.         Catch e As Exception
    94.             Trace.WriteLine("ERROR: Part4: " & e.Message)
    95.             Close()
    96.         End Try
    97.     End Sub
    98.  
    99.     Private Sub Part5A()
    100.         ' Part 5A: Read javascript table rows and update more_data
    101.         Trace.WriteLine("Part5A: Read javascript table rows and update more_data")
    102.         Try
    103.             Dim new_rows As New ArrayList
    104.             Dim pattern As String = "(?<=" + Regex.Escape("myl+='") + ")([^\\]+(?:\\t))+([^\\]+(?=\\r\\n'))"
    105.             Dim separator As String = vbTab
    106.             For Each m As Match In Regex.Matches(WebBrowser1.DocumentText, pattern)
    107.                 new_rows.Add(String.Join(separator, m.Value.Split(New String() {"\t"}, StringSplitOptions.None)))
    108.                 Trace.WriteLine(new_rows(new_rows.Count - 1))
    109.             Next
    110.             Dim str_new_last_datetime As String = new_rows(new_rows.Count - 1)
    111.             Dim new_last_datetime As DateTime
    112.             new_last_datetime = DateTime.ParseExact(str_new_last_datetime.Substring(0, 15), "dd.MM. HH:mm:ss", System.Globalization.CultureInfo.CreateSpecificCulture("de-de"))
    113.             If (new_last_datetime < last_datetime) Then
    114.                 Trace.WriteLine("Adding " & new_rows.Count & " new row(s) to combined rows.")
    115.                 rows.AddRange(new_rows)
    116.                 last_datetime = new_last_datetime
    117.             Else
    118.                 Trace.WriteLine("Skipping new row(s).")
    119.                 more_data = False
    120.             End If
    121.         Catch e As Exception
    122.             Trace.WriteLine("ERROR: Part5A: " & e.Message)
    123.             Part5B() ' Save any accrued data
    124.             Close()
    125.         End Try
    126.     End Sub
    127.  
    128.     Private Sub Part5B()
    129.         ' Part 5B: Write combined table rows to file
    130.         Trace.WriteLine("Part5B: Write combined table rows to file")
    131.         If rows.Count Then
    132.             Try
    133.                 Dim current_datetime As String = DateTime.Now.ToString("yyyyMMddHHmmss") ' 24 hour clock
    134.                 Trace.WriteLine("Writing " & rows.Count & " row(s) to file...")
    135.                 Using sw As StreamWriter = File.CreateText("BrowserAutomation" & current_datetime & ".txt")
    136.                     For Each row As String In rows
    137.                         sw.WriteLine(row)
    138.                     Next
    139.                 End Using
    140.                 Trace.WriteLine("Done.")
    141.             Catch e As Exception
    142.                 Trace.WriteLine("ERROR: Part5B: " & e.Message)
    143.                 Close()
    144.             End Try
    145.         Else
    146.             Trace.WriteLine("No data to write.")
    147.         End If
    148.     End Sub
    149.  
    150.     Private Sub Part6()
    151.         ' Part 6: Automatically click More Button
    152.         Trace.WriteLine("Part 6: Automatically click More Button")
    153.         System.Threading.Thread.Sleep(2000)
    154.         Try
    155.             Dim el As HtmlElement = WebBrowser1.Document.GetElementById("ctl00_ctl00_ctl17_ctl00_WP1Quotes_ctl04_LBtn_More")
    156.             el.InvokeMember("click")
    157.         Catch e As Exception
    158.             Trace.WriteLine("ERROR: Part4: " & e.Message)
    159.             Part5B() ' Save any accrued data
    160.             Close()
    161.         End Try
    162.     End Sub
    163.  
    164. End Class
    • Edited by Tim Mathias Wednesday, October 14, 2009 5:22 PM Reformatted code snippet.
    Monday, February 04, 2008 10:48 AM
  • Hi Tim

    thanks for both your posts! I implemented the first post and it did work.

    The missing lines on the website - we probably cant do anything about that but that shouldt matter i hope.

    Now your second post looks really scaring. There commands you use are totally different! Id like to understand all that, but at the moment i just have no time as i am studying and exams are held next week and then i'll be away for a while.
    But thanks anyway! Should what i have yet not work i'll check it out!

    Thanks for all your help, i appreciate that a lot!

    Dominik
    Monday, February 11, 2008 3:15 PM
  • Hello d.j.t,

     

    Considering that many developers in this forum ask how to automate a web page via WebBrowser, rotate or flip images, my team has created a code sample for this frequently asked programming task in Microsoft All-In-One Code Framework. You can download the code samples at:

     

    VBWebBrowserAutomation

     

    http://bit.ly/VBWebBrowserAutomation

     

    CSWebBrowserAutomation

     

    http://bit.ly/CSWebBrowserAutomation

     

    With these code samples, we hope to reduce developers’ efforts in solving the frequently asked

    programming tasks. If you have any feedback or suggestions for the code samples, please email us: onecode@microsoft.com.

    ------------

    The Microsoft All-In-One Code Framework (http://1code.codeplex.com) is a free, centralized code sample library driven by developers' needs. Our goal is to provide typical code samples for all Microsoft development technologies, and reduce developers' efforts in solving typical programming tasks.

    Our team listens to developers’ pains in MSDN forums, social media and various developer communities. We write code samples based on developers’ frequently asked programming tasks, and allow developers to download them with a short code sample publishing cycle. Additionally, our team offers a free code sample request service. This service is a proactive way for our developer community to obtain code samples for certain programming tasks directly from Microsoft.

    Thanks

    Microsoft All-In-One Code Framework

    Thursday, March 24, 2011 10:22 AM