locked
Programmatically waiting for the WebBrowser control to complete loading a page RRS feed

  • Question

  • I'm having trouble using the WebBrowser control in a Windows Form. I need to open a page with it (which I can do) and then progress to a series of pages with links from the first one (which I can't do very well) in order to gather information by searching the page for text and certain elements.

    At present, the program runs a function to scan the first page and has trouble finding the HtmlDocument object from the WebBrowser (ie it raises an exception, saying the HtmlDocument is null). I have managed to solve the problem by raising a MessageBox straight after the WebBrowser.Navigate method is run. As soon as a MessageBox is raised the ReadyState of the WebBrowser moves from 'Uninitialized' to a latter state.

    Unfortunately, using this option would require me to sit at my computer and watch the program in order to press Ok on the MessageBox dialogs for every page that is visited and looked at. Thus undermining the point of automatically searching a series of pages. I've tried solving this problem using the DocumentCompleted event, however this causes another problem which is that the event gets fired twice (despite only adding the event function only once).

    Does anyone know how to wait until the WebBrowser is completely done loading the page before proceeding? Preferably by doing it programmatically from a function outside of the WebBrowser event functions. I've tried using BeginInvoke and EndInvoke, as well as WaitOne though the IAsyncResult object. Unfortunately neither seems to work.


    Saturday, November 29, 2008 2:16 AM

Answers

  • I managed to devise a work around. Although not a perfect one.

    I found an article on CodeProject that helped. There's a piece of C# code on it, and here's the link:
    http://www.codeproject.com/KB/miscctrl/CsMsgBoxTimeOut.aspx

    Basically, it's a MessageBox that timesout and disappears if the user doesn't press "ok" in a set time.
    Can be called using:MessageBoxEx.Show("Message here",timeoutValue);

    When you call WebBrowser.Navigate, something has to happen by which there is a switch of context from the calling code to the WebBrowser in order for the page to start loading. The web browser has a ReadyState property which starts off as "Unitialized" and goes through several other states until it gets to "Completed". In my original program I had a while loop that kept iterating until the ReadyState became "Completed". Unfortunately, this meant it never changed because the WebBrowser control would not move forward (because of the lack of context switch from the calling code to the WebBrowser).

    However, if you use the timeout message box in the code from the link above, the WebBrowser will start while the MessageBox is waiting for user input or the timeout period to expire. When it expires the WebBrowser will have moved forward from the "Unitialized" ReadyState.

    After this, I changed the while loop to display another timed out message box (with a shorter timeout period). By this stage the page will be past "Unitialized" but not at "Completed" yet (mainly, my program ends up being at the "Interactive" stage - which is where the page hasn't fully loaded but the user can interact with it).

    The only real problem so far for me, is that the program generates a MessageBox every few seconds. So it might interrupt the user while doing any other work. Or just make a beeping noise which gets annoying after the billionth time.

    I hope this helps anyone who had the same problem I did, but cannot rely on the DocumentCompleted callback to maintain control over how the program interacts with the page.

    (NOTE:I'm inclined to expect that another solution would be to call another mechanism similar to the MessageBox which looks for a response somewhere else. As in, a mechanism that causes the calling code to stop completely in order for the WebBrowser to kick in completely. That way you could avoid the MessageBox popping up.)
    Thursday, December 4, 2008 7:26 AM

All replies

  • You need to use the Document_Complete event of the Webbowser.

     

    Saturday, November 29, 2008 3:15 AM
  • As I already wrote before, I've used the DocumentCompleted event. It doesn't solve the problem. Either, I put the handling software into the method I attached to the WebBrowser, or I have the program halt by bringing up a message box. In the first case, the attached method fires irregularly and/or often messes up by jumping from one page to another before the page has been looked at by the program. In the latter case, I'd have to pause the program and click 'Ok' on the message box everytime it comes up. Which ruins the flow of the program. So I still need to know if anyone knows a way to insist that the WebBrowser is done loading. I've tried spin waiting on the ReadyState but it never gets past Unitialized if I do that. I've also tried setting the ReadyState from the method attached to the DocumentCompleted event. No joy with either of those two.
    Sunday, November 30, 2008 12:39 PM
  •  cablehead wrote:
    You need to use the Document_Complete event of the Webbowser.

     



    I have run into a similar problem.

    I can succesfully handle the DWebBrowserEvents2:Big SmileocumentComplete event in my IE extension. But I do not understand how I can use this to determine if the current web page is finished loading.

    The DWebBrowserEvents2:Big SmileocumentComplete is called numerous times for a web page. The documentation for this event says that it is called for each frame when a page is loaded.

    How am I to determine when the last time occurs ? Am I supposed to check the URL parameter of the DWebBrowserEvents2:Big SmileocumentComplete to determine if this matches the IWebBrowser2::LocationURL and conclude that the web page is finished loading when they are the same ? The documentation says that the URL parameter may not match the URL specified in a IWebBrowser2::Navigate or IWebBrowser2::Navigate2 call. What do I check to determine that the page is actually finished loading ?

    Furthermore my check may determine that a document is finished loading while another document has started loading, where I need to wait for the latest document to finish loading in the browser window. While there is the DWebBrowserEvents2:Big SmileocumentComplete event there seems to be no corresponding event which tells me that another document has started loading, in other words how do I determine whether or not a document is in the process of bring loaded at all ?

    I know there must be a simple solution to this problem, so any further help you can give would be most helpful.
    Wednesday, December 3, 2008 9:49 PM
  • I had the same problem with the DocumentComplete part as well. I simply checked the URL and ran a different piece of subcode depending on which page it was looking at. I also had a variable in the code which would be changed to tell the DocumentCompleted function which page it was expecting.

    In the DocumentCompleted callback, I placed some if/else statements to handle the URL of the page that is being completed. Part of my program has been collecting information from a series of pages that link off a main page. This has worked alright because the page is either looking at the main page or one of the sub pages. In either case it either looks at the main page to find the links on it (having decided that it is looking at the main page by comparing the URL with what the main page is expected to have) or it gets the information the program is designed to look for on the sub pages (once again, having decided that it is looking at the sub page by comparing the URL with what the sub page is expected to have). In this case, it isn't too hard.

    When I started to work on the next part of the program which will proceed from the sub page to another sub page, the DocumentCompleted function starts behaving unexpectedly. It ends up skipping the second subpage and loading another one. I suspect it might be because my program isn't filtering out the unwanted Documents (which are mostly advertisements or one of several other sub sections in frames). I've figured that it should be possible to simply wait for the page somehow.

    Unfortunately, the way the thread of the WebBrowser control works is different from the usual threading. There seems to be a need to break the execution of the main thread in order for the WebBrowser's seperate thread to even start running. So far, I've found that the only way to get that break to occur is for a message box to pop up waiting for the user to press "OK". If I press the "OK" too soon, the page will mess up because it's not ready to continue yet.

    I'd very much like to know how to determine when the last one occurs. It would be great if there was a way to pause the function that initiates the WebBrowser.Navigate enough for (a) the DocumentCompleted function to start and (b) to wait until the DocumentCompleted function has definitely ended. Anyway know? Or have some clues?

    (I get the funny feeling not many people would know as controls like the WebBrowser are usually used to imitate the behaviour of a standard Web Browser and not to do anything like automatically navigate pages).
    • Proposed as answer by sbakazmi Friday, July 3, 2009 6:38 PM
    Thursday, December 4, 2008 3:13 AM
  • I managed to devise a work around. Although not a perfect one.

    I found an article on CodeProject that helped. There's a piece of C# code on it, and here's the link:
    http://www.codeproject.com/KB/miscctrl/CsMsgBoxTimeOut.aspx

    Basically, it's a MessageBox that timesout and disappears if the user doesn't press "ok" in a set time.
    Can be called using:MessageBoxEx.Show("Message here",timeoutValue);

    When you call WebBrowser.Navigate, something has to happen by which there is a switch of context from the calling code to the WebBrowser in order for the page to start loading. The web browser has a ReadyState property which starts off as "Unitialized" and goes through several other states until it gets to "Completed". In my original program I had a while loop that kept iterating until the ReadyState became "Completed". Unfortunately, this meant it never changed because the WebBrowser control would not move forward (because of the lack of context switch from the calling code to the WebBrowser).

    However, if you use the timeout message box in the code from the link above, the WebBrowser will start while the MessageBox is waiting for user input or the timeout period to expire. When it expires the WebBrowser will have moved forward from the "Unitialized" ReadyState.

    After this, I changed the while loop to display another timed out message box (with a shorter timeout period). By this stage the page will be past "Unitialized" but not at "Completed" yet (mainly, my program ends up being at the "Interactive" stage - which is where the page hasn't fully loaded but the user can interact with it).

    The only real problem so far for me, is that the program generates a MessageBox every few seconds. So it might interrupt the user while doing any other work. Or just make a beeping noise which gets annoying after the billionth time.

    I hope this helps anyone who had the same problem I did, but cannot rely on the DocumentCompleted callback to maintain control over how the program interacts with the page.

    (NOTE:I'm inclined to expect that another solution would be to call another mechanism similar to the MessageBox which looks for a response somewhere else. As in, a mechanism that causes the calling code to stop completely in order for the WebBrowser to kick in completely. That way you could avoid the MessageBox popping up.)
    Thursday, December 4, 2008 7:26 AM
  • Hi,

    Try the webbrowser's ready state:

    Private Sub WebBrowser1_DocumentCompleted( _
        ByVal sender As Object, _
        ByVal e As WebBrowserDocumentCompletedEventArgs _
    ) Handles WebBrowser1.DocumentCompleted
        If Me.WebBrowser1.ReadyState = WebBrowserReadyState.Complete Then
            MsgBox("Document and frames loaded!")
        End If
    End Sub
    
    
    
    for more information check
    this link

    hope this helps..
    Friday, July 3, 2009 6:41 PM
  • Hi
    I am working on a similar project.To be honest the same project.
    I have used the document completed event, it worked fine for me.I just created a method to navigate the webbrowser control to the next link and all other stuff in that method and simply called that method in the on Document_completed event.Hope this will help you
    Sunday, November 1, 2009 8:16 PM