locked
HtmlAgilityPack Stuck trying to understand HtmlWeb.Load NetworkCredential RRS feed

  • Question

  • User-224188520 posted

    Hey All,

    I've been trying to understand this for a good portion of today and seem to be making very little progress. I've tried a bunch of different code snippets from other resources but to no avail. I am using RoadKill Wiki for some guides and walk-throughs for work. I want to call excerpts from the wiki into another software program. The wiki is not public so a login and password is required. That is where I am stumped..

    Here is my code that works fine if the wiki is set to public.

    Dim wikiURL As String = "http://wiki.mysite.com/wiki/2"
    Dim str As New StringBuilder
    
    Dim web As New HtmlWeb
    web.UserAgent = "My App Traffic"
    Dim htmldoc As New HtmlDocument
    htmldoc = web.Load(wikiURL)
    
    For Each node As HtmlNode In htmldoc.DocumentNode.SelectNodes("//div[@id='container']")
        str.Append(node.InnerHtml)
    Next
    
    Return str.ToString()

    I know you can overload the HtmlWeb.Load like so 

    web.Load(wikiURL, method As String, proxy As WebProxy, credentials As NetworkCredential) 

    I just don't understand the proper syntax of it. Do I need a proxy(what would that actually be if I did...), and how do I pass a userId and password to the credentials?

    Any help would be great!

    Regards,

    Tuesday, January 6, 2015 2:17 PM

All replies

  • User-224188520 posted

    Maybe the above is not that clear.

    I am just looking to figure out how to properly pass a userId and Password to a site using the HtmlAgilityPack.

    Thanks,

    Wednesday, January 7, 2015 2:40 PM
  • User-224188520 posted

    I've tried this:

        Private Shared Function getWikiArticles(ByVal id As String) As String
            Dim wikiURL As String = "http://wiki.mywiki.com/wiki/" & id
            Dim str As New StringBuilder
    
            Dim web As New HtmlWeb
            web.UserAgent = "My Site Query"
    
            Dim htmldoc As New HtmlDocument
    
            Dim address As Uri = New Uri("http://wiki.mywiki.com")
            Dim myProxy As New WebProxy(address)
    
            htmldoc = web.Load(wikiURL, "POST", myProxy, New NetworkCredential("login", "password"))
    
            For Each node As HtmlNode In htmldoc.DocumentNode.SelectNodes("//div[@id='container']")
                str.Append(node.InnerHtml)
            Next
    
            Return str.ToString()
        End Function

    And I get Object reference not set to an instance of an object error when looping through the nodes of the page. So obviously there is not data coming back from the page.

    Is there anyone out there that can lend me a helping hand with this, any little tidbit of information will do.

    Thursday, January 8, 2015 11:40 AM
  • User-224188520 posted

    This is where I am at now.

        Public Shared Function getArticles() As String
            Dim urlString As String = "http://wiki.mysite.com/wiki/2"
            Dim Username As String = "userName"
            Dim Password As String = "password"
            Dim str As New StringBuilder
    
    
            Dim web As New HtmlWeb()
            web.UseCookies = True
    
            Dim wp As New System.Net.WebProxy("23.91.123.44:80", False)
    
            wp.UseDefaultCredentials = False
            Dim nc As NetworkCredential = New NetworkCredential(Username, Password)
            Dim htmldoc As HtmlDocument = web.Load(urlString, "GET", wp, nc)
    
            For Each node As HtmlNode In htmldoc.DocumentNode.SelectNodes("//div[@id='container']")
                str.Append(node.InnerHtml)
            Next
    
            Return str.ToString()
        End Function

    All that happens is I am re-directed to the login page, no errors.

    Thursday, January 8, 2015 8:26 PM