Scraping data from web


  • Hi, i m a beginner in the VBA coding and i m trying to extract a piece of information from the web. 

    Here is the Website:

    I m trying to extract the Classname = "TableNumBlack" (13)

    The information is the bookrunner name "Ample Orient Capital, Wealth Link Securities"

    The following is my code:

    Sub Runner()
    Application.ScreenUpdating = False
    Application.Calculation = xlCalculationManual
        Dim i As Integer
        Dim x As Integer
        Dim k As Integer
        x = InputBox("initial:")
        k = InputBox("final:")
        On Error Resume Next
        For i = x To k
        'Do Until Cells(i + 1, 2) = ""
            Dim cellv As Integer
            cellv = Cells(i + 1, 7).Value
            'MsgBox cellv
            Dim IE As Object
            Set IE = CreateObject("InternetExplorer.Application")
            IE.Visible = False
            IE.navigate "" & cellv & "&type=listing"
            Do While IE.Busy Or _
            IE.readyState <> 4
            Application.Wait (Now + TimeValue("0:00:3"))

            Dim Doc As HTMLDocument
            Set Doc = IE.document
            Dim Name As String
            Nsme = Trim(Doc.getElementsByClassName("TableNumBlack")(13).innerText)
            MsgBox Name
            Cells(i + 1, 10) = Name
            Name = ""
            Set IE = Nothing
         Next i
    Application.ScreenUpdating = True
    Application.Calculation = xlCalculationAutomatic

    MsgBox ("Done!")

    End Sub

    But i m cannot extract the information. The msg box returns with nothing. Anyone can help me on this? 

    Thank you so much!

    Thursday, January 04, 2018 12:54 AM

All replies

  • I don't have much luck with Microsoft Internet Controls.  I use SeleniumBasic and use xPaths.  Here is what it looks like with SeleniumBasic.  The release is dated but IE works fine. 

    Dim drv As Selenium.WebDriver
    Dim ele As Selenium.WebElement
    Sub SelectElement()
       '' Add reference to selenium type library
      Set drv = New Selenium.IEDriver
      Dim xPath As String
      drv.Get ""
      drv.Wait (3000)
      xPath = "(//td[@class='TableNumBlack'])[14]"
      If GetElement(xPath) Then
        Debug.Print ele.text
      End If
    End Sub
    Function GetElement(xPath As String) As Boolean
      On Error GoTo Handler
      text = ""
      Set ele = drv.FindElementByXPath(xPath)
      text = ele.text
      GetElement = True
      Exit Function
      GetElement = False
    End Function

    The Immediate window has 'Ample Orient Capital, Wealth Link Securities'

    Thursday, January 04, 2018 2:51 PM
  • See Introduction to Web Site Scraping. It has C# code but I hope the concepts help. That sample began as VBA code for Access about ten years ago. I have found my original code and there is not much to help here except don't do the following.

    Do While IE.Busy Or _
    IE.readyState <> 4

    Instead use the DocumentComplete event and process the document there. The DocumentComplete event will fire asynchronously so I hope you can process the document asynchronously. I don't know if you can do the IE.Quit from within the DocumentComplete but probably you can. It might be better to do this in an application such as Access so you can use a WebBrowser but I can understand not doing that if there is a reason to not.

    Sam Hobbs

    Friday, January 05, 2018 2:27 AM