none
Finding Proper Names in a Cell of Text RRS feed

  • Question

  • I’m looking for a way to find a proper name within a cell of text. I have a scrape from a webcrawl that contains the body of websites and often times the metadata for author isn’t actually the person who authored the article. What I think might work is, using VBA in excel, if I could find any occurrences of two, three or four words in a row that start with a capital letter. Once they have been found they would be put in the adjacent cell.

     

    Example:

     Column A----------------------------------------------------------------Column B
    This would be the text. Alan Smith. Blah blah. ------------------Alan Smith
    More text. Blah blah. Dr. T. G. Gordon.-----------------------------Dr. T. G. Gordon

     Any Ideas?


    Tuesday, March 5, 2013 4:04 PM

All replies

  • This is not that easy.

    If you suggested first capital letter from 1st line:

    This Alan Smith. Blah ----------------------no -------------Alan Smith
    More Blah Dr. T. G. Gordon.--------------no--------------Dr. T. G. Gordon

    I think you should to lookup and create some names database.


    Oskar Shon, Office System MVP

    Press if Helpful; Answer when a problem solved

    Tuesday, March 5, 2013 7:44 PM
    Answerer
  • Public Function FullName(r As Range) As String
    FullName = ""
    Dim col, col2 As Collection
    Dim L As Long
    Dim V As String
    
    Set col = New Collection
    Set col2 = New Collection
    ary = Split(r.Value, " ")
    
    For L = 0 To UBound(ary)
    
        If UCase(Left(ary(L), 1)) = Left(ary(L), 1) Then
            col.Add ary(L)
        Else
        
            If col.Count = 1 Then
                col.Remove 1
            Else
            End If
            
            If col.Count >= 2 Then
            
            For t = col.Count To 1 Step -1
            col2.Add col(t)
            
            col.Remove col.Count
            Next t
            
            End If
            If col.Count >= 2 Then Exit For
        End If
        
    Next L
    
    If col.Count = 1 Then
        col.Remove 1
    End If
    If col.Count = 0 Then Exit Function
    For t = col2.Count To 1 Step -1
        If t > 10 Then Exit For
        FullName = FullName & " " & col2.Item(t)
    Next t
    
    For t = 1 To col.Count
        If t > 10 Then Exit For
        FullName = FullName & " " & col.Item(t)
    Next t
    End Function

    Here's what I found, it works great. Hope it can help someone else too!
    • Edited by hiimdoug Wednesday, March 6, 2013 2:54 PM
    Wednesday, March 6, 2013 2:53 PM
  • If you use this function in worksheet you shoult add to code 

    Application.Volatile

    on the begining

    anyway this function do not recognize name with "This would be the text. Alan Smith. Blah blah."


    Oskar Shon, Office System MVP

    Press if Helpful; Answer when a problem solved

    Wednesday, March 6, 2013 4:36 PM
    Answerer