locked
Contains whole words RRS feed

  • Question

  • The following code checks a strings if it contains particular words and if yes it removes them. The problem is that it also removes characters that it shouldn't. For example:

    If user_input3 contains a word like lo(ok) it would remove the last characters ok. How can i modify the code to remove whole words only?

       Dim remove_words As List(Of String) = New List(Of String)(New String() _
    {"OK", "Ok", "ok", "Yes", "No", "Hm", "Yeah", "uh huh"})
    
     For Each Word As String In remove_words
          If user_input3.Contains(String.Concat(Word)) Then
            user_input3 = Regex.Replace(user_input3, Word, "")
            user_input3 = Regex.Replace(user_input3, start_line, "")
            user_input3 = Regex.Replace(user_input3, end_line, "")
          End If
        Next
    Friday, April 23, 2010 4:50 PM

Answers

  • I think it's even easier with a Linq

    Private remove_words As List(Of String) = New List(Of String)(New String() {"OK", "Ok", "ok", "Yes", "No", "Hm", "yeah"})
      Private User_Input3 As String = "Hm Looks like i found a solution OK.No,It is Not the ok one i am after.Yes it is"
      Private Sub Form1_Load(ByVal sender As Object, ByVal e As EventArgs) Handles MyBase.Load
        Label1.Text = User_Input3
      End Sub
      Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim s = (From n In User_Input3.Split(" "c, "."c, ","c) Where Not remove_words.Contains(n) Select n).ToArray
        User_Input3 = String.Join(ChrW(32), s)
        Label1.Text = User_Input3
    
      End Sub
    

     


    Asgar
    • Marked as answer by Pilot_ Saturday, April 24, 2010 11:49 PM
    Saturday, April 24, 2010 7:26 AM

All replies

  • Nop it doesn't work... from looks it still removes lo(ok).
    Friday, April 23, 2010 7:49 PM
  • Your script's logic is flawed.  What makes a "whole" word???  The answer is spaces between words.

    Thus, the brute force method would be to split the input string (sentence) by spaces to separate into words, then check each word individually against your list of words, removing words as needed.  Then join the array together using spaces again to form the sentence with the appropriate words removed.  You may need to place the valid words into another array to prevent multiple spaces when joining.

    Where is a similar WORD based issue and solution in VBScript

    http://networkadminkb.com/vbslibrary/Knowledge%20Base/Components/Strings/GetFirstLetters.aspx

    Friday, April 23, 2010 8:11 PM
  • you mean to remove the whole word that contains any word from remove_word List then Gunner999 is right.

    try this code.

    Public Class Form1
      Private remove_words As List(Of String) = New List(Of String)(New String() {"OK", "Ok", "ok", "Yes", "No", "Hm", "yeah"})
      Private User_Input3 As String = "Hmmm Looks like i found a solution.It is Not the solution i am after."
      Private Sub Form1_Load(ByVal sender As Object, ByVal e As EventArgs) Handles MyBase.Load
        Label1.Text = User_Input3
      End Sub
      Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    
        RemoveSomeWords(User_Input3)
        Label1.Text = User_Input3
    
      End Sub
    
      Private Sub RemoveSomeWords(ByRef s As String)
    
        Dim str As String = Array.Find(Of String)(s.Split(ChrW(32)), New Predicate(Of String)(AddressOf FindMatch))
    
        If Not str = String.Empty Then
          User_Input3 = User_Input3.Replace(str, String.Empty)
          RemoveSomeWords(s)
        End If
    
      End Sub
      Private Function FindMatch(ByVal str As String) As Boolean
    
        For Each w As String In remove_words
          If str.Contains(w) Then Return True
        Next
    
        Return False
    
      End Function

    Asgar
    Friday, April 23, 2010 8:42 PM
  • or if you want to replace whole words(not containing) you can replace the FindMatch methodby the method below and you can also add a '.' to split by

     Private Function FindMatch(ByVal str As String) As Boolean
    
      For Each w As String In remove_words
       If str = W Then Return True
      Next
    
      Return False
    
     End Function
    

     


    Asgar
    Friday, April 23, 2010 10:10 PM
  • Omie your code looks promising but what about if you have a sentance like "OK I am ready". The current code can not handle that. A word can be at the beginning of a sentance (hence there is no space at the beginnign of the word) or after that (hence you have space at the beginning and end). You have demonstrated how to deal with the second but not with the first.
    Friday, April 23, 2010 11:56 PM
  • Hi Pilot_ ,

    Okay then give this a try with one button on a FORM.

    There was an incorrect assumption earlier in this thread that all words are separated by spaces.

    The code will even put back in any characters that are used to separate the words such as a space and a full stop,

    a comma, a semi-colon, a colon, a question mark? Even an exclamation mark!!

    I have even included the apostrophe as in the word isn't

    as well as the double quote mark;"Should you quote me!!" LOL!!

     

    My method of approach: Split all the words, split all the non-alphabetic characters, rebuild the sentence after the unrequired words

    are removed and add back-in the non-alphabetic characters.

     

    This code is not 100% perfect as I tried adding one or two smiley's to a STRING and it didn't like it!!

    I hope it is sufficient for your needs though?

     

    Anyway, try this with one button on a FORM please.>>

     

     

    Option Strict On
    Public Class Form1
    
     Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    
      Dim removeTheseWords As List(Of String) = New List(Of String)(New String() _
     {"OK", "Ok", "ok", "Yes", "No", "Hm", "yeah", "uh", "huh"})
    
      Dim test1 As String = "Hmmm ok Looks like i got solution.It is Not the ok one i am after ok but yes noon is hotter but uh huh no one believes"
    
      Dim result As String = RemoveWords(test1, removeTheseWords)
      MessageBox.Show(result)
    
      Dim test2 As String = """Ok,I am ready!!""" & " Said John!! Great, isn't it??!! LOL!!"
      result = RemoveWords(test2, removeTheseWords)
      MessageBox.Show(result)
    
     End Sub
    
     Private Function RemoveWords(ByVal FromThisString As String, ByVal WithThisListOfWords As List(Of String)) As String
    
      Dim splitChars() As Char = " ""!£$%^&*()_+{}:@~|<>?€[];'#\,./0123456789".ToCharArray
      Dim sentenceWordsArray() As String = FromThisString.Split(splitChars)
      Dim sentenceList As New List(Of String)
      sentenceList = sentenceWordsArray.ToList
    
      Dim otherChars As String = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
      Dim otherCharsList As New List(Of Char)
      For index As Integer = 0 To FromThisString.Length - 1
       If otherChars.Contains(FromThisString.Substring(index, 1)) = False Then
        otherCharsList.Add(Convert.ToChar(FromThisString.Substring(index, 1)))
       End If
      Next
    
      Dim newWordList As New List(Of String)
    
      For Each word As String In sentenceList
       If WithThisListOfWords.Contains(word) = False Then
        newWordList.Add(word)
       End If
      Next
    
      Dim nextChar As Char
      Dim sb As New System.Text.StringBuilder
      sentenceWordsArray = newWordList.ToArray
      For index As Integer = 0 To sentenceWordsArray.GetUpperBound(0)
       If index <= otherCharsList.Count - 1 Then nextChar = otherCharsList(index)
       sb.Append(nextChar & sentenceWordsArray(index))
      Next
    
      Return sb.ToString
    
     End Function
    
    End Class

     

     

    Regards,

    John

     

    <edit>

    P.S. After further testing I think I should remove the section of code that adds the non-alphabetic characters back in.

    </edit>

     


    Please see this thread for Vb.Net learning links.>> http://social.msdn.microsoft.com/Forums/en/vbgeneral/thread/549c8895-6780-42f8-878f-2138214fdeb4
    Saturday, April 24, 2010 6:05 AM
  • Ok try this

    Public Class Form1
      Private remove_words As List(Of String) = New List(Of String)(New String() {"OK", "Ok", "ok", "Yes", "No", "Hm", "yeah"})
      Private User_Input3 As String = "Hm Looks like i found a solution OK.No,It is Not the ok one i am after.Yes it is"
      Private Sub Form1_Load(ByVal sender As Object, ByVal e As EventArgs) Handles MyBase.Load
        Label1.Text = User_Input3
      End Sub
      Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    
        RemoveSomeWords(User_Input3)
        Label1.Text = User_Input3
    
      End Sub
    
      Private Sub RemoveSomeWords(ByRef s As String)
    
        Dim StringArrray() As String = s.Split(New Char() {" "c, "."c, ","c}, StringSplitOptions.RemoveEmptyEntries)
        Dim st As String = Array.Find(Of String)(StringArrray, New Predicate(Of String)(AddressOf FindMatch))
    
        If Not st = String.Empty Then
    
          StringArrray.SetValue(String.Empty, Array.IndexOf(StringArrray, st))
          User_Input3 = String.Join(ChrW(32), StringArrray)
          RemoveSomeWords(s)
    
        End If
    
      End Sub
      Private Function FindMatch(ByVal s As String) As Boolean
    
        For Each w As String In remove_words
          If w = s Then Return True
        Next
    
        Return False
    
      End Function
    
    End Class

    Asgar
    Saturday, April 24, 2010 6:12 AM
  • I think it's even easier with a Linq

    Private remove_words As List(Of String) = New List(Of String)(New String() {"OK", "Ok", "ok", "Yes", "No", "Hm", "yeah"})
      Private User_Input3 As String = "Hm Looks like i found a solution OK.No,It is Not the ok one i am after.Yes it is"
      Private Sub Form1_Load(ByVal sender As Object, ByVal e As EventArgs) Handles MyBase.Load
        Label1.Text = User_Input3
      End Sub
      Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim s = (From n In User_Input3.Split(" "c, "."c, ","c) Where Not remove_words.Contains(n) Select n).ToArray
        User_Input3 = String.Join(ChrW(32), s)
        Label1.Text = User_Input3
    
      End Sub
    

     


    Asgar
    • Marked as answer by Pilot_ Saturday, April 24, 2010 11:49 PM
    Saturday, April 24, 2010 7:26 AM