Answered by:
Search string for most frequently words

Question
-
User-760585380 posted
I need to create a method which looks through a large string of text, and determines which words (apart from words like "a" "and" "the") are the most frequently used. I would ideally like to determine which are the top 3 most frequently used words in a string of text...is this possible? If so, any ideas on how it can be achieved?
FYI the string is stored within the VB code, and not in a database.
Wednesday, February 17, 2010 11:35 AM
Answers
-
User-758443495 posted
you should create a dictionary from common words (like a,the,etc) and split the sentence to words by space delimeter,the compare content of the words array with the commonwords dic and remove equal array items
then keep the normal words in a storage like database or xml and each time user searched again populate another dictionary from stored words and again compare your final words array with this collection ,so if the words was stored in past give a point to the count attribute or if words is new keep it in storage with 1point
- Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
Wednesday, February 17, 2010 12:14 PM -
User-1179452826 posted
Enter Linq [:)]
Dim str As String = "adam riely adam phil adam phil commodore admiral" Dim array As String() = str.Split(New Char() {" "}, StringSplitOptions.RemoveEmptyEntries) Dim grouped = (From a In array Group By Key = a Into Group _ Order By Group.Count() Descending _ Select New With {.Str = Key, .Count = Group.Count()}).Take(3) For Each g In grouped Console.WriteLine("{0} - {1}", g.Str, g.Count) Next
- Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
Wednesday, February 17, 2010 12:38 PM -
User-1179452826 posted
Whoops...forgot the common words criterion. Try this:
Dim str As String = "adam riely adam phil adam phil the the a commodore the admiral" Dim array As String() = str.Split(New Char() {" "}, StringSplitOptions.RemoveEmptyEntries) Dim commonWords = New String() {"a", "an", "the"} Dim grouped = (From a In array Where commonWords.Contains(a.ToLower()) = False Group By Key = a Into Group _ Order By Group.Count() Descending _ Select New With {.Str = Key, .Count = Group.Count()}).Take(3) For Each g In grouped Console.WriteLine("{0} - {1}", g.Str, g.Count) Next
- Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
Wednesday, February 17, 2010 12:42 PM
All replies
-
User-758443495 posted
you should create a dictionary from common words (like a,the,etc) and split the sentence to words by space delimeter,the compare content of the words array with the commonwords dic and remove equal array items
then keep the normal words in a storage like database or xml and each time user searched again populate another dictionary from stored words and again compare your final words array with this collection ,so if the words was stored in past give a point to the count attribute or if words is new keep it in storage with 1point
- Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
Wednesday, February 17, 2010 12:14 PM -
User-1179452826 posted
Enter Linq [:)]
Dim str As String = "adam riely adam phil adam phil commodore admiral" Dim array As String() = str.Split(New Char() {" "}, StringSplitOptions.RemoveEmptyEntries) Dim grouped = (From a In array Group By Key = a Into Group _ Order By Group.Count() Descending _ Select New With {.Str = Key, .Count = Group.Count()}).Take(3) For Each g In grouped Console.WriteLine("{0} - {1}", g.Str, g.Count) Next
- Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
Wednesday, February 17, 2010 12:38 PM -
User-1179452826 posted
Whoops...forgot the common words criterion. Try this:
Dim str As String = "adam riely adam phil adam phil the the a commodore the admiral" Dim array As String() = str.Split(New Char() {" "}, StringSplitOptions.RemoveEmptyEntries) Dim commonWords = New String() {"a", "an", "the"} Dim grouped = (From a In array Where commonWords.Contains(a.ToLower()) = False Group By Key = a Into Group _ Order By Group.Count() Descending _ Select New With {.Str = Key, .Count = Group.Count()}).Take(3) For Each g In grouped Console.WriteLine("{0} - {1}", g.Str, g.Count) Next
- Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
Wednesday, February 17, 2010 12:42 PM -
User-760585380 posted
Thanks this seems to work!!
Wednesday, February 17, 2010 1:08 PM