Fuzzy search in database
- Hi all!
I have a task related to fuzzy searching in database.
It is assumed to be an sql server database.
The task is to have an ability to search in large database by approximate input strings.
So, I have a couple of questions to start my research in that task.
Is that the realizable task to implement fuzzy searching in database by myself? Or it can be too tricky?
Or it is more reasonable to use some ready-to-use tools?
Thanks!
Answers
- for the full-text search algorithm google on okapi bm-25. For levenstein edit distance google on levenstein edit distance. Here is a sample:
http://www.merriampark.com/ld.htm
looking for a book on SQL Server replication? http://www.nwsu.com/0974973602.html looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941- Marked As Answer byJian KangMSFT, ModeratorWednesday, October 21, 2009 10:03 AM
All Replies
- SQL Server does not really support fussy searching. Freetext is sometimes considered to be fuzzy, but it is not really.
You will need to implement something like Levenstein Edit Distance for this.
looking for a book on SQL Server replication? http://www.nwsu.com/0974973602.html looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941 - Thanks, Hilary
Continuing the discussion, help me to find where to start from.
My simplified task is to find several algorithms (under algorithm here I imply the edit distance calculation algorithm and the text indexing algorithm which is more important), then to select more appropriated by performance and to implement it.
Any links to those algorithms and their descriptions are appreciated.
Thanks in advance,
Dmitry - for the full-text search algorithm google on okapi bm-25. For levenstein edit distance google on levenstein edit distance. Here is a sample:
http://www.merriampark.com/ld.htm
looking for a book on SQL Server replication? http://www.nwsu.com/0974973602.html looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941- Marked As Answer byJian KangMSFT, ModeratorWednesday, October 21, 2009 10:03 AM


