none
Need Advise RRS feed

  • Question

  • Hi,

    I'm developing my version of Full text search. The search tokens was atributed. I aimed to build result set ordered by rank of intersection of attributes containing sources reference Id's. The parser implements complex algorithm on tokenizing phase since the source may contain typo graphic  mistakes or diffrent abrivations needed to be handled. The rank evaluation is also more simpler. 

    I may have huge set of symbol (for token) and reference Id (of source record) pair that used for intersection (as FullText index)

    In version 0.1, I used a flat file to store those pairs (sorted) and  another flat file that holds unique symbol and offset of starting position in first file (an index file). This allow me to read RefIDs as block to internal array that eventually used in intersection  phase.

    Now I decide to port this structure to a RDB.

    I'm looking an advise to choose a dB.

    I like to access dB through TCP/IP protocol since my server and dB may host in diffrent box. The ODBC connection may restrict me to utilize some API's provided by dB. (performance is my major concern)

    Unfortunately I couldn't study and explore  MsSQL features. Are there some API's provided for native code? Or any approach to handle large data from dB as block of defined size?

    I found MySql is some how suitable for me by provided API's. But I like to know other dB's feature before step in MySql.

    I may share brief details of my application architecture who may interested.

    Thanks in advance

    MTT 

    Monday, September 25, 2006 10:42 AM