locked
Problem in fetching data from sys.dm_fts_index_keywords_by_document table for the first time RRS feed

  • Question

  • Hello,

    Am storing a word's contents with full text index into a table in SQL SERVER 2008. The keyword is then retrieved using the code

    SELECT display_term FROM sys.dm_fts_index_keywords_by_document ( DB_ID(), tableid) Where mycondition

    Am executing a stored procedure from my application which saves the word document details into a table and again invokes a stored procedure right after the above save( containing the above code) inorder to retrieve the keyword. I find that during the first time when a fetch occurs from  sys.dm_fts_index_keywords_by_document, the keyword is not retrieved from the table though the data is present. When i execute the same for the second time, data is retrieved. I am not able to figure out the reason. I confirmed that in both the cases i have given same condition to retrieve a particular keyword. Is there any performance specific issues related to sys.dm_fts_index_keywords_by_document Please help me.

    Thanks,

    Arudi

    Friday, March 16, 2012 6:58 AM

All replies

  • The FTS might not have had a change to index the document the first time you run it.  Fulltext indexing is an asynchronous process.  You can use sys.dm_fts_index_population to see when population is complete.

    By the way, what is the use case for retrieving a keyword list from a Word doc right after you insert it?  I'm curious.

    Friday, March 16, 2012 4:23 PM
  • Hello Arudi,

    I propose you that a moderator to move your thread towards the Database Engine Forum where ( i think ) that it would be more suitable placed. A move by a moerator will keep all the posts of this thread and will have the advantage that you will not have to recreate this thread. Moreover, your thread will interest more people in the Database Engine Forum

    Have a nice day


    Mark Post as helpful if it provides any help.Otherwise,leave it as it is.

    Sunday, March 18, 2012 10:10 PM
  • Yes, i would like to get this thread moved to the appropriate forum.

    Thanks


    Arudi

    Monday, March 19, 2012 12:00 PM
  • Hello,

    I think i am confused. Before retreiving from sys.dm_fts_index_keywords_by_document, I tried retrieving the status and status description through sys.dm_fts_index_population. I found that in both the cases ie ( during the first time , when word is saved but data not fetched from  sys.dm_fts_index_keywords_by_document and second time when data is fetched) , the status description is  displayed as "Starting".  Then how the data is fetched in the second Save.  Is it because by the second save, the population has completed and again restarted. I went through the logs and its mentioned that  "Full-text Auto population completed for table or indexed view" after the first save. Let me know if  my understanding is wrong.

    Use case here is am trying to do a preceding wild card search for data saved in MS office documents. Preceding wild card search requires data to be stored in reverse format. Hence while inserting the data to table with full text index, in my Insert trigger am invoking a stored procedure to reverse the keywords. And that is done by fetching the keyword from sys.dm_fts_index_keywords_by_document . And this is where am facing the issue. 

    Let me know how this can be handled.

    Thanks


    Arudi

    Monday, March 19, 2012 12:28 PM
  • If you are using a trigger to do this, it is highly likely that you are running your query before the indexer is finished.  You may want to poll the DMV every few seconds until the status is "7" or "11".  If this table contains any other columns that might be updated apart from the word document, you will want to only do this when the documents are different.  But this requires doing a binary comparison of two varbinary columns, each row, so that isn't fantastic either.  The best approach here may be to have the trigger drop edited rows onto a Service Broker queue, then have the queue worker job look through a change table (after setting up Change Data Capture) to process tables asynchronously so you don't cause updates to slow to a crawl.  No approach is perfect, so you'll have to make a judgement call on what suits you best.
    Monday, March 19, 2012 2:41 PM