cannot create index on the view the view contains a self join


  • Why can I not create an index on a view that contains a self join? 


    Please don't give me the answer a view cannot reference another view as   that is not the question.    This is join of two tables.      


    I have built a table to support full text searching as Microsoft FTS does not support within X words at this time. 


    The table FTS1 has the following columns

    sID          int   (represents a document)

    wordPos  int

    word       char(40)

    with a cluster index on docID, wordPoc 


    I limit the within x words to a maximum of 50.


    the query looks like this: 

    select top 40 fts1a.*, fts1b.*, (fts1b.wordpos - fts1a.wordpos) as 'separation'

    from fts1 as fts1a

    join fts1 as fts1b

    on fts1a.sID = fts1b.sID

    and fts1a.wordPos < fts1b.wordPos

    and (fts1b.wordPos - fts1a.wordPos) <= 50

    and fts1a.word <> fts1b.word

    where fts1a.word = 'water'

    and fts1b.word = 'system'


    created this view:

    select fts1a.sid as 'sID'

    , fts1a.wordPos as 'word1pos'

    , fts1b.wordPos as 'word2pos'

    , fts1a.word as 'word1'

    , fts1b.word as 'word2'

    ,(fts1b.wordpos - fts1a.wordpos) as 'separation'

    from fts1 as fts1a

    join fts1 as fts1b

    on fts1a.sID = fts1b.sID

    and fts1a.wordPos < fts1b.wordPos

    and (fts1b.wordPos - fts1a.wordPos) <= 50

    and fts1a.word <> fts1b.word


    Clearly I want to create an index on the view to speed up the queries.  Currently getting 10 second response with over 36 million words indexed.


    My question why can I not create an indexed view on a self join? 


    I understand a self join has limitations but this self join does exactly what I need it to do.   Microsoft are you trying to save users from themselves or is there an inherent reason a self join cannot be supported on an indexed view?  


    Is there a work around?



    Thursday, December 11, 2008 1:48 AM

All replies

  • I asked this same question a week ago to one of the SQL Developers at Microsoft who offered to send me further information on a academic paper that wasn't from Microsoft, but covered problems related with creating an index view if I wanted.  I wasn't that interested, but I can ask him for a reference if you need one.


    Basically it boils down to this, the ability to maintain a transactional "indexed" view requires these restrictions.  When there is a change to the base table (insert/update/delete) this has to also update the view.  The most efficient way to do this transactionally is to place restrictions on the indexed view that allow predicatable cost efficient updates to occur.  Some really good examples were provided to me by the Microsoft Dev Team member as well as other MVP's for why this self join restriction exists.  Before I provide them, I need to ensure that none of them are covered by a NDA, but rest assured that limitations are required to allow for efficiency.  Even in Oracle, Materialized Views have similar restrictions and it has to do with efficiency there as well.


    One thing to consider:


    Try yourself to write the needed queries to insert new rows into the view, update the rows of the view, and delete the rows of the view you provided.  It is highly complex, and would be a very expensive operation to perform.  Now for your particular implementation it might not happen often, but you aren't the only person using SQL.  I might have indexed views on highly transactional systems where inserts/updates/deletes happen very often.

    Thursday, December 11, 2008 2:52 AM
  • Thanks Johnathan,


    No I don't think I am good enough to write the code to maintain a view but 90% of what SQL does I could not write.  I still don't buy why a self join is any harder.  I could implement two identical tables, populate them with identical data, write the idential view, and index that view.   If I can index a view on two identical tables then why can't I index a view on a self join?    


    A software developer will relase features based on demand and perceived value of that feature.   I hope that I have provide a complelling example of the need for indexed view on a self join.   I am indexing some documents with 1/2 million words but I only need to search on words within 50 of each other.   An index on the 50 delta combination is a very very valuable index and I just don't buy that it is more expensive or complex because of a self join.     


    If FTS provided a within X words feature I would not need to write this.   According to Microsoft this feature just missed SQL 2008 and is high on the list for the next release of FTS.    I know this is not an FTS topic but another important feature for FTS is the character position of the hit and length.   This is used to highlight the hits on either a rich text box or HTML.   For legal discovery within X and highlighting are essential.

    Thursday, December 11, 2008 3:04 PM
  • Gymmie,
    This guy posted a word around.
    Warning: it isn't pretty

    Tuesday, February 24, 2009 11:38 PM
  • Hi Jonathan,
    I'm stuck with this problem too and I'm curious about the reason for this limitation.
    I would really appreciate if you could point me to the academic paper you mention.

    Thanks a lot,
    Tuesday, February 24, 2009 11:42 PM
  • Carlos.V said:

    This guy posted a word around.
    Warning: it isn't pretty

    That post is not applicable to this problem which is why I didn't post it here.  That post shows how to work around a double join to a table which isn't really a self join, but for complexity in maintenance reasons is excluded from Indexed Views.

    If you'd like some resources to read up on the complexities have a look at some of the patents that have been created for their maintenance:
    Jonathan Kehayias
    Please click the Mark as Answer button if a post solves your problem!
    Wednesday, February 25, 2009 4:54 AM