none
store pdf in sql server RRS feed

  • Question

  • Hi,
    I am wrighting a business application for a university and they would like to be able to add a pdf file to the data for a person and be able to search that files.  I am currently designing the database in SQL Server 2008 and was wondering the best way to store the pdf files.  Should I just put it into a iamge(data type) column or another document in SQL Server?   What are the best practrices for this and can you point me to documentation that would help with designing and implementing such a task?
    Monday, April 6, 2009 7:37 AM

Answers

  • Hi,
    as far as I know the best way to search pdf's in ms sql 2008 is usng the full text search (http://msdn.microsoft.com/en-us/library/ms142571.aspx).

    In MS SQL 2008 you have the option of using varbinary(MAX) for BLOB(binary large objects) which allows you more smooth work than image.

    You'll need to create full text index on the collumn you're going to search(http://msdn.microsoft.com/en-us/library/ms187317.aspx).

    In order to search in pdf's you'll also need a third party iFilters for pdf files, since thay are not included in the server (iFilters for almost all other file types are included in MS SQL server). Search the adobe site for the latest version, but have in mind that there are 32 and 64 bit versions of the filters, depending on the version of your ms sql server. To see which filetypes are included in ms sql installation use :
    SELECT *
    FROM sys.fulltext_document_types
    
    When you install the iFilters on the server you'll need to call the following querry in order to load the filters in the full text search engine:
    EXEC sys.sp_fulltext_service 'load_os_resources', 1;
    GO
    
    EXEC sys.sp_fulltext_service 'update_languages', NULL;
    
    Than you can search the file content using CONTAINS or CONTAINSTABLE this way
    SELECT [ID],[Name],[FileContent]
    FROM [MyDatabase].[dbo].[Files]
    INNER JOIN 
    CONTAINSTABLE ([MyDatabase].[dbo].[Files], 
    ([Name], [FileContent]), 
    'ISABOUT( FORMSOF (INFLECTIONAL, Here goes your searched text) WEIGHT(0.9))', 
    language 'English') AS res
    ON res.[key]=[ID]
    
    If have questions you're welcome to write here or send me a personal mail.
    Monday, April 6, 2009 8:57 AM
  • Monday, April 6, 2009 8:07 AM

All replies

  • Monday, April 6, 2009 8:07 AM
  • Hi,
    as far as I know the best way to search pdf's in ms sql 2008 is usng the full text search (http://msdn.microsoft.com/en-us/library/ms142571.aspx).

    In MS SQL 2008 you have the option of using varbinary(MAX) for BLOB(binary large objects) which allows you more smooth work than image.

    You'll need to create full text index on the collumn you're going to search(http://msdn.microsoft.com/en-us/library/ms187317.aspx).

    In order to search in pdf's you'll also need a third party iFilters for pdf files, since thay are not included in the server (iFilters for almost all other file types are included in MS SQL server). Search the adobe site for the latest version, but have in mind that there are 32 and 64 bit versions of the filters, depending on the version of your ms sql server. To see which filetypes are included in ms sql installation use :
    SELECT *
    FROM sys.fulltext_document_types
    
    When you install the iFilters on the server you'll need to call the following querry in order to load the filters in the full text search engine:
    EXEC sys.sp_fulltext_service 'load_os_resources', 1;
    GO
    
    EXEC sys.sp_fulltext_service 'update_languages', NULL;
    
    Than you can search the file content using CONTAINS or CONTAINSTABLE this way
    SELECT [ID],[Name],[FileContent]
    FROM [MyDatabase].[dbo].[Files]
    INNER JOIN 
    CONTAINSTABLE ([MyDatabase].[dbo].[Files], 
    ([Name], [FileContent]), 
    'ISABOUT( FORMSOF (INFLECTIONAL, Here goes your searched text) WEIGHT(0.9))', 
    language 'English') AS res
    ON res.[key]=[ID]
    
    If have questions you're welcome to write here or send me a personal mail.
    Monday, April 6, 2009 8:57 AM
  • Hi,

    I know this is an old thread, but hope you can help me.

     

    I have set up SQL to store PDF BLOBs.

    I have created a full-text index for the VARBINARY column storing the BLOB.

    I have installed the Adobe iFilter for searching PDF BLOBs with full-text index.

    However, when I run the following query...

    SELECT [FileID],[FileDesc],[PubDate]
    FROM Compliance_Updates
    INNER JOIN 
    CONTAINSTABLE ([Compliance_Updates],([FileDesc], [FileData]), 'ISABOUT( FORMSOF (INFLECTIONAL, Mortgage) WEIGHT(0.9))',language 'English') AS res
    ON res.[key]=[FileID] 

    I don't get any results?

    I know the PDF BLOB contains the word 'Mortgage', so why don't I get any results?

    Your help is appreciated.

    Craig. 


    To err is human, to really balls things up you need Microsoft!

    Wednesday, November 6, 2013 10:42 AM