locked
Storing video frames in SQL Server 2008 RRS feed

  • Question

  • Hi,

    I'm working on an application where we get several MJPEG video streams and need to process each frame.
    The processing generates meta-data for each frame.

    I then need to store the frames and their meta-data, with playback capabilities.
    The worst case scenario is an average of 100KB per frame, 30fps, stored for 30 days.
    I'm aiming at staying within the limitations of SQL Server Express (1 cpu, 1GB RAM, and most of all 4GB database).

    I was leaning towards storing each frame as a seperate JPEG in a varbinary FILESTREAM, and the FILESTREAM data size is not counted for the database size, so that's ok.
    The problem is the identifying meta-data. This would require a 16-bytes GUID + an 8-byte timestamp + some unknown-size pointer to the file (sp_help returns a length of -1 for this column, which makes no sense to me). It amounts to at least 1.78GB per video stream in the aforementioned worst case scenario, so the mere identifiers would already limit me to 2 video streams.

    My questions are :

    1) Do you have any advice on limiting the size of the database ?

    2) What is the length of a varbinary FILESTREAM column ?

    3) Will I get good playback performance by storing each frame image seperately...
        a) like this :

    CREATE TABLE [dbo].[Videos](
        [ID] [smallint] IDENTITY(1,1) NOT NULL,
        [Name] [nvarchar](50) NOT NULL)
    
    CREATE TABLE [dbo].[Frames](
        [ID] [uniqueidentifier] ROWGUIDCOL NOT NULL,
        [VideoID] [smallint] NOT NULL,
        [Timestamp] [datetime] NOT NULL,
        [Image] [varbinary](max) FILESTREAM NULL
        /*other meta-data*/)

        b) by concatenating back the frames in an MJPEG, and finding the frame image using the timestamp, like this :

    CREATE TABLE [dbo].[Videos](
        [ID] [smallint] IDENTITY(1,1) NOT NULL,
        [GUID] [uniqueidentifier] ROWGUIDCOL NOT NULL,
        [Name] [nvarchar](50) NOT NULL,
        [MJPEG] [varbinary](max) FILESTREAM NULL)
    
    CREATE TABLE [dbo].[Frames](
        [ID] [int] IDENTITY(1,1) NOT NULL, /* optional, I could have VideoID and Timestamp as a composite primary key ? */
        [VideoID] [smallint] NOT NULL,
        [Timestamp] [datetime] NOT NULL
        /*other meta-data*/)

        c) by using a normal varbinary BLOB (and getting a non-free edition of SQL Server) ?


    Thanks for any answer !
    • Edited by antbj Monday, May 4, 2009 10:30 AM
    Monday, May 4, 2009 10:22 AM

Answers

  • There is another possible alternative, and that would be to put each video in its own database.  In this scenario, you would have 2 databases that the application itself requires, a master metadata database that holds the information about the current video's available to the application, and the database name specific to each video, and a template database that is copied into a new database for each video project.  Then in your application tier, when an end user opens a specific video project, it changes database contexts to the appropriate database for that project and works against that database.  This can be a management nightmare if you have many projects running and active at once, but it is actually a fairly common method of managing projects similar to yours.  For example QualityCenter by HP (https://h10078.www1.hp.com/cda/hpms/display/main/hpms_content.jsp?zn=bto&cp=1-11-127-24_4000_100__) which is used for software QA testing manages projects in this manner.  This would give you 4GB per project which sounds like it would be sufficient.
    Jonathan Kehayias
    http://sqlblog.com/blogs/jonathan_kehayias/
    http://www.sqlclr.net/
    Please click the Mark as Answer button if a post solves your problem!
    • Marked as answer by antbj Monday, May 4, 2009 3:03 PM
    Monday, May 4, 2009 12:37 PM

All replies

  • There is another possible alternative, and that would be to put each video in its own database.  In this scenario, you would have 2 databases that the application itself requires, a master metadata database that holds the information about the current video's available to the application, and the database name specific to each video, and a template database that is copied into a new database for each video project.  Then in your application tier, when an end user opens a specific video project, it changes database contexts to the appropriate database for that project and works against that database.  This can be a management nightmare if you have many projects running and active at once, but it is actually a fairly common method of managing projects similar to yours.  For example QualityCenter by HP (https://h10078.www1.hp.com/cda/hpms/display/main/hpms_content.jsp?zn=bto&cp=1-11-127-24_4000_100__) which is used for software QA testing manages projects in this manner.  This would give you 4GB per project which sounds like it would be sufficient.
    Jonathan Kehayias
    http://sqlblog.com/blogs/jonathan_kehayias/
    http://www.sqlclr.net/
    Please click the Mark as Answer button if a post solves your problem!
    • Marked as answer by antbj Monday, May 4, 2009 3:03 PM
    Monday, May 4, 2009 12:37 PM
  • Hi Jonathan,

    Thanks for your reply. I suppose that to programmatically clone the template database (in SQL Server 2008), I would have to make a stored procedure calling BACKUP DATABASE then RESTORE DATABASE. Or is there a better practice for that?
    Monday, May 4, 2009 2:24 PM
  • Actually, you could do it programmatically with SMO.  See:

    http://blogs.msdn.com/benjones/archive/2008/09/21/making-a-database-clone-using-smo.aspx

    Jonathan Kehayias
    http://sqlblog.com/blogs/jonathan_kehayias/
    http://www.sqlclr.net/
    Please click the Mark as Answer button if a post solves your problem!
    Monday, May 4, 2009 2:33 PM
  • Thanks, that answers my first question.
    I'm still wondering about questions 2 and 3 though.
    Monday, May 4, 2009 3:03 PM
  • I wasn't ignoring this, I had to ask for some additional information after researching the answer for #2, because I don't know enough about FileStream to answer your question.  For #2, according to Paul Randal, you can use the DATALENGTH() function to get the size of the data in a filestream column of a row.  He also suggested that you could use an additional int/bigint column to store the size of the data in and then you don't have to make the function call to retrieve the size.  As for #3, I couldn't begin to tell you what the best design table structure wise is going to be for your specific project.  You might consider building a small video and testing the retrieval rates to validate that they are going to meet your needs before actually commiting yourself to them large scale.  On the surface I don't see a problem with how you are thinking your design through though.  Based on what you have said, I wouldn't choose 3b as an option because the sizes that you are talking about dealing with will be faster coming out of filestream based on the tests and performance comparisons I have read.
    Jonathan Kehayias
    http://sqlblog.com/blogs/jonathan_kehayias/
    http://www.sqlclr.net/
    Please click the Mark as Answer button if a post solves your problem!
    Monday, May 4, 2009 7:47 PM
  • Thanks Jonathan for your answers.

    I actually meant the size taken up by the varbinary column in the database itself, as opposed to the length of the data it points to.
    But apparently that must be 0, the database engine must be using the 16-bit GUID to locate the file, and therefore does not need to store anything in the varbinary column.
    Tuesday, May 5, 2009 9:30 AM