locked
Manipulating large amounts of data RRS feed

  • Question

  • I have large amount of data that i want to put in database or xml file in order to read, edit and delete "records". Each record has about 25 "members", some of the members represents binary files. I can use .dat file, it's much easier, but if i want to update it, i should delete the existing .dat file first, then write the updated file to the file system, ofcourse that's not a big problem. There are 4000-5000 records. The total size of all records would be nearly 4-5 gb. So, which is better for such task - sql database or xml file? Or use .dat file?
    Monday, May 25, 2009 4:16 PM

Answers

  • SQL Server will scale better than XML files for something like this.

    However, I'd recommend leaving your .dat files externally on the file system, and storing a reference (filename) to the .dat file in the data table in SQL.  This will make querying very quick (5000 records is tiny for SQL server, but can be somewhat large for XML - or will end up held in memory).

    As for your writing the dat file issue - my recommendation would be to write a ~new~ .dat file, then update the DB, and finally, delete the old one.  This gives you a way to do the update in a more atomic, safe way.  If you delete your .dat file on disk first, then have an exception while creating the new one, you may lose data...


    Reed Copsey, Jr. - http://reedcopsey.com
    • Proposed as answer by JohnGrove Monday, May 25, 2009 4:24 PM
    • Marked as answer by Harry Zhu Tuesday, June 2, 2009 6:43 AM
    Monday, May 25, 2009 4:23 PM

All replies

  • SQL Server will scale better than XML files for something like this.

    However, I'd recommend leaving your .dat files externally on the file system, and storing a reference (filename) to the .dat file in the data table in SQL.  This will make querying very quick (5000 records is tiny for SQL server, but can be somewhat large for XML - or will end up held in memory).

    As for your writing the dat file issue - my recommendation would be to write a ~new~ .dat file, then update the DB, and finally, delete the old one.  This gives you a way to do the update in a more atomic, safe way.  If you delete your .dat file on disk first, then have an exception while creating the new one, you may lose data...


    Reed Copsey, Jr. - http://reedcopsey.com
    • Proposed as answer by JohnGrove Monday, May 25, 2009 4:24 PM
    • Marked as answer by Harry Zhu Tuesday, June 2, 2009 6:43 AM
    Monday, May 25, 2009 4:23 PM

  • For me, i will go with XML, but based on the size of the data you should use SQL. I know most people will choose SQL over others. If my data will be grouped into different table, then i will use SQL. The reason i dont like SQL is that too much security and always crash. Another thing, if you are comfortable with SQL then go for it, because it more advanced when you dealing with data storage.

    kaymaf


    I hope this helps, if that is what you want, just mark it as answer so that we can move on
    Monday, May 25, 2009 4:30 PM
  • I see. But i cannot use local database with LINQ, when i try to drag table into the object relational designer, VS throws this error: "the selected object(s) use an unsupported data provider". Why?
    Monday, May 25, 2009 6:06 PM
  • > "However, I'd recommend leaving your .dat files externally on the file system, and storing a reference (filename) to the .dat file in the data table in SQL"

    SQL Server has a new feature (FILESTREAM) that makes this unnecessary; LOBs can be stored within the file system external to the DB, but still queried and processed directly by T-SQL code.

    Michael Asher
    Monday, May 25, 2009 6:24 PM
  • masher2:

    That only works in SQL Server 2008, though, right?  I wasn't sure which version of SQL server the poster was using...
    Reed Copsey, Jr. - http://reedcopsey.com
    Monday, May 25, 2009 6:36 PM
  • A RDBMS like Oracle or SQL Server would be more suited than an xml file since you have 4-5 GB of data.

    Ganesh Ranganathan
    [Please mark the post as answer if you find it helpful]
    Monday, May 25, 2009 6:36 PM
  • Reed,

    Right; it's a 2008 feature.
    Michael Asher
    Monday, May 25, 2009 6:38 PM