locked
C# Compile office docs from WSS DB data RRS feed

  • Question

  • This code was designed to extract the Documents from a windows sharepoint services database. It works but the data within the file does not extract correctly. Does someone have any expertise on how these may be restructured via C# to recover these documents?

    // BEGIN SPDBEX.CS
    using System;
    using System.Collections.Generic;
    using System.Text;
    using System.Data;
    using System.Data.SqlClient;
    using System.IO;
    namespace spdbex
    {
     class Program
     {
     static void Main(string[] args)
     {
     // Sharepoint content DB connection string
     string DBConnString =
     "Server=SERVER\SQLEXPRESS;" +
     "Database=WSS_Content_DBNAME;Trusted_Connection=True;";
     // create a DB connection
     SqlConnection con = new SqlConnection(DBConnString);
     con.Open();
     // the query to grab all the files.
     SqlCommand com = con.CreateCommand();
     com.CommandText = "SELECT ad.SiteId, ad.Id, ad.DirName," +
     " ad.LeafName, ads.Content" +
     " FROM AllDocs ad, AllDocStreams ads" +
     " WHERE ad.SiteId = ads.SiteId" +
     " AND ad.Id = ads.Id" +
     " AND ads.Content IS NOT NULL" +
     " Order by DirName";
     // execute query
     SqlDataReader reader = com.ExecuteReader();
     while (reader.Read())
     {
     // grab the file’s directory and name
     string DirName = (string)reader["DirName"];
     string LeafName = (string)reader["LeafName"];
     // create directory for the file if it doesn’t yet exist
     if (!Directory.Exists(DirName))
     {
      Directory.CreateDirectory(DirName);
      Console.WriteLine("Creating directory: " + DirName);
     }
     // create a filestream to spit out the file
     FileStream fs = new FileStream(DirName + "/" + LeafName,
      FileMode.Create, FileAccess.Write);
     BinaryWriter writer = new BinaryWriter(fs);
    
     // depending on the speed of your network,
     // you may want to change the buffer size (it’s in bytes)
     int bufferSize = 1000000;
     long startIndex = 0;
     long retval = 0;
     byte[] outByte = new byte[bufferSize];
     // grab the file out of the db one chunk
     // (of size bufferSize) at a time
     do
     {
      retval = reader.GetBytes(4, startIndex, outByte, 0,
      bufferSize);
      startIndex += bufferSize;
      writer.Write(outByte, 0, (int)retval);
      writer.Flush();
     } while (retval == bufferSize);
    
     // finish writing the file
     writer.Close();
     fs.Close();
     Console.WriteLine("Finished writing file: " + LeafName);
     }
     // close the DB connection and whatnots
     reader.Close();
     con.Close();
     }
     }
    }
    // END SPDBEX.CS

    The documents export, but they are damaged/corrupt files. When I DO get a text only of the file. It still looks like the raw data of the WSS 3.0 Data Content in the sharepoint tables.

    For Example; Word document (2007/2010 Version)

    504b0304140006000800000021000baee64eeb010000e9090000130008025b436f6e74656e745f54797065735d2e786d6c20a2040228a000020000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    The Word document DOES at least get the correct filenames and file formats, the raw data doesn't get assembled correctly and it becomes a corrupt  and very unuseable document.

    Any help is greatly appreciated...

    Best Regards,


    Steve Kline
    Microsoft Certified IT Professional: Server Administrator
    Microsoft Certified Product Specialist
    Microsoft Certified Network Product Specialist
    This posting is "as is" without warranties and confers no rights.

    • Edited by Steve Kline Monday, August 30, 2010 3:32 AM
    • Moved by Mike Dos Zhang Monday, August 30, 2010 1:28 PM sql server question (From:Visual C# Language)
    • Moved by edhickey Monday, August 30, 2010 3:22 PM (From:SQL Server Modeling)
    Friday, August 27, 2010 2:17 PM

All replies

  • In case you are wondering... The vendor we have backups with 30 day retention, could not recover a backup of the clean version of this database. This was backed up but the backup was dirty for this particular database file. Again, help is appreciated.

    Best Regards,


    Steve Kline
    Microsoft Certified IT Professional: Server Administrator
    Microsoft Certified Product Specialist
    Microsoft Certified Network Product Specialist
    This posting is "as is" without warranties and confers no rights.
    Friday, August 27, 2010 2:31 PM
  • Is there no one that can help with this issue?
    Steve Kline
    Microsoft Certified IT Professional: Server Administrator
    Microsoft Certified Product Specialist
    Microsoft Certified Network Product Specialist
    This posting is "as is" without warranties and confers no rights.
    Monday, August 30, 2010 3:33 AM
  • Hi Steve,

    You've asked this in the SQL Server Modeling Forum, so we don't have the expertise here to help you.  I will move this to a Forum where I hope you can get an answer to your question.

    Ed

    Monday, August 30, 2010 3:21 PM
  • Hi Steve,

    You've asked this in the SQL Server Modeling Forum, so we don't have the expertise here to help you.  I will move this to a Forum where I hope you can get an answer to your question.

    Ed


    Thank you Ed.
    Steve Kline
    Microsoft Certified IT Professional: Server Administrator
    Microsoft Certified Product Specialist
    Microsoft Certified Network Product Specialist
    This posting is "as is" without warranties and confers no rights.
    Monday, August 30, 2010 8:05 PM
  • Hello Steve,

    Is your issue solved.Are you able to extract documents from WSS DB correctly?

    I have similar situation.PLZ let me know if you have found any final solution on this?

    Thursday, September 2, 2010 8:36 AM
  • Hello Steve,

    Is your issue solved.Are you able to extract documents from WSS DB correctly?

    I have similar situation.PLZ let me know if you have found any final solution on this?

     

    I Have the same problem... Any suggestions?

    Please write here and send me a notification on the email if you have any ideas: hobbe_h@hotmail.com

    Tuesday, October 19, 2010 11:49 AM