locked
How to use GZIP compression with C# SqlBulkCopy method for a table in SQL database RRS feed

  • Question

  • Hello!

    * I use: Enterprise core edition SQL server

    I have an SQL question. I know SQL question normally should be asked in: "https://docs.microsoft.com/en-us/answers/questions"

    But I think this question is more of a C# question, so I try this out here.

    I am trying to understand how compression work and would like to compress the whole table using the: GZIP Algorithm. I have looked at this very well written link about this: https://www.mssqltips.com/sqlservertip/5709/using-compress-and-decompress-in-sql-server-to-save-disk-space/

    Below code that I have now only inserts data into the table uncompressed. As seen which is very important, I use the "SqlBulkCopy" function to insert data in bulk which is much more effcient, instead of inserting one row at a time.

    I will show step by step of how I do this and my question simply will be, how can I instead insert all those values for ALL the columns using the GZIP algorithm to save a lot of space. As seen in the link the GZIP only takes up 6% of the space.

    1. Create the table: table123

    CREATE TABLE [dbo].[table123] (
         [_DateTime]        SMALLDATETIME DEFAULT (getdate()) NOT NULL,
         [_DayNr]           TINYINT       DEFAULT ((0)) NOT NULL,
         [_CategoryNbr]     TINYINT       DEFAULT ((0)) NOT NULL,
         [_FeatureNbr]      SMALLINT      DEFAULT ((-1)) NOT NULL,
         [_Value]           FLOAT (53)                       NULL,
         [_Bool]            BIT                              NULL,
         CONSTRAINT [PK_table123] PRIMARY KEY CLUSTERED ([_DayNr] ASC, [_DateTime] ASC, [_CategoryNbr] ASC, [_FeatureNbr] ASC),
         CONSTRAINT [UC_table123] UNIQUE NONCLUSTERED ([_FeatureNbr] ASC, [_DateTime] ASC)
     );

    2. I will call this function that in its name tells what it does: writeTo_table123_in_Gzip_Compressed_Format

    a) I will here then first, create a "DataTable"(15000 rows) with values by calling the function: createDataTable

    b) I will open a connection to the SQL database: SqlConnection conn

    c)  I will insert bulk Records into the: "table123" using: objbulk.WriteToServer(tbl); 

    How would it in c) be possible to insert all this to "table123" using the GZIP algorithm method?

            void writeTo_table123_in_Gzip_Compressed_Format()
            {
                //Get "DataTable" with values
                DataTable tbl = new DataTable();
                List<String> columnNameLIST = new List<String>();
                createDataTable(out tbl, out columnNameLIST);
    
    
                //Open SQL connection and write this "DataTable" in bulk to "table123" in the database
                using (SqlConnection conn = new SqlConnection(GetConnectionString()))
                {
                    conn.Open();
                    SqlBulkCopy objbulk = new SqlBulkCopy(conn); //create object of SqlBulkCopy which help to insert  
                    objbulk.BatchSize = 0;  //numbers of rows in each batch to write to server. zero if no value has been set.
                    objbulk.BulkCopyTimeout = 1200;  //seconds for timeout
                    objbulk.DestinationTableName = "table123";  //assign Destination table name
                    for (int i = 0; i < columnNameLIST.Count; i++)
                    {
                        objbulk.ColumnMappings.Add(columnNameLIST[i], columnNameLIST[i]);
                    }
    
                    //Now dump tbl to: "table123" in database
                    objbulk.WriteToServer(tbl); //insert bulk Records into DataBase.  
                }
            }
            void createDataTable(out DataTable tbl, out List<String> columnNameLIST)
            {
                columnNameLIST = new List<String>();
                tbl = new DataTable(); //Add all values to table now! (Add the column names first)
                columnNameLIST.Add("_DateTime"); columnNameLIST.Add("_DayNr"); columnNameLIST.Add("_CategoryNbr"); columnNameLIST.Add("_FeatureNbr"); columnNameLIST.Add("_Value"); columnNameLIST.Add("_Bool");
                for (int i = 0; i < columnNameLIST.Count; i++)
                {
                    DataColumn column = new DataColumn();
                    if (i == 0) { column.AllowDBNull = false; column.ColumnName = columnNameLIST[i]; column.DataType = typeof(DateTime); } //SMALLDATETIME (_DateTime)
                    if (i == 1) { column.AllowDBNull = false; column.ColumnName = columnNameLIST[i]; column.DataType = typeof(byte); } //TINYINT (_DayNr)
                    if (i == 2) { column.AllowDBNull = false; column.ColumnName = columnNameLIST[i]; column.DataType = typeof(byte); } //TINYINT (_CategoryNbr)
                    if (i == 3) { column.AllowDBNull = false; column.ColumnName = columnNameLIST[i]; column.DataType = typeof(Int16); } //SMALLINT (_FeatureNbr)
                    if (i == 4) { column.AllowDBNull = true; column.ColumnName = columnNameLIST[i]; column.DataType = typeof(double); } //FLOAT(53) (_Value)
                    if (i == 5) { column.AllowDBNull = true; column.ColumnName = columnNameLIST[i]; column.DataType = typeof(bool); } //BIT (_Bool)
                    tbl.Columns.Add(column);
                }
                //Add all rows with values for all columns
                DateTime datetime = DateTime.Now; Int16 _featurenr = -1;
                for (int i = 0; i < 30000; i++) //Each row has a DateTime update: "201005011830"
                {
                    DataRow dr = tbl.NewRow(); 
                    dr["_DateTime"] = datetime;
                    dr["_DayNr"] = Convert.ToByte(datetime.Day);
                    dr["_CategoryNbr"] = (byte)10;
    
                    _featurenr++;
                    dr["_FeatureNbr"] = _featurenr; 
                    dr["_Value"] = 12.2549874239;
                    dr["_Bool"] = true;
    
                    tbl.Rows.Add(dr);
                    datetime = datetime.AddMinutes(1);
                }
            }
            public String GetConnectionString() { return "Data Source=WIN-S5P46PFR8ST;Initial Catalog=C:\\AI\\ARTIFICIAL INTELLIGENCE\\FEATURESDATABASE1.MDF;Integrated Security=True;Connect Timeout=3600"; }










    • Edited by Silvers11 Tuesday, November 10, 2020 1:20 AM
    Monday, November 9, 2020 7:01 PM

Answers

  • Hi Silvers11,

    Thank you for posting here.

    At first I tried to find a way to use SqlBulkCopy with the COMPRESS function in SQL server, but it seems that this method does not exist.

    So I can only try to compress the string first, and then write the compressed string to the database.

    Although it is not as efficient as the compression in the link, it will be significantly improved.

    Code for compression and decompression:

        public static class Extensions
        {
            public static byte[] Compress(this string uncompressedString)
            {
                var bytes = Encoding.UTF8.GetBytes(uncompressedString);
    
                using (MemoryStream memory = new MemoryStream())
                {
                    using (GZipStream gzip = new GZipStream(memory,
                        CompressionMode.Compress, true))
                    {
                        gzip.Write(bytes, 0, bytes.Length);
                    }
                    return memory.ToArray();
                }
            }
    
            public static string Unzip(this byte[] bytes)
            {
                using (MemoryStream memory = new MemoryStream())
                {
                    using (GZipStream gzip = new GZipStream(memory,
                        CompressionMode.Compress, true))
                    {
                        gzip.Write(bytes, 0, bytes.Length);
                    }
                    return Encoding.UTF8.GetString(memory.ToArray());
                }
            }
        }

    Best Regards,

    Timon


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    Tuesday, November 10, 2020 6:11 AM