none
Saving HTML from MemoryStream into a Excel file. RRS feed

  • Question

  • I have a XSLT transformed HTML data in MemoryStream (in C#). I am trying to convert this to an Excel format before emailing, preferably conversion happens all in memory again without saving to local disk. I can worry about the email attachment part later. Can anyone point me to a sample on how I could do the conversion from HTML to Excel format either through OpenXML or with Office.Interop.Excel.

    The HTML data is well formed and I could manually do the conversion by opening the html in Excel application and do a Save As to save it in xlsx format (Office 2010), no problem. I also tried to simply change the .html extension to .xlsx, but then excel complains about opening it.

    What's the best way to automate the manual SaveAs action so that I could use the same html data in Excel format? I understand that I could create a separate .xslt for directly converting my XML into Excel format. But, that'll be too many .xslt to maintain. I'm trying to find the hack to let Excel do the work for me.

    Thank you for any and all pointers in advance!

    Monday, August 11, 2014 10:08 PM

Answers

  • Hi,

    According to your description, I suspect that you have an HTML file and you want to convert it to be an Excel file programmatically.

    We can achieve the goal with Excel Object Model (Office.Interop.Excel). You could open the HTML file and then save it as an Excel file with Open and SaveAs method.

    Here is a sample to save as an *.xlsx file for your reference.

    using Excel=Microsoft.Office.Interop.Excel;
    
    private void ConvertHTMLToExcel()
    {
        string pathHTML = @"C:\Documents\HTMLPage.html";
        string pathExcel = @"C:\Documents\HTMLPage.xlsx";
        Excel.Application app = new Excel.Application();
        Excel.Workbook wb= app.Workbooks.Open(pathHTML);
        wb.SaveAs(pathExcel, Excel.XlFileFormat.xlOpenXMLWorkbook);
        wb.Close();
        app.Quit();
    }

    Since Open XML SDK is used to operate the Open XML format files such as Word document, Excel spreadsheet or PowerPoint presentation, I'm afraid we cannot operate an HTML file and convert it to be an Excel spreadsheet with Open XML SDK.


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Tuesday, August 12, 2014 9:01 AM
    Moderator
  • Below method is final code that takes bytes[] as html and returns xlsx 

    public static byte[] DoConvertXlDataToOpenXml(byte[] data, 
    FileInfo fileInfo)
            {
                ExcelInterop.Application excelApp = null;
                ExcelInterop.Workbooks workBooks = null;
                ExcelInterop.Workbook workBook = null;
                FileInfo tempFile = null;
                FileInfo convertedTempFile = null;
    
                try
                {
                    //Stream the file to temporary location, overwrite if exists
                    tempFile = new FileInfo(Path.ChangeExtension(Path.Combine(Path.GetTempFileName()), fileInfo.Extension));
    
                    using (var destStream = new FileStream(tempFile.FullName, FileMode.Create, FileAccess.Write))
                    {
                        destStream.Write(data, 0, data.Length);
                    }
    
                    //open original
                    excelApp = new ExcelInterop.Application();
                    excelApp.Visible = false;
                    excelApp.DisplayAlerts = false;
    
                    workBooks = excelApp.Workbooks;
                  
                    workBook = workBooks.Open(tempFile.FullName);
    
                    convertedTempFile = new FileInfo(Path.ChangeExtension(Path.GetTempFileName(), "XLSX"));
    
                    //Save as XLSX
                    excelApp.Application.ActiveWorkbook.SaveAs(
                         convertedTempFile.FullName
                         , Microsoft.Office.Interop.Excel.XlFileFormat.xlOpenXMLWorkbook
                         , ConflictResolution: ExcelInterop.XlSaveConflictResolution.xlLocalSessionChanges);
    
                    excelApp.Application.ActiveWorkbook.Close();
    
                    return File.ReadAllBytes(convertedTempFile.FullName);
                }
                catch (Exception)
                {
                    throw;
                }
                finally
                {
                    if (workBooks != null)
                        Marshal.ReleaseComObject(workBooks);
    
                    if (workBook != null)
                        Marshal.ReleaseComObject(workBook);
    
                    if (excelApp != null)
                        Marshal.ReleaseComObject(excelApp);
    
                    if (tempFile != null && tempFile.Exists)
                        tempFile.Delete();
    
                    if (convertedTempFile != null && convertedTempFile.Exists)
                    {
                        convertedTempFile.Delete();
                    }
                }
            }


    • Marked as answer by MohanMia Friday, August 15, 2014 1:54 PM
    Friday, August 15, 2014 1:54 PM

All replies

  • Hi,

    According to your description, I suspect that you have an HTML file and you want to convert it to be an Excel file programmatically.

    We can achieve the goal with Excel Object Model (Office.Interop.Excel). You could open the HTML file and then save it as an Excel file with Open and SaveAs method.

    Here is a sample to save as an *.xlsx file for your reference.

    using Excel=Microsoft.Office.Interop.Excel;
    
    private void ConvertHTMLToExcel()
    {
        string pathHTML = @"C:\Documents\HTMLPage.html";
        string pathExcel = @"C:\Documents\HTMLPage.xlsx";
        Excel.Application app = new Excel.Application();
        Excel.Workbook wb= app.Workbooks.Open(pathHTML);
        wb.SaveAs(pathExcel, Excel.XlFileFormat.xlOpenXMLWorkbook);
        wb.Close();
        app.Quit();
    }

    Since Open XML SDK is used to operate the Open XML format files such as Word document, Excel spreadsheet or PowerPoint presentation, I'm afraid we cannot operate an HTML file and convert it to be an Excel spreadsheet with Open XML SDK.


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Tuesday, August 12, 2014 9:01 AM
    Moderator
  • Many thanks for the sample code. I looked at this Workbooks.Open(). but it only takes FileName as a string. but I'm trying to load the input HTML from a MemoryStream object. I would like to avoid storing that MemoryStream to disk and again read from disk with this Open() method. 

    Is there any way to directly process this MemoryStream and convert directly to .xlsx format in MemortyStream with SavaAs kind of method?

    Thanks

    Tuesday, August 12, 2014 2:20 PM
  • Hello Mohan,

    The Excel object model doesn't provide such methods.

    Tuesday, August 12, 2014 3:07 PM
  • Thanks Eugene for the reply. I figured I have no choice but to store html to disk and read it back.

    When I did try though, getting the exception with HRESULT: 0x800A03EC on the SaveAs method. Here's how to reproduce it.

    steps to reproduce the behavior

    1. Save this text

    <html><head></head><body><center><h1>Truck / Facility Reconciliation</h1></center></body></html>

    as Test.html in C:\

    2. after making reference to Excel interop like this,

    using Excel = Microsoft.Office.Interop.Excel;

    Try this code

    var app = new Excel.Application();
    Excel.Workbook wb = null;
    
    	       try
                   {
                        wb = app.Workbooks.Open(@"c:\test.html");
                        wb.SaveAs(@"c:\test.xlsx", Excel.XlFileFormat.xlOpenXMLWorkbook);
                        //wb.SaveCopyAs(@"c:\test.xlsx");
                        
                    }
                    catch (Exception ex)
                    {
                        //_logger.Error(ex);
                    }
                    finally
                    {
                        wb.Close();
                        app.Quit();
                    }         
               

    I always get the exception on SaveAs no matter which fileformat I choose or even not mentioning the fileformat.

    Any ideas?





    • Marked as answer by MohanMia Wednesday, August 13, 2014 1:47 PM
    • Edited by MohanMia Wednesday, August 13, 2014 1:51 PM Updated the sample code to make it work
    • Unmarked as answer by MohanMia Friday, August 15, 2014 1:54 PM
    Tuesday, August 12, 2014 5:05 PM
  • Hi,

    I think the line with SaveAs method in your code is wrong since the second argument is "Excel.XlFileFormat.xlOpenDocumentSpreadsheet". You can refer to the XlFileFormat Enumeration to learn to specify the file format when saving the worksheet. 

    In my sample, the second argument should be "Excel.XlFileFormat.xlOpenXMLWorkbook" since I will save the file as an *.xlsx file. So I suggest you editing it.


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Wednesday, August 13, 2014 3:28 AM
    Moderator
  • Thanks Luna Zhang.

    I thought I did try OpenXMLWorkbook format too. Apparently I didn't. Now it worked. I'm able to save it as xlsx. there was a forum post I was reading which suggested to change a lot of security settings using DCOMCNFG for this to work which would  been a nightmare.

    Wednesday, August 13, 2014 1:50 PM
  • Below method is final code that takes bytes[] as html and returns xlsx 

    public static byte[] DoConvertXlDataToOpenXml(byte[] data, 
    FileInfo fileInfo)
            {
                ExcelInterop.Application excelApp = null;
                ExcelInterop.Workbooks workBooks = null;
                ExcelInterop.Workbook workBook = null;
                FileInfo tempFile = null;
                FileInfo convertedTempFile = null;
    
                try
                {
                    //Stream the file to temporary location, overwrite if exists
                    tempFile = new FileInfo(Path.ChangeExtension(Path.Combine(Path.GetTempFileName()), fileInfo.Extension));
    
                    using (var destStream = new FileStream(tempFile.FullName, FileMode.Create, FileAccess.Write))
                    {
                        destStream.Write(data, 0, data.Length);
                    }
    
                    //open original
                    excelApp = new ExcelInterop.Application();
                    excelApp.Visible = false;
                    excelApp.DisplayAlerts = false;
    
                    workBooks = excelApp.Workbooks;
                  
                    workBook = workBooks.Open(tempFile.FullName);
    
                    convertedTempFile = new FileInfo(Path.ChangeExtension(Path.GetTempFileName(), "XLSX"));
    
                    //Save as XLSX
                    excelApp.Application.ActiveWorkbook.SaveAs(
                         convertedTempFile.FullName
                         , Microsoft.Office.Interop.Excel.XlFileFormat.xlOpenXMLWorkbook
                         , ConflictResolution: ExcelInterop.XlSaveConflictResolution.xlLocalSessionChanges);
    
                    excelApp.Application.ActiveWorkbook.Close();
    
                    return File.ReadAllBytes(convertedTempFile.FullName);
                }
                catch (Exception)
                {
                    throw;
                }
                finally
                {
                    if (workBooks != null)
                        Marshal.ReleaseComObject(workBooks);
    
                    if (workBook != null)
                        Marshal.ReleaseComObject(workBook);
    
                    if (excelApp != null)
                        Marshal.ReleaseComObject(excelApp);
    
                    if (tempFile != null && tempFile.Exists)
                        tempFile.Delete();
    
                    if (convertedTempFile != null && convertedTempFile.Exists)
                    {
                        convertedTempFile.Delete();
                    }
                }
            }


    • Marked as answer by MohanMia Friday, August 15, 2014 1:54 PM
    Friday, August 15, 2014 1:54 PM