none
OpenXML SDK 2.5 Excel File Corruption RRS feed

  • Question

  • I've come across a problem in using the OpenXML SDK 2.5 where the spreadsheet cannot be opened without losing data when you fill cells past column z (all on the same row). The Issue is that in the xml for the sheet, column references AA, AB, AC, etc sort BEFORE column B in the cell list. The consequence of this is that the data for cells A, AA, AB, AC, etc is written, but the data for the cells B to Z is stripped out of the recovered spreadsheet.

    Create a blank spreadsheet (.xlsx), and, using the SDK, put data into cells A1, B1, C1, AA1, AB1, AC1. Save the file.

    Try to open the spreadsheet in excel and receive the message that the file is essentially corrupt. Recover the spreadsheet and see that there is data in cells A1, AA1, AB1 and AC1, but that in B1 and C1 is missing.

    If you go into the xml internals of the spreadsheet and move the structures for cells AA and beyond to after that for column C (in this case) and then reopen the file, it's fine.

    I am currently working around this by starting my spreadsheet in column AA.

    Mike

    Tuesday, May 31, 2016 12:51 AM

All replies

  • Hi Mike,

    >>Create a blank spreadsheet (.xlsx), and, using the SDK, put data into cells A1, B1, C1, AA1, AB1, AC1. Save the file.

    It would be helpful if you could share us your code, and then we could try to reproduce your issue.

    I suggest you try Open XML SDK 2.5 Productivity Tool for Microsoft Office, open the file and click Validate to check whether there is any validation error.

    Best Regards,

    Edward


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.


    Tuesday, May 31, 2016 6:42 AM
  • Hi Edward,

    Thanks for your reply.

    Files posted to the following link:

    https://onedrive.live.com/redir?resid=5AA6707DF8135167!2544&authkey=!AIalpM4FYhMWB7s&ithint=folder%2c

    There's a blank spreadsheet and some code

    Mike

    Friday, June 3, 2016 12:08 AM
  • Hi Mike,

    Based on your code, it seems you want to write string to AA6, am I right? If so, I suggest you try below code:

     // Given a document name and text, 
            // inserts a new work sheet and writes the text to cell "A1" of the new worksheet.
    
            public  void InsertText(string docName, string text)
            {
                // Open the document for editing.
                using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Open(docName, true))
                {
                    // Get the SharedStringTablePart. If it does not exist, create a new one.
                    SharedStringTablePart shareStringPart;
                    if (spreadSheet.WorkbookPart.GetPartsOfType<SharedStringTablePart>().Count() > 0)
                    {
                        shareStringPart = spreadSheet.WorkbookPart.GetPartsOfType<SharedStringTablePart>().First();
                    }
                    else
                    {
                        shareStringPart = spreadSheet.WorkbookPart.AddNewPart<SharedStringTablePart>();
                    }
    
                    // Insert the text into the SharedStringTablePart.
                    int index = InsertSharedStringItem(text, shareStringPart);
    
                    // Insert a new worksheet.
                    WorksheetPart worksheetPart = InsertWorksheet(spreadSheet.WorkbookPart);
    
                    // Insert cell A1 into the new worksheet.
                    Cell cell = InsertCellInWorksheet("AA", 1, worksheetPart);
    
                    // Set the value of cell A1.
                    cell.CellValue = new CellValue(index.ToString());
                    cell.DataType = new EnumValue<CellValues>(CellValues.SharedString);
    
                    // Save the new worksheet.
                    worksheetPart.Worksheet.Save();
                }
            }
    
            // Given text and a SharedStringTablePart, creates a SharedStringItem with the specified text 
            // and inserts it into the SharedStringTablePart. If the item already exists, returns its index.
            private static int InsertSharedStringItem(string text, SharedStringTablePart shareStringPart)
            {
                // If the part does not contain a SharedStringTable, create one.
                if (shareStringPart.SharedStringTable == null)
                {
                    shareStringPart.SharedStringTable = new SharedStringTable();
                }
    
                int i = 0;
    
                // Iterate through all the items in the SharedStringTable. If the text already exists, return its index.
                foreach (SharedStringItem item in shareStringPart.SharedStringTable.Elements<SharedStringItem>())
                {
                    if (item.InnerText == text)
                    {
                        return i;
                    }
    
                    i++;
                }
    
                // The text does not exist in the part. Create the SharedStringItem and return its index.
                shareStringPart.SharedStringTable.AppendChild(new SharedStringItem(new DocumentFormat.OpenXml.Spreadsheet.Text(text)));
                shareStringPart.SharedStringTable.Save();
    
                return i;
            }
    
            // Given a WorkbookPart, inserts a new worksheet.
            private static WorksheetPart InsertWorksheet(WorkbookPart workbookPart)
            {
                // Add a new worksheet part to the workbook.
                WorksheetPart newWorksheetPart = workbookPart.AddNewPart<WorksheetPart>();
                newWorksheetPart.Worksheet = new Worksheet(new SheetData());
                newWorksheetPart.Worksheet.Save();
    
                Sheets sheets = workbookPart.Workbook.GetFirstChild<Sheets>();
                string relationshipId = workbookPart.GetIdOfPart(newWorksheetPart);
    
                // Get a unique ID for the new sheet.
                uint sheetId = 1;
                if (sheets.Elements<Sheet>().Count() > 0)
                {
                    sheetId = sheets.Elements<Sheet>().Select(s => s.SheetId.Value).Max() + 1;
                }
    
                string sheetName = "Sheet" + sheetId;
    
                // Append the new worksheet and associate it with the workbook.
                Sheet sheet = new Sheet() { Id = relationshipId, SheetId = sheetId, Name = sheetName };
                sheets.Append(sheet);
                workbookPart.Workbook.Save();
    
                return newWorksheetPart;
            }
    
            // Given a column name, a row index, and a WorksheetPart, inserts a cell into the worksheet. 
            // If the cell already exists, returns it. 
            private static Cell InsertCellInWorksheet(string columnName, uint rowIndex, WorksheetPart worksheetPart)
            {
                Worksheet worksheet = worksheetPart.Worksheet;
                SheetData sheetData = worksheet.GetFirstChild<SheetData>();
                string cellReference = columnName + rowIndex;
    
                // If the worksheet does not contain a row with the specified row index, insert one.
                Row row;
                if (sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).Count() != 0)
                {
                    row = sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).First();
                }
                else
                {
                    row = new Row() { RowIndex = rowIndex };
                    sheetData.Append(row);
                }
    
                // If there is not a cell with the specified column name, insert one.  
                if (row.Elements<Cell>().Where(c => c.CellReference.Value == columnName + rowIndex).Count() > 0)
                {
                    return row.Elements<Cell>().Where(c => c.CellReference.Value == cellReference).First();
                }
                else
                {
                    // Cells must be in sequential order according to CellReference. Determine where to insert the new cell.
                    Cell refCell = null;
                    foreach (Cell cell in row.Elements<Cell>())
                    {
                        if (string.Compare(cell.CellReference.Value, cellReference, true) > 0)
                        {
                            refCell = cell;
                            break;
                        }
                    }
    
                    Cell newCell = new Cell() { CellReference = cellReference };
                    row.InsertBefore(newCell, refCell);
    
                    worksheet.Save();
                    return newCell;
                }
            }

    You could refer the link below for more information.

    # How to: Insert text into a cell in a spreadsheet document (Open XML SDK)
    https://msdn.microsoft.com/en-us/library/office/cc861607.aspx?f=255&MSPPError=-2147217396

    Best Regards,

    Edward


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.


    Friday, June 3, 2016 6:17 AM
  • I found the sample code included in the Open XML SDK documentation causes an error error when writing beyond column 'Z' (e.g. AA1, AB1, AC1, etc).  The faulty sample code causes an error to occur when attempting to read the spreadsheet file in Excel.

    Cells must be sequentially ordered.  However, this requirement is not met by executing sample code included in the Open XML SDK documentation.  Specifically, the faulty code is as follows (and can be found in the above code provided by Edward):

    // the following code does not correctly order cells beyond column 'Z'
    if (string.Compare(cell.CellReference.Value, cellReference, true) > 0) { refCell = cell; break; }

    What I do is replace the faulty code shown above with that shown next and which uses the function 'ColumnIndex' (see comment 'small is beautiful' by Luc Vandenbroucke):

    // this code makes certain cells beyond column 'Z' are sequentially ordered
    var xCell = this.ColumnIndex( cell.CellReference.Value ); var xCellReference = this.ColumnIndex( cellReference); if( xCellReference > xCell ) { refCell = cell; break; }


    Ian



    • Edited by IB00 Thursday, January 31, 2019 2:04 AM
    Thursday, January 31, 2019 1:59 AM