Answered Excel 2010 VSTO - Finding "Hidden" Styles.

  • Wednesday, March 30, 2011 9:52 PM
     
     

    We have an ongoing issue with styles becoming duplicated ("Corrupted") over time.
    We can easily duplicate this issue using Excel XP -> 2010.
    The issue involves copying/pasting data between two workbooks that are opened in different Excel instances.

    Anyway - this isn't the issue at this point - the issue now comes down to me cleaning the duplicate/corrupted styles.
    I have been able to do most of it myself; however, it isn't complete.

    By using ActiveWorkbook.Styles.Count, I can see how many styles are in the Workbook.
    I can then cycle through tem all and delete any that are not being used.

    The issue is, there are some styles that are not being displayed to my VB.Net code.
    I can confirm this by opening up the XLSX file in WinRAR and navigating to xl\styles.xml

    I now see thousands of entries such as:
    <cellStyle name="40% - Accent1" xfId="2670" builtinId="31" hidden="1" customBuiltin="1" />

    So it seems to me that the "hidden="1"" would hide it from the GUI; however, there has to be some way programatically to view that in VB.Net.
    My next option if it can't be done would be to determine if there is a way in VB.Net to open up the Styles.xml file directly and do a search for any nodes with both hidden="1" and customBuiltin="1" and just delete the entire node.

    Any thoughts/suggestions/fixes would be greatly appreciated!

Answers

  • Monday, April 04, 2011 11:34 PM
     
     Answered

    The issue is that VBA or even VB.Net (what I'm using, code is very similiar though) can see anything with the "hidden" attribute marked as true.

    Luckily, I was able to find out how to do this using "Open XML SDK 2.0" to open the style.xml file and search for any styles with the Hidden attribute set to anything other than "nothing" and set it back to "nothing."
    http://msdn.microsoft.com/en-us/library/documentformat.openxml.spreadsheet.cellstyle.hidden.aspx

    The Excel file that I'm currently working on, after setting those styles to be visible, had over 60,000 unused duplicate styles.

    (4k was a limit in Excel 2003, and after 4k you would lose formatting.  With Excel 2007 - 2010, the same corruption is there and grows exponentially.  If you create a new document and then open a corrupt document with 4k corrupt styles and copy from it and paste into the new document - new document has 4k corrupt styles.  If you copy another cell and paste it, it goes up to 8k corrupt styles).  So, as time goes on these things grow huge.

    I'm finally about done!  I started with a 1.1MB file and have reduced it to about 300KB.
    (150KB by checking used ranges and removing data validation from one sheet that had validation covering the entire sheet in random cells...)
    (~520KB by removing the 47k unused styles that were visible.)
    (Another ~130 or so by removing the 13k unused hidden styles.)

All Replies

  • Thursday, March 31, 2011 2:05 AM
     
     


    Give this free excel add-in (Formats & Styles) a try... http://excelusergroup.org/media/p/4861.aspx
    '---
    Jim Cone
    Portland, Oregon USA
    http://www.mediafire.com/PrimitiveSoftware
    (free and commercial excel programs)

  • Thursday, March 31, 2011 3:04 PM
     
      Has Code

    I gave it a try.
    At first, I had high hopes because it took over twice as long to run than my code did on the same Excel file...
    Sadly, when all wasy complete it deleted the same number of unused styles as mine did (about 47,000) and still left thousands that fit the following style (if I look under xl/styles.xml)

    <cellStyle xfId="3470" name="Heading 1" customBuiltin="1" hidden="1" builtinId="16"/>
    

    I've tried professional tools as well (such as ASAPUtilities) and it has the same issue.

    For whatever reason, any style that is tagged as "hidden" and "customBuiltin" is not seen by any of these utilities - that includes me stepping through my code to see if I could find them...

    Any other ideas? 

  • Monday, April 04, 2011 7:43 PM
     
     

    I think I determined why mine was faster, and that is that VSTO programming seems to be just a tad more efficient than .vba files?

    I tried my code in .vba and it was much slower...

    Anyway - this still seems to be the one thing holding me up.  If no one has any ideas - is this something you'd be willing to spend any more time helping on Jim?  I can provide an example, I just need to scrub it for any information related to our company first.

    Thanks,

  • Monday, April 04, 2011 10:54 PM
     
     

    G,

    I don't speak VSTO so my help would not be such.
    My Formats & Styles program, when deleting unused styles, skips over any built-in styles and leaves them in place.  Simple vba code should be able to remove all but the "Normal" style...

    '---
    Sub TryStyleRemoval()
    Dim s As Style
    On Error Resume Next
    For Each s In ActiveWorkbook.Styles
        s.Delete
    Next
    End Sub
    '---

    If you have actually removed 47,000 styles from the workbook, that would be record according to what I have seen or heard.  If the workbook is still corrupted then maybe another approach is warrented.
    '---
    Jim Cone
    Portland, Oregon USA
    http://www.mediafire.com/PrimitiveSoftware
    (free and commercial excel programs)



    • Edited by Jim Cone Tuesday, April 05, 2011 12:51 AM
    •  
  • Monday, April 04, 2011 11:34 PM
     
     Answered

    The issue is that VBA or even VB.Net (what I'm using, code is very similiar though) can see anything with the "hidden" attribute marked as true.

    Luckily, I was able to find out how to do this using "Open XML SDK 2.0" to open the style.xml file and search for any styles with the Hidden attribute set to anything other than "nothing" and set it back to "nothing."
    http://msdn.microsoft.com/en-us/library/documentformat.openxml.spreadsheet.cellstyle.hidden.aspx

    The Excel file that I'm currently working on, after setting those styles to be visible, had over 60,000 unused duplicate styles.

    (4k was a limit in Excel 2003, and after 4k you would lose formatting.  With Excel 2007 - 2010, the same corruption is there and grows exponentially.  If you create a new document and then open a corrupt document with 4k corrupt styles and copy from it and paste into the new document - new document has 4k corrupt styles.  If you copy another cell and paste it, it goes up to 8k corrupt styles).  So, as time goes on these things grow huge.

    I'm finally about done!  I started with a 1.1MB file and have reduced it to about 300KB.
    (150KB by checking used ranges and removing data validation from one sheet that had validation covering the entire sheet in random cells...)
    (~520KB by removing the 47k unused styles that were visible.)
    (Another ~130 or so by removing the 13k unused hidden styles.)

  • Saturday, May 07, 2011 4:31 PM
     
     

    Gymratz, you nailed it for Excel 2007/2010: OpenXML is the way to go to deal with this.  Fast and 100% reliable.

    1. VBA perf is slower if you use foreach or loop through styles starting from 1 to total count using native Excel object model: styles collection gets re-indexed on each iteration. You can speed up VBA based cleaning algorithms by looping backwards from highest index to 1:

    stylesCount = ActiveWorkbook.Styles.Count
     For i = stylesCount To 1 Step -1
        Set s = ActiveWorkbook.Styles(i)

        If Not s.BuiltIn Then
            s.Delete
        End If
       
    Next

    2.  I'm running a blog where I have Open XML based Excel files cleanup tools and lots of discussions on the subject of styles related corruption:

    http://sergeig888.wordpress.com/2009/10/13/sharing-useful-utilities

    3. Jim, 47,000 IS NOT A RECORD.  The threshold in Excel 2007/2010 is 65,534 for total unique custom styles count.  Open XML based files will go into corrupted state on file save when they hit 65,535 custom style count and drop all cell formatting on next file open. So, a file with 47,000 custom styles still has nice buffer before it breaks.

     

  • Wednesday, February 22, 2012 9:55 AM
     
     

    ... Luckily, I was able to find out how to do this using "Open XML SDK 2.0" to open the style.xml file and search for any styles with the Hidden attribute set to anything other than "nothing" and set it back to "nothing."
    http://msdn.microsoft.com/en-us/library/documentformat.openxml.spreadsheet.cellstyle.hidden.aspx ...

    Dear Gymratz,
    could you please post a complite VBA code block which opens and parces style.xml within from Excel macro?

  • Saturday, February 25, 2012 3:03 PM
     
     

    I've found it! If anybody needs - go to a greate Jan Karel Pieterse's article  "Editing elements in an OpenXML file using VBA" here: http://www.jkp-ads.com/Articles/Excel2007FileFormat02.asp

  • Friday, May 11, 2012 4:50 PM
     
     

    Gymratz - i'm dealing with a similar issue, huge excel file with a styles corruption problem. by the description of your solve, it sounds like this problem of mine might be resolved as well. only problem is, i have no idea how to operate "Open XML SDK 2.0"...not really sure what it is, even... could you lend me a hand??

    thanks!