none
Way to temporarily persist 8 million values stumbled upon RRS feed

  • General discussion

  • In the course of this thread, i realized a method i've long used may be an undocumented way to persist millions of values temporarily (values are lost when the workbook is closed), not on a worksheet.

    It's controversial, and replies claim i'm either wrong, or doing something dangerous. 

    You decide. I suggest you head straight down to the post marked "My Position Summary".

    (note, this is a 'Discussion', not a 'Question')

    ======

    Hi

    I create a very long array formula, and store it in a defined name.

    When i check the refers-to of the name in the name manager, or in VBA, i don't get the entire array.

    But, if i write the array to sheet, i still get the whole array.

    • Example, load a sheet with RANDBETWEEN(1,1000), from [A1:GZ200]. That's about 200x200 = 40,000 values. (pic 1)
    • In VBA, do: x = [A1:GZ200]. Now x contains a 200x200 2D array.
    • In VBA, do: names.add "test", x. Now "test" contains the entire array.
    • Open Excel Name Manager, and view "test". You cannot see the whole array in the list, or the formula box. Notice the 'refers to' column truncates in a different place than the formula box (pic 2).
    • In VBA immediate pane, do: ?names("test").refersto. Again, you will not get the whole array (pic 3). Notice it truncates at a different place than Name Manager.
    • Now switch to a blank sheet.
    • In VBA, do: [A1:GZ200] = [test]. You'll see the entire array get loaded into the sheet.

    This is just strange, and interesting, and useful (ie, useful that the whole array is still in there). I welcome any relevant thoughts/comments.

    Pic 1:

    Pic 2:

    Pic 3:

    thx!














    • Edited by johny w Sunday, July 10, 2016 5:09 AM
    Thursday, June 23, 2016 8:59 PM

All replies

  • As you say NameManager can neither display nor edit such a long array, not much more than about 2000, though in Excel 2003 less than 256

    The name itself can store a bit more but with a limit of about 8200, so the array in your example name will be truncated.

    Note in a hard coded array in a Name, like your example, columns are separated with commas and rows with semi colons and all inside the curley brackets. You can write the array sting in code with that format and apply to a name. But be sure the entire array including separators, brackets and the = is well under 8200 (in 2007) if you want to write a string for the refersto

    arr = myRange.Value is not directly related. You could have writen the refersto with the Value property rather than via the array variable. However it is indeed the best way to read and particularly write to large ranges

    myPresizedRange.Value = arr

    Friday, June 24, 2016 12:16 PM
    Moderator
  • The name itself can store a bit more but with a limit of about 8200, so the array in your example name will be truncated.

    You could have writen the refersto with the Value property rather than via the array variable. However it is indeed the best way to read and particularly write to large ranges

    But in my example, i show that my 40,000 cell array is not getting truncated when written back to the sheet-- only truncated in the Name Manager display. Can you test? i'm on 2010.

    It appears the limitation is not on number of cells in the array, but on number of characters in the array. The limits you mention are characters, correct?

    Also, why do you say it's "best" to pass a range through an array before writing to a name? Wouldn't it be faster to say:

    names.add "test", [A1:GZ200]

    instead of:

    myArray = [A1:GZ200]
    Names.Add "test", myArray

    cheers




    • Edited by johny w Friday, June 24, 2016 5:49 PM
    Friday, June 24, 2016 5:46 PM
    • In VBA, do: x = [A1:GZ200]. Now x contains a 200x200 2D array.
    • In VBA, do: names.add "test", x. Now "test" contains the entire array.
    • In VBA, do: [A1:GZ200] = x. You'll see the entire array get loaded into the sheet.

    I agree with Peter and your conclusion is wrong.

    The name "test" does not contain an array, RefersTo is a String.
    And when you get it back from the name, you are not able to store it back into the sheet, especially not as array.

    Make a new file, copy the code below into a regular module, run sub Setup.
    After that look into the sheet, then run sub Test.
    As you see the immediate window shows 8200, that's the limit of the string length.
    And there is no data written back, I get a RTE 7 out of memory.

    Andreas.

    Option Explicit
    
    Sub Setup()
      Dim Arr(1 To 200, 1 To 200)
      Dim i As Long, j As Long, k As Long
      For i = 1 To UBound(Arr)
        For j = 1 To UBound(Arr, 2)
          k = k + 1
          Arr(i, j) = k
        Next
      Next
      Range("A1").Resize(200, 200) = Arr
      Names.Add "test", Arr
    End Sub
    
    Sub Test()
      Dim Arr
      Arr = Names("test").RefersTo
      Debug.Print Len(Arr)
      Worksheets.Add
      Range("A1").Resize(200, 200).Value = Arr
    End Sub

    Friday, June 24, 2016 10:01 PM
  • Indeed Names are limited by the length of the string in the refersto.

    In your example you stored the array in the variable x, and wrote the values to the second sheet from x. Although using the evaluate brackets [] is quicker to type, better in full -

    x = Worksheets(1).Range("A1:gz200").Value
    Worksheets(2).Range("A1:gz200").Value = x

    Between those lines you happened to use x to create a Name, which took as much of x as it could but truncated it. Make your name as you did before and -

    debug.? Len(names("test").Refersto)

    With 41600 cells of mainly 3 digits, with the commas you'd expect a length of roughly 41600 x (3 + 1 digits  & commas), about 160k characters, but how many do you get?

    It's not normally a good idea to store hard-coded values as in your example in a Name, better to store in cells. FWIW that approach (and much more) can be used to dissociate Chart source data from cells to make permanent copy of a chart. But that's another subject.

    I didn't say "best to pass a range through and array before writing to a name". The point I tried to make is when processing or copying cell values it's faster to read to an array in one go first, do whatever processing to the array, and dump to a range in one go from an array. Reading and particularly writing cells individually is slow.

    Friday, June 24, 2016 10:07 PM
    Moderator
  • "Although using the evaluate brackets [] is quicker to type, better in full"

    Why?

    "It's not normally a good idea to store hard-coded values as in your example in a Name, better to store in cells."

    Why?

    "Reading and particularly writing cells individually is slow."

    Did you think i was suggesting writing cells individually?

    Andreas, this thread does not involve looping-- it involves creating the array from a worksheet range. Your results may differ with an array built using a loop. 

    Peter, i don't know if you've tested my scenario, so i'll share more details. This method stores a 200 x 200 element array in a defined name, and writes it out to a sheet. No values are lost. 

    1. On Sheet1, enter RANDBETWEEN(1,100) in all cells, A1:GZ200. That's about 40,000 cells. 

    2. copy and paste values, so we're working with static values.

    3. Enter and execute this code in a module:

    Sub test()
              Dim aTable
              aTable = [Sheet1!A1:GZ200]
              Names.Add "bigTable", aTable
              [Sheet2!A1:GZ200] = [bigTable]
              Names("bigTable").Delete
    End Sub

    4. Compare Sheet1 and Sheet2. You'll see that all 40,000 values are identical on both sheets. The defined name held all values.

    Important: If you don't delete the bigTable name, then the workbook will fail to save, reporting "not saved" and "errors detected". If you then delete the big name, you can then save. That's interesting, valuable info. That's because the amount of data stored in the name is so huge. 

    I was curious to know what is the largest defined-name we can have, and still be able to save the workbook. I tried a 100x100 array, and that was still too big-- could not save. Then tried 50x50, and then i could save the workbook:

    Sub test()
              Dim aTable
              aTable = [Sheet1!A1:AZ50]
              Names.Add "medTable", aTable
              
              Dim aCopy
              aCopy = [medTable]
              [Sheet2!A1:AZ50] = aCopy
              
              ' cleanup
              ' Names("medTable").Delete
    End Sub

    The next experiment would be to find if we can store multiple 50x50 defined-names, and still save, or if they are cumulative. I'll try that someday.

    • Edited by johny w Tuesday, June 28, 2016 8:03 PM
    Tuesday, June 28, 2016 7:45 AM
  • To add to Andreas' comments and in reply to your lastest post

    Using [] calls Excel's 'Evaluate' function to try as best it can to interpret the request and return whatever it deems is appropriate. It is an extremely clever function and normally works as expected. But behind the scenes much more work is required not least calling the additional function.

    Also, although longer to type, if you get in the habit of always heading modules Option Explicit, fully declaring variables, and writing such things as a Range address in full it will save considerably more time than any saved in the shorter typing with [].

    FWIW I often use [], but never for anything I intend to distribute (except certain specialised purposes).

    Various reasons why better to store data in cells, more efficient, editable, Name limits, and no doubt many more. If you only want to store the data temporarily, as your example implies, store it in an array or in a temporary hidden sheet. Otherwise, as I said, the Name cannot store anything like that quantity, errors will be more likely. I take it you didn't try to save/reopen and test the name again. However I did say 'normally' best not, though it can be useful for a few specific reasons to store large 'hard' data in a Name.

    In your first example, ie your OP, you wrote the data from the array variable 'x', not the Name as you did just above with 'bigtable'

    Indeed, your example showed nothing about writing to cells individually, but you asked why I said read/write to an array is more efficient than individually and I explained.


    Tuesday, June 28, 2016 11:25 AM
    Moderator
  • Peter Thornton: "Using [] ...behind the scenes much more work is required not least calling the additional function."

    - So you're saying that performance will degrade with []?

    "Also, although longer to type, if you get in the habit of always heading modules Option Explicit, fully declaring variables, and writing such things as a Range address in full it will save considerably more time than any saved in the shorter typing with []."

    - How will it save time?  

    - Why are you talking about declaring variables? Let's please stay on-topic. Plz respect the OP, and don't use the thread to post general advice on unrelated topics. 

    "reasons why better to store data in cells, more efficient,"

    -- you're saying saving data to a worksheet is more efficient than saving to a defined name? Define "efficient". Do you have any evidence?

    "editable,"

    -- ok, that's important only if you need to edit. Not the topic here. 

    "Name limits,"

    -- My example shows a defined-name can store and retrieve 40,000 elements no-problem. Which limit are you referring to? 

    "and no doubt many more. "

    -- Generic "no doubt many more" comments are not informative. 

    "Name cannot store anything like that quantity"

    -- my example proves you wrong. 

    "it can be useful for a few specific reasons to store large 'hard' data in a Name."

    -- you just contradicted yourself. 

    "I take it you didn't try to save/reopen and test the name again."

    -- i take it you did not read what i wrote about deleting the name before saving. 


    "In your first example, ie your OP, you wrote the data from the array variable 'x', not the Name as you did just above with 'bigtable'"

    -- i believe they are functionally equivalent. However, to satisfy this critique, i've rewritten the VBA to write the data from an array variable. It still works:

    Sub test()
              Dim aTable
              aTable = [Sheet1!A1:GZ200]
              Names.Add "bigTable", aTable
              
              Dim aCopy
              aCopy = [bigTable]
              [Sheet3!A1:GZ200] = aCopy
              
              ' cleanup
              Names("bigTable").Delete
    End Sub


    "your example showed nothing about writing to cells individually, but you asked why I said read/write to an array is more efficient than individually and I explained."

    -- Yep- read/write the entire array OR range in a single statement is more efficient than writing each element individually. That's why i do it that way. 

    -- the most efficient way to copy a range from one place to another is directly:

    [Sheet3!A1:GZ200] =  [Sheet1!A1:GZ200]

    But this thread is not asking "what's the most efficient way to copy a range from one place to another?" This thread is about defined names. They are a tool in the developer's toolbox, to use when appropriate. 

    cheers

    Edit: out of curiosity, i executed this same experiment with a 1,000 x 1,000 cell range. That's 1 million cells. It worked. The defined-name held the whole array, no problem.

    • Edited by johny w Tuesday, June 28, 2016 7:58 PM
    Tuesday, June 28, 2016 5:33 PM
  • First to clear up some of your misunderstandings:

    "it can be useful for a few specific reasons to store large 'hard' data in a Name."-- you just contradicted yourself. 

    My statement stands (perhaps I should have qualified ‘large’ should be within the limits), as does my advice concerning your approach, there is no contradiction

    "Name cannot store anything like that quantity"-- my example proves you wrong. 

    Again, a Name does store that anything like quantity and your example is misleading you into thinking otherwise

    -- the most efficient way to copy a range from one place to another is directly:[Sheet3!A1:GZ200] =  [Sheet1!A1:GZ200]

    The most efficient way is to write the code explicitly including  .Value (or Value2), but not with use of Evaluate

    "In your first example, ie your OP, you wrote the data from the array variable 'x', not the Name as you did just above with 'bigtable'" -- i believe they are functionally equivalent. 

    As used in your first example, you populated x from the sheet, populated the Name with x (or rather attempted to), then populated the second sheet with x. You then went on to suggest it had been populated from the name ‘bigtable’ 

    The only way your example apparently works is by use of Evaluate which attempts to retrieve the data that was stored in memory when the name was created. That might work indefinitely through the session or fail without notice, or cause other problems (eg disable AutoSave if it errors while attempting to save such a name). I don’t know what’s going on behind the scenes with Evaluate here but consider it an undocumented oddity. There are others with Names (and previously with hidden NameSpace) which are useful, but in my view this isn’t one of them.

    Finally, you've asked for clarifications and explanations throughout which I've tried to give you. You've asked for yet more above, then you also instruct me not to answer those questions. I'm not sure what your question is now that has not already been fully answered. 


    Wednesday, June 29, 2016 12:21 PM
    Moderator
  • My Position Summary


    Again, a Name does store that anything like quantity and your example is misleading you into thinking otherwise

    ... 

    As used in your first example, you populated x from the sheet, populated the Name with x (or rather attempted to), then populated the second sheet with x. You then went on to suggest it had been populated from the name ‘bigtable’ 

    ...

    The only way your example apparently works is by use of Evaluate which attempts to retrieve the data that was stored in memory when the name was created. That might work indefinitely through the session or fail without notice, or cause other problems (eg disable AutoSave if it errors while attempting to save such a name). I don’t know what’s going on behind the scenes with Evaluate here but consider it an undocumented oddity. There are others with Names (and previously with hidden NameSpace) which are useful, but in my view this isn’t one of them.

    Usefulness

    You may not find it useful to temporarily persist 4 million values. Others may find a use for it. It's also quite fast (~3 seconds per transfer). 

    Bona Fide

    How is my example is "misleading"? I've successfully repeated the test with a 2,000 x 2,000 range-- 4 million values were stored and retrieved from a defined name. It works. It retrieves. I looped the persist/restore cycle 100 times with 4 million values stored in 2 different names (total 8 million values), and never hit a memory alert.

    Errata

    You're quite correct that, in my first example, i populated the sheet from an array (not a name). I corrected my post. 

    Memory

    "might ...fail without notice"
    -- I've looked for and hit memory limits (see below), but no crashes, hangs, or corruption. The defined-name never got corrupt. Excel demonstrated robust memory protection with this technique. 

    Note, the alerts below do NOT happen with just 8 million values (4 million values ea stored in 2 names), as long as there aren't millions more values or formulas on other sheets. 

    I only started to hit limits when i added millions of formulas to the workbook, or attempted to create a 2nd huge defined-name. 

    After creating the name, there was a sheet with 4 million cells containing values, a defined name with 4 million elements (and while the procedure is executing, a VBA array-variable containing 4 million elements). Then i added a new worksheet, called "RandomSheet", containing a formula in 4 million cells: =RANDBETWEEN(1,100). They all calculated, and nothing crashed. Then i added another sheet, put a formula into A1, and copied it to 4 million cells: =RandomSheet!A1. (as a relative formula, it mirrored the 4 million values on RandomSheet) Then, (with the first sheet still containing 4 million constants, plus the defined-name still containing 4 million elements), all 8 million formulas calculated in <1 sec without error, and nothing crashed. 

    Then I added a new sheet, and pointed some formulas at the huge defined name: =index(bigTable, 1253, 342). I was only able to get about 5 or 10 of these, before receiving this alert:

    So now we know the programmer should not hope to perform formula lookups against huge defined-names directly. But nothing crashed-- the protection is there. I removed the formulas pointing to the defined-name, and recalc'd. Again, all 8 million formulas calc'd with no error.

    Then i went into VBA and attempted to create another defined-name pointing to the 4 million cells on RandomSheet-- VBA said "Excel cannot complete this task with available resources." I clicked "ok" and kept working-- no crash or hang.

    I deleted the huge defined-name, and got another "Excel cannot complete this task with available resources." alert. I clicked OK, and the name got deleted anyway. I was able to save on close, and on re-open everything was just fine, no corruption. Excel appeared to know i was hitting memory limits, and handled them gracefully, without crashing or destroying data. 

    I do get error-alerts by attempting to save without first deleting the huge name, but nothing breaks. As soon as i delete the  huge name, i can save. 

    Next, i deleted all the sheets, and make one sheet with 3,000 x 3,000 values (6 million values). The first persist/restore worked. The 2nd persist/restore gave an "available resources" error, and then threw a VBA debug-error on the line that creates the defined name. I ended the macro, and kept working. (Still no crash, hang, or corruption, though.) Seems on my PC, 6 million is too many-- will have to be satisfied with 4 million ;)

    At no time did i experience any instabilities (that i've experience many times using standard, documented Excel VBA techniques). The workbook feels solid after numerous tests. 

    I don't know what determines those limits, whether Excel, amount of RAM, or something else. I should mention i'm on a PC with 24 CPU's and 12GB RAM. 

    AutoSave

    --You're correct about autoSave-- this happened. A few minutes after the huge name was created, i got an alert that Autosave was being disabled. And nothing broke. Not surprising that it can't autosave, since manual save fails while the huge-name exists. But, it's ok if autosave fails: this book is 80+ MB -- it takes 15-20 seconds to save. With a book that size, i don't want autosave. Excel did me a favor by turning off autosave for me. On re-launching Excel, autosave was restored. 


    Evaluate

    You're quite correct, we must use Evaluate. Thx for discovering that! I didn't even realise we cannot retrieve the array from the .Names collection. It may be undocumented, but that doesn't frighten me. Evaluate has been part of Excel VBA for 20 years, since Excel '97, (and earlier, in 4.0 Macro language, before VBA existed). So i'm confident that it's a permanent part of Excel. 

    Please share those "other undocumented oddities with Names"!

    Code

    I've rewritten to show the data persists even if VBA loses state. 

    Sub persist()
              ' clear target sheet (in case sheet already contains huge dataset, to reduce file-size before upcoming huge dump)
              Sheet2.Cells.Clear
              
              ' put range into array
              Dim aTable, rTable As Range
              Set rTable = Sheet1.UsedRange  ' <-- eg., 4 million cells with data
              aTable = rTable
              
              ' put array into defined name
              Names.Add "bigTable", aTable
    
              ' break VBA state
              End
    End Sub
    
    
    Sub retrieve()
              ' put defined name into array
              Dim aCopy
              aCopy = Sheet1.Evaluate("bigTable")   ' <-- works. use evaluate or sq bracket.
              ' aCopy = Names("bigTable")           ' <-- fails
              
              ' get size
              Dim lCols As Long, lRows As Long
              lRows = UBound(aCopy, 1)
              lCols = UBound(aCopy, 2)
              
              ' get last target cell
              Dim rLastCell As Range
              Set rLastCell = Sheet2.Cells(lRows, lCols)
              
              ' write array to target range
              Dim rTarget As Range
              Set rTarget = Sheet2.Range("A1", rLastCell)
              rTarget = aCopy
              
              ' cleanup
              Names("bigTable").Delete
    End Sub
    
    

    Thx!

    • Edited by johny w Sunday, July 10, 2016 5:14 AM
    Saturday, July 9, 2016 5:12 PM
  • Just passing by for a quick tip regarding "Out of resources" error.

    Change Workbook.ForceFullCalculation to True, save the file and close Excel Application. Reopen the workbook. Now Excel won't upload the calculation chain in the memory. So, less memory will be needed to evaluate formulas.

    The downside is a possible lost on performance.


    http://www.ambienteoffice.com.br - http://www.clarian.com.br

    Tuesday, July 12, 2016 5:03 PM
  • Change Workbook.ForceFullCalculation to True, save the file and close Excel Application. Reopen the workbook. Now Excel won't upload the calculation chain in the memory. So, less memory will be needed to evaluate formulas.


    http://www.ambienteoffice.com.br - http://www.clarian.com.br

    wow, thx Felipe! Always learning new things!

    Friday, July 15, 2016 9:37 PM