none
Randomized audit with two constants? RRS feed

  • Question

  • This subject has been tackled elsewhere but it has been suggested that I simplify my question and post it here for the best results.

    The idea is to find a way to create a simple spreadsheet for internal audit purposes.

    The audit needs to be carried out on N% of data per month, and X% of data per quarter AND each piece of data needs to be audited the same amount of Y times per annum, and this sum would of course depend on N and X, i.e. Y=(8*N+4*X)/100.

    Often Y would end up to be a non-integer, and when that happens the amount of audits carried out per year per a piece of data would be between the nearest two integers - i.e. if Y=4,4 then the amount of audits carried out per year per a piece of data would be between 4 and 5, since naturally you can't check something 4,4 times.

    With such constans, or perhaps constraints, there is more than one viable solution as for which pieces of data are audited in a given month/quarter. And that's the point, really - to have a spreadsheet that would seamlessly randomize this while satisfying these constraints.

    Below for reference purposes are screenshots of viable solutions as per the above, with 0 and 1 entered by hand, where 1 means audit is performed and 0 means audit is not performed. This is only an example and if it's relevant the actual audit will concerns at least 62 pieces of data (or so) and not 15:

    https://onedrive.live.com/redir?resid=8BCC5C37877928DB!3305&authkey=!AJPSTYyiMBArjCU&v=3&ithint=photo%2cpng

    https://onedrive.live.com/redir?resid=8BCC5C37877928DB!3306&authkey=!ABMVqC04LcuTxqM&v=3&ithint=photo%2cpng

    Basically I figured if I this can be done "manually" (a bit like Sudoku, isn't it?) there also has to be a fairly easy way to automate this. What do you think?

    Friday, July 3, 2015 9:53 PM

Answers

  • But I believe that since one contraint is based on the other it would not be possible to reach a non-solvable number of Nth.

    I got it! :-)))

    Download the file again, I've changed it. Now you can have a solution up to Nth=100 in a few seconds.

    But there are situations where it is not possible to get a solution, e.g.:

    Nth=15, M=25%, Q=50% means you have 64 audits. But the Y-value with that settings is exact 4, means the sum of the rows (Nth=15*4) is only 60! See the cell in H5:I7

    And the settings N=100, M=25%, Q=50% seems to have only one solution, so I guess the code needs a very long time (maybe forever) to find it.

    But with all other settings where Min/Max audits shows different values are definitely solvleable.

    Andreas.

    Sunday, July 5, 2015 4:43 PM

All replies

  • The audit needs to be carried out on N% of data per month, and X% of data per quarter AND each piece of data needs to be audited the same amount of Y times per annum, and this sum would of course depend on N and X, i.e. Y=(8*N+4*X)/100.

    Basically I figured if I this can be done "manually" (a bit like Sudoku, isn't it?) there also has to be a fairly easy way to automate this. What do you think?

    Well, maybe we can solve that with a brute force method... anyway, I'm interested. But at first I want to see the sheet and the used formulas where you made the screen shots from.

    Please upload your file on an online file hoster like www.dropbox.com and post the download link here.

    And please tell me how do you obtained the numbers 8 and 4 in the calculation above.

    Andreas.

    Saturday, July 4, 2015 5:43 AM
  • Sure! Here it is:

    https://onedrive.live.com/redir?resid=8BCC5C37877928DB!3307&authkey=!AFC0faRuAHPKweE&ithint=file%2cxlsx

    And as for the numbers:

    8 -> months (excl. quarters)

    4 -> quarters

    Let me know if there's anything else I should've added.

    Saturday, July 4, 2015 8:38 AM
  • Here is my test file, there are only 3 (colored) input cells, you can change them as you like.

    After you clicked the Solve button look at the status bar, sometimes it needs a long time to finish, sometimes my code can't find a solution. Press Ctrl-Break in that case and try again.

    It seems that there are situations where a solution is not possible...

    https://dl.dropboxusercontent.com/u/35239054/Samples/1d8a82c8-c0a2-40bf-ab89-e583a29ee557.xlsm

    Andreas.

    Saturday, July 4, 2015 10:26 AM
  • Could you please try it for e.g. Nth=62 (or around that, can be 100 for testing purposes) on your end?

    While your code works magic for smaller numbers - it solved Nth=15 in seconds and generated EXACTLY what was needed, once I go up to Nth=30 or so it seems to perform a crazy amount of runs and then stop - without giving a solution.

    You mentioned that it happened for you as well, but you could obtain a solution if you tried again  - I did follow your advice and tried 8 maybe 10 times each for Nth=62, Nth=45 and Nth=30, to no avail.

    What do you think could be the issue here?




    Sunday, July 5, 2015 11:41 AM
  • Could you please try it for e.g. Nth=62 (or around that, can be 100 for testing purposes) on your end?

    I played around with different settings, up to Nth=100, but as higher the Nth as lower you have to set the % values. Try this settings:

    Nth=100
    per Month 15%
    per Quarter 40%

    You'll get a solution after a few runs (5.000 up to 20.000)

    Then increase % per Month by 1 and you'll see that it takes longer till you get a solution. With 18% per Month you did not get a solution anymore.

    I guess in this case a solution is not possible... but I'm not a mathematician.

    When you try per Month 20% and per Quarter 34% you'll get a solution again.

    Maybe you can ask in a Mathematics Forum if it i possible to calculate if, resp. how many solutions are possible with which input settings.

    EDIT: Don't worry if the status bar stops updating, that is an issue with Excel, the code runs in the background.

    Andreas.


    Sunday, July 5, 2015 12:39 PM
  • I see. I'm guessing here, but to me it looks like it's more of an application / processing unit limitation than a mathematical one - because even for Nth=62 you could enter a solution manually (I did try) and it would satisfy all set contraints - so that means a viable solution exists - but maybe Excel is not able to execute enough runs to find it? I don't know. But I believe that since one contraint is based on the other it would not be possible to reach a non-solvable number of Nth.

    In any case I'd like to thank you very much Andreas - the file you provided comes very near a perfect solution for me, and after I study it more thoroughly perhaps I will be able to "force it" to do some more runs and find solutions for higher Nth :).


    EDIT: I didn't know that - thanks! But BTW the Ctrl+Break doesn't seem to work for me at all - I've had to close the file and re-open it every time the code had seemed to stop.

    Sunday, July 5, 2015 12:52 PM
  • because even for Nth=62 you could enter a solution manually (I did try) and it would satisfy all set contraints - so that means a viable solution exists -

    but maybe Excel is not able to execute enough runs to find it?

    Can you upload the file with the solution on Nth=62? I'll take a look.

    Excel has nothing to do with the solution, it's only my stupid code, just a very rough brute force method.

    Maybe it's helpful when I explain my idea, resp. how it works:

    I start with a field (an array) filled with zeros.

    Step 1: I choose a random slot, check if it contains a 0 and if so, I write a 1 into it.

    Step 2: Calculate the sum for all rows and columns and check if anything is beyond the limits (the max. value for each row and column). If so, continue with Step 1, otherwise undo the last 1.

    Obvious this 2 steps are not enough to get a solution, when you run just that, you get often into an endless loop, because all the 1 that are already set in the field are static and at min. one of them is in the wrong place. So we have to "shuffle".

    Step 3: Count the runs and when we reach (max. audits * 100) runs, I erase the 1st row (fill the 1st row in the field with zeros), and continue with Step 1. If we reach this point again, I erase the 2nd row and so one until I reach the last row and the next time I erase the 1st row again.

    So if a solution exists for N=62 M=30% Q=50%, it is obvious that Step 3 is to bad to make a solution possible.

    The question is, what can we do instead, rather then a "blind erase"? If you have an idea how this puzzle can be solved, describe it with your own words, maybe I can "translate" them into code.

    Andreas.

    Sunday, July 5, 2015 3:22 PM
  • But I believe that since one contraint is based on the other it would not be possible to reach a non-solvable number of Nth.

    I got it! :-)))

    Download the file again, I've changed it. Now you can have a solution up to Nth=100 in a few seconds.

    But there are situations where it is not possible to get a solution, e.g.:

    Nth=15, M=25%, Q=50% means you have 64 audits. But the Y-value with that settings is exact 4, means the sum of the rows (Nth=15*4) is only 60! See the cell in H5:I7

    And the settings N=100, M=25%, Q=50% seems to have only one solution, so I guess the code needs a very long time (maybe forever) to find it.

    But with all other settings where Min/Max audits shows different values are definitely solvleable.

    Andreas.

    Sunday, July 5, 2015 4:43 PM
  • But I believe that since one contraint is based on the other it would not be possible to reach a non-solvable number of Nth.

    I got it! :-)))

    Download the file again, I've changed it. Now you can have a solution up to Nth=100 in a few seconds.

    But there are situations where it is not possible to get a solution, e.g.:

    Nth=15, M=25%, Q=50% means you have 64 audits. But the Y-value with that settings is exact 4, means the sum of the rows (Nth=15*4) is only 60! See the cell in H5:I7

    And the settings N=100, M=25%, Q=50% seems to have only one solution, so I guess the code needs a very long time (maybe forever) to find it.

    But with all other settings where Min/Max audits shows different values are definitely solvleable.

    Andreas.

    A million thanks Andreas! It does work beautifully now :)

    Thank you also for putting all the useful annotations in the code itself - it does make it easier for me to go through it and better see what it is you actually did :)

    Monday, July 6, 2015 6:48 PM
  • Hi Andreas,

    Today I attempted to tweak the file some more in order to add further constraints - to be exact I now need to have a different sample for each quarter (which effectively increases the number of variables from 2 to 5).

    I modified the code, but even at 1,5 million runs it doesn't seem to arrive at a solution better then <90% match :x

    I'm uploading the updated file here: https://1drv.ms/x/s!AtsoeYc3XMyLpzpYKJhaPQ-D31sV

    Could you perhaps have a quick look?

    I think mostly I am missing to correctly update the Crosscheck part, but then again maybe the culprit is somewhere else. Any hints?

    Miłosz.

    Tuesday, December 13, 2016 11:26 AM
  • Could you perhaps have a quick look?

    I can't download the file:

    Please don't use encryption in your file and try a different file hoster which support direct download links.

    Andreas.

    Wednesday, December 14, 2016 8:13 AM
  • Hmm strange - pretty sure it isn't encrypted and I uploaded it to OneDrive which we used in this very thread :/

    Here is another try: https://1drv.ms/x/s!AtsoeYc3XMyLpzvP-lqUbjzSaIGC

    I tested it and it worked on my end - if it for any reason still doesn't work for you I'll upload it to Dropbox - let me know :).

    Wednesday, December 14, 2016 8:40 AM
  • Here is another try: https://1drv.ms/x/s!AtsoeYc3XMyLpzvP-lqUbjzSaIGC

    Okay, got it, the problem is clear: The calculation of Audits Total in the worksheet is wrong.

    The algorithm itself is unchanged, means it doesn't matter if you use PerQ*4 or PerQ+PerQ2+PerQ3+PerQ4.

    Download my original file again, open it set Nth=50 and Quarter=65% (which is the average of your quarter settings): As you see there is no solution possible with that settings.

    Theoretical max. of Nth=50 is Quarter=63% but as the algorithm is a "random brute force" means it can happen that the code never finds a 100% solution, so you have to live with 98%...

    Andreas.

    Wednesday, December 14, 2016 6:10 PM
  • Well, that is not entirely it :) I managed to obtain a "solution" (after 4727008 runs ;) ) by changing the numbers, but it seems to me that it picks up the settings without even recognizing that the Q1/2/3/4 are supposed to be different.

    Here's a screenshot:

    https://1drv.ms/u/s!AtsoeYc3XMyLpzx7dmlg0Hhe05p3

    And here's the file with the "solution":

    https://1drv.ms/x/s!AtsoeYc3XMyLpz1nUvYkqZC0OJDG

    Thursday, December 15, 2016 5:29 PM
  • but it seems to me that it picks up the settings without even recognizing that the Q1/2/3/4 are supposed to be different.

    Assume you want to split 12 whatever and everybody should have 2 whatever at the end.
    The calculation is 12 / 6 = 2

    Why do you assume that the distribution changes if you calculate 12 / 4 / 1.5 ?

    Andreas.

    Thursday, December 15, 2016 6:53 PM
  • Using your terms, I have 12 people divided in 5 groups and each group gets a different portion from a fixed amount of whatever.

    Lets say we have 1000 pieces of whatever. Here's an example of a correct distribution:

    EXAMPLE 1
    group 1 | 1 person | 78-79
    group 2 | 1 person | 86-87
    group 3 | 1 person | 104-105
    group 4 | 1 person | 173-174
    group 5 | 8 people | 552-560 (69-70 per person)

    But what the macro generates is more like this, regardless of the Q1-Q4 percentage settings:

    EXAMPLE 2
    group 1 = group 2 = group 3 = group 4 | 4 people | 428-432 (107-108 per person)
    group 5 | 8 people | 568-576 (71-72 per person)

    All the while each person in groups 1-4 needs to get a different portion of whatever, like shown in Example 1.

    Let me know if I'm missing your point here...


    Monday, December 19, 2016 10:47 AM
  • Lets say we have 1000 pieces of whatever. Here's an example of a correct distribution:

    That it what you expect, but not what we talked about in the past and furthermore not possible with the algorithm that I wrote for you.

    Andreas.

    Monday, December 19, 2016 3:46 PM
  • Now I'm lost :) so are you saying it cannot be done, or just that the algorithm that you wrote for me needs to be rewritten in order for this to work? Because what I figured is that if we could diffrentiate bewtween M and Q, why can't we differentiate between M, Q1, Q2... etc. too many constraints or what is the obstacle here?

    EDIT: Or are you just saying that this should be a separate topic, since it's too far off from the issue described in the opening post?
    Tuesday, December 20, 2016 1:31 PM
  • why can't we differentiate between M, Q1, Q2... etc. too many constraints or what is the obstacle here?

    Search the code for "Mod 3 = 0", that are the places where you have to differentiate between M, Q1, Q2...

    Andreas.

    Tuesday, December 20, 2016 2:01 PM
  • OK so I take it that it is possible :) let me try to make further adjustments on top of the ones already made and we'll see if it works...

    EDIT: I take that back :))) since you're using immediate if, then you have either true or false, i.e. 2 possible constraints (and not 5...) so I suppose this is not doable. I am very sorry to have wasted your time.


    Tuesday, December 20, 2016 2:14 PM
  • I am very sorry to have wasted your time.

    No problem, we talked about it and finally you understood it, that was my goal. :-)

    Andreas.

    Tuesday, December 20, 2016 3:03 PM