Answered by:
Randomized audit with two constants?
Question

This subject has been tackled elsewhere but it has been suggested that I simplify my question and post it here for the best results.
The idea is to find a way to create a simple spreadsheet for internal audit purposes.
The audit needs to be carried out on N% of data per month, and X% of data per quarter AND each piece of data needs to be audited the same amount of Y times per annum, and this sum would of course depend on N and X, i.e. Y=(8*N+4*X)/100.
Often Y would end up to be a noninteger, and when that happens the amount of audits carried out per year per a piece of data would be between the nearest two integers  i.e. if Y=4,4 then the amount of audits carried out per year per a piece of data would be between 4 and 5, since naturally you can't check something 4,4 times.
With such constans, or perhaps constraints, there is more than one viable solution as for which pieces of data are audited in a given month/quarter. And that's the point, really  to have a spreadsheet that would seamlessly randomize this while satisfying these constraints.
Below for reference purposes are screenshots of viable solutions as per the above, with 0 and 1 entered by hand, where 1 means audit is performed and 0 means audit is not performed. This is only an example and if it's relevant the actual audit will concerns at least 62 pieces of data (or so) and not 15:
https://onedrive.live.com/redir?resid=8BCC5C37877928DB!3305&authkey=!AJPSTYyiMBArjCU&v=3&ithint=photo%2cpng
https://onedrive.live.com/redir?resid=8BCC5C37877928DB!3306&authkey=!ABMVqC04LcuTxqM&v=3&ithint=photo%2cpng
Basically I figured if I this can be done "manually" (a bit like Sudoku, isn't it?) there also has to be a fairly easy way to automate this. What do you think?
Answers

But I believe that since one contraint is based on the other it would not be possible to reach a nonsolvable number of Nth.
I got it! :)))
Download the file again, I've changed it. Now you can have a solution up to Nth=100 in a few seconds.
But there are situations where it is not possible to get a solution, e.g.:
Nth=15, M=25%, Q=50% means you have 64 audits. But the Yvalue with that settings is exact 4, means the sum of the rows (Nth=15*4) is only 60! See the cell in H5:I7
And the settings N=100, M=25%, Q=50% seems to have only one solution, so I guess the code needs a very long time (maybe forever) to find it.
But with all other settings where Min/Max audits shows different values are definitely solvleable.
Andreas.
 Marked as answer by MiłoszNiedziela Monday, July 6, 2015 6:38 PM
All replies

The audit needs to be carried out on N% of data per month, and X% of data per quarter AND each piece of data needs to be audited the same amount of Y times per annum, and this sum would of course depend on N and X, i.e. Y=(8*N+4*X)/100.
Basically I figured if I this can be done "manually" (a bit like Sudoku, isn't it?) there also has to be a fairly easy way to automate this. What do you think?Well, maybe we can solve that with a brute force method... anyway, I'm interested. But at first I want to see the sheet and the used formulas where you made the screen shots from.
Please upload your file on an online file hoster like www.dropbox.com and post the download link here.
And please tell me how do you obtained the numbers 8 and 4 in the calculation above.
Andreas.


Here is my test file, there are only 3 (colored) input cells, you can change them as you like.
After you clicked the Solve button look at the status bar, sometimes it needs a long time to finish, sometimes my code can't find a solution. Press CtrlBreak in that case and try again.
It seems that there are situations where a solution is not possible...
https://dl.dropboxusercontent.com/u/35239054/Samples/1d8a82c8c0a240bfab89e583a29ee557.xlsm
Andreas.

Could you please try it for e.g. Nth=62 (or around that, can be 100 for testing purposes) on your end?
While your code works magic for smaller numbers  it solved Nth=15 in seconds and generated EXACTLY what was needed, once I go up to Nth=30 or so it seems to perform a crazy amount of runs and then stop  without giving a solution.
You mentioned that it happened for you as well, but you could obtain a solution if you tried again  I did follow your advice and tried 8 maybe 10 times each for Nth=62, Nth=45 and Nth=30, to no avail.
What do you think could be the issue here?
 Edited by MiłoszNiedziela Sunday, July 5, 2015 11:43 AM

Could you please try it for e.g. Nth=62 (or around that, can be 100 for testing purposes) on your end?
I played around with different settings, up to Nth=100, but as higher the Nth as lower you have to set the % values. Try this settings:
Nth=100
per Month 15%
per Quarter 40%You'll get a solution after a few runs (5.000 up to 20.000)
Then increase % per Month by 1 and you'll see that it takes longer till you get a solution. With 18% per Month you did not get a solution anymore.
I guess in this case a solution is not possible... but I'm not a mathematician.
When you try per Month 20% and per Quarter 34% you'll get a solution again.
Maybe you can ask in a Mathematics Forum if it i possible to calculate if, resp. how many solutions are possible with which input settings.
EDIT: Don't worry if the status bar stops updating, that is an issue with Excel, the code runs in the background.
Andreas.
 Edited by Andreas Killer Sunday, July 5, 2015 12:41 PM

I see. I'm guessing here, but to me it looks like it's more of an application / processing unit limitation than a mathematical one  because even for Nth=62 you could enter a solution manually (I did try) and it would satisfy all set contraints  so that means a viable solution exists  but maybe Excel is not able to execute enough runs to find it? I don't know. But I believe that since one contraint is based on the other it would not be possible to reach a nonsolvable number of Nth.
In any case I'd like to thank you very much Andreas  the file you provided comes very near a perfect solution for me, and after I study it more thoroughly perhaps I will be able to "force it" to do some more runs and find solutions for higher Nth :).
EDIT: I didn't know that  thanks! But BTW the Ctrl+Break doesn't seem to work for me at all  I've had to close the file and reopen it every time the code had seemed to stop.
 Edited by MiłoszNiedziela Sunday, July 5, 2015 12:56 PM

because even for Nth=62 you could enter a solution manually (I did try) and it would satisfy all set contraints  so that means a viable solution exists 
but maybe Excel is not able to execute enough runs to find it?
Can you upload the file with the solution on Nth=62? I'll take a look.
Excel has nothing to do with the solution, it's only my stupid code, just a very rough brute force method.
Maybe it's helpful when I explain my idea, resp. how it works:
I start with a field (an array) filled with zeros.
Step 1: I choose a random slot, check if it contains a 0 and if so, I write a 1 into it.
Step 2: Calculate the sum for all rows and columns and check if anything is beyond the limits (the max. value for each row and column). If so, continue with Step 1, otherwise undo the last 1.
Obvious this 2 steps are not enough to get a solution, when you run just that, you get often into an endless loop, because all the 1 that are already set in the field are static and at min. one of them is in the wrong place. So we have to "shuffle".
Step 3: Count the runs and when we reach (max. audits * 100) runs, I erase the 1st row (fill the 1st row in the field with zeros), and continue with Step 1. If we reach this point again, I erase the 2nd row and so one until I reach the last row and the next time I erase the 1st row again.
So if a solution exists for N=62 M=30% Q=50%, it is obvious that Step 3 is to bad to make a solution possible.
The question is, what can we do instead, rather then a "blind erase"? If you have an idea how this puzzle can be solved, describe it with your own words, maybe I can "translate" them into code.
Andreas.

But I believe that since one contraint is based on the other it would not be possible to reach a nonsolvable number of Nth.
I got it! :)))
Download the file again, I've changed it. Now you can have a solution up to Nth=100 in a few seconds.
But there are situations where it is not possible to get a solution, e.g.:
Nth=15, M=25%, Q=50% means you have 64 audits. But the Yvalue with that settings is exact 4, means the sum of the rows (Nth=15*4) is only 60! See the cell in H5:I7
And the settings N=100, M=25%, Q=50% seems to have only one solution, so I guess the code needs a very long time (maybe forever) to find it.
But with all other settings where Min/Max audits shows different values are definitely solvleable.
Andreas.
 Marked as answer by MiłoszNiedziela Monday, July 6, 2015 6:38 PM

But I believe that since one contraint is based on the other it would not be possible to reach a nonsolvable number of Nth.
I got it! :)))
Download the file again, I've changed it. Now you can have a solution up to Nth=100 in a few seconds.
But there are situations where it is not possible to get a solution, e.g.:
Nth=15, M=25%, Q=50% means you have 64 audits. But the Yvalue with that settings is exact 4, means the sum of the rows (Nth=15*4) is only 60! See the cell in H5:I7
And the settings N=100, M=25%, Q=50% seems to have only one solution, so I guess the code needs a very long time (maybe forever) to find it.
But with all other settings where Min/Max audits shows different values are definitely solvleable.
Andreas.
A million thanks Andreas! It does work beautifully now :)
Thank you also for putting all the useful annotations in the code itself  it does make it easier for me to go through it and better see what it is you actually did :)

Hi Andreas,
Today I attempted to tweak the file some more in order to add further constraints  to be exact I now need to have a different sample for each quarter (which effectively increases the number of variables from 2 to 5).
I modified the code, but even at 1,5 million runs it doesn't seem to arrive at a solution better then <90% match :x
I'm uploading the updated file here: https://1drv.ms/x/s!AtsoeYc3XMyLpzpYKJhaPQD31sV
Could you perhaps have a quick look?
I think mostly I am missing to correctly update the Crosscheck part, but then again maybe the culprit is somewhere else. Any hints?
Miłosz.


Hmm strange  pretty sure it isn't encrypted and I uploaded it to OneDrive which we used in this very thread :/
Here is another try: https://1drv.ms/x/s!AtsoeYc3XMyLpzvPlqUbjzSaIGC
I tested it and it worked on my end  if it for any reason still doesn't work for you I'll upload it to Dropbox  let me know :).

Here is another try: https://1drv.ms/x/s!AtsoeYc3XMyLpzvPlqUbjzSaIGC
Okay, got it, the problem is clear: The calculation of Audits Total in the worksheet is wrong.
The algorithm itself is unchanged, means it doesn't matter if you use PerQ*4 or PerQ+PerQ2+PerQ3+PerQ4.
Download my original file again, open it set Nth=50 and Quarter=65% (which is the average of your quarter settings): As you see there is no solution possible with that settings.
Theoretical max. of Nth=50 is Quarter=63% but as the algorithm is a "random brute force" means it can happen that the code never finds a 100% solution, so you have to live with 98%...
Andreas.

Well, that is not entirely it :) I managed to obtain a "solution" (after 4727008 runs ;) ) by changing the numbers, but it seems to me that it picks up the settings without even recognizing that the Q1/2/3/4 are supposed to be different.
Here's a screenshot:
https://1drv.ms/u/s!AtsoeYc3XMyLpzx7dmlg0Hhe05p3
And here's the file with the "solution":

but it seems to me that it picks up the settings without even recognizing that the Q1/2/3/4 are supposed to be different.
Assume you want to split 12 whatever and everybody should have 2 whatever at the end.
The calculation is 12 / 6 = 2Why do you assume that the distribution changes if you calculate 12 / 4 / 1.5 ?
Andreas.

Using your terms, I have 12 people divided in 5 groups and each group gets a different portion from a fixed amount of whatever.
Lets say we have 1000 pieces of whatever. Here's an example of a correct distribution:
EXAMPLE 1
group 1  1 person  7879
group 2  1 person  8687
group 3  1 person  104105
group 4  1 person  173174
group 5  8 people  552560 (6970 per person)But what the macro generates is more like this, regardless of the Q1Q4 percentage settings:
EXAMPLE 2
group 1 = group 2 = group 3 = group 4  4 people  428432 (107108 per person)
group 5  8 people  568576 (7172 per person)All the while each person in groups 14 needs to get a different portion of whatever, like shown in Example 1.
Let me know if I'm missing your point here...
 Edited by MiłoszNiedziela Monday, December 19, 2016 10:51 AM


Now I'm lost :) so are you saying it cannot be done, or just that the algorithm that you wrote for me needs to be rewritten in order for this to work? Because what I figured is that if we could diffrentiate bewtween M and Q, why can't we differentiate between M, Q1, Q2... etc. too many constraints or what is the obstacle here?
EDIT: Or are you just saying that this should be a separate topic, since it's too far off from the issue described in the opening post? Edited by MiłoszNiedziela Tuesday, December 20, 2016 1:42 PM


OK so I take it that it is possible :) let me try to make further adjustments on top of the ones already made and we'll see if it works...
EDIT: I take that back :))) since you're using immediate if, then you have either true or false, i.e. 2 possible constraints (and not 5...) so I suppose this is not doable. I am very sorry to have wasted your time.
 Edited by MiłoszNiedziela Tuesday, December 20, 2016 2:31 PM
