locked
Why use Path.GetRandomFileName instead of Guid.NewGuid().ToString()? RRS feed

  • Question

  • Path.GetRandomFileName() uses a crypto random algorithm to generate an 8.3 file name (where the .3 is 'tmp) though I'm not clear on what the probabability of this being unique is (what order of magnitude of temp files need to be created before there is a 50% or greater chance of a collision?)

    On the other hand, Guid.NewGuid() creates a (much longer) unique string, also based on a crypto algorithm, where the probability of collision is known to be so low in a lifetime of creating guids that my understanding is it is safe enough to assume it is unique.

    Is Path.GetRandomFileName() preferred for any reason? Is 8.3 still a necessary requirement on many systems, is there a perf concern, or other reason why it is a better way to go?


    Edit: I believe the math shows that you only need to generate about 2 million files before you have a likely collision- perhaps someone else can confirm?
    • Edited by Jacob Pitts Thursday, July 26, 2012 7:05 AM
    Thursday, July 26, 2012 6:51 AM

Answers

  • OK I made an incorrect assumption based on the code I was debugging at the time that you got an 8.3 name where only the first 8 were random, and the last 3 were 'tmp'; turns out you actually get 11 random characters, which substantially increases the space. I think the likely threshhold for collisions is closer to 475 million with the larger group. I made a short program that calls this algorithm & substrings the first 8 charactes, and got collisions around the 2M mark as expected:

    Collision found in 0337913.
    Collision found in 1101434.
    Collision found in 0477350.
    Collision found in 2096446.
    Collision found in 1019310.

    Based on the speed of Guid.NewGuid() and the larger space, I'm temped to think that a file name of string.Format("{0:N}.tmp", Guid.NewGuid()) being quicker and having more reasonable uniqueness is a better choice.

    • Marked as answer by Mike Feng Friday, July 27, 2012 7:20 AM
    Thursday, July 26, 2012 10:54 PM

All replies

  • Hi

        I believe that Path.GetRandomFileName() has more complicated algorithm because after running it 200000000 times I didn't get any duplicates. Another result is that Guid.NewGuid() i somewhere around 5 times faster than Path.GetRandomFileName().


    everything is a matter of probability...

    Thursday, July 26, 2012 11:19 AM
  • OK I made an incorrect assumption based on the code I was debugging at the time that you got an 8.3 name where only the first 8 were random, and the last 3 were 'tmp'; turns out you actually get 11 random characters, which substantially increases the space. I think the likely threshhold for collisions is closer to 475 million with the larger group. I made a short program that calls this algorithm & substrings the first 8 charactes, and got collisions around the 2M mark as expected:

    Collision found in 0337913.
    Collision found in 1101434.
    Collision found in 0477350.
    Collision found in 2096446.
    Collision found in 1019310.

    Based on the speed of Guid.NewGuid() and the larger space, I'm temped to think that a file name of string.Format("{0:N}.tmp", Guid.NewGuid()) being quicker and having more reasonable uniqueness is a better choice.

    • Marked as answer by Mike Feng Friday, July 27, 2012 7:20 AM
    Thursday, July 26, 2012 10:54 PM

  • Be aware Windows has a size limit of 255 chars for the file path, so using a GUID in the path can overflow it a lot easier than an 8.3 name.
    Wednesday, March 30, 2016 8:46 PM