Does File size affect speed of directory listing
-
07 Mei 2012 5:41
I have an app that reads a directory to get the filenames in a directory
for each File in Directory
(process file)
next file
I want to know if the file size makes a difference to the time it takes to get the filenames
eg if a folder contains 10,000 photos of 1 megapixel and another has a folder of 10,000 photos of 8 megapixels will it take the same time.
Also is this the fastest way to get the listing
Any help much appreciated
Semua Balasan
-
07 Mei 2012 6:05
No because you don't read the files but the directory.
If you want to get the file info, then it is a different case
Success
Cor- Ditandai sebagai Jawaban oleh x38class 14 Mei 2012 7:01
-
07 Mei 2012 6:05
No.
Yes.
Possibly, depends how you get the listing.
Bear in mind that if you get the listing, Windows might cache it so the second and subsequent gets are faster.
Regards David R
---------------------------------------------------------------
The great thing about Object Oriented code is that it can make small, simple problems look like large, complex ones.
Object-oriented programming offers a sustainable way to write spaghetti code. - Paul Graham.
Every program eventually becomes rococo, and then rubble. - Alan Perlis
The only valid measurement of code quality: WTFs/minute. -
07 Mei 2012 7:05
thanks for replying
All I want is the filename, so it should take same time?
-
07 Mei 2012 7:06
As I said listing is
for each File in Directory
(process file)
next file
so I am not sure what you are trying to tell me
-
07 Mei 2012 7:37
No - file size makes no difference.
Yes - it will take the same time. That follows from the first answer.
Possibly - if you use the Directory class that's probably the fastest way, if you use some home brew method it could be faster or slower. You do not show VB code just pseudo code. I could assume you meant the Directory class, but then again you might not. :)
Regards David R
---------------------------------------------------------------
The great thing about Object Oriented code is that it can make small, simple problems look like large, complex ones.
Object-oriented programming offers a sustainable way to write spaghetti code. - Paul Graham.
Every program eventually becomes rococo, and then rubble. - Alan Perlis
The only valid measurement of code quality: WTFs/minute. -
08 Mei 2012 4:34
My code is:
Dim gFile as String
For each gFile in System.IO.GetFiles("C:\Xyz,"*.jpg",SearchOption.TopDirectoryOnly)
lstFiles.items.add(gFile)
Next
Is this the class you are refering to or is there an example you can point me to
Thanks
-
08 Mei 2012 5:17
Dim gFile as String
For each gFile in System.IO.GetFiles("C:\Xyz,"*.jpg",SearchOption.TopDirectoryOnly)
lstFiles.items.add(gFile)
Next
That code won't compile for me. Which version of VB are you using? You haven't nominated the type for gFile, so it is not clear whether you are returning a file, a filename or a fileinfo.
The Directory class is described here:
http://msdn.microsoft.com/en-us/library/system.io.directory.aspx
-
08 Mei 2012 5:21
There is no System.IO.GetFiles method.
-
08 Mei 2012 6:03
sorry missed out one word
System.IO.Directory.GetFiles("C:\Xyz,"*.jpg",SearchOption.TopDirectoryOnly)
Have found this info in link http://www.codeproject.com/Articles/38959/A-Faster-Directory-Enumerator
Directory.GetFilesmethod: ~43,860msDirectoryInfo.GetFilesmethod: ~44,000msFastDirectoryEnumerator.GetFilesmethod: ~55msFastDirectoryEnumerator.EnumerateFilesmethod: ~53ms
That is roughly a 830x increase in performance, and more than 2 orders of magnitude! And, the gap only increases as the latency to the PC containing the files increases.
Only problem is it is in C, any vb versions known?
For Acamar, Net 2010, framework 2 my example code has a word missing
this link http://msdn.microsoft.com/en-us/library/ms143316%28v=vs.80%29.aspx has
'Declaration Public Shared Function GetFiles ( _ path As String, _ searchPattern As String, _ searchOption As SearchOption _ ) As String() 'Usage Dim path As String Dim searchPattern As String Dim searchOption As SearchOption Dim returnValue As String() returnValue = Directory.GetFiles(path, searchPattern, searchOption)
How is it used in the application?, seems very similar to my code
I thought I may have concluded this thread by now, have to go to hospital, so will be back in approx 48 hours hopefully
-
08 Mei 2012 6:10
For accurate timing, you have to run each method after a reboot of the computer. Change the order in which you run the methods. Are your relative times the same? DirectoryInfo, will always be slow because it returns a FileInfo.
-
14 Mei 2012 7:02Thanks to all who contributed, I still have no answer as to which is the fastest way to get a directory listing
-
14 Mei 2012 7:19
I have now found this link
http://tom-shelton.net/index.php/2010/01/02/using-extension-methods-and-the-win32-api-to-efficiently-enumerate-the-file-system/
However it is in C#, the owner quotes the following
All of the current built in .NET functions - enumerate the entire directory before returning. I have an article on my website on how to implement methods similar to what are being introduced in 4.0. The caveat is that the code is in C# - and seriously, would be much more complicated to do in VB.NET because the lack of iterator support. But, you could take the code and compile it to a dll for use in your vb project - if you don't fancy waiting for 4.0....
Tom SheltonSo my question now is: How do I get the code in C# compiled to a dll and how do I use it in my app
Probably I am asking too much as I would need step by step instructions, all I know is there must be a better way to get a faster directory listing but having to change my distribution app to include Network 4 is a bit overkill (can you imagine the user installing an app on XP & finding out they must install Network 4, it really is excessive)
-
14 Mei 2012 7:45
I have now found this link
http://tom-shelton.net/index.php/2010/01/02/using-extension-methods-and-the-win32-api-to-efficiently-enumerate-the-file-system/
However it is in C#, the owner quotes the following
All of the current built in .NET functions - enumerate the entire directory before returning. I have an article on my website on how to implement methods similar to what are being introduced in 4.0. The caveat is that the code is in C# - and seriously, would be much more complicated to do in VB.NET because the lack of iterator support. But, you could take the code and compile it to a dll for use in your vb project - if you don't fancy waiting for 4.0....
Tom SheltonSo my question now is: How do I get the code in C# compiled to a dll and how do I use it in my app
Probably I am asking too much as I would need step by step instructions, all I know is there must be a better way to get a faster directory listing but having to change my distribution app to include Network 4 is a bit overkill (can you imagine the user installing an app on XP & finding out they must install Network 4, it really is excessive)
If you want all the files in a directory and its subdirectories, there is very little difference in the times for the various methods. The time to get the files before they are cached is orders of magnitude greater than the time to get them after they are cached. What exactly do want?
-
15 Mei 2012 4:32
I want a listing of file names from a selected directory
Currently on my pc it takes 45 seconds to load 10,000 file names and parse the file name for a specific field
My code is:
Dim gFile as String
For each gFile in System.IO.Directory.GetFiles("C:\Xyz,"*.jpg",SearchOption.TopDirectoryOnly)
ParseFile(gFile)
Next
On a Cd it takes 8 minutes, as I have no idea what my users have in the way of memory/processor/drive type or revolution speed I want the time taken to be an absolute minimum so that confidence in using my app is not an issue for the user to sit & wait, they may have 35,000 files, what then?
-
15 Mei 2012 4:37
The time consuming method is most likely ParseFile. GetFiles for a 35,000 file single directory should take less than 10 seconds. Why is the user waiting?I want a listing of file names from a selected directory
Currently on my pc it takes 45 seconds to load 10,000 file names and parse the file name for a specific field
My code is:
Dim gFile as String
For each gFile in System.IO.Directory.GetFiles("C:\Xyz,"*.jpg",SearchOption.TopDirectoryOnly)
ParseFile(gFile)
Next
On a Cd it takes 8 minutes, as I have no idea what my users have in the way of memory/processor/drive type or revolution speed I want the time taken to be an absolute minimum so that confidence in using my app is not an issue for the user to sit & wait, they may have 35,000 files, what then?
- Diedit oleh JohnWeinMicrosoft Community Contributor 15 Mei 2012 4:39
- Diedit oleh JohnWeinMicrosoft Community Contributor 15 Mei 2012 4:42
- Disarankan sebagai Jawaban oleh Renee Culver 15 Mei 2012 6:52
-
15 Mei 2012 5:40
Thanks John for your suggestion, obviously I have not seen my routine as being the problem. only expecting the result of the directory search to be the only cause.
I will have to put a progress bar in my app to highlight progress of the parsing.
Thanks for taking the time to make a contribution to myself & others on this site
-
15 Mei 2012 6:57
By the Way....file size does not effect the speed of a listing. The NTFS file system has headers in the Master file directory which usually has a block size of 1024 bytes.
Renee
"MODERN PROGRAMMING is deficient in elementary ways BECAUSE of problems INTRODUCED by MODERN PROGRAMMING." Me
-
15 Mei 2012 14:21
I constructed a directory containging 35000 jpg files (64 x 64 portions of screen shots). GetFiles took 100 milliseconds after a restart.Thanks John for your suggestion, obviously I have not seen my routine as being the problem. only expecting the result of the directory search to be the only cause.
I will have to put a progress bar in my app to highlight progress of the parsing.
Thanks for taking the time to make a contribution to myself & others on this site
-
16 Mei 2012 5:31
Thanks to Renee & John for further comments
My searches on the web for fast solutions has found many people interested in this topic
I hope the discussion on this thread is of some help to others
-
08 Juni 2012 17:59
There is nothing to discuss. Filesize in no way effects the amount to get a listing.
Renee
"MODERN PROGRAMMING is deficient in elementary ways BECAUSE of problems INTRODUCED by MODERN PROGRAMMING." Me
- Diedit oleh Renee Culver 08 Juni 2012 17:59
-
08 Juni 2012 18:22
The reason it takes so long is that GetFiles reads *all* the names before it returns. But if you use the EnumerateFiles or EnumerateFileSystemEntries methods, it returns before getting all the names so you can start processing immediately. From the docs:
"The EnumerateFiles and GetFiles methods differ as follows: When you use EnumerateFiles, you can start enumerating the collection of names before the whole collection is returned; when you use GetFiles, you must wait for the whole array of names to be returned before you can access the array. Therefore, when you are working with many files and directories, EnumerateFiles can be more efficient."
http://msdn.microsoft.com/en-us/library/dd383458.aspx#Y342
-
08 Juni 2012 19:28
This only applies to cached info. The first time EnumerateFiles or GetFiles is run, each will be IO limited and take approximately the same time. The quickest way to do anything with IO is to do it the second time first.The reason it takes so long is that GetFiles reads *all* the names before it returns. But if you use the EnumerateFiles or EnumerateFileSystemEntries methods, it returns before getting all the names so you can start processing immediately. From the docs:
"The EnumerateFiles and GetFiles methods differ as follows: When you use EnumerateFiles, you can start enumerating the collection of names before the whole collection is returned; when you use GetFiles, you must wait for the whole array of names to be returned before you can access the array. Therefore, when you are working with many files and directories, EnumerateFiles can be more efficient."
http://msdn.microsoft.com/en-us/library/dd383458.aspx#Y342