none
thread use 24% of CPU RRS feed

  • Question

  • Hi All

    I make virus removal tool detect virus by MD5

    I create thread for scan it list all modules and find every file md5 so if file md5= virus md5 then detect it as virus

    The problem is my tool use about 24% of CPU so the PC hange

    tool use 24% of CPU or more !!!


    I must Win

    Tuesday, September 25, 2012 12:19 PM

Answers

  • If I were a betting man, I'd wager that your machine is a quad-core system.  If you were to run your scan on a single core system, instead of 24% you'd see it eating 96% of the CPU.

    MD5 calculations are CPU intensive.

    First, I'd suggest running your thread using the lowest priority you can.  That will allow other processes to do their work, and the scan runs when everything else is idle. (note, your scan times will increase)

    Second, I'd see if there was any unnecessary code in the calculation loop. The faster each calulation is finished, the more time--proportionally--your program will spend doing I/O.  Other programs can use the CPU while yours is waiting on I/O operations to complete.

    Finally, if all else fails adding a hard pause in between processing files will free time for other applications.


    This signature unintentionally left blank.

    • Marked as answer by Kosay Hatem Sunday, October 14, 2012 2:47 PM
    Wednesday, October 3, 2012 12:53 PM
  • it use 11.828 KB not 11.8 GB

    I try to escap system files but some virus inject into system process so I can't escap anything

    I will try other MD5 calculate Algorithm

    but I have a question : is this big size of CPU usage because I use VB.NET ? and it will decreasesif I use C++ library for scan or find MD5 ?

    thank you


    I must Win

    Wednesday, September 26, 2012 9:57 PM
  • Using natively compiled C++ could reduce time, however, if your algorithm uses the same basic principles, you will likely still have the problem, only for less time.

    Remember that processor utilization is a reading of current CPU usage taken at interval.  The best compiling in C++ could do is reduce the amount of time the processor is pinned, but it won't stop it from being pinned for a time.

    My guess is that there is an inefficiency in the algorithm you are using, whether it be utilization of locked loops, excessive I/O, or some other bottleneck.

    The only way you will be sure is if you profile the aspects of the algorithm you are using as to the time each part takes to execute.  Only then can you see what the bottlenecks in your code are.  Also, if your are using locked loops in your algorithm, try to come up with another way to implement it, or at the very least introduce sleeps to break your applications hold on the processor.

    • Marked as answer by Mike FengModerator Tuesday, October 9, 2012 8:30 AM
    • Unmarked as answer by Kosay Hatem Sunday, October 14, 2012 2:47 PM
    • Marked as answer by Kosay Hatem Sunday, October 14, 2012 2:47 PM
    Monday, October 1, 2012 8:07 PM

All replies

  • How much memory do you havve in the computer and how much cache.  The USB Guard is using 11.8G of memory.  This would mean a lot of memory swapping is occuring putting data into a cashe on the hard drive which is very slow.

    jdweng

    Tuesday, September 25, 2012 12:45 PM
  • @Kosay Hatem:

    There are many reasons for a process consuming a high amount of CPU processing percentage. You have to measure and identify the bottleneck. Can you be more specific?

    For example, if your MD5 algorithm takes much time, better to optimize in that area.

    @ Joel:

    I noted that you have mentioned as 11.8G of memory. Are you saying 11,828KB from the screenshot is 11.8 Gigabytes of memory? I think it is approximately 11MB (11.5508MB accurately)

    Tuesday, September 25, 2012 2:11 PM
  • Have you done any profiling within your code to see what aspects of your algorithm are taking the longest amount of time?  I would suspect that the MD5 hash calculation would be longer if the file size is larger.

    Also, are you performing this iteration through the files locked loop?  While Windows is pretty good at slicing processor time such that all processes can function, "bad neighbor" applications that employ processor intensive, long running, locked loop algorithms can still steal an inordinate amount of processor time.  In this case, you might consider employing a timer of short duration (like 10ms) to iterate the file system to force some time between file checks.

    Another thing, how are you doing the MD5 comparison?  If you are using a list of virus MD5s and for each file MD5 iterating the entire list, you may be introducing a big inefficiency here.  If the virus MD5 list is long enough, you will be going through the entire list most of the time, since most, if not all, files will not be virus files.  If you are doing it this way, perhaps a better idea would be to sort the list of MD5s for a binary tree search (or other fast lookup algorithm based on a sorted list) so that you only do so many checks per file.

    Also, some of the hang may be due to your application performing a lot of I/O.  If you are accessing the disk constantly, you are essentially taking over the data bus which will block other applications which require bus access regardless of how much processor you are using.

    Just some ideas.  Hope they help.

    Tuesday, September 25, 2012 9:46 PM
  • it use 11.828 KB not 11.8 GB

    I try to escap system files but some virus inject into system process so I can't escap anything

    I will try other MD5 calculate Algorithm

    but I have a question : is this big size of CPU usage because I use VB.NET ? and it will decreasesif I use C++ library for scan or find MD5 ?

    thank you


    I must Win

    Wednesday, September 26, 2012 9:57 PM
  • Using natively compiled C++ could reduce time, however, if your algorithm uses the same basic principles, you will likely still have the problem, only for less time.

    Remember that processor utilization is a reading of current CPU usage taken at interval.  The best compiling in C++ could do is reduce the amount of time the processor is pinned, but it won't stop it from being pinned for a time.

    My guess is that there is an inefficiency in the algorithm you are using, whether it be utilization of locked loops, excessive I/O, or some other bottleneck.

    The only way you will be sure is if you profile the aspects of the algorithm you are using as to the time each part takes to execute.  Only then can you see what the bottlenecks in your code are.  Also, if your are using locked loops in your algorithm, try to come up with another way to implement it, or at the very least introduce sleeps to break your applications hold on the processor.

    • Marked as answer by Mike FengModerator Tuesday, October 9, 2012 8:30 AM
    • Unmarked as answer by Kosay Hatem Sunday, October 14, 2012 2:47 PM
    • Marked as answer by Kosay Hatem Sunday, October 14, 2012 2:47 PM
    Monday, October 1, 2012 8:07 PM
  • If I were a betting man, I'd wager that your machine is a quad-core system.  If you were to run your scan on a single core system, instead of 24% you'd see it eating 96% of the CPU.

    MD5 calculations are CPU intensive.

    First, I'd suggest running your thread using the lowest priority you can.  That will allow other processes to do their work, and the scan runs when everything else is idle. (note, your scan times will increase)

    Second, I'd see if there was any unnecessary code in the calculation loop. The faster each calulation is finished, the more time--proportionally--your program will spend doing I/O.  Other programs can use the CPU while yours is waiting on I/O operations to complete.

    Finally, if all else fails adding a hard pause in between processing files will free time for other applications.


    This signature unintentionally left blank.

    • Marked as answer by Kosay Hatem Sunday, October 14, 2012 2:47 PM
    Wednesday, October 3, 2012 12:53 PM
  • with task manager you can display the usage of all the cores (not just) oine to check if it is just one core that is giving the issue.  I suspect the virus is running while your code is running and your code has to wait for the virus to pause before your code can access the files.  I think you may have to amke sure your code is running at the highest priority to make sure your code isn't blcoked by the virus itself.


    jdweng

    Wednesday, October 3, 2012 1:01 PM
  • I add new function that baybass big file , and system files from find MD5

    thank you


    I must Win

    Sunday, October 14, 2012 2:45 PM