Deleted RRS feed

All replies

  • Before deciding to run the program on multiple threads, you first have to find out what is the bottleneck that makes it "kinda slow".

    • If the bottleneck is in your network bandwidth, then you will not gain anything from launching multiple threads.
    • If the bottleneck is in the speed of response of the remote computers that you are scrapping (presuming you are scrapping from different sites) then it may be advantageous to launch a different thread for each server.
    • If the bottleneck is in your CPU speed when processing the data that you have scrapped, then making the program run on 100 threads is probably a bad idea, unless you run it on a computer with 100 cores. Don't launch more threads than the number of CPUs.

    Depending on the case, and how your program is processing its data, the way in which you will launch your multiple threads will be different. For the "cpu" case, the simplest way is probably to do a Parallel.Foreach loop that iterates on all the documents that you want to process.

    Friday, January 11, 2019 9:47 PM
  • Deleted
    Friday, January 11, 2019 10:30 PM
  • Ok, in that case it may be useful to launch several queries in parallel. Whether it works or not will depend mostly on the behavior of the proxy: if it's throttling "per request", then do send several requests in parallel to improve speed. If it's throttling "per user" then don't bother, it will not go faster if you send several requests.

    There are different ways to launch threads, for example, you could use the Thread Parallel Library (TPL), or you could use the ThreadPool. The basic "Thread" class can be used like this:

    using System.Threading;
    Thread t = new Thread(myMethod);
    .... // do here other things
    t.Join(); // this waits for the thread to finish if it hasn't already
    private void myMethod()
        // this will be executed in a separate thread

    Saturday, January 12, 2019 8:16 AM
  • Deleted
    Saturday, January 12, 2019 8:44 AM
  • Yes, you put in "MyMethod" the code that you want to run in a separate thread.

    Be warned that multithreaded programming is very complex and there are many things to keep in mind when doing it. A couple of them are these:

    - If the code that runs in a Thread accesses any resource that is shared by other threads (for example, a variable in memory or a file on disk), then you have to take some precautions, such as applying a lock, to avoid corrupting that resource when two threads access it at the same time.

    - If you are doing a Windows desktop application, be aware that the desktop does NOT apply such locks, so it will be corrupted if you access it from two threads. Make sure that your multithreaded code does not access the screen; if it has to do it you need to first marshall execution into the main thread.

    - This locking of shared resources, if not done properly, can lead to deadlocks or contention, so you need to know and understand what you are doing, it's not a simple matter of "give me an example and I'll copy the code".

    - If it is NOT done, it leads to what are called "race conditions", where the code appears to work most of the time, but produces an unexpected error from time to time in an unpredictable way. These are horribly difficult to debug.

    • Proposed as answer by Stanly Fan Monday, January 14, 2019 5:55 AM
    Saturday, January 12, 2019 9:41 AM