none
Reading .dat files by GPU RRS feed

  • Question

  • Hello,

    I'm trying to take advantage of heterogeneous computing system by developing an algorithm that make make the GPU contributing to the process of reading enormous amount of .dat files, I did program this in C++ for traditional reading. What I need to know, is that possible by C++ AMP, in other words, can I write code in C++ AMP to read this big number of files? from where should I start digging in case yes

    Regards,
    Tuesday, April 9, 2013 2:10 AM

Answers

  • Nope, it is neither practical nor efficient, IMHO. It makes perfect sense to have the CPU parse your directories / chug up the .dat files - you will be waiting for the slowest part in the system, which is I/O anyway. I am somewhat sure that even a single core can saturate whatever your HDD can provide. A GPU is a compute accelerator, it does best when there is sufficient arithmetic intensity involved, and when you tend to do extensive processing on some data set. So yes, my suggestion would be to go with your second scenario IF you are doing enough processing per each byte so as to offset the cost of moving large quantities of data across PCI-E.
    Wednesday, April 10, 2013 3:10 AM

All replies

  • Hmm, this sounds like a task that's a rather poor fit for a GPU, as it is an IO bound one. Could you provide some more context? You can of course read arbitrary data through C++ AMP (with relative ease), but I see no reason to want to do so based on what you described. You'd just be stuck waiting for reading the .dat files from the hard-drive, or a similarly slow medium, and then pay the additional cost of flipping them over to the GPU and fetching them back for no reason.
    Tuesday, April 9, 2013 3:59 AM
  • I wrote a traditional code in C++ that reads all .dat files in a certain directory. but that was just a beginning as the real test should deal with huge size of data (e.g. 100 GB at least). So one of the solutions that came to my mind is using GPU to help reading this size. Since the GPU cannot read directly from the hard-drive, I found some suggested solutions to minimize communication overhead such as page-locked memory and a lot of talk about it and others. Do you think such solution is practical and efficient ? The other direction I have is just leave the CPU reading the files and utilize the GPU for other tasks such as perform some calculations on these files. What do you suggest ?  
    Wednesday, April 10, 2013 1:49 AM
  • Nope, it is neither practical nor efficient, IMHO. It makes perfect sense to have the CPU parse your directories / chug up the .dat files - you will be waiting for the slowest part in the system, which is I/O anyway. I am somewhat sure that even a single core can saturate whatever your HDD can provide. A GPU is a compute accelerator, it does best when there is sufficient arithmetic intensity involved, and when you tend to do extensive processing on some data set. So yes, my suggestion would be to go with your second scenario IF you are doing enough processing per each byte so as to offset the cost of moving large quantities of data across PCI-E.
    Wednesday, April 10, 2013 3:10 AM
  • look at a paper titled "GPUfs" by Silverstein, Ford, Keidar, and Witchel. 

    also, what GPU are you using?

    Friday, June 7, 2013 6:42 PM