none
R doAzureParallel and chunkSize RRS feed

  • Question

  • Intend to use Azure Batch to run R in parallel with the do AzureParallel package. I intend leave the default chunkSize = 1.

    Does it start a new R session for each iteration of the foreach() loop?

    I do want it to start a new R session for each iteration of the foreach() loop. If the answer to the above question is "no", how can I make it restart the R session for each iteration?

    Monday, February 11, 2019 1:08 AM

Answers

  • Hi John,

    Yes if you set chunkSize to 1, every foreach iteration will instantiate a new R session. Is there a reason why you want to start a new R process for each iteration?

    Thanks,

    Brian


    Brian Hoang

    Tuesday, February 12, 2019 8:26 PM

All replies

  • Can you provide some additional docs or links to help me better understand the question? I don't quite have enough to go off of to provide any suggestions. 
    Monday, February 11, 2019 7:02 PM
    Moderator
  • Hi Micah,

    Here is the link to the documentation of the chunkSize option, with example code.

    https://github.com/Azure/doAzureParallel/blob/master/docs/80-performance-tuning.md#using-the-chunksize-option

    Thank you for you help.

    Tuesday, February 12, 2019 1:20 AM
  • Could you open an issue directly on that repo and CC me? 

    https://github.com/Azure/doAzureParallel/issues

    My github is MicahMcKittrick-MSFT

    I can add some people who should be able to answer the question. 

    Tuesday, February 12, 2019 7:27 PM
    Moderator
  • Hi John,

    Yes if you set chunkSize to 1, every foreach iteration will instantiate a new R session. Is there a reason why you want to start a new R process for each iteration?

    Thanks,

    Brian


    Brian Hoang

    Tuesday, February 12, 2019 8:26 PM
  • Thanks Brian. The reason is that it seems to be the only way to clear out R's memory. 
    Tuesday, February 12, 2019 11:12 PM