none
Error while running caret_example.R in doAzureParallel GitHub repo RRS feed

  • Question

  • I'm running a R script (caret) for k-cross validation with method = "rf" (random forest). The example in the doAzureParallel GitHub repo is giving the same error as my script (https://github.com/Azure/doAzureParallel/blob/master/samples/caret/caret_example.R). The error is :

    ======================================================================
    Id: job20190809200919
    chunkSize: 8
    enableCloudCombine: TRUE
    packages: 
    	caret; 
    errorHandling: stop
    wait: TRUE
    autoDeleteJob: TRUE
    ======================================================================
    Submitting tasks (75/75)
    Submitting merge task. . .
    Job Preparation Status: Package(s) being installed............
    Waiting for tasks to complete. . .
    | Progress: 2.67% (2/75) | Running: 2 | Queued: 71 | Completed: 2 | Failed: 2 |
    Errors have occurred while running the job 'job20190809200919'. Error handling is set to 'stop' and has proceeded to terminate the job. The user will have to handle deleting the job. If this is not the correct behavior, change the errorhandling property to 'pass'  or 'remove' in the foreach object. Use the 'getJobFile' function to obtain the logs. For more information about getting job logs, follow this link: https://github.com/Azure/doAzureParallel/blob/master/docs/90-troubleshooting.md#viewing-files-directly-from-compute-nodeError in e$fun(obj, substitute(ex), parent.frame(), e$data) : 
      object 'results' not found
    In addition: Warning messages:
    1: In waitForTasksToComplete(id, jobTimeout, obj$errorHandling) :
      2 task(s) failed while running the job. This caused the job to terminate automatically. To disable this behavior and continue on failure, set .errorHandling='remove | pass' in the foreach loop
    3
    4
    
    2: In self$client$extractAzureResponse(response, content) :
      Service Unavailable (HTTP 503).

    sessionInfo()

    R version 3.6.1 (2019-07-05)
    Platform: x86_64-w64-mingw32/x64 (64-bit)
    Running under: Windows 10 x64 (build 18362)
    
    Matrix products: default
    
    Random number generation:
     RNG:     Mersenne-Twister 
     Normal:  Inversion 
     Sample:  Rounding 
     
    locale:
    [1] LC_COLLATE=English_United States.1252 
    [2] LC_CTYPE=English_United States.1252   
    [3] LC_MONETARY=English_United States.1252
    [4] LC_NUMERIC=C                          
    [5] LC_TIME=English_United States.1252    
    
    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods  
    [7] base     
    
    other attached packages:
     [1] DAAG_1.22.1           caret_6.0-84          ggplot2_3.2.0        
     [4] lattice_0.20-38       doAzureParallel_0.8.0 iterators_1.0.12     
     [7] foreach_1.4.7         devtools_2.1.0        usethis_1.5.1        
    [10] ROCR_1.0-7            gplots_3.0.1.1       
    
    loaded via a namespace (and not attached):
      [1] tidyselect_0.2.5       lme4_1.1-21           
      [3] robust_0.4-18.1        grid_3.6.1            
      [5] munsell_0.5.0          codetools_0.2-16      
      [7] future_1.14.0          miniUI_0.1.1.1        
      [9] withr_2.1.2            Brobdingnag_1.2-6     
     [11] metaBMA_0.6.1          colorspace_1.4-1      
     [13] rstudioapi_0.10        stats4_3.6.1          
     [15] DescTools_0.99.28      robustbase_0.93-5     
     [17] ggsignif_0.6.0         rcompanion_2.2.2      
     [19] listenv_0.7.0          emmeans_1.4           
     [21] rstan_2.19.2           mnormt_1.5-5          
     [23] MCMCpack_1.4-4         bridgesampling_0.7-2  
     [25] rprojroot_1.3-2        coda_0.19-3           
     [27] vctrs_0.2.0            generics_0.0.2        
     [29] TH.data_1.0-10         metafor_2.1-0         
     [31] ipred_0.9-9            randomForest_4.6-14   
     [33] R6_2.4.0               BayesFactor_0.9.12-4.2
     [35] bitops_1.0-6           reshape_0.8.8         
     [37] logspline_2.1.13       assertthat_0.2.1      
     [39] promises_1.0.1         scales_1.0.0          
     [41] multcomp_1.4-10        nnet_7.3-12           
     [43] ggExtra_0.8            gtable_0.3.0          
     [45] multcompView_0.1-7     globals_0.12.4        
     [47] processx_3.4.1         mcmc_0.9-6            
     [49] sandwich_2.5-1         timeDate_3043.102     
     [51] rlang_0.4.0            MatrixModels_0.4-1    
     [53] EMT_1.1                zeallot_0.1.0         
     [55] splines_3.6.1          TMB_1.7.15            
     [57] lazyeval_0.2.2         ModelMetrics_1.2.2    
     [59] broom_0.5.2            inline_0.3.15         
     [61] reshape2_1.4.3         abind_1.4-5           
     [63] modelr_0.1.5           backports_1.1.4       
     [65] httpuv_1.5.1           tools_3.6.1           
     [67] lava_1.6.6             psych_1.8.12          
     [69] ellipsis_0.2.0.1       RColorBrewer_1.1-2    
     [71] WRS2_1.0-0             sessioninfo_1.1.1     
     [73] ez_4.4-0               Rcpp_1.0.2            
     [75] plyr_1.8.4             jmvcore_1.0.0         
     [77] RCurl_1.95-4.12        purrr_0.3.2           
     [79] ps_1.3.0               prettyunits_1.0.2     
     [81] rpart_4.1-15           pbapply_1.4-1         
     [83] cowplot_1.0.0          zoo_1.8-6             
     [85] LaplacesDemon_16.1.1   haven_2.1.1           
     [87] ggrepel_0.8.1          cluster_2.1.0         
     [89] fs_1.3.1               furrr_0.1.0           
     [91] magrittr_1.5           data.table_1.12.2     
     [93] openxlsx_4.1.0.1       manipulate_1.0.1      
     [95] SparseM_1.77           lmtest_0.9-37         
     [97] mvtnorm_1.0-11         broomExtra_0.0.4      
     [99] sjmisc_2.8.1           matrixStats_0.54.0    
    [101] pkgload_1.0.2          hms_0.5.0             
    [103] mime_0.7               xtable_1.8-4          
    [105] rio_0.5.16             sjstats_0.17.5        
    [107] broom.mixed_0.2.4      readxl_1.3.1          
    [109] gridExtra_2.3          rstantools_1.5.1      
    [111] testthat_2.2.1         compiler_3.6.1        
    [113] tibble_2.1.3           KernSmooth_2.23-15    
    [115] ggstatsplot_0.0.12     crayon_1.3.4          
    [117] minqa_1.2.4            StanHeaders_2.18.1-10 
    [119] htmltools_0.3.6        mgcv_1.8-28           
    [121] mc2d_0.1-18            pcaPP_1.9-73          
    [123] later_0.8.0            tidyr_0.8.3           
    [125] libcoin_1.0-4          rrcov_1.4-7           
    [127] expm_0.999-4           lubridate_1.7.4       
    [129] sjlabelled_1.1.0       jmv_0.9.6.1           
    [131] MASS_7.3-51.4          boot_1.3-22           
    [133] Matrix_1.2-17          car_3.0-3             
    [135] cli_1.1.0              gdata_2.18.0          
    [137] parallel_3.6.1         insight_0.4.1         
    [139] gower_0.2.1            forcats_0.4.0         
    [141] pkgconfig_2.0.2        fit.models_0.5-14     
    [143] coin_1.3-0             foreign_0.8-71        
    [145] skimr_1.0.7            recipes_0.1.6         
    [147] paletteer_0.2.1        ggcorrplot_0.1.3      
    [149] estimability_1.3       prodlim_2018.04.18    
    [151] stringr_1.4.0          callr_3.3.1           
    [153] digest_0.6.20          cellranger_1.1.0      
    [155] nortest_1.0-4          curl_4.0              
    [157] shiny_1.3.2            gtools_3.8.1          
    [159] quantreg_5.51          modeltools_0.2-22     
    [161] rjson_0.2.20           nloptr_1.2.1          
    [163] jsonlite_1.6           nlme_3.1-140          
    [165] carData_3.0-2          groupedstats_0.0.8    
    [167] desc_1.2.0             pillar_1.4.2          
    [169] loo_2.1.0              httr_1.4.1            
    [171] purrrlyr_0.0.5         DEoptimR_1.0-8        
    [173] pkgbuild_1.0.4         survival_2.44-1.1     
    [175] remotes_2.1.0          glue_1.3.1            
    [177] bayestestR_0.2.5       zip_2.0.3             
    [179] rAzureBatch_0.7.0      class_7.3-15          
    [181] stringi_1.4.3          performance_0.3.0     
    [183] rsample_0.0.5          latticeExtra_0.6-28   
    [185] memoise_1.1.0          caTools_1.17.1.2      
    [187] dplyr_0.8.3           

    Friday, August 9, 2019 9:08 PM

Answers

  • Hi rserran,

    I wasn't able to reproduce the issue. Are you using the caret cluster configuration file in the example? This uses a specific container image of caret. 

    https://github.com/Azure/doAzureParallel/tree/master/samples/caret

    R version 3.6.0 (2019-04-26)
    Platform: x86_64-w64-mingw32/x64 (64-bit)
    Running under: Windows >= 8 x64 (build 9200)


    Matrix products: default


    Random number generation:
     RNG:     Mersenne-Twister 
     Normal:  Inversion 
     Sample:  Rounding 
     
    locale:
    [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
    [4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    


    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     


    other attached packages:
    [1] caret_6.0-84          DAAG_1.22.1           ggplot2_3.2.1         lattice_0.20-38       doAzureParallel_0.8.0
    [6] iterators_1.0.10      foreach_1.4.4        


    loaded via a namespace (and not attached):
     [1] tidyselect_0.2.5    reshape2_1.4.3      purrr_0.3.2         splines_3.6.0       colorspace_1.4-1    generics_0.0.2     
     [7] stats4_3.6.0        survival_2.44-1.1   prodlim_2018.04.18  rlang_0.4.0         ModelMetrics_1.2.2  pillar_1.4.2       
    [13] glue_1.3.1          withr_2.1.2         RColorBrewer_1.1-2  plyr_1.8.4          stringr_1.4.0       lava_1.6.6         
    [19] rAzureBatch_0.6.2   timeDate_3043.102   munsell_0.5.0       gtable_0.3.0        recipes_0.1.6       codetools_0.2-16   
    [25] latticeExtra_0.6-28 curl_3.3            class_7.3-15        Rcpp_1.0.1          scales_1.0.0        ipred_0.9-9        
    [31] jsonlite_1.6        mime_0.6            rjson_0.2.20        digest_0.6.18       stringi_1.4.3       dplyr_0.8.3        
    [37] grid_3.6.0          tools_3.6.0         bitops_1.0-6        magrittr_1.5        lazyeval_0.2.2      RCurl_1.95-4.12    
    [43] tibble_2.1.3        randomForest_4.6-14 crayon_1.3.4        pkgconfig_2.0.2     MASS_7.3-51.4       Matrix_1.2-17      
    [49] data.table_1.12.2   lubridate_1.7.4     gower_0.2.1         assertthat_0.2.1    httr_1.4.0          rstudioapi_0.10    
    [55] R6_2.4.0            rpart_4.1-15        nnet_7.3-12         nlme_3.1-139        compiler_3.6.0     

    Thanks,

    Brian


    Brian Hoang

    Tuesday, August 13, 2019 5:54 PM

All replies

  • Hi rserran,

    I tried to reproduce your problem.

    I followed this document to setup the environment. That document has another example to follow. Instead of that example i followed the example from github (Link which you mentioned in the question.)

    I got the same error as you got in R studio.  That's a generic error though.

    Then i went into the azure portal, ans checked the tasks standard error(stderr.txt).

    running
      '/usr/local/lib/R/bin/R --slave --no-restore --no-save --no-environ --no-restore --no-site-file --file=/mnt/batch/tasks/workitems/job20190813075616/job-1/jobpreparation/wd/worker.R --args 1 8 0 stop'
    
    Warning: namespace ‘doAzureParallel’ is not available and has been replaced
    by .GlobalEnv when processing object ‘’
    Loading required package: lattice
    Loading required package: ggplot2
    loaded caret and set parent environment
    

    Looks like the package is missing.

    Can you go to any one of the failed task and check the error message.

    Please let me know if that error message is matching with this or not.

    Tuesday, August 13, 2019 8:15 AM
    Moderator
  • Hi rserran,

    I wasn't able to reproduce the issue. Are you using the caret cluster configuration file in the example? This uses a specific container image of caret. 

    https://github.com/Azure/doAzureParallel/tree/master/samples/caret

    R version 3.6.0 (2019-04-26)
    Platform: x86_64-w64-mingw32/x64 (64-bit)
    Running under: Windows >= 8 x64 (build 9200)


    Matrix products: default


    Random number generation:
     RNG:     Mersenne-Twister 
     Normal:  Inversion 
     Sample:  Rounding 
     
    locale:
    [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
    [4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    


    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     


    other attached packages:
    [1] caret_6.0-84          DAAG_1.22.1           ggplot2_3.2.1         lattice_0.20-38       doAzureParallel_0.8.0
    [6] iterators_1.0.10      foreach_1.4.4        


    loaded via a namespace (and not attached):
     [1] tidyselect_0.2.5    reshape2_1.4.3      purrr_0.3.2         splines_3.6.0       colorspace_1.4-1    generics_0.0.2     
     [7] stats4_3.6.0        survival_2.44-1.1   prodlim_2018.04.18  rlang_0.4.0         ModelMetrics_1.2.2  pillar_1.4.2       
    [13] glue_1.3.1          withr_2.1.2         RColorBrewer_1.1-2  plyr_1.8.4          stringr_1.4.0       lava_1.6.6         
    [19] rAzureBatch_0.6.2   timeDate_3043.102   munsell_0.5.0       gtable_0.3.0        recipes_0.1.6       codetools_0.2-16   
    [25] latticeExtra_0.6-28 curl_3.3            class_7.3-15        Rcpp_1.0.1          scales_1.0.0        ipred_0.9-9        
    [31] jsonlite_1.6        mime_0.6            rjson_0.2.20        digest_0.6.18       stringi_1.4.3       dplyr_0.8.3        
    [37] grid_3.6.0          tools_3.6.0         bitops_1.0-6        magrittr_1.5        lazyeval_0.2.2      RCurl_1.95-4.12    
    [43] tibble_2.1.3        randomForest_4.6-14 crayon_1.3.4        pkgconfig_2.0.2     MASS_7.3-51.4       Matrix_1.2-17      
    [49] data.table_1.12.2   lubridate_1.7.4     gower_0.2.1         assertthat_0.2.1    httr_1.4.0          rstudioapi_0.10    
    [55] R6_2.4.0            rpart_4.1-15        nnet_7.3-12         nlme_3.1-139        compiler_3.6.0     

    Thanks,

    Brian


    Brian Hoang

    Tuesday, August 13, 2019 5:54 PM
  • Brian,

    You nailed it! I was using a similar cluster config file, but not similar (missing the jrowden/dcaret container image). I ran the caret example and my script using ranger, and everything is running as expected.

    So far the time elapsed is improving with 3 low priority nodes compared with my laptop. I'll increase the nodes to test if there is a significant improvement.

    Again, thanks!

    Ricardo

    Tuesday, August 13, 2019 7:38 PM