История

Pablo Selem a6e51c964e Feature/container (#153 ) * force add PATH to current user * checkin docker setup script * Update cluster_setup.sh * install docker and start container on cluster setup * WIP: Run task in container * fix merge conflict * run tasks and merge task from within container * refactor code to proper docker commands and make a single R container per job * refactor command line utils into its own file * refactor job utilities into its own file * move cluster setup script to inst folder * remove unnecessary curl installs * remove starting container from setup script * check in WIP * add apt_install file * make required directories * update cluster setup files as needed * include libxml2 packages in apt installs * working cluster create with cran and github dependencies * update job prep to install apt-get and not each task * use rocker containers instead of r-base * remove unused & commented code * remove unused install function * address several lintr issues * initial test dockerfile * add spacing between commands * temporarily point wget to feature branch * update bioconductor install for non-jobPrep installs * Delete Dockerfile * minor changes to install bioc * resolve merge conflicts * update cluster to correctly install BioC packages using install_bioconductor * fix issue where some packages were not getting installed * add missing BioConductorCommand initializer * remove print lines * initial dockerfile implementations * update docker files * Only install packages if they are required * Remove requirement on bioconductor installer script on start task * remove duplicate environment variable entry * update docs for container support * update version to 0.6.0 * refactor changes updates * remove poorly formatted whitespaces * add full path to pacakges directory * fix docker command line * update file share sample * update azure files cluster name * update mandelbrot sample * update package management sample * update plyr samples * make montecarlo sample more consistent * update montecarlo sample * remove plyr example * fix bad environment pointer * fix linter issues * more linter fixes * more linter issues * use latest rAzureBatch version * update resource files example * remove reference to deleted sample * pr feedback * PR docs feedback * Print errors from worker (#154) * Fixed pool package command line lintr test * Package installation tests fixed - too long lines * Fixed json in customize cluster docs * Fix: Typos in customize cluster docs * Cleaning up files * Feature/githubbiopackage (#150) * install github package worked for foreach loop * fix lintr error * tests for github and bioc packages installation * lintr fix * add back lost code due to merge and update docs * The Travis CI build failed for feature/githubbiopackage * remove incorrect parameter for install_github * Updated job prep task to have default command * Use the latest version of rAzureBatch * Updated description + Generate cluster config * Fix: Bioconductor and Github packages installation (#155) * Added multiple package install test and fix obj reading args * Fixed naming for packages install * Replaced validation exclusion for linter * Fixed test validate test * Fixing all interactive tests with skip * Fixed renaming validation * Removed default test - cannot be tested * Removed in validation * Added cluster package install tests (#156)		2017-11-03 10:06:40 -07:00
..
README.md	Feature/container (#153 )	2017-11-03 10:06:40 -07:00
montecarlo_cluster.json	Feature/container (#153 )	2017-11-03 10:06:40 -07:00
montecarlo_pricing_simulation.R	Feature/container (#153 )	2017-11-03 10:06:40 -07:00

README.md

Monte Carlo

Using the Monte Carlo algorithm is a popular option for doing many financial modelling scenarios. In this sample we do a multiple pricing simulations for the closing price of a security. Part of the sample is to show the speed up of running locally without a parallel backend, and then using the cloud to leverage a cluster to do the same work.

To speed up the algorithm significantly play around with the number of nodes in the cluster, and the chunk size for the foreach loop. Currently it is set to 13 because we have 2 nodes, with 4 cores each (total of 8 cores) and we want to run 100 iterations of the loop. 100 / 8 ~= 13 so we set the chunk size to 13. If we have 32 cores, we may want to set the chunk size to 4 to spead out the work as evenly as possible across all the nodes and improve the total execution time.