Title: | Easy Parallel Tools |
---|---|
Description: | Utility functions for easy parallelism in R. Include some reexports from other packages, utility functions for splitting and parallelizing over blocks, and choosing and setting the number of cores used. |
Authors: | Florian Privé [aut, cre] |
Maintainer: | Florian Privé <[email protected]> |
License: | GPL-3 |
Version: | 0.3.2 |
Built: | 2024-11-14 05:29:46 UTC |
Source: | https://github.com/privefl/bigparallelr |
Check that you are not trying to use too many cores.
assert_cores(ncores)
assert_cores(ncores)
ncores |
Number of cores to check. Make sure is not larger than
|
It also checks if two levels of parallelism are used, i.e. having ncores
larger than 1, and having a parallel BLAS enabled by default.
You could remove this check by setting
options(bigstatsr.check.parallel.blas = FALSE)
.
We instead recommend that you disable parallel BLAS by default by adding
try(bigparallelr::set_blas_ncores(1), silent = TRUE)
to your .Rprofile
(with an empty line at the end of this file) so that this is set whenever
you start a new R session. You can use usethis::edit_r_profile()
to open
your .Rprofile. For this to be effective, you should restart the R session or
run options(default.nproc.blas = NULL)
once in the current session.
Then, in a specific R session, you can set a different number of cores to use
for matrix computations using bigparallelr::set_blas_ncores()
, if you know
there is no other level of parallelism involved in your code.
## Not run: assert_cores(2) ## End(Not run)
## Not run: assert_cores(2) ## End(Not run)
Number of cores used by BLAS (matrix computations)
get_blas_ncores() set_blas_ncores(ncores)
get_blas_ncores() set_blas_ncores(ncores)
ncores |
Number of cores to set for BLAS. |
get_blas_ncores()
get_blas_ncores()
This is base on the following rule: use only physical cores and if you have only physical cores, leave one core for the OS/UI.
nb_cores()
nb_cores()
The recommended number of cores to use.
nb_cores()
nb_cores()
Wrapper around Reduce
to add multiple arguments. Useful
plus(...)
plus(...)
... |
Multiple arguments to be added together. |
Reduce('+', list(...))
plus(1:3, 4:6, 1:3)
plus(1:3, 4:6, 1:3)
Register parallel in functions. Do makeCluster()
, registerDoParallel()
and stopCluster()
when the function returns.
register_parallel(ncores, ...)
register_parallel(ncores, ...)
ncores |
Number of cores to use. If using only one, then this function
uses |
... |
Arguments passed on to |
## Not run: test <- function(ncores) { register_parallel(ncores) foreach(i = 1:2) %dopar% i } test(2) # only inside the function foreach(i = 1:2) %dopar% i ## End(Not run)
## Not run: test <- function(ncores) { register_parallel(ncores) foreach(i = 1:2) %dopar% i } test(2) # only inside the function foreach(i = 1:2) %dopar% i ## End(Not run)
Split costs in consecutive blocks using a greedy algorithm that tries to find blocks of even total cost.
split_costs(costs, nb_split)
split_costs(costs, nb_split)
costs |
Vector of costs (e.g. proportional to computation time). |
nb_split |
Number of blocks. |
A matrix with 4 columns lower
, upper
, size
and cost
.
split_costs(costs = 150:1, nb_split = 3) split_costs(costs = rep(1, 151), nb_split = 3) split_costs(costs = 150:1, nb_split = 30)
split_costs(costs = 150:1, nb_split = 3) split_costs(costs = rep(1, 151), nb_split = 3) split_costs(costs = 150:1, nb_split = 30)
Split length in blocks
split_len(total_len, block_len, nb_split = ceiling(total_len/block_len))
split_len(total_len, block_len, nb_split = ceiling(total_len/block_len))
total_len |
Length to split. |
block_len |
Maximum length of each block. |
nb_split |
Number of blocks. Default uses the other 2 parameters. |
A matrix with 3 columns lower
, upper
and size
.
split_len(10, block_len = 3) split_len(10, nb_split = 3)
split_len(10, block_len = 3) split_len(10, nb_split = 3)
A Split-Apply-Combine strategy to parallelize the evaluation of a function.
split_parapply( FUN, ind, ..., .combine = NULL, ncores = nb_cores(), nb_split = ncores, opts_cluster = list(), .costs = NULL )
split_parapply( FUN, ind, ..., .combine = NULL, ncores = nb_cores(), nb_split = ncores, opts_cluster = list(), .costs = NULL )
FUN |
The function to be applied to each subset matrix. |
ind |
Initial vector of indices that will be splitted in |
... |
Extra arguments to be passed to |
.combine |
Function to combine the results with |
ncores |
Number of cores to use. Default uses |
nb_split |
Number of blocks. Default uses |
opts_cluster |
Optional parameters for clusters passed as a named list.
E.g., you can use |
.costs |
Vector of costs (e.g. proportional to computation time)
associated with each element of |
This function splits indices in parts, then apply a given function to each part and finally combine the results.
Return a list of ncores
elements, each element being the result of
one of the cores, computed on a block. The elements of this list are then
combined with do.call(.combine, .)
if .combined
is not NULL
.
## Not run: str( split_parapply(function(ind) { sqrt(ind) }, ind = 1:10000, ncores = 2) ) ## End(Not run)
## Not run: str( split_parapply(function(ind) { sqrt(ind) }, ind = 1:10000, ncores = 2) ) ## End(Not run)
Split object in blocks
split_vec(x, block_len, nb_split = ceiling(length(x)/block_len)) split_df(df, block_len, nb_split = ceiling(nrow(df)/block_len))
split_vec(x, block_len, nb_split = ceiling(length(x)/block_len)) split_df(df, block_len, nb_split = ceiling(nrow(df)/block_len))
x |
Vector to be divided into groups. |
block_len |
Maximum length (or number of rows) of each block. |
nb_split |
Number of blocks. Default uses the other 2 parameters. |
df |
Data frame to be divided into groups. |
A list with the splitted objects.
split_vec(1:10, block_len = 3) str(split_df(iris, nb_split = 3))
split_vec(1:10, block_len = 3) str(split_df(iris, nb_split = 3))