Changes in version 1.6.2 (2025-07-29) - Check that parameters ind.row/ind.col are not NULL. Changes in version 1.6.1 (2024-09-09) - Remove {dplyr} dependency for internal function any_near0(). Changes in version 1.6.0 - Fix conversion from NA_real to FBM type integer on new Macs. Changes in version 1.5.14 - Error when variables with a zero scaling are used in e.g. big_randomSVD() and big_crossprodSelf() (#52). Changes in version 1.5.13 - Add parameter backingfile to big_crossprodSelf() and big_cor() (#170). Changes in version 1.5.11 - Make sure not to use two levels of parallelism in big_univLogReg() (#137). Changes in version 1.5.10 - Check out-of-bounds ind.col in big_prodMat() (#154). Changes in version 1.5.9 - Add global option FBM.dir (that defaults to tempdir() as before). This can be used to change the default directory used to create FBMs when calling either FBM(), FBM.code256(), as_FBM(), big_copy(), or big_transpose(). Note that, if not using the temporary directory anymore, you must clean up the files you do not want to keep. Changes in version 1.5.8 - Enable ARMA_64BIT_WORD. Changes in version 1.5.7 - New strategy for $add_columns(). Changes in version 1.5.6 (2022-02-03) - Add convenience function as_scaling_fun() to create your own fun.scaling parameters. Changes in version 1.5.4 - Now automatically discard covariates with no variation in pcor() (with a warning). Changes in version 1.5.3 - pcor() now returns NAs (instead of 0s) for singular systems. Changes in version 1.5.0 (2021-03-29) - Recode some parallel algorithms with OpenMP. For now, functions big_prodVec(), big_cprodVec(), big_colstats() and big_univLinReg() have been recoded. Changes in version 1.4.0 - Now detects and errors if there is not enough disk space to create an FBM. Changes in version 1.3.3 - Fix pcor() for singular systems, e.g. when x has all the same values. Changes in version 1.3.2 - Fix summary() and plot() for old (< v1.3) big_sp_list models. Changes in version 1.3.1 (2020-11-06) - Add function pcor() to compute partial correlations. Changes in version 1.3.0 - Add two options in big_spLinReg() and big_spLogReg(); power_scale for using a different scaling for LASSO and power_adaptive for using adaptive LASSO (where larger marginal effects are penalized less). See documentation for details. - big_(c)prodVec() and big_(c)prodMat() (re)gain a ncores parameter. Note that for big_(c)prodMat(), it might be beneficial to use the BLAS parallelism (with bigparallelr::set_blas_ncores()) instead of this parameter, especially when the matrix A is large-ish. Changes in version 1.2.2 (2020-03-09) - Function big_colstats() can now be run in parallel (added parameter ncores). Changes in version 1.2.1 - It is now possible to use C++ FBM accessors without linking to {RcppArmadillo}. Changes in version 1.2.0 - Functions big_(c)prodMat() and big_(t)crossprodSelf() now use much less memory, and may be faster. - Add covar_from_df() to convert a data frame with factors/characters to a numeric matrix using one-hot encoding. Changes in version 1.1.4 (2020-02-01) - Remove some 'Suggests' dependencies. Changes in version 1.1.3 - Add a new column $all_conv to output of summary() for big_spLinReg() and big_spLogReg() to check whether all models have stopped because of "no more improvement". Also add a new parameter sort to summary(). - Now warn (enabled by default) if some models may not have reached a minimum when using big_spLinReg() and big_spLogReg(). Changes in version 1.1.1 - Fix In .self$nrow * .self$ncol : NAs produced by integer overflow. Changes in version 1.1.0 - Make two different memory-mappings: one that is read-only (using $address) and one where it is possible to write (using $address_rw). This enables to use file permissions to prevent modifying data. - Also add a new field $is_read_only to be used to prevent modifying data (at least with <-) even when you have write permissions to it. Functions creating an FBM now gain a parameter is_read_only. - Make vector accessors (e.g. X[1:10]) faster. Changes in version 1.0.0 - Move some code to new packages {bigassertr} and {bigparallelr}. - big_randomSVD() gains arguments related to matrix-vector multiplication. - assert_noNA() is faster. Changes in version 0.9.10 - Add big_increment(). Changes in version 0.9.9 In plot.big_SVD(), - Can now plot many PCA scores (more than two) at once. - Use coord_fixed() when plotting PCA scores because it is good practice. - Use log-scale in scree plot to better see small differences in singular values. - Reexport cowplot::plot_grid() to merge multiple ggplots. Changes in version 0.9.6 - AUCBoot() is now 6-7 times faster. Changes in version 0.9.5 - Add parameters center and scale to products. Changes in version 0.9.3 - Fix a bug in big_univLogReg() for variables with no variation. IRLS was not converging, so glm() was used instead. The problem is that glm() drops dimensions causing singularities so that Z-score of the first covariate (or intercept) was used instead of a missing value. Changes in version 0.9.0 - Use mio instead of boost for memory-mapping. - Add a parameter base.row to predict.big_sp_list() and automatically detect if needed (as well as for covar.row). - Possibility to subset a big_sp_list without losing attributes, so that one can access one model (corresponding to one alpha) even if it is not the 'best'. - Add parameters pf.X and pf.covar in big_sp***Reg() to provide different penalization for each variable (possibly no penalization at all). Changes in version 0.8.4 Add %*%, crossprod and tcrossprod operations for 'double' FBMs. Changes in version 0.8.3 Now also returns the number of non-zero variables ($nb_active) and the number of candidate variables ($nb_candidate) for each step of the regularization paths of big_spLinReg() and big_spLogReg(). Changes in version 0.8.0 - Parameters warn and return.all of big_spLinReg() and big_spLogReg() are deprecated; now always return the maximum information. Now provide two methods (summary and plot) to get a quick assessment of the fitted models. Changes in version 0.7.3 - Check of missing values for input vectors (indices and targets) and matrices (covariables). - AUC() is now stricter: it accepts only 0s and 1s for target. Changes in version 0.7.1 - $bm() and $bm.desc() have been added in order to get an FBM as a filebacked.big.matrix. This enables using {bigmemory} functions. Changes in version 0.7.0 - Type float added. Changes in version 0.6.2 (2018-08-17) - big_write added. Changes in version 0.6.1 - big_read now has a filter argument to filter rows, and argument nrow has been removed because it is now determined when reading the first block of data. - Removed the save argument from FBM (and others); now, you must use FBM(...)$save() instead of FBM(..., save = TRUE). Changes in version 0.6.0 - You can now fill an FBM using a data frame. Note that factors will be used as integers. - Package {bigreadr} has been developed and is now used by big_read. Changes in version 0.5.0 - There have been some changes regarding how conversion between types is checked. Before, you would get a warning for any possible loss of precision (without actually checking it). Now, any loss of precision due to conversion between types is reported as a warning, and only in this case. If you want to disable this feature, you can use options(bigstatsr.downcast.warning = FALSE), or you can use without_downcast_warning() to disable this warning for one call. Changes in version 0.4.1 - change big_read so that it is faster (corresponding vignette updated). Changes in version 0.4.0 - possibility to add a "base predictor" for big_spLinReg and big_spLogReg. - don't store the whole regularization path (as a sparse matrix) in big_spLinReg and big_spLogReg anymore because it caused major slowdowns. - directly average the K predictions in predict.big_sp_best_list. - only use the "PSOCK" type of cluster because "FORK" can leave zombies behind. You can change this with options(bigstatsr.cluster.type = "PSOCK"). Changes in version 0.3.4 - Fix a bug in big_spLinReg related to the computation of summaries. - Now provides function plus to be used as the combine argument in big_apply and big_parallelize instead of '+'. Changes in version 0.3.3 - Before, this package used only the "PSOCK" type of cluster, which has some significant overhead. Now, it uses the "FORK" type on non-Windows systems. You can change this with options(bigstatsr.cluster.type = "PSOCK"). Uses "PSOCK" in 0.4.0. Changes in version 0.3.2 - you can now provide multiple $\alpha$ values (as a numeric vector) in big_spLinReg and big_spLogReg. One will be chosen by grid-search. Changes in version 0.3.1 - fixed a bug in big_prodMat when using a dimension of 1 or 0. Changes in version 0.3.0 - Package {bigstatsr} is published in Bioinformatics Changes in version 0.2.6 - no scaling is used by default for big_crossprod, big_tcrossprod, big_SVD and big_randomSVD (before, there was no default at all) Changes in version 0.2.4 (2018-01-31) - Integrate Cross-Model Selection and Averaging (CMSA) directly in big_spLinReg and big_spLogReg, a procedure that automatically chooses the value of the $\lambda$ hyper-parameter. - Speed up big_spLinReg and big_spLogReg (issue #12) Changes in version 0.2.3 (2017-11-30) - Speed up AUC computations Changes in version 0.2.0 - No longer use the big.matrix format of package bigmemory