R Notes
Tags :: Language
General
caching intermediate computations as Rdata
inter_fp <- file.path("data_test.Rdata")
## Check if data already exists on disk, otherwise create it and save to disk
if (file.exists(inter_fp)) {
message(paste(inter_fp, "exists. Skipping"))
## NOTE:
## load(...) will load /all/ variables from the Rdata into the current
## environment. Variables from the Rdata will /overwrite/ variables of the
## same name in the current environment.
load(inter_fp) # Loads variable "fun_sample" to environment
} else {
## ... do computation to create intermediate data
N <- 100
## Simulate N samples
fun_sample <- rnorm(N)
## save(...) allows us to only write /specific/ variables to an Rdata file,
## instead of all the data in the current environment.
save(fun_sample, file = inter_fp)
## save(...) can also be used to save any number of variables
## e.g. save(fun_sample, N, file = inter_fp)
}
## Use loaded intermediate data later!
plot(1:100, fun_sample)
Memory model
Memory Profiling
R code
Profiling/logging memory usage in R is tricky, and is definitely not
helped by the lack of __file__
and __line__
macros.
There are a couple of “experimental” (2006) additions to the language to profile memory usage.
Rprof
has an option calledmemory.profiling
which will write out information, heap sizes, memory in nodes, and num
of calls to Rf_duplicate
(the internal C function which copies data)
summaryRprof
can summarize
- Only available if compiled with
--enable-memory-profiling
tracemem
marks an object so that a stack trace will be printed when an object is duplicated, or copies by coercion or arithmetic functions. intended for tracking accidental copying of large objects.untracemem
will untraced an object andtracingState
controls whether tracing info is printed.tracemem
cannot be used on function since it uses the same trace bit that trace uses, and will not work on objects such as environments that are passed by reference.
If R is complied with memory profiling, Rprofmem
starts and stops a pure memory use profilier.
Enable memory profiling
RUN wget -c https://cran.r-project.org/src/base/R-4/R-4.1.1.tar.gz && \
tar -xf R-4.1.1.tar.gz && \
cd R-4.1.1 && \
./configure --enable-memory-profiling && \
make -j$(nproc) -O && \
make install && \
cd /app && \
rm R-4.1.1.tar.gz && \
rm -r R-4.1.1
Methods to get memory usage
object.size
:- Gets memory allocation on object by object basis
memory.size
- Windows specific
How much performance is lost with profiling enabled at compile time?
- https://stackoverflow.com/a/57756890 On debian (all unix??) based systems, there is very minimal loss. However, the slow down on windows is noticable.
Compiled packages
To enable profiling of compiled C code, the packages need to be
installed with specific flags. The -g
flag w/ GNU GCC or Clang
will produce debugging information in the operating systems native
format.
The compilation variables for Make are set in
~/.R/Makevars
. This directory/file might have to be created by you:
mkdir -p ~/.R && touch ~/.R/Makevars
Set the flag as CXXFLAGS = -g
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libprofiler.so CPUPROFILE=sample.profile /usr/local/lib/R/bin/exec/R -f main_profile.R
Valgrind
Valgrind can be used on linux with a common CPU type. Test an R script with it like:
R -d "valgrind --tool=memcheck --leak-check=full" --no-save < test.R
This is expected to run ~20x slower than without valgrind. In some cases it will be even slower.
On platforms where valgrind and its headers are installed you can build
a version of R with extra instrumentation to help valgrind detect errors
in the use of memory allocated from the R heap. The configure is --with-valgrind-instrumentation=level
where level
is 0, 1, or 2. - Level 0 is the default and does not add
anything. - Level 1 will detect some uses of uninitialized memory and
has little impact on speed. - Uninitialized memory in some numeric,
logical, integer, raw, complex vectors, and in memory allocated by
R_alloc
- Level 2 will detect other memory-use bugs but makes R much
slower when running under valgrind. - Includes using the data sections
of R vectors after they are freed - Using level 2 with gctorture
can
be even more effective (and even slower)
Compiling
Options flags when configuring can seen with configure --help
Configure options
; ./configure --help
`configure' configures R 4.1.1 to adapt to many kinds of systems.
Usage: ./configure [OPTION]... [VAR=VALUE]...
To assign environment variables (e.g., CC, CFLAGS...), specify them as
VAR=VALUE. See below for descriptions of some of the useful variables.
Defaults for the options are specified in brackets.
Configuration:
-h, --help display this help and exit
--help=short display options specific to this package
--help=recursive display the short help of all the included packages
-V, --version display version information and exit
-q, --quiet, --silent do not print `checking ...' messages
--cache-file=FILE cache test results in FILE [disabled]
-C, --config-cache alias for `--cache-file=config.cache'
-n, --no-create do not create output files
--srcdir=DIR find the sources in DIR [configure dir or `..']
Installation directories:
--prefix=PREFIX install architecture-independent files in PREFIX
[/usr/local]
--exec-prefix=EPREFIX install architecture-dependent files in EPREFIX
[PREFIX]
By default, `make install' will install all the files in
`/usr/local/bin', `/usr/local/lib' etc. You can specify
an installation prefix other than `/usr/local' using `--prefix',
for instance `--prefix=$HOME'.
For better control, use the options below.
Fine tuning of the installation directories:
--bindir=DIR user executables [EPREFIX/bin]
--sbindir=DIR system admin executables [EPREFIX/sbin]
--libexecdir=DIR program executables [EPREFIX/libexec]
--sysconfdir=DIR read-only single-machine data [PREFIX/etc]
--sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com]
--localstatedir=DIR modifiable single-machine data [PREFIX/var]
--libdir=DIR object code libraries [EPREFIX/lib]
--includedir=DIR C header files [PREFIX/include]
--oldincludedir=DIR C header files for non-gcc [/usr/include]
--datarootdir=DIR read-only arch.-independent data root [PREFIX/share]
--datadir=DIR read-only architecture-independent data [DATAROOTDIR]
--infodir=DIR info documentation [DATAROOTDIR/info]
--localedir=DIR locale-dependent data [DATAROOTDIR/locale]
--mandir=DIR man documentation [DATAROOTDIR/man]
--docdir=DIR documentation root [DATAROOTDIR/doc/R]
--htmldir=DIR html documentation [DOCDIR]
--dvidir=DIR dvi documentation [DOCDIR]
--pdfdir=DIR pdf documentation [DOCDIR]
--psdir=DIR ps documentation [DOCDIR]
R installation directories:
--libdir=DIR R files to R_HOME=DIR/R [EPREFIX/$LIBnn]
rdocdir=DIR R doc files to DIR [R_HOME/doc]
rincludedir=DIR R include files to DIR [R_HOME/include]
rsharedir=DIR R share files to DIR [R_HOME/share]
X features:
--x-includes=DIR X include files are in DIR
--x-libraries=DIR X library files are in DIR
System types:
--build=BUILD configure for building on BUILD [guessed]
--host=HOST cross-compile to build programs to run on HOST [BUILD]
Optional Features:
--disable-option-checking ignore unrecognized --enable/--with options
--disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no)
--enable-FEATURE[=ARG] include FEATURE [ARG=yes]
--enable-R-profiling attempt to compile support for Rprof() [yes]
--enable-memory-profiling
attempt to compile support for Rprofmem(),
tracemem() [no]
--enable-R-framework[=DIR]
macOS only: build R framework (if possible), and
specify its installation prefix [no,
/Library/Frameworks]
--enable-R-shlib build the shared/dynamic library 'libR' [no]
--enable-R-static-lib build the static library 'libR.a' [no]
--enable-BLAS-shlib build BLAS into a shared/dynamic library [perhaps]
--enable-maintainer-mode
enable make rules and dependencies not useful (and
maybe confusing) to the casual installer [no]
--enable-strict-barrier provoke compile error on write barrier violation
[no]
--enable-prebuilt-html build static HTML help pages [no]
--enable-lto enable link-time optimization [no]
--enable-java enable Java [yes]
--enable-byte-compiled-packages
byte-compile base and recommended packages [yes]
--enable-static[=PKGS] (libtool) build static libraries [default=no]
--enable-shared[=PKGS] (libtool) build shared libraries [default=yes]
--enable-fast-install[=PKGS]
(libtool) optimize for fast installation
[default=yes]
--disable-libtool-lock avoid locking (might break parallel builds)
--enable-long-double use long double type [yes]
--disable-openmp do not use OpenMP
--disable-largefile omit support for large files
--disable-nls do not use Native Language Support
--disable-rpath do not hardcode runtime library paths
Optional Packages:
--with-PACKAGE[=ARG] use PACKAGE [ARG=yes]
--without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no)
--with-blas use system BLAS library (if available), or specify
it [no]
--with-lapack use system LAPACK library (if available), or specify
it [no]
--with-readline use readline library [yes]
--with-pcre2 use PCRE2 library (if available) [yes]
--with-pcre1 use PCRE1 library (if available and PCRE2 is not)
[yes]
--with-aqua macOS only: use Aqua (if available) [yes]
--with-tcltk use Tcl/Tk (if available), or specify its library
dir [yes]
--with-tcl-config=TCL_CONFIG
specify location of tclConfig.sh []
--with-tk-config=TK_CONFIG
specify location of tkConfig.sh []
--with-cairo use cairo (and pango) if available [yes]
--with-libpng use libpng library (if available) [yes]
--with-jpeglib use jpeglib library (if available) [yes]
--with-libtiff use libtiff library (if available) [yes]
--with-system-tre use system tre library (if available) [no]
--with-valgrind-instrumentation
Level of additional instrumentation for Valgrind
(0/1/2) [0]
--with-system-valgrind-headers
use system valgrind headers (if available) [no]
--with-internal-tzcode use internal time-zone code [no, yes on macOS]
--with-internal-towlower
use internal code for towlower/upper [no, yes on
macOS and Solaris]
--with-internal-iswxxxxx
use internal iswprint etc. [no, yes on macOS,
Solaris and AIX]
--with-internal-wcwidth use internal wcwidth [yes]
--with-recommended-packages
use/install recommended R packages [yes]
--with-ICU use ICU library (if available) [yes]
--with-static-cairo allow for the use of static cairo libraries [no, yes
on macOS]
--with-pic[=PKGS] (libtool) try to use only PIC/non-PIC objects
[default=use both]
--with-aix-soname=aix|svr4|both
(libtool) shared library versioning (aka "SONAME")
variant to provide on AIX, [default=aix].
--with-gnu-ld assume the C compiler uses GNU ld [default=no]
--with-sysroot[=DIR] Search for dependent libraries within DIR (or the
compiler's sysroot if not specified).
--with-x use the X Window System
--with-gnu-ld assume the C compiler uses GNU ld [default=no]
--with-libpth-prefix[=DIR] search for libpth in DIR/include and DIR/lib
--without-libpth-prefix don't search for libpth in includedir and libdir
--with-included-gettext use the GNU gettext library included here [no]
--with-libintl-prefix[=DIR] search for libintl in DIR/include and DIR/lib
--without-libintl-prefix don't search for libintl in includedir and libdir
Some influential environment variables:
R_PRINTCMD command used to spool PostScript files to the printer
R_PAPERSIZE paper size for the local (PostScript) printer
R_BATCHSAVE set default behavior of R when ending a session
MAIN_CFLAGS additional CFLAGS used when compiling the main binary
SHLIB_CFLAGS
additional CFLAGS used when building shared objects
MAIN_FFLAGS additional FFLAGS used when compiling the main binary
SHLIB_FFLAGS
additional FFLAGS used when building shared objects
MAIN_LD command used to link the main binary
MAIN_LDFLAGS
flags which are necessary for loading a main program which will
load shared objects (DLLs) at runtime
CPICFLAGS special flags for compiling C code to be turned into a shared
object.
FPICFLAGS special flags for compiling Fortran code to be turned into a
shared object.
SHLIB_LD command for linking shared objects which contain object files
from a C or Fortran compiler only
SHLIB_LDFLAGS
special flags used by SHLIB_LD
DYLIB_LD command for linking dynamic libraries which contain object files
from a C or Fortran compiler only
DYLIB_LDFLAGS
special flags used for make a dynamic library
CXXPICFLAGS special flags for compiling C++ code to be turned into a shared
object
SHLIB_CXXLD command for linking shared objects which contain object files
from the C++ compiler
SHLIB_CXXLDFLAGS
special flags used by SHLIB_CXXLD
TCLTK_LIBS flags needed for linking against the Tcl and Tk libraries
TCLTK_CPPFLAGS
flags needed for finding the tcl.h and tk.h headers
MAKE make command
TAR tar command
R_BROWSER default browser
R_PDFVIEWER default PDF viewer
BLAS_LIBS flags needed for linking against external BLAS libraries
LAPACK_LIBS flags needed for linking against external LAPACK libraries
LIBnn 'lib' or 'lib64' for dynamic libraries
SAFE_FFLAGS Safe Fortran fixed-form compiler flags for e.g. dlamc.f
r_arch Use architecture-dependent subdirs with this name
DEFS C defines for use when compiling R
JAVA_HOME Path to the root of the Java environment
R_SHELL shell to be used for shell scripts, including 'R'
YACC The `Yet Another Compiler Compiler' implementation to use.
Defaults to the first program found out of: `bison -y', `byacc',
`yacc'.
YFLAGS The list of arguments that will be passed by default to $YACC.
This script will default YFLAGS to the empty string to avoid a
default value of `-d' given by some make applications.
PKG_CONFIG path to pkg-config (or pkgconf) utility
PKG_CONFIG_PATH
directories to add to pkg-config's search path
PKG_CONFIG_LIBDIR
path overriding pkg-config's default search path
CC C compiler command
CFLAGS C compiler flags
LDFLAGS linker flags, e.g. -L<lib dir> if you have libraries in a
nonstandard directory <lib dir>
LIBS libraries to pass to the linker, e.g. -l<library>
CPPFLAGS (Objective) C/C++ preprocessor flags, e.g. -I<include dir> if
you have headers in a nonstandard directory <include dir>
CPP C preprocessor
FC Fortran compiler command
FCFLAGS Fortran compiler flags
CXX C++ compiler command
CXXFLAGS C++ compiler flags
CXXCPP C++ preprocessor
OBJC Objective C compiler command
OBJCFLAGS Objective C compiler flags
LT_SYS_LIBRARY_PATH
User-defined run-time library search path.
CXX11 C++11 compiler command
CXX11STD special flag for compiling and for linking C++11 code, e.g.
-std=c++11
CXX11FLAGS C++11 compiler flags
CXX11PICFLAGS
special flags for compiling C++11 code to be turned into a
shared object
SHLIB_CXX11LD
command for linking shared objects which contain object files
from the C++11 compiler
SHLIB_CXX11LDFLAGS
special flags used by SHLIB_CXX11LD
CXX14 C++14 compiler command
CXX14STD special flag for compiling and for linking C++14 code, e.g.
-std=c++14
CXX14FLAGS C++14 compiler flags
CXX14PICFLAGS
special flags for compiling C++14 code to be turned into a
shared object
SHLIB_CXX14LD
command for linking shared objects which contain object files
from the C++14 compiler
SHLIB_CXX14LDFLAGS
special flags used by SHLIB_CXX14LD
CXX17 C++17 compiler command
CXX17STD special flag for compiling and for linking C++17 code, e.g.
-std=c++17
CXX17FLAGS C++17 compiler flags
CXX17PICFLAGS
special flags for compiling C++17 code to be turned into a
shared object
SHLIB_CXX17LD
command for linking shared objects which contain object files
from the C++17 compiler
SHLIB_CXX17LDFLAGS
special flags used by SHLIB_CXX17LD
CXX20 C++20 compiler command
CXX20STD special flag for compiling and for linking C++20 code, e.g.
-std=c++20
CXX20FLAGS C++20 compiler flags
CXX20PICFLAGS
special flags for compiling C++20 code to be turned into a
shared object
SHLIB_CXX20LD
command for linking shared objects which contain object files
from the C++20 compiler
SHLIB_CXX20LDFLAGS
special flags used by SHLIB_CXX20LD
XMKMF Path to xmkmf, Makefile generator for X Window System
Use these variables to override the choices made by `configure' or to help
it to find libraries and programs with nonstandard names/locations.
Report bugs to <https://bugs.r-project.org>.
R home page: <https://www.r-project.org>.
References
- R internal structures
- The R programming language: The good, the bad, and the ugly
- https://rstudio.github.io/r-manuals/r-exts/Debugging.html
- http://faculty.washington.edu/tlumley/tutorials/user-biglm.pdf