Little Programming Note
Some good practices on R
, Python
and bash
programming, as well as a little note for myself.
R
Use
.rds
when saving only one object instead of.rdata
, asreadRDS()
will allow you to assign a name for the object, while loading the.rdata
will maintain its own name, which can be inconvenient when you are running many replicates of simulation using one set of core code.When combining the many
.rds
into one final output for performance evaluation,list.files()
is often useful. With a little bit of knowledge in regular expression matching, your code can be made both generalizable and precise.Code should be made generalizable - use as little hard-coded parameters as possible. Instead, these can be passed to code using command line arguments.
Use
library()
instead ofrequire()
when loading packages, asrequire()
is essentially equivalent totry(library())
, and may not always work if the required library is installed. It will only throw a logical value indicating whether the package is loaded or not. The code will not stop here, which means you will only realize the problem much later when you call a function from the package that is not loaded.library()
instead will throw an error.Use relative path instead of absolute path. Hard-coding paths can make it difficult for your collaborators to replicate your work, as they do not have exactly the same folder organization with you. Instead, use
./
to denote the current folder, and../
to denote the upper level folder. For even upper level folders, use../../
or more, depending on your need.sample(N)
will generate a random permutation of integers 1 to N.If a specific level of a factor is to be used as the reference level, use
relevel()
to re-define the reference level before model fitting.To obtain the matrix product of a vector with its transpose, giving a square matrix, use
tcrossprod()
.Explore
get()
andassign()
. They may be useful at some future moment.When subsetting columns from a matrix, the resulting matrix may be automatically transformed to a column vector. To maintain the two-dimensional shape, use
mymatrix[ , i, drop = FALSE]
Make good use of the
outer()
function to avoid two-layer loops.
Bash
grep -i keyword *.R
helps one to list allR
files under a certain directory that contain thekeyword
stringpdftotext myfile.pdf - | wc -w
helps count how many words are inmyfile.pdf