posts by vikram

Sporadically updated, (somewhat) useful stuff. back to Email ORCID GitHub RSS feed

or click here for the archive

Images of Brachyistius aletes, courtesy of the California Academy of Sciences

I am happy to share that a study in which we determined how ecological specialization drives convergence in body shape and size among cleaner fishes is now available. Brachyistius aletes image 2

To trace the evolution of body shape, I collected morphological data from ~300 species of fishes. As part of this project, I got the chance to visit several museum collections across the US, including the Smithsonian National Museum of Natural History, the LA County Museum of Natural History, and the California Academy of Sciences.

After these visits, I realized I had missed one species: Brachyistius aletes, a type of surfperch. To my surprise, there were no clear photographs of this species available online - at least my frantic Google Images search yielded nothing.

Fortunately, Dave Catania (Senior Collections Manager of Ichthyology at the CAS) was able to hunt down a set of preserved specimens and snap a few photos for me. Here I’ll post these photos of Brachyistius aletes to make them available to others. Thanks again for sharing these, Dave!

Read more

Run phytools' make.simmap() in parallel

In macroevolutionary studies, we often use stochastic character mapping to infer how a discrete trait may have evolved. simmap parallel anoledata

I am grateful that the phytools package allows easy implementation of character mapping via the make.simmap() function. This method uses a Markovian process where we sample character histories in proportion to their posterior probabilities under a given model. So we need to simulate many, many (hundreds, thousands…) of potential histories to get meaningful results.

As with any other algorithm that we’d like to run repeatedly, it makes sense to see if parallelization can help us.

Here I provide code to run make.simmap() in parallel. It’s a Windows-friendly approach and similar to my code from another blog post, I make use of parLapply().

Read more

Sample variance at small sample sizes II - distributions

sample variance ggridges In my previous post (see here), I showed that although sample variance on average gives an unbiased estimate of population variance, it is highly unreliable at extremely small sample sizes. This time, I will focus more closely on the distributions of sample variance. Does sample size affect the distributions of sample variance? And how might this inform how we determine which sample sizes are too small? I'll use one of my favorite new(ish) packages, ggridges, to plot the sets of distributions from one example simulation.

Read more

Dangers of sample variance at small sample size

sample variance vs sample size Sample variance generally gives an unbiased estimate of the true population variance, but that does not mean it provides a reliable estimate of population variance. Here, I show that sample variance itself has high variance at low sample sizes. I run through a variety of empirical simulations that vary population size and population variance to see what general patterns emerge.

Read more

Parallel processing for MCMCglmm in R (Windows-friendly)

Lately, I have been using the MCMCglmm package to run linear mixed-models in a Bayesian framework. The documentation is generally very good but there seems to be relatively little support for using parallel processing (here: using multiple cores on your machine) to efficiently run large volumes of mcmc runs. This is especially true for Windows users, who cannot use functions like parallel::mclapply().

I’m happy to share that I have worked out a solution using the parallel package. Basically, I set up a virtual cluster and then use the parallel::parLapply() function to run iterations of MCMCglmm() in parallel.

(This is a re-post of an entry that appeared on my old blog - see here).

Read more

Check if packages are installed (and install if not) in R

Say you have an R script that you share with others. You may not be sure that each user has installed all the packages the script will require. Using install.packages() would be unnessary for users who already have the packages and simply need to load them.

Here’s some code that provides an easy way to check whether specific packages are in the default Library. If they are, they’re simply loaded via library(). If any packages are missing, they’re installed (with dependencies) into the default Library and are then loaded.

(This is a re-post of an entry that appeared on my old blog - see here).

Read more

Set max DLLs in R (Windows)

On occasion, you may need adjust the max number of .dll files that R can handle. I first encountered this need when using a high number of packages together.

I’ve had trouble finding this info in the past, so I decided to create this post for others. This works if you are using Windows.

Read more

test post

#tags: #test  #R 

I am using this post to modify an R Markdown template provided by the rmarkdown package. This will be mostly futzing around so I can get familiar with how things work. Some light plagiarism will be involved for the things I don’t feel like changing.

Read more