vignettes/troubleshooting-model-fits.Rmd
troubleshooting-model-fits.Rmd
Automatically finding good initial values for parameters in a nonlinear model (i.e. stats::nls()
) is an art. Given that each of the formulas represented by the model
argument of fit_gaussian_2D()
contains 5 to 7 parameters, stats::nls()
will often encounter singular gradients or step size errors.
Code within fit_gaussian_2D()
will first scan the supplied dataset to guesstimate sensible initial parameters, which hopefully sidesteps these issues. But there is no guarantee this strategy will always work.
This vignette will offer some guidance on what to do when stats::nls()
fails to converge, including the use of optional parameters in fit_gaussian_2D()
that are meant to help you address these issues.
Let’s start by loading gaussplotR
and getting some sample data loaded up:
library(gaussplotR)
## Load the sample data set
data(gaussplot_sample_data)
## The raw data we'd like to use are in columns 1:3
samp_dat <-
gaussplot_sample_data[,1:3]
One common problem is that of singular gradients. I will intentionally comment out the next block of code because running it will produce the singular gradient error, and generating errors in an R Markdown file will prevent its rendering. Please un-comment the example below to see.
## Un-comment this example if you'd like to see a singular gradient error
# gauss_fit_cir <-
# fit_gaussian_2D(samp_dat,
# constrain_amplitude = TRUE,
# method = "circular")
The output from the above example should be:
#> Error in stats::nls(response ~ Amp_init * exp(-((((X_values - X_peak)^2)/(2 * :
#> singular gradient
#> Called from: stats::nls(response ~ Amp_init * exp(-((((X_values - X_peak)^2)/(2 *
#> X_sig^2) + ((Y_values - Y_peak)^2)/(2 * Y_sig^2)))), start = c(X_peak = #> _peak_init, >
#> Y_peak = Y_peak_init, X_sig = X_sig_init, Y_sig = Y_sig_init),
#> data = data, trace = verbose, control = list(maxiter = maxiter,
#> ...))
#> Error during wrapup: unimplemented type (29) in 'eval'
#>
#> Error: no more error handlers available (recursive errors?); invoking 'abort' restart
#> Error during wrapup: INTEGER() can only be applied to a 'integer', not a 'unknown type #> #29'
#> Error: no more error handlers available (recursive errors?); invoking 'abort' restart
There are a couple tools in gaussplotR
that can help you address this problem.
A good first step is to enable the optional argument print_initial_params
in fit_gaussian_2D()
by setting it to TRUE
. Again, please un-comment this next block, since it will still produce an error:
## Un-comment this example if you'd like to see a singular gradient error
# gauss_fit_cir <-
# fit_gaussian_2D(samp_dat,
# constrain_amplitude = TRUE,
# method = "circular",
# print_initial_params = TRUE)
Though this block of code will not work, you will at least see something helpful at the beginning of the error message:
#> Initial parameters:
#> Amp X_peak Y_peak X_sig Y_sig
#> 25.725293 -2.000000 3.000000 2.482892 2.500000
#> Error in stats::nls(response ~ Amp_init * exp(-((((X_values - X_peak)^2)/(2 * :
#> singular gradient
#> Called from: stats::nls(response ~ Amp_init * exp(-((((X_values - X_peak)^2)/(2 *
#> X_sig^2) + ((Y_values - Y_peak)^2)/(2 * Y_sig^2)))), start = c(X_peak = #> _peak_init, >
#> Y_peak = Y_peak_init, X_sig = X_sig_init, Y_sig = Y_sig_init),
#> data = data, trace = verbose, control = list(maxiter = maxiter,
#> ...))
#> Error during wrapup: unimplemented type (29) in 'eval'
#>
#> Error: no more error handlers available (recursive errors?); invoking 'abort' restart
#> Error during wrapup: INTEGER() can only be applied to a 'integer', not a 'unknown type #> #29'
#> Error: no more error handlers available (recursive errors?); invoking 'abort' restart
Those first three lines indicate the initial values that were used. Often, singular gradients will arise when initial values for parameters were poorly chosen (sorry!).
What you can do is supply your own set of initial values. To do this, use the optional argument user_init
within fit_gaussian_2D()
. You will need to supply a numeric vector that is of the same length as the number of parameters for your chosen model. The values you supply must be provided in the same order they appear in the Initial parameters message. They do not need to be named; values alone will suffice.
This next example will work. I’ll keep print_initial_params
on too, since it can be nice to see:
## This should run with no errors
gauss_fit_cir_user <-
fit_gaussian_2D(samp_dat,
constrain_amplitude = TRUE,
method = "circular",
user_init = c(25.72529, -2.5, 1.7, 1.3, 1.6),
print_initial_params = TRUE)
#> Initial parameters:
#> Amp X_peak Y_peak X_sig Y_sig
#> 25.72529 -2.50000 1.70000 1.30000 1.60000
gauss_fit_cir_user
#> Model coefficients
#> Amp X_peak Y_peak X_sig Y_sig
#> 25.73 -2.55 1.86 1.27 1.5
#> Model error stats
#> rss rmse deviance AIC R2 R2_adj
#> 906.88 5.02 906.88 228.32 0.54 0.53
#> Fitting methods
#> method amplitude orientation
#> "circular" "constrained" NA
Note that although we are constraining the amplitude, the value of Amp
must still be provided (here it is 25.72529
).
It may take some trial and error to find a set of user_init
values that gets your model to converge. It often makes sense to think about what values are feasible for each parameter. For example, it should be relatively straightforward to think of ranges of possible values for X_peak
and Y_peak
. I often find that finding good initial values for the “spread” parameters (such as X_sig
and Y_sig
) is the tough nut to crack, so I recommend tweaking those parameters first.
nls()
The fit_gaussian_2D()
function also allows you to pass additional control arguments to stats::nls.control()
via the ...
argument.
To put this in more technical terms, arguments supplied to ...
are handled as:stats::nls(control = list(maxiter, ...))
Therefore, if you are interested in changing e.g. minFactor
to 1/2048
:fit_gaussian_2D(data, model, minFactor = 1/2048)
See the Help file for stats::nls.control()
for further guidance on what these control arguments are.
Please also note that that tweaking maxiter
should not be handled via ...
but rather by the maxiter
argument to fit_gaussian_2D()
.
Our scapegoat here is setting constrain_amplitude = TRUE
. Often, when constraining parameters in a nonlinear model, you’ll find yourself in a scenario where the QR decomposition of the gradient matrix is not of full column rank.
Constraining parameters (amplitude or orientation) will lead to poorer-fitting Gaussians anyway, so these features should only be used if you have an a priori reason to do so (see examples in Priebe et al. 20031)
Turning off the constrain_amplitude
constraint alleviates the problem in this particular case:
## This should run with no errors
gauss_fit_cir <-
fit_gaussian_2D(samp_dat,
method = "circular")
gauss_fit_cir
#> Model coefficients
#> Amp X_peak Y_peak X_sig Y_sig
#> 23.18 -2.55 1.81 1.32 1.64
#> Model error stats
#> rss rmse deviance AIC R2 R2_adj
#> 899.61 5 899.61 230.03 0.54 0.53
#> Fitting methods
#> method amplitude orientation
#> "circular" "unconstrained" NA
Hope this helps!
🐢
Priebe NJ, Cassanello CR, Lisberger SG. The neural representation of speed in macaque area MT/V5. J Neurosci. 2003 Jul 2;23(13):5650-61. doi: 10.1523/JNEUROSCI.23-13-05650.2003.↩︎