General Tips

Motion Correction tips

  • Non-rigid motion correction is not always necessary. Sometimes, rigid motion correction will be sufficient and it will lead to significant performance gains in terms of speed. Check your data to before/after rigid motion correction to decide what is best for you. The boolean parameter params.motion.pw_rigid can be used to alternate between rigid and non-rigid motion correction using the same function MotionCorrect.motion_correct.

  • When using piecewise rigid motion correction, use parameters that physically make sense. For example, a typical patch size could be around 100um x 100um since motion can in many times be approximated as rigid for smaller patches (if the imaging is not too slow). Similarly, the maximum allowed shifts can in typical 2p recordings chosen to correspond to 10um. The patch size is given by the sum of the parameters params.motion.strides + params.motion.overlaps. The maximum shifts parameter is params.motion.max_shifts. These values corresponds to pixels so make sure you have a rough idea of the spatial resolution of your data. There is a parameter for that params.data.dxy.

  • Motion correction works in parallel by splitting each file in multiple chunks and processing them in parallel. Make sure that the length of each chunk is not too small by setting the parameter params.motion.num_frames_split. On the other hand, too large chunks can negatively impact computation time even with parallelization. If you experience problems with large datasets, try scaling the number of chunks (params.motion.splits_els and params.motion.split_rig) with the length of your recording (e.g. int((number of total frames)/200)).

Caiman Online Processing tips

  • Important parameters for online processing are the CNN threshold value params.online.thresh_CNN_noisy, the trace SNR params.online.min_SNR and the number of candidate components to be considered at each timestep params.online.min_num_trial. Lower values for the thresholds (e.g., 1 for params.online.min_SNR and 0.5 for params.online.thresh_CNN_noisy) and/or higher values for params.online.min_num_trial (e.g., 10) can lead to higher recall values, although potentially at the expense of lower precision. In general they are preferable for datasets that are relatively short (e.g., 10000 frames or less). On the other hand, higher threshold values (e.g., 1.5 for params.online.min_SNR and 0.7 for params.online.thresh_CNN_noisy) and/or lower values for params.online.min_num_trial (e.g., 5) will lead to higher precision values, although potentially at the expense of lower recall. n general they are preferable for datasets that are longer (e.g., more than 10000 frames).

  • If your analysis setup allows it, multiple epochs over the data can be very beneficial, especially in the strict regime or high acceptance thresholds.

  • In general, bare initialization can be used most of the times, to capture the neuropil activity and a small number of neurons at an initial chunk. For a large FOV with lots of active neurons, e.g., a plane from a zebrafish dataset, bare initialization can be inadequate. In this case, a proper initialization with cnmf can lead to substantially better results.

  • Spatial downsampling can lead to significant speed gains, often at no expense in terms of accuracy. It can be set through the parameter ds_factor.

  • When using the CNN for screening candidate components, the usage of a GPU can lead to significant computational gains.

Caiman Batch processing tips

  • In order to optimize memory consumption and parallelize computing, it is suggested to adopt computing in patches (see companion paper). The user will inspect the correlation image and select an appropriate number of neurons per each patch. The params.patches['rf']' andparams.patches.stride’ parameters controls the size of patches and their overlap. Given the patch size and the correlation image the user can set an upper bound on the number of neurons per patches. We suggest to start exploring regions that contain 5-10 neurons.

  • Important parameters for selecting components based on quality are

    • the CNN lower bound and upper threshold params.quality['cnn_lowest'] and params.quality['min_cnn_thr']

    • the trace SNR params.quality['min_SNR']

    • the footprint consistency threshold params.quality['rval_thr']

    Each quality check has a low threshold (rval_lowest (default -1), SNR_lowest (default 0.5), cnn_lowest (default 0.1)) and high threshold (rval_thr (default 0.8), min_SNR (default 2.5), min_cnn_thr (default 0.9)). A component has to exceed ALL low thresholds as well as at least ONE high threshold to be accepted.

The user should explore these parameters around the default to optimize for specific data sets.

1p processing tips

  • For microendoscopic 1p data use CNMF-E’s background model and initialization method by setting center_psf=True, method_init='corr_pnr' and ring_size_factor to some value around 1.5. In this case the spatial and temporal components are updated during the initialization phase, hence use only_init_patch=True.

  • Other important parameters for microendoscopic 1p data are gSig, gSiz, min_corr and min_pnr. gSig specifies the gaussian width of a 2D gaussian kernel, which approximates a neuron and gSiz the average diameter of a neuron, in general 4*gSig+1. To pick the thresholds min_corr and min_pnr you can use caiman.utils.visualization.inspect_correlation_pnr and vary the slider values.

  • Because the background has no high spatial frequency components, it can be spatially downscaled to speed up processing without loss in accuracy, e.g. by a factor of 2 by setting ssub_B=2.

  • The exact background can be returned as full rank matrix (gnb=-1), or more compactly as parameters of the ring model (gnb=0), or not at all (gnb<-1). Further the background can also be approximated as low rank matrix by setting gnb to the desired rank. gnb=0 is usually the desired choice. If you have plenty of RAM and process in patches gnb=-1 is a good and faster option.

  • The CNMF-E algorithm poses high demands on RAM. There is however a trade off between computing time and memory usage when processing in patches. The number of processes n_processes specifies how many patches are processed in parallel, thus a higher number decreases computing time but increases RAM usage. If you have insufficient RAM, use a smaller value for n_processes to reduce memory consumption, or don’t even use parallelization at all by setting dview=None.

Deconvolution tips

  • Simultaneous deconvolution and source extraction can mostly offer benefits in particularly low SNR data. In most cases, running source extraction without deconvolution (p=0), followed by deconvolution will be sufficient.

  • It is generally better to perform some sort of de-trending on the extracted calcium traces prior to deconvolution to correct for baseline drifts that can results in wrongfully deconvolved neural activity. You can use the estimates.detrend_df_f methods for that.

  • When using the constrained_foopsi method for deconvolution the spiking variable S does not immediately correspond to number of spikes or the spiking probability in each timebin. It merely represents a measure of “deconvolved neural activity” that is loosely proportional to the firing rate of the neuron at each time. To convert into actual spike counts some reference point is needed to threshold and quantize this signal. For example, if you deconvolve DF/F traces and have knowledge of what change in DF/F units a spike is inducing, you can use this information to approximate the number of spikes (under certain linearity assumptions)..