Reference

Models

spatiotemporal_model_timeseries(distance_matrix, sa_lambda, sa_inf, ta_delta1s, num_timepoints, sample_rate, highpass_freq, seed=None)

Simulate the spatiotemporal model from Shinn et al (2023)

Parameters:

Name Type Description Default
distance_matrix NxN numpy array

the NxN distance matrix, representing the spatial distance between location of each of the timeseries. This should usually be the output of the distance_matrix_euclidean function.

required
sa_lambda float

the SA-λ parameter

required
sa_inf float

the SA-∞ parameter

required
ta_delta1s length-N list of floats

a list of TA-Δ₁ values, of length N, for generating the timeseries

required
num_timepoints int

length of timeseries to generate

required
sample_rate float

the spacing between timepoints (e.g. TR in fMRI)

required
highpass_freq float

if non-zero, apply a highpass filter above the given frequency. A good default is 0.01 for fMRI timeseries.

required
seed int

the random seed. If not specified, it will use the current state of the numpy random number generator.

None

Returns:

Type Description

NxT numpy array: For each of the N nodes, a timeseries of length T=num_timepoints according to the spatiotemporal model

intrinsic_timescale_sa_model_timeseries(distance_matrix, sa_lambda, sa_inf, ta_delta1s, num_timepoints, sample_rate, highpass_freq, seed=0)

Simulate the intrinsic timescale + spatial autocorrelation model from Shinn et al (2023)

Parameters:

Name Type Description Default
distance_matrix NxN numpy array

the NxN distance matrix, representing the spatial distance between location of each of the timeseries. This should usually be the output of the distance_matrix_euclidean function.

required
sa_lambda float

the SA-λ parameter

required
sa_inf float

the SA-∞ parameter

required
ta_delta1s length-N list of floats

a list of TA-Δ₁ values, of length N, for generating the timeseries

required
num_timepoints int

length of timeseries to generate

required
sample_rate float

the spacing between timepoints (e.g. TR in fMRI)

required
highpass_freq float

if non-zero, apply a highpass filter above the given frequency. A good default is 0.01 for fMRI timeseries.

required
seed int

the random seed. If not specified, it will use the current state of the numpy random number generator.

0

Returns:

Type Description

NxT numpy array: For each of the N nodes, a timeseries of length T=num_timepoints according to the intrinsic timescale + spatial autocorrelation model

Surrogate timeseries

eigensurrogate_matrix(cm, seed=None)

Eigensurrogate model, from Shinn et al (2023)

Determine the eigenvalues of the correlation matrix, and then sample a new correlation matrix with the same eigenvalues.

Parameters:

Name Type Description Default
cm NxN numpy array

a correlation matrix

required
seed int

the random seed. If not specified, it will use the current state of the numpy random number generator.

None

Returns:

Type Description

NxN numpy array: a correlation matrix with the same eigenvalues as cm

eigensurrogate_timeseries(cm, N_timepoints, seed=None)

Timeseries from the eigensurrogate model.

Sample timeseries which have the correlation matrix given by the eigensurrogate model. Note that there are many ways to sample timeseries from the eigensurrogate model, but this is the simplest (a multivariate normal distribution).

Parameters:

Name Type Description Default
cm NxN numpy array

a correlation matrix

required
N_timepoints int

the length of the timeseries to sample

required
seed int

the random seed. If not specified, it will use the current state of the numpy random number generator.

None

Returns:

Type Description

NxN_timepoints numpy array: timeseries generated from the eigensurrogate model

phase_randomize(tss, seed=None)

Phase-randomized surrogate timeseries.

Scramble a set of timeseries independently by preserving the amplitudes in Fouries space but randomly sampling new phases from the uniform distribution [0, 2π].

Parameters:

Name Type Description Default
tss NxT numpy array

should be a NxT matrix, where N is the number of timeseries and T is the number of samples in the timeseries.

required
seed int

the random seed. If not specified, it will use the current state of the numpy random number generator.

None

Returns:

Type Description

NxT numpy array: surrogate timeseries of the same shape as tss

zalesky_surrogate(cm, seed=None)

Zalesky matching surrogate, from Zalesky et al (2012)

Generate matrices with identical mean-FC and var-FC. Adapted from code taken from Zalesky et al (2012)

Parameters:

Name Type Description Default
cm NxN numpy array

a correlation matrix

required
seed int

the random seed. If not specified, it will use the current state of the numpy random number generator.

None

Returns:

Type Description

NxN numpy array: a correlation matrix with the same mean and variance as cm

Statistics

long_memory(x, minscale, multivariate=False)

Estimate the long memory coefficient, from Achard and Gannaz (2006)

See Achard and Gannaz (2006) for details of the coefficient and its estimator.

Parameters:

Name Type Description Default
x NxT numpy array

the matrix of timeseries (rows are regions, columns are timepoints)

required
minscale int

the minimum wavelet scale used to perform the estimation. As a rule of thumb, if data are low pass filtered, minscale should be the multiple of nyquist corresponding to the filter frequency, e.g. 2 if filtering is performed at half nyquist.

required
multivariate

the type of estimation to perform, described in Achard and Gannaz (2006). Note that multivariate=True is extremely slow for any reasonably sized correlation matrix.

False

Returns:

Name Type Description
float

The long memory coefficient

Warning

This function is rather fragile: it requires rpy2 to be installed, as well as R, with the multiwave package. It works on my computer but your results may vary. If it doesn't work for you, export your data and use the "multiwave" package directly in R.

spatial_autocorrelation(cm, dist, discretization=1)

Calculate the SA-λ and SA-∞ measures of spatial autocorrelation, defined in Shinn et al (2023)

Parameters:

Name Type Description Default
cm NxN numpy array

NxN correlation matrix of timeseries, where N is the number of timeseries

required
dist NxN numpy array

the NxN distance matrix, representing the spatial distance between location of each of the timeseries. This should usually be the output of the distance_matrix_euclidean function.

required
discretization int

The size of the bins to use when computing the SA parameters. The size of the discretization should ensure that there are a sufficient number of observations in each bin, but also enough total bins to make a meaningful estimation. Try increasing it or decreasing it according to the scale of your data. Data that has values up to around 100 should be fine with the default. Decrease or increase as necessary to get an appropriate estimation.

1

Returns:

Type Description

tuple of floats: tuple of (SA-λ, SA-∞)

temporal_autocorrelation(x)

Compute TA-Δ₁ (lag-1 temporal autocorrelation) from the timeseries.

Parameters:

Name Type Description Default
x

the timeseries of which to compute TA-Δ₁

If x is a single list or one-dimensional numpy array, return the TA-Δ₁ estimate. If x contains nested lists or is a NxT numpy, return a numpy array of length N, giving the TA-Δ₁ of each row of x.

required

Returns:

Type Description

list of floats: The temporal autocorrelation of each timeseries in x is multidimensional, return nested lists in the same shape as the leading dimensions of x.

Note

This is the biased estimator, but it is the default in numpy. This is what we use throughout the manuscript. The purpose of this function is to standardize computing TA-Δ₁.

Power spectrum manipulation

correlated_spectral_sampling(cm, spectra, seed=None)

Generate timeseries with given amplitude spectra and correlation matrices

This implements Correlated Spectral Sampling, as described in Shinn et al (2023).

Parameters:

Name Type Description Default
cm NxN numpy array

The correlation matrix

required
spectra Nxk numpy array

A list of Fourier spectra generated by make_spectrum Each of the N spectra are associated with a row/column of cm.

required
seed int

the random seed. If not specified, it will use the current state of the numpy random number generator.

None

Returns:

Type Description

NxT numpy array: N timeseries of length T. Timeseries i will have a power spectrum given by spectra[i], and will be correlated with the other timeseries with correlations cm[i].

make_spectrum(tslen, sample_rate, alpha, highpass_freq)

Create a 1/f^alpha spectrum.

Parameters:

Name Type Description Default
tslen int

the length of the timeseries represented by the spectrum

required
sample_rate float

the spacing between timepoints (e.g. TR in fMRI)

required
alpha float

the pink noise exponent, between 0 and 2.

required
highpass_freq float

if non-zero, apply a highpass filter above the given frequency. A good default is 0.01 for fMRI timeseries.

required

Returns:

Type Description

length-k numpy array: Return the fourier spectrum (amplitude spectrum)

how_much_noise(spectrum, target_ta)

Determine the standard deviation of noise to add to achieve a target TA-Δ₁.

This function answers the following question: If I generate timeseries with frequency spectrum (amplitude spectrum) spectrum, and then add white noise to the generated timeseries, what should the standard deviation of this white noise be if I want the timeseries to have the TA-Δ₁ coefficient target_ta?

Parameters:

Name Type Description Default
spectrum length-k numpy array

the power spectrum to generate from, e.g., that generated from make_spectrum.

required
target_ta float

the desired TA-Δ₁.

required

Returns:

Name Type Description
float

The standard deviation of white noise to add

ta_to_alpha(tslen, sample_rate, highpass_freq, target_ta)

Compute the (pink noise) alpha which would give, noiseless, the given TA-Δ₁.

Generate timeseries with get_spectrum_ta, i.e. high pass filtered.

Parameters:

Name Type Description Default
tslen int

the length of the timeseries represented by the spectrum

required
sample_rate float

the spacing between timepoints (e.g. TR in fMRI)

required
alpha float

the pink noise exponent, between 0 and 2.

required
highpass_freq float

if non-zero, apply a highpass filter above the given frequency. A good default is 0.01 for fMRI timeseries.

required

Returns:

Name Type Description
float

a value of alpha such that the filtered pink noise with this exponent has TA-Δ₁ coefficient target_ta.

get_spectrum_ta(spectrum)

Given a fourier spectrum, return the expected TA-Δ₁ of a timeseries generated with that spectrum.

Parameters:

Name Type Description Default
spectrum length-k numpy array

the power spectrum to generate from, e.g., that generated from make_spectrum.

required

Returns:

Name Type Description
float

the TA-Δ₁ value that would be expected if a timeseries had the given power spectrum and random phases.

ta_to_alpha_fast(tslen, sample_rate, highpass_freq, target_ta)

Identical to ta_to_alpha, but discretize and cache to increase speed.

See ta_to_alpha for documentation.

Other tools

cosine(x, y)

Cosine similarity

Cosine similarity ranges from -1 to 1 and measures the cosine of the angles between the vectors. This function returns a matrix of the similarity for row vectors.

Parameters:

Name Type Description Default
x list of length N or KxN numpy array

the first vector on which to find cosine similarity

required
y list of length N or MxN numpy array

the second vector on which to find cosine similarity

required

Returns:

Type Description

KxM numpy array: The cosine similarity matrix between x and y. If K and M are 1 or x and y are lists, return a float instead.

fingerprint(subjects, values)

Fingerprinting performance, from Finn et al (2015).

Note

This implementation is slightly different than that of Finn et al (2015). Here, instead of having separate databases, we use all other scans from all other subjects as the "database". Then, we see if the match is from the same subject or different subjects. This means that if there are more than two observations for each subject, there will be more than one "correct" best match. However, it also means that there are many more possible incorrect matches for a given subject than there are in Finn et al (2015).

Parameters:

Name Type Description Default
subjects list or 1xN numpy array

a list of length N giving the subject ID. N should be the total number of observations, e.g., if there are 10 subjects with 3 scans each, N = 30.

required
values Nxk numpy array

numpy matrix, where N is as above and k is the size of the feature on which to perform the fingerprinting. E.g., if we are fingerprinting based on TA-delta1 of each node in a 360 node atlas, k=360.

required

Returns:

Name Type Description
float

The fraction of correct fingerprinting identifications

lin(x, y)

Lin's concordance correlation coefficient

Lin's concordnce ranges from -1 to 1 and achieves a tradeoff between correlation and variance explained. This function returns a matrix of the concordance for row vectors.

Parameters:

Name Type Description Default
x list of length N or KxN numpy array

the first vector on which to find Lin's concordance

required
y list of length N or MxN numpy array

the second vector on which to find Lin's concordance

required

Returns:

Type Description

KxM numpy array: The Lin's concordance matrix between x and y. If K and M are 1 or x and y are lists, return a float instead.

pearson(x, y)

Matrix of Pearson correlations

Pearson correlation ranges from -1 to 1. This function differs from np.corrcoef because it allows you to pass x and y as matrices without computing the correlation between within-matrix rows. For large x and y this is a substantial speed increase. This function returns a matrix of the correlation for row vectors. (Sometimes this operation is mistakenly called "cross-correlation".)

Parameters:

Name Type Description Default
x list of length N or KxN numpy array

the first vector on which to find Pearson correlation

required
y list of length N or MxN numpy array

the second vector on which to find Pearson correlation

required

Returns:

Type Description

KxM numpy array: The Pearson correlation matrix between x and y. If K and M are 1 or x and y are lists, return a float instead.

spearman(x, y)

Matrix of Spearman correlations

Spearman correlation ranges from -1 to 1. This function differs from scipy.stats.spearmanr because it allows you to pass x and y as matrices without computing the correlation between within-matrix rows. For large x and y this is a substantial speed increase. This function returns a matrix of the correlation for row vectors (unlike scipy.stats.spearmanr, which uses column vectors).

Parameters:

Name Type Description Default
x list of length N or KxN numpy array

the first vector on which to find Spearman correlation

required
y list of length N or MxN numpy array

the second vector on which to find Spearman correlation

required

Returns:

Type Description

KxM numpy array: The Spearman correlation matrix between x and y. If K and M are 1 or x and y are lists, return a float instead.