pycea.tl.autocorr#
- pycea.tl.autocorr(tdata, keys=None, connect_key='tree_connectivities', method='moran', layer=None, copy=False)#
Calculate autocorrelation statistic.
This function computes autocorrelation for one or more variables using either Moran’s I or Geary’s C statistic, based on a specified connectivity graph between observations.
Mathematically, the two statistics are defined as follows:
\[ \begin{align}\begin{aligned}I = \frac{ N \sum_{i,j} w_{i,j} (x_i - \bar{x})(x_j - \bar{x}) }{ W \sum_i (x_i - \bar{x})^2 }\\C = \frac{ (N - 1)\sum_{i,j} w_{i,j} (x_i - x_j)^2 }{ 2W \sum_i (x_i - \bar{x})^2 }\end{aligned}\end{align} \]- where:
\(N\) is the number of observations,
\(x_i\) is the value of observation i,
\(\bar{x}\) is the mean of all observations,
\(w_{i,j}\) is the spatial weight between i and j, and
\(W = \sum_{i,j} w_{i,j}\).
A Moran’s I value close to 1 indicates strong positive autocorrelation, while values near 0 suggest randomness. For Geary’s C behaves inversely: values less than 1 indicate positive autocorrelation, while values greater than 1 indicate negative autocorrelation.
- Parameters:
tdata (
TreeData) – TreeData object.keys (
str|Sequence[str] |None(default:None)) – One or moreobs.keys(),var_names,obsm.keys(), orobsp.keys()to calculate autocorrelation for. Defaults to all ‘var_names’.connect_key (
str(default:'tree_connectivities')) –tdata.obspconnectivity key specifying set of neighbors for each observation.method (
str(default:'moran')) –Method to calculate autocorrelation. Options are:
’moran’ : Moran’s I autocorrelation.
’geary’ : Geary’s C autocorrelation.
layer (
str|None(default:None)) – Name of the TreeData object layer to use. IfNone,tdata.Xis used.copy (
Literal[True,False] (default:False)) – If True, returns aDataFramewith autocorrelation.
- Return type:
- Returns:
Returns
Noneifcopy=False, else returnsDataFramewith columns:'autocorr'- Moran’s I or Geary’s C statistic.'pval_norm'- p-value under normality assumption.'var_norm'- variance of'score'under normality assumption.
Sets the following fields for each key:
tdata.uns['moranI']: Above DataFrame for if method is'moran'.tdata.uns['gearyC']: Above DataFrame for if method is'geary'.
Examples
Estimate gene expression heritability using Moran’s I autocorrelation:
>>> tdata = py.datasets.yang22() >>> py.tl.tree_neighbors(tdata, n_neighbors=10) >>> py.tl.autocorr(tdata, connect_key="tree_connectivities", method="moran")