pycea.tl.neighbor_distance

pycea.tl.neighbor_distance#

pycea.tl.neighbor_distance(tdata, connect_key=None, dist_key=None, method='mean', key_added='neighbor_distances', copy=False)#
Overloads:
  • tdata (td.TreeData), connect_key (str | None), dist_key (str | None), method (_AggregatorFn | _Aggregator), key_added (str), copy (Literal[True, False]) → pd.Series

  • tdata (td.TreeData), connect_key (str | None), dist_key (str | None), method (_AggregatorFn | _Aggregator), key_added (str), copy (Literal[True, False]) → None

Aggregates distance to neighboring observations.

For each observation ii, this function collects the distances Dij:jN(i)\\{ D_{ij} : j \in \mathcal{N}(i) \\} to its neighbors (as defined by a binary/weighted connectivity in tdata.obsp[connect_key]) and reduces them to a single value via an aggregation function gg:

di=g({Dij:jN(i)})d_i = g\big( \{ D_{ij} : j \in \mathcal{N}(i) \} \big)

The aggregator gg can be the mean, median, min, max, or a user-supplied callable. If an observation has no neighbors, the result for that observation is NaN.

Parameters:
  • tdata (TreeData) – The TreeData object.

  • connect_key (str | None (default: None)) – tdata.obsp connectivity key specifying set of neighbors for each observation.

  • dist_key (str | None (default: None)) – tdata.obsp distances key specifying distances between observations.

  • method (Union[Callable[[ndarray], ndarray | float], Literal['mean', 'median', 'sum', 'min', 'max', 'var']] (default: 'mean')) – Aggregation function used to calculate neighbor distances.

  • key_added (str (default: 'neighbor_distances')) – tdata.obs key to store neighbor distances.

  • copy (Literal[True, False] (default: False)) – If True, returns a Series with neighbor distances.

Returns:

Returns None if copy=False, else returns a Series.

Sets the following fields:

  • tdata.obs[key_added]Series (dtype float)
    • Neighbor distances for each observation.

Examples

Calculate mean spatial distance to tree neighbors:

>>> tdata = py.datasets.koblan25()
>>> py.tl.tree_neighbors(tdata, n_neighbors=5, depth_key="time")
>>> py.tl.distance(tdata, key="spatial", connect_key="tree_connectivities")
>>> py.tl.neighbor_distance(tdata, dist_key="spatial_distances", connect_key="tree_connectivities", method="mean")