pycea.tl.distance

Contents

pycea.tl.distance#

pycea.tl.distance(tdata, key, obs=None, metric='euclidean', metric_kwds=None, sample_n=None, connect_key=None, random_state=None, update=True, key_added=None, copy=False)#
Overloads:
  • tdata (td.TreeData), key (str), obs (str | int | Sequence[Any] | None), metric (_MetricFn | _Metric), metric_kwds (Mapping | None), sample_n (int | None), connect_key (str | None), random_state (int | None), update (bool), key_added (str | None), copy (Literal[True, False]) → np.ndarray | sp.sparse.csr_matrix

  • tdata (td.TreeData), key (str), obs (str | int | Sequence[Any] | None), metric (_MetricFn | _Metric), metric_kwds (Mapping | None), sample_n (int | None), connect_key (str | None), random_state (int | None), update (bool), key_added (str | None), copy (Literal[True, False]) → None

Computes distances between observations.

Supports full pairwise distances, distances from a single observation to all others, distances within a specified subset, or distances for an explicit list of pairs. Distances can be computed using a named metric (e.g. "euclidean", "cosine", "manhattan") or a user-supplied callable.

Parameters:
  • tdata (TreeData) – The TreeData object.

  • key (str) – Use the indicated key. 'X' or any tdata.obsm key is valid.

  • obs (str | int | Sequence[Any] | None (default: None)) –

    The observations to use:

    • If None, pairwise distance for all observations is stored in tdata.obsp.

    • If a string, distance to all other observations is tdata.obs.

    • If a sequence, pairwise distance is stored in tdata.obsp.

    • If a sequence of pairs, distance between pairs is stored in tdata.obsp.

  • metric (Union[Callable[[ndarray, ndarray], float], Literal['braycurtis', 'canberra', 'chebyshev', 'cityblock', 'cosine', 'correlation', 'dice', 'euclidean', 'hamming', 'jaccard', 'kulsinski', 'l1', 'l2', 'mahalanobis', 'minkowski', 'manhattan', 'rogerstanimoto', 'russellrao', 'seuclidean', 'sokalmichener', 'sokalsneath', 'sqeuclidean', 'yule']] (default: 'euclidean')) – A known metric’s name or a callable that returns a distance.

  • metric_kwds (Mapping | None (default: None)) – Options for the metric.

  • sample_n (int | None (default: None)) – If specified, randomly sample sample_n pairs of observations.

  • connect_key (str | None (default: None)) – If specified, compute distances only between connected observations specified by tdata.obsp[{connect_key}_connectivities].

  • random_state (int | None (default: None)) – Random seed for sampling.

  • key_added (str | None (default: None)) – Distances are stored in tdata.obsp['{key_added}_distances'] and connectivities in tdata.obsp['{key_added}_connectivities']. Defaults to key.

  • update (bool (default: True)) – If True, updates existing distances instead of overwriting.

  • copy (Literal[True, False] (default: False)) – If True, returns a the distances.

Returns:

Returns None if copy=False, else returns distances.

Sets the following fields:

  • tdata.obsp['{key_added}_distances']ndarray/csr_matrix (dtype float) if obs is None or a sequence.
    • Distances between observations.

  • tdata.obsp['{key_added}_connectivities']csr_matrix (dtype float) if distance is sparse.
    • Connectivity between observations.

  • tdata.obs['{key_added}_distances']Series (dtype float) if obs is a string.
    • Distance from specified observation to others.

Notes

  • When both connect_key and sample_n are provided, sampling is performed within the connected pairs induced by the connectivity.

  • If you pass a callable metric, it must accept two 1D vectors and return a scalar.

Examples

Calculate pairwise spatial distance between all observations:

>>> tdata = py.datasets.koblan25()
>>> py.tl.distance(tdata, key="spatial")

Calculate spatial distance between closely related observations:

>>> py.tl.tree_neighbors(tdata, n_neighbors=10, depth_key="time")
>>> py.tl.distance(tdata, key="spatial", connect_key="tree_connectivities")

Calculate distance from a single observation to all others:

>>> py.tl.distance(tdata, key="spatial", obs="M3-1-19")