pycea.tl.ancestral_states#
- pycea.tl.ancestral_states(tdata, keys, method='mean', missing_state=None, default_state=None, costs=None, keys_added=None, tree=None, copy=False)#
- Overloads:
tdata (td.TreeData), keys (str | Sequence[str]), method (str | Callable), missing_state (str | None), default_state (str | None), costs (pd.DataFrame | None), keys_added (str | Sequence[str] | None), tree (str | Sequence[str] | None), copy (Literal[True, False]) → pd.DataFrame
tdata (td.TreeData), keys (str | Sequence[str]), method (str | Callable), missing_state (str | None), default_state (str | None), costs (pd.DataFrame | None), keys_added (str | Sequence[str] | None), tree (str | Sequence[str] | None), copy (Literal[True, False]) → None
Reconstructs ancestral states for an attribute.
This function reconstructs ancestral (internal node) states for categorical or continuous attributes defined on tree observations. Several reconstruction methods are supported, ranging from simple aggregation rules to the Sankoff and Fitch-Hartigan algorithms for discrete character data, or a custom aggregation function can be provided.
For
tdata.alignment == "leaves", only leaf node values are used as input and all internal node states are reconstructed. Fortdata.alignment == "nodes"or"subset", internal nodes present intdata.obswith non-missing values are treated as fixed constraints and are not overwritten by reconstruction.- Parameters:
tdata (
TreeData) – TreeData object.keys (
str|Sequence[str]) – One or moreobs.keys(),var_names,obsm.keys(), orobsp.keys()to reconstruct.method (
str|Callable(default:'mean')) –Method to reconstruct ancestral states:
’mean’ : The mean of leaves in subtree.
’sum’ : The sum of leaves in subtree (iterative bottom-up traversal).
’mode’ : The most common value in the subtree.
’fitch_hartigan’ : The Fitch-Hartigan algorithm.
’sankoff’ : The Sankoff algorithm with specified costs.
Any function that takes a list of values and returns a single value.
missing_state (
str|None(default:None)) – The state to consider as missing data.default_state (
str|None(default:None)) – The expected state for the root node.costs (
DataFrame|None(default:None)) – A pd.DataFrame with the costs of changing states (from rows to columns). Only used if method is ‘sankoff’.keys_added (
str|Sequence[str] |None(default:None)) – Attribute keys oftdata.obst[tree].nodeswhere ancestral states will be stored. IfNone,keysare used.tree (
str|Sequence[str] |None(default:None)) – Theobstkey or keys of the trees to use. IfNone, all trees are used.copy (
Literal[True,False] (default:False)) – If True, returns aDataFramewith ancestral states.
- Returns:
Returns
Noneifcopy=False, else returnDataFramewith ancestral states.Sets the following fields for each key:
tdata.obst[tree].nodes[key_added]float|Object|List[Object]Inferred ancestral states. List of states if data was an array.
Examples
Infer the expression of Krt20 and Cd74 based on their mean value in descendant cells:
>>> tdata = py.datasets.yang22() >>> py.tl.ancestral_states(tdata, keys=["Krt20", "Cd74"], method="mean")
Reconstruct ancestral character states using the Fitch-Hartigan algorithm:
>>> py.tl.ancestral_states(tdata, keys="characters", method="fitch_hartigan", missing_state=-1)