Prepare datasets for the phylter
function. Detects possible issues,
discards genes if necessary, imputes missing data if any, and reorders row- and col-names.
For internal usage mostly.
Usage
PreparePhylterData(
X,
bvalue = 0,
distance = "patristic",
Norm = "median",
Norm.cutoff = 0.001,
gene.names = NULL,
verbose = TRUE
)
Arguments
- X
A list of phylogenetic trees (phylo object) or a list of distance matrices. Trees can have different number of leaves and matrices can have different dimensions. If this is the case, missing values are imputed.
- bvalue
If X is a list of trees, nodes with a support below 'bvalue' will be collapsed prior to the outlier detection.
- distance
If X is a list of trees, type of distance used to compute the pairwise matrices for each tree. Can be "patristic" (sum of branch lengths separating tips, the default) or nodal (number of nodes separating tips).
- Norm
Should the matrices be normalized and how. If "median", matrices are divided by their median, if "mean" they are divided by their mean, if "none", no normalization if performed. Normalizing ensures that fast-evolving (and slow-evolving) genes are not treated as outliers. Normalization by median is less sensitive to outlier values but can lead to errors if some matrices have a median value of 0. are not considered outliers.
- Norm.cutoff
Value of the median (if Norm="median") or the mean (if Norm="mean") below which matrices are simply discarded from the analysis. This prevents dividing by 0, and getting rid of genes that contain mostly branches of length 0 and are therefore uninformative anyway.
- gene.names
List of gene names used to rename elements in X. If NULL (the default), elements are named 1,2,..,length(X).
- verbose
If TRUE (the default), messages are written during the filtering process to get information of what is happening
Value
A list of class 'phylter' with the 'Initial' (before filtering) and 'Final' (after filtering) states, or a list of class 'phylterinitial' only, if InitialOnly=TRUE.
Examples
data(carnivora)
# transform trees to a named list of matrices with same dimensions
# and identical row and column orders and names
carnivora.clean<-PreparePhylterData(carnivora)
#>
#> Number of Genes: 125
#> Number of Species: 53
#> --------