
Applies a transformation to a dataset from a MultiDataSet object
Source:R/transformation.R
transform_dataset.RdApplies a transformation to a dataset from a MultiDataSet object.
Implemented transformations are: Variance Stabilising Normalisation (from the
vsn package), Variance Stabilising Transformation (from the DESeq2
package - only for count data), and appropriate feature-wise normalisation
through the BestNormalise package.
Usage
transform_dataset(
mo_data,
dataset,
transformation,
return_multidataset = FALSE,
return_matrix_only = FALSE,
verbose = TRUE,
log_base = 2,
pre_log_function = zero_to_half_min,
method,
...
)Arguments
- mo_data
A
MultiDataSet-classobject.- dataset
Character, name of the dataset to transform.
- transformation
Character, transformation to be applied. Possible values are:
vsn,vst-deseq2,logxbest-normalize-autoorbest-normalize-manual. SeeDetails.- return_multidataset
Logical, should a
MultiDataSetobject with the original data replaced by the transformed data returned? IfFALSE, the output of the function depends onreturn_matrix_only. Default value isFALSE.- return_matrix_only
Logical, should only the transformed matrix be returned? If
TRUE, the function will return a matrix. IfFALSE, the function instead returns a list with the transformed data as well as other information relevant to the transformation. Ignored ifreturn_multidatasetisTRUE. Default value isFALSE.- verbose
Logical, should information about the transformation be printed? Default value is
TRUE.- log_base
Numeric, the base with respect to which logarithms are computed. Default value is
2. Only used iftransformation = 'logx'.- pre_log_function
Function that will be applied to the matrix before the log transformation (e.g. to apply an offset to the values to avoid issues with zeros). Default value is the
zero_to_half_min()function. Only used iftransformation = 'logx'.- method
Character, if
transformation = 'best-normalize-manual', which normalisation method should be applied. See possible values intransform_bestNormalise_manual(). Ignored for other transformations.- ...
Further arguments passed to the
bestNormalize::bestNormalize()function or themethodfunction from thebestNormalizepackage.
Value
if
return_multidataset = TRUE: a MultiDataSet::MultiDataSet object, in which the original data for the transformed dataset has been replaced.if
return_multidataset = FALSEandreturn_matrix_only = TRUE: a matrix with the transformed data.if
return_multidataset = FALSEandreturn_matrix_only = FALSE: a list with two elements,transformed_datacontaining a matrix of transformed data, andinfo_transformationcontaining information about the transformation (depends on the transformation applied).
Details
Currently implemented transformations and recommendations based on dataset type:
vsn: Variance Stabilising normalisation, implemented in thevsn::justvsn()function from thevsnpackage. This method was originally developed for microarray intensities. This transformation is recommended for microarray, metabolome, chemical or other intensity-based datasets. In practice, applies thetransform_vsn()function.vst-deseq2: Variance Stabilising Transformation, implemented in theDESeq2::varianceStabilizingTransformation()function from theDESeq2package. This method is applicable to count data only. This transformation is recommended for RNAseq or similar count-based datasets. In practice, applies thetransform_vst()function.logx: log-transformation (default to log2, but base can be specified). In practice, applies thetransform_logx()function.best-normalize-auto: most appropriate normalisation method automatically selected from a number of options, implemented in thebestNormalize::bestNormalize()function from thebestNormalizepackage. This transformation is recommended for phenotypes that are each measured on different scales (since the transformation method selected will potentially be different across the features), preferably with a reasonable number of features (less than 100) to avoid large computation times. In practice, applies thetransform_bestNormalise_auto()function.best-normalize-manual: performs the same transformation (specified through themethodargument) to each feature of a dataset. This transformation is recommended for phenotypes data in which the different phenotypes are measured on the same scale. The different normalisation methods are:"arcsinh_x": data is transformed aslog(x + sqrt(x^2 + 1));"boxcox": Box Cox transformation;"center_scale": data is centered and scaled;"exp_x": data is transformed asexp(x);"log_x": data is transformed aslog_b(x+a)(aandbeither selected automatically per variable or passed as arguments);"orderNorm": Ordered Quantile technique;"sqrt_x": data transformed assqrt(x + a)(aselected automatically per variable or passed as argument),"yeojohnson": Yeo-Johnson transformation.