Plots omics data vs sample covariate — plot_data

For a given set of features, plots their value against a sample covariate from the samples metadata. Depending on whether the covariate is continuous or discrete, will generate either a scatterplot or a violin plot.

Usage

plot_data_covariate(
  mo_data,
  covariate,
  features,
  samples = NULL,
  only_common_samples = FALSE,
  colour_by = NULL,
  shape_by = NULL,
  point_alpha = 1,
  add_se = TRUE,
  add_boxplot = TRUE,
  ncol = NULL,
  label_cols = NULL,
  truncate = NULL
)

Arguments

mo_data: A MultiDataSet::MultiDataSet object.
covariate: Character, name of column in one of the samples metadata tables from mo_data to use as x-axis in the plot.
features: Character vector, the ID of features to show in the plot.
samples: Character vector, the ID of samples to include in the plot. If NULL (default), all samples in the corresponding dataset will be used.
only_common_samples: Logical, whether only samples that are present in all datasets should be plotted. Default value is FALSE.
colour_by: Character, name of column in one of the samples metadata tables from mo_data to use to colour the observations in the plot. Default value is NULL.
shape_by: Character, name of column in one of the samples metadata tables from mo_data to use as shape for the observations in the plot.
point_alpha: Numeric between 0 and 1, the opacity of the points in the plot (with 1 = fully opaque, and 0 = fully transparent). Default value is 1.
add_se: Logical, should a confidence interval be drawn around the smoothing curves for numerical covariates? Default value is TRUE.
add_boxplot: Logical, should a boxplot be drawn on top of the points for categorical covariates? Default value is TRUE.
ncol: Integer, number of columns in the faceted plot. Default value is NULL.
label_cols: Character or named list of character, giving for each dataset the name of the column in the corresponding features metadata to use as label. If one value, will be used for all datasets. If list, the names must correspond to the names of the datasets in mo_data. If a dataset is missing from the list or no value is provided, feature IDs will be used as labels. Alternatively, use feature_id to get the feature IDs as labels.
truncate: Integer, width to which the labels should be truncated (to avoid issues with very long labels in plots). If NULL (default value), no truncation will be performed.

Value

a ggplot.

Examples

if (FALSE) { # \dontrun{
## Selecting at random 3 features from each dataset
random_features <- get_features(mo_set) |>
  map(sample, size = 3, replace = FALSE) |>
  unlist() |>
  unname()

## Plotting features value against a discrete samples covariate
plot_data_covariate(
  mo_set,
  "feedlot",
  random_features,
  only_common_samples = TRUE,
  colour_by = "status",
  shape_by = "geno_comp_cluster"
)

## Plotting features value against a continuous samples covariate
plot_data_covariate(
  mo_set,
  "day_on_feed",
  random_features,
  only_common_samples = TRUE,
  colour_by = "status",
  shape_by = "geno_comp_cluster"
)
} # }