Skip to contents

Creates a list of targets to perform feature preselection on datasets from a MultiDataSet object with sPLS-DA (from the mixOmics package).


  to_keep_props = NULL,
  target_name_prefix = "",
  filtered_set_target_name = NULL,
  multilevel = NULL,
  seed_perf = NULL,
  seed_run = NULL,



Symbol, the name of the target containing the MultiDataSet object.


Character, the column name in the samples information data-frame to use as samples group.


Named integer vector, the number of feature to retain in each dataset to be prefiltered (names should correspond to a dataset name). Value should be less than the number of features in the corresponding dataset. Set to NULL in order to use to_keep_props instead.


Named numeric vector, the proportion of features to retain in each dataset to be prefiltered (names should correspond to a dataset name). Value should be > 0 and < 1. Will be ignored if to_keep_ns is not NULL.


Character, a prefix to add to the name of the targets created by this target factory. Default value is "".


Character, the name of the final target containing the filtered MultiDataSet object. If NULL, a name will automatically be supplied. Default value is NULL.


Character vector of length 1 or 3 to be used as information about repeated measurements. See get_input_splsda() for details. Default value is NULL (no repeated measurements).


Named integer vector, the seed to use for the perf_splsda() function for each dataset. The length and names should match those of to_keep_ns or to_keep_props. If not named, the values will be used in order of the datasets in to_keep_ns or to_keep_props. Default value is NULL, i.e. no seed is set.


Named integer vector, the seed to use for the run_splsda() function for each dataset. The length and names should match those of to_keep_ns or to_keep_props. If not named, the values will be used in order of the datasets in to_keep_ns or to_keep_props. Default value is NULL, i.e. no seed is set.


Further arguments passed to the perf_splsda function.


A list of target objects. With target_name_prefix = "" and filtered_set_target_name = NULL, the following targets are created:

  • splsda_spec: generates a grouped tibble where each row corresponds to one dataset to be filtered, with the columns specifying each dataset name, and associated values from to_keep_ns and to_keep_props.

    • individual_splsda_input: a dynamic branching target that runs the get_input_splsda() function for each dataset.

  • individual_splsda_perf: a dynamic branching target that runs the perf_splsda() function for each dataset.

  • individual_splsda_run: a dynamic branching target that runs the run_splsda() function for each dataset, using the results from individual_splsda_perf to guide the number of latent components to construct.

  • filtered_set_slpsda: a target to retain from the original MultiDataSet object only features selected in each sPLS-DA run.


if (FALSE) { # \dontrun{
## in the _targets.R

  ## add code here to load the different datasets

  ## the following target creates a MultiDataSet object from previously
  ## created omics sets (geno_set, trans_set, etc)
    create_multiomics_set(geno_set, trans_set, metabo_set, pheno_set)
    group = "outcome_group",
    to_keep_ns = c("rnaseq" = 1000, "metabolome" = 500),
    filtered_set_target_name = "mo_set_filtered",
    folds = 10 ## example of an argument passed to perf_splsda

  ## Another example using to_keep_props
    group = "outcome_group",
    to_keep_ns = NULL,
    to_keep_props = c("rnaseq" = 0.3, "metabolome" = 0.5),
    filtered_set_target_name = "mo_set_filtered",
    folds = 10 ## example of an argument passed to perf_splsda
} # }