The function trains a predictive model of a given gene using top mediators as fixed effects and assesses in-sample performance with cross-validation.

trainExpression(
  geneInt,
  snps,
  snpLocs,
  mediator,
  medLocs,
  covariates,
  dimNumeric,
  qtlFull,
  h2Pcutoff = 0.1,
  numMed = 5,
  seed,
  k,
  cisDist = 5e+05,
  parallel = T,
  prune = F,
  windowSize = 50,
  numSNPShift = 5,
  ldThresh = 0.5,
  cores,
  verbose = T,
  LDMS = F,
  modelDir,
  ldScrRegion = 200,
  snpAnnot = NULL
)

Arguments

geneInt

character, identifier for gene of interest

snps

data frame, SNP dosages

snpLocs

data frame, MatrixEQTL locations for SNPs

mediator

data frame, mediator intensities

medLocs

data frame, MatrixEQTL locations for mediators

covariates

data frame, covariates

qtlFull

data frame, all QTLs (cis and trans) between mediators and genes

h2Pcutoff

numeric, P-value cutoff for heritability

numMed

integer, number of top mediators to include

seed

integer, random seed for splitting

k

integer, number of training-test splits

parallel

logical, TRUE/FALSE to run glmnet in parallel

prune

logical, TRUE/FALSE to LD prune the genotypes

windowSize

integer, window size for PLINK pruning

numSNPShift

integer, shifting window for PLINK pruning

ldThresh

numeric, LD threshold for PLINK pruning

cores

integer, number of parallel cores

outputAll

logical, include mediator information

Value

final model for gene along with CV R2 and predicted values