ClassifyGxT

About

ClassifyGxT is a method for classifying gene-by-treatment (GxT) interactions using Bayesian model selection (BMS). The method is primarily designed for molecular count phenotypes, such as gene expression and chromatin accessibility, although it can be used for other types of phenotypes. It takes as input a list of feature-SNP pairs (or SNPs) with significant GxT interactions that have already been identified by a standard method and assigns posterior probability of different types of GxT interactions to each of the feature-SNP pairs (or SNPs). See Overview below for the types of GxT interactions.

See Get started and Reference for usage instructions.

Publication

Yuriko Harigaya, Nana Matoba, Brandon D. Le, Jordan M. Valone, Jason L. Stein, Michael I. Love*, William Valdar*. “Probabilistic classification of gene-by-treatment interactions on molecular count phenotypes.” doi: https://doi.org/10.1101/2024.08.03.605142 (* These authors contributed equally to this work.)

Overview

In the simplest form, our BMS framework can be described as follows. For n samples indexed by i, we consider a linear model shown below. y_i is the phenotype, g_i is the genotype, and t_i is the indicator variable as to whether the sample i was treated. ε_i is the residual error (noise) and is assumed to follow a Normal distributiion with a zero mean and variance σ². β₀ is the intercept. β_g, β_t, and β_g × t are the genotype, treatment, and GxT interaction effects, respectively. We use the m_g, m_t, and m_g × t indicator variables to specify the inclusion and exclusion of the genotype, treatment, and GxT interaction terms, respectively. The m_j vector refers to the j-th model (j = 1, …, 8). The eight models represent different types of GxT interactions.

To account for the inherent nonlinear relationship between the genotype and the tranformed molecular count phenotype, we use nonlinear regression. The eight models can be formulated in the same way using the m vectors. See our manuscript (Publication) for prior specification, posterior computation, and other details.

Although we primarily use the m vector notations to specify the types of GxT interaction, it is also useful to consider the following nomenclatures.

m vector	Nomenclature
(0,0,0)	No effect
(1,0,0)	Genotype main effect only
(0,1,0)	Treatment main effect only
(1,1,0)	Genotype and treatment main effects only
(0,0,1)	Treatment-induced genotype effect
(1,0,1)	Treatment-altered genotype effect
(0,1,1)	Treatment main effect and treatment-induced genotype effect
(1,1,1)	Genotype and treatment main effects and interaction

Data input

ClassifyGxT requires individual genotype and phenotype data for a list of feature-SNP pairs (or SNPs) with significant GxT interactions that have already been identified by a standard method, such as TensorQTL. See Get started for the format of input data. See Using ClassifyGxT with TensorQTL under Articles for an instruction on formatting input data for ClassifyGxT from typical input/output data for TensorQTL.

Installation

The classifygxt R package can be installed using devtools.

if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
devtools::install_github("yharigaya/classifygxt")

About

Publication

Overview

Data input

Installation

Links

License

Citation

Developers