Relation to StructLMM model
CellRegMap builds on and extends the structured linear mixed model (StructLMM) model, proposed in Moore*, Casale* et al, 2018, in the context of population genetics. StructLMM allows to test for GxE effects across multiple environmental exposures at once, extending traditional interaction models which can only consider one environment at a time.
However, StructLMM is not designed to deal with repeated or related samples. Thus, it is not well suited to model longitudinal data (where multiple observations from the same individuals are collected over time) or single-cell data (where multiple cells are collected from the same individual), nor can it optimally model population stratification and cryptic relatedness, which have been shown to be prevalent in population genetic data. CellRegMap overcomes this by including an additional random effect term that models relatedness across samples.
Nevertheless, the original StructLMM model can be run using CellRegMap, by simply setting the repeatedness term to None, i.e.:
hK=None
and then running the model similarly to what is described in the usage page, i.e.:
from CellRegMap import run_interaction
pv_slmm = run_interaction(y=y, W=W, E=E, G=g, hK=None)[0]
print(f'StructLMM interaction test p-value: {pv_slmm}')
where we note that āEā is used here instead of āCā as typically this model will be applied in the context of population genetics to test for effects with environmental exposures, as opposed to the cellular contexts generally considered in applications of CellRegMap to scRNA-seq data.