APSampler: Algorithm for Efficient Searches in the Space of Genetic Variation
In recent years, the number of studies focusing on the genetic basis of common disorders with a complex mode of inheritance, in which multiple genes of small effect are involved, has been steadily increasing. An improved methodology to identify the cumulative contribution of several polymorphous genes would accelerate our understanding of their importance in disease susceptibility and our ability to develop new treatments. A critical bottleneck is the inability of standard statistical approaches, developed for relatively modest predictor sets, to achieve power in the face of the enormous growth in our knowledge of genomics. The inability is due to the combinatorial complexity arising in searches for multiple interacting genes. Similar ‘‘curse of dimensionality’’ problems have arisen in other fields, and Bayesian statistical approaches coupled to Markov chain Monte Carlo (MCMC) techniques have led to significant improvements in understanding. We present here an algorithm, APSampler, for the exploration of potential combinations of allelic variations positively or negatively associated with a disease or with a phenotype. The algorithm relies on the rank comparison of phenotype for individuals with and without specific patterns (i.e., combinations of allelic variants) isolated in genetic backgrounds matched for the remaining significant patterns. It constructs a Markov chain to sample only potentially significant variants, minimizing the potential of large data sets to overwhelm the search.
The source code with the software documentation is available at: Google Code under MIT Artistic License.
Here is a more detailed description of our model for multiallelic assosiation we use in the work. The validation framework description is coming soon
The APSampler algorithm was originally presented in our Genetics, 2005 publication.
Several studies, e.g BMC Med Genet 2006, Mol Biol (Mosk) 2008, Mol Biol (Mosk) 2009 and Pharmacogenomics (2009) utilized APSampler.
The project is currently run in collaboration with Laboratory of Bioinformatics, Research Institute for Genetics and Selection of Industrial Microorganisms, Moscow, RF, and with the department of Molecular Biology and Biotechnology, Russian State Medical Unversity, Moscow, RF.
Feel free to contact us.