Developed in collaboration with Dr. Christoph Lange of Harvard's School of Public Health, the SNP & Variation Suite PBAT Analysis add-on delivers an exclusive and extensive array of advanced statistical routines for the design and analysis of family-based association studies.
The PBAT add-on to SVS streamlines the import of most family-based data formats including: PLINK BED, PED, and TPED, FBAT Pedigree, and FBAT Phenotype files. It also makes it easy to join and merge this data once imported to ensure all genotype, phenotype, and pedigree data is formatted properly for analysis. Further, SVS supports parallel processing for most PBAT analyses to significantly speed up computationally intense processes.
Pre-Study Power Calculations
SVS PBAT Analysis capabilities for power calculations are a software implementation of the approaches to analytical power calculations for FBATs by [Lange 2002a, Lange 2002b,Lange 2002c]. They allow you to assess the power of family-based association tests (FBATs) for a large variety of different designs:
- Dichotomous/binary and continuous traits
- Missing parental information
- Combination of different family-types and different ascertainment conditions
- Combinations of different family-types
- Different genetic models
- Different ascertainment conditions for the first and second proband
- Marker and disease locus are not identical
- Multiple offspring per family
- Verification of all power calculations by Monte-Carlo simulations
Power calculations also allow you to assess the power of non-family-based association test designs for both case/control studies and studies based on quantitative traits.
Family-Based SNP Association
SVS incorporates PBAT's comprehensive and powerful tool set for family-based SNP association. PBAT offers a unified approach to the FBAT statistic, a generalization of the transmission disequilibrium test (TDT), to cover different genetic models, tests of different sampling designs, tests involving different disease phenotypes, tests with missing parents and tests of different null hypotheses, all in the same framework.
Family-Based CNV Association
PBAT also supports the testing for copy-number variation (CNV) in a family-based setting. All robustness properties of the FBAT approach are maintained as in PBAT for SNP analysis. In addition, all previously-developed FBAT extensions, including FBATs for time-to-onset, multivariate FBATs, and FBAT-testing strategies, can be directly transferred to the analysis of CNVs.
The latest version of Golden Helix PBAT incorporates a novel test that assesses the genotyping quality of individual probands in family-based association studies. Published in PLoS Genetics [Fardo 2009] these tests are "ideally suited as the final layer of quality control filters in the cleaning process of genome-wide association studies."; You can also assess Mendelian errors, Hardy-Weinberg Equilibrium and call rates per marker.
Enhanced Extended Pedigree Analysis
Breaking up extended pedigrees into trios, which is a computationally fast strategy, does not take full advantage of the structure of the known extended pedigree. On the other hand, analyzing extended pedigrees as such, which does take full advantage of all the information and is the most powerful option, can be computationally slow when many of the genotypes in a pedigree are missing.
Golden Helix PBAT includes a new hybrid option that identifies clusters of nuclear families in extended pedigrees which are directly linked (i.e. that share a family member) and analyzes such clusters as extended pedigrees. At the same time, clusters that are linked only through two or more family members without genotypic information are broken up into separate extended-pedigree clusters. These clusters are analyzed in the same way that extended pedigrees would be under the original algorithm, but independently of each other.
The extra information provided to the computation of the genetic distribution under the original algorithm by linking together the extended pedigree clusters is minimal, while the effort required for taking advantage of this information is disproportionately enormous. This puts the original algorithm at a severe disadvantage
Under the new hybrid approach, however, such links between family clusters within extended pedigrees are dropped. The increased statistical power of the original extended pedigree algorithm is, therefore, maintained while almost having the computational speed of a pure nuclear-family analysis.
Screening Based on Conditional Mean Model
The key concept of PBAT's screening technique is the conditional mean model approach [Lange 2002b,Lange 2002c ], for which the data space is considered to be partitioned into two independent testing sets. This approach may be described as follows:
- First, find which combination of phenotypes as a group and markers have the highest power when tested against, not actual genotypes, but those predicted from the parents' genotypes.
- Second, perform the appropriate FBAT test for the selected combinations of phenotypes and markers on the actual genotypes of the patients, both as a group and individually.
This allows one to control the type I error rates and to overcome one of the most important statistical hurdles when analyzing genome-wide association studies - the multiple comparison problem. PBAT's screening methods are only minimally affected by the non-causal SNPs. In addition, they are robust against effects of population stratification and admixture.
Case Study - Robert Kleta, University College of London
"SVS is so good that I, myself, without being a computer scientist, could upload the data and actually do a GWAS myself."
Maximizing Public Data Sources for Sequencing and GWAS Studies
by Dr. Andreas SchererDownload the eBook »
View a sample project in SVS
Explore a GWAS Project in SVS - FREE!