‹‹ Back to SVS Home
Golden Helix SVS
SNP and Variation Suite™ Manual
Version 7.3.0
Copyright ©2000-2010 Golden Helix, Inc.
Acknowledgements
SVS would not exist without the generous contributions of many minds and hearts. We would particularly like to thank
the following people: Alan Menius, Meg Ehm, Dmitri Zaykin, Mike Mosteller, Tony Segreti, Allen Roses and many other
visionary GlaxoSmithKline scientists and managers worldwide. Dr. Douglas Hawkins of the University of Minnesota, Bret
Musser of Merck, Albert Seymour of Pfizer, Dr. Peter Westfall of Texas Tech University, Dr. S. Stanley Young of CGStat
LLC, Dr. Sally John of The University of Manchester, Dr. Chao-Qiang Lai of the Human Nutrition Research Center on
Aging at Tufts University, Steve Dubnoff at Circle Systems, Inc., our colleagues at INTEC Web & Genome, and
all the helpful folks at Affymetrix. We’d also like to thank the NIH National Institute for General Medical
Sciences for their generous funding support through the SBIR program. Finally, we would like to extend a
big “Thank You” to all our beta testers for the hard work, time and energy they have invested into SVS7.
Trademarks Used
SVS is a registered trademark of Golden Helix, Inc.. Affymetrix, GeneChip and the Affymetrix logo are registered trademarks used by Affymetrix, Inc. Microsoft, Microsoft SQL, Transact-JQL, Excel, Access and ODBC are registered trademarks of Microsoft, Inc. Stat/Transfer is a registered trademark of Circle Systems, Inc. Oracle, Oracle PL-SQL and SQL Server are registered trademarks of Oracle, Inc. IBM and DB2 are registered trademarks of IBM. SAS is a registered trademark of SAS, Inc. Sybase is a registered trademark of Sybase,Inc. Any other incidentally used names that are registered trademarks are trademarks of their respective owners.
Contents
1.1 Installation Overview
Installation Under Windows
Installation Under Linux
Installation Under Mac OS X
1.2 Release Notes
New in Version 7.3
New in Version 7.2
Bugs Fixed in Version 7.3
2 Understanding the Interface and Workflow
2.1 General Genetic Association Analysis Workflow
2.2 Interface Overview
2.3 Navigating the Welcome Screen
Getting Started
Community Resources
Support
License Information
Modules
The Menu Bar
Global Product Options
Update
2.4 Project Navigator
Project Navigator Window
Node Change Log Window
Node Annotations Window
Customizing the Project Navigator
File Menu
Tools Menu
Import Menu
Download Menu
Help Menu
Tool Bar
3 Importing Your Data Into A Project
3.1 Importing Data
3.2 Text File
3.3 Third Party File
3.4 PED/TPED/BED File
3.5 Golden Helix DSF File
3.6 Legacy Golden Helix GHD File
3.7 Affymetrix Files
Affymetrix CHP File
Affymetrix CEL Files
Affymetrix CNT Files
Affymetrix CNCHP Files
Affymetrix CYCHP Files
3.8 Illumina DSF File
3.9 Agilent Files
3.10 NimbleGen Data Summary Files
3.11 Importing PBAT Family-Based Data
Preparing Family Data
Import FBAT Pedigree
Import FBAT Phenotype
Import Text Pedigree
Import Text Phenotype
3.12 Import Scripts
HapMap
Illumina Final Report by SNP
Parallele Long File
4 Genetic Marker Maps and Affymetrix Library Files
4.1 Genetic Marker Maps Overview
4.2 Convert Text File into Marker Map DSM Format
4.3 Download Affymetrix Annotation Files
4.4 Managing the MarkerMaps Folder
Importing Genetic Marker Map as Spreadsheet
Removing Files from the MarkerMaps Folder
Moving DSM Files from One Folder to Another
4.5 Applying a Genetic Marker Map to a Spreadsheet
Selecting a Genetic Marker Map
Setting Apply Options
Indicating Direction to Apply Genetic Marker Map
Applying a Different Genetic Marker Map
Dropping a Genetic Marker Map from a Spreadsheet
4.6 Exporting an Applied Genetic Marker Map to a DSM file
4.7 Downloading Affymetrix Library (CDF) Files
5 Spreadsheets
5.1 Spreadsheet Overview
Navigating the Spreadsheet
Special Features of a Pedigree Spreadsheet
Relationships and Dependencies Between Spreadsheets
Row-Information Columns
Row States
Column Headers
Column Data Types
Column States
Genetic Marker Map Information
Saving/Exporting Spreadsheets
Spreadsheet Menus Overview
5.2 Working with a Single Spreadsheet
Copying Spreadsheet Information to Clipboard
Finding Strings or Values in a Spreadsheet
Renaming Column Headers with Genetic Marker Map Information
Recoding Genotypes
Convert to Pedigree Spreadsheet
Row Select Operations
Column Select Operations
Activate By Chromosome
Activate By Column Type
Creating Subset Spreadsheets
Column and Row Spreadsheet Operations
Transposing Spreadsheets
Create Top-Level Spreadsheet
5.3 Editing a Spreadsheet
Spreadsheet Editor Overview
Editing the Row Label Header and Row Labels
Editing Individual Row Labels
Editing Columns
Editing Data
Find/Replace and Regular Expressions
Scripts for the Spreadsheet Editor
5.4 Working with Multiple Spreadsheets
Difference Between Appending or Joining Two Spreadsheets
Appending Spreadsheets
Joining or Merging Spreadsheets
6 Scripting and Other Integrated Statistical Tools
6.1 Integrated Tools Overview
6.2 The Python Shell Window
Using Shell Objects
Using the Directory Command
Getting Help on a Python Command
6.3 The Python Editor Window
Python Editor Overview
Creating A New Script
6.4 Running Scripts
6.5 Obtaining Add-On Scripts
6.6 Scripting Reference
Project Related Commands
General GHI Commands
Commands for Importing Data
Commands Common to All Objects
Commands For User Input
Commands for Accessing Genome Browser Annotation Files
Using Progress or Status Dialogs
Building a Dataset
Building a Marker Map
Commands for Spreadsheet Objects
Analysis with Spreadsheet Objects
Commands for Writing Spreadsheet Editor Scripts
7 Data Quality Assessment
7.1 Quality Control Overview
7.2 Genotype Statistics by Marker
Data Requirements
Processing
Call Rate
Allele Frequencies
Hardy-Weinberg Equilibrium P-Value
Fisher’s Exact Test for HWE P-Value
Signed HWE R
Genotype Count Table(s)
Allele Count Table(s)
7.3 Genotype Filtering by Marker
7.4 Genotype Statistics by Sample
Data Requirements
Processing
Call Rate (fraction not missing)
Hardy-Weinberg Thw P-Value
Output -log 10 p-values
Output
Subdivision of Output by Cases vs. Controls
7.5 PBAT Family-Based QC Statistics
Data Requirements
Processing
Computation Parameters
Output
7.6 Principal Component Analysis Overview
Correcting for Stratification
Correcting for Batch Effects and Other Measurement Errors
Correction of Input Data by Principal Component Analysis
7.7 Genotypic Principal Component Analysis
Using the Genotypic Principal Components Analysis Window
7.8 Numeric Principal Component Analysis
Using the Numeric Principal Components Analysis Window
7.9 Correcting for Stratification by Genomic Control
8 Analysis
8.1 Genotype Association Tests
Genotype Association Tests Overview
Genotype Models and Other Genotype Tests
Test Statistics
Missing Values
Multiple Testing Corrections
Principal Components Analysis
Overall Marker Statistics
Using the Genotype Association Test Window
8.2 Haplotype Association Tests
Haplotype Association Tests Overview
Ways of Defining Haplotype Blocks
Association Tests Used with Haplotype Frequencies
How Haplotype Frequencies are Computed
Multiple Testing Correction
Additional Outputs
Haplotype Association Tests Results
8.3 Haplotype Block Detection
8.4 Runs of Homozygosity
Runs of Homozygosity Overview
Using Runs of Homozygosity Window
Association Analysis using ROH Covariates
8.5 Numeric Association Tests
Numeric Association Tests Overview
Tests and Analysis Methods
Note on Missing Values
Multiple Testing Corrections
Principal Components Analysis
Using the Numeric Association Test Window
8.6 Regression Analysis
Performing Analysis
Full Versus Reduced Model Regression Equation
Note on Missing Values
Multiple Testing Corrections
Output and Running the Regression
9 PBAT Family-Based Analysis
9.1 PBAT Family-Based Analysis Overview
9.2 Using PBAT Capabilities Through SVS
9.3 Pre-Study Power Calculation
Summary
Using Pre-Study Power Calculation
Methods Tab (all designs)
Family Design Tab – Binary Traits
Family Design Tab – Continuous Traits
Genetic Model Tab – Family-design Binary Trait
Genetic Model Tab – Family-design Continuous Trait
Genetic Model Tab – Population-design Case/Control Trait
Genetic Model Tab – Population-design Quantitative Trait
Computational Tab – Population-Based Case/Control Trait
Computational Tab – Population-Based Quantitative Trait
PBAT Pre-Study Power Calculation Results
9.4 PBAT Genotype Analysis
Summary
Using PBAT Genotype Analysis
Select Phenotypes
Phenotype and Haplotype Parameters
Test Statistic and Computational
Multiple Processes
Output Spreadsheet
9.5 PBAT CNV Analysis
Summary
Using PBAT CNV Analysis
Select Phenotypes
Phenotype Parameters
Test Statistic and Computational
Multiple Processes
Output Spreadsheet
10 Copy Number Analysis
10.1 Copy Number Analysis Overview
Copy Number Variation
Copy Number Analysis Module (CNAM)
10.2 Preparing Log2 Ratio Data
10.3 Using CNAM Optimal Segmenting
Log2 Ratio Spreadsheet
Selecting Chromosomes
Segmenting Options
Optional Output Files
Excluding Markers
Run Log
10.4 Outputs from CNAM Optimal Segmenting
CNV Covariates Spreadsheet
CNV Segment List Spreadsheet
Segment Run Log
Wiggle Track (WIG) File
10.5 CNV Association Tests
CNV Association Tests Overview
Tests and Analysis Methods
Note on Missing Values
Multiple Testing Corrections
Principal Components Analysis
Using the CNV Association Test Window
10.6 Visualizing Copy Number Analysis Results
Log2 Ratios
CNV Segment Mean Covariates
CNV Segment Means Histogram
Log2 Ratios and CNV Segments Together
11 Visualizing Data
11.1Types of Plots Available
Numeric Value Plots
Histograms
XY Scatter Plots
LD Plots
Heat Maps
Genome Browser
11.2Plot Viewer
Navigating The Plot Viewer
11.3Genome Browser
Obtaining a Genome Browser
Features Unique to a Genome Browser
11.4Opening the Plot Viewer
From the Plot Menu
From the Tool Bar
From a Column Header Menu
11.5Using the Graph Control Interface
Using the User Graphs Tree Window
Graph and Item Controls
Adding a Line to a Graph
11.6Using the Data Console
11.7Docking, Un-docking or Hiding Plot Viewer Subwindows
Un-docking Plot Viewer Items
Docking Plot Viewer Subwindow
Hiding Plot Viewer Items
11.8Zooming in the Graph View
11.9Using LD Plots
LD Plot Modes
Graph Controls Specific to LD Plots
Haplotype Block Sets and LD Graphs
Interactive Block Definition
The Full Domain View in LD Plots
The Data Console in LD Plots
Adding Additional Graphs
11.10Haplotype Tables
11.11Heat Maps
Heat Map Plot Modes
Graph Controls Specific to Heat Maps
The Full Domain View in Heat Maps
The Data Console in Heat Map Graphs
Adding Additional Graphs
11.12Creating Specialized Plots
Multi-Color Scatter Plots for PCA or Gender Analysis
Multi-Color Manhattan Plots
Q-Q Plot or P-P Plot
11.13Genome Maps
Bundled Genome Maps
Creating a Genome Map
Switching the Genome Map
12 Export Options for Data and Plots
12.1 Exporting Spreadsheet Data
Saving as a Text or Third Party File
Saving as PED/MAP, TPED/TFAM, or BED/BIM/FAM Files
Saving as Either a DSF or GHD File
12.2 Saving or Printing Graphs
Saving Graphs to Image Formats
Saving Graphs to a PDF
Printing Graphs
“Insufficient memory.” Warning
13 Formulas and Theories: The Science Behind SNP and Variation Suite
13.1 General Statistics
General Marker Statistics
Statistics Available for Genotype Association Tests
Statistics for Numeric Association Tests
False Discovery Rate
13.2 Permutation Testing Methodology
13.3 Linear Regression
Genotype or Numeric Association Test – Linear Regression
Multiple Linear Regression Model
13.4 Logistic Regression
Genotype or Numeric Association Test - Logistic Regression
Multiple Logistic Regression
13.5 Haplotype Frequency Estimation Methods
About Haplotype Inference
Case/Control Association with Haplotype Frequencies
Expectation Maximization (EM)
Composite Haplotype Method (CHM)
13.6 Formulas for Principal Component Analysis
Motivation
Technique
Applying the PCA Correction
Formulas for PCA Normalization of Genotypic Data
Further Motivation: Relation to the Variance-Covariance Matrix
Centering by Marker vs. Not Centering by Marker
Centering by Sample vs. Not Centering by Sample
Applying PCA to a Superset of Markers
Applying PCA to a Subset of Samples
13.7 Runs Of Homozygosity (ROH) Algorithm
13.8 Quantile Normalization of Affymetrix CEL Files
13.9 CNAM Optimal Segmentation Algorithm
Overview
Obtaining Segments with a Moving Window
Obtaining Segments without a Moving Window
Copy Number Segmentation within a Sub-Region
Univariate Outlier Removal
Permutation Testing for Verifying Copy Number Segments
Recommended Settings for Univariate Segmentation
Appendices
14 EULA
15 Installing the Third-Party Condor® Package
15.1 Installing Condor® Overview
15.2 Downloading and Using the Installation Wizard
Launching the Condor® Installer
Creating or Joining a Condor® Pool
Execution and Submit Behavior for Jobs
Setting Condor® Host Permission
Finish Up
15.3 Troubleshooting Techniques and Common Issues
Command Utility Commands
Condor® Issues on Windows
16 Extracting Affymetrix Copy Number Data for use in SVS
16.1 Extracting Affymetrix Copy Number Data Overview
16.2 Creating CNT Files using the Affymetrix CNAT Batch Analysis Tool
About Affymetrix CNAT
Creating the CNT Files
16.3 Creating CNCHP Files Using Affymetrix Genotyping Console 2.0
About Affymetrix Genotyping Console
Generating CNCHP Files for the Mapping 100k and Mapping 500k Arrays
Generating CNCHP Files for the Genome Wide SNP 6.0 Array
16.4 Affymetrix CNT File Format
Header Section
Column Names Section
Data Section
Example File
17 Exporting Data from GenomeStudio
17.1 Exporting Data From GenomeStudio Overview
17.2 Exporting Genotype Data using the Final Report
Exporting the Data from GenomeStudio
Importing Data into SVS
17.3 Exporting Data using the SVS DSF Export 4.0 Plug-In
Installation of the Golden Helix, Inc. Plug-In
Exporting DSF Data from GenomeStudio using Plugin version 4.0
Importing the DSF files into SVS
18 Platform Notes
18.1 Microsoft Windows
Memory Usage
19 A Glossary of Terms Used in Genetic Analysis
20 References
References
[Abt 2001] Abt, M., Lim, Y., Sacks, J., , Xie, M., and Young, S. S., (2001), ‘A Sequential Approach for Identifying Lead Compounds in Large Chemical Databases’, Statistical Science, 16, 154-168.
[Affymetrix 2007] Affymetrix (2007), ‘CNAT 4.0: Copy Number and Loss of Heterozygosity Estimation Algorithms for the GeneChip® Human Mapping 10/50/100/250/500K Array Set’, Revision Version 1.2
[Biggs 1991] Biggs, D., B. deVille, and E. Suen (1991). ‘A method of choosing multiway partitions for classification and decision trees’, Journal of Applied Statistics 18, 49.
[Bolstad 2003] Bolstad, B.M., Irisarry, R.A., Astrand, M., Speed, T.P. (2003) ‘A Comparison of Normalization Methods for High Density Oligonucleotide Array Data based on Variance and Bias’. Bioinformatics Vol 19 no. 2, p.185–193
[Bolstad 2001] Bolstad, Ben (2001), ‘Probe Level Quantile Normalization of High Density Oligonucleotide Array Data’, Division of Biostatistics, University of California, Berkley.
[Carlson 2004] Carlson, C., Eberle, M., Rieder, M., Yi, Q., Kruglyak, L., Nickerson, D., (2004), ‘Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analysis Using Linkage Disequilibrium’, Am. J. Hum. Genet. 74, 106–120.
[Chiano 1998] Chiano M. N., Clayton D. G. (1998), ‘Fine genetic mapping using haplotype analysis and the missing data problem.’ Ann. Hum. Genet. 62, 55–60.
[Dempster 1977] Dempster, A. P., Laird, N. M., Rubin D., (1977), ‘Maximum likelihood from incomplete data via the EM algorithm.’ J of the Royal Stat Soc B 39: 1-38.
[Devlin and Roeder 1999] B. Devlin, Kathryn Roeder, ‘Genomic Control for Association Studies’, Biometrics, Vol. 55, No. 4 (Dec., 1999), pp. 997–1004
[Durstenfeld 1964] Durstenfeld, Richard, (July 1964), ‘Algorithm 235: Random permutation’ Communications of the ACM Vol 7 no. 7, p.420.
[Emigh 1980] Emigh, T. H., (1980), ‘Comparison of tests for Hardy-Weinberg Equilibrium’ Biometrics 36: 627–642.
[Excoffier 1995] Excoffier L, Slatkin M (1995) ‘Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.’ Molecular Biology and Evolution 12: 921–927.
[Fallin 2000] Fallin D, Schork NJ (2000) ‘Power of omnibus likelihood ratio test for haplotype-based case-control studies.’ Am J Hum Genet 67(S2): 214 (abstract).
[Fardo 2009] Fardo DW, Ionita-Laza I, Lange C, 2009 On Quality Control Measures in Genome-Wide Association Studies: A Test to Assess the Genotyping Quality of Individual Probands in Family-Based Association Studies and an Application to the HapMap Data. PLoS Genet 5(7): e1000572. doi:10.1371/journal.pgen.1000572
[Gabriel 2002] Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, et al. (2002) ‘The structure of haplotype blocks in the human genome.’ Science 296: 2225–2229.
[The Genome Sequencing Consortium 2001] The Genome Sequencing Consortium (2001 Feb 15). ‘Initial sequencing and analysis of the human genome’. Nature, 409(6822), 860–921.
[Green 1997] Green, W.H. Econometric Analysis, 3rd Ed. Prentice Hall, NJ, (1997), pp 882–886.
[Hawkins 2002] Hawkins, D. M., (2002). ‘Fitting multiple change-points to data’, Computational Statistics and Data Analysis, 37, 323–341.
[Hawkins 2001] Hawkins. D. M., and Musser, B. J. (2001), ‘Feature selection with nondeterministic recursive partitioning’, Proceedings of the American Statistical Association [CD-ROM] Alexandria, VA: ASA.
[Hawkins 1999] Hawkins, D. M. and Musser, B. J., (1999) ‘One tree or a forest? Alternative dendrographic models’, Computing Science and Statistics, 30, 534–542.
[Hawkins 1997] Hawkins, D. M., Young, S. S., and Rusinko, A., (1997), ‘Analysis of a large structure-activity data set using recursive partitioning, Quantitative Structure Activity Relationships’, 16, 296–302.
[Hawkins 1995a] Hawkins, D. M. and McKenzie, D. P., (1995). ‘A data-based comparison of some recursive partitioning procedures’, Proceedings, Statistical Computing Section, American Statistical Association, 245–252.
[Hawkins 1995b] Hawkins, D. M. (1995). ‘FIRM: Formal Inference-based Recursive Modeling, release 2.’ Technical Report 546, University of Minnesota, School of Statistics.
[Hawkins 1982] Hawkins, D. M. and G. V. Kass (1982). ‘Automatic interaction detection’. In D. M. Hawkins (Ed.), Topics in Applied Multivariate Analysis. Cambridge University Press.
[Hawkins 1973] Hawkins, D. M. and Merriam, D. F. (1973) ‘Optimal zonation of digitized sequential data’. Jour. Math Geology, v. 5, no. 4, p. 389–395.
[Hawkins 1972] Hawkins, D. M. (1972) ‘On the choice of segments in piecewise approximation’. Jour. Inst. Math. Applications, v. 9, no. 2, p. 250–256.
[Hill 1997] Hill, D. A., L. M. Delaney, and S. Roncal (1997). ‘A Chi-Squared Automatic Interaction Detection (CHAID) analysis of factors determining critical outcomes’, The Journal of Trauma: Injury, Infection and Critical Care 42, 62–66.
[Hooton 1981] Hooton, T. M., Haley, R.W., Culver, D. H., White, J. W., Morgan, W. M., and Carroll, R. J., (1981), ‘The joint associations of multiple risk factors with the occurrence of nosocomial infections’, American Journal of Medicine, 70, 960–970.
[Hosmer and Lemeshow 2000] Hosmer, David W., and Lemeshow, Stanley, Applied Logistic Regression, second edition, John Wiley and Sons, 2000. See pp. 1 – 42 for a discussion of standard error and other related statistics for logistic regressions, with standard error specifically shown on p. 35.
[Horvath 2004] Horvath, S., Xu, X., Lake, S.L., Silverman, E.K., Weiss, S.T. and Laird, N.M. (2004), ‘Family-based tests for associating haplotypes with general phenotype data: application to asthma genetics’, Genet Epidemiol, 26, 61–69.
[Huang 1993] Huang, H. C., T. K. Lin, and P. W. Ngui (1993). ‘Analyzing a mental health survey by Chi-Squared Automatic Interaction Detection’, Annals of the Academy of Medicine 22, 332–337.
[Ionita-Laza 2007] Ionita-Laza, Iuliana, Perry, George H., Raby, Benjamin A., Klanderman, Barbara, Lee, Charles, Laird, Nan M., Weiss, Scott T., and Lange, Christoph, (2007). ‘On the Analysis of Copy-Number Variations in Genome-Wide Association Studies: A Translation of the Family-Based Association Test’, Genetic Epidemiology 32, 1–11.
[Karolchik 2004] Karolchik, D., Hinrichs, A.S., Furey, T.S., et al (2004 Jan 1). ‘The UCSC Table Browser data retrieval tool’. Nucleic Acids Res., 32(Database issue), D493–6.
[Kass 1980] Kass, G. V., (1980), ‘An exploratory technique for investigating large quantities of categorical data’, Applied Statistics, 29, 119–127.
[Kass 1975] Kass, G. V. (1975). ‘Significance testing in, and some extensions of Automatic Interaction Detection.’ Ph. D. thesis, University of the Witwatersrand, Johannesburg.
[Kent 2002a] Kent, W.J. (2002 April). ‘BLAT - the BLAST -like alignment tool’. Genome Res. 12(4), 656–64.
[Kent 2002b] Kent, W.J., Sugnet, C.W., Roskin, K.M., Pringle T.H., Zahler, A.M., Haussler, D. (2002, June). ‘The human genome browser at UCSC’, Genome Res., 12(6), 996–1006.
[Knapp 1999] Knapp, M. (1999), ‘A Note on Power Approximations for the Transmission/Disequilibrium Test.’ Am J Hum Genet 64:1177–1185.
[Lange 2002a] Lange C, DeMeo D, Laird NM (2002) ‘Power and design considerations for a general class of family-based association tests: Quantitative traits.’ Am J Hum Genet 71:1330–1341.
[Lange 2002b] Lange C, Laird NM (2002) ‘On a general class of conditional tests for family-based association studies in genetics: the asymptotic distribution, the conditional power and optimality considerations.’ Genetic Epidemiology 23:165–180.
[Lange 2002c] Lange C, Laird NM (2002) ‘Analytical sample size and power calculations for a general class of family-based association tests: dichotomous traits.’ Am J Hum Genet 71:575–584.
[Lencz 2007] Lencz T, Lambert C, DeRosse P, Burdick KE, Morgan V, Kane JM, Kucherlapati R, Malhotra AK. (in press) ‘Runs of Homozygosity Reveal Highly Penetrant Recessive Loci in Schizophrenia.’ Submitted PNAS, 2007.
[Mehta and Patel 1983] Mehta C, Patel N (1983) J. Am. Stat. Assoc. 78:427–434
[Mehta and Patel 1986] Mehta C, Patel N (1986) ‘FEXACT: a FORTRAN subroutine for Fisher’s exact test on unordered rc contingency tables’ ACM Transactions on Mathematical Software (TOMS) Volume 12 Issue 2 pp. 154–161.
[Musser 1999] Musser, B. J. (1999) ‘Extensions to Recursive Partitioning’ Ph.D. Thesis, University of Minnesota School of Statistics.
[Nielson 1998] Nielsen D, Ehm M, Weir BS (1998) ‘Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus.’ Am J Hum Genet 63: 1531–1540.
[Patterson 2006] Patterson N, Price AL, Reich D (2006) Population Structure and Eigenanalysis PLoS Genet 2(12): e190. doi:10.1371/journal.pgen.0020190.
[Price 2006] Price, Alkes L., Patterson, Nick J. Plenge, Robert M. Weinblatt, Michael E. Shadick, Nancy A. Reich, David. (2006). ‘Principal Components Analysis Corrects for Statification in Genome-Wide Asssociation Studies’. Nature Genetics 38, 904–909.
[Rhead 2009] Rhead, B., Karolchik, D., et al (2009 Nov 11). ‘The UCSC Genome Browser database: update 2010’. Nucleic Acids Res. Epub, 38(Database issue), D613–9.
[Storey 2002] Storey, John D. (2002) ‘A direct approach to false discovery rates’, J. R. Statist. Soc. B 64, Part 3, pp. 479–498.
[Weir 1996] Weir BS (1996) ‘Genetic Data Analysis II.’ Sinauer Associates.
[Xie 1993] Xie X, Ott J (1993) ‘Testing linkage disequilibrium between a disease gene and marker loci.’ Am J Hum Genet 53, 1107 (abstract).
[Zaykin 2002] Zaykin DV, Westfall PH, Young SS, Karnoub MA, Wagner MJ, Ehm MG. (2002) ‘Testing association of statistically inferred haplotypes with discrete and continuous traits in samples of unrelated individuals.’ Human Heredity, 53:79–91.
[Zaykin 2001] Zaykin DV, Ehm, MG, Weir BS (2001) ‘Evaluating new haplotyping methods for predicting clinical response using dense maps of single nucleotide polymorphisms (SNPs).’ Work in progress. Presented at Bioinformatics Seminar Series, Research Triangle Institute, NC.
[Zaykin 2000] Zaykin DV, Nielsen DM (2000) ‘Hardy-Weinberg disequilibrium (HWD) fine mapping for case-control samples.’ Am J Hum Genet 67: 1238(S).
[Zaykin (unknown)] Zaykin DV, Ehm MG, Weir BS. ‘The composite haplotype method for association mapping of complex traits in out-bred populations.’ Accepted for publication in Genetic Epidemiology.
[Zhao 2000] Zhao JH, Curtis D, Sham PC (2000) ‘Model-free analysis and permutation tests for allelic associations.’ Human Heredity 2000: 50 133—139.