‹‹ Back to SVS Home

Golden Helix SVS

SNP and Variation Suite Manual

Version 7.3.0

Copyright ©2000-2010 Golden Helix, Inc.

Acknowledgements

SVS would not exist without the generous contributions of many minds and hearts. We would particularly like to thank the following people: Alan Menius, Meg Ehm, Dmitri Zaykin, Mike Mosteller, Tony Segreti, Allen Roses and many other visionary GlaxoSmithKline scientists and managers worldwide. Dr. Douglas Hawkins of the University of Minnesota, Bret Musser of Merck, Albert Seymour of Pfizer, Dr. Peter Westfall of Texas Tech University, Dr. S. Stanley Young of CGStat LLC, Dr. Sally John of The University of Manchester, Dr. Chao-Qiang Lai of the Human Nutrition Research Center on Aging at Tufts University, Steve Dubnoff at Circle Systems, Inc., our colleagues at INTEC Web & Genome, and all the helpful folks at Affymetrix. We’d also like to thank the NIH National Institute for General Medical Sciences for their generous funding support through the SBIR program. Finally, we would like to extend a big “Thank You” to all our beta testers for the hard work, time and energy they have invested into SVS7.

Trademarks Used

SVS is a registered trademark of Golden Helix, Inc.. Affymetrix, GeneChip and the Affymetrix logo are registered trademarks used by Affymetrix, Inc. Microsoft, Microsoft SQL, Transact-JQL, Excel, Access and ODBC are registered trademarks of Microsoft, Inc. Stat/Transfer is a registered trademark of Circle Systems, Inc. Oracle, Oracle PL-SQL and SQL Server are registered trademarks of Oracle, Inc. IBM and DB2 are registered trademarks of IBM. SAS is a registered trademark of SAS, Inc. Sybase is a registered trademark of Sybase,Inc. Any other incidentally used names that are registered trademarks are trademarks of their respective owners.

Contents

1 Installing and Initializing
 1.1 Installation Overview
  Installation Under Windows
  Installation Under Linux
  Installation Under Mac OS X
 1.2 Release Notes
  New in Version 7.3
  New in Version 7.2
  Bugs Fixed in Version 7.3
2 Understanding the Interface and Workflow
 2.1 General Genetic Association Analysis Workflow
 2.2 Interface Overview
 2.3 Navigating the Welcome Screen
  Getting Started
  Community Resources
  Support
  License Information
  Modules
  The Menu Bar
  Global Product Options
  Update
 2.4 Project Navigator
  Project Navigator Window
  Node Change Log Window
  Node Annotations Window
  Customizing the Project Navigator
  File Menu
  Tools Menu
  Import Menu
  Download Menu
  Help Menu
  Tool Bar
3 Importing Your Data Into A Project
 3.1 Importing Data
 3.2 Text File
 3.3 Third Party File
 3.4 PED/TPED/BED File
 3.5 Golden Helix DSF File
 3.6 Legacy Golden Helix GHD File
 3.7 Affymetrix Files
  Affymetrix CHP File
  Affymetrix CEL Files
  Affymetrix CNT Files
  Affymetrix CNCHP Files
  Affymetrix CYCHP Files
 3.8 Illumina DSF File
 3.9 Agilent Files
 3.10 NimbleGen Data Summary Files
 3.11 Importing PBAT Family-Based Data
  Preparing Family Data
  Import FBAT Pedigree
  Import FBAT Phenotype
  Import Text Pedigree
  Import Text Phenotype
 3.12 Import Scripts
  HapMap
  Illumina Final Report by SNP
  Parallele Long File
4 Genetic Marker Maps and Affymetrix Library Files
 4.1 Genetic Marker Maps Overview
 4.2 Convert Text File into Marker Map DSM Format
 4.3 Download Affymetrix Annotation Files
 4.4 Managing the MarkerMaps Folder
  Importing Genetic Marker Map as Spreadsheet
  Removing Files from the MarkerMaps Folder
  Moving DSM Files from One Folder to Another
 4.5 Applying a Genetic Marker Map to a Spreadsheet
  Selecting a Genetic Marker Map
  Setting Apply Options
  Indicating Direction to Apply Genetic Marker Map
  Applying a Different Genetic Marker Map
  Dropping a Genetic Marker Map from a Spreadsheet
 4.6 Exporting an Applied Genetic Marker Map to a DSM file
 4.7 Downloading Affymetrix Library (CDF) Files
5 Spreadsheets
 5.1 Spreadsheet Overview
  Navigating the Spreadsheet
  Special Features of a Pedigree Spreadsheet
  Relationships and Dependencies Between Spreadsheets
  Row-Information Columns
  Row States
  Column Headers
  Column Data Types
  Column States
  Genetic Marker Map Information
  Saving/Exporting Spreadsheets
  Spreadsheet Menus Overview
 5.2 Working with a Single Spreadsheet
  Copying Spreadsheet Information to Clipboard
  Finding Strings or Values in a Spreadsheet
  Renaming Column Headers with Genetic Marker Map Information
  Recoding Genotypes
  Convert to Pedigree Spreadsheet
  Row Select Operations
  Column Select Operations
  Activate By Chromosome
  Activate By Column Type
  Creating Subset Spreadsheets
  Column and Row Spreadsheet Operations
  Transposing Spreadsheets
  Create Top-Level Spreadsheet
 5.3 Editing a Spreadsheet
  Spreadsheet Editor Overview
  Editing the Row Label Header and Row Labels
  Editing Individual Row Labels
  Editing Columns
  Editing Data
  Find/Replace and Regular Expressions
  Scripts for the Spreadsheet Editor
 5.4 Working with Multiple Spreadsheets
  Difference Between Appending or Joining Two Spreadsheets
  Appending Spreadsheets
  Joining or Merging Spreadsheets
6 Scripting and Other Integrated Statistical Tools
 6.1 Integrated Tools Overview
 6.2 The Python Shell Window
  Using Shell Objects
  Using the Directory Command
  Getting Help on a Python Command
 6.3 The Python Editor Window
  Python Editor Overview
  Creating A New Script
 6.4 Running Scripts
 6.5 Obtaining Add-On Scripts
 6.6 Scripting Reference
  Project Related Commands
  General GHI Commands
  Commands for Importing Data
  Commands Common to All Objects
  Commands For User Input
  Commands for Accessing Genome Browser Annotation Files
  Using Progress or Status Dialogs
  Building a Dataset
  Building a Marker Map
  Commands for Spreadsheet Objects
  Analysis with Spreadsheet Objects
  Commands for Writing Spreadsheet Editor Scripts
7 Data Quality Assessment
 7.1 Quality Control Overview
 7.2 Genotype Statistics by Marker
  Data Requirements
  Processing
  Call Rate
  Allele Frequencies
  Hardy-Weinberg Equilibrium P-Value
  Fisher’s Exact Test for HWE P-Value
  Signed HWE R
  Genotype Count Table(s)
  Allele Count Table(s)
 7.3 Genotype Filtering by Marker
 7.4 Genotype Statistics by Sample
  Data Requirements
  Processing
  Call Rate (fraction not missing)
  Hardy-Weinberg Thw P-Value
  Output -log 10 p-values
  Output
  Subdivision of Output by Cases vs. Controls
 7.5 PBAT Family-Based QC Statistics
  Data Requirements
  Processing
  Computation Parameters
  Output
 7.6 Principal Component Analysis Overview
  Correcting for Stratification
  Correcting for Batch Effects and Other Measurement Errors
  Correction of Input Data by Principal Component Analysis
 7.7 Genotypic Principal Component Analysis
  Using the Genotypic Principal Components Analysis Window
 7.8 Numeric Principal Component Analysis
  Using the Numeric Principal Components Analysis Window
 7.9 Correcting for Stratification by Genomic Control
8 Analysis
 8.1 Genotype Association Tests
  Genotype Association Tests Overview
  Genotype Models and Other Genotype Tests
  Test Statistics
  Missing Values
  Multiple Testing Corrections
  Principal Components Analysis
  Overall Marker Statistics
  Using the Genotype Association Test Window
 8.2 Haplotype Association Tests
  Haplotype Association Tests Overview
  Ways of Defining Haplotype Blocks
  Association Tests Used with Haplotype Frequencies
  How Haplotype Frequencies are Computed
  Multiple Testing Correction
  Additional Outputs
  Haplotype Association Tests Results
 8.3 Haplotype Block Detection
 8.4 Runs of Homozygosity
  Runs of Homozygosity Overview
  Using Runs of Homozygosity Window
  Association Analysis using ROH Covariates
 8.5 Numeric Association Tests
  Numeric Association Tests Overview
  Tests and Analysis Methods
  Note on Missing Values
  Multiple Testing Corrections
  Principal Components Analysis
  Using the Numeric Association Test Window
 8.6 Regression Analysis
  Performing Analysis
  Full Versus Reduced Model Regression Equation
  Note on Missing Values
  Multiple Testing Corrections
  Output and Running the Regression
9 PBAT Family-Based Analysis
 9.1 PBAT Family-Based Analysis Overview
 9.2 Using PBAT Capabilities Through SVS
 9.3 Pre-Study Power Calculation
  Summary
  Using Pre-Study Power Calculation
  Methods Tab (all designs)
  Family Design Tab – Binary Traits
  Family Design Tab – Continuous Traits
  Genetic Model Tab – Family-design Binary Trait
  Genetic Model Tab – Family-design Continuous Trait
  Genetic Model Tab – Population-design Case/Control Trait
  Genetic Model Tab – Population-design Quantitative Trait
  Computational Tab – Population-Based Case/Control Trait
  Computational Tab – Population-Based Quantitative Trait
  PBAT Pre-Study Power Calculation Results
 9.4 PBAT Genotype Analysis
  Summary
  Using PBAT Genotype Analysis
  Select Phenotypes
  Phenotype and Haplotype Parameters
  Test Statistic and Computational
  Multiple Processes
  Output Spreadsheet
 9.5 PBAT CNV Analysis
  Summary
  Using PBAT CNV Analysis
  Select Phenotypes
  Phenotype Parameters
  Test Statistic and Computational
  Multiple Processes
  Output Spreadsheet
10 Copy Number Analysis
 10.1 Copy Number Analysis Overview
  Copy Number Variation
  Copy Number Analysis Module (CNAM)
 10.2 Preparing Log2 Ratio Data
 10.3 Using CNAM Optimal Segmenting
  Log2 Ratio Spreadsheet
  Selecting Chromosomes
  Segmenting Options
  Optional Output Files
  Excluding Markers
  Run Log
 10.4 Outputs from CNAM Optimal Segmenting
  CNV Covariates Spreadsheet
  CNV Segment List Spreadsheet
  Segment Run Log
  Wiggle Track (WIG) File
 10.5 CNV Association Tests
  CNV Association Tests Overview
  Tests and Analysis Methods
  Note on Missing Values
  Multiple Testing Corrections
  Principal Components Analysis
  Using the CNV Association Test Window
 10.6 Visualizing Copy Number Analysis Results
  Log2 Ratios
  CNV Segment Mean Covariates
  CNV Segment Means Histogram
  Log2 Ratios and CNV Segments Together
11 Visualizing Data
 11.1Types of Plots Available
  Numeric Value Plots
  Histograms
  XY Scatter Plots
  LD Plots
  Heat Maps
  Genome Browser
 11.2Plot Viewer
  Navigating The Plot Viewer
 11.3Genome Browser
  Obtaining a Genome Browser
  Features Unique to a Genome Browser
 11.4Opening the Plot Viewer
  From the Plot Menu
  From the Tool Bar
  From a Column Header Menu
 11.5Using the Graph Control Interface
  Using the User Graphs Tree Window
  Graph and Item Controls
  Adding a Line to a Graph
 11.6Using the Data Console
 11.7Docking, Un-docking or Hiding Plot Viewer Subwindows
  Un-docking Plot Viewer Items
  Docking Plot Viewer Subwindow
  Hiding Plot Viewer Items
 11.8Zooming in the Graph View
 11.9Using LD Plots
  LD Plot Modes
  Graph Controls Specific to LD Plots
  Haplotype Block Sets and LD Graphs
  Interactive Block Definition
  The Full Domain View in LD Plots
  The Data Console in LD Plots
  Adding Additional Graphs
 11.10Haplotype Tables
 11.11Heat Maps
  Heat Map Plot Modes
  Graph Controls Specific to Heat Maps
  The Full Domain View in Heat Maps
  The Data Console in Heat Map Graphs
  Adding Additional Graphs
 11.12Creating Specialized Plots
  Multi-Color Scatter Plots for PCA or Gender Analysis
  Multi-Color Manhattan Plots
  Q-Q Plot or P-P Plot
 11.13Genome Maps
  Bundled Genome Maps
  Creating a Genome Map
  Switching the Genome Map
12 Export Options for Data and Plots
 12.1 Exporting Spreadsheet Data
  Saving as a Text or Third Party File
  Saving as PED/MAP, TPED/TFAM, or BED/BIM/FAM Files
  Saving as Either a DSF or GHD File
 12.2 Saving or Printing Graphs
  Saving Graphs to Image Formats
  Saving Graphs to a PDF
  Printing Graphs
  “Insufficient memory.” Warning
13 Formulas and Theories: The Science Behind SNP and Variation Suite
 13.1 General Statistics
  General Marker Statistics
  Statistics Available for Genotype Association Tests
  Statistics for Numeric Association Tests
  False Discovery Rate
 13.2 Permutation Testing Methodology
 13.3 Linear Regression
  Genotype or Numeric Association Test – Linear Regression
  Multiple Linear Regression Model
 13.4 Logistic Regression
  Genotype or Numeric Association Test - Logistic Regression
  Multiple Logistic Regression
 13.5 Haplotype Frequency Estimation Methods
  About Haplotype Inference
  Case/Control Association with Haplotype Frequencies
  Expectation Maximization (EM)
  Composite Haplotype Method (CHM)
 13.6 Formulas for Principal Component Analysis
  Motivation
  Technique
  Applying the PCA Correction
  Formulas for PCA Normalization of Genotypic Data
  Further Motivation: Relation to the Variance-Covariance Matrix
  Centering by Marker vs. Not Centering by Marker
  Centering by Sample vs. Not Centering by Sample
  Applying PCA to a Superset of Markers
  Applying PCA to a Subset of Samples
 13.7 Runs Of Homozygosity (ROH) Algorithm
 13.8 Quantile Normalization of Affymetrix CEL Files
 13.9 CNAM Optimal Segmentation Algorithm
  Overview
  Obtaining Segments with a Moving Window
  Obtaining Segments without a Moving Window
  Copy Number Segmentation within a Sub-Region
  Univariate Outlier Removal
  Permutation Testing for Verifying Copy Number Segments
  Recommended Settings for Univariate Segmentation
Appendices
14 EULA
15 Installing the Third-Party Condor® Package
 15.1 Installing Condor® Overview
 15.2 Downloading and Using the Installation Wizard
  Launching the Condor® Installer
  Creating or Joining a Condor® Pool
  Execution and Submit Behavior for Jobs
  Setting Condor® Host Permission
  Finish Up
 15.3 Troubleshooting Techniques and Common Issues
  Command Utility Commands
  Condor® Issues on Windows
16 Extracting Affymetrix Copy Number Data for use in SVS
 16.1 Extracting Affymetrix Copy Number Data Overview
 16.2 Creating CNT Files using the Affymetrix CNAT Batch Analysis Tool
  About Affymetrix CNAT
  Creating the CNT Files
 16.3 Creating CNCHP Files Using Affymetrix Genotyping Console 2.0
  About Affymetrix Genotyping Console
  Generating CNCHP Files for the Mapping 100k and Mapping 500k Arrays
  Generating CNCHP Files for the Genome Wide SNP 6.0 Array
 16.4 Affymetrix CNT File Format
  Header Section
  Column Names Section
  Data Section
  Example File
17 Exporting Data from GenomeStudio
 17.1 Exporting Data From GenomeStudio Overview
 17.2 Exporting Genotype Data using the Final Report
  Exporting the Data from GenomeStudio
  Importing Data into SVS
 17.3 Exporting Data using the SVS DSF Export 4.0 Plug-In
  Installation of the Golden Helix, Inc. Plug-In
  Exporting DSF Data from GenomeStudio using Plugin version 4.0
  Importing the DSF files into SVS
18 Platform Notes
 18.1 Microsoft Windows
  Memory Usage
19 A Glossary of Terms Used in Genetic Analysis
20 References

References

[Abt 2001]   Abt, M., Lim, Y., Sacks, J., , Xie, M., and Young, S. S., (2001), ‘A Sequential Approach for Identifying Lead Compounds in Large Chemical Databases’, Statistical Science, 16, 154-168.

[Affymetrix 2007]   Affymetrix (2007), ‘CNAT 4.0: Copy Number and Loss of Heterozygosity Estimation Algorithms for the GeneChip® Human Mapping 10/50/100/250/500K Array Set’, Revision Version 1.2

[Biggs 1991]   Biggs, D., B. deVille, and E. Suen (1991). ‘A method of choosing multiway partitions for classification and decision trees’, Journal of Applied Statistics 18, 49.

[Bolstad 2003]   Bolstad, B.M., Irisarry, R.A., Astrand, M., Speed, T.P. (2003) ‘A Comparison of Normalization Methods for High Density Oligonucleotide Array Data based on Variance and Bias’. Bioinformatics Vol 19 no. 2, p.185–193

[Bolstad 2001]   Bolstad, Ben (2001), ‘Probe Level Quantile Normalization of High Density Oligonucleotide Array Data’, Division of Biostatistics, University of California, Berkley.

[Carlson 2004]   Carlson, C., Eberle, M., Rieder, M., Yi, Q., Kruglyak, L., Nickerson, D., (2004), ‘Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analysis Using Linkage Disequilibrium’, Am. J. Hum. Genet. 74, 106–120.

[Chiano 1998]   Chiano M. N., Clayton D. G. (1998), ‘Fine genetic mapping using haplotype analysis and the missing data problem.’ Ann. Hum. Genet. 62, 55–60.

[Dempster 1977]   Dempster, A. P., Laird, N. M., Rubin D., (1977), ‘Maximum likelihood from incomplete data via the EM algorithm.’ J of the Royal Stat Soc B 39: 1-38.

[Devlin and Roeder 1999]   B. Devlin, Kathryn Roeder, ‘Genomic Control for Association Studies’, Biometrics, Vol. 55, No. 4 (Dec., 1999), pp. 997–1004

[Durstenfeld 1964]   Durstenfeld, Richard, (July 1964), ‘Algorithm 235: Random permutation’ Communications of the ACM Vol 7 no. 7, p.420.

[Emigh 1980]   Emigh, T. H., (1980), ‘Comparison of tests for Hardy-Weinberg Equilibrium’ Biometrics 36: 627–642.

[Excoffier 1995]   Excoffier L, Slatkin M (1995) ‘Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.’ Molecular Biology and Evolution 12: 921–927.

[Fallin 2000]   Fallin D, Schork NJ (2000) ‘Power of omnibus likelihood ratio test for haplotype-based case-control studies.’ Am J Hum Genet 67(S2): 214 (abstract).

[Fardo 2009]   Fardo DW, Ionita-Laza I, Lange C, 2009 On Quality Control Measures in Genome-Wide Association Studies: A Test to Assess the Genotyping Quality of Individual Probands in Family-Based Association Studies and an Application to the HapMap Data. PLoS Genet 5(7): e1000572. doi:10.1371/journal.pgen.1000572

[Gabriel 2002]   Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, et al. (2002) ‘The structure of haplotype blocks in the human genome.’ Science 296: 2225–2229.

[The Genome Sequencing Consortium 2001]   The Genome Sequencing Consortium (2001 Feb 15). ‘Initial sequencing and analysis of the human genome’. Nature, 409(6822), 860–921.

[Green 1997]   Green, W.H. Econometric Analysis, 3rd Ed. Prentice Hall, NJ, (1997), pp 882–886.

[Hawkins 2002]   Hawkins, D. M., (2002). ‘Fitting multiple change-points to data’, Computational Statistics and Data Analysis, 37, 323–341.

[Hawkins 2001]   Hawkins. D. M., and Musser, B. J. (2001), ‘Feature selection with nondeterministic recursive partitioning’, Proceedings of the American Statistical Association [CD-ROM] Alexandria, VA: ASA.

[Hawkins 1999]   Hawkins, D. M. and Musser, B. J., (1999) ‘One tree or a forest? Alternative dendrographic models’, Computing Science and Statistics, 30, 534–542.

[Hawkins 1997]   Hawkins, D. M., Young, S. S., and Rusinko, A., (1997), ‘Analysis of a large structure-activity data set using recursive partitioning, Quantitative Structure Activity Relationships’, 16, 296–302.

[Hawkins 1995a]   Hawkins, D. M. and McKenzie, D. P., (1995). ‘A data-based comparison of some recursive partitioning procedures’, Proceedings, Statistical Computing Section, American Statistical Association, 245–252.

[Hawkins 1995b]   Hawkins, D. M. (1995). ‘FIRM: Formal Inference-based Recursive Modeling, release 2.’ Technical Report 546, University of Minnesota, School of Statistics.

[Hawkins 1982]   Hawkins, D. M. and G. V. Kass (1982). ‘Automatic interaction detection’. In D. M. Hawkins (Ed.), Topics in Applied Multivariate Analysis. Cambridge University Press.

[Hawkins 1973]   Hawkins, D. M. and Merriam, D. F. (1973) ‘Optimal zonation of digitized sequential data’. Jour. Math Geology, v. 5, no. 4, p. 389–395.

[Hawkins 1972]   Hawkins, D. M. (1972) ‘On the choice of segments in piecewise approximation’. Jour. Inst. Math. Applications, v. 9, no. 2, p. 250–256.

[Hill 1997]   Hill, D. A., L. M. Delaney, and S. Roncal (1997). ‘A Chi-Squared Automatic Interaction Detection (CHAID) analysis of factors determining critical outcomes’, The Journal of Trauma: Injury, Infection and Critical Care 42, 62–66.

[Hooton 1981]   Hooton, T. M., Haley, R.W., Culver, D. H., White, J. W., Morgan, W. M., and Carroll, R. J., (1981), ‘The joint associations of multiple risk factors with the occurrence of nosocomial infections’, American Journal of Medicine, 70, 960–970.

[Hosmer and Lemeshow 2000]   Hosmer, David W., and Lemeshow, Stanley, Applied Logistic Regression, second edition, John Wiley and Sons, 2000. See pp. 1 – 42 for a discussion of standard error and other related statistics for logistic regressions, with standard error specifically shown on p. 35.

[Horvath 2004]   Horvath, S., Xu, X., Lake, S.L., Silverman, E.K., Weiss, S.T. and Laird, N.M. (2004), ‘Family-based tests for associating haplotypes with general phenotype data: application to asthma genetics’, Genet Epidemiol, 26, 61–69.

[Huang 1993]   Huang, H. C., T. K. Lin, and P. W. Ngui (1993). ‘Analyzing a mental health survey by Chi-Squared Automatic Interaction Detection’, Annals of the Academy of Medicine 22, 332–337.

[Ionita-Laza 2007]   Ionita-Laza, Iuliana, Perry, George H., Raby, Benjamin A., Klanderman, Barbara, Lee, Charles, Laird, Nan M., Weiss, Scott T., and Lange, Christoph, (2007). ‘On the Analysis of Copy-Number Variations in Genome-Wide Association Studies: A Translation of the Family-Based Association Test’, Genetic Epidemiology 32, 1–11.

[Karolchik 2004]   Karolchik, D., Hinrichs, A.S., Furey, T.S., et al (2004 Jan 1). ‘The UCSC Table Browser data retrieval tool’. Nucleic Acids Res., 32(Database issue), D493–6.

[Kass 1980]   Kass, G. V., (1980), ‘An exploratory technique for investigating large quantities of categorical data’, Applied Statistics, 29, 119–127.

[Kass 1975]   Kass, G. V. (1975). ‘Significance testing in, and some extensions of Automatic Interaction Detection.’ Ph. D. thesis, University of the Witwatersrand, Johannesburg.

[Kent 2002a]   Kent, W.J. (2002 April). ‘BLAT - the BLAST -like alignment tool’. Genome Res. 12(4), 656–64.

[Kent 2002b]   Kent, W.J., Sugnet, C.W., Roskin, K.M., Pringle T.H., Zahler, A.M., Haussler, D. (2002, June). ‘The human genome browser at UCSC’, Genome Res., 12(6), 996–1006.

[Knapp 1999]   Knapp, M. (1999), ‘A Note on Power Approximations for the Transmission/Disequilibrium Test.’ Am J Hum Genet 64:1177–1185.

[Lange 2002a]   Lange C, DeMeo D, Laird NM (2002) ‘Power and design considerations for a general class of family-based association tests: Quantitative traits.’ Am J Hum Genet 71:1330–1341.

[Lange 2002b]   Lange C, Laird NM (2002) ‘On a general class of conditional tests for family-based association studies in genetics: the asymptotic distribution, the conditional power and optimality considerations.’ Genetic Epidemiology 23:165–180.

[Lange 2002c]   Lange C, Laird NM (2002) ‘Analytical sample size and power calculations for a general class of family-based association tests: dichotomous traits.’ Am J Hum Genet 71:575–584.

[Lencz 2007]   Lencz T, Lambert C, DeRosse P, Burdick KE, Morgan V, Kane JM, Kucherlapati R, Malhotra AK. (in press) ‘Runs of Homozygosity Reveal Highly Penetrant Recessive Loci in Schizophrenia.’ Submitted PNAS, 2007.

[Mehta and Patel 1983]   Mehta C, Patel N (1983) J. Am. Stat. Assoc. 78:427–434

[Mehta and Patel 1986]   Mehta C, Patel N (1986) ‘FEXACT: a FORTRAN subroutine for Fisher’s exact test on unordered rc contingency tables’ ACM Transactions on Mathematical Software (TOMS) Volume 12 Issue 2 pp. 154–161.

[Musser 1999]   Musser, B. J. (1999) ‘Extensions to Recursive Partitioning’ Ph.D. Thesis, University of Minnesota School of Statistics.

[Nielson 1998]   Nielsen D, Ehm M, Weir BS (1998) ‘Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus.’ Am J Hum Genet 63: 1531–1540.

[Patterson 2006]   Patterson N, Price AL, Reich D (2006) Population Structure and Eigenanalysis PLoS Genet 2(12): e190. doi:10.1371/journal.pgen.0020190.

[Price 2006]   Price, Alkes L., Patterson, Nick J. Plenge, Robert M. Weinblatt, Michael E. Shadick, Nancy A. Reich, David. (2006). ‘Principal Components Analysis Corrects for Statification in Genome-Wide Asssociation Studies’. Nature Genetics 38, 904–909.

[Rhead 2009]   Rhead, B., Karolchik, D., et al (2009 Nov 11). ‘The UCSC Genome Browser database: update 2010’. Nucleic Acids Res. Epub, 38(Database issue), D613–9.

[Storey 2002]   Storey, John D. (2002) ‘A direct approach to false discovery rates’, J. R. Statist. Soc. B 64, Part 3, pp. 479–498.

[Weir 1996]   Weir BS (1996) ‘Genetic Data Analysis II.’ Sinauer Associates.

[Xie 1993]   Xie X, Ott J (1993) ‘Testing linkage disequilibrium between a disease gene and marker loci.’ Am J Hum Genet 53, 1107 (abstract).

[Zaykin 2002]   Zaykin DV, Westfall PH, Young SS, Karnoub MA, Wagner MJ, Ehm MG. (2002) ‘Testing association of statistically inferred haplotypes with discrete and continuous traits in samples of unrelated individuals.’ Human Heredity, 53:79–91.

[Zaykin 2001]   Zaykin DV, Ehm, MG, Weir BS (2001) ‘Evaluating new haplotyping methods for predicting clinical response using dense maps of single nucleotide polymorphisms (SNPs).’ Work in progress. Presented at Bioinformatics Seminar Series, Research Triangle Institute, NC.

[Zaykin 2000]   Zaykin DV, Nielsen DM (2000) ‘Hardy-Weinberg disequilibrium (HWD) fine mapping for case-control samples.’ Am J Hum Genet 67: 1238(S).

[Zaykin (unknown)]   Zaykin DV, Ehm MG, Weir BS. ‘The composite haplotype method for association mapping of complex traits in out-bred populations.’ Accepted for publication in Genetic Epidemiology.

[Zhao 2000]   Zhao JH, Curtis D, Sham PC (2000) ‘Model-free analysis and permutation tests for allelic associations.’ Human Heredity 2000: 50 133—139.