Created:
July 2, 2008

User Level:
Intermediate

Products:
HelixTree, CNAM

Step 2. Determine the Appropriate LogR Mean Threshold to Impute Gender

Next you need to determine the appropriate LogR mean threshold value that will be used to indicate whether a sample is male or female. For example, because females have two copies of the X chromosome, their mean LogR values should be greater than zero whereas males, with only one copy, should have mean LogRs less than zero. It is opposite for the Y chromosome. Males should have a mean LogR greater than zero and females a mean LogR less than zero. In both cases however, a threshold of zero may not always be appropriate as the quality of the data can cause the threshold to shift left or right.

In order to complete this step you first need to download the Row Means Threshold script from our Add-on Scripts Repository.

Save this script in the following directory:
../HelixTree/scriptsHT/user/Spreadsheet/Scripts/

Row Mean Threshold Window
Figure 1. Row Mean Threshold window.

Open the X chromosome LogR spreadsheet created in Step 1 and select >Scripts >Row Means Threshold. The dialog window at the right will appear (Figure 1).

At this time you don’t know the appropriate threshold, so any value will work. Likewise, you can indicate either Greater than or Less than. Click OK.

You will then be asked if you want to specify a Mean Column header. Select Yes and click OK. In the next window type ‘X’ in the space provided and click OK.

A new spreadsheet will be created (Figure 2) with sample names followed by two columns: LogR means for each sample and a binary column indicating whether the mean LogR is greater than (or less than) the threshold value you entered. For now, ignore the second column.

Tree View
Figure 3. Tree View.
Row Means Threshold Output
Figure 2. Row Mean Threshold output.












To determine the threshold value you need to create a histogram plot. To do this, first left click once on the X Mean column header. This will turn the column magenta. Next, select >Analysis >Interactive Tree Analysis. This will open the Tree View window with a single node displayed (Figure 3).

Click on this box and select >Visualize Split Data >Histogram. A histogram plot will appear (Figure 4).

Histogram Plot
Figure 4. Histogram plot of chromosome
X LogR means.

Considering the histogram in Figure 4, a threshold of zero seems to be appropriate though it appears a few samples may need further investigation. If one of the “hills” were to cover ‘0’ a different threshold value may be appropriate.

Close this window and the Tree View window.

Follow the same procedure to determine the Y chromosome threshold.