Association Haplotype Test
Previous  Next

You can perform Haplotypic association tests here.

To run these tests, SNPator needs, besides the usual information stored in a Study, a list of Haplotypes corresponding to a set of Samples. (Remember that data is stored in the Genotypes Table without phase information).

This haplotype list may come from an external source, or can be easily computed using the Phase program that SNPator runs in background.

SNPator will classify the Haplotypes provided by Phase in cases and controls following the information stored in the Samples Table. Then, it will proceed to create contingency tables for each haplotype against all the others and to compute a set of statistics.

The procedure is as follows:


graphic

graphic

graphic

Samples with values other than the ones declared here will be discarded and not used in the process.

When more than two possible values exist in a field. You may want to enter only one of these values in either the "Case" or the "Control" boxes, leaving the other box untouched (that is, leaving the "[Select Item]" option there). By doing so, those samples that match your explicit selection are being defined as, say, Cases (or Controls, if you selected so). All the remaining samples having other values will be used as the other option (Controls, in this example).

If a blank is selected " ", that means that there are Samples with no value in the field and that those are being selected as a case or control.

- Fisher's exact test.

Performs Fisher's Exact Test upon the contingency table in addition to the Chi-square test that is always done. A combo box allows the user to select weather to do it for every SNP or only in those cases in which the Chi-square test has Validity=N (See below).

- Odds Ratio - Haldane Correction.

If some value of the original contingency table is 0, this option tells SNPator to perform the Haldane Correction when computing Odds Ratios and their Confidence Intervals. 

- Batch mode.

This is a fundamental time-saving feature. Selecting one of the fields in the Samples table, SNPator will run this analysis as many times as different values are in that fields, using each time only those samples that have each of the values. For instance, if you have defined your samples in the "sex" field as "M" or "W", selecting "sex" as the attribute of the batch mode will result in having two runs of the analysis, taking separately men and women.



graphic

The haplotype list can be specified by 4 different ways:
  1. Using genotypes already stored SNPator. This option is only valid when all genotypes are haploid since in this case a set of genotypes is equivalent to a haplotype.
  2. Selecting a PHASE* file previously produced in SNPator and stored in the User Results seccion. You only have to select the result that you want in the list.
  3. Providing a "report" file (one of the outputs of the Phase program) that may be stored in your local computer. A "report.bis" file generated by SNPator can be also used in this seccion. (More info about "report.bis" files in Run PHASE  help)
  4. Providing your own list of Haplotypes in a tabulated format. This list should have the following format:


Sample1    Hap1 Hap2
Sample2    Hap1 Hap2
...          ...  ...

NOTE:
* missing alleles inside a haplotype must be entered as '?'
* haploid samples will have no "Hap2"


All haplotypes containing missing alleles will be discarded and not taken into account when calculating the association test.

graphic

In order to do the association analysis SNPator has to decide which samples of those entered in the Haplotypes list are case and which are control. This information is stored in the Samples Table. The analysis cannot be performed, or will be incomplete if the sample names of the list and those entered in SNPator are not the same.

A very usual problem arises when a filter is activated that does not allow to retrieve information from some samples (because they are excluded by the filter). At this point, thus, you need to be sure that all the intended samples are being analyzed.

Job               : H_Association
Description       : ff
User              : advanced
Study             : Pruebas_2
Request time      : 2005-10-06 17:48:58
Start time        : 2005-10-06 17:48:58
End time          : 2005-10-06 17:49:03
Ready to use time : 2005-10-06 17:49:06

Filter Information:
    Filter: 0
    Filter description: central
    Filter version: 2


-------------------------------------------------------------------------------
Percentage of samples used: 100.00 %
-------------------------------------------------------------------------------


     Haplotype         N         P      V   ODDS Ratio           CI 95%            Fisher   
-------------------- ------ ----------- -- ------------- ----------------------- -----------
CCCGTT                    1   0.3880    N      0.4286 +          0.02 - 11.63 +    1.0000   
CGCGTT                    8   0.0404 *  Y      0.1111             0.01 - 1.13      0.0791   
CGTCCT                    1   0.3880    N      0.4286 +          0.02 - 11.63 +    1.0000   
CGTGCT                    1   0.3880    N      0.4286 +          0.02 - 11.63 +    1.0000   
CGTGTT                    1   0.3880    N      0.4286 +          0.02 - 11.63 +    1.0000   
TCCCCC                    1   0.2268    N      4.5789 +         0.17 - 124.59 +    0.4167   
TCCCCT                    2   0.0805    N      8.5294 +         0.36 - 199.49 +    0.1630   
TCCGCT                    2   0.0805    N      8.5294 +         0.36 - 199.49 +    0.1630   
TCCGTT                    2   0.0805    N      8.5294 +         0.36 - 199.49 +    0.1630   
TCTCTT                    1   0.2268    N      4.5789 +         0.17 - 124.59 +    0.4167   
TGCGTT                    1   0.2268    N      4.5789 +         0.17 - 124.59 +    0.4167   
TGTCCT                    1   0.3880    N      0.4286 +          0.02 - 11.63 +    1.0000   
TGTCTT                    1   0.3880    N      0.4286 +          0.02 - 11.63 +    1.0000   
TGTGTT                    1   0.3880    N      0.4286 +          0.02 - 11.63 +    1.0000   
                     ------
                         24

    + Haldane correction applied


          | CCCGTT | Others |   Chi Squared:        0.7453
          |--------|--------|   pValue:             0.3880  
  Case    |      0 |     10 |   Odds Ratio:         0.4286 +
  Control |      1 |     13 |   CI 95%:       0.02 - 11.63 +
          |--------|--------|


          | CGCGTT | Others |   Chi Squared:        4.2000
          |--------|--------|   pValue:             0.0404 *
  Case    |      1 |      9 |   Odds Ratio:         0.1111
  Control |      7 |      7 |   CI 95%:        0.01 - 1.13
          |--------|--------|


          | CGTCCT | Others |   Chi Squared:        0.7453
          |--------|--------|   pValue:             0.3880  
  Case    |      0 |     10 |   Odds Ratio:         0.4286 +
  Control |      1 |     13 |   CI 95%:       0.02 - 11.63 +
          |--------|--------|


          | CGTGCT | Others |   Chi Squared:        0.7453
          |--------|--------|   pValue:             0.3880  
  Case    |      0 |     10 |   Odds Ratio:         0.4286 +
  Control |      1 |     13 |   CI 95%:       0.02 - 11.63 +
          |--------|--------|


          | CGTGTT | Others |   Chi Squared:        0.7453
          |--------|--------|   pValue:             0.3880  
  Case    |      0 |     10 |   Odds Ratio:         0.4286 +
  Control |      1 |     13 |   CI 95%:       0.02 - 11.63 +
          |--------|--------|


At the top, you can find the usual header informing you about dates and times of performance, user, study, filters applied and other data.

"Percentage of samples used" informs the user about the percentage of samples included in the haplotype list that have matched samples included in the sample table. This percentage should be 100% if everything is OK. Otherwise, it is possible that an incorrect filter was activated when the test was performed or that there areproblems with the spelling of samples.

Below that, the association results are printed in several columns, containing:

- Haplotype
The haplotype used in a 2x2 association test (the rest of haplotypes were pooled together).

- N
Occurrences of this haplotype in the haplotype list.

- pValue
P-value from the chi square test applied to the 2x2 contingency table for each haplotype.

- Significance
* for pValue<0.05, ** for pValue<0.01

- Validity
Not valid (N) when there is some expected value in the chi- squared contingency table that is equal or below 1.

- Odds Ratio
The odds ratio resulting from the association analysis of each haplotype. If the Haldane correction has been used, this value is flagged with a "+" sign.

- CI95
It is the 95% Confidence Interval for the Odds Ratio value obtained before. Here, too, if Haldane correction has been applied it will be flagged with a "+" sign.

- Fisher_p
P-value from the Fisher exact test applied to the 2x2 contingency table for each haplotype.

- Significance (of the of Fisher's Exact Test)
* for P-values<0.05, ** for P-values<0.01

Finally, for each haplotype, the contingency table with some of the statistics is printed.

References: