The data set on which the analysis is based can be downloaded from:

http://drygin.ccbr.utoronto.ca/~costanzo2009/ File name: sgadata_costanzo2009_rawdata_101120.txt.gz

In the following we briefly summarize what each Mathematica notebook file (.nb) does.

Categorize genes into arrayonly, queryonly, temperature sensitive, DAmP, ... Build matrices of double knockout growth rates and epistatic interactions. Process the data in order to plot epistasis as a function of the mutational effects. Do the same with the traditional definition of epistasis.

Actually make the plots of epistasis as a function of the mutational effects.

Repeat the analysis considering only growth rates whose relative experimental uncertainty is less than 5%.

Compare the observed distribution of experimental uncertainty with a chi-square distribution.

Generate a mock data set with normally-distributed noise and repeat the analysis.

Generate 9 mock data sets with student's t-distributed noise and repeat the analysis for each of them (double gene knockout mutants).

Analysis for the average of the 9 mock data sets h1 to h9.

Generate a mock data set with log-normally-distributed noise and repeat the analysis.

Generate 9 mock data sets with student's t-distributed noise and repeat the analysis for each of them (temperature sensitive mutants).

Analysis for the average of the 9 mock data sets j1 to j9.

Rank interactions according to their strength, considering only interactions with good enough p-values. Compare how interactions rank according to the two definitions of epistasis. Count how many Gene Ontology interactions are present among the top ranking genetic pairs. (Consider as GO-interacting only the pairs in the top 5% for sharing GO terms).

Rank interactions according to their strength, considering only interactions with good enough p-values. Compare how interactions rank according to the two definitions of epistasis. Count how many Gene Ontology interactions are present among the top ranking genetic pairs. (Consider as GO-interacting only the pairs in the top 20% for sharing GO terms).

Rank interactions according to their strength, considering only interactions with good enough p-values. Compare how interactions rank according to the two definitions of epistasis. Count how many Gene Ontology interactions are present among the top ranking genetic pairs. (Consider as GO-interacting only the pairs in the top 5% for sharing GO terms). Analysis restricted only to subsets of the data set.

Analysis of protein-protein interactions.

Analysis of GO interactions.

Plot additional figures.

Department of Physics

Massachusetts Institute of Technology

Gore Laboratory

Department of Physics

Massachusetts Institute of Technology

13 - 2008

77 Massachusetts Avenue

Cambridge, MA 02139