  • [ 10 ] . In their paper, they applied Welch tests on a leukemia dataset [ 11 ] and demonstrated the importance of allowing for unequal variances.

  • The correlation between the R-PCR dataset and the direct labeling dataset is 0.68 (Fig 3A).

  • The mammalian proteins with an increased rate of amino-acid substitution between the human and rodent lineages also show a level of K s significantly higher than that of the total dataset (Table 1).

  • After this initial round of fuzzy clustering, duplicate centroids (pairs whose Pearson correlation is greater than 0.9) are averaged, and genes with a greater than 0.7 correlation to any of the identified centroids are removed from the dataset (see Materials and methods).

  • For ease of illustration and to emphasize USM's general validity, the test dataset used to describe implementation of the algorithm consists of two stanzas of a Poem by Wendy Cope, "The Uncertainty of the Poet" [ 14 ] . In the Discussion section, USM was also applied to the DNA sequence of the threonine operon of Escherichia coli K-12 MG1655, obtained from the University of Winsconsin E. coli Genome Project http:/www.genetics.wisc.edu, and to its 5'3' first frame proteomic translation obtained by using SwissProt on line translator http://www.expasy.ch/tools/dna.html.

