Creation and implications of a phenome-genome network
Atul J Butte & Isaac S Kohane
2006, Nature Biotechnology

▣ Background
- Comprehensively consider associations between components of phenotype, genotype and environment to identify genes that may govern phenotype and responses to the environment.

▣ Result
- Finding a network of relations between phenotypic, disease, environmental and experimental contexts as well as genes with differential expression associated with these concepts.


▣ Method

- 1. Seven fields of annotations representing the phenotype, environmental and experimental context from GEO samples, series and data sets are parsed and mapped to UMLS concepts.
  • Downlaod GEO Data(http://www.ncbi.nlm.nih.gov/geo/)
  • Extract seven fields of GEO data into DB tables(gds, gse, gsm, ...)
  • Using MetaMap, MetaMap Automation Mapping Tool is been designed by Yonglae.


gds_title    GDS1531    Chito-oligomer effect    C1280500    -827    Effect    [qlco]
gds_title    GDS1531    Chito-oligomer effect    C2348382    -827    Effect    [qlco]
gds_title    GDS1531    on seedlings    C0242437    -1000    Seedlings    [plnt]
gds_title    GDS2674    20-hydroxyecdysone effect    C0013495    -734    20-Hydroxyecdysone    [horm, strd]
gds_title    GDS2674    20-hydroxyecdysone effect    C1280500    -827    Effect    [qlco]
gds_title    GDS2674    20-hydroxyecdysone effect    C0013495    -734    20-Hydroxyecdysone    [horm, strd]
gds_title    GDS2674    20-hydroxyecdysone effect    C2348382    -827    Effect    [qlco]
gds_title    GDS2674    on cultured larval organs    C0010453    -589    Culture    [idcn]
gds_title    GDS2674    on cultured larval organs    C0023047    -589    Larva    [euka]

- 2. GEO Platforms are manually related to NCBI Gene identifiers, allowing the same genes to be related across platforms.
  • Using GRIP(http://grip.snubi.org)
  • Using BLAST(http://blast.ncbi.nlm.nih.gov/Blast.cgi)

- 3. Gene expression measurements are rank normalized within each GEO sample, then averaged across each GEO series.
  • Normalized Rank(http://people.revoledu.com/kardi/tutorial/Similarity/Normalized-Rank.html)

- 4. Mean expression measurements for each gene in each GEO data set were related to the concepts mapped from each GEO data set.


Gene ID CUI High/Low p-value q-value MRN MRN
Not
Concept Not Concept
1 C0026845 2 0.00314 0 0.322 0.7322 4 5
1 C0205448 1 0.016635 0.03 0.75225 0.388 4 5
1 C0596981 2 0.00314 0 0.322 0.7322 4 5
9 C0026845 2 0.001849 0.125 0.231833 0.448684 6 38
9 C0037585 1 0.01069 0.17 0.499889 0.398343 9 35


+ Recent posts