bioinformatics based confirmatory test for identification of disease putative genes
Embed Size (px)
TRANSCRIPT
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 154 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
Bioinformatics based confirmatory test for identification of Disease Putative Genes*
BIKRAM NAYAK (Author) Adeshwar Academy,Jagdalpur,C.G. (India) Email: [email protected]
ABSTRACT In this paper ,several bioinformatics based approaches and methodologies are deployed to get confirmatory classification on
genes of mouse chromosome 11 in the region from 69mb to 104Mb for lethality or viability, duplicacy or Singleton character
and how their location determine their properties. DL genes found within these above mentioned regions are AK144590,
AL591436, X63190, DQ832277, AL591436, X51983, X07750, X07751, X07752, BC046795, AL590963, CH466556, AK078233,
AL590963, CH466556, AK078233, AL590963, CH466556 .
The gene id having MGI ID 2448712 is not available in genetrap nor the 7 genes
AF465352,AK039558,AK170258,BC052502,BC052734,CH466596,AL845465 having GO ID of 005737 are available in GO ontolo-
gy.so these are disease unknown gene.
No Matching Record for MGI:2137026 is also available at genetrap. But the go ID 005887 are available in plasma membrane so
these are grouped in disease viable gene and they have very few or less than 1 or 2 edges available at PPI network. The max
binding protein having id no 109150 starting at 74644422 and ending at 74659227 and entrez gene id of 17428 is purely a disease
lethal gene as it’s go ontology id 0005634 suggest that it is located at nucleus and having tumerigenic property and listed as ad-
enocarcinoma at MeSH dictionary and there are more than 5 edge connected to different hub. All genes
AK144590,AL591436,X63190,DQ832277,AL591436,X51983,X07750,X07751,
X07752,BC046795,AL590963,CH466556,AK078233,AL590963,CH466556,AK078233,AL590963, CH466556 having gene starting
position and ending position is megablasted against Human/mouse and by freeing an e value (1020 ) highlighted duplicacy
from human/mouse.
Keywords : mutagenesis, duplicacy,lethality,viability, PPi Network,MeSH dictionary,e value, fdr
1 INTRODUCTION
Generally there are two types of genes. Essential disease and non essential disease genes. An Essential gene is one that is
necessary for the organism’s survival. Essential disease genes are those gene if the knockout of its mouse orthologs confers
lethality and non essential disease genes are those genes where a mouse knock out is viable at birth and if there is no available
data found at mouse knock out data that is treated as disease unknown gene. So these essential disease genes are termed as Dis-
ease Lethal and non essential disease gene as disease viable gene.
2. Procedure:- Tools and methods for accessing wet lab datasets:- When we collected the mouse gene from the following url (http://www.mouse-genome.bcm.tmc.edu/Bioinformatics/MouseGeneSearchList2.asp) by filling the submission form we got both 793 known and unknown locus starting with gene id O08826 to Mapt gene.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 155 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
Materials and Methods:
The human–mouse orthology and protein coding genes data from Mouse Genome Informatics
(http://www.informatics.jax.org) are obtained from biomart product.Biomart is a simple and robust data
integration system for large scale data querying and warehouse data extraction server..These data are an appro-
priate proxy for gene essentiality in humans and are herein mentioned as viable and lethal.
http://biomart.informatics.jax.org/biomart/martview/28d343acfd5d3bf0896340a4965d54a9 If a gene id is same and we got 4 different transcript factor id .it was assume that it has 4 predicted transcript sites in its gene. Dataset Mus musculus genes (NCBIM37) Filters Chromosome: 11 Gene Start (bp): 69000000 Gene End (bp): 104000000 with EMBL ID(s): Only Ensembl Gene ID(s): [ID-list specified] Gene type : protein_coding Source : ensembl Status (gene) : KNOWN Evidence code (GO Cellular component) : IC Orthologous Human Genes: Only Attributes Ensembl Gene ID Ensembl Transcript ID Chromosome Name Ensembl Protein ID Gene Start (bp) Gene End (bp) GO Term Accession EntrezGene ID EMBL (Genbank) ID MGI ID After getting all the dataset ,all the genes were evaluated & analyzed according to the following functional parameters.
a. Cellular Localization/Function
b. Biological Function
c. Physiological Function
d. Protein protein interaction
e. Mode of inheritance
f. Evolutionary history/gene age
g. Singleton/duplication event
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 156 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
1. Disease genes localise to different cellular compartments. Disease viable and disease lethal genes vary in the cellular compartments to which they are localised .DL genes are highly avail-
able in the nucleus. But DV genes are enriched for localisation to the plasma membrane. & in the extracellular region. That’s
why DL genes show a greater number of PPIs due to their higher probability of localisation within the nucleus.
Eg. All 18 genes AK144590,AL591436,X63190,DQ832277,AL591436,X51983,X07750,X07751,
X07752,BC046795,AL590963,CH466556,AK078233,AL590963,CH466556,AK078233,AL590963,
CH466556 having go Id 0005634 are present in nucleus and suggested as DL genes while gene bearing GO ID 0005887 are pre-
sent in plasma membrane and the rest of the gene are disease unknown.
2. Disease viable and disease lethal genes perform different Biological Functions
As we all know that the function of a protein is fully dependent on its cellular localisation. for example transcription
factors must be present in the nucleus to activate gene expression. GO annotations suggest essential genes localise to the nucle-
us, DL are enriched for nucleic acid binding when compared to all genes.for ex from our biomart output gene MGI ID of
2150020 has nucleotide binding property. DV genes are enriched for calcium binding refers itself a role in signal transduction,
DV are over-represented in signal transduction functions along with hydrolase activity than any of the other metabolic function
categories indicates that hydrolase activity is a specific feature of DV genes. But DL genes are enriched for involvement in em-
bryonic development as suggested by biological process annotations.
Eg. As found in genetrap column/GO database
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 157 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
[Disease Lethal and Disease Viable gene involved in Biological Processes]
[Differentiation on the basis of Molecular Function of Disease Lethal and Disease Viable gene]
3.Disease viable and disease lethal genes perform different Physiological Functions
Disease symptoms generally show an irregular element in particular organ systems or physiological processes.
Disease lethal genes are statistically over-represented for expression and behaves/work as an cancerous gene
directly affecting cell growth and death mechanisms. DL genes have a higher representation in skeletal
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 158 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
disorders, bone disease. This difference may be due to essential genes being involved in developmental
patterning of the body axis, and skeletal system, but not being involved in bone metabolism. However, the DV
are also associated with some diseases but differ from DL gene .They are involve in nutritional, psychiatric and
neurological disorders. And also are enriched in psychiatric and immune system diseases, but under-represented
among cardiovascular diseases.
4. Protein protein interaction network distinguish Disease Lethal and Disease viable gene.
Disease lethal genes are highly connected in Protein Protein network while disease viable genes have fewer
connections. To create the human protein-protein interaction network, data were derived from BioGRID,
BIND,HPRD, GeneRIF from ncbi and properly viewed and analyzed in “Cytoscape” and “Navigator” tool for
portein protein interaction network and number of hubs and hub-hub connections in the network are recorded.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 159 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
[Disease lethal gene cluster as graphically represented at PPI network from Cytoscape/netscape]
Above graphical representation of PPI network taken from the various java based plugin of cytoscape/netscape suggest that
Diseased Lethal groups have more complex networks than DV genes, with more interactions, few fragmentations and the rate of
edge is more in the highest hub when interacted from the same datasets.
Below is the interaction map of DV genes separated from DL group. Here the DV gene is less interconnected, more fragmented
and the rate of edge is less/null in the hub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 160 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
[ DV genes are viewed in PPI network in Cytoscape/netscape plugin ]
5. Functional parameters:-Mode of Inheritance Disease lethal genes express a dominant mode of inheritance than DV.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 161 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
It was observed that the DL gene set showed a higher proportion of autosomal dominant mutations While DV genes are
overrepresented for autosomal recessive inheritance.
6. Disease genes vary in their evolutionary history/gene age/phylogenetic distance.
The DL genes would have the oldest evolutionary history. Using reference genomes representative of each taxonomy category,
orthologs are identified for all disease genes, representing the earliest ancestor gene for each human gene. The taxonomy catego-
ries are distributed according to evolutionary distance, with H. sapiens as the closest and Fungi/Metazoa as the category with
the most distant evolutionary origin. DL genes show a higher frequency of orthologs originating in the most distant Fun-
gi/Metazoa or Bilateria classes. As compared to all human genes, the DL genes have a much higher proportion with the oldest
ancestor in the chordata class or earlier. However, DV have a higher proportion of genes with the oldest ortholog originating in
one of the evolutionarily more recent categories: Tetrapoda, Amniota, Mammalia, Theria and Eutheria. When compared to all
human genes, all genes in our annotated categories do not have the oldest ortholog arising in the most recent evolutionary line-
ages, such as Euarchontoglires, Primates, Catarrhini, Hominidae, Hominanae, or Homo sapiens .so finally DL genes have a
more ancient evolutionary origin, and a greater number of orthologs, than the other gene classes analysed.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 162 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 163 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
Gene age:
A lot of behavioral study and investigation in the context of essentiality is going on to provide insights for candidate gene anal-
ysis to identify new disease loci. One such eg is on gene age that was measured using the phylogenetic breadth of the distribu-
tion of homologous genes among different lineages.ex. old genes are those that are present in more distantly related species
whereas young genes are those that are present only in the closely related species like chimpanzee and macaque.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 164 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
[The comparison sheet of the gene age of Human beings and other lower ordered organisms ]
When all the gene with MGI accession are again searched at genetrap database a lot of phenotypic characters having accession
id of mammalian phenotypic browser,their orthology counterparts,ontological evidence on their location,omim relation etc are
arranged.
For example mgi:99423 gene suggests it’s expression is associated with tumor cell invasion and metastasis and go ic suffix
GO:0005634 [nucleus] evidence: IC suggest that it is disease lethal gene.
MGI:2150020 gene also show protein coding biotype.it has also gene tree in newwick format
(((((((((((((ENSSTOP00000000315_Stri_:0.0330, ENSDORP00000000165_Dord_:0.1161):0.0048, ((ENSMICP00000005418_Mmur_:0.0116, ENSOGAP00000007601_Ogar_:0.0509):0.0047, ENSTBEP00000013197_Tbel_:0.0487):0.0053):0.0000, ((((((ENSPTRP00000051614_Ptro_:0.0000, ENSP00000372088_Hsap_:0.0013):0.0171, ENSGGOP00000013922_Ggor_:0.0506):0.0000, (ENSMMUP00000033829_Mmul_:0.0094, ENSCJAP00000038748_Cjac_:0.0157):0.0140):0.1059, ENSPPYP00000007198_Ppyg_:0.0000):0.0327, ENSTSYP00000000841_Tsyr_:0.0302):0.0016, ((((ENSDNOP00000013522_Dnov_:0.0076, ENSCHOP00000000528_Chof_:0.0292):0.0051, ((ENSPCAP00000000302_Pcap_:0.0307, ENSLAFP00000007349_Lafr_:0.0346):0.0072, ENSETEP00000002471_Etel_:0.0494):0.0072):0.0076, (((((((ENSSSCP00000005138_Sscr_:0.0000, ENSSSCP00000005140_Sscr_:0.0749):0.0251, ENSBTAP00000003788_Btau_:0.0210):0.0032, ENSTTRP00000010038_Ttru_:0.0127):0.0048, ENSCAFP00000013557_Cfam_:0.0348):0.0000,
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 165 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
(((ENSMLUP00000003893_Mluc_:0.0278, ENSPVAP00000001739_Pvam_:0.0280):0.0000, ENSFCAP00000013649_Fcat_:0.0830):0.0022, ENSVPAP00000003573_Vpac_:0.0463):0.0000):0.0000, ENSSARP00000002591_Sara_:0.0708):0.0018, ENSECAP00000015324_Ecab_:0.0119):0.0147):0.0057, ((ENSOCUP00000014527_Ocun_:0.0243, ENSOPRP00000014620_Opri_:0.0500):0.0179, ENSCPOP00000012705_Cpor_:0.0458):0.0069):0.0024):0.0063):0.0153, ((ENSRNOP00000053270_Rnor_:0.0305, ENSRNOP00000041615_Rnor_:0.1258):0.0000, ENSMUSP00000028795_Mmus_:0.0332):0.0374):0.0592, (ENSMEUP00000013631_Meug_:0.0268, ENSMODP00000000325_Mdom_:0.1465):0.0753):0.0737, ENSOANP00000014708_Oana_:0.2186):0.0000, (((ENSGALP00000032107_Ggal_:0.0011, ENSMGAP00000002659_Mgal_:0.0296):0.0878, ENSTGUP00000007594_Tgut_:0.1159):0.0332, ENSACAP00000013598_Acar_:0.1269):0.0290):0.0631, ENSXETP00000034829_Xtro_:0.1406):0.0470, (((ENSTRUP00000010366_Trub_:0.0517, ENSTNIP00000022389_Tnig_:0.0569):0.0251, (ENSGACP00000016197_Gacu_:0.0667, ENSORLP00000022299_Olat_:0.0997):0.0455):0.0658, (ENSDARP00000060705_Drer_:0.0000, ENSDARP00000102254_Drer_:0.2574):0.1346):0.0524):0.1086, ENSCSAVP00000002327_Csav_:0.3665):0.0401, FBpp0084956_Dmel_:0.3818):0.0988, ((((((ENSDARP00000053846_Drer_:0.2500, ENSGACP00000006597_Gacu_:0.4557):0.0000, ((((((((ENSSTOP00000014167_Stri_:0.0520, ENSCPOP00000009649_Cpor_:0.0895):0.0076, ENSTBEP00000008426_Tbel_:0.0614):0.0024, (ENSPCAP00000012864_Pcap_:0.0390, ENSLAFP00000012636_Lafr_:0.0471):0.0285):0.0035, (((((ENSBTAP00000025252_Btau_:0.0220, ENSSSCP00000002496_Sscr_:0.0599):0.0149, ENSECAP00000011815_Ecab_:0.0389):0.0045, (ENSCAFP00000024255_Cfam_:0.0519, ENSEEUP00000001317_Eeur_:0.0934):0.0066):0.0000, (((ENSFCAP00000007091_Fcat_:0.0588, ENSSARP00000001222_Sara_:0.1050):0.0077, ENSPVAP00000011016_Pvam_:0.0468):0.0030, (ENSTTRP00000014685_Ttru_:0.0213, ENSVPAP00000009741_Vpac_:0.0445):0.0107):0.0014):0.0183, ENSMICP00000009918_Mmur_:0.0348):0.0065):0.0000, ((((((((ENSPTRP00000039267_Ptro_:0.0000, ENSP00000419881_Hsap_:0.0000):0.0026, ENSPPYP00000006740_Ppyg_:0.0091):0.0026, ENSMMUP00000020875_Mmul_:0.0092):0.0108, ENSCJAP00000027894_Cjac_:0.0198):0.0241, ENSETEP00000008672_Etel_:0.1043):0.0017, (ENSDORP00000010143_Dord_:0.1035, ENSOPRP00000004397_Opri_:0.1484):0.0086):0.0000, (((ENSMUSP00000078490_Mmus_:0.0478, ENSRNOP00000016459_Rnor_:0.0548):0.1360, ENSOCUP00000000520_Ocun_:0.1224):0.0242, ENSTSYP00000006542_Tsyr_:0.0439):0.0022):0.0023, ENSMLUP00000000626_Mluc_:0.0776):0.0000):0.0078,
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 166 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
ENSCHOP00000012287_Chof_:0.0365):0.0795, ENSMODP00000012669_Mdom_:0.1692):0.0647, (((ENSMGAP00000012489_Mgal_:0.0076, ENSGALP00000015437_Ggal_:0.0793):0.0629, ENSTGUP00000011847_Tgut_:0.0781):0.1183, ENSACAP00000005463_Acar_:0.1986):0.0749):0.3462):1.0094, (((((ENSTNIP00000013172_Tnig_:0.0561, ENSTRUP00000026857_Trub_:0.1996):0.0783, ENSGACP00000027559_Gacu_:0.1800):0.0383, ENSORLP00000019089_Olat_:0.1402):0.1668, ENSDARP00000029740_Drer_:0.2887):0.1059, ((((ENSTGUP00000017411_Tgut_:0.0041, ENSTGUP00000005057_Tgut_:0.0070):0.1025, (ENSGALP00000003458_Ggal_:0.0182, ENSMGAP00000004024_Mgal_:0.1614):0.1141):0.1267, (((((((((ENSMUSP00000018985_Mmus_:0.0262, ENSRNOP00000036257_Rnor_:0.0300):0.1150, ENSCPOP00000017272_Cpor_:0.0890):0.0204, ((((ENSGGOP00000000358_Ggor_:0.0019, ENSP00000378090_Hsap_:0.0551):0.0194, ENSPTRP00000015369_Ptro_:0.0057):0.0109, ENSPPYP00000009193_Ppyg_:0.0097):0.0086, ENSCJAP00000027151_Cjac_:0.0261):0.0390):0.0065, ENSVPAP00000011247_Vpac_:0.1432):0.0005, (((ENSSTOP00000005497_Stri_:0.0783, ENSOCUP00000012289_Ocun_:0.1399):0.0040, ((ENSECAP00000012028_Ecab_:0.0522, ENSMLUP00000011374_Mluc_:0.0698):0.0053, ((ENSFCAP00000000230_Fcat_:0.0271, ENSCAFP00000027051_Cfam_:0.0436):0.0358, ENSBTAP00000025404_Btau_:0.0815):0.0060):0.0086):0.0017, ((ENSEEUP00000001339_Eeur_:0.1162, ENSSARP00000000796_Sara_:0.1880):0.0030, (ENSOGAP00000008835_Ogar_:0.0559, ENSOPRP00000010741_Opri_:0.1132):0.0216):0.0042):0.0016):0.0008, ((((ENSTBEP00000011110_Tbel_:0.0687, ENSMICP00000001136_Mmur_:0.0808):0.0253, ENSPVAP00000003694_Pvam_:0.0811):0.0023, ENSTSYP00000002243_Tsyr_:0.1576):0.0037, ENSTTRP00000014683_Ttru_:0.0614):0.0281):0.0276, ENSDNOP00000003927_Dnov_:0.0568):0.0029, ((ENSLAFP00000015357_Lafr_:0.0294, ENSPCAP00000006649_Pcap_:0.0888):0.0099, ENSETEP00000009798_Etel_:0.1368):0.0158):0.1028, (ENSMODP00000023889_Mdom_:0.0358, ENSMEUP00000007724_Meug_:0.0805):0.1221):0.1544):0.0638, ENSXETP00000019650_Xtro_:0.5555):0.1321):1.5927):0.3322, ((((((ENSMODP00000018294_Mdom_:0.0380, ENSMEUP00000006550_Meug_:0.1082):0.0896, (((ENSOGAP00000004411_Ogar_:0.0574, ENSOPRP00000000782_Opri_:0.0755):0.0076, ENSCHOP00000006029_Chof_:0.0557):0.0000, (((((((ENSP00000336701_Hsap_:0.0000, ENSPTRP00000016062_Ptro_:0.0039):0.0027, ENSPPYP00000009244_Ppyg_:0.0039):0.0084, ENSMMUP00000011981_Mmul_:0.0051):0.0192, ENSTSYP00000008137_Tsyr_:0.0329):0.0024, ((((ENSECAP00000003008_Ecab_:0.0270, ENSMLUP00000005665_Mluc_:0.0469):0.0012,
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 167 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
(ENSTTRP00000016289_Ttru_:0.0202, ENSSSCP00000018702_Sscr_:0.0425):0.0023):0.0000, (((ENSVPAP00000000334_Vpac_:0.0307, ENSBTAP00000020012_Btau_:0.0327):0.0072, ENSPVAP00000014496_Pvam_:0.0226):0.0016, ENSCAFP00000025883_Cfam_:0.0391):0.0027):0.0048, (ENSLAFP00000002822_Lafr_:0.0493, ENSDNOP00000003626_Dnov_:0.0580):0.0140):0.0069):0.0000, (ENSDORP00000001895_Dord_:0.0672, ENSTBEP00000013404_Tbel_:0.0695):0.0185):0.0000, (((((ENSCJAP00000011941_Cjac_:0.0020, ENSCJAP00000020026_Cjac_:0.0040):0.0058, ENSCJAP00000035050_Cjac_:0.0079):0.0266, ENSMICP00000014682_Mmur_:0.0356):0.0000, ENSEEUP00000001708_Eeur_:0.1161):0.0000, (((ENSOCUP00000001958_Ocun_:0.0000, ENSOCUP00000017133_Ocun_:0.0196):0.0597, ENSPCAP00000002180_Pcap_:0.2204):0.0000, ((((ENSRNOP00000008846_Rnor_:0.0309, ENSMUSP00000007790_Mmus_:0.1641):0.0791, ENSSTOP00000002592_Stri_:0.0494):0.0000, ENSCPOP00000004133_Cpor_:0.0808):0.0120, ENSETEP00000001932_Etel_:0.0886):0.0046):0.0027):0.0000):0.0044):0.1051):0.0621, ENSOANP00000013505_Oana_:0.1106):0.0631, (((ENSGALP00000038538_Ggal_:0.0067, ENSMGAP00000007189_Mgal_:0.0264):0.0699, (ENSTGUP00000007629_Tgut_:0.0050, ENSTGUP00000015360_Tgut_:0.0138):0.0967):0.0858, ENSACAP00000010810_Acar_:0.2435):0.0733):0.0699, ENSXETP00000002461_Xtro_:0.3031):0.1191, ((((ENSTRUP00000004854_Trub_:0.0975, ENSTNIP00000018013_Tnig_:0.1402):0.1259, ENSORLP00000024828_Olat_:0.2694):0.0330, ENSGACP00000000375_Gacu_:0.0683):0.2500, ENSDARP00000090614_Drer_:0.2005):0.1508):0.6278):0.2354, FBpp0084486_Dmel_:1.9909):1.2160, Y43C5A.6a_Cele_:0.3502):0.2385):0.0000, YER095W_Scer_:0.4878):0.0000; Or after the clustalw program if we prepare the cladogram sheet with distance node the gene age,inheritance would be measured and we could easily isolate the dl genes from dv.
http://www.ensembl.org/Mus_musculus/Gene/Compara_Ortholog?db=core;g=ENSMUSG00000007646 http://www.ensembl.org/Mus_musculus/Gene/Compara_Tree?db=core;g=ENSMUSG00000007646 Below is the graphical format of it’s gene tree.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 168 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 169 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 170 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 171 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 172 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 173 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
Duplication and retention
To examine the gene duplication events in our categories of disease genes, we have used similarity methods to identify
paralogs of all human disease genes. The proportion of genes with paralogs, or duplicates, was analysed for each gene category.
All the DL genes are much more likely to be duplicates . while DV genes are singletons. The high proportion of singleton genes
in the DV class suggests a difference in retention in the human genome following whole genome duplications for these genes,
with many duplicates or paralogs being lost after duplication.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 174 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
Duplicate and singleton identification method Sequences are retrieved from biomart from Ensembl. BLAT v.32 was used for sequence similarity search. Freeing an e-
value threshold of 1020 was used to identify duplicates and singleton genes.
All genes AK144590,AL591436,X63190,DQ832277,AL591436,X51983,X07750,X07751,
X07752,BC046795,AL590963,CH466556,AK078233,AL590963,CH466556,AK078233,AL590963,
CH466556 having gene starting position and ending position is megablasted against Human/mouse and by freeing an e value
10 pow(20) highlights duplicate gene from human/mouse.
[ Figure represents the frequency of duplicate/singleton nature of DL/DV ]
MGI:2150020
Chr.11(-): 87190152-87218268 [NCBI37]
Entrez Gene114714 Chr.11(-): 87190146-87217940 [NCBI37]
GO:0000166 [nucleotide binding] evidence: IEA
GO:0048476 [Holliday junction resolvase complex] evidence: IC lethality-prenatal/perinatal
mgi 98742 not found suggest that it is a disease unknown gene according to genetrap.
But it’s nucleotide binding property suggest it as Disease lethal gene.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 175 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
7.1 Appendices:
Below are all the mouse mutagenesis for development defect from the 69mb to 104mb of chromosome 11 of the Mus Muscu-
lus.
Category craniofacial eye fertility growth lethal
Mutant 30 3 8 80 89
Category metabolism neurological skeletal skin and coat undefined
Mutant 10 76 32 31 3
Category urogenital
Mutant 1
• CRANIOFACIAL:
MGI Accession Lab Name Phenotype MGI:2671740. crfm02Jus craniofacial, affected testers are smaller, have shorter snouts. MGI:2671741. crfm05Jus craniofacial, patchy hair loss.
MGI:2671742. crfm06Jus craniofacial, testers have shorter faces and are smaller than carrier siblings.
MGI:2671743. crfm08Jus craniofacial, testers have a subtle short snout phenotype. MGI:2671832. crfm18Jus smaller head, short snout, not completely penetrant.
MGI:3046702. crfm26Jus testers are smaller, hydrocephalous, do not live very long past 8 or 12 weeks.
• FERTILITY:
MGI Accession Lab Name Phenotype
MGI:2671711. infm02Jus male infertility, low sperm count, normal morphology. ref. clark et. al., biology of reproduction 70, 1317-1324, 2004..
MGI:2671710. infm03Jus female infertility. ref. clark et. al., biology of reproduction 70, 1317-1324, 2004..
MGI:2671707. infm04Jus
male infertility, low sperm count, not motile, unusual mor-phology. ref. clark et. al., biology of reproduction 70, 1317-1324, 2004. also in the same complementation group as inf08 and inf09..
MGI:2671706. infm05Jus male infertility, ref. clark et. al., biology of reproduction 70, 1317-1324, 2004..
MGI:2671699. infm07Jus female infertility, ref. clark et. al., biology of reproduction 70, 1317-1324, 2004..
MGI:2671697. infm08Jus in the same complementation group as inf04 and inf09. ref. clark et. al., biology of reproduction 70, 1317-1324, 2004..
MGI:2671691. infm09Jus in the same complementation group as inf04 and inf08. ref. clark et. al., biology of reproduction 70, 1317-1324, 2004..
• GROWTH:
MGI Accession Lab Name Phenotype
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 176 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
MGI:2671718. grom01Jus small animals seen in small litters of 5/4 pups. small runs about 7-9 gm, while other pups are 13-16gm size..
MGI:2671720. grom22Jus affected testers 3/4 size of normal littermates, low serum cholesterol.
MGI:2671721. grom40Jus testers are 1/2 size of unaffected siblings.
MGI:2671722. grom41Jus 2 of 5 testers 3/4 size of carrier sibs, appears to be on chro-mosome 11 but not completely penetrant.
MGI:2671723. grom42Jus testers small, 3/4 size of carrier siblings. MGI:3046714. grom79Jus small, 3/4, 10gm vs 16gm at n1f1.
• LETHAL:
MGI Accession Lab Name Time of Death MGI:2671871. l11Jus01 5.5 - 8.5 dpc. MGI:2671872. l11Jus02 5.5 - 8.5 dpc. MGI:2671873. l11Jus03 5.5 - 8.5 dpc. MGI:2671874. l11Jus04 5.5 - 8.5 dpc. MGI:2671876. l11Jus05 9.5 - 12.5 dpc. MGI:2671877. l11Jus06 9.5 - 12.5 dpc. MGI:2671878. l11Jus07 5.5 - 8.5 dpc. MGI:2671879. l11Jus08 9.5 - 12.5 dpc. MGI:2671880. l11Jus09 9.5 - 12.5 dpc. MGI:2671881. l11Jus10 peri-natal lethal. MGI:2671882. l11Jus11 5.5 - 8.5 dpc. MGI:2671883. l11Jus12 5.5 - 8.5 dpc. MGI:2671884. l11Jus13 peri -natal lethal. MGI:2671885. l11Jus14 9.5 - 12.5 dpc. MGI:2671886. l11Jus15 13.5 - 18.5 dpc. MGI:2671887. l11Jus16 peri-natal lethal. MGI:2671888. l11Jus17 9.5 - 12.5 dpc. MGI:2671889. l11Jus18 9.5 - 12.5 dpc. MGI:2671890. l11Jus19 9.5 - 12.5 dpc. MGI:2671891. l11Jus20 9.5 - 12.5 dpc. MGI:2671892. l11Jus21 peri-natal lethal. MGI:2671893. l11Jus22 peri-natal lethal. MGI:2671894. l11Jus23 peri-natal lethal. MGI:2671896. l11Jus24 peri-natal lethal. MGI:2671897. l11Jus25 post-natal lethal. MGI:2671898. l11Jus26 post-natal lethal. MGI:2671899. l11Jus27 9.5 - 12.5 dpc. MGI:2671900. l11Jus28 9.5 - 12.5 dpc. MGI:2671901. l11Jus29 13.5 - 18.5 dpc. MGI:2671902. l11Jus30 peri-natal lethal. MGI:2671903. l11Jus31 peri-natal lethal. MGI:2671904. l11Jus32 peri-natal lethal.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 177 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
MGI:2671906. l11Jus33 peri-natal lethal. MGI:2671907. l11Jus34 5.5 - 8.5 dpc. MGI:2671908. l11Jus35 5.5 - 8.5 dpc. MGI:2671909. l11Jus36 9.5 - 12.5 dpc. MGI:2671910. l11Jus37 9.5 - 12.5 dpc. MGI:2671911. l11Jus38 5.5 - 8.5 dpc. MGI:2671912. l11Jus39 9.5 - 12.5 dpc. MGI:2671913. l11Jus40 peri-natal lethal. MGI:2671914. l11Jus41 9.5 - 12.5 dpc. MGI:2671915. l11Jus42 5.5 - 8.5 dpc. MGI:2671916. l11Jus43 peri-natal lethal. MGI:2671917. l11Jus44 peri-natal lethal. MGI:2671918. l11Jus45 9.5 - 12.5 dpc. MGI:2671919. l11Jus46 9.5 - 12.5 dpc. MGI:2671920. l11Jus47 9.5 - 12.5 dpc. MGI:2671921. l11Jus48 5.5 - 8.5 dpc. MGI:3034009. l11Jus49 5.5 - 8.5 dpc. MGI:3034010. l11Jus50 after 12.5 dpc. MGI:2671922. l11Jus51 Post-natal lethal. MGI:2671923. l11Jus52 Post-natal lethal. MGI:2671924. l11Jus53 post-natal lethal. MGI:2671925. l11Jus54 post-natal lethal. MGI:2671926. l11Jus55 post-natal lethal. MGI:2671927. l11Jus56 post-natal lethal. MGI:2671928. l11Jus57 post-natal lethal. MGI:3034011. l11Jus58 9.5 - 12.5 dpc. MGI:3043663. l11Jus59 still to be determined.
• METABOLISM:
MGI Accession Lab Name Phenotype MGI:2671716. hemm04 low rbc/hgb/hct 11/17/52. MGI:2671712. hem1 low wbc, neutrophilic blasts. MGI:2671713. hem2 low wbc, cf c: 1-2 vs 8. MGI:2671715. hem3 low rbc/hgb/hct 6/10/32.
• NEUROLOGICAL:
MGI Accession Lab Name Phenotype
MGI:2671725. nurm01Jus tester animals are hyperactive, nervousness, tremors. previ-ously known as jittery 1..
MGI:2671727. nurm02Jus testers are hyperactive. previously known as shaky 3..
MGI:2671729. nurm03Jus small, hyperactive, some affecteds show craniofacial abnor-malities. previously known as small hyper 3..
MGI:2671730. nurm04Jus affected animals have a quivering phenotype, the phenotype was late onset. previously known as shaky 4..
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 178 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
MGI:2671731. nurm05Jus affected animals have a quivering phenotype, noticeable when they move. phenotype is late onset, do not see it until at least 3 months old. previously called shaky 5..
MGI:2671732. nurm06Jus affected animals hyperactive and 1/2 size of siblings, 6 of 9 testers, 2 of 24 carriers affected, may be outside of inversion on 11. previously called small hyper 5..
MGI:2671733. nurm07Jus smaller, lethargic, testers have reduced open field activity, develop late onset tremors upon movement. previously known as small lethargic..
MGI:2671734. nurm08Jus hyper, seizing, previously known as flicker.
MGI:2671735. nurm09Jus hyperactive, jittery weaving gait, hearing loss. previously known as jittery 2..
• SKIN AND COAT:
4 CONCLUSION Since so many years, mapping & identification of disease-causing genes in humans is being carried out in so many laboratories
with different methodologies. Today, classical map-based gene discovery has been augmented by the sequence-based gene dis-
covery, given that the human genome project has produced high-precision tools for disease gene locus mapping and identifica-
tion.
So far, the characterization of genetic defects has been successfully accomplished in more than 1600 human monogenic diseases.
Mapping common & genetically complex human disease traits has proved more difficult but even in these more complex cases, a
no of mutations associated with human complex diseases have been identified.Like most of the confirmatory test for different
types of radicals in chemistry lab, there must be a systematic procedure for identification of disease gene. During many of the
cases, mouse knockout gene dataset is taken as an alternative option to understand the role of disease genes in human because
chromosome 11 of mouse is similar to the gene of human chromosome 17, so in this article it was taken as a suitable proxy mod-
el to find out disease loci. Here bioinformatics based analysis are carried out for classifying genes from mouse chromosome 11 in
the mutagenesis screen (69mb-104Mb) and also other factors like how many genes are participating for duplicacy, Where is
there actual position? Is there any function of the gene be affected with their location?
These classification of DL and DV genes,their underexpressed and overexpressed characters help us in finding human dis-
ease,ageing and biosenescence etc. But bioinformatics based classification along with few statistical parameters help us to pre-
dict absolutely in accurate way.
ACKNOWLEDGMENT I cordially thankful to all my students, laboratory staffs and especially to the Director Mr P.K.Boss.chemistry Head and prof John Pejjulo,PhD in biostatistics for his online support during my work.
MGI Accession Lab Name Phenotype MGI:2671724. skcm01Jus greasy looking hair, previously known as greasy coat.
MGI:3038892. Skcm02Jus scruffy, hair sticks out straight, previously known as pete rose hair.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014 179 ISSN 2278-7763
Copyright © 2014 SciResPub. IJOART
Glossary:
1. Cladogram:- Phylogenetic tree showing the relationship between species.
2. Homologues:- 2 genes are homologous if they evolved from the same common ancestor.
3. Pattern:-Conserved residues that one can use as a functional signature.
4. TrEMBL:-Translated EMBL which contains all the putative protein sequence contained in the nucleotide
databases. It’s US counterpart is Non redundant.
5. E-value:-expected value. i.e how likely the similarity between your sequence and database sequence
due to a chance.Less e value more suitable for research.Evalue of 10 to the power -32 is better than 10
to the power -4
6. P-value:-In statistical testing, the p value indicates whether some effect (like whether the difference in
the average value of some quantity is different between two groups, or whether one numerical variable
is correlated with another numeric variable) is statistically significant. Statistically significant generally
means that the results one observed (the difference in some average between two groups) is very un-
likely to have arisen only from random fluctuations in your observed sample, if there is truly no differ-
ence between the two groups in the whole population. The p value is the probability of getting results
at least as convincing as what one actually get, if there's really nothing going on, but only random fluc-
tuations. If this p value is less than some small number (often set at 0.05), then the results are said to
be statistically significant.
References
1. Hentges, K.E., Pollock, D.D., Liu, B. and Justice, M.J. (2007) Regional variation in the density of essential genes in mice.
PLoS Genet.
2 McKusick, V. (1998) Mendelian Inheritance in Man., A Catalog of Human Genes and Genetic Disorders.
3. Bult, C.J., Eppig, J.T., Kadin, J.A., Richardson, J.E. and Blake, J.A. (2008) The Mouse Genome Database (MGD): mouse
biology and model systems. Nucleic Acids Res, 36, D724-8.
4. Maere, S., Heymans, K. and Kuiper, M. (2005) BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontolo-
gy categories in biological networks. Bioinformatics, 21, 3448-9.
5. Lowe, H.J. and Barnett, G.O. (1994) Understanding and using the medical subject heading (MeSH) vocabulary
6. Smedley, D., Haider, S., Ballester, B., Holland, R., London, D., Thorisson, G. and Kasprzyk, A. (2009) BioMart--
biological queries made easy. BMC Genomics, 10, 22.
7. Kent, W.J. (2002) BLAT--the BLAST-like alignment tool. Genome Res, 12, 656-64.
IJOART