visigene - avirtual microscope and database for in situ images at genome.ucsc.edu galt barber, donna...
Post on 20-Dec-2015
214 views
TRANSCRIPT
VisiGene - AVirtual Microscope and Database for In Situ Images at genome.ucsc.edu
Galt Barber, Donna Karolchik, David Haussler, Jim Kent
VisiGene displays images from in-situ RNA hybridization, reporter genes, and other techniques that show where a gene, enhancer, or promoter is active in an organism. Currently VisiGene contains ~100,000 images from several high-throughput gene projects and also images from the literature as curated by the model organism databases. The controls for VisiGene are quite simple. There is a text box for search terms, a scrolling list of thumbnails of images that match the search terms, and a large region that serves as a virtual microscope for the selected image. One simply clicks on a region to go to the next level of magnification centered on that region. VisiGene only transmits the data for the part of the image that you are viewing at the scale you are viewing it at, so the response time is quite fast. One can scroll through the image by dragging it with a mouse. Underneath the image is a caption which contains a link to the paper associated with the image, hyperlinks to the UCSC Genome Browser page for the genes, the age, sex and genotype of the organism, and when available human curated information on what anatomical structures the gene is active in. The search terms include gene names and symbols, authors, date of publication, organisms, developmental stages, and anatomical structures. The Genome Browser and Gene Sorter contain tracks and columns that link into VisiGene. Current image sets include mouse transcription factors from the Mahoney Lab, adult mouse brain images from the Allen Brain Atlas, mouse head and brain images from the GENSAT project, whole mount Xenopus laevis images from the Japanese Institute of Basic Biology, and images from the mouse literature curated by the GXD group of MGI. We are grateful to all who have contributed images to VisiGene so far, and are actively searching for additional image sets.
Database structure Indentation shows parent/child relationship between tables. Key fields used to join tables are underlined. In general a key field named xyz links into the id field of the xyz table.
table fieldssubmissionSource id,name,acknowledgement,setUrl,itemUrl,abUrl submissionSet id,name,contributors,year,publication,pubUrl,journal,copyright,submissionSource journal id,name,url copyright id,notice imageFile id,fileName,priority,imageWidth,imageHeight,submissionSet,submitId,caption caption id,caption image id,submissionSet,imageFile,imagePos,paneLabel,sectionSet,sectionIx,specimen,preparation specimen id,name,taxon,genotype,bodyPart,sex,age,minAge,maxAge,notes bodyPart id,name sex id,name genotype id,taxon,strain,alleles strain id,taxon,name genotypeAllele genotype,allele .allele id,gene,name gene id,name,locusLink,refSeq,genbank,uniProt,taxon preparation id,fixation,embedding,permeablization,sliceType,notes fixation id,description embedding id,description permeablization id,description sliceType id, name imageProbe image,probe,probeColor probe id,gene,antibody,probeType,fPrimer,rPrimer,seq,bac gene id,name,locusLink,refSeq,genbank,uniProt,taxon antibody id,name,description,taxon probeType id, name bac id, name probeColor id, name expressionLevel imageProbe,bodyPart,level,cellType,cellSubtype,expressionPattern bodyPart id, name cellType id, name cellSubtype id, name expressionPattern id, name
Full Resolution Image
1/2x Image
1/4x
Full sized images are shrunk 1/2, 1/4, 1/8, 1/16, 1/32, and 1/64. Images at each scale are cut into 512x512 tiles. This processing happens off-line on our computer cluster. Javascript code in “bigImage.html”requests just those tiles needed to to show current window. The bigImage.html is independent of the database, and could easily be used to deliver other high resolution imagery over the web.
SQL Databasehttp JPEGsJAX/MGI
Gene names
Excel SpreadsheetLaptop JPEGsMahoney LabPCR Primers
XML Dumphttp JPEGs
NCBI GensatBAC Seq.
File naming scheme3 CDs JPEGs
Japanese NIBBEST Seq
Excel SpreadsheetExt HD JPEG 2000
Allen BrainClone Seq
vgLoadJax978 lines of C
vgLoadMahoney724 lines of C
vgLoadGensat301 lines of C
vgLoadJax204 lines of C
vgLoadJax253 lines of C
Directory containing 3 files per submission: submission.raimageInfo.tab
caption.txt
visiGeneLoad1332 lines of C
vgPrepImage832 lines of C
~4,000,000 512x512
JPEG image tiles
~1,000,000row MySQLDatabase
Free textgene-aware
index
vgGetText290 lines of C
Directories of Full sizedimages
hgVisiGeneWeb CGI script3988 lines of C
bigImage.htmlJavaScript + HTML
1098 lines
Your web browser
AcknowledgementsImagery and Caption Data:
Paul Gray and the Mahoney Lab
Martin Ringwald, Susan McKlatchy, Janan Eppig, and the Gene Expression folks at MGI/Jackson Labs
Michael Dicuccio at NCBI and the GENSAT project
Naeto Ueno and the Japanese National Institute for Basic Biology
Susan Sunkin and the Allen Brain Institute
Software Tools:
MySQL
Image Magick
ER Mapper (for JPEG 2000 libraries)
GNU Compiler Collection & Linux
Funding:
VisiGene was developed as a skunk works under NHGRI grant 1P41HG02371
Special thanks to the Quality Assurance Group at genome.ucsc.edu for all their help in making VisiGene a robust web application.