visigene - avirtual microscope and database for in situ images at genome.ucsc.edu galt barber, donna...

VisiGene - AVirtual Microscope and Database for In Situ Images at genome.ucsc.edu

Galt Barber, Donna Karolchik, David Haussler, Jim Kent

VisiGene displays images from in-situ RNA hybridization, reporter genes, and other techniques that show where a gene, enhancer, or promoter is active in an organism. Currently VisiGene contains ~100,000 images from several high-throughput gene projects and also images from the literature as curated by the model organism databases. The controls for VisiGene are quite simple. There is a text box for search terms, a scrolling list of thumbnails of images that match the search terms, and a large region that serves as a virtual microscope for the selected image. One simply clicks on a region to go to the next level of magnification centered on that region. VisiGene only transmits the data for the part of the image that you are viewing at the scale you are viewing it at, so the response time is quite fast. One can scroll through the image by dragging it with a mouse. Underneath the image is a caption which contains a link to the paper associated with the image, hyperlinks to the UCSC Genome Browser page for the genes, the age, sex and genotype of the organism, and when available human curated information on what anatomical structures the gene is active in. The search terms include gene names and symbols, authors, date of publication, organisms, developmental stages, and anatomical structures. The Genome Browser and Gene Sorter contain tracks and columns that link into VisiGene. Current image sets include mouse transcription factors from the Mahoney Lab, adult mouse brain images from the Allen Brain Atlas, mouse head and brain images from the GENSAT project, whole mount Xenopus laevis images from the Japanese Institute of Basic Biology, and images from the mouse literature curated by the GXD group of MGI. We are grateful to all who have contributed images to VisiGene so far, and are actively searching for additional image sets.

Database structure Indentation shows parent/child relationship between tables. Key fields used to join tables are underlined. In general a key field named xyz links into the id field of the xyz table.

table fieldssubmissionSource id,name,acknowledgement,setUrl,itemUrl,abUrl submissionSet id,name,contributors,year,publication,pubUrl,journal,copyright,submissionSource journal id,name,url copyright id,notice imageFile id,fileName,priority,imageWidth,imageHeight,submissionSet,submitId,caption caption id,caption image id,submissionSet,imageFile,imagePos,paneLabel,sectionSet,sectionIx,specimen,preparation specimen id,name,taxon,genotype,bodyPart,sex,age,minAge,maxAge,notes bodyPart id,name sex id,name genotype id,taxon,strain,alleles strain id,taxon,name genotypeAllele genotype,allele .allele id,gene,name gene id,name,locusLink,refSeq,genbank,uniProt,taxon preparation id,fixation,embedding,permeablization,sliceType,notes fixation id,description embedding id,description permeablization id,description sliceType id, name imageProbe image,probe,probeColor probe id,gene,antibody,probeType,fPrimer,rPrimer,seq,bac gene id,name,locusLink,refSeq,genbank,uniProt,taxon antibody id,name,description,taxon probeType id, name bac id, name probeColor id, name expressionLevel imageProbe,bodyPart,level,cellType,cellSubtype,expressionPattern bodyPart id, name cellType id, name cellSubtype id, name expressionPattern id, name

Full Resolution Image

1/2x Image

1/4x

Full sized images are shrunk 1/2, 1/4, 1/8, 1/16, 1/32, and 1/64. Images at each scale are cut into 512x512 tiles. This processing happens off-line on our computer cluster. Javascript code in “bigImage.html”requests just those tiles needed to to show current window. The bigImage.html is independent of the database, and could easily be used to deliver other high resolution imagery over the web.

SQL Databasehttp JPEGsJAX/MGI

Gene names

Excel SpreadsheetLaptop JPEGsMahoney LabPCR Primers

XML Dumphttp JPEGs

NCBI GensatBAC Seq.

File naming scheme3 CDs JPEGs

Japanese NIBBEST Seq

Excel SpreadsheetExt HD JPEG 2000

Allen BrainClone Seq

vgLoadJax978 lines of C

vgLoadMahoney724 lines of C

vgLoadGensat301 lines of C



Directory containing 3 files per submission: submission.raimageInfo.tab

caption.txt

visiGeneLoad1332 lines of C

vgPrepImage832 lines of C

~4,000,000 512x512

JPEG image tiles

~1,000,000row MySQLDatabase

Free textgene-aware

index

vgGetText290 lines of C

Directories of Full sizedimages

hgVisiGeneWeb CGI script3988 lines of C

bigImage.htmlJavaScript + HTML

1098 lines

Your web browser

AcknowledgementsImagery and Caption Data:

Paul Gray and the Mahoney Lab

Martin Ringwald, Susan McKlatchy, Janan Eppig, and the Gene Expression folks at MGI/Jackson Labs

Michael Dicuccio at NCBI and the GENSAT project

Naeto Ueno and the Japanese National Institute for Basic Biology

Susan Sunkin and the Allen Brain Institute

Software Tools:

MySQL

Image Magick

ER Mapper (for JPEG 2000 libraries)

GNU Compiler Collection & Linux

Funding:

VisiGene was developed as a skunk works under NHGRI grant 1P41HG02371

Special thanks to the Quality Assurance Group at genome.ucsc.edu for all their help in making VisiGene a robust web application.

visigene - avirtual microscope and database for in situ images at genome.ucsc.edu galt barber, donna...

Documents

genotype id

imagefile id

id field

allele id

bac gene id

caption image id

taxon antibody id

taxon preparation id