topic maps custom viewing of data ashish mahabal – spie ‘02

34
Topic Maps Custom viewing of data Ashish Mahabal – SPIE ‘02

Post on 20-Dec-2015

223 views

Category:

Documents


2 download

TRANSCRIPT

Topic Maps

Custom viewing of data

Ashish Mahabal – SPIE ‘02

http://www.astro.caltech.edu/~aam/science/topicmaps/

Roy Williams George Djorgovski Robert Brunner

Talk plan Quick reminder of what Topic Maps

are What UCDs are What a UCD Topic Map is Where this is leading us

What are Topic Maps? A semantic network over and

information pool A configurable data

interconnections viewer SQL for XML Data Discovery Tool

What do Topic Maps consist of?

Topics (subjects, names, thingies : vertices)

Associations between Topics (relationships : edges)

Occurrences of Topics and of Associations (incidences)

SCOPES!

Some examples NGC 4261 is a Topic (of type

“galaxy”) Galaxy is a Topic (of type “object”) A “dustlane” is also a topic (more

general, of type “galaxy feature”) “NGC 4261 contains a dustlane” is

an Association (of type “object-contains-feature”)

NGC 4261 can have multiple occurrences

As an elliptical galaxy As a radio galaxy (3C 270) As a dust lane galaxy As an object in ApJ 1995 Mahabal

et al.The different Occurrences appear as different Scopes for the Topic NGC 4261. Indexing is possible on Scopes making Topic Maps more powerful than RDF

Where do UCDs come in? Tables Columns Column names

Standardization of column names in tables:

4-tier hierarchy with over 1400 UCDs (CDS)

UCD Topic Map: Topics UCDs Tables Column names UCD descriptions Units EXTERNAL LINKS

UCD Topic Map Associations UCD occurs in TABLE UCD corresponds to COLUMN

DESCRIPTION UCD in TABLE has units UNIT COLUMN DESCRIPTION occurs in TABLE UCD is associated with UCD DESCRIPTION UCD is parent of UCD (in the hierarchical

structure)

UCD Topic Map: Occurrences In tables With units External table links MORE EXTERNAL LINKS: plots,

joins, histograms, statistics

Topic Map from single Table O(10) UCDs O(10) Column names O(1) Units O(100) Associations O(100) Occurrences

The UCD Topic Map uses just the MetaData (the header info) and forms a layer distinct from the data (table) itself

What if 100s of tables are used?

Several 100 UCDs – perhaps thousands

Many UCDs are repeated – and that is where the plot gets interesting! Which UCDs are repeated and where?

Step 1 in data discovery:

Which Table(s) does the UCD of MY interest occur in?

What is the UCD Topic Map? Multiply connected list of UCDs Multiply connected list of Tables Multiply connected list of Column

names Multiply connected list of Units

Multiply connected list of {ANY TOPIC YOU HAVE CARED TO DEFINE}

… within the 100 Tables TM Can I merge this IR column with that

Xray column? Do both these catalogs talk about

extra-galactic objects? Or would one be good to dientify galactic contaminants?

More interesting to go beyond – YOUR OWN CATALOGS + other catalogs

Data discovery is all about asking the right questions! Can I merge Xray and Radio

catalogs? Which ones? Do their units match? What are the parameter ranges? What is the basic statistics? Histogram?EXTERNAL LINKS AS TOPICS!

Is that it?

List all UCDs: instance-of($A,ucd)?Count number of child UCDs: Select $A, count($B) from isParentUCD($A :

ucd, $B :ucd)?To order in descending by count: Select $A, count($B) from isParentUCD($A :

ucd, $B : ucd) order by $B desc?

NO! Direct querying and multiple indexing possible!!

What is the UCD Topic Map? Fully customizable tool for data

discovery using metadata!

We have made use of catalogs made available at Vizier

100 most frequently used catalogs have been used

More specialized UCD Topic Maps are being made

General applications of Topic Maps are immense

What is out there? Ontopia’s omnigator Mondeca Infoloom Topicmaps.org

http://www.astro.caltech.edu/~aam/science/topicmaps/