mark dubois illinois central college [email protected] information trapping

45
Mark DuBois Illinois Central College [email protected] Information Trapping

Upload: martin-templer

Post on 01-Apr-2015

226 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Mark DuBois

Illinois Central College

[email protected]

Information Trapping

Page 2: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Your backgroundWhy are you here?

What do you hope to gain from this presentation?

What do you know about?RSS feeds (live bookmarks)Micro-formatse-MailTagging and social bookmarks

Page 3: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Source of a lot of this informationInformation trapping book

Information Trapping: Real-Time Research on the Web. Tara Calishain. (2006) ISBN: 0321491718

Page 4: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Why trapping?Suppose you need to keep up to date with a

given technologyYou could

Subscribe to various specialty magazines and e-news letters

Use search engines to methodically obtain informationSearching is so 1990’s

Much of the information available on the WWW today differs what was there yesterday

Why not set up RSS feeds and other traps Once these are established, you review the results

periodically

Page 5: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Huh? Consider the process (contrast with a search)

1. Examine your subject and carefully develop search queries

2. Evaluate places to search3. Establish your queries4. Receive and periodically evaluate the results

The initial process is more time consuming It is not as easy to tweak the traps as it is to

modify a search query However, once you have the traps set, you can

collect results for months or years

Page 6: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Simple Example

Page 7: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Initial questionsWhat is the topic you are interested in?What are the likely sources of information on this

topic?This likely includes questions such as what and

where (in the event a geographic locality is involved or you wish to focus your results on particular institutions or individuals)

How frequently do you wish to receive results?How do you want to receive the results?

Do you prefer e-mail, RSS feeds or what?

Page 8: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

RSS fundamentalsWikiPedia definition (slightly modified)

“Family of web feed formats used to publish frequently updated content such as blog entries, news headlines, or podcasts.”

“An RSS document, which is called a ‘feed,’‘web feed,’ or ‘channel,’ contains either a summary of content from an associated web site or the full text. RSS makes it possible for people to keep up with their favorite web sites in an automated manner that's easier than checking them manually.”

Page 9: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Consider current versions of Firefox – subscribe to this page (instead of bookmark this page)

RSS fundamentals

Page 10: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Firefox addonsWizz RSS -

https://addons.mozilla.org/en-US/firefox/addon/424

Purpose to read and manageRSS feeds

Useful for small numberof feedsPerhaps only critical ones

Public and PrivateNeed Wizz account for latterLimited security

Page 11: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Firefox addonsSage -

https://addons.mozilla.org/en-US/firefox/addon/77 No need for an accountLinked to Technorati

(see what others link tofor items of interest)

A lightweight alternative(like Wizz)

Page 12: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Web based RSS readershttp://www.bloglines.com/

Lots of options

http://www.newsburst.com/ Part of CNETCan use OPML (Outline Process Markup Language) –

XML based file to allow importing/ exporting of RSS feeds

http://www.google.com.readerPublic page if you want to share

http://www.feedbucket.com/http://reader.rocketinfo.com/desktop/

Page 13: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Client side RSS readershttp://www.jwizz.com/ - may recall Wizz

Java based version for desktophttp://www.superwaba.com.br/en/default.asp

Mobile device RSS reader (based on Wizz)http://www.sharpreader.net/

Requires .Net platformThere are many others, but a fair number cost

NetNewsWire (for Mac) $29.95NewzCrawler (for Windows) $24.95

Page 14: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Ok, now I have the software…So what?

First need to identify possible sources of information (next slide)

Need to understand the technology so you can use it effectivelySome sites updated frequently, others, not very

oftenBefore you try to set up traps to monitor sites, I

recommend you understand the capabilities of the technology and the nuances of the sites you plan to monitor

Page 15: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Sources of feedshttp://www.newsgator.com/ http://feedster.com/ http://www.syndic8.com/ http://newsisfree.com/ http://technorati.com/blogs (for weblogs)http://2rss.com/ http://www.rss-network.com/

Types of feedsStaticKeyword based

RSS fundamentals

Page 16: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

What does RSS look like? <?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet href="http://rss.cnn.com/~d/styles/rss2full.xsl" type="text/xsl" media="screen"?><?xml-stylesheet href="http://rss.cnn.com/~d/styles/itemcontent.css" type="text/css" media="screen"?><rss xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0"> <channel> <title>CNN.com</title> <link>http://www.cnn.com/?eref=rss_topstories</link> <description>CNN.com delivers up-to-the-minute news and information on the latest top stories, weather, entertainment, politics and more.</description> <language>en-us</language> <copyright>© 2007 Cable News Network LP, LLLP.</copyright> <pubDate>Sun, 04 Nov 2007 12:22:34 EST</pubDate> <ttl>5</ttl> <image> <title>CNN.com</title> <link>http://www.cnn.com/?eref=rss_topstories</link> <url>http://i.cnn.net/cnn/.element/img/1.0/logo/cnn.logo.rss.gif</url>

Page 17: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Page MonitorsIsn’t RSS enough?

Sometimes the content is not available via RSSSometimes you only need a little information

What is a page monitor?Automated tool that takes a “snapshot of a web

page”Returns later and takes anotherCompares the two and reports on differences

Can have false positives (perhaps someone changed the spelling)

Web based or client side tools

Page 18: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Page Monitors (2)Web based

http://watchthatpage.com/Free, must registerHas been reported on some blacklists

http://trackengine.com/Free (for up to 5 sources)

http://changedetect.com/Free (up to 5 sources)

http://www.changedetection.com/monitor.htmlSomewhat limited options (no frequency of monitoring)

http://www.pagehammer.com/Free as well

Page 19: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Page Monitors (3)Desktop

http://aignes.com/ (Website-Watcher)Free trial (relatively inexpensive)

http://www.copernic.com/en/products/tracker/$50 (free 30 day trial)

http://www.safe-install.com/programs/internet-owl.html (Internet Owl)Free

Machttp://chaoticsoftware.com/ProductPages/WebWatc

her.html (Web Watcher)$20 shareware

Page 20: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

e-Mail alertsWhy?

May want to monitor entire sites (not just selected pages with a page monitor)

Most don’t have as many false positives as page monitors

May want to have content sent to places other than your computer (perhaps a cell phone)

There are quite a few of theseSearch for entomology ("email alerts" OR "e-

mail alerts") gave me 257,000 possibilities

Page 21: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

e-Mail alerts siteshttp://www.google.com/alerts is one

http://alerts.yahoo.com is another

http://googlealert.com/ (not affiliated with Google – this site came before Google alerts)

Page 22: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

MicroformatsThese are small islands of HTML data

Yes, HTML is a data type these days, just like XML, SQL databases and so forth

As long as everyone agrees on underlying format/ namesCan actually use these to interchange data

Page 23: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

MicroformatsSolve a specific problemHave a low barrier to entryDesign for humans first, machines secondReuse building blocks from existing standardsAre modular and can be embedded in web pagesEncourage decentralized content and services

Page 24: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Types of microformatsHcard – for marking up contact information for people

and organizationsHcalendar – for marking up event information for

meetings and conferencesHreview – for marking up reviews including products

and eventsExample sites

http://corkd.com/ - wine reviews (hreview), contact (hcard)

http://flickr.com/ - profiles (hcard)http://www.last.fm/ - concerts (hcalendar)http://upcoming.yahoo.com/ - events (hcalendar),

profiles (hcard)

Page 25: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

MicroformatsDesire to re-use bits of HTML

http://microformats.org/ (good reference site)Operator (Firefox add-on) -

https://addons.mozilla.org/en-US/firefox/addon/4106

Dreamweaver microformats extensionhttp://www.webstandards.org/action/dwtf/microform

ats/

Consider a few exampleshCard (for people and organizations)http://microformats.org/code/hcard/creator

hCard creator

Page 26: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Microformats.org with OperatorScreen capture below

Page 27: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

hCard Example<div id="hcard-Mark-DuBois" class="vcard"> <a class="url fn"

href="http://www.markdubois.info">Mark DuBois</a> <div class="org">WOW</div> <div class="adr"> <div class="street-address">1 College Drive</div> <span class="locality">East Peoria</span>, <span class="region">IL</span>, <span class="postal-code">61635</span> <span class="country-name">USA</span> </div></div>

Page 28: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

hCalendarhttp://microformats.org/code/hcalendar/creator Compact Code example

<div class="vevent" id="hcalendar-WOW-Meeting"> <abbr class="dtstart" title="20071101">November 1st</abbr> &mdash; <abbr class="dtend" title="20071103">2nd, 2007</abbr> <span class="summary">WOW Meeting</span>&mdash; at <span class="location">Las Vegas</span> <div class="description">Review established curriculum model and current technology trends as they affect web curricula</div> </div>

Page 29: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

QueriesYes, we also will need to use search engines

We need to verify we have the desired sites (and have not overlooked something)

How should I actually create queries to obtain needed information?Many just plug a couple of words into the text input box

ConsiderUsing unique language – instead of ants (which turns up

Java related terms in addition to insects), I might look for Formicidae

Use more words (I believe Google has a limit of 32 words)How many have ever approached that limit?Try to be as narrow as possible

Page 30: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

SearchingBasic syntax

Caterpillar –tractor (using the minus sign in front of a word to exclude those sites from the results)

JavaScript tutorials examples – since I did not specify, most search engines today assume a Boolean AND is between each word

Sidebar - http://www.googlewhack.com/ Special searching syntax

Intitle:keyword (for Google and Yahoo) – word must be in title

InURL:keyword (for Google and Yahoo) – word must be in URL

Site:domain (.edu, .com, etc.) – might help if looking for academic information

Page 31: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Tags and conversationsTags – keyword someone uses to describe a resource

in a directoryPeople often build a folksonomy (or collaborative

tagging)http://en.wikipedia.org/wiki/Folksonomy

These are not full descriptions, only a few wordsConversations – discussions on mailing lists or forums

There are specialty search engines which index conversationshttp://www.omgili.com/ is an example

Why treat these differently? Language

Page 32: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Searching within tagsPotentially working with huge datasets

Consider that by 2010 Gartner Group estimates there will be 1 zettabyte of information generated annually2 to the 70th power10 to the 21st power“Grains of sand”

A lot of this information is in the form of audio, images, and video

This is why tagging has become so popular – helpful to find

What do we look for?

Page 33: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Searching within tags (2)Tags are only a couple of wordsConsider that you can look for different levels of

informationInsectsAntsLabor Day antsLasius neoniger

Last one might be appropriate for website search but is probably too specific for tag search

Try to stay simple and general

Page 34: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Searching within conversationsCreate queries that reflect how you would discuss

a topicIf you are interested in professional

conversations, use their vocabularyMany of the conversation search sites have

advanced search optionsUse themExample on next slide

Page 35: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Advanced search - conversations

Page 36: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Tagging information Might want to use some existing sites as well

http://del.icio.us/ (doesn’t look or act like a search engine)Yes, there is a search box, but also tryhttp://del.icio.us/tag/keyword1+keyword2

http://www.spurl.net/http://www.blinklist.com/http://rawsugar.com/http://technorati.com/tag

Page 37: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Filtering the inputYou may have set a number of traps

RSS feeds can be organized in the software itselfeMail tends to accumulate and may hinder your

best efforts to control itOne alternative is Gmail

Lot of storage space (4.5 GB at this time)Good filtering abilityExcellent anti-spam capabilitiesGreat searching capabilities

Can also create multiple [email protected] – send me a message

sometime

Page 38: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Gmail exampleResults of filter for mail sent to

[email protected]

Page 39: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Gmail example (2)Setting the filter

Page 40: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Gmail example (3)Setting the filter – part 2

Page 41: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Gmail example (4)Searching

Show search optionsNote that when you select a label, you are doing a

search in your inbox

Page 42: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Organizing the informationConsider starting with simple text editor

I use Notepad++http://notepad-plus.sourceforge.net/uk/site.htm

Multiple sources of informationIf you use a tool like MS-Word, you get all sorts of

formatting (yes, you can deactivate it, but it can be a pain)

Could also use Wiki (portable one is TiddlyWiki)http://www.tiddlywiki.com/That is what I will provide all these links withCan download from

http://www.markdubois.info/IBEA/

Page 43: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Some of the items we coveredRSSWeb page monitorseMail alertsMicroformatsQueriesSearchingTagsConversationsFiltering the resultsOrganizing the results

Page 44: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

ReferencesInformation trapping book

Information Trapping: Real-Time Research on the Web. Tara Calishain. (2006) ISBN: 0321491718

Microformats: Empowering Your Markup for Web 2.0 John Allsopp (2007) ISBN 1590598148

Page 45: Mark DuBois Illinois Central College mdubois@icc.edu Information Trapping

Mark DuBoisIllinois Central College

[email protected][email protected]

Information Trapping