predicting water quality in northwest indiana team members: carl summers, zhe wei wang, brian...
Post on 19-Dec-2015
217 views
TRANSCRIPT
Predicting Water Quality in Northwest Indiana
Team members:Carl Summers, Zhe Wei Wang,Brian Hunter, Joseph Robertson
Project Mentor: Dr. Ruijian Zhang
Purdue University Calumet
Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research
AchievementsAchievements
Research extended to the IEEE CHCResearch extended to the IEEE CHC6161 Web Web
Programming CompetitionProgramming Competition Received funding through Purdue University Received funding through Purdue University
Research Department to pursue See5.0 Web Research Department to pursue See5.0 Web implementationimplementation
Collaborating with Indiana’s Department of Collaborating with Indiana’s Department of Environmental ManagementEnvironmental Management
Outline of Presentation:
Water Quality Prediction• Motivation• Preparing Data• Output of See5 decision tree
WebsiteWebsite• Data Graphical RepresentationData Graphical Representation• Web TechnologiesWeb Technologies
• Flash Professional 8Flash Professional 8• Cascading Style SheetsCascading Style Sheets• ASP.NET Framework 2.0ASP.NET Framework 2.0
Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research
I. Water Quality PredictionI. Water Quality Prediction
Current mechanistic models require significant expert input to provide accurate forecasts.
These systems are typically used to predict trends in water quality over a vast region and long timelines.
Improving the detail of a mechanistic model may be too difficult, costly, or time consuming.
Traditional Mechanistic Models
Modeling Methods
ArtificialIntelligence
Data Mining
BayesianStatistics
Decision Tree
See5
TraditionalMechanistic
Models
Implement and compare Decision Trees, Bayesian
Networks, and the traditional Mechanistic modeling techniques.
See5 – A Decision Tree Tool See5 generates a text file containing a rule-set,
used for classifying (predicting) each record in a data-set, into a discrete set of pre-determined classifications ({Good, Bad}, {Above, Normal, Below}, etc.).
Utilizes information gain, from information theory, to determine which attributes to “split” the data on.
Data Set
Raw data was sparse
Many attributes were useless
Required extensive work to glean useful information.
Not classified
Clustering
Unclassified data from USGS
ClusteringProcess
Classified Data
See5 requires classified input data.
Clustering is composed of two parts:
1) A function to group together similar points, and ultimately similar clusters. We refer to these functions as a whole as Joining Methods.
2) A function to quantify the similarity between points or clusters. These are referred to as Similarity Metrics.
Attribute 1
Attribute 2
Clustering
Date Precipitation Suspended Sediment Dissolved Oxygen Flow Rate Temperature Classification
12/15/2006 0.34 28 6.8 30 14.9 Good
12/22/2006 0 9 7 35 11.9 Bad
12/29/2006 1.6 10 6.4 46 9.5 Good
1/5/2007 3 10 6.4 52 8.5 Bad
1/12/2007 0.56 11 5.9 31 9.3 Bad
1/19/2007 0 12 8.4 43 10.8 Good
1/26/2007 0.12 20 9.2 25 11.9 Bad
2/2/2007 0 21 9.3 54 9.2 Bad
2/9/2007 0 20 8.4 35 7.9 Good
2/16/2007 0.4 20 6.4 47 8.9 Good
2/23/2007 0 17 6.1 38 9.1 Good
3/2/2007 0.13 17 6.2 29 11.4 Bad
3/9/2007 2.2 17 6.7 50 11.7 Bad
3/16/2007 1.7 15 5.5 50.1 11.9 Good
3/23/2007 0.09 18 5.7 41 12.2 Good
Clustered Data Set
Offset Classification
Date PrecipitationSuspendedSediment
FlowRate
Temperature
12/15/2006 0.34 28 30 14.9
12/22/2006 0 9 35 11.9
12/29/2006 1.6 10 46 9.5
1/5/2007 3 10 52 8.5
1/12/2007 0.56 11 31 9.3
1/19/2007 0 12 43 10.8
Classification
Good
Bad
Good
Bad
Bad
Good
Decision Tree
Date PrecipitationSuspended Sediment
Dissolved Oxygen
Flow Rate Temperature Classification
12/15/2006 0.34 28 6.8 30 14.9 Good
12/22/2006 0 20 7 35 16 Bad
05/23/2007 1.6 10 6.4 46 9.5 ???
Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research
II. See5.0 Web SolutionII. See5.0 Web Solution
Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research
ObjectiveObjective
Share a visualization of the predictions Share a visualization of the predictions generated by See5 with the public.generated by See5 with the public.
To provide viewers with a user interface to To provide viewers with a user interface to easily display descriptive and complex data in easily display descriptive and complex data in a comprehensive environment.a comprehensive environment.
Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research
MethodsMethods
To provide a cross-platform interface by conforming To provide a cross-platform interface by conforming to W3C Standardsto W3C Standards Web languages will function through various Web Web languages will function through various Web
browsersbrowsers Provides consistency to define the appearance of an entire Provides consistency to define the appearance of an entire
Web siteWeb site Take advantage of Web technologiesTake advantage of Web technologies
No package installation required from the userNo package installation required from the user Always available (per server uptime)Always available (per server uptime) User interactionUser interaction
Easy to deploy and manage Easy to deploy and manage
Website
Interactive Content Page
Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research
Data Graphical RepresentationData Graphical Representation
Applying various languages to supply a fully Applying various languages to supply a fully scalable application to the userscalable application to the user Flash 8 Professional will provide rich animation Flash 8 Professional will provide rich animation
and an elegant user interfaceand an elegant user interface CSS will allow consistency of format throughout CSS will allow consistency of format throughout
the sitethe site ASP.NET 2.0 allows embedded Flash objectsASP.NET 2.0 allows embedded Flash objects Returns server-side code and code-behind files Returns server-side code and code-behind files
into plain HTMLinto plain HTML
Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research
Flash Professional 8Flash Professional 8
Many users won’t be able to install arbitrary ActiveX Many users won’t be able to install arbitrary ActiveX controls or use a Java plug-in, whereas Flash is controls or use a Java plug-in, whereas Flash is preinstalled with Windows on corporate machines, preinstalled with Windows on corporate machines, even most Linux distributions come pre-packaged even most Linux distributions come pre-packaged with Flashwith Flash
Flash can consume raw XML data to draw real-time Flash can consume raw XML data to draw real-time graphs to easily determine water qualitygraphs to easily determine water quality
Advantages of ActionScript 2.0Advantages of ActionScript 2.0 Object Oriented Programming LanguageObject Oriented Programming Language Permits vector based objects to be manipulated quickly and Permits vector based objects to be manipulated quickly and
easily, on-the-fly!easily, on-the-fly!
Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research
CCascading ascading SStyle tyle SSheetsheets
Allows the provision of a standardized layout Allows the provision of a standardized layout throughout the sitethroughout the site ModulationModulation End result with CSS means cleaner codeEnd result with CSS means cleaner code
Provides the user with a consistent interfaceProvides the user with a consistent interface Conventional throughout the entire pageConventional throughout the entire page
CSS allows updating to become an easy taskCSS allows updating to become an easy task Modifications on one style sheet can affect some Modifications on one style sheet can affect some
or all pages, which are linked to that styleor all pages, which are linked to that style
Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research
ASP.NET Framework 2.0ASP.NET Framework 2.0 Have accessibility to the .NET Have accessibility to the .NET
Framework 2.0 Class LibraryFramework 2.0 Class Library Easy deployment, configuration, Easy deployment, configuration,
and management with IIS 6 and management with IIS 6 (Windows Server 2003)(Windows Server 2003)
XML Metabase Schema provides XML Metabase Schema provides quick deploymentquick deployment
Easy to use GUI management utility Easy to use GUI management utility (inetmgr)(inetmgr)
Quick to update latest security Quick to update latest security patchespatches
Security Authentication to lock Security Authentication to lock out users without proper out users without proper credentials to administrate or view credentials to administrate or view the content of the pagethe content of the page
Purdue University Calumet Purdue University Calumet Undergraduate ResearchUndergraduate Research
SummarySummary
Using clustering tools to classify data in Using clustering tools to classify data in preparation for See5preparation for See5
Using See5 to generate a rule setUsing See5 to generate a rule set Use the rule set to obtain predictionsUse the rule set to obtain predictions Ultimately implement and compare other Ultimately implement and compare other
prediction methodsprediction methods Provide a public website for the visualization Provide a public website for the visualization
of the predictionof the prediction