Predicting Median Substrate

Download Predicting Median Substrate

Post on 11-Jan-2016




0 download

Embed Size (px)


Predicting Median Substrate. for Oregon and Washington EMAP sites Utilizing GIS data. Julia J. Smith December 12, 2005. Why Predict Median Substrate?. Indicator of overall stream health Bed load transport Stream Power Microinvertebrate habitat Fish habitat - PowerPoint PPT Presentation


<ul><li><p>Predicting Median Substratefor Oregon and Washington EMAP sitesUtilizing GIS dataJulia J. Smith December 12, 2005</p></li><li><p>Why Predict Median Substrate?</p><p>Indicator of overall stream healthBed load transportStream PowerMicroinvertebrate habitatFish habitatHow is human development affecting a stream</p></li><li><p>What is LD50?LD50 is a measure of median substrate.Geometric mean of class boundariesLog10 of the geometric meansSeveral samples at each siteLD50 is the median value oflog10(geometric mean of class)</p></li><li><p>Substrate Classifications</p></li><li><p>Geomorphic Metrics is the total bank-full shear stresss is the density of sediment is fluid densityg is gravitational accelerationh is bank-full depthS is channel slope </p></li><li><p>Distance-weighted Stream Power versus LD50r = 0.327, p-value = 2.63 x 10 -12Geomorphic Metrics</p></li><li><p>Geomorphic MetricsOutlet link mean slope versus LD50r = 0.214, p-value = 3.78 x 10-6 </p></li><li><p>Geologic Metrics</p><p>Percent Unconsolidated Geologic type versus LD50 r = -0.246, p-value = 1.18 x 10-7 </p></li><li><p>Climatic Metrics</p><p>Annual average precipitation versus LD50r = 0.199, p-value = 1.56 x 10-6 </p></li><li><p>Climatic Metrics</p><p>Average annual potential evapotranspiration (mm) versus LD50 r = -0.046, p-value = 0.342 </p></li><li><p>Land Cover Metrics1. Developed 2. Barren 3. Forest 4. Grasses5. Agriculture 6. Wetlands7. Open water/perennial ice and snow8. Shrubland</p></li><li><p>Land Cover MetricsPercentage of watershed that is forest versus LD50 r = 0.19, p-value = 3.516 x 10-5 </p></li><li><p>Distance-Weighted metrics</p><p>j represents the land cover type of concern, Aj represents the total area for land cover type j in the watershed, represents the coefficient of exponential decay, represents average distance from outlet for land cover of type j n represents the total number of the land cover types </p></li><li><p>Additional Land Cover MetricsBuffered Metrics Buffered within a measure of the stream (30 meters, 100 meters, 300 meters)Buffered and Distance-weighted metrics</p></li><li><p>GoalsPredict LD50 without visiting sitesSmall number of predictors for scientifically sensible model</p></li><li><p>Methods-Stepwise Variable SelectionMultiple Linear RegressionTop-in-tier modelsTop geomorphic models plus one from each of the remaining tiers</p></li><li><p>Akaikes Information CriterionN observationsp predictorsRSS is the sum of squared residuals </p></li><li><p>AIC in stepwise variable selection</p><p>Forward Stepwise Selection - Method for choosing the top predictor from each tierStart with the intercept modelChoose the variable that reduces AIC the most and include in model.Stepwise selection in both directions-Method chosen for choosing all top Geomorphic predictorsStart with full model.Add and subtract variables until the model with minimum AIC is found or iteration stops.</p></li><li><p>Methods: CART Classification and Regression Trees</p></li><li><p>Methods: CART Classification and Regression Trees</p><p>Predicted Response:</p></li><li><p>Hybrid of Multiple Linear Regression and CARTUtilize CART on the residualsAdd indicator variables to the multiple linear regression equation for one minus the number of terminal nodes in the treeCreate new multiple regression model with variables and indicator variables</p></li><li><p>Predictive-ability Statistics </p></li><li><p>Analysis Comparison Top 4-tier ModelsProblems with top 4-tier modelsLow Adjusted R2Low Predictive AbilityOver-prediction and under-prediction of fine and bedrock substrate Non-normal residualsBenefit of top 4-tier modelsSmall number of predictors</p></li><li><p>Example of Non-normality of ResidualsTop 4-Tier Model</p></li><li><p>Analysis Comparison Geomorphic plus Top 3-Tier ModelsProblems with top geomorphic plus top 3-tier modelIncrease in number of variablesPredictive ability still lowOver-prediction and under-prediction of fine and bedrock substrateSome collinearity between variables</p></li><li><p>Analysis Comparison Geomorphic plus Top 3-Tier ModelsBenefits with top geomorphic plus top 3-tier modelImproved predictionsImproved normality of residuals</p></li><li><p>Comparison of Analysis - CARTProblems with CARTLow predictive-abilityPredicts several observed substrate sizes in one nodeOver-prediction and under-prediction of fines and bedrock substrateOmitting one site creates different treeBenefits of CARTSimple analysisMissing variables not an issue</p></li><li><p>CART Predictions </p></li><li><p>Comparison of Analysis-HybridsProblems with hybrid modelsIncreased number of variablesCollinearity with introduction of node indicator variablesNon-normal residuals</p></li><li><p>Comparison of Analysis-HybridsBenefit of hybrid modelsResiduals closer to normalIncreased predictive-abilityExplains some of the variation created by fitting a linear model to ordinal data</p></li><li><p>One example: Residual Tree forHybrid Geomorphic plus Top 3-Tier ModelMost promising multiple regression prediction model: Geomorphic plus top 3-tier</p><p>ResponseAdjusted R2PRESSpfor LD50MSPRLD500.362504.8021.2740.319</p></li><li><p>One example: Residual Tree forHybrid Geomorphic plus Top 3-Tier Model</p></li><li><p>One example: Observed vs. Predicted forHybrid Geomorphic plus Top 3-Tier ModelPlot of predictions against observed LD50</p></li><li><p>QQ-Plot of Residuals for Hybrid Model</p></li><li><p>Coast Range EcoregionLess skewed distribution of LD50No measurements are outliersSimilar ecosystem throughout region</p></li><li><p>Ecoregion Distributions</p></li><li><p>Coast Range EMAP Sites</p></li><li><p>Top 4-Tier Coast Range ModelPredictors Average aspect (climatic) Average watershed elevation (geomorphic)% watershed as volcanic geologic type (geologic)% wetlands (distance weighted and buffered)</p></li><li><p>QQ-Plot: Top 4-Tier Coast Range</p></li><li><p>Observed versus Predicted: Top 4-Tier Coast Range Model</p></li><li><p>Coast Range ModelTop Geomorphic VariablesAverage watershed elevation (m) Drainage densityMean slope within a 300-meter bufferRatio of width of stream to width of floodplainCoefficient of average hill connectivityDistance to the first tributary (m)Percent of landscape with less than 4% slopePercent of landscape with less than 7% slopeMeasure of size and complexity of riverPercent of stream as cascadeDistance-weighted stream power Watershed relief divided by its length</p></li><li><p>QQ-Plot: Coast Range Geomorphic plus Top 3-Tier model</p></li><li><p>Observed versus Predicted: Coast Range Geomorphic + Top 3-Tier</p></li><li><p>CART - Coast Range EcoregionPredictions versus Observed LD50</p></li><li><p>Coast Range: Hybrid Models Benefits of hybridImproved predictionImproved fit Improved normality of residualsProblems with hybridIncreased number of predictorsCollinearity with node indicator variables</p></li><li><p>QQ-Plot:Coast Range Hybrid Top 4-Tier </p></li><li><p>Observed versus Predicted:Coast Range Hybrid Top 4-Tier</p></li><li><p>QQ-Plot: Coast Range Hybrid Geomorphic plus Top 3-Tier</p></li><li><p>Observed versus Predicted: Coast Hybrid Geomorphic plus Top 3-Tier</p></li><li><p>Comparison of Coast Models</p><p>ModelAdjusted R2Top 4-tier0.3840.362Geomorphic plus top-30.5480.495CARTNA0.087Top 4-tier hybrid0.5520.503Geomorphic plus top-3 hybrid 0.7000.614</p></li><li><p>ConclusionsLD50 is difficult to predictAdditional geomorphic predictors increases prediction abilityHybrid models increase prediction abilityMore success in Coast Range Ecoregion</p></li><li><p>Future WorkLogistic RegressionOrdinal data treated as continuous in this study12 categories might require more sophisticated methods</p><p>Spatial AnalysisAppears to be spatial correlation in distribution of LD50</p><p>As a nation, we have been making progress in understanding the extent to which we have an impact on and a responsibility to our environment.2. Use of resources/disposal/encroachment on wilderness land effect the world we leave for future generations3. EPA protect,assessing how the effects of tima an human development are changing our world, trying to meet/balance that allows for devel while minimizing.The cost and time send teams remote and not so remote armed with metre,clipboards,electronic devices With GIS, data satellite imagery, potential for minimizing these costs.Small piece of this vast undertaking in the projectpredicting rock sizes in streams How much the river debris and substrate does a stream have How much power is available to move substrateWhat is the oxygen content, what can live in the water4. Is the habitat suitable for the animals living in the water to reproduce.Upper Limit for bedrock and lower limit for fines12 different LD50s in this study485 sites: Only 432 with LD50 measurementsSome sites were missing important predictors: struggle to leave out site or remove predictor. Removed predictors that were missing more than 40 site observationsThere are 1122 predictors available. After taking out predictors with more than 40 missing observations, there were 710 predictors available.These predictors fell into 4 general tiers: geomorphic, geologic, climatic, and land coverCharacteristics of the formation of the land and stream in the watershed. These include predictors that have a strong relationship to substrate size, such as slope and the power or flow of a stream. The ability of a stream to move substrate affects the size of the substrate that is left behind. People have used D50 to predict the ability of a stream to intiate movement and continue movement of substrate (bed load transport). One such equation is Shields sheer stress equation. In fact, there has been use of these equations to attempt to predict median substrate which have worked under only moderate conditions.Geologic types that appear in the watershedLo/high residential, commericial, industrial, transportationBare rock/sand/clay, quarries/strip mines/gravel pits, transitional changeDeciduous,evergreen, mixedGrasslands/herbaceous, alpine/tundraPasture/hay, row crops, small grains, fallow land, urban/recreational grassesWoody wetlands, emergent herbaceous wetlandsDebris of land cover could prevent the flow of small substrate down stream or affect substrate size in other ways.22 land types4 coefficients of exponential decay randgin from -1 hundred thousandth to -1 thousandths used twice: once where the distances are weighted further byTopographic Wetness index is an an indication of the soil moisture content and indicates where surface flow versus subsurface flow may occur. This indicates the ability of run-off in a watershed to carry the land-cover debris to the streamCalculated only within a certain buffer distance of the stream network.How does the distance of the land cover affect the size of substrateMinimize16 terminal nodes - 2 of which have the same predicted valueResidual Predictions are ordered from lowest to highest.Any observation falling into the first node is coded with zeros for each indicator variable.Any observation falling into the next lowest node is coded with a one for the first indicator variable and zeros for the remaining nodes.Until finally, all observations falling into the highest terminal node are coded with all zeros except the last variable is coded as a one.</p><p>Predicted error sum of squares-0.1659 to 2.015Negative 17 hundredths to just a little over twoWhere LD50 ranges from -2.11 to 3.75Predictive error sum of squaresMean square prediction errorPredictive r-squared18 terminal nodesAll sand sites in entire site were outliers by the 1.5 IQR rule13 ecoregions</p></li></ul>