residential behavior-based programs ryan firestone rtf residential behavior subcommittee february 2,...

22
Residential Behavior- Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

Upload: derek-cook

Post on 18-Jan-2018

226 views

Category:

Documents


0 download

DESCRIPTION

Measure Overview Measure Description: Information provided periodically to households that influences their energy consumption – Grounding in behavioral and social sciences – Programs can vary in target audience, communication methods, message contents and presentation, timing and frequency of messaging Large pools of potential participants and delivery mechanism allows for experimental design and robust statistical analysis of impacts Opower’s Home Energy Report is a common behavior- based program – Monthly or quarterly mailers – Comparison of recipient’s energy use to neighbors – Customized recommendations on how to save energy 3

TRANSCRIPT

Page 1: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

Residential Behavior-Based Programs

Ryan FirestoneRTF Residential Behavior Subcommittee

February 2, 2016

Page 2: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

2

Presentation Overview

In October, the RTF instructed staff to put resources towards developing Impact Evaluation Guidance.

Today, we are seeking input from the subcommittee on:• Use of Uniform Methods Project protocol as basis for

guidance– We will review this

• Inclusion of supplemental, NW-specific guidance– We will discuss this

Page 3: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

3

Measure Overview• Measure Description: Information provided periodically to

households that influences their energy consumption– Grounding in behavioral and social sciences– Programs can vary in target audience, communication methods, message

contents and presentation, timing and frequency of messaging

• Large pools of potential participants and delivery mechanism allows for experimental design and robust statistical analysis of impacts

• Opower’s Home Energy Report is a common behavior-based program– Monthly or quarterly mailers– Comparison of recipient’s energy use to neighbors– Customized recommendations on how to save energy

Page 4: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

4

Measure Overview

• Typical evaluated savings:– 1% - 3% of whole-house consumption– persist/increase while messaging continues (findings limited by new-

ness of programs)– decay after messaging stops– vary by demographics: e.g. larger energy consumers save proportionally

more

• DOE Uniform Methods Project (UMP) Residential Behavior Protocol provides many details necessary for – program design: randomized control or randomized encouragement,

persistence– savings estimation: statistical estimates of difference in energy

consumption between control and treatment groups[we’ll review this document later in the presentation]

Page 5: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

5

Measure Overview

• We don’t know what mix of participant actions result in energy savings. For example,– turn off lights, turn down HVAC when not in space– lower water heater setpoint– use less heat, light, or other services

• This could conflict with our notion of Conservation– buy more efficient products and services

• This would impact cost effectiveness

• This mix can vary by program design, implementation, location, and time

Page 6: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

6

CAT/Staff RecommendationCAT/Staff recommendation: develop RTF impact evaluation guidance to supplement the UMP• Program design and participant characteristics are too

varied/potentially varied to develop shortcuts to evaluation (i.e., Standard Protocol or UES measures)

• UMP includes much of the technical detail necessary for evaluation• RTF can supplement UMP protocol with Northwest-specific details

• If there’s interest in a Standard Protocol, we’d need– a specific program description– sufficient impact evaluation results to generalize impacts

Page 7: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

7

DOE’s Uniform Methods Project: Residential Behavior Programs

Page 8: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

8

Uniform Methods Project (UMP)• DOE protocols for determining savings from energy efficiency measures and programs• Chapter 17: Residential Behavior Programs• Applicable to residential behavior programs with large (1,000’s-10,000’s) number of

participants– Each with individual billing data (e.g., by house)

• Experimental Design: – Randomized Control Trial – subjects randomly assigned to group that gets or does not get messaging– Randomized Encouragement Design – all subjects can opt in, subjects randomly assigned to group that

gets or does not get encouragement to participate.• Analysis:

– Difference (kWhcontrol – kWhtreatment)

– Difference-in-Difference ( (kWhcontrol - kWhtreatment)post – (kWhcontrol-kWhtreatment)pre

– Simple average, panel regression w/ or w/out fixed-effects– Avoid double counting of trackable program savings – analyze participation data– Avoid double counting of untrackable (upstream) program savings (e.g., lighting) – use surveys

• Similar methods in State and Local Energy Efficiency Action (SEE Action) “Issues and Recommendations” report, evaluations, etc.

Page 9: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

9

Uniform Methods Project – Behavior Protocol

• Section 1 – Measure Description– Description: information on household consumption and energy

efficiency education/tips– May include competitions and/or rewards– First large scale programs in 2008– Common features:

• Randomized experimental design• 1000’s of customers• Outside vendor implementation

– Newer programs may include smaller scale, new communication channels (e.g.,web, social media, text)• Degree of validity and accuracy of savings may be less than experimental

methods

Page 10: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

10

Uniform Methods Project – Behavior Protocol

• Section 2 – Applicability Conditions of Protocol– “Residential utility customers are the target.”

• Programs/evaluations to date have been primarily residential. Efficacy of methods on non-residential applications has not been tested.

– “Energy or demand savings are the objective.”• The protocol does not directly address the evaluation of other BB program

objectives, such as increasing utility customer satisfaction, educating customers about their energy use, or increasing awareness of energy efficiency.

– “An appropriately sized analysis sample can be constructed.”• Protocol is a statistical analysis, which requires significant sample size given the

typically small savings relative to total home consumption and variance– “Accurate energy use measurements for sampled units are available.”

• If sampled unit is household, this would be the individual house billing records or other house-level (or sub-house-level) metered records.

Page 11: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

11

Uniform Methods Project – Behavior Protocol

• Section 3 – Savings Concepts– Section 3.1 Definitions

• Control group / Experimental design / External validity / Internal Validity / Opt-in program / Opt-out program / Quasi-experimental design / Randomized Control Trial (RCT) / Randomized Encouragement Design (RED) / Treatment / Treatment effect / Treatment group

– Section 3.2 Experimental Research Designs• Requires planning upfront to design sample• RED can be used as alternative to RCT where program does not want to limit

participation• Internal validity: RCT and RED used because they yield unbiased savings

estimates because program intervention is uncorrelated with subjects’ energy use

• External validity: results may be applicable to other populations or time periods• Experimental evaluation methods apply to wide range of BB programs

Page 12: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

12

Uniform Methods Project – Behavior Protocol

• Section 3 – Savings Concepts (cont’d)– Section 3.3 Basic Features

• Identify study population• Determine sample size – statistical power analysis to determine minimum

treatment and control sample sizes “as a function of the hypothesized program effect, the coefficient of variation of energy use, the specific analysis approach that will be used (e.g., simple differences of means, a repeated measure analysis), and tolerances for Type I (false positive) and Type II (false negative) statistical errors.”

• Randomly assign subjects to treatments and control:• Administer the treatment:• Collecting data: data must be collected from all study subjects (including

drop-outs). Multiple pre- and post-treatment data points are preferable.• Estimate savings: difference or difference-in-difference methods

Page 13: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

Uniform Methods Project – Behavior Protocol

• Section 3 – Savings Concepts (cont’d)– Section 3.4 Common Designs

• Randomized Control Trial With Opt-Out Program Design– Random assignment of study population to control and treatment groups

• Randomized Control Trial With Opt-In Program Design– Random assignment of opt-in customers to control and treatment groups

» Control group is denied participation, or allowed delayed participation• Randomized Encouragement Design

– For opt-in programs where delaying or denying participation is undesirable– Random assignment of customers:

» Treatment group is encouraged to participate» Control group is not

– Can measure:» effect of encouragement» effect of program on “compliers” – customers who participate with encouragement but not without

encouragement– Cannot measure: (unless control group is not permitted to participate)

» Effect of program on “always takers” – customers who participate with or without encouragement – Need sufficient number of compliers for methods to work. Otherwise consider quasi-experimental design.

• Persistence Design– Customers in treatment group are randomly assigned to groups with different times that treatment is discontinued (e.g.,

after one year, after three years, not during study period).

– Section 3.5 Evaluation Benefits and Implementation Requirements of Randomized Experiments• Summary of earlier sub-sections.

Page 14: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

14

Uniform Methods Project – Behavior Protocol

• Section 4 – Savings Estimation– Difference

• (kWhcontrol,post – kWhtreatment,post)

– Difference-in-Difference (D-in-D)• (kWhcontrol,post – kWhtreatment,post) - (kWhcontrol,pre – kWhtreatment,pre)• More precise results - D-in-D strongly advised

– Multiple data points and at least one year of pre-treatment consumption recommended

– Sample Design• Based on hypothesized program effect, coefficient of variation of use, D vs. D-in-D,

number of observations, correlation of observations, tolerance for statistical error• Random assignment by independent third party

– Or at least certification from third party the assignment was done correctly• Equivalency Check by independent third party

– Characteristics of treatment and control group are balanced» Pretreatment energy consumption and variance, home characteristics, heating fuel type

– If not balanced, re-sample and/or consider stratifying sample

Page 15: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

15

Uniform Methods Project – Behavior Protocol

• Section 4 – Savings Estimation (cont’d)– Data Requirements and Collection

• Panel data recommended: – Pros(multiple data points per customer) can allow for more precision/smaller sample sizes and

savings during specific times– Con : statistical software likely required for analysis (vs. spreadsheet analysis)

• At least one year of data pre and one year post recommended to capture seasonal effects

• Program participation (other programs) must be collected to avoid double counting• Temperature and weather data recommended, can help explain results

– Analysis Methods• Panel regression formulation for RCT and RED, D and D-in-D, with and without fixed

effects• Models for Estimating Savings Persistence• Standard Errors

– Recommend clustered standard errors: account for within-subject correlations• Handling of Opt-out subjects and account closures

– Energy Efficiency Program Uplift and Double Counting of Savings

Page 16: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

16

Uniform Methods Project – Behavior Protocol

• Section 5 – Reporting– Evaluators should describe:

• “The program implementation and the hypothesized effects of the behavioral intervention

• “The experimental design, including the procedures for randomly assigning subjects to the treatment or control group

• “The sample design and sampling process• “Processes for data collection and preparation for analysis, including all

data cleaning steps• “Analysis methods, including the application of statistical or econometric

models and key assumptions used to identify savings, including tests of those key identification assumptions

• “Results of savings estimate, including point estimates of savings and standard errors and full results of regressions used to estimate savings.”

Page 17: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

17

Supplementing UMP

Page 18: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

18

Conservation vs. CurtailmentProposal: Consider persistent savings as a proxy for conservation (i.e., not curtailment). Evaluation must demonstrate that savings persist when messaging continues. For current HER programs, recent evaluations sufficiently demonstrate this.• Risk mitigation and deferred generation credits are applicable to conservation

• CAT hypothesizes that savings from significant curtailment would not persist: participants would get tired of being uncomfortable and revert to previous behavior.

• PSE multi-year evaluation and others show savings increasing over several years when messaging continued– This suggests that participants are not too uncomfortable with anything they’re doing, they just

need to be reminded• While BB programs may encourage some compromise in utility, CAT recognizes that many

conservation measures include compromises in comfort that are not counted: e.g., – light quality and lack of dim-ability of CFLs, – noise of heat pump products, – slower recovery time of HPWH, – noise/vibration of EE clothes washers, – shower quality with low flow showerheads

Page 19: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

19

Avoiding Double CountingProposal: Recognize both programs and momentum savings estimates are potential areas for double counting that must be addressed

Overlap with programs: refer to UMP• Trackable programs – analyze program database, only count savings once (either in

behavior program or in rebate program)• Untrackable programs (upstream programs): Surveys are best practice (but low

rigor). – Evaluators have not found a better way to measure this– For HER, overlap has been found to be small in multiple evaluations.

• But we need to pay attention to this because behavior-based programs is a broader range of programs than just HER, HER messaging may evolve, and upstream programs may evolve

Overlap with momentum savings: If program administrators are claiming momentum (i.e., non-programmatic) savings, they need to avoid double counting. Use same methods as for untrackable programs.

Page 20: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

20

Frequency of Savings EstimationsProposal: Savings measured from a single year should not be assumed to be representative of savings from other years of the program (or after messaging ends). Savings must be evaluated periodically to estimate the change in savings over time.

• The region is due for a discussion on how savings for shorter-lived measures and measures with time-varying savings are reported, but that is beyond the scope of this effort.

• “How frequently should BB programs be evaluated? Regulators usually determine the frequency of program evaluation. Although requirements vary between jurisdictions, most BB programs are evaluated once per year. Annual evaluation seems appropriate for many BB programs such as home energy reports programs.” UMP, Section 4

• RTF could consider some shortcuts: – E.g., Evaluate less frequently than every year while messaging continues (or after it stops), using

interpolation for unevaluated years.– This may not be useful to programs – would need to hear from programs on their reporting

requirements– Less frequent surveys (these can be expensive)

Page 21: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

21

Measure Cost and Cost Effectiveness

Proposal: • Caution: Measure costs should include not only program costs, but also

customer costs to acquire new equipment. State your assumptions.• Regional credits for risk mitigation and deferred generation are applicable.

• For Standard Protocols and Custom Guidance, RTF does not estimate costs or cost effectiveness

• However, Guidelines say, “costs and benefits should be estimated and documented as described in these Guidelines, as appropriate.”

• Risk mitigation and deferred generation credits are applicable because the measure is available/renewable for the duration of the planning period.

Page 22: Residential Behavior-Based Programs Ryan Firestone RTF Residential Behavior Subcommittee February 2, 2016

22

Firmer Guidance Than UMP• Loosen applicability to include non-residential?• Randomized Encouragement Design: Allow? If so, how to estimate savings from all

participants, not just the “compliers”.– For example, allow pre/post estimates for “always-takers”

• Targeted levels of confidence/precision and explicit methods for estimating. (Note that this is part of program design, not evaluation)

• Requirement on duration of pre- and post- periods (≥1 year)• Require difference-in-difference (pre/post analysis)• Treatment of outliers (UMP advises against removing. If removed, show results with

and without removal)• Estimating demand savings from monthly data• Where 3rd party is required (sample selection? evaluation?)

• Possibly a punchlist of UMP recommendations (so implementers, evaluators, regulators don’t have to wade through UMP)