the assistments testbed: opportunities and challenges of...

The ASSISTments TestBed: Opportunities and Challenges of Experimentation in Online Learning

Platforms

Pashler, H., McDaniel, M., Rohrer, D., & Bjork, R. (2008). Learning styles: Concepts and evidence. Psychological science in the public interest, 9(3), 105-119.

Wager, S., & Athey, S. (2017). Estimation and inference of heterogeneous treatment effects usingrandom forests. Journal of the American Statistical Association, Doi:10.1080/01621459.2017.1319839

Yin, B., Patikorn, T., Botelho, A. F., & Heffernan, N. T. (2017, April). Observing Personalizations in Learning: Identifying Heterogeneous Treatment Effects Using Causal Trees. In Proceedings of the Fourth (2017) ACM Conference on Learning@ Scale, 299-302. ACM.

Yin, B., Botelho, A. F., Patikorn, T., Heffernan, N. T., & Zou, J. (2017, June). Causal Forest vs. Naïve Causal Forest in Detecting Personalization: An Empirical Study in ASSISTments. In Proceedings of the Tenth International Conference on Educational Data Mining, 388-389.

- Remnant Based Residualization (REBAR) (Sales et al., 2018a,b; Botelho et al., 2018)- Makes use of auxiliary data (the “remnant”) to more accurately estimate treatment

effects

Residualization: REBAR

Remnant

RemnantAll users Users randomized to

condition

Botelho, A. F., Sales, A. C., Heffernan, N. T., & Patikorn, T. (2018). The ASSISTments Testbed: Opportunities and Challenges of Online Experimentation in Intelligent Tutors. In Submission.Sales, A. C., Botelho, A. F., Patikorn, T., & Heffernan, N. T. (2018a). Using Big Data to Sharpen Design-Based Inference in A/B Tests. In Proceedings of the 11th International Conference on Educational Data Mining. International Educational Data Mining Society, 479-486.Sales, A. C., Hansen, B. B., & Rowan, B. (2018b). Rebar: Reinforcing a matching estimator with predictions from high-dimensional covariates. Journal of Educational and Behavioral Statistics, 43(1), 3-31.

- Remnant Based Residualization (REBAR) (Sales et al., 2018a,b; Botelho et al., 2018)- Makes use of auxiliary data (the “remnant”) to more accurately estimate treatment

effects

Residualization: REBAR

Remnant

RemnantAll users Treatment

Control

Botelho, A. F., Sales, A. C., Heffernan, N. T., & Patikorn, T. (2018). The ASSISTments Testbed: Opportunities and Challenges of Online Experimentation in Intelligent Tutors. In Submission.Sales, A. C., Botelho, A. F., Patikorn, T., & Heffernan, N. T. (2018a). Using Big Data to Sharpen Design-Based Inference in A/B Tests. In Proceedings of the 11th International Conference on Educational Data Mining. International Educational Data Mining Society, 479-486.Sales, A. C., Hansen, B. B., & Rowan, B. (2018b). Rebar: Reinforcing a matching estimator with predictions from high-dimensional covariates. Journal of Educational and Behavioral Statistics, 43(1), 3-31.

Example:- Use log data from past ASSISTments users - Recurrent neural network: predict assignment

completion from student's previous assignments- Use trained RNN, log data to predict completion in

for subjects in experiment

Use the Remnant as a Training Set- But what can we do with this remnant data?- Train ML algorithm in remnant to predict potential

outcomes- Use the algorithm to predict outcomes in the

experiment

Remnant based predictions are great! - It's a covariate! (totally invariant to treatment assignment) - Unlimited model selection, etc. in remnant without screwing up experimental

inference - Uses neglected data to explain variance in Y!

Predictions from the Remnant

Problems with remnant-based predictions- What if remnant is really different from experimental sample?- What if the predictions suck?- Can worsen precision :(

Gelman, A., & Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge university press.Gelman, A., Hill, J., & Yajima, M. (2012). Why we (usually) don't have to worry about multiple comparisons. Journal of Research on Educational Effectiveness, 5(2), 189-211.James, W., & Stein, C. (1961, June). Estimation with quadratic loss. In Proceedings of the fourth Berkeley symposium on mathematical statistics and probability (Vol. 1, No. 1961, pp. 361-379).Efron, B., & Morris, C. (1973). Stein's estimation rule and its competitors—an empirical Bayes approach. Journal of the American Statistical Association, 68(341), 117-130.

http://tiny.cc/experiment_data

the assistments testbed: opportunities and challenges of...

Documents