Statistical Learning

D. Russo and B. Van Roy,
``Learning to Optimize Via InformationDirected Sampling.''

D. Russo and B. Van Roy,
``An InformationTheoretic
Analysis of Thompson Sampling,'' to appear in Journal of Machine
Learning Research.

D. Russo and B. Van Roy,
``Learning
to Optimize Via Posterior Sampling,'' to appear in Mathematics
of Operations Research.

B. Park and B. Van Roy,
``Adaptive
Execution: Exploration and Learning of Price Impact.''

Y.H. Kao and B. Van Roy,
``Directed
Principal Component Analysis,'' to appear in Operations Research.

D. Russo and B. Van Roy,
``Eluder Dimension and the Sample Complexity of Optimistic Exploration,''
Advances in Neural
Information Processing Systems 26, pp. 22562264, 2013.

I. Osband, D. Russo, and B. Van Roy,
``(More) Efficient Reinforcement
Learning via Posterior Sampling,'' Advances in Neural
Information Processing Systems 26, pp. 30033011, 2013.

Z. Wen and B. Van Roy,
``Efficient
Exploration and Value Function Generalization in Deterministic
Systems,'' Advances in Neural
Information Processing Systems 26, pp. 30213029, 2013.

Y.H. Kao and B. Van Roy,
``Learning
a Factor Model via Regularized PCA,'' Machine Learning,
Vol. 91, No. 3, pp. 279303, 2013.

M. Ibrahimi, A. Javanmard, and B. Van Roy
``Efficient
Reinforcement Learning for High Dimensional Linear Systems,''
Advances in Neural Information Processing Systems 25,
MIT Press, 2012.

Y. H. Kao, B. Van Roy, and X. Yan,
``Directed
Regression,''
Advances in Neural Information Processing Systems 22,
MIT Press, pp. 889897, 2009.

B. Van Roy and X. Yan,
``Manipulation
Robustness of Collaborative Filtering,''
Management Science, Vol. 56, No. 11, pp. 19111929, 2010.

V. F. Farias, C. C. Moallemi, B. Van Roy, and T. Weissman,
``Universal
Reinforcement Learning,'' IEEE Transactions on Information
Theory, Vol. 56, No. 5, pp. 24412454, 2010.

B. Van Roy
``Performance
Loss Bounds for Approximate Value Iteration with State Aggregation,''
Mathematics of Operations Research, Vol. 31, No. 2, pp. 234244,
2006.

C. C. Moallemi and B. Van Roy
``Distributed
Optimization in Adaptive Networks,'' Advances in Neural Information
Processing Systems 16, MIT Press, 2004.
[appendix]

D. S. Choi and B. Van Roy,
``A
Generalized Kalman Filter for Fixed Point Approximation
and Efficient TemporalDifference Learning,''
Discrete Event Dynamic Systems, Vol. 16, No. 2, April 2006.
 J. N. Tsitsiklis and B. Van Roy,
``
On Average Versus Discounted Reward TemporalDifference
Learning,'' Machine Learning, Vol. 49, No. 23, 2002, pp. 179191.

D. P. de Farias and B. Van Roy,
``
On the Existence of Fixed Points for Approximate Value
Iteration and TemporalDifference Learning,''
Journal of Optimization Theory and Applications,
Vol. 105, No. 3, June, 2000.
 J. N. Tsitsiklis and B. Van Roy,
``Average Cost
TemporalDifference Learning,'' Automatica,
Vol. 35, No. 11, November 1999, pp. 17991808.
 J. N. Tsitsiklis and B. Van Roy,
``An Analysis of
TemporalDifference Learning with Function Approximation,''
IEEE Transactions on Automatic Control,
Vol. 42, No. 5, May 1997, pp. 674690.