Statistical Learning

  1. D. Russo and B. Van Roy, ``Learning to Optimize Via Information-Directed Sampling.''

  2. D. Russo and B. Van Roy, ``An Information-Theoretic Analysis of Thompson Sampling,'' to appear in Journal of Machine Learning Research.

  3. D. Russo and B. Van Roy, ``Learning to Optimize Via Posterior Sampling,'' to appear in Mathematics of Operations Research.

  4. B. Park and B. Van Roy, ``Adaptive Execution: Exploration and Learning of Price Impact.''

  5. Y.-H. Kao and B. Van Roy, ``Directed Principal Component Analysis,'' to appear in Operations Research.

  6. D. Russo and B. Van Roy, ``Eluder Dimension and the Sample Complexity of Optimistic Exploration,'' Advances in Neural Information Processing Systems 26, pp. 2256-2264, 2013.

  7. I. Osband, D. Russo, and B. Van Roy, ``(More) Efficient Reinforcement Learning via Posterior Sampling,'' Advances in Neural Information Processing Systems 26, pp. 3003-3011, 2013.

  8. Z. Wen and B. Van Roy, ``Efficient Exploration and Value Function Generalization in Deterministic Systems,'' Advances in Neural Information Processing Systems 26, pp. 3021-3029, 2013.

  9. Y.-H. Kao and B. Van Roy, ``Learning a Factor Model via Regularized PCA,'' Machine Learning, Vol. 91, No. 3, pp. 279-303, 2013.

  10. M. Ibrahimi, A. Javanmard, and B. Van Roy ``Efficient Reinforcement Learning for High Dimensional Linear Systems,'' Advances in Neural Information Processing Systems 25, MIT Press, 2012.

  11. Y. H. Kao, B. Van Roy, and X. Yan, ``Directed Regression,'' Advances in Neural Information Processing Systems 22, MIT Press, pp. 889-897, 2009.

  12. B. Van Roy and X. Yan, ``Manipulation Robustness of Collaborative Filtering,'' Management Science, Vol. 56, No. 11, pp. 1911-1929, 2010.

  13. V. F. Farias, C. C. Moallemi, B. Van Roy, and T. Weissman, ``Universal Reinforcement Learning,'' IEEE Transactions on Information Theory, Vol. 56, No. 5, pp. 2441-2454, 2010.

  14. B. Van Roy ``Performance Loss Bounds for Approximate Value Iteration with State Aggregation,'' Mathematics of Operations Research, Vol. 31, No. 2, pp. 234-244, 2006.

  15. C. C. Moallemi and B. Van Roy ``Distributed Optimization in Adaptive Networks,'' Advances in Neural Information Processing Systems 16, MIT Press, 2004. [appendix]

  16. D. S. Choi and B. Van Roy, ``A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning,'' Discrete Event Dynamic Systems, Vol. 16, No. 2, April 2006.

  17. J. N. Tsitsiklis and B. Van Roy, `` On Average Versus Discounted Reward Temporal-Difference Learning,'' Machine Learning, Vol. 49, No. 2-3, 2002, pp. 179-191.

  18. D. P. de Farias and B. Van Roy, `` On the Existence of Fixed Points for Approximate Value Iteration and Temporal-Difference Learning,'' Journal of Optimization Theory and Applications, Vol. 105, No. 3, June, 2000.

  19. J. N. Tsitsiklis and B. Van Roy, ``Average Cost Temporal-Difference Learning,'' Automatica, Vol. 35, No. 11, November 1999, pp. 1799-1808.

  20. J. N. Tsitsiklis and B. Van Roy, ``An Analysis of Temporal-Difference Learning with Function Approximation,'' IEEE Transactions on Automatic Control, Vol. 42, No. 5, May 1997, pp. 674-690.