Publications


In Progress

  1. D. Russo and B. Van Roy, ``Learning to Optimize Via Information-Directed Sampling.''

  2. D. Russo and B. Van Roy, ``An Information-Theoretic Analysis of Thompson Sampling,'' to appear in Journal of Machine Learning Research.

  3. D. Russo and B. Van Roy, ``Learning to Optimize Via Posterior Sampling,'' to appear in Mathematics of Operations Research.

  4. B. Park and B. Van Roy, ``Adaptive Execution: Exploration and Learning of Price Impact.''

  5. Y.-H. Kao and B. Van Roy, ``Directed Principal Component Analysis,'' to appear in Operations Research.

  6. D. Russo and B. Van Roy, ``Learning to Optimize Via Information-Directed Sampling,'' to appear in Advances in Neural Information Processing Systems 27, 2014.

  7. I. Osband and B. Van Roy, ``Model-Based Reinforcement Learning and the Eluder Dimension,'' to appear in Advances in Neural Information Processing Systems 27, 2014.

  8. I. Osband and B. Van Roy, ``Near-Optimal Reinforcement Learning in Factored MDPs,'' to appear in Advances in Neural Information Processing Systems 27, 2014.


2013

  1. D. Russo and B. Van Roy, ``Eluder Dimension and the Sample Complexity of Optimistic Exploration,'' Advances in Neural Information Processing Systems 26, pp. 2256-2264, 2013.

  2. I. Osband, D. Russo, and B. Van Roy, ``(More) Efficient Reinforcement Learning via Posterior Sampling,'' Advances in Neural Information Processing Systems 26, pp. 3003-3011, 2013.

  3. Z. Wen and B. Van Roy, ``Efficient Exploration and Value Function Generalization in Deterministic Systems,'' Advances in Neural Information Processing Systems 26, pp. 3021-3029, 2013.

  4. Y.-H. Kao and B. Van Roy, ``Learning a Factor Model via Regularized PCA,'' Machine Learning, Vol. 91, No. 3, pp. 279-303, 2013.


2012

  1. Z. Wen, L. J. Durlofsky, B. Van Roy, and K. Aziz, ``Approximate Dynamic Programming for Optimizing Oil Production,'' Chapter 25 in Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, edited by F. L. Lewis and D. Liu, Wiley-IEEE Press, 2012.

  2. M. Ibrahimi, A. Javanmard, and B. Van Roy ``Efficient Reinforcement Learning for High Dimensional Linear Systems,'' Advances in Neural Information Processing Systems 25, MIT Press, 2012.

  3. M. T. Padilla and B. Van Roy, ``Intermediated Blind Portfolio Auctions,'' Management Science, Vol. 58, No. 9, pp. 1747-1760, 2012.

  4. C. C. Moallemi, B. Park, and B. Van Roy, ``Strategic Execution in the Presence of an Uninformed Arbitrageur,'' Journal of Financial Markets, Vol. 15, pp. 361-391, 2012.

  5. A. Chairawongse, S. Kiatsupaibul, S. Tirapat, and B. Van Roy, ``Portfolio Selection with Qualitative Input,'' Journal of Banking and Finance, Vol. 36, No. 2, pp. 489-496, 2012.


2011

  1. G. Y. Weintraub, C. L. Benkard, and B. Van Roy, ``Industry Dynamics: Foundations for Models with an Infinite Number of Firms,'' Journal of Economic Theory, Vol. 146, No. 5, pp. 1965-1994, 2011.

  2. C. C. Moallemi and B. Van Roy, ``Resource Allocation via Message Passing,'' INFORMS Journal on Computing, Vol. 23, No. 2, pp, 205-219, 2011.

  3. J. Han and B. Van Roy, ``Control of Diffusions via Linear Programming,'' in Stochastic Programming: The State of the Art, in Honor of George B. Dantzig, edited by Gerd Infanger, pp. 329-354, Springer, 2011.

  4. Z. Wen, L. J. Durlofsky, B. Van Roy, and K. Aziz, ``Use of Approximate Dynamic Programming for Production Optimization,'' SPE Proceedings, 2011.


2010

  1. B. Van Roy and X. Yan, ``Manipulation Robustness of Collaborative Filtering,'' Management Science, Vol. 56, No. 11, pp. 1911-1929, 2010.

  2. B. Van Roy, ``On Regression-Based Stopping Times,'' Discrete Event Dynamic Systems, Vol. 20, No. 3, pp. 307-324, 2010.

  3. R. Johari, G. Y. Weintraub, and B. Van Roy, ``Investment and Market Structure in Industries with Congestion,'' Operations Research, Vol. 58, No. 5, 2010, pp. 1303-1317.

  4. C. C. Moallemi and B. Van Roy, ``Convergence of the Min-Sum Algorithm for Convex Optimization,'' IEEE Transactions on Information Theory, Vol. 56, No. 4, pp. 2041-2050, 2010.

  5. G. Y. Weintraub, C. L. Benkard, and B. Van Roy, ``Computational Methods for Oblivious Equilibrium,'' Operations Research, Vol. 58, No. 4, pp. 1247-1265, 2010. [Matlab code (updated July 2012)]

  6. V. F. Farias, C. C. Moallemi, B. Van Roy, and T. Weissman, ``Universal Reinforcement Learning,'' IEEE Transactions on Information Theory, Vol. 56, No. 5, pp. 2441-2454, 2010.

  7. V. F. Farias and B. Van Roy, ``Dynamic Pricing with a Prior on Market Response,'' Operations Research, Vol. 58, No. 1, pp. 16-29, 2010.


2009

  1. Y. H. Kao, B. Van Roy, and X. Yan, ``Directed Regression,'' Advances in Neural Information Processing Systems 22, MIT Press, pp. 889-897, 2009.

  2. B. Van Roy and X. Yan, ``Manipulation-Resistant Collaborative Filtering Systems,'' Proceedings of the Third ACM Conference on Recommender Systems, pp. 165-172, 2009.

  3. C. C. Moallemi and B. Van Roy, ``Convergence of Min-Sum Message Passing for Quadratic Optimization,'' IEEE Transactions on Information Theory, Vol. 55, No. 5, pp. 2413-2423, 2009.


2008

  1. G. Y. Weintraub, C. L. Benkard, and B. Van Roy, ``Markov Perfect Industry Dynamics with Many Firms,'' Econometrica, Vol. 76, No. 6, 2008, pp. 1375-1411. [Technical Appendix]

  2. X. Yan and B. Van Roy, ``Reputation Markets,'' Proceedings of the ACM SIGCOMM 2008 Workshop on Economics of Networks, Systems, and Computation.

  3. H. Permuter, P. Cuff, B. Van Roy, and T. Weissman, ``Capacity of the Trapdoor Channel with Feedback,'' IEEE Transactions on Information Theory, Vol. 54, No. 7, pp. 3150-3165, 2008.

  4. C. C. Moallemi, S. Kumar, and B. Van Roy, ``Approximate and Data-Driven Dynamic Programming for Queueing Networks,'' 2008.


2007

  1. V. F. Farias and B. Van Roy, ``An Approximate Dynamic Programming Approach to Network Revenue Management,'' 2007.

  2. N. O. Keohane, B. Van Roy, and R. J. Zeckhauser, ``Managing the Quality of a Resource with Stock and Flow Controls,'' Journal of Public Economics, Vol. 91, 2007, pp. 541-569.

  3. B. Van Roy, ``A Short Proof of Optimality for the MIN Cache Replacement Algorithm,'' Information Processing Letters, Vol. 102, No. 2, pp. 72-73, 2007.


2006

  1. G. Y. Weintraub, C. L. Benkard, and B. Van Roy, ``Oblivious Equilibrium: A Mean Field Approximation for Large Scale Dynamic Games,'' Advances in Neural Information Processing Systems 18, MIT Press, 2006.

  2. C. C. Moallemi and B. Van Roy, ``Consensus Propagation,'' IEEE Transactions on Information Theory, Vol. 52, No. 11, pp. 4753-4766, 2006.

  3. C. C. Moallemi and B. Van Roy, ``Consensus Propagation,'' Advances in Neural Information Processing Systems 18, MIT Press, 2006.

  4. D. S. Choi and B. Van Roy, ``A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning,'' Discrete Event Dynamic Systems, Vol. 16, No. 2, April 2006.

  5. P. Rusmevichientong, B. Van Roy, and P. W. Glynn, ``A Non-Parametric Approach to Multi-Product Pricing,'' Operations Research, Vol. 54, No. 1, 2006, pp. 82-98.

  6. P. Rusmevichientong, J. A. Salisbury, L. T. Truss, B. Van Roy, and P. W. Glynn, ``Opportunities and Challenges in Using Online Preference Data for Vehicle Pricing: A Case Study at General Motors,'' Journal of Revenue and Pricing Management, Vol. 5, No. 1, pp. 45-61, 2006.

  7. D. P. de Farias and B. Van Roy, ``A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees,'' Mathematics of Operations Research, Vol. 31, No. 3, pp. 597-620, 2006.

  8. V. F. Farias and B. Van Roy ``Approximation Algorithms for Dynamic Resource Allocation,'' Operations Research Letters, Vol. 34, No. 2, March 2006, pp. 180-190.

  9. R. Cogill, M. Rotkowitz, B. Van Roy, S. Lall, ``An Approximate Dynamic Programming Approach to Decentralized Control of Stochastic Systems,'' Lecture Notes in Control and Information Sciences, Springer, Berlin, 2006, Vol. 329, pp. 243-256.

  10. B. Van Roy ``Performance Loss Bounds for Approximate Value Iteration with State Aggregation,'' Mathematics of Operations Research, Vol. 31, No. 2, pp. 234-244, 2006.

  11. B. Van Roy, ``TD(0) Leads to Better Policies than Approximate Value Iteration,'' Advances in Neural Information Processing Systems 18, MIT Press, 2006.

  12. V. F. Farias and B. Van Roy, ``Tetris: A Study of Randomized Constraint Sampling,'' in Probabilistic and Randomized Methods for Design Under Uncertainty, G. Calafiore and F. Dabbene, eds., Springer-Verlag, 2006.


2005

  1. D. P. de Farias and B. Van Roy, ``A Linear Program for Bellman Error Minimization with Performance Guarantees,'' Advances in Neural Information Processing Systems 17, MIT Press, 2005.

  2. V. F. Farias, C. C. Moallemi, B. Van Roy, and T. Weissman, ``A Universal Scheme for Learning,'' Proceedings of the IEEE International Symposium on Information Theory, Adelaide, Australia, September 2005.

  3. X. Yan, P. Diaconis, P. Rusmevichientong, and B. Van Roy, ``Solitaire: Man Versus Machine,'' Advances in Neural Information Processing Systems 17, MIT Press, 2005.


2004

  1. R. Cogill, M. Rotkowitz, B. Van Roy, S. Lall, ``An Approximate Dynamic Programming Approach to Decentralized Control of Stochastic Systems,'' Proceedings of the Allerton Conference on Communication, Control, and Computing, 2004, pp. 1040-1049.

  2. D. P. de Farias and B. Van Roy, `` On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming,'' Mathematics of Operations Research, Vol. 29, No. 3, August 2004, pp. 462-478.

  3. H. Zhang, A. Goel, R. Govindan, K. Mason, and B. Van Roy, ``Improving Eigenvector-Based Reputation Systems Against Collusion,'' Workshop on Algorithms and Models for the Web Graph, October 2004.

  4. W. B. Powell and B. Van Roy, ``Approximate Dynamic Programming for High-Dimensional Dynamic Resource Allocation Problems,'' in Handbook of Learning and Approximate Dynamic Programming, edited by J. Si, A. G. Barto, W. B. Powell, and D. Wunsch, Wiley-IEEE Press, Hoboken, NJ, 2004, pp. 261-279.

  5. C. C. Moallemi and B. Van Roy ``Distributed Optimization in Adaptive Networks,'' Advances in Neural Information Processing Systems 16, MIT Press, 2004. [appendix]


2003

  1. D. P. de Farias and B. Van Roy, ``The Linear Programming Approach to Approximate Dynamic Programming,'' Operations Research, Vol. 51, No. 6, November-December 2003, pp. 850-865.

  2. C. C. Moallemi and B. Van Roy, ``Decentralized Protocols for Optimization of Sensor Networks,'' Proceedings of Allerton 2003.

  3. D. P. de Farias and B. Van Roy, ``Approximate Linear Programming for Average-Cost Dynamic Programming,'' Advances in Neural Information Processing Systems 15, MIT Press, 2003.

  4. B. Van Roy, ``Book Review: Self-Learning Control of Finite Markov Chains, by A. S. Poznyak, K. Najim, and E. Gomez-Ramirez,'' Automatica, Volume 39, Issue 2, February 2003, pp. 373-376.


2002

  1. N. Agarwal, J. Basch, P. Beckmann, P. Bharti, S. Bloebaum, S. Casadei, A. Chou, P. Enge, W. Fong, N. Hathi, W. Mann, A. Sahai, J. Stone, J. Tsitsiklis, and B. Van Roy, ``Algorithms for GPS Operation Indoors and Downtown,'' GPS Solutions, Vol. 6, No. 3, December, 2002, pp. 149-160.

  2. J. N. Tsitsiklis and B. Van Roy, `` On Average Versus Discounted Reward Temporal-Difference Learning,'' Machine Learning, Vol. 49, No. 2-3, 2002, pp. 179-191.


2001

  1. D. S. Choi and B. Van Roy, ``A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning,'' Proceedings of the International Conference on Machine Learning, 2001.

  2. P. Rusmevichientong and B. Van Roy, ``A Tractable POMDP for a Class of Sequencing Problems,'' Proceedings of the Conference on Uncertainty in Artificial Intelligence, 2001.

  3. B. Van Roy, `` Neuro-Dynamic Programming: Overview and Recent Trends,'' in Handbook of Markov Decision Processes: Methods and Applications, edited by E. Feinberg and A. Shwartz, Kluwer, 2001.

  4. J. N. Tsitsiklis and B. Van Roy, ``Regression Methods for Pricing Complex American-Style Options,'' IEEE Transactions on Neural Networks, Vol. 12, No. 4 (special issue on computational finance), July 2001, pp. 694-703.

  5. P. Rusmevichientong and B. Van Roy, `` An Analysis of Belief Propagation on the Turbo Decoding Graph with Gaussian Densities,'' IEEE Transactions on Information Theory, Vol. 47, No. 2, pp. 745-765, 2001.


2000

  1. P. Rusmevichientong and B. Van Roy, ``An Analysis of Turbo Decoding with Gaussian Priors,'' Advances in Neural Information Processing Systems 12, MIT Press, 2000.

  2. N. O. Keohane, B. Van Roy, and R. J. Zeckhauser, ``The Optimal Harvesting of Environmental Bads,'' Proceedings of the IEEE Conference on Decision and Control, 2000.

  3. D. P. de Farias and B. Van Roy, `` On the Existence of Fixed Points for Approximate Value Iteration and Temporal-Difference Learning,'' Journal of Optimization Theory and Applications, Vol. 105, No. 3, June, 2000.

  4. D. P. de Farias and B. Van Roy, ``Approximate Value Iteration with Randomized Policies,'' Proceedings of the IEEE Conference on Decision and Control, 2000.

  5. D. P. de Farias and B. Van Roy, ``Approximate Value Iteration and Temporal-Difference Learning,'' Proceedings of the IEEE Symposium 2000 on Adaptive Systems for Signal Processing, Communications and Control, 2000.

  6. D. P. de Farias and B. Van Roy, ``Fixed Points for Approximate Value Iteration and Temporal-Difference Learning,'' Proceedings of the International Conference on Machine Learning, 2000.


1999

  1. J. N. Tsitsiklis and B. Van Roy, ``Average Cost Temporal-Difference Learning,'' Automatica,Vol. 35, No. 11, November 1999, pp. 1799-1808.

  2. B. Van Roy, ``Temporal-Difference Learning and Applications in Finance,'' Computational Finance (Proceedings of the Sixth International Conference on Computational Finance, Leonard N. Stern School of Business, January 6-8, 1999). Edited by Y. S. Abu-Mostafa, B. LeBaron, A. W. Lo, and A. S. Weigend. Cambridge, MA: MIT Press, 1999.

  3. J. N. Tsitsiklis and B. Van Roy, ``Optimal Stopping of Markov Processes: Hilbert Space Theory, Approximation Algorithms, and an Application to Pricing High-Dimensional Financial Derivatives,'' IEEE Transactions on Automatic Control, Vol. 44, No. 10, October 1999, pp. 1840-1851.

1997

  1. J. N. Tsitsiklis and B. Van Roy, ``Average Cost Temporal-Difference Learning,'' Proceedings of the IEEE Conference on Decision and Control, 1997.

  2. J. N. Tsitsiklis and B. Van Roy, ``Overview of Neuro-Dynamic Programming and a Case Study in Optimal Stopping,'' Proceedings of the IEEE Conference on Decision and Control, 1997.

  3. J. N. Tsitsiklis and B. Van Roy, ``Approximate Solutions to Optimal Stopping Problems,'' Advances in Neural Information Processing Systems 9, MIT Press, 1997.

  4. J. N. Tsitsiklis and B. Van Roy, ``An Analysis of Temporal-Difference Learning with Function Approximation,'' IEEE Transactions on Automatic Control, Vol. 42, No. 5, May 1997, pp. 674-690.

  5. J. N. Tsitsiklis and B. Van Roy, ``Analysis of Temporal-Difference Learning with Function Approximation,'' Advances in Neural Information Processing Systems 9, MIT Press, 1997.

  6. B. Van Roy, D. P. Bertsekas, Y. Lee, and J. N. Tsitsiklis, ``A Neuro-Dynamic Programming Approach to Retailer Inventory Management,'' Proceedings of the IEEE Conference on Decision and Control, 1997. (full length version)

  7. R. Kennedy, Y. Lee, B. Van Roy, C. Reed, and R. Lippman, Solving Data Mining Problems Through Pattern Recognition, Prentice-Hall, 1997.

1996

  1. J. N. Tsitsiklis and B. Van Roy, ``Feature-Based Methods for Large Scale Dynamic Programming,'' Machine Learning, Vol. 22, 1996, pp. 59-94.

  2. B. Van Roy and J. N. Tsitsiklis, `` Stable Linear Approximations to Dynamic Programming for Stochastic Control Problems with Local Transitions,'' Advances in Neural Information Processing Systems 8, MIT Press, 1996.


1995

  1. R. Kennedy, Y. Lee, C. Reed, and B. Van Roy, Solving Pattern Recognition Problems, Unica, 1995.

Theses

  1. B. Van Roy, ``Learning and Value Function Approximation in Complex Decision Processes,'' PhD Thesis, Massachusetts Institute of Technology, May 1998.

  2. B. Van Roy, ``Feature-Based Methods for Large Scale Dynamic Programming,'' Master's Thesis, Massachusetts Institute of Technology, January 1995.

  3. B. Van Roy, ``Differential Cost Functions for Training Neural Network Pattern Classifiers,'' Bachelor's Thesis, Massachusetts Institute of Technology, May 1993.