Recent And Representative Publications

Show selected/Show all

Highlight!

A slide reviews my research on "ODE" and Deep Learning: [slide]

Research Papers:

      
2019
Bin Dong, Jikai Hou, Yiping Lu*, Zhihua Zhang "Distillation ≈ Early Stopping? Harvesting Dark Knowledge Utilizing Anisotropic Information Retrieval For Overparameterized Neural Network" NeurIPS2019 Workshop on ML with Guarantees. arXiv preprint:1910.01255

[ paper] [ arXiv] [ slide]

Highlight! Distillation=Early Stopping? Distillation>Early Stopping! Discover why in our paper!
Yiping Lu*, Zhuohan Li*, Di He, Zhiqing Sun, Bin Dong, Tao Qin, Liwei Wang, Tie-yan Liu "Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View." (*equal contribution) Submitted. arXiv preprint:1906.02762

[ paper] [ arXiv] [ slide] [Code]

Highlight! ODE can also be used in NLP!! We show that the Transformer can be mathematically interpreted as a numerical Ordinary Differential Equation (ODE) solver for a convection-diffusion equation in a multi-particle dynamic system.
Dinghuai Zhang*, Tianyuan Zhang*,Yiping Lu*, Zhanxing Zhu, Bin Dong. "You Only Propagate Once: Painless Adversarial Training Using Maximal Principle." (*equal contribution) 33rd Annual Conference on Neural Information Processing Systems 2019(NeurIPS2019).

[ paper] [ arXiv] [ slide] [ poster] [Code]

Highlight! ODE can help accelerate adversarial training!! Adversarial training doesn't need too many computational resources! We fully exploit structure of deep neural networks via recasting the adversarial training for neural networks as a differential game and propose a novel strategy to decouple the adversary update with the gradient back propagation. Try Our Code!
Xiaoshuai Zhang*, Yiping Lu*, Jiaying Liu, Bin Dong. "Dynamically Unfolding Recurrent Restorer: A Moving Endpoint Control Method for Image Restoration" Seventh International Conference on Learning Representations(ICLR) 2019(*equal contribution)

[ paper] [ arXiv] [code] [ slide] [ project page] [Open Review]

Highlight! In this paper, we propose a new control framework called the moving endpoint control to restore images corrupted by different degradation levels in one model. The proposed control problem contains a restoration dynamics which is modeled by an RNN. The moving endpoint, which is essentially the terminal time of the associated dynamics, is determined by a policy network. We call the proposed model the dynamically unfolding recurrent restorer (DURR). Numerical experiments show that DURR is able to achieve state-of-the-art performances on blind image denoising and JPEG image deblocking. Furthermore, DURR can well generalize to images with higher degradation levels that are not included in the training stage.
2018
Yiping Lu, Aoxiao Zhong, Quanzheng Li, Bin Dong. "Beyond Finite Layer Neural Network:Bridging Deep Architects and Numerical Differential Equations" Thirty-fifth International Conference on Machine Learning (ICML), 2018

[paper] [arXiv] [project page] [slide][ bibtex][Poster]

Highlight! This work bridge deep neural network design with numerical differential equations. We show that many effective networks can be interpreted as different numerical discretizations of differential equations. This finding brings us a brand new perspective on the design of effective deep architectures. We can take advantage of the rich knowledge in numerical analysis to guide us in designing new and potentially more effective deep networks. As an example, we propose a linear multi-step architecture (LM-architecture) which is inspired by the linear multi-step method solving ordinary differential equations.
Zichao long*, Yiping Lu*, Xianzhong Ma*, Bin Dong. "PDE-Net:Learning PDEs From Data",Thirty-fifth International Conference on Machine Learning (ICML), 2018(*equal contribution)

[paper] [arXiv] [code] [Supplementary Materials][ bibtex]

Highlight! This paper is an initial attempt to learn evolution PDEs from data. Inspired by the latest development of neural network designs in deep learning, we propose a new feed-forward deep network, called PDE-Net, to fulfill two objectives at the same time: to accurately predict dynamics of complex systems and to uncover the underlying hidden PDE models. The basic idea of the proposed PDE-Net is to learn differential operators by learning convolution kernels (filters), and apply neural networks or other machine learning methods to approximate the unknown nonlinear responses.

We have updated a new version of PDE-Net focusing more on model discovery, please check: Zichao Long, Yiping Lu, Bin Dong. " PDE-Net 2.0: Learning PDEs from Data with A Numeric-Symbolic Hybrid Deep Network" arXiv