-
Our Contribution:
- We give a new understanding of network designing using tools in numerical differential equations.
- A sotchastic differential equation view of stochastic training (e.g. shake-shake and stochastic depth) used in deep learning.
- We introduce a novel multi-step scheme to deep learning and the performance boost is explained by a modified equation view.
| Model | Layer | Para | Error | DataSet |
|---|---|---|---|---|
| ResNet | 20 | 0.27M | 8.75 | CIFAR10 |
| ResNet | 32 | 0.46M | 7.57 | CIFAR10 |
| ResNet | 44 | 0.66M | 7.17 | CIFAR10 |
| ResNet | 56 | 0.85M | 6.97 | CIFAR10 |
| ResNet | 110, pre-act | 1.14M | 6.37 | CIFAR10 |
| ResNet | 164, pre-act | 1.7M | 5.46 | CIFAR10 |
| ResNet | 110, stochastic depth | 1.14M | 5.25 | CIFAR10 |
| ResNet | 1202, stochastic depth | 10M+ | 4.91 | CIFAR10 |
| LM-ResNet | 20, pre-act | 0.27M | 8.33 | CIFAR10 |
| LM-ResNet | 32, pre-act | 0.46M | 7.18 | CIFAR10 |
| LM-ResNet | 44, pre-act | 0.6MM | 6.66 | CIFAR10 |
| LM-ResNet | 56, pre-act | 0.85M | 6.31 | CIFAR10 |
| LM-ResNet | 110, pre-act | 1.14M | 6.16 | CIFAR10 |
| LM-ResNet | 164, pre-act | 1.7M | 5.27 | CIFAR10 |
| LM-ResNet | 56, stochastic depthn | 0.85M | 5.14 | CIFAR10 |
| LM-ResNet | 110, stochastic depthn | 1.14M | 4.80 | CIFAR10 |
Our Learned Momentum:
Cite us:
@InProceedings{lu18d, title = {Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential
Equations},
author = {Lu, Yiping and Zhong, Aoxiao and Li, Quanzheng and Dong, Bin},
booktitle = {Proceedings of the 35th International Conference on Machine Learning},
pages = {3282--3291}, year = {2018}, editor = {Jennifer Dy and Andreas Krause},
volume = {80}, series = {Proceedings of Machine Learning Research},
address = {Stockholmsmässan, Stockholm Sweden}, month = {10--15 Jul}, publisher = {PMLR},
pdf = {http://proceedings.mlr.press/v80/lu18d/lu18d.pdf}, url = {http://proceedings.mlr.press/v80/lu18d.html} }
author = {Lu, Yiping and Zhong, Aoxiao and Li, Quanzheng and Dong, Bin},
booktitle = {Proceedings of the 35th International Conference on Machine Learning},
pages = {3282--3291}, year = {2018}, editor = {Jennifer Dy and Andreas Krause},
volume = {80}, series = {Proceedings of Machine Learning Research},
address = {Stockholmsmässan, Stockholm Sweden}, month = {10--15 Jul}, publisher = {PMLR},
pdf = {http://proceedings.mlr.press/v80/lu18d/lu18d.pdf}, url = {http://proceedings.mlr.press/v80/lu18d.html} }