📝 Publications
An Adaptive and Momental Bound Method for Stochastic Learning
Jianbang Ding, Xuancheng Ren, Ruixuan Luo, Xu Sun
[Paper] [Code] [Video]
We propose AdaMod , an improvement over the Adam optimizer by adding in a long-term memory aspect.
Just pip install adamod to try it! Up to now, it has multiple several variants and implementations, supporting both Pytorch, Tensorflow, Keras.

An Adaptive Learning Method for Solving the Extreme Learning Rate Problem of Transformer
Jianbang Ding, Xuancheng Ren, Ruixuan Luo
[Paper] [Code] [Video]
Conducting more empirical studies on AdaMod.
Some third-party’s Comments:
- “In testing AdaMod on some datasets along with other optimizers, I find that AdaMod is consistently a top 5 optimizer.” ——Less Wright
- “I’ve had great success with this wrapped in lookahead.” ——Evan Walters
Adder Encoder for Pre-trained Language Model
Jianbang Ding, Suiyun Zhang, Linlin Li
[Paper] [Poster]
🎉CCL Best Poster Award
AddderBERT achieves highly competitive performance against that of BERT-base on the GLUE benchmark while obtaining a 4.9x reduction in energy consumption.
Disfluency Detection for Real-World Scenarios
Jianbang Ding, Suiyun Zhang, Dandan Tu
[Paper] [Slide] [Video]
Oral Paper
Our approach significantly outperforms previous baselines and achieves state-of-the-art performance (94.3 F-score) on English Switchboard corpus.