#5072. LSDDL: Layer-Wise Sparsification for Distributed Deep Learning
July 2026 | publication date |
Proposal available till | 30-05-2025 |
4 total number of authors per manuscript | 0 $ |
The title of the journal is available only for the authors who have already paid for |
|
|
Journal’s subject area: |
Management Information Systems;
Information Systems and Management;
Information Systems;
Computer Science Applications; |
Places in the authors’ list:
1 place - free (for sale)
2 place - free (for sale)
3 place - free (for sale)
4 place - free (for sale)
Abstract:
With an escalating arms race to adopt machine learning (ML) into diverse application domains, there is an urgent need to efficiently support distributed machine learning (ML) algorithms. As Stochastic Gradient Descent (SGD) is widely adopted in training ML models, the performance bottleneck of distributed ML would be the communication cost to transmit gradients through the network. In this paper, we propose LSDDL, a scalable and light-weighted method to boost the training process of deep learning models in shared-nothing environment. The cornerstone of LSDDL lies on the observation that different layers in a neural network have different importance in the process of decompression. To exploit this insight, we devise a sparsification strategy to compress the gradient of deep neural networks which can preserve the structural information of the model. We implement our LSDDL framework in the PyTorch system and encapsulate it as a user friendly API. We validate our proposed techniques by training several real models on a large cluster. Experimental results show that the communication time of LSDDL is up to 5.43 times less than the original SGD without losing much accuracy.
Keywords:
Machine learning algorithms; training; deep learning models; sparsification strategy
Contacts :