Aug 24, 2020
Asynchronous distributed deep learning technology for edge computing
~Model training technology even when vast amounts of data are held on distributed servers~
NTT Corporation (NTT) has achieved asynchronous distributed deep learning technology, which we call edge-consensus learning, for machine learning on edge computing. Recent machine learning, especially deep learning, generally involves training models, such as image/speech recognition, by aggregating data at a fixed location such as a cloud data center. However, in the IoT era, where everything is connected to networks, aggregating vast amounts of data on the cloud is complicated. More and more people are demanding that data be held on a local server/device due to privacy issues. Legal regulations have also been enacted to guarantee data privacy, including the EU’s General Data Protection Regulation (GDPR). In current era, excitement is growing in edge computing that decentralizes data processing/storing servers for processing load and response time reductions on cloud/communication networks and for data privacy protection.
Our research is investigating a training algorithm to obtain a global model as if it is trained by aggregating data in a single server, even when the data are placed in distributed servers, such as in edge computing. Our proposed technology, which has both academic and practical interest, enables us to obtain a global model (a trained model that uses all the data at a single place) even when (1) statistically nonhomogeneous data subsets are placed on multiple servers, and (2) the servers only asynchronously exchange variables related to the model.
Details are presented from August 23 at KDD 2020 (Knowledge Discovery and Data Mining), an international conference sponsored by the Association for Computing Machinery (ACM) (16.9% acceptance rate). We also published the code associated with our achievement on Github for verification of our method effectiveness.
In recent machine learning, especially deep learning, data are aggregated in a single place and a model is trained in the single place. However, due to the drastic increase in the amount of data as well as data privacy concerns, data will be collected in a distributed manner in the near future. For example, edge computing initiatives identify the decentralization of data collection/processing loads, and provisions in the EU’s GDPR restrict the transfer of data across countries or require minimal data collection. A world in which we can benefit from machine learning without having to sacrifice data privacy is more desirable. One technical challenge for this goal is to make data aggregation/model training/processing decentralized manner (Fig. 1).
Our proposed training algorithm can obtain a global model even in situations where different/nonhomogeneous data subsets are placed on multiple servers and their communication is asynchronous. Instead of aggregating/exchanging data (e.g., image or speech) between servers, variables associated with each model trained on servers are asynchronously exchanged between servers and result in a global model. As shown in Fig. 2, the training algorithm is composed of two processes: a procedure that updates the variables inside each server (U) and a variable exchange between servers (X).
We will continue research and development for commercialization. We will release the source code to promote further development of this technology as well as collaboration on applications.
Paper/source code publication
This research achievement is presented from August 23 at KDD 2020 (Knowledge Discovery and Data Mining), an international conference sponsored by the Association for Computing Machinery (ACM).
|Title||Edge-consensus Learning: Deep Learning on P2P Networks with Nonhomogeneous Data|
|Authors||Kenta Niwa (NTT), Noboru Harada (NTT), Guoqiang Zhang (University Technology of Sydney), Bastiaan Kleijn (Victoria University of Wellington)|
We also published the source code associated with our achievement on following cite for verification of our method effectiveness.