Please wait a minute...
  2020, Vol. 1 Issue (3): 234-242    doi: 10.23919/ICN.2020.0015
    
Deep reinforcement learning based worker selection for distributed machine learning enhanced edge intelligence in internet of vehicles
Junyu Dong,Wenjun Wu*(),Yang Gao,Xiaoxi Wang,Pengbo Si
Faculty of Information Technology, Beijing University of Technology, Beijing 100022, China
Download: PDF (4153 KB)      HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

Nowadays, Edge Information System (EIS) has received a lot of attentions. In EIS, Distributed Machine Learning (DML), which requires fewer computing resources, can implement many artificial intelligent applications efficiently. However, due to the dynamical network topology and the fluctuating transmission quality at the edge, work node selection affects the performance of DML a lot. In this paper, we focus on the Internet of Vehicles (IoV), one of the typical scenarios of EIS, and consider the DML-based High Definition (HD) mapping and intelligent driving decision model as the example. The worker selection problem is modeled as a Markov Decision Process (MDP), maximizing the DML model aggregate performance related to the timeliness of the local model, the transmission quality of model parameters uploading, and the effective sensing area of the worker. A Deep Reinforcement Learning (DRL) based solution is proposed, called the Worker Selection based on Policy Gradient (PG-WS) algorithm. The policy mapping from the system state to the worker selection action is represented by a deep neural network. The episodic simulations are built and the REINFORCE algorithm with baseline is used to train the policy network. Results show that the proposed PG-WS algorithm outperforms other comparation methods.



Key wordsedge information system      internet of vehicles      distributed machine learning      deep reinforcement learning      worker selection     
Received: 31 July 2020      Online: 19 August 2021
Fund:  Science and Technology Foundation of Beijing Municipal Commission of Education(KM201810005027);National Natural Science Foundation of China(U1633115);Beijing Natural Science Foundation(L192002)
Corresponding Authors: Wenjun Wu     E-mail: wenjunwu@bjut.edu.cn
About author: Junyu Dong received the BS degree in communication engineering from North China University of Technology in 2019. He is currently pursuing the MS degree at Faculty of Information Technology, Beijing University of Technology. His research interests include internet of vehicle and mobile edge computing.|Wenjun Wu received the BS and PhD degrees from Beijing University of Posts and Telecommunications, Beijing, China in 2007 and 2012, respectively. From 2012 to 2015, she was a post-doctoral researcher at the School of Electronic and Information Engineering, Beihang University, Beijing, China. She is now working as an associate professor at the Faculty of Information Technology, Beijing University of Technology, Beijing, China. Her research interests are in the fields of mobile edge computing, blockchain, and deep reinforcement learning.|Yang Gao received the BS degree in communication engineering from Beijing University of Technology, Beijing, China in 2018. She is currently pursuing the PhD degree in electronic science and technology at Beijing University of Technology, Beijing, China. Her current research interests include mobile edge computing, blockchain, deep reinforcement learning, and wireless resources management.|Xiaoxi Wang received the BS degree in communication engineering from Hebei University of Engineering in 2019. She is currently pursuing the MS degree at Faculty of Information Technology, Beijing University of Technology. Her current research interests include distributed machine learning and wireless networks.|Pengbo Si received the BE and PhD degrees from Beijing University of Posts and Telecommunications, Beijing, China in 2004 and 2009, respectively. He joined Beijing University of Technology, Beijing, China in 2009, where he is currently a full professor. During November 2007 and November 2008, he was a visiting student at Carleton University, Ottawa, Canada. During November 2014 and November 2015, he was a visiting scholar at the University of Florida, Gainesville, FL, USA.
Cite this article:

Junyu Dong,Wenjun Wu,Yang Gao,Xiaoxi Wang,Pengbo Si. Deep reinforcement learning based worker selection for distributed machine learning enhanced edge intelligence in internet of vehicles. , 2020, 1: 234-242.

URL:

http://icn.tsinghuajournals.com/10.23919/ICN.2020.0015     OR     http://icn.tsinghuajournals.com/Y2020/V1/I3/234

Fig. 1 System architecture.
Fig. 2 DRL-based solution framework.
ParameterSetting
Inter sites distance (m)100
Number of vehicle nodes N30
Noise density (dBm/Hz)-174
Size of intelligent information (bits)12 000
Number of channel resource units48
Moving speed of vehicle nodes (km/h)[45, 90]
Time of tasks in uplink transmission{1, 2, 4, 8}
Threshold of link outage0.05
Time length of observed states T20
Discount factor λ0.9
Initial weights r1,r2, and r31
Table 1 Simulation parameters.
Fig. 3 Performance of average value in the training process. 
Fig. 4 Training performance with different learning rates.
Fig. 5 Performance comparison with different reward weights.
[1]   Zhang J. and Letaief K. B., Mobile edge intelligence and computing for the internet of vehicles, Proc. IEEE, vol. 108, no. 2, pp. 246-261, 2020.
[2]   Xu W. C., Zhou H. B., Cheng N., Lv F., Shi W. S., Chen J. Y., and Shen X. M., Internet of vehicles in big data era, IEEE/CAA J. Autom. Sin., vol. 5, no. 1, pp. 19-35, 2018.
[3]   TomTom HD map for autonomous driving extends to Japan, , 2017.
[4]   HERE introduces HD live map to show the path to highly automated driving, , 2016.
[5]   Alcantarilla P. F., Stent S., Ros G., Arroyo R., and Gherardi R., Street-view change detection with deconvolutional networks, Auto. Robots, vol. 42, no. 7, pp. 1301-1322, 2018.
[6]   McMahan B., Moore E., Ramage D., Hampson S., and Arcas B. A., Communication-efficient learning of deep networks from decentralized data, arXiv preprint arXiv: 1602.05629, 2017.
[7]   Chen J. M., Pan X. H., Monga R., Bengio S., and Jozefowicz R., Revisiting distributed synchronous SGD, arXiv preprint arXiv: 1604.00981, 2016.
[8]   Wang S. Q., Tuor T., Salonidis T., Leung K. K., Makaya C., He T., and Chan K., When edge meets learning: Adaptive control for resource-constrained distributed machine learning, presented at IEEE INFOCOM 2018-IEEE Conf. Computer Communications, Honolulu, HI, USA, 2018, pp. 63-71.
[9]   Zhang R., Yu F. R., Liu J., Huang T., and Liu Y. J., Deep reinforcement learning (DRL)-based Device-to-Device (D2D) caching with blockchain and mobile edge computing, IEEE Trans. Wireless Comm., vol. 19, no. 10, pp. 6469-6485, 2020.
[10]   Gao Y., Wu W. J., Nan H. X., Sun Y., and Si P. B., Deep reinforcement learning based task scheduling in mobile Blockchain for IoT applications, presented at ICC 2020-2020 IEEE Int. Conf. Communications (ICC), Dublin, Ireland, 2020, pp. 1-7.
[11]   Li M., Yu F. R., Si P. B., Wu W. J., and Zhang Y. H., Resource optimization for delay-tolerant data in blockchain-enabled iot with edge computing: A deep reinforcement learning approach, IEEE Int. Things J., vol. 7, no. 10, pp. 9399-9412, 2020.
[12]   Mozaffari M., Saad W., Bennis M., and Debbah M., Mobile unmanned aerial vehicles (UAVs) for energy-efficient internet of things communications, IEEE Trans. Wirel. Comm., vol. 16, no. 11, pp. 7574-7589, 2017.
[13]   Liu H., Liu S. W., and Zheng K., A reinforcement learning-based resource allocation scheme for cloud robotics, IEEE Access, vol. 6, pp. 17 215-17 222, 2018.
[14]   Sutton R. S. and Barto A. G., Reinforcement Learning: An Introduction. Cambridge, MA, USA: MIT Press, 1998.
[15]   Enhancement of 3GPP Support for V2X Scenarios, 3GPP TS 22.186, 2019.
[1] G. M. Shafiqur Rahman,Tian Dang,Manzoor Ahmed. Deep reinforcement learning based computation offloading and resource allocation for low-latency fog radio access networks[J]. , 2020, 1(3): 243-257.
[2] Tan Li,Congduan Li,Jingjing Luo,Linqi Song. Wireless recommendations for internet of vehicles: Recent advances, challenges, and opportunities[J]. , 2020, 1(1): 1-17.