RT Dissertation/Thesis T1 Machine Learning Algorithms for Provisioning Cloud/Edge Applications A1 Ayimba, Constantine Ochieng A2 IMDEA Networks Institute, AB Reinforcement Learning (RL), in which an agent is trained to make the mostfavourable decisions in the long run, is an established technique in artificial intelligence. Itspopularity has increased in the recent past, largely due to the development of deep neuralnetworks spawning deep reinforcement learning algorithms such as Deep Q-Learning. Thelatter have been used to solve previously insurmountable problems, such as playing thefamed game of “Go” that previous algorithms could not. Many such problems suffer thecurse of dimensionality, in which the sheer number of possible states is so overwhelmingthat it is impractical to explore every possible option.While these recent techniques have been successful, they may not be strictly necessaryor practical for some applications such as cloud provisioning. In these situations, theaction space is not as vast and workload data required to train such systems is notas widely shared, as it is considered commercialy sensitive by the Application ServiceProvider (ASP). Given that provisioning decisions evolve over time in sympathy toincident workloads, they fit into the sequential decision process problem that legacy RLwas designed to solve. However because of the high correlation of time series data, statesare not independent of each other and the legacy Markov Decision Processes (MDPs)have to be cleverly adapted to create robust provisioning algorithms.As the first contribution of this thesis, we exploit the knowledge of both the applicationand configuration to create an adaptive provisioning system leveraging stationary Markovdistributions. We then develop algorithms that, with neither application nor configurationknowledge, solve the underlying Markov Decision Process (MDP) to create provisioningsystems. Our Q-Learning algorithms factor in the correlation between states and theconsequent transitions between them to create provisioning systems that do not onlyadapt to workloads, but can also exploit similarities between them, thereby reducingthe retraining overhead. Our algorithms also exhibit convergence in fewer learning stepsgiven that we restructure the state and action spaces to avoid the curse of dimensionalitywithout the need for the function approximation approach taken by deep Q-Learningsystems.A crucial use-case of future networks will be the support of low-latency applicationsinvolving highly mobile users. With these in mind, the European Telecommunications Standards Institute (ETSI) has proposed the Multi-access Edge Computing (MEC)architecture, in which computing capabilities can be located close to the network edge,where the data is generated. Provisioning for such applications therefore entails migratingthem to the most suitable location on the network edge as the users move. In this thesis,we also tackle this type of provisioning by considering vehicle platooning or CooperativeAdaptive Cruise Control (CACC) on the edge. We show that our Q-Learning algorithmcan be adapted to minimize the number of migrations required to effectively run suchan application on MEC hosts, which may also be subject to traffic from other competingapplications. YR 2022 FD 2022-04 LK https://hdl.handle.net/10016/35921 UL https://hdl.handle.net/10016/35921 LA eng NO Mención Internacional en el título de doctor NO This work has been supported by IMDEA Networks Institute DS e-Archivo RD 27 jul. 2024