Monte Carlo tree search control scheme for multibody dynamics applications

Tang, Yixuan; Orzechowski, Grzegorz; Prokop, Aleš; Mikkola, Aki

Monte Carlo tree search control scheme for multibody dynamics applications

dc.contributor.author	Tang, Yixuan	cs
dc.contributor.author	Orzechowski, Grzegorz	cs
dc.contributor.author	Prokop, Aleš	cs
dc.contributor.author	Mikkola, Aki	cs
dc.coverage.issue	10	cs
dc.coverage.volume	112	cs
dc.date.issued	2024-04-03	cs
dc.description.abstract	There is considerable interest in applying reinforcement learning (RL) to improve machine control across multiple industries, and the automotive industry is one of the prime examples. Monte Carlo Tree Search (MCTS) has emerged and proven powerful in decision-making games, even without understanding the rules. In this study, multibody system dynamics (MSD) control is first modeled as a Markov Decision Process and solved with Monte Carlo Tree Search. Based on randomized search space exploration, the MCTS framework builds a selective search tree by repeatedly applying a Monte Carlo rollout at each child node. However, without a library of available choices, deciding among the many possibilities for agent parameters can be intimidating. In addition, the MCTS poses a significant challenge for searching due to the large branching factor. This challenge is typically overcome by appropriate parameter design, search guiding, action reduction, parallelization, and early termination. To address these shortcomings, the overarching goal of this study is to provide needed insight into inverted pendulum controls via vanilla and modified MCTS agents, respectively. A series of reward functions are well-designed according to the control goal, which maps a specific distribution shape of reward bonus and guides the MCTS-based control to maintain the upright position. Numerical examples show that the reward-modified MCTS algorithms significantly improve the control performance and robustness of the default choice of a constant reward that constitutes the vanilla MCTS. The exponentially decaying reward functions perform better than the constant value or polynomial reward functions. Moreover, the exploitation vs. exploration trade-off and discount parameters are carefully tested. The study’s results can guide the research of RL-based MSD users.	en
dc.format	text	cs
dc.format.extent	8363-8391	cs
dc.format.mimetype	application/pdf	cs
dc.identifier.citation	NONLINEAR DYNAMICS. 2024, vol. 112, issue 10, p. 8363-8391.	en
dc.identifier.doi	10.1007/s11071-024-09509-8	cs
dc.identifier.issn	1573-269X	cs
dc.identifier.orcid	0000-0002-7526-1366	cs
dc.identifier.other	188384	cs
dc.identifier.researcherid	IWE-1849-2023	cs
dc.identifier.scopus	57055245100	cs
dc.identifier.uri	http://hdl.handle.net/11012/245527
dc.language.iso	en	cs
dc.publisher	Springer Nature	cs
dc.relation.ispartof	NONLINEAR DYNAMICS	cs
dc.relation.uri	https://doi.org/10.1007/s11071-024-09509-8	cs
dc.rights	Creative Commons Attribution 4.0 International	cs
dc.rights.access	openAccess	cs
dc.rights.sherpa	http://www.sherpa.ac.uk/romeo/issn/1573-269X/	cs
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	cs
dc.subject	Monte Carlo Tree Search	en
dc.subject	Multibody dynamics	en
dc.subject	Reward functions	en
dc.subject	Parametric analysis	en
dc.subject	Artificial intelligence control	en
dc.subject	Inverted pendulum	en
dc.title	Monte Carlo tree search control scheme for multibody dynamics applications	en
dc.title.alternative	Schéma řízení pro multibody modely na základě Monte Carlo Tree Search	cs
dc.type.driver	article	en
dc.type.status	Peer-reviewed	en
dc.type.version	publishedVersion	en
sync.item.dbid	VAV-188384	en
sync.item.dbtype	VAV	en
sync.item.insts	2025.02.03 15:48:02	en
sync.item.modts	2025.01.17 15:25:54	en
thesis.grantor	Vysoké učení technické v Brně. Fakulta strojního inženýrství. Ústav automobilního a dopravního inženýrství	cs

Files

Original bundle

Now showing 1 - 1 of 1

Name:: s11071024095098.pdf
Size:: 4.59 MB
Format:: Adobe Portable Document Format
Description:: file s11071024095098.pdf

Download

Collections

Ústav automobilního a dopravního inženýrství