Monte Carlo tree search control scheme for multibody dynamics applications

dc.contributor.authorTang, Yixuancs
dc.contributor.authorOrzechowski, Grzegorzcs
dc.contributor.authorProkop, Alešcs
dc.contributor.authorMikkola, Akics
dc.coverage.issue10cs
dc.coverage.volume112cs
dc.date.issued2024-04-03cs
dc.description.abstractThere is considerable interest in applying reinforcement learning (RL) to improve machine control across multiple industries, and the automotive industry is one of the prime examples. Monte Carlo Tree Search (MCTS) has emerged and proven powerful in decision-making games, even without understanding the rules. In this study, multibody system dynamics (MSD) control is first modeled as a Markov Decision Process and solved with Monte Carlo Tree Search. Based on randomized search space exploration, the MCTS framework builds a selective search tree by repeatedly applying a Monte Carlo rollout at each child node. However, without a library of available choices, deciding among the many possibilities for agent parameters can be intimidating. In addition, the MCTS poses a significant challenge for searching due to the large branching factor. This challenge is typically overcome by appropriate parameter design, search guiding, action reduction, parallelization, and early termination. To address these shortcomings, the overarching goal of this study is to provide needed insight into inverted pendulum controls via vanilla and modified MCTS agents, respectively. A series of reward functions are well-designed according to the control goal, which maps a specific distribution shape of reward bonus and guides the MCTS-based control to maintain the upright position. Numerical examples show that the reward-modified MCTS algorithms significantly improve the control performance and robustness of the default choice of a constant reward that constitutes the vanilla MCTS. The exponentially decaying reward functions perform better than the constant value or polynomial reward functions. Moreover, the exploitation vs. exploration trade-off and discount parameters are carefully tested. The study’s results can guide the research of RL-based MSD users.en
dc.description.abstractThere is considerable interest in applying reinforcement learning (RL) to improve machine control across multiple industries, and the automotive industry is one of the prime examples. Monte Carlo Tree Search (MCTS) has emerged and proven powerful in decision-making games, even without understanding the rules. In this study, multibody system dynamics (MSD) control is first modeled as a Markov Decision Process and solved with Monte Carlo Tree Search. Based on randomized search space exploration, the MCTS framework builds a selective search tree by repeatedly applying a Monte Carlo rollout at each child node. However, without a library of available choices, deciding among the many possibilities for agent parameters can be intimidating. In addition, the MCTS poses a significant challenge for searching due to the large branching factor. This challenge is typically overcome by appropriate parameter design, search guiding, action reduction, parallelization, and early termination. To address these shortcomings, the overarching goal of this study is to provide needed insight into inverted pendulum controls via vanilla and modified MCTS agents, respectively. A series of reward functions are well-designed according to the control goal, which maps a specific distribution shape of reward bonus and guides the MCTS-based control to maintain the upright position. Numerical examples show that the reward-modified MCTS algorithms significantly improve the control performance and robustness of the default choice of a constant reward that constitutes the vanilla MCTS. The exponentially decaying reward functions perform better than the constant value or polynomial reward functions. Moreover, the exploitation vs. exploration trade-off and discount parameters are carefully tested. The study’s results can guide the research of RL-based MSD users.en
dc.formattextcs
dc.format.extent8363-8391cs
dc.format.mimetypeapplication/pdfcs
dc.identifier.citationNONLINEAR DYNAMICS. 2024, vol. 112, issue 10, p. 8363-8391.en
dc.identifier.doi10.1007/s11071-024-09509-8cs
dc.identifier.issn0924-090Xcs
dc.identifier.orcid0000-0002-7526-1366cs
dc.identifier.other188384cs
dc.identifier.researcheridIWE-1849-2023cs
dc.identifier.scopus57055245100cs
dc.identifier.urihttp://hdl.handle.net/11012/245527
dc.language.isoencs
dc.publisherSpringer Naturecs
dc.relation.ispartofNONLINEAR DYNAMICScs
dc.relation.urihttps://doi.org/10.1007/s11071-024-09509-8cs
dc.rightsCreative Commons Attribution 4.0 Internationalcs
dc.rights.accessopenAccesscs
dc.rights.sherpahttp://www.sherpa.ac.uk/romeo/issn/0924-090X/cs
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/cs
dc.subjectMonte Carlo Tree Searchen
dc.subjectMultibody dynamicsen
dc.subjectReward functionsen
dc.subjectParametric analysisen
dc.subjectArtificial intelligence controlen
dc.subjectInverted pendulumen
dc.subjectMonte Carlo Tree Search
dc.subjectMultibody dynamics
dc.subjectReward functions
dc.subjectParametric analysis
dc.subjectArtificial intelligence control
dc.subjectInverted pendulum
dc.titleMonte Carlo tree search control scheme for multibody dynamics applicationsen
dc.title.alternativeMonte Carlo tree search control scheme for multibody dynamics applicationsen
dc.type.driverarticleen
dc.type.statusPeer-revieweden
dc.type.versionpublishedVersionen
sync.item.dbidVAV-188384en
sync.item.dbtypeVAVen
sync.item.insts2025.10.14 15:05:44en
sync.item.modts2025.10.14 10:24:49en
thesis.grantorVysoké učení technické v Brně. Fakulta strojního inženýrství. Ústav automobilního a dopravního inženýrstvícs

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s11071024095098.pdf
Size:
4.59 MB
Format:
Adobe Portable Document Format
Description:
file s11071024095098.pdf