Regression Trees and Random Forests for Predictions

Loading...
Thumbnail Image
Date
2023
Authors
Oberta, Dušan
ORCID
Advisor
Referee
Mark
Journal Title
Journal ISSN
Volume Title
Publisher
Vysoké učení technické v Brně, Fakulta strojního inženýrství, Ústav matematiky
Abstract
Regression trees are widely used in statistics to capture, not always trivial, relationships between predictors (i.e. independent variables) and a response variable (i.e. dependent variable). They can be used in a variety of situations where other statistical tools are not suitable, even in situations where the number of predictors is greater than the number of observations in the set of training data. Random forests generalize the concept of regression trees to reduce variance and improve stability of simple regression trees. Apart from the classical regression trees based on the least squares method, the concept of maximum likelihood With the assumption of gamma distribution of the response variable is described and derived by the author. Compared to literature found, slightly different proofs of theorems regarding pruning of regression trees are offered, as well as a thorough derivation of confidence intervals for the expected value of the response variable is offered as own work of the author. Introduction to the concept of random forests is covered in the last part of the article.
Description
Citation
Kvaternion. 2023 vol. 9, č. 1-2, s. 113-136. ISSN 1805-1332
http://kvaternion.fme.vutbr.cz/2023/kv23_1-2_oberta_web.pdf
Document type
Peer-reviewed
Document version
Published version
Date of access to the full text
Language of document
en
Study field
Comittee
Date of acceptance
Defence
Result of defence
Document licence
© Vysoké učení technické v Brně, Fakulta strojního inženýrství, Ústav matematiky
DOI
Collections
Citace PRO