Regression Trees and Random Forests for Predictions

Loading...
Thumbnail Image

Date

Authors

Oberta, Dušan

Advisor

Referee

Mark

Journal Title

Journal ISSN

Volume Title

Publisher

Vysoké učení technické v Brně, Fakulta strojního inženýrství, Ústav matematiky

ORCID

Abstract

Regression trees are widely used in statistics to capture, not always trivial, relationships between predictors (i.e. independent variables) and a response variable (i.e. dependent variable). They can be used in a variety of situations where other statistical tools are not suitable, even in situations where the number of predictors is greater than the number of observations in the set of training data. Random forests generalize the concept of regression trees to reduce variance and improve stability of simple regression trees. Apart from the classical regression trees based on the least squares method, the concept of maximum likelihood With the assumption of gamma distribution of the response variable is described and derived by the author. Compared to literature found, slightly different proofs of theorems regarding pruning of regression trees are offered, as well as a thorough derivation of confidence intervals for the expected value of the response variable is offered as own work of the author. Introduction to the concept of random forests is covered in the last part of the article.

Description

Citation

Kvaternion. 2023 vol. 9, č. 1-2, s. 113-136. ISSN 1805-1332
http://kvaternion.fme.vutbr.cz/2023/kv23_1-2_oberta_web.pdf

Document type

Peer-reviewed

Document version

Published version

Date of access to the full text

Language of document

en

Study field

Comittee

Date of acceptance

Defence

Result of defence

DOI

Collections

Endorsement

Review

Supplemented By

Referenced By

Citace PRO