2019/2
Browse
Recent Submissions
Now showing 1 - 5 of 7
- ItemOn the ubiquity of the Bayesian paradigm in statistical machine learning and data science(Vysoké učení technické v Brně, Fakulta strojního inženýrství, Ústav matematiky, 2019) Fokoué, ErnestThis paper seeks to provide a thorough account of the ubiquitous natureof the Bayesian paradigm in modern statistics, data science and artificial intelli-gence. Once maligned, on the one hand by those who philosophically hated thevery idea of subjective probability used in prior specification, and on the otherhand because of the intractability of the computations needed for Bayesian esti-mation and inference, the Bayesian school of thought now permeates and pervadesvirtually all areas of science, applied science, engineering, social science and evenliberal arts, often in unsuspected ways. Thanks in part to the availability of pow-erful computing resources, but also to the literally unavoidable inherent presenceof the quintessential building blocks of the Bayesian paradigm in all walks of life,the Bayesian way of handling statistical learning, estimation and inference is notonly mainstream but also becoming the most central approach to learning from thedata. This paper explores some of the most relevant elements to help to the readerappreciate the pervading power and presence of the Bayesian paradigm in statistics,artificial intelligence and data science, with an emphasis on how the Gospel accord-ing to Reverend Thomas Bayes has turned out to be the truly good news, and insome cases the amazing saving grace, for all who seek to learn statistically from thedata.
- ItemOn a global measure of nonlinearity and its application in parameter estimation in nonlinear regression(Vysoké učení technické v Brně, Fakulta strojního inženýrství, Ústav matematiky, 2019) Khinkis, LeonidThe theoretical and computational challenges in least squares estimationof parameters in nonlinear regression models are well documented in statisticalliterature. The measures of nonlinearity are intended to quantify the degree ofnonlinearity and to explain the relationship between nonlinearity and statisticalproperties of a model. A new measure of nonlinearity reflecting the model’s globalbehavior is introduced and discussed in this paper. Two new criteria for globalminimum of the sum of squares in nonlinear regression incorporating this measureare presented and illustrated on several published examples.
- ItemOn the versatility and polyvalence of certain statistical learning machines(Vysoké učení technické v Brně, Fakulta strojního inženýrství, Ústav matematiky, 2019) Fokoué, ErnestAs data science and its flurry of lucrative career opportunities continue to dominatestrategic planning meetings at companies and universities around the world, it isremarkable to notice that mathematics, the queen of all sciences, is still called uponto play a central role. I use mathematics here in senso lato to mean mathematicalsciences in general, including algebra, analysis, probability, statistics and theoret-ical computer science. Indeed all the statistical learning machines and traditionalstatistical methods permeating the articles of this special issue have in common thefact they all rest on strong mathematical foundations, even though some of the vastmathematical details are not shown here in some cases due to space constraints.
- ItemPrediction and evaluation in College Hockey using the Bradley–Terry–Zermelo model(Vysoké učení technické v Brně, Fakulta strojního inženýrství, Ústav matematiky, 2019) Whelan, John T.; Wodon, AdamWe describe the application of the Bradley–Terry model to NCAA Divi-sion I Men’s Ice Hockey. A Bayesian construction gives a joint posterior probabilitydistribution for the log-strength parameters, given a set of game results and a choiceof prior distribution. For several suitable choices of prior, it is straightforward to findthe maximum a posteriori point (MAP) and a Hessian matrix, allowing a Gaussianapproximation to be constructed. Posterior predictive probabilities can be esti-mated by 1) setting the log-strengths to their MAP values, 2) using the Gaussianapproximation for analytical or Monte Carlo integration, or 3) applying importancesampling to re-weight the results of a Monte Carlo simulation. We define a methodto evaluate any models which generate predicted probabilities for future outcomes,using the Bayes factor given the actual outcomes, and apply it to NCAA tournamentresults. Finally, we describe an on-line tool which currently estimates probabilitiesof future results using MAP evaluation and describe how it can be refined using theGaussian approximation or importance sampling.
- ItemWhat do Asian and non-Asian scriptures have in common? An applied statistical machine learning inquiry(Vysoké učení technické v Brně, Fakulta strojního inženýrství, Ústav matematiky, 2019) Sah, Preeti; Fokoué, ErnestThis paper presents a substantially detailed statistical machine learningapproach to the analysis of several aspects of sacred texts from both the Asian andBiblical scriptural canons. The corpus herein considered consists of 4 Asian sacredscriptures, namely the Tao Te Ching, the teachings of the Buddha, the Yogasutras ofPatanjali, and the Upanishads, and 4 non-Asian sacred texts essentially four booksfrom the Bible, namely the Book of Proverbs, the Book of Wisdom, the Book ofEcclesiastes and the Book of Ecclesiasticus. Standard text mining tools are used,like the creation of Document Term Matrices (DTM) to pre-process raw Englishtranslations into word frequencies, and both unsupervised and supervised learningmethods are used to answer some foundational questions featuring similarities anddissimilarities within each canon and interesting differences between all the canonsconsidered. Despite the vast disparities between the translators of the originaltexts, our findings reveal sharp differences between Asian and non Asian scripturesregardless of whether clustering techniques or pattern recognition methods are used.We provide several compelling visualizations to help highlight our striking findings,chief of which are the persistent groupings of the scriptures based on geography.