دانلود مقاله : پی فوتبال: مدل شبکه های بیزی برای پیش بینی نتایج بازی فوتبال

Association Football (hereafter referred to simply as ‘football’) is the world’s most popular sport [11], [43] and [12], and constitutes the fastest growing gambling market [7]. As a result, researchers continue to introduce a variety of football models which are formulated by diverse forecast methodologies. While some of these focus on predicting tournament outcomes [36], [4], [35], [26] and [27] or league positions [34], our interest is in predicting outcomes of individual matches.

A common approach is the Poisson distribution goal-based data analysis whereby match results are generated by the attack and defence parameters of the two competing teams [41], [9], [38] and [32]. A similar version is also reported in [10] where the authors demonstrate profitability against the market only at very high levels of discrepancy, but which relies on small quantities of bets against an unspecified bookmaker. A time-varying Poisson distribution version was proposed by [53] in which the authors demonstrate profitability against Intertops (a bookmaker located in Antigua, West Indies), and refinements of this technique were later proposed in [8] which allow for a computationally less demanding model.

In contrast to the Poisson models that predict the number of goals scored and conceded, all other models restrict their predictions to match result, i.e. win, draw, or lose. Typically these are ordered probit regression models that consist of different explanatory variables. For example, [37] considered team performance data as well as published bookmakers’ odds, whereas [24] and [22] considered team quality, recent performance, match significance and geographical distance. Ref. [23] compared goal-driven models with models that only consider match results and concluded that both versions generate similar predictions.

Techniques from the field of machine learning have also been proposed for prediction. In [55], the authors claimed that a genetic programming based technique was superior in predicting football outcomes to other two methods based on fuzzy models and neural networks. More recently, [52] claimed that acceptable match simulation results can be obtained by tuning fuzzy rules using parameters of fuzzy-term membership functions and rule weights by a combination of genetic and neural optimisation techniques.

Models based on team quality ratings have also been considered, but they do not appear to have been extensively evaluated. Knorr-Held [33] used a dynamic cumulative link model to generate ratings for top division football teams in Germany. The ELO rating that was initially developed for assessing the strength of chess players [13] has been adopted to football [3]. In [29], the authors used the ELO rating for match predictions and concluded that the ratings appeared to be useful in encoding the information of past results for measuring the strength of a team, but the forecasts generated were not on par with market odds. Ref. [40] have also assessed an ELO rating based model along with the FIFA/Cocal Cola World rating model and concluded that both were inferior against bookmakers’ forecasts for EURO 2008.

Numerous studies have considered the impact of specific factors on match outcome. These factors include: home advantage [28], ball possession [28], and red cards [51] and [56].1

Recently researchers have considered Bayesian networks and subjective information for football match predictions. In particular, [31] demonstrated the importance of supplementing data with expert judgement by showing that an expert constructed Bayesian network model was more accurate in generating football match forecasts for matches involving Tottenham Hotspurt than machine learners of MC4, naive Bayes, Bayesian learning and K-nearest neighbour. A model that combined a Bayesian network along with a rule-based reasoner appeared to provide reasonable World Cup forecasts in [42] through simulating various predifined strategies along with subjective information, whereas in [2] a hierarchical Bayesian network model that did not incorporate subjective judgments appeared to be inferior in predicting football results when compared to standard Poisson distribution models.

In this paper we present a new Bayesian network model for forecasting the outcomes of football matches in the distribution form of {p(H), p(D), p(A)}; corresponding to home win, draw and away win. We believe this study is important for the following reasons:

(a)

the model is profitable under maximum, mean and common bookmakers’ odds, even by allowing for the bookmakers’ introduced profit margin;
(b)

the model priors are dependent on statistics derived from predetermined scales of team-strength, rather than statistics derived from a particular team (hence enabling us to maximise historical data);
(c)

the model enables us to revise forecasts from objective data, by incorporating subjective information for important factors that are not captured in the historical data;
(d)

the significance of recent information (objective or subjective) is weighted using degrees of uncertainty resulting in a non-symmetric Bayesian parameter learning procedure;
(e)

forecasts were published online before the start of each match [49];
(f)

although the model has so far been applied for one league (the English Premier League) it is easily applicable to any other football league.

The paper is organised as follows: section 2 describes the historical data and method used to inform the model priors, section 3 describes the Bayesian network model, section 4 describes the assessment methods and section 5 provides our concluding remarks and future work.

منبع:
https://isiarticles.com/

دیدگاهتان را بنویسید

نشانی ایمیل شما منتشر نخواهد شد. بخش‌های موردنیاز علامت‌گذاری شده‌اند *