Bayesian network structure learning (BNSL) is the problem of finding a Bayesian network structure which best explains a given dataset. Score-based learning, a widely-used technique for solving BNSL, assigns a score to each network structure. The score measures the goodness of fit of that network to the data; commonly-used scores are based on, for example, Bayesian posterior likelihoods or minimum description length principles. Solving BNSL optimally is known to be NP-hard. Nevertheless, in the last decade, a variety of algorithms have been proposed which guarantee to find a network structure that optimizes a given scoring function. The goal of this work is to better understand the empirical behavior of these algorithms.
In order to finish within a reasonable amount of time, most of the exact algorithms (safely) prune the search space using a variety of heuristics, such as branch-and-bound search. After briefly introducing several of the exact algorithms, I will discuss problem-dependent characteristics which affect the efficacy of the different heuristics. Empirical results show that, despite the complexity of the algorithms, machine learning techniques based on these characteristics can often be used to accurately predict the algorithms' running times.
Within BNSL, the score typically reflects fit to a training dataset; however, it is well-known that a model may fit a training dataset very well but generalize poorly to unseen data. Thus, it is not clear that finding a network which optimizes a scoring function is worthwhile. The second part of this talk will focus on a study which compares exact and local search techniques. In some scenarios, algorithms such as greedy hill climbing or the polynomial-time Chow-Liu algorithm suffice; for more complex datasets, though, the exact algorithms consistently produce better networks.
Very recently, several algorithms have been developed for learning bounded-treewidth networks, in which exact inference is guaranteed to be efficient. This talk will conclude with discussion on preliminary results which suggest that the generalization performance of bounded-treewidth is similar to that of more complex structures.
Note: A tutorial (on the first day of the main UAI conference, July 12) by Changhe Yuan and James Cussens will focus on BNSL algorithms. Consequently, this talk will only include a brief overview of the algorithms.
References:
This talk largely pulls from the following references.
Malone, B.; Kangas, K.; Järvisalo, M.; Koivisto, M. & Myllymäki, P. "Predicting the Hardness of Learning Bayesian Networks." Proceedings of the 28th AAAI Conference on Artificial Intelligence, 2014.
Malone, B.; Järvisalo, M. & Myllymäki, P. "Impact of Learning Strategies on the Quality of Bayesian Networks: An Empirical Evaluation." Proceedings of the 31st Conference on Uncertainty in Artificial Intelligence, 2015.
Berg, J.; Järvisalo, M. & Malone, B. "Learning Optimal Bounded Treewidth Bayesian Networks via Maximum Satisfiability." Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, 2014.
Brandon Malone is a postdoctoral researcher at the Max Planck Institute for Biology of Ageing. Much of his research has addressed improving the scalability of exact Bayesian network structure learning using admissible heuristic search. He has also worked with interdisciplinary groups to investigate biological problems, such as processes related to ageing, using probabilistic models. He received his PhD from Mississippi State University in 2012. Email: brandon.malone@age.mpg.de.
Uncertainty and risks are common elements of all major projects. Yet such uncertainty is rarely effectively calculated when analysing project costs and benefits. This paper presents a Bayesian network (BN) modelling framework to calculate the costs, benefits, and return on investment of a project over a specified time period, allowing for changing circumstances and trade-offs. The framework uses hybrid and dynamic BNs containing both discrete and continuous variables over multiple time stages. The BN calculates costs and benefits based on multiple causal factors including the effects of individual risk factors, budget deficits, and time value discounting. The method is illustrated using a case study of an agricultural development project.
Gated Bayesian networks (GBNs) are an exten- sion of Bayesian networks that aim to model sys- tems that have distinct phases. In this paper, we aim to use GBNs to output buy and sell decisions for use in algorithmic trading systems. These systems may have several parameters that require tuning, and assessing the performance of these systems as a function of their parameters cannot be expressed in closed form, and thus requires simulation. Bayesian optimisation has grown in popularity as a means of global optimisation of parameters where the objective function may be costly or a black box. We show how algorithmic trading using GBNs, supported by Bayesian opti- misation, can lower risk towards invested capital, while at the same time generating similar or bet- ter rewards, compared to the benchmark invest- ment strategy buy-and-hold.
A static Bayesian network has been used to support fog forecasting at Melbourne Airport since 2006. We are extending the approach in a Dynamic Bayesian Network (DBN) to support forecasting of fog onset and clearance. In this paper we describe a prototype visualisation tool for understanding and exploring the output of the DBN. This tool supports both the knowledge engineering process and the use of the resultant DBN by forecasters.
12:30 - 14:30 |
LUNCH BREAK / POSTER |
|
14:30 - 16:00 |
Education |
|
14:30 - 15:00 |
An IRT-based Parameterization for Conditional Probability Tables
|
Russell Almond
paper, presentation
|
In educational assessment, as in many other areas of application for
Bayesian networks, most variables are ordinal. Additionally
conditionally probability tables need to express monotonic
relationships; e.g., increasing skill should mean increasing chance of
a better performances on an assessment task. This paper describes a
flexible parameterization for conditional probability tables based on
item response theory (IRT) that preserves monotonicity. The parameterization is
extensible because it rests on three auxiliary function: a
_mapping_ function which maps discrete parent states to real
values, a _combination_ function which combines the parent
values into a sequence of real numbers corresponding to the child
variable states, and a _link_ function which maps that vector
of numbers to conditional probabilities. The paper also describes an
EM-algorithm for estimating the parameters, and describes a hybrid
implementation using both R and Netica, available for free download.
15:00 - 15:30 |
Bayesian network models for adaptive testing
|
Martin Plajner and Jirka Vomlel
paper, presentation
|
Computerized adaptive testing (CAT) is an interesting and promising approach to testing human abilities. In our research we use Bayesian networks to create a model of tested humans.
We collected data from paper tests performed with grammar school students. In this article we first provide the summary of data used for our experiments. We propose several different Bayesian networks, which we tested and compared by cross-validation. Interesting results were obtained and are discussed in the paper. The analysis has brought a clearer view on the model selection problem. Future research is outlined in the concluding part of the paper.
15:30 - 16:00 |
Computer Adaptive Testing Using the Same-Decision Probability
|
Suming Chen, Arthur Choi and Adnan Darwiche
paper, presentation
|
Computer Adaptive Tests dynamically allocate questions to students based on their previous responses. This involves several challenges, such as determining when the test should terminate, as well as which questions should be asked. In this paper, we introduce a Computer Adaptive Test that uses a Bayesian network as the underlying model. Additionally, we show how the notion of the Same-Decision Probability can be used as an information gathering criterion in this context — to determine which further questions are needed and if so, which further questions should be asked. We show empirically that utilizing the Same-Decision Probability is a viable and intuitive approach for determining question selection in Bayesian-based Computer Adaptive Tests, as its usage allows us to ask fewer questions while still maintaining the same level of precision and recall in terms of classifying competent students.
16:00 - 16:30 |
COFFEE BREAK / POSTER |
|
16:30 - 17:30 |
Vehicle and Aircraft Applications |
|
16:30 - 17:00 |
Influence diagrams for the optimization of a vehicle speed profile
|
Václav Kratochvíl and Jirka Vomlel
paper, presentation
|
Influence diagrams are decision theoretic extensions of Bayesian networks. They were applied to diverse decision problems. In this paper we apply influence diagrams
to the optimization of a vehicle speed profile. We present results of computational experiments in which an influence diagram was used to optimize the speed profile of a Formula 1 race car at the Silverstone F1 circuit. The computed lap time and speed profiles correspond well to those achieved by test pilots. An extended version of our model that considers a more complex optimization function and diverse traffic constraints is currently being tested onboard a testing car by a major car manufacturer. This paper opens doors for new applications of influence diagrams.
17:00 - 17:30 |
Bayesian Predictive Modelling: Application to Aircraft Short-Term Conflict Alert System
|
Vitaly Schetinin, Livija Jakaite and Wojtek Krzanowski
paper, presentation
|
Bayesian Model Averaging (BMA) being made computationally feasible with Markov Chain Monte Carlo (MCMC) method is a well-known methodology of reliable estimation of predictive distribution. The use of decision tree (DT) models for the averaging enables experts not only to estimate predictive posterior but also to interpret models of interest and estimate the importance of predictor factors that are assumed to contribute to the prediction. MCMC method generates parameters of DT models in order to explore their posterior distribution and draw samples of the models. However, these samples can often over-represent DT models of an excessive size, that in cases of real-world applications affects the results of BMA. When this happens, it is unlikely to expect that a DT model that provides Maximum a Posteriori will explain the observed data with high accuracy. We propose a new scenario of Bayesian application that allows experts to interpret explanatory model and improve the accuracy of predictions. In our experiments with aircraft short-term conflict alerts, we show how this scenario can be used for analysing uncertainties in the predictions.
17:30 - 18:00 |
Planning for BMAW 2016 |
|
17:30 - 18:00 |
Final discussion
|
presentation
|
18:00 |
END OF BMAW ACTIVITIES |
|