Catalogue of Artificial Intelligence Techniques
Keywords: constraint satisfaction problem, neuron, probability, probability distribution, stochastic
Categories: Neural Networks
Author(s): Dominik Glodzik
A Boltzmann machine is an extension of a Hopfield network. As the latter, a Boltzmann machine is a set of neurons, which are active or not, and of weighted connections between them. While operating, units are chosen at random and updated. The two essential differences between a Boltzmann machine and a Hopfield network are: stochastic mechanism of firing a neuron and allowing for hidden units.
A Hopfield network evolves deterministically towards a lower energy state of the system. The final stable state will be a minimum, but not necessarily a globally optimal one. This is because moving from one energy minimum to the other would require increasing the energy of the system. A Boltzmann machine allows for this by a changed mechanism of activating a unit. A randomly chosen neuron is activated with probability pi:
T – temperature of the system
– weight of the connection between units i and j
– +1 if the unit j is activated, -1 if not
Firstly, this equation shows that a unit can be activated even if Ei is low, at sufficiently high temperature T. At low temperatures the behavior similar to a Hopfield network. While the network is evolving, T can be varied. Inspiration from real physical systems and the simulated annealing search method suggests that in order to find the globally optimal solution, T has to be initially high and then gradually decreased.
Secondly, the process of firing a neuron is not deterministic, which means that the network will not settle in one stable state. It will evolve forever, generating a probability distribution of activation patterns. Having reached a thermal equilibrium, probability of a state will depend only on its energy, which is similar to behavior of a physical system as described by Boltzmann distribution.
A Boltzman machine allows for hidden neutrons. A limitation of all stochastic networks with connections that are all pairwise and with visible units only is that they cannot capture probability distributions with order higher than two. Using only a subset of neurons in a Boltzmann machine as a pattern enables to reproduce probability distributions of higher order. However, introducing hidden units makes learning of the network much harder – the probability distribution to be learnt by the network only gives information on what the visible units should be. The learning algorithm for the Boltzmann machines is therefore more complex and requires large amounts of time due to numerous nested loops.
Boltzmann machines can serve both as content-accessible memories (character and face recognition) and a way of solving constraint satisfaction problems (decision making). For example, a constraint satisfaction problem can be solved by introducing a unit for a hypothesis about a variable. Conflicting hypotheses are connected by negatively weighted connections, compatible ones by positive edges. A solution can be found by letting the network operate and consulting the probability distribution of states of the system after reaching thermal equilibrium.
- Knight K., Rich E., Artificial Intelligence, Mc-Graw-Hill, New York, 1991, pp.509-510.