The problem with this approach is that most real-world situations and even some games do not have a simple set of rules that govern how they work. So, some researchers have tried to solve the problem using an approach that tries to model how a certain game or scenario environment will affect an outcome and then use that knowledge to make a plan. The disadvantage of this system is that some areas are so complex that modeling each aspect is almost impossible. This has proven to be the case with most Atari games, for example.
In a way, MuZero combines the best of both worlds. Instead of modeling everything, just try to consider those factors that are important for making a decision. As DeepMind points out, you do this as a human being. When most people look out the window and see dark clouds forming on the horizon, they generally don’t think about things like condensation and pressure fronts. Instead, think about how you should dress to stay dry if you go outside. MuZero does something similar.
Take three factors into account when making a decision. It will take into account the outcome of its previous decision, its current position and the best course of action. This seemingly simple approach makes MuZero the most efficient algorithm developed by DeepMind to date. In his testing, he found that MuZero was as good as AlphaZero at chess, Go, and shogi, and better than all previous algorithms, including Agent57, at Atari games. He also found that the more time MuZero gave him to consider an action, the better he performed. DeepMind also conducted tests that put a limit on the number of simulations MuZero could perform before committing to a move to Mrs. Pac-Man. In these tests, it was found that MuZero was still able to obtain good results.
Enrolling high scores in Atari games is fine, but what about the practical applications of the latest DeepMind research? In a word, they could be revolutionary. Although we are not there yet, MuZero is the closest researcher to the development of a general-purpose algorithm. The subsidiary says that MuZero’s learning capabilities could one day help it cope with complex problems in areas such as robotics where there are no direct rules.