Of course we are looking for ways to apply MuZero to real world problems and there are some encouraging initial results. To give a concrete example, internet traffic is dominated by videos, and a big open problem is how to compress these videos as efficiently as possible. You can think of this as a consolidation learning problem, because there are these very complicated programs that compress the video, but what you see below is unknown. But when you plug something like MuZero into it, our initial results look very promising in terms of saving significant amounts of data, maybe something like 5 percent of the bits that are used to compress a video.
In the longer term, where do you think consolidation learning will have the greatest impact?
I am thinking of a system that can help you as a user to achieve your goals as efficiently as possible. A really powerful system that sees all the things you see, that has all the same senses that you have, that is able to help you achieve your goals in your life. I think it’s a really important one. Another transformation that looks long-term is something that could provide a personalized health care solution. There are issues of confidentiality and ethics that need to be addressed, but it will have great transformative value; will change the face of medicine and the quality of people’s lives.
Is there anything you think cars will learn to do in a lifetime?
I do not want to put a time limit, but I would say that everything a man can achieve, in the end I think a car can. The brain is a computational process, I don’t think magic happens there.
Can we get to the point where we can understand and implement algorithms as efficient and powerful as the human brain? Well, I don’t know what the time frame is. But I think the trip is interesting. And we should aim to achieve this. The first step in making this journey is to try to understand what it really means to gain intelligence? What problem are we trying to solve in solving intelligence?
Beyond practical uses, are you sure you can move from mastering games like chess and Atari to real intelligence? What makes you believe that learning through reinforcement will lead to cars with common sense understanding?
There is a hypothesis, we call it the reward-is-sufficient hypothesis, which says that the essential process of intelligence could be as simple as a system seeking to maximize its reward and that the process of trying to achieve a goal and achieve trying to maximize the reward is enough to give rise to all the attributes of intelligence that we see in natural intelligence. It is a hypothesis, we do not know if it is true, but it offers a direction to research.
If we take common sense specifically, the reward hypothesis is enough to say well, if common sense is useful to a system, it means that it should help it achieve its goals better.
You seem to think that your area of expertise – reinforcement learning – is in a sense fundamental to understanding or “solving” intelligence. Correct?
I really see it as very essential. I think the big question is, is it true? Because it certainly flies in the way of how many people see AI, meaning there is an incredibly complex collection of mechanisms involved in intelligence and each has its own type of problem that it solves or its own particular way of working or can that there is no clear definition of the problem for something like common sense. This theory says no, in fact there can be a very clear and simple way to think about all intelligence, which is that it is a goal optimization system and that if we find the way to optimize goals, very well, then all these other things will come out of this process.