The term “reinforcement learning” describes a method in the area of machine learning. Alongside supervised learning and unsupervised learning, reinforcement learning is the third option for teaching algorithms in such a way that they are able to make decisions on their own. The focus here is on the development of intelligent solutions for complex management problems.
However, in contrast to supervised and unsupervised learning, this machine learning option does not require any data for conditioning. With the first two methods, programs are fed data first. This step is completely omitted in reinforcement learning. Instead, the data is generated in a trial-and-error process during the training and simultaneously assigned a label. As such, the program is subjected to a large number of test runs in a simulation environment in order to provide a sufficiently accurate result. So, instead of confronting the system with the correct results during training (as is the case with supervised learning), the system is only supported through stimuli (i.e. rewards and penalties).
The desired result of this training is that the artificial intelligence is able to solve very complex management problems on its own without any prior knowledge provided by humans. Compared to conventional engineering, this is faster, more efficient, and provides the best possible result.
Research into reinforcement learning is often conducted through games. Video games provide the perfect basis for researching and understanding reinforcement learning, because they generally include a predefined simulation environment, various management possibilities, and an interactive environment. In addition, most games present complex problems or tasks to be completed within various periods of play. Most games also include a supplementary point system which is similar to the reward system used in reinforcement learning.
Leading experts in the area of artificial intelligence consider reinforcement learning to be a very promising method for achieving artificial general intelligence. This would make it possible for a machine to make inherently rational decisions, just like a person, and to execute successfully any number of tasks. The machine observes and learns and, in this way, is able to solve problems independently.