In the Baidu project, a virtual teacher gave positive or negative feedback to an agent who was responding to commands issued by the teacher. If the agent rightly connected the command with the intended action, it received a reward, and in the case of failure, it was penalized. The agent slowly learned the correct meaning and usage of words. Later, when presented with an unfamiliar command, it was able to extrapolate the correct meaning and fulfill the desired goal, an example of one shot learning. While this whole process took place in a simple 2D maze world, it could likely be extrapolated to vivid 3D environments, and from there to real world settings.Basically, Baidu first trained an AI agent to understand language through positive and negative reinforcement. Then the AI agent used this knowledge to navigate a 2D maze using natural language commands. That is the novelty here, in this simplistic model, the system can apply past knowledge to perform new tasks. This is a job that comes naturally to humans but remains very hard for machine learning.
Via: ExtremeTech