The ability to predict what other people might do next based on their body language comes naturally to humans, but that’s not the same with machines. That is about to change as researchers in a study at Columbia University School of Engineering and Applied Science have unveiled a technique that gives the machines a more intuitive sense of what will happen next by leveraging higher associations between humans, animals, and other objects. “This algorithm is a big step towards making the machines capable of coming out to an accurate prediction about human behavior, and hence, better coordinates their actions with the action of humans,” said Carl Vondrick, who is an educated assistant professor of computer science in Columbia who lead the study, which was presented at the International Conference on Computer Vision and Pattern Recognition.
Researchers believe that it is the most exact and correct method to date to predict video action events up to several minutes into the future. After analyzing thousands of hours of movies, sports games, and shows like “The Office,” the system learns to predict hundreds of activities, from shaking hands to hugging. When it couldn’t predict the specific action, it finds the top-level concept that links them together, like in this case, the word “greeting.” Previous attempts at predictive machine learning, including the team members, have focused on predicting only one action at a time. The algorithms decide whether the action is classified as a hug, high fives, handshake, or even non-action as ignore.