Facebook wants to build artificial intelligence that learns to understand the world like humans—by watching our every move. The tech giant has announced plans to teach AI to ‘understand and interact with the world like we do’ in first person.
It hopes to do this by using video and audio from augmented reality (AR) glasses like its new high-tech Ray-Bans. “AI typically learns from photos and videos captured in third-person, but next-generation AI will need to learn from videos that show the world from the center of action,” the company said.
It went on: “AI that understands the world from this point of view could unlock a new era of immersive experiences.” For the Ego4D project, Facebook gathered 2,200 hours of first-person video from 700 people going about their daily lives in order to begin training its AI assistants. It says it wants to teach AI to:
remember things, so we can ask it ‘what happened when’
predict human actions and try to anticipate our needs
manipulate hands and objects in order to learn new skills
keep a video ‘diary’ of everyday life and recall specific moments
learn and understand social interaction
These tasks can’t be performed by any AI system right now, but could play a central role in Facebook’s plans to build the ‘metaverse’; a digital 3D overlay of reality using VR and AR. FULL REPORT