Artificial Intelligence and Egocentric Vision

by Dr Taberez A Neyazi



Image credit: Shagufta Neyazi




Egocentric understanding or understanding the world from the first-person experience instead of the usual third-person perspective is the latest development in the field of Artificial Intelligence (AI). Through wearing AR (augmented reality) smart glasses, participants can record themselves doing routine, day-to-day activities in real-time or how they interact with their environment including people and object. The data collected in the form of videos and images could be then used to train AI to better comprehend the world around it and create a more personalized experience for individuals. The ultimate aim is to develop smarter AI based on data collected through unscripted experiences of individuals in real-time that can interact in metaverse. By integrating the physical world with the real world, metaverse aims to create an immersive experience that can transform the lived experience beyond what we see to what could be visualized. 

In addition, AR smart glasses powered by AI assistants could help accomplish specific tasks such as cook food without referring to cookbooks, locate lost items, provide assistance for technical support; create personalized social interaction such as a guide in deciding one’s itinerary for the next family trip. While the Facebook through its Ego4D (Grauman et al., 2021) project has taken the lead in creating AI system trained in first-person perspective data, other tech companies are certainly going to follow the suit. Facebook has also adopted several measures to ensure that the privacy of people being recorded through AR glasses is protected. These measures include blurring personally identifiable information in the videos, obtaining informed consent and deidentifying data where applicable. While these are standard ethical practices in human behavioural research, more informed discussion needs to be undertaken to identify the larger societal benefits of AI system trained in the first-person perspective. 

Developing services and products based on the first-person perspective data could be utilized for the benefit of society and eventually monetized by the tech companies. Accomplishing routine tasks with the help of AI assistants is less problematic. However, the idea that such egocentric data could be used to deliver personalized services at the individual level needs to be discussed more seriously because personalization not only brings convenience for us but also creates an insular world devoid of vision. The choices offered to individuals through personalization are the choices based on their past habits and could be limiting and has been subject of ongoing debates in the field of AI. Hence, bringing diversity to the choices offered to the individuals and allowing them to discover products, services and ideas should be prioritized while developing egocentric perception though this is not going to be an easy task. 


NAF programme title:
 A Close Eyecounter


Grauman, K., Westbury, A., Byrne, E., Chavis, Z., Furnari, A., Girdhar, R., ... & Malik, J. (2021). Ego4d: Around the world in 3,000 hours of egocentric video. arXiv preprint arXiv:2110.07058.