AI that mimics human intelligence can address broader enterprise needs
We humans perceive, process, attend and act up on signals received by multiple senses simultaneously. We are also able to call upon prior knowledge from multiple domains to make appropriate inferences. While AI has made great advancements in vision, language and audio domains, there hasn’t been much work on combining models from different domains into an integrated whole. Can we combine all the three signals for better modeling of ground truth?
At Aganitha, we find ourselves very frequently combining signals from multiple domains in order to effectively address the business problems our customers have been referring to us. Hence, multi-modal AI, cutting across CV, NLP and Speech is a very important area of focus for us. Here are some of the research questions we have been looking into:
What’s the best way to combine many narrow models in the course of addressing a larger problem? e.g., When we first see a book, signals such as - title, author, publisher, cover picture, table of contents, flowing text, images, tables and equations - call for our attention and come together to give us a gestalt of what book is about. While each of these signals is individually amenable to analysis with AI, enterprise applications require an upstream integration of inputs from multiple data formats to deliver a comprehensive and crisp synthesis.
What’s the most comprehensive way for brands to gauge customer sentiment that considers multiple channels such as social media, email, chat, call centre conversations? Though the sentiment analysis problem is well studied, most of the published research focuses on text and of late, quite a bit of vision research has been published on classifying facial expressions. However, emotions expressed in tone are also useful inputs. Research in this area is sketchy. Aganitha has developed platforms that address such gaps to enable application of AI on integrated inputs from multiple data formats - an aspect critical for enterprises to adopt AI
We expect multi-modal AI to rapidly evolve and tackle these questions.