AI can detect sarcasm from audio

Join our forum!

For news, views and discussions on science, technology and the future of humanity!

20th May 2024

A multimodal algorithm for improved sarcasm detection has been developed by the University of Groningen, Netherlands.

ai detect sarcasm from audio

Oscar Wilde once said, "Sarcasm is the lowest form of wit but the highest form of intelligence." Perhaps that is due to how difficult it is to use and understand. Sarcasm is notoriously tricky to convey through text – even in person, it can be easily misinterpreted. The subtle changes in tone that convey sarcasm often confuse AI models too, limiting virtual assistants and content analysis tools.

A team from the University of Groningen's Speech Technology Lab has now developed a multimodal algorithm for improved sarcasm detection that examines multiple aspects of audio recordings for increased accuracy. They presented their work at a recent joint meeting of the Acoustical Society of America and the Canadian Acoustical Association, which concluded last Friday in Ottawa, Canada.

Previous algorithms for sarcasm detection have tended to rely on a single parameter to produce results. In this new study, however, researchers used two complementary approaches – sentiment analysis of text, combined with emotion recognition from audio – for a more complete picture. The group developed and trained a neural network on "MUStARD", a database of video clips from US sitcoms, which included Friends and The Big Bang Theory.

AI can detect sarcasm from audio

"We extracted acoustic parameters such as pitch, speaking rate, and energy from speech, then used Automatic Speech Recognition to transcribe the speech into text for sentiment analysis," explains Xiyuan Gao, MA. "Next, we assigned emoticons to each speech segment, reflecting its emotional content. By integrating these multimodal cues into a machine learning algorithm, our approach leverages the combined strengths of auditory and textual information along with emoticons for a comprehensive analysis."

"When you start studying sarcasm, you become hyper-aware of the extent to which we use it as part of our normal mode of communication," said fellow researcher Matt Coler, Associate Professor and Director of Voice Technology at Groningen. "But we have to speak to our devices in a very literal way, as if we're talking to a robot, because we are. It doesn't have to be this way."

The AI detected sarcasm in unlabelled exchanges from the video clips nearly 75% of the time. The team used synthetic data to increase this accuracy further and is now hoping to publish their results in a journal. They believe additional improvements may come from adding visual cues into the AI's training data, such as eyebrow movements and smirks.

"There are a range of expressions and gestures people use to highlight sarcastic elements in speech," explains Gao. "These need to be better integrated into our project. In addition, we would like to include more languages."

"The development of sarcasm recognition technology can benefit other research domains using sentiment analysis and emotion recognition," she adds. "Traditionally, sentiment analysis mainly focuses on text and is developed for applications such as online hate speech detection and customer opinion mining. Emotion recognition based on speech can be applied to AI-assisted health care. Sarcasm recognition technology that applies a multimodal approach is insightful to these research domains."

		Latest updates »
		Timeline »
		Blogs »
		Features »
		Community »
Privacy policy


Latest updates »
Timeline »
Blogs »
Features »
Community »

Privacy policy