The detection of sarcasm enters the battle of information warfare
With social media platforms being one of the main vectors of contradictory propaganda, researchers are examining how information travels in the digital environment and how it turns into action outside of online presence. A recent breakthrough in sentiment analysis – with an algorithm capable of detecting sarcasm – of the University of Central Florida as part of the Defense Advanced Research Projects Agency SocialSim Project, helps this understanding and defense.
Researchers are using neural networks, machine learning, and social media datasets to conduct fundamental research on developing possible solutions to counter information warfare. The proliferation of information warfare against the United States has created a great need for countercapabilities to protect against adverse interference.
Sarcasm Detector Is An Important First Step, Says Brian Kettler, program director, Information Innovation Office (I2O) of the Defense Advanced Research Projects Agency (DARPA), who currently leads the Computational Simulation of Online Social Behavior program, the official name of the SocialSim effort.
“Sarcasm can really trip sentiment analysis,” says the program manager. “If you’re trying to figure out how people engage with a particular story, if they’re outraged by it, or if they accept it, the sarcasm might make it show up in a way, but it’s really a different way. Understanding sarcasm I think is a basic ability, so that you don’t get fooled by that when doing things like sentiment analysis on a particular text.
Two years ago, Kettler took over the leadership of the project which began in 2017 and immediately pivoted SocialSim to tackle the information war, given the urgent need.
“It started mainly by looking at cybersecurity information, information about vulnerabilities and their spread on social media,” he explains. “[Instead], I wanted to focus more on the information warfare we find ourselves in, the influence of information, the information our adversaries can push and counter-influence operations. I wanted to watch how various narratives spread online and understand how misinformation could evolve. “
DARPA, the University of Central Florida (UCF), three other university partners, and a commercial social media data retention company are developing solutions to model how information travels online. “This is how we model an online population with publicly available data,” notes Kettler. “We’re not really interested in modeling specific users; we’re looking at an aggregate, and how we can model how much information a piece of information is going to be engaged with online, and how many people are likely to retweet something, that sort of thing.
The researchers looked at the posts on Twitter, comments on YouTube and DK, essentially a Russian Facebook-like platform. At first, the researchers applied different algorithmic models on different social media platforms to generalize the models, says Kettler.
“And now we’re looking at how we can take these patterns and provide other signals, besides just looking at the news online. We don’t just try to predict online behavior by looking online because we all know that [behavior] online has a big impact offline and vice versa. We are trying to make these models much more robust. And it’s about how we can really understand human behavior online enough, and then create simulations that will tell how various types of information will evolve over time, which could have many useful applications.
As part of this effort, DARPA issues a new challenge every six months for research teams to develop and test different capabilities with new datasets, explains. Ramya akula, graduate research assistant at the Laboratory of Complex Adaptive Systems in the Computer Science Department of UCF. She is part of the SocialSim team led by Ivan Garibay, associate professor of industrial engineering and management systems at UCF and director of the laboratory.
The final question of the challenge gets teams to dig deeper into the information operations that take place on social media, Akula explains. “It is essentially a question of arriving at the propagation of information and the type of information exchanged or the sensitivity of the information,” she notes.
Kettler points out that the UCF research team has achieved “some very interesting things” in their efforts. First, the team uniquely structured their deep neural network, which processes algorithms, a multi-headed neural network.
UCF’s algorithm begins by converting text from social media platforms into numbers the algorithm can understand, she says. “This is the first phase,” Akula says. “Next, the heart of our architecture is essentially the method of self-attention. We take the different projections of the same entry, and then we try to learn all the different combinations of those entries that come together. The reason we do this is to understand the relationship between two words in a sentence. Especially when you have a really long sentence, understanding the context or understanding the semantics behind it is easy for humans, but it’s hard for a machine to figure it out. That is why we must examine every possibility of these combinations.
She cites an example of the word apple. If the text says someone “takes an apple to work,” it could mean an Apple gadget or an apple, ”suggests Akula.“ Then these layers of self-attention are made through different heads and different layers, so it’s like a bunch of layers, a bunch of heads all trying to figure out all the likely combinations of “apple”. “
The system then calculates an attention rate for each word and sends the data through the Closed Recurring Unit (GRU), a special type of neural network UCF uses to learn the relationship between two words, Akula suggests.
“When a word has a higher attention rate, it means it has a certain emphasis,” says the researcher. “And using that attention rate, the patterns are learned, then this calculated information is sent to the GRU neural networks, which are then connected to a fully connected layer, or to a basic neural network, and then it calculates the probability score. “
The probability score indicates the likelihood that a sentence contains sarcasm. First succeeded in detecting sarcasm in words, the algorithm then had to be applied to symbols or emojis. “In our case, we had very good results detecting most sarcasm and non-sarcasm when we have words, but there are situations where there are question marks, the algorithm was a little confused to detect what it means, “Akula admits. “But if not, when there is pure text or when there are hash symbols or when there are emojis, that sort of thing, he was able to detect very well.”
In fact, UCF’s artificial intelligence tool was found to be accurate 99 percent of the time, Kettler says.
“The accuracy was much better than many other attempts to create classifiers that detect sarcasm, which, on a single sentence, is probably comparable to most humans or maybe even better,” notes the DARPA program manager. .
The next steps, according to Kettler, are to increase the accuracy of the models when applied to other signal sources. “To validate a model, we have to say, given a particular narrative, how the model says information is going to change over time, and then we compare it to how it actually changes over time,” he says. he. “So we have a very well-defined ground truth in the actual dissemination of information. And in addition to these models, it’s to help you better understand how people interact with information.
Kettler expects SocialSim’s research efforts to be completed by the end of the calendar year. After that, the program manager will seek to pave the way for other information warfare solutions for the nation.