On this blog we have already discussed some applications of big data regarding consumer behavior. But other applications of big data are popping up everywhere and this one has really caught my interest: a group of University of Wisconsin researchers has developed a machine learning model that can detect tweets relating to bullying, and even identify bullies, victims and witnesses. So the social web is not only form of modern communication, it is also more and more becoming a research source for a whole range of subjects, including ‘criminal evidence’. How does it work? The researchers have developed a machine learning algorithm that’s identifying more than 15,000 tweets per day relating to bullying. They developed their model by feeding it two sets of tweets: one they had determined to be about bullying activity and another that was not. Once the model learned the language identifiers of tweets containing bullying it started identifying a huge amount of tweets from the Twitter firehose and it also discovered time patterns. Regarding big data what I find the most interesting part of the story is that the model is also able to say who played what role in the bullying. As the researchers dug into the tweets selected by the computer, they identified a new role: the reporter. From their press release: “We taught it ways to identify bullies, victims, accusers and defenders. These other roles were identified in the early ‘90s in the bullying literature. But the reporter role is new. It’s just like it sounds, a child who witnessed or found out about, but wasn’t participating in, a bullying encounter. That role emerged out of studying the social media roles.” Big data enabled the researcher to determine a entirely new insight into the sociology of bullying, maybe one that is specifically tied to social media bullying but that seems to be a important new playing field for bullies. What’s next The team wants to add sentiment analysis to its model so it can try and determine how individuals’ feelings are actually affected by bullying. Besides that the team also want to track bullies and victims over time (this is not possible in traditional social science surveys that typically involve one-off interviews with children). The question is how we (the school, parents or maybe even the police) can act upon these new insights in bullying. Are we violating privacy by monitoring bullies online? Should we take action if algorithms tell us our children are being bullied? Again from the press release: “The idea is that if someone is powerfully affected by the event, if they are feeling extreme anger or sadness, that’s when they could be a danger to themselves or others. Those are the ones that would need immediate attention.” How do you feel about this?