- stock.ado

Black Lives Matter, but do bots know that?

The volume of content generated each day necessitates automated moderation to curate everything as it is published, ensuring offensive and objectionable material is blocked. But this only works if systems are adequately configured and reviewed

Social media platforms deal with an almost unimaginable volume of content every day. In the UK alone, there are 45 million active social media users and when we consider other countries too, this number escalates exponentially.

The problem with this amount of user-generated content, be it through social media, blogging, or other such platforms, is that it very soon becomes an almost Herculean task for each post and comment to be properly moderated by humans.

One solution for this challenge has been the use of automated content moderation systems. As their name suggests, these automate the process of moderating user-generated content, usually by machine learning (ML) systems being taught what the platform considers to be inappropriate content.

However, automated content moderation systems are only as good as the machine learning artificial intelligence (AI) behind them, and this in turn is dependent on the quality of data used to teach it and the engineers who deploy the systems.

On 8 June 2020, the live-action events company Profound Decisions published a social media post on their Empire LRP page in support of Black Lives Matter. Rather than writing the post itself, it contacted a black, Asian and minority ethnic (BAME) group asking if they would like to write something on their behalf to further the discussion on race and racism.

All seemed well at first. Matt Pennington, the general manager of Profound Decisions, was pleased with the post, as he felt it was well-written and constructive, highlighting what needed to be done. The response was supportive, with people liking and sharing the post.

Blacklisted by Facebook

Then things started to go wrong. “A day went by and everything was fine,” recalls Pennington. “Then at 11 o’clock at night, our entire presence on Facebook just disappeared overnight.”

The first indication that anything was amiss was when Profound Decisions attempted to publish a new post to Facebook and it was instantly blocked, stating: “Your link could not be shared, because this link goes against our Community Standards. If you think doesn’t go against our Community Standards let us know.” This was despite the post not containing content of any kind that could conceivably be offensive or inappropriate.

Profound Decisions started investigating and soon realised that its statement on Black Lives Matter had come down. Investigating further, it discovered that everything connected to the company had been excised from Facebook, including private messages in users’ inboxes. It also discovered that this had occurred on their Instagram account as well.

Eventually, Profound Decisions identified that it was the company website, along with anything linked to it, that had been blacklisted by Facebook. Unfortunately, Profound Decisions could not find any way to raise the issue with Facebook. “I spent approximately 12 hours scouring Google, Facebook and help files to find any mechanism to contact Facebook,” says Pennington.

The only opportunity that Profound Decisions had to raise the issue with Facebook was when it was informed that a post had been blocked and it had an opportunity to disagree with the decision. “This was in a very sort of automated way and it would come up with, ‘Thank you for that, we don’t review individual decisions, but this will affect our future algorithms’,” says Pennington. “That was the only form of direct communication.”

Profound Decisions attempted to establish a business account with Facebook, as this contains a mechanism for contacting the company. However, the process – in Pennington’s words – “silently failed”; it concluded with Facebook saying they would be in touch, but this did not happen.

However, after 30 hours, everything simply reappeared and all the blocks disappeared, but without explanation. Ultimately, the block did little harm to Profound Decisions, due to it being in lockdown, but if the block had continued, or occurred when it was at its busiest, then it could have been far worse.

What was frustrating for Profound Decisions is that its post had generated a positive response, with people engaging with the issue. Effectively the focus shifted to rumours that Profound Decisions was blocked. “The debate moved on from what was important, Black Lives Matter, to the one thing we didn’t want to talk about, which was the problems of a group of white guys,” says Pennington.

In response to a request for information about this, Facebook explained that Profound Decision’s Empire LRP website was temporarily blocked from Facebook after a false positive from their automated spam detection systems. This block was lifted on 11 June, when Facebook became aware of the error. It claimed that this was not related to the Black Lives Matter post.

While, in this case, the block did little harm to Profound Decisions, it highlights the very real danger that automated content moderation systems can pose. Significant monetary and reputational damage can be caused to any company with a major social media presence, or that operates exclusively online.

Why automation needs oversight

Automated content moderation systems are tailored for the platform. Such systems require careful configuration and consistent oversight to ensure they continually provide ethically appropriate moderation.

Every time a comment or message is posted on a platform, it should be adjudicated in the same way, independently of who wrote it. This must be regardless of the user’s sexual orientation, gender, skin colour, and so on. A properly configured automated content management system will not have been taught these biases.

From the outset, any organisation that hosts user-generated content online needs to define the behaviour and language that it deems appropriate for its platform. This baseline of acceptability must also be clearly published, so users understand the behaviour that is expected of them.

“We have defined in our contracts such that each party has to follow the United Nations Universal Declaration of Human Rights,” says Mari-Sanna Paukkeri, CEO and co-founder of text analytics and machine learning company Utopia Analytics.

Rather than relying on dictionaries to define content moderation systems (a process that runs the risk of outdated terms being used by the system), machine learning systems should be overseen by humans. These moderators determine what is considered acceptable on the platform. Moderation needs to be performed thoroughly and consistently, otherwise deviations in acceptable behaviour and content can occur.

Also, rather than the adjudication being based on a single word, the comment as a whole needs to be considered, taking into account the earlier conversation. 

“Context is extremely important,” says Paukkeri. “95% of all the systems use word-based processing, which have some kind of dictionaries in the background. When the word is put in a different discussion, the meaning can be totally different. You cannot evaluate one word independently.”

Ethics is one of the biggest things in technology. Technological problems have been solved, but ethics is the most important thing
Mari-Sanna Paukkeri, Utopia Analytics

Just as language naturally evolves, with words and phrases taking on new meanings and the creation of new words and abbreviations, content moderation systems need to be reviewed to ensure they are providing ethically appropriate content moderation. This can be performed through analysis by machine learning specialists, with human moderators reviewing complaints against the system’s decisions.

No matter how well-trained a system is, false positives can and will occur. This can take the form of either objectionable material being allowed through, or benign material unintentionally being blocked. Users reporting these issues are an effective way for these deviations to be highlighted.

That said, not all offending material will be reported, with most people tending not to report minor issues, such as if a post was mistakenly identified as offensive but this was not important to the user.

Even with the best content moderation system, it can be easy for the human moderators/overseers to become overwhelmed. One solution is for the system to assign a confidence level to each response.

“If the confidence of the AI is very high for those comments, which are being a reported by other users, the system could just not react on that,” explains Paukkeri. “If the confidence level is low, then it would go to humans to check.”

Due to machine learning requiring a sufficient volume of data to learn from, automated content moderation systems are only viable for platforms that have a sufficiently large user base. If a platform’s users generate a lot of content, then advanced machine learning is a far more efficient method for content moderation than humans. Conversely, if it is a small platform with a few messages a day, then human moderators will be more effective.

Organisations also need to be aware of risk mitigation, and have contingency plans in place if they rely heavily on social media platforms.

As an example of mitigation, an animal rehoming charity might use the term ‘female dog’ rather than the alternative term, which may be deemed offensive.

Contingency plans could include a regular newsletter emailed to supporters, or a social media presence across multiple platforms, thereby reducing dependence on a single platform if it is mistakenly blocked. Having such plans in place ensures that there is an established means of communication in place to fall back on.

Ultimately, automated content moderation systems can be a powerful tool in ethically and effectively moderating user-generated content. However, this can only be achieved with viable data to learn from and the appropriate feedback system in place – machine learning can only occur if it has feedback.

“Ethics is one of the biggest things in technology,” concludes Paukkeri. “Technological problems have been solved, but ethics is the most important thing.”

Read more about ethics in AI

Next Steps

Digilantism explained: Everything you need to know

Read more on Privacy and data protection

Data Center
Data Management