The Patheos Filter and its Averter

A Discussion

Some time prior to July 2019, a filter was implemented in the Disqus comment system on Patheos which sends any comment with 'bad words' in it into a moderation queue.

Many of the words would not indeed be acceptable language among the genteel. But even the most genteel would have difficulty avoiding the filter, as some words have a perfectly innocent, ordinary meaning as well as a risque one ('bag', pot'), others are only a problem if you're a bigot ('lesbian', 'Islam') and others still are not remotely offensive and presumably are to prevent spam ('iphone', 'http'), and others target white supremacists, but also people wanting to discuss white supremacy or simply world history ('KKK', 'Hilter').

As there's no real rhyme or reason to what is on the list, it is impossible to avoid simply by deciding not to swear. It requires being able to remember the arbitrary intersection of the list of words and one's own vocabulary and re-edit oneself on the fly.

Note that it is very unlikely anyone in Patheos deliberately curated the list of words, the motivation is simply to tick a box to get ad revenue. See below for further discussion on this point.

Moderators on busy forums have difficulty clearing all the flagged posts, and at least one popular forum (Slacktivist) has no active moderation at all, in which case your carefully-worded comment will never be seen. Without recourse to the list of words, or (easier) the filter, working out what word set it off is a painful process of trial-and-error.

How Does the Filter Work?

Essentially it just checks for exact case-insensitive matches against a large list of words and phrases (a little over a thousand items).

There are two bits of cleverness:

The accent coping was added a few months after the filter was put in place.

If you form a word that isn't on the list, by substituting a letter or adding some numbers ('69' is popular for some reason), it won't be detected.

The word lists sometimes include '*', which may have been supposed to be a globbing character (i.e. stands for any string of characters) in the list's original context. The Patheos filter makes no use of this, e.g. 'cock*' appears on the list yet 'cockamamie' isn't detected.

The Other Filter

Note that unfortunately there is another filter, which sometimes flags comments too. This one may be Disqus rather than Patheos, and it is probably a lot smarter, using some kind of machine-learning treatment in all likelihood. It's likely to be impossible to avert, because we have no access to its internal state, and what it's looking for probably changes moment to moment.

Fortunately, it doesn't act very often.

The one reputed thing that seems to put a comment in danger is length, but plenty of long comments are still able to be published without interference. If this happens to you try making your comment shorter. Alternatively try a bit of rewording.

Ross has confirmed that it operates at a different stage in the comment's lifecycle, indicating that it is a separate process:

I read the comments via the RSS feed. I've seen people complain about comments getting caught by the spam filter which nevertheless showed up in my RSS feed, whereas things caught by the naughty word filter do not show up in my feed. That suggests to me that the spam filter is mechanically different and interferes at a different point in the process.

How did this come about?

Clancy, who mods on ‘Roll to Disbelieve’, has this to say:

The word that filtered down to us was that Google would not place ads with Patheos unless a filter was implemented, so Patheos insisted Disqus implement the filter. How the content was selected is unknown to us plebes. Patheos has simply not responded to any suggestions about the filter's content. The NR blog with the most traffic, Friendly Atheist, told Patheos to get stuffed, and deleted the filter, so I'm told.

It is terribly difficult for low-traffic blogs with no moderators except the blogger, as you know. Since quarantine, Captain Cassidy posts an article every day, and gets 300-600 comments per day. We also have four moderators. It is an interesting fact that a moderator can edit the filter, so if there were a mod with a lot of chutzpah, some of the stupidest words might have an "accident".

The motive is profit, the implementation is kind of crappy (see below), and the management don't care. Despite the fact that your comment will be flagged if you say 'lesbian', or 'Islam', no-one is intentionally suppressing those topics. On the other hand, Patheos as an organisation functionally does not care that discussion is being suppressed, although how much of this is organisational dysfunction and how much is callous negligence is a matter of speculation.

The Bad Words

The words are available as badwords.txt. Obviously the contents of that file in no way represent the opinions of this author, they are simply provided as a guide as to what will get your post vanished into the moderator bin.

Note that it appears to be composed of a number of different alphabetically-ordered lists, so while it starts off alphabetical, it eventually loops back to 'a' again, etc.

The badwords were retrieved from a list Random_Luker provided as a google doc.

Here is another version, courtesy of Clancy, which categorizes the words (and helps to show how arbitrary the list is (or more accurately, the lists are).

The words do not seem to have changed during the existence of the filter.

The Filter as Software

Everything speaks to a rush job where the outcome isn't that important:

The most likely scenario is that someone was told to create this on a shoestring, and created the simplest possible filter using a list of words they found on the internet somewhere, adding counter-measures to very obvious work-arounds.

How Does the Averter Work?

In the case of single bad words, they are detected, reversed (so they appear in the data backwards), and then surrounded in Unicode control characters to display it backwards. Two backwards makes a forwards.

The Patheos filter isn't looking for backwards bad words, so this works.

For phrases, the phrase is detected and the first common-or-garden ASCII space character is replaced with two thin spaces. The filter is not very smart and looks for exact strings, so replacing the space character with another.

Reversing doesn't work on palindromes, so the two palindromes ('KKK','boob') have letters replaced with Greek look-alikes.

Update 19 July 2021

A moderator on another Patheos site informs me that new words have been added to the filter. A number of them include a regular expression wildcard '.*' which will match anything, so for example 'merd.*' will match 'merdur'.

(It is possible that the filter has always had regular expression support, but the wildcarded expression in the word list to date have been meant for a simpler alternative, 'globbing', where '*' by itself matches any string. So for example it includes 'bitch*', which presumably is supposed to match 'bitchin', but interpreted regular expression style will only match 'bitchh', 'bitchhhhh', etc.)

Either the behaviour has changed between Patheos sites, or the words have changed, as I can post 'merdur' and 'arsenic' to Slacktivist but not to another site.

It's not simply that the Slacktivist word list is behind the times, as at least one of the new words is detected by the filter.

The new words I know about have been added to the PFA, but I have not so far attempted to do anything about the use of regular expressions.

Note that I can't be confident what words are in the Slacktivist list any longer, adding words that appear on other lists is shooting in the dark at this stage, but it does appear to hit things occassionally.