Auto Neutralization of Sensitive Text

Problem Statement

Auto Neutralization of Racist and Sexist web pages.

At this point on the internet, people from all over the world are interacting with each other. In order for marginalized communities to not undergo harassment we need to have a robust and general solution. This solution will only work if both the parties win.

Relevance

This project is a combination of NLP and Web Development. It perfectly combines Machine Learning and General Development thus is relevant to computer science.
Moreover, our team has taken subjects like NLP, ML and Big Data. Thus it combines previously learnt methods to newer ones.

Proposed Solution

We plan to have a generalized Machine Learning model which can detect and neutralize racist/sexist content. Our solution will have three sections.

Parts of the Solution.

Firstly, we will have a Browser Extension available for all the browsers which will take note of the websites you see and scrape the content of the websites.
Secondly, we will have an ML model which will detect racist and sexist portions of the text and neutralize it. That is, it will remove offensive content and paraphrase the rest using the remaining key words while retaining the original meaning.
Lastly, we will give the user the right to choose the degree of exposure. The user can choose from the following three options:
1. Completely view the page with original text. The sensitive text is detected and then reported automatically without user intervention.
2. With the page with changed text but the user can still see highlighted portions of the website, where upon hovering, the user will be able to see the original text.
3. The user will be shown a completely screened and filtered page and will not know what text was changed, ensuring consensual ignorance.

Additional Scope.

Based on threat level of the text, we will be able to auto-report the site/original poster. Or we will simply choose to neutralize the text to suit the user's taste.
Eg: "I will kill this faggot." counts as a threat.
"This faggot doesn't know anything." is flagged as insensitive and not a threat. The text will be neutralized.

ML Model

XLnet

Text classification is a key function required for the detection of racist and sexist text. For this, we have scrapped data from over 20 sources to collate one consolidated dataset which covers over 30,000 rows of sexist text, annotated and classified.
XLnet is an extension of the Transformer-XL model pre-trained using an autoregressive method to learn bidirectional contexts by maximizing the expected likelihood over all permutations of the input sequence factorization order. In simple words - XLNet is a generalized autoregressive model. XLNet uses Transformer XL as a feature extracting architecture, which is better than BERT’s Transformer since Transformer XL added recurrence to the Transformer. Which can make the XLNet has a deeper understanding of the language context. XLnet has outperformed BERT, T5, DistilBERT in the text classification domain.

Paraphrasing

For the purpose of paraphrasing, we have two main frameworks in mind:

Parrot

the 3 key metrics that measure the quality of paraphrases are:

Adequacy (Is the meaning preserved adequately?)
Fluency (Is the paraphrase fluent English?)
Diversity (Lexical / Phrasal / Syntactical) (How much has the paraphrase changed the original sentence?)

Parrot offers knobs to control Adequacy, Fluency, and Diversity as per our needs.

Pegasus

The model will derive paraphrases from an input sentence, and we will also be comparing how it is different from the input sentence.
The PEGASUS model's pre-training task is very similar to summarization, i.e. important sentences are removed and masked from an input document and are later generated together as one output sequence from the remaining sentences, which is fairly similar to a summary.
In PEGASUS, several whole sentences are removed from documents during pre-training, and the model is tasked with recovering them. The Input for such pre-training is a document with missing sentences, while the output consists of the missing sentences being concatenated together.
The advantage of this self-supervision is that you can create as many examples as there are documents without any human intervention, which often becomes a bottleneck problem in purely supervised systems.