Analyzing toxicity in social media posts with a custom dictionary

To examine how many posts in your dataset contain toxic words or phrases, you can use a pre-compiled dictionary of toxic terms and swear words, developed as part of the following publication:

K. Hazel Kwon, & Anatoliy Gruzd. (2017). Is offensive commenting contagious online? Examining public vs interpersonal swearing in response to Donald Trump’s YouTube campaign videos. Internet Research, 27(4), 991–1010. https://doi.org/10.1108/IntR-02-2017-0072 [open access version]

First, download the dictionary file from an online data repository (make sure to select the CSV file prepared for Netlytic called ‘NETLYTIC_swear_dictionary.csv’).

https://doi.org/10.5683/SP/J59UUG

Second, import the downloaded dictionary file into Netlytic using the Import button inside the “Manual Categories” window (as shown below).

Important: If you have already analyzed your dataset with this feature, you would need to click on the “Reset” button and then re-run the “Manual Categories” analysis for this new category to be included in the results.

Once the analysis is ready, you can explore results using the treemap visualization (shown below) or export the resulting/labeled dataset as a CSV file using the Export option.

For more information about this type of analysis in Netlytic, see the following page: https://netlytic.org/home/?page_id=11101

Posted in Tutorial Tagged with: , , , ,