Online Political Toxicity Report Details

This site describes, how we're reporting the toxicity of political persons online posts.

Step 1: Data collection

To support our mission of reducing online toxicity and encouraging more productive political dialogue, we continuously monitor public posts from selected, influential political figures.

For each individual, we load posts going up to one year back in time. The list of political persons is curated manually to ensure relevance, public impact, and accountability. If you believe an important voice is missing, we strongly encourage you to share your feedback and suggestions—our selection process is open to improvement.

Posts are retrieved via Truth Social and the Twitter/X API using the external service twitterapi.io and are updated daily to reflect newly published content.

Step 2: Scoring

All retrieved posts are automatically analyzed using the Perspective API, developed by Google Jigsaw. Perspective is a widely used machine-learning service designed to identify and measure harmful language in online conversations.

For each post, we generate scores across multiple dimensions, including toxicity, profanity, identity attacks, insults, and threats. These scores represent the likelihood that a given piece of text contains language that could negatively impact constructive public discourse.

Perspective’s models are trained on large, diverse datasets and continuously refined using a combination of human annotations and statistical methods. Detailed information about the underlying training data, model limitations, and methodological approach is publicly available in the official Perspective API documentation and should be consulted for a deeper technical understanding.

It is important to note that these scores are probabilistic assessments, not definitive judgments. They are intended to support transparency, trend analysis, and informed discussion—not to label intent or context with absolute certainty.

Step 3: Aggregation and reporting

Automated language analysis can be sensitive to context, wording, and short-term fluctuations. To mitigate these effects and account for known limitations of machine-learning–based scoring systems, we apply strong aggregation methods to all individual post scores.

Rather than highlighting single posts in isolation, we compute rolling monthly averages across all analyzed dimensions (e.g. toxicity, insults, threats). This approach reduces noise, smooths out outliers, and allows us to focus on long-term trends and behavioral patterns instead of individual statements or momentary spikes.

The aggregated data is then visualized using Grafana, enabling interactive charts, time-series graphs, and statistical summaries. These visualizations are designed to make changes over time easier to understand and to support evidence-based discussion.

By emphasizing aggregated trends rather than individual scores, we aim to provide a more robust, fair, and interpretable view of political discourse—while minimizing the risk of misinterpretation inherent in automated text analysis.

Community and Sources

Please contact us on info@diametral.net, if you have any ideas for improvement or feedback. Head over to our GitHub repositories if you're interested in the sources. Database dumps are available here:

- 2025

- 2024