ClickCease
Abstract visualization of AI image moderation

Harnessing AI for Content Moderation: A Guide to Integrating it to your app

Italo Orihuela
Italo Orihuela
Engineer
Android
iOS
Web
Feb 7, 2024

Online communities have become a vital part of our daily lives. However, maintaining the safety and inclusivity of these communities can be a challenging task. This is where AI Moderation comes into play. It is a feature that uses advanced AI technology to automatically scan and moderate all user-generated content, effectively identifying and addressing any inappropriate or offensive material. This blog post will guide you through the process of integrating AI Moderation into your app, ensuring a secure online environment for your users.

Pre-requisites

Before we dive into the details, let’s go over the pre-requisites for integrating Enhance Moderation into your app:

  1. The Amity Social Cloud SDK installed in your project.
  2. An Amity Social Cloud Portal account
  3. An Amity Social Cloud Console Account

Note: If you haven’t already registered for an Amity Social Cloud account, we recommend following our comprehensive step-by-step guide in the Amity Portal to create your new network.

Step 1: Understanding Enhance Moderation

Enhance Moderation is a feature that uses AI to scan and moderate all user-generated content on your app. It can detect and moderate text, images, and videos that contain inappropriate or offensive content. The list of categories it can detect is extensive, ranging from sexually explicit content to hate symbols and violence. This feature is currently in Private Beta, and you can submit a request to the Amity Help Center to enable it.

Types of Content that can be Blocked

Enhance Moderation can detect and moderate a wide range of content categories, including but not limited to:

  • Sexually explicit or adult content
  • Offensive content
  • Adult Toys
  • Alcohol and Alcoholic Beverages
  • Drug Paraphernalia and Drug Products
  • Explicit Nudity
  • Hate Symbols
  • Graphic Violence Or Gore
  • Gambling
  • Smoking and Tobacco Products
  • Weapons and Weapon Violence
  • White Supremacy

Step 2: How Enhance Moderation Works

The AI moderation is performed using two main factors — flagConfidence and blockConfidence. When a post or a comment is created, the AI system scans the content and generates a confidence value. If this value falls below flagConfidence, the content passes the moderation. If the value falls between flagConfidence and blockConfidence, the content is flagged for review. If the value surpasses blockConfidence, the content is deleted. This proactive approach ensures a safe online environment for your users.

Step 3: Configuring Enhance Moderation

By default, all categories have an initial flagConfidence value of 40, and a blockConfidence value of 80. However, you can adjust these values based on your moderation policies and the needs of your community. You can use the Enhance Moderation APIs to retrieve and set the confidence level for each moderation category.

Step 4: Using the API Endpoint

To use the enhanced moderation APIs, specify the appropriate API endpoint on each HTTP request. Each data center has a unique endpoint, so it’s essential to adjust it accordingly. By selecting the correct endpoint associated with the right location, you ensure faster response times and optimize the overall performance of your API requests.

Step 5: Getting and Updating Moderation Confidence Level

You can use the Get moderation confidence level API to retrieve the confidence level for each moderation category. Similarly, you can use the Set moderation confidence level API to set the confidence level for any moderation category. We recommend adjusting the confidence levels for each category based on your moderation policies and the needs of your community.

Final Thoughts

Harnessing the power of AI for content moderation can significantly enhance the safety and inclusivity of your online communities. With Enhance Moderation, you can effectively identify and address inappropriate or offensive content, without relying on manual oversight. By understanding how it works and adjusting the confidence levels for optimal moderation, you can provide a secure online environment for your users.