How to build an AI-powered spam filter

Andrew Tate
Andrew Tate
Technical writer

Oct 18, 2024

Lovely Spam! Wonderful Spam!

Spam. Spam. Spam. Spam.

Oh, you meant unwanted messages on the web? What a pity, it would have been interesting to build an app to try to find Eric Idle with AI.

160 billion spam emails are sent each day. We won’t go into what they often sell (and you probably know better than to click on them). Of course, you rarely have to see spam emails anymore—email spam detection filters have become increasingly sophisticated, marking these messages as junk and ensuring they never reach your inbox.

AI makes many of these filters possible, and with the growth in public large language models (LLMs), it's become much easier for anyone to build their own custom spam filter for their needs: filtering messages from a feedback form, moderating comments on a blog, screening applications for an online job board, or protecting customer support chat.

It’s the first of these we’re going to tackle today, taken from our recent AI build-along. We will:

Let’s get into it.

Building our Retool database

Why start here instead of with the contact form?

Well, for one, it is good practice. Defining your data schema is always a good idea before you build an app. Doing it this way will help us clearly define the structure of our data and ensure that our application has a solid foundation. It will also make the subsequent steps of creating the contact form and implementing the AI-based spam detection more straightforward, as we'll have a well-defined data structure to work with.

In fact, this is super-true with Retool, because we can automatically create a form using a Retool database schema. We’ll get to that in a moment, but first, we’ll set out our schema. It is going to have:

  • An id. This is added by default and used as the primary key
  • user_name: the name of the individual giving feedback
  • user_feedback: The user's actual comment
  • is_spam: a checkbox for whether this entry is spam or not.

We’ll call our table feedback_data.

That’s all we need for now.

Building our contact form

With our database ready, we can move on to the contact form. As we have an underlying database, we can use that as the basis for our form, by selecting “Generate from database”:

This will populate the form with the fields from our feedback_data table. We only want user_name and user_feedback to show on the form, so we can hide the other two. We’ll change the labels and placeholders to something a little nicer, and we’ll end up with a table like this:

That's pretty swish. If we publish this form now, the data will automatically fill in our Retool database. If this were a guide to Retool Forms, then we’d already be done, probably within a minute.

But to add spam detection is going to take a little (not a lot!) more work.

Connecting data to an AI model with a Retool workflow

The heart of this spam detector is a Retool workflow. The workflow will take data from the form, use an AI model with specific prompts to check whether the message and name are likely spam, and then mark the entry spam or not in the database.

Let’s start with showing the full workflow, then we’ll go through each component to see how it was created:

The first part is the trigger. This is what fires when a new entry is created in the form. To activate it, go to “Edit triggers” and select “Webhook.”

We don’t have to manually change the webhook URL—Retool will take care of that. Head back to your Retool form, select “Actions,” and then “Workflow.” Choose your workflow from the dropdown (here, it is called spam workflow), hit save, and then publish. The workflow is now connected to the form.

Next, we want to create our spam detection AI. From the startTrigger, connect another block and choose “AI Action.” This is going to be the heart of your spam detector. Choose Retool AI and then “Generate text.” We will ask the AI model to prepare the form name and feedback and then decide if it is spammy. If it is, we want it to generate the text “true.” If not, we want it to generate the text “false.”

We’ll pass both an input prompt and a system message to do this. The input we’re using here is:

“Based on the following contact form submission (full name and submitted data), make your best guess about whether this submission is an actual piece of feedback that should be forwarded to the team or whether it should be marked as spam.

Full Name: {{ startTrigger.data.user_name }}

Feedback: {{ startTrigger.data.user_feedback }}

Do you think the response above is spam? Respond with the single lowercase word true if the response is spam or the single lowercase word false if the response is not spam.”

Using the curly brackets, we can dynamically add the values from our form via our startTrigger block. So, in our test instance, this would send to the AI model we chose (GPT-4o-mini, in this case):

“Based on the following contact form submission (full name and submitted data), make your best guess about whether this submission is an actual piece of feedback that should be forwarded to the team or whether it should be marked as spam.

Full Name: John Doe

Feedback: I want to sell you something

Do you think the response above is spam? Respond with the single lowercase word true if the response is spam or the single lowercase word false if the response is not spam.”

In this case, it should return “true,” as that is obviously spam (sorry to all the John Does out there).

The system message helps give the AI the correct context:

“You are an expert at spam detection, with a discerning eye for feedback that should be passed to the product team.”

At the moment, though, nothing will happen with that categorization. So, the next block will be a Resource block, and we’ll choose a Retool database.

Above, we’ve shown the GUI way of accessing and updating the data in the database. Basically, we are saying where user_name = {{ startTrigger.data.user_name }} and user_feedback {{ startTrigger.data.user_feedback }}, set the is_spam checkbox to whatever the AI decides. We can also use SQL to do this:

1UPDATE
2  feedback_data
3SET
4  is_spam = {{ spamFilter.data }}
5WHERE
6  feedback_data.user_name = '{{ startTrigger.data.user_name }}' AND feedback_data.user_feedback '{{ startTrigger.data.user_feedback }}';

That’s it. Let’s test it. First, we’ll try some spam:

Then, if we check out the database:

There it is! Now, let’s try a real message:

And check whether it has been marked as spam:

No spam here. We now have a functioning spam detection system using AI, built in only a few minutes with a couple of prompts and a single SQL query.

If you want to build a similar AI app in just a few minutes, then you can sign up for Retool and use AI for free to build sophisticated applications like this.

Reader

Andrew Tate
Andrew Tate
Technical writer
Andrew is an ex-neuroengineer-turned-developer and technical content marketer.
Oct 18, 2024
Copied