How Spam Filtering Works in Postbox
Postbox layers honeypots, disposable email detection, link analysis, content moderation, and LLM intelligence to stop spam before it wastes your time or credits.
Form spam is a solved problem. It just hasn’t been solved by most form backends.
The standard industry approach is a CAPTCHA. Stick a reCAPTCHA on your form, force your users to click traffic lights, and call it a day. The problem: CAPTCHAs hurt conversion rates, annoy legitimate users, and don’t work against sophisticated bots anyway. They’re a tax on your users to compensate for your backend’s inability to tell real submissions from junk.
We think that’s backwards. Spam detection belongs on the server, not the client. Your users should never see it, never interact with it, and never know it’s there. The bot should never know it was caught either.
Here’s how we built it.
Two strategies, layered by design
Postbox offers two spam strategies: Standard and Intelligent. Standard is free on every plan — it’s a product capability, not a premium feature. Intelligent adds LLM-powered analysis for 1 AI credit per submission and is available on Pro.
Both strategies share one principle: the bot always receives a normal 201 Created response. No error codes. No rejection messages. The submission gets quietly flagged and routed to your spam folder. The sender has no signal that anything happened.
Standard strategy: five layers, zero cost
Standard spam filtering runs five checks in sequence. Each one catches a different category of junk.
1. Honeypot fields
The oldest trick in the book, and still one of the most effective. When you define your form schema, you can mark any field as a honeypot by setting "honeypot": true. That field gets included in the schema but should be hidden from real users via CSS in your HTML form.
Bots scrape the page, see the field, and fill it in. Humans never see it. If the honeypot field contains any value, the submission is instantly flagged as spam. Done. No scoring, no analysis, no API calls.
The beauty of honeypots is that they’re zero-friction for real users and nearly impossible for naive bots to detect. The field looks legitimate in the HTML. The bot fills it out. The response comes back 201. The bot thinks it succeeded. It didn’t.
2. Disposable email detection
We maintain a list of over 100,000 disposable email domains — services like Mailinator, Guerrilla Mail, Temp Mail, and thousands of others. The list is loaded into ETS (Erlang Term Storage) at boot for O(1) lookups. When a submission includes an email field, we check the domain against this list.
Disposable emails are the hallmark of throwaway spam accounts. Legitimate users occasionally use them for privacy, but in the context of form submissions, they correlate overwhelmingly with junk. If the domain matches, the submission gets flagged.
3. Profanity scoring
A blocklist of over 10,000 profane and abusive terms, also loaded into ETS for constant-time lookups. We scan text fields and calculate a score: each matched term adds 5 points. High scores indicate abusive or low-quality submissions.
This isn’t about censorship — it’s about signal. A contact form submission that’s 40% slurs is not a legitimate inquiry. The scoring approach means a single borderline word won’t trigger a flag, but a submission stuffed with abuse will.
4. Link analysis
Spam loves links. We count them and weight them by suspicion:
- Every URL adds +1 to the score
- URL shorteners (bit.ly, tinyurl.com, t.co, and similar) add +3 each — legitimate form submissions rarely contain shortened links
- Suspicious TLDs (.xyz, .top, .click, .buzz, and others commonly abused by spammers) add +2 each
A score of 5 or higher marks the submission as suspicious. A score of 10 or higher flags it as spam. A normal submission with one or two regular links won’t trigger anything. A submission packed with bit.ly links to .xyz domains will get caught instantly.
5. Content moderation
The final layer calls the OpenAI Moderation API. This is a free API — it doesn’t consume any of your AI credits. It checks for harassment, hate speech, self-harm, sexual content, and violence across all standard categories.
This catches the submissions that pass heuristic checks but contain genuinely harmful content. The API is fast, free, and purpose-built for exactly this use case.
Intelligent strategy: LLM-powered contextual analysis
Standard filtering is rule-based. It catches the obvious stuff — bots, throwaway emails, link farms, abuse. But some spam is subtler. A well-crafted promotional message with no profanity, no suspicious links, and a real email address will sail through heuristics.
That’s where Intelligent filtering comes in.
For 1 AI credit per submission, Postbox sends the submission content to an LLM with explicit instructions to analyze it for spam indicators. The model evaluates:
- Promotional content — unsolicited marketing, SEO pitches, “business opportunities”
- Irrelevant data — submissions that don’t match the form’s purpose
- Gibberish — random characters, keyboard mashing, test submissions
- Phishing attempts — requests for credentials, financial information, or personal data
- Bot patterns — templated text, mail-merge artifacts, generic greetings
- Category spam — crypto, gambling, pharmaceutical, and adult content promotion
The model returns a confidence score and a human-readable reason explaining why it flagged (or didn’t flag) the submission. We run it at temperature 0.0 for deterministic output — the same submission produces the same result every time.
This is the kind of analysis that’s impossible with rules alone. The LLM understands context. It knows that “I’d love to discuss a partnership” on a contact form is probably spam, while the same phrase on a partnerships inquiry form is legitimate.
Pipeline order matters
Spam detection runs first in the processing pipeline — before auto-translation and before smart replies. This is deliberate. Translation costs credits. Smart replies cost credits. If a submission is spam, we don’t want to spend credits translating and replying to it.
The pipeline looks like this:
- Spam filtering — Standard or Intelligent
- Auto-translation — if the submission is in a foreign language
- Smart replies — if auto-reply is configured
If step 1 catches spam, steps 2 and 3 never run. Your credits don’t get wasted on junk.
Spam folder, not spam deletion
Flagged submissions go to a spam folder. We don’t delete them. No spam filter is perfect, and false positives happen. You can review flagged submissions, unflag legitimate ones, and move them back to your inbox.
This is a deliberate design choice. Aggressive spam filters that silently delete submissions are worse than no filter at all — at least with no filter, you can manually sort through everything. With silent deletion, you’ll never know what you missed.
Why this matters
Most form backends punt on spam entirely. They give you a CAPTCHA widget, tell you to add it to your frontend, and wash their hands of the problem. That approach fails for three reasons:
-
CAPTCHAs are client-side. If your data is coming from a script, an agent, or a curl command, there’s no browser to render a CAPTCHA in. Server-side filtering works regardless of the source.
-
CAPTCHAs are friction. Every CAPTCHA is a user asking themselves “is filling out this form worth solving a puzzle?” Some percentage will say no. That’s lost data you’ll never get back.
-
CAPTCHAs are defeatable. Services exist that solve CAPTCHAs for pennies. Bots use them routinely. A CAPTCHA is a speed bump, not a wall.
Honeypots, heuristics, content moderation, and LLM intelligence — layered together, running server-side, invisible to the sender — that’s how spam filtering should work. The bot thinks it won. Your inbox stays clean. Your users never noticed anything happened.
That’s the design philosophy behind everything we build at Postbox. The hard problems get solved on the backend so they stop being problems for everyone else.