Test Your AI Agent Before Going Live – Heymarket

Before any real customer sees an AI-generated reply, run the agent end to end against scripted scenarios. Most launch problems show up in 20 minutes of structured testing.

Set up a test surface

You have two options:

Use a dedicated test inbox. Create a sandbox inbox, add the agent as a member, and message it from your phone.
Use your own number on the live inbox. Send messages from a personal phone to the production inbox. Useful for catching real-world quirks (carrier delays, MMS handling).

Either way, run tests in Supervised Mode so you can read each suggestion before it sends.

Scenarios to run

Test three categories We recommend building test prompts from real questions your team has answered before, common help-center searches, and mock conversations that include missing details. With these, test the following conditions:

Happy path. Five to ten of the most common questions you actually get. The agent should respond accurately, in the right tone, within your length limit.

Edge cases. Vague questions ("can you help me?"), compound questions ("what's pricing and how do I cancel?"), and off-topic questions. Watch for invented detailshallucinations, made-up features, and overly confident answers.

Emotional triggers. Frustration, urgency, profanity. The agent should hand off cleanly via your unassignment rules.

Verify each unassignment rule

For every rule you wrote, send a message designed to trigger it. Confirm:

The agent unassigns itself promptly
If you configured a re-route template, the contact receives it
If you configured a specific team member, the chat lands with them
If that user is not in the inbox, the chat is unassigned (expected behavior)

Spot-check knowledge accuracy

For three or four replies, verify the facts against your source. Look for:

Pricing or plan details that match the source
Feature descriptions that are not invented
No references to articles or features that do not exist

[SCREENSHOT: Side-by-side view: an AI Agent reply on the left, the corresponding KB source page on the right. Use this approach to spot-check accuracy.]

If you catch an error, note it. Article 7 walks through how to feed it back into the knowledge base.

Pre-launch checklist

Before flipping the inbox toggle:

All happy path scenarios produce acceptable suggestions
Each unassignment rule has been triggered and behaves correctly
Re-route template (if configured) reads correctly to a customer
Knowledge sources show "Complete" sync status
A reviewer is named and aware of the launch
Daily review time is on someone's calendar for the first week

Next steps

Continue to Article 7: Train and Improve with Supervised Mode and Agent Enhancements.

Articles in this section