Batch test lets you simulate how Fin will respond to real customer questions—before those responses reach actual customers. It helps you check content coverage, debug unexpected results, and refine Fin’s behavior across brands, users, languages, and automations.

Key benefits / use cases:

Validate Fin’s answers across multiple brands and user profiles before going live.
Diagnose and fix content gaps by inspecting the source, personality, and guidance behind each response.
Organize test runs into reusable groups to track changes over time.
See when automations trigger, including Fin tasks, actions, and custom answers.
Control language settings and ensure real-time translations work as expected.

Batch Test is designed for all customers who use Intercom to support conversations – whether you’re an existing Fin customer or simply looking to explore its potential.

To access Batch Test for Fin, teammates must have a full seat and their conversation access permission set to "All conversations".

How to use Batch test

1. Generate your questions

Go to Fin AI Agent > Test from the main navigation and choose how you want to add questions.

The options for adding questions for a batch test are:

Generate from past conversations
- All conversations
- From a specific AI topic
Add questions manually
Upload a CSV file of questions

These options allow you to ensure that Fin is tested against real, relevant customer queries before going live, helping you spot content gaps and optimize responses for different audiences and scenarios.

Note:

You can upload up to 50 questions per test group.
You must have a minimum of 1 conversation in the last 90 days for the Generate from inbox option to appear.
The "Can access people, companies, and account lists" and "Can access lead and user profile pages" permissions are required as well.

How to create your list of customer questions

Each approach to generating question groups comes with its own advantages and ideal use cases.

Generate questions from Past Conversations

You can automatically generate up to 50 questions based on your most recent customer conversations (30-90 days), so the question set will accurately represent what customers are asking about right now.

This is a quick way to get a list of questions that accurately reflects current customer needs.

Note: The option only appears if there’s at least some recent conversation volume (e.g., last 90 days).

Generate questions by topic

This method uses AI topics to generate topic-specific question sets based on real customer queries. It’s particularly useful when you want to focus on specific subject areas and understand how well they’re being handled.

Common use cases include:

Preparing for seasonal spikes in customer questions (for example, end-of-year tax queries).
Evaluating Fin’s performance after enabling new content on specific topics.
Prioritizing high-volume or low-CSAT topics identified through the Topics Explorer.

Note: This method requires AI Topics to be available in your workspace. Topics Explorer relies on recent conversation data to generate insights.

Add Questions Manually

If you don’t want to generate your questions from existing data, you can copy and paste a list of questions or add them one by one. This method gives you full control over the exact content and phrasing with no dependency on historical conversations.

Common use cases include:

You have a pre-prepared question list with compliance-critical scenarios and/or edge cases that don’t appear in historical conversation data.
You’re testing responses about new features or policies with no past conversation volume.

Upload a CSV file of questions

This method lets you upload a set of questions generated elsewhere, such as from another support platform. You simply need a .csv file with a single column containing up to 50 test questions.

Common use cases include:

You need to bulk load a curated list (e.g., per topic, per audience, per region)
You need to run repeatable evaluations across teams
You’re preparing for a launch or seasonal spike and want a single upload to cover 30–50 canonical questions.

By selecting the right question-generation method for your needs, you can evaluate and refine Fin’s performance against relevant customer scenarios, helping you identify gaps and improve responses over time.

2. Configure your test

Your test questions will automatically start running using a generic preview user to access all content and trigger all automations without targeting. To configure these test settings, click the Manage dropdown at the top of the page and then select Settings.

Test as

Choose how to test Fin by simulating users or audiences. You can choose to test Fin as:

User or Lead - Test Fin as a user or lead, with available content, guidance, and automations (tasks, data connectors, and custom answers).Select a user to see how Fin tailors responses based on user attributes, such as language, location, plan, and more. Perfect for validating setups before launch or troubleshooting post-deployment. You can select from a dropdown list of contacts in your workspace, and select to simulate as that specific user. This is particularly useful if you have test users already setup in your workspace. However, you can also search and select real users/leads within your workspace.
Audience - Test Fin as audience(s), with audience-specific content and guidance. Automations will not run.
Preview User - Test Fin with all live content, guidance, and automations included.

Brand

If you've set up multiple brands on your workspace, you can select the brand you would like to test. For example, the content might be different for this brand so the Fin response can be tested to ensure these nuances are being picked up.

Click Confirm to re-run the test with these settings.

3. Run the test and inspect responses

Once you run a test, you can view Fin’s generated response for each question.

Use the "Evaluate answer" panel to see:

Personality settings (e.g. tone of voice)
Guidance (if configured)
Content sources Fin pulled from
Automations (e.g., Fin Tasks, Data connectors, or Custom Answers triggered)

While you can’t directly edit automations triggered in a Batch test, the panel links out to the relevant configuration screen for you to review or quickly make adjustments.

4. Add a new question

To add a new question, go to Fin AI Agent > Test from the main navigation. Then, click the + add question button. A dropdown menu will appear, offering you various options for how you'd like to add your questions.

Generate more from all conversations - You can create up to 50 questions from your past conversations.
Generate more by Topic - If AI Topics are available in your workspace, questions can be automatically generated based on your conversation topics.
Upload more from a CSV - Import up to 50 questions at once by uploading a CSV file.
Add more manually - You can either copy and paste a list of questions or add them one by one.

5. Adjust language settings and translations

To help you get accurate responses in the correct language, Batch test makes it easy to check and update your Language support and Real-time translation settings during testing.

If you’re seeing a response in a different language than expected—for example, if a question in Russian gets a reply in English—it provides clear messages in yellow explaining why this might be happening and how to fix the issue.

Once you enable the appropriate settings and refresh the answer, Fin's responses will appear in the correct language.

You might need to enable just one or both settings, depending on your current configuration. Don’t worry—you can turn them on simultaneously and come back to this screen.

6. Rate Fin's answers

Review Fin’s responses and assess whether they meet expectations, checking for factual accuracy, appropriate tone, and whether the correct data was accessed or the right procedures were triggered.

If you select Acceptable, you can add an internal note to reference later when improving Fin’s responses. These notes will be included in the downloadable CSV report for the test.

If you select Poor, choose a reason for your rating from the list. This is where you need to conduct root cause analysis to correctly determine why Fin’s answer did not meet expectations so you can make immediate improvements.

The possible choices are as follows:

Didn't use the correct content
Didn't clarify the customer's question
Used the content incorrectly
Tone wasn't right
Answer length is too long or short
Didn't speak in the right language
Other

Note: These ratings don’t train Fin directly. You need to apply a fix via Improve this answer or by updating content/guidance, then re-run the test.

Read below for indicators to help you conduct root cause analysis for a Poor-rated answer.

Didn’t use the correct content

Use when the answer is factually wrong because Fin relied on an irrelevant, outdated, or incomplete source.

Indicators:

Citations point to the wrong page
Answer references an old policy or different product/plan
Key facts are missing or incorrect

Didn’t clarify the customer’s question

Use when the question is ambiguous or missing key details and Fin answered prematurely instead of asking for context.

Indicators:

Multi‑intent queries (“refund + upgrade”)
Vague terms (“issue with login”)
Dependency on customer specifics (plan, region, platform) that weren’t gathered.

Used the content incorrectly

Use when the underlying content is correct, but Fin interpreted, combined, or ordered it the wrong way.

Indicators:

Steps out of sequence
Applying rules to the wrong audience/plan
Mixing two procedures
Leaving out a prerequisite that’s present in the source

Tone wasn’t right

Use when the response doesn’t match your brand voice or the situation’s sensitivity.

Indicators:

Too casual/cheerful for sensitive topics (billing, security)
Overly formal for simple FAQs
Missing empathy or reassurance
Otherwise does not match your brand voice

Answer length is too long or short

Use when verbosity doesn’t fit the intent or channel.

Indicators:

Wall‑of‑text responses for quick FAQs
Too brief multi-step answers for important compliance issues
Key details buried or omitted

Didn’t speak in the right language

Use when the response language doesn’t match the customer or region.

Indicators:

Answer appears in English when the test audience setting is non‑English (or vice versa)
Mixed language responses in the same thread

Other

Use when the issue doesn’t fit the above categories (leave a clear note).

Examples:

Automation triggered (or didn’t) as expected
Brand/audience targeting mismatch
Missing data connector
UI element (screenshot) needed from content for illustrative purposes

7. Refine Fin's answers

Once you’ve selected the reason that best matches your rating, click Improve this answer to view dynamic recommendations tailored to the root cause you selected for that specific answer.

Common recommendations include:

Adding or revising guidance to:

Shape tone of voice and response length
Ensure Fin asks clarifying questions for ambiguous queries
Establish clear escalation rules for high-risk areas
Define preferred source content for specific intents or brands
Ensure compliance and policy requirements are applied correctly

Adding a snippet for:

Fast, precise, and private knowledge updates
Immediate stopgaps for outdated or incorrect content
Internal-only details that should not be publicly shared
Seasonal or time-bound information (e.g. promotions)
Narrow edge cases that require specific wording or parameters

Creating or updating existing articles for:

Core FAQs widely needed by customers
Complex procedures that benefit from headings, numbered steps, tables, or images
Content that should be referenced for transparency and customer self-serve value
Topics with repeated fixes in snippets and should instead be consolidated into an article
Enabling Fin’s access to specific articles

Updating supported languages to:

Provide support in more languages
Detect incoming language effectively
Enable real-time translation of help content in your default language

Note: If you select Other as the reason for a Poor-rated answer, you may not receive a recommended solution. In some cases, the appropriate solution may involve updating data connectors or creating a procedure that enables Fin to take action on a customer’s behalf.

Example

This example demonstrates a hypothetical root cause analysis and solution for a Poor-rated question.

Question: “How do I add a new user without paying?”

What happened: Fin answered with generic steps to invite a user, but missed the plan-based billing nuance.

Root cause analysis: Answer details show Fin relied on the “Invite user” article which does not cover plan/billing specifics for free vs paid seats (knowledge gap).

Select reason: "Didn't use the correct content."

Suggestions: Update “Invite user” article to include relevant information; Add guidance to ensure Fin asks clarifying questions to identify the relevant plan type before answering.

8. Filter a test and make bulk updates

Filter a test by Answer status:

Any - all questions added to the test group.
Answered questions - only questions where Fin provided a direct answer, disambiguation, or automation (e.g. workflow handover, Fin Task, etc.)
Unanswered questions - only questions where Fin couldn't provide an answer or trigger any follow-up action.

Filter a test by Answer rating:

Any
Good
Acceptable
Poor

Make bulk updates using the checkboxes on the left of the question. This allows you to bulk update Q+A pairs to download certain questions, delete those questions, create a test group, or update the answer associated with that question.

9. Save and organize test groups

You can use test groups to organize and save up to 50 questions and responses in the testing area. This is the maximum number you can upload at once, making it a handy way to group related questions for easier review and reuse. Each test group retains the settings you used during testing—like simulating a specific user—so you can re-run tests with the same configuration anytime.

Click Manage at the top of the page to select the option + Create new group.

From there, you have several flexible options to populate your new test group with questions:

Generate from inbox: Pull questions directly from your existing conversations, either from all conversations or by specific topics.
Manually add: Enter questions one by one.
Upload a CSV file: Import multiple questions quickly using a CSV file.

Click Manage to rename a group or delete a test group.

Click the name of your test group to create additional groups, or select a different group you've previously saved.

Test groups are especially useful for organizing questions by topic. For example, if you’ve tested and reviewed a batch of insurance claim questions, you can save them as a group labeled “Insurance Claim Questions.” This makes it simple to revisit, rerun, or evaluate that content later.

They’re also great for managing team collaboration. Since Batch test is a workspace-level feature, using test groups lets teammates keep their test runs separate. Instead of deleting previous tests to make space, you can save them into groups to preserve everyone’s work.

10. Download a CSV report

A CSV file can be generated, compiling all questions, answers, user-applied ratings, and the sources utilized for each response. Simply click Manage at the top of the page and then select Get CSV report.

This is great for sharing results with your wider team for collaboration, or visibility for senior leaders to review.

FAQs

What’s not testable in Batch test?

Fin Vision (image recognition) isn’t supported in the Batch test section yet.

Will I be charged for resolutions when using Batch test?

No, the Fin AI Agent > Test page is free to use and you won't be charged for AI answers generated through the Batch test. 👌

Can I generate test questions automatically?

Yes, you can generate test questions automatically. However, for the "Generate from inbox" option to appear, you must have a minimum of 1 conversation in the last 90 days in your workspace.

Do answer ratings train Fin?

No. Batch test is strictly for quality assurance—ratings help you identify areas to improve, not retrain Fin.

Can I test different languages?

Yes. Batch test checks and flags any missing language or translation settings so you can resolve them easily.

Can I simulate different users?

Yes. Batch test allow you to select a user/lead in your workspace and see how Fin would respond based on their specific user attributes.

What is the difference between resetting a test and re-running a test?

Resetting the test will allow you to choose another batch, either from conversation history or an upload. Re-running the test will re-generate answers based on any content changes or answer ratings you provided in the batch so you can continue to refine the performance.

Why is the option "Generate from inbox" disabled?

The "Generate from inbox" option is disabled when there aren't enough conversations or conversations with relevant topics in your inbox.

Does Batch testing affect customer data?

No. Batch testing is simulation-only and does not change customer data.

When batch testing runs:

Data connectors use their configured test or sample payloads. The results are clearly labeled as “answer uses sample response data”, so no live APIs are called.
Fin Tasks, Procedures, and Workflows are not executed. Batch testing only shows what would have been triggered in a real conversation so no tags, updates, escalations, or writes actually happen.

This allows you to safely validate responses and behavior without impacting Intercom data or external systems.

💡Tip

Need more help? Get support from our Community Forum
Find answers and get help from Intercom Support and Community Experts

Fin AI Agent explained

Dig into Fin AI Agent unresolved questions

Provide Fin AI Agent with specific guidance

The Fin Flywheel

Understand why Fin AI Agent may not provide an answer

Batch test Fin AI Agent

Key benefits / use cases:

How to use Batch test

1. Generate your questions

How to create your list of customer questions

Generate questions from Past Conversations

Generate questions by topic

Add Questions Manually

Upload a CSV file of questions

2. Configure your test

Test as

Brand

3. Run the test and inspect responses

4. Add a new question

5. Adjust language settings and translations

6. Rate Fin's answers

Didn’t use the correct content

Didn’t clarify the customer’s question

Used the content incorrectly

Tone wasn’t right

Answer length is too long or short

Didn’t speak in the right language

Other

7. Refine Fin's answers

Common recommendations include:

Example

8. Filter a test and make bulk updates

9. Save and organize test groups

10. Download a CSV report

FAQs

What’s not testable in Batch test?

Will I be charged for resolutions when using Batch test?

Can I generate test questions automatically?

Do answer ratings train Fin?

Can I test different languages?

Can I simulate different users?

What is the difference between resetting a test and re-running a test?

Why is the option "Generate from inbox" disabled?

Does Batch testing affect customer data?