Skip to main content

Creating and configuring Scorecards

Learn how to create and customize scorecards to evaluate conversations and maintain high quality standards.

Written by Dawn
Updated today

A custom scorecard defines what good looks like for your team by explicitly setting the criteria you care about - such as accuracy, tone, or policy adherence. Scorecards work alongside Monitors: a Monitor defines which conversations get reviewed, and a scorecard defines how each one is evaluated. Monitors currently only evaluate Fin AI Agent conversations, not human teammates.

You can have multiple scorecards for different Monitors. Choose which scorecard to associate with a Monitor from the Monitor setup screen:

Note: Scorecards are available as part of the Pro add-on.


To create a scorecard

Go to Fin AI Agent > Analyze > Monitors and click Scorecards. You can use the out-of-the-box Fin Quality Scorecard or create your own by clicking + New scorecard:


Create a new scorecard criteria

Start by adding scorecard criteria. First, click New scorecard > + Criteria > Create new.

When creating a new attribute, work through the following steps:

1. Name the Criteria

Give the criteria a short, clear name (for example, Sentiment or Answer accuracy). This name appears in reports and will be used as a reference.

2. Describe what is being evaluated

Add a clear description explaining what the criteria checks and how it should be scored. The description is the prompt the AI uses to score this criteria, and the more precise it is, the more accurately AI will evaluate conversations. It also helps human reviewers apply the same criteria consistently.

Tip: For help writing effective descriptions, see how to write effective Monitor and Scorecard Criteria.

3. Choose how the criteria is scored

Decide whether the criteria should be automatically scored with AI, or manually scored by human reviewers. You can mix AI-scored and human-scored attributes within the same scorecard.

Note: Scorecard criteria titles and descriptions are reusable. Once you have created an attribute, you can add it to multiple scorecards. Previous rating scores cannot be reused and will need to be set from scratch in each scorecard.

4. Define rating options

Add the possible rating values a reviewer or AI can select (for example: Good, Okay, Poor). Each attribute must have at least two rating options. For each rating option, you will:

  • Name the rating (short and clear)

  • Describe when it should be selected

  • Assign a score (for example, 100%, 50%, 0%) or mark it as Not scored

The score you assign determines how that rating contributes to the overall review score.

5. Choose whether to include it in the review score

You can toggle Include in review score on or off.

  • When enabled, this attribute contributes to the overall review score.

  • When disabled, the attribute is recorded for analysis and reporting, but does not affect the overall score.

In this example, a scorecard attribute has been created to evaluate escalation ease:

6. Enable Auto-review (optional)

You can automate the entire QA process for a scorecard by toggling on Auto-review scorecard.

When enabled:

  • If AI scores all criteria in the scorecard, the manual review step is skipped entirely.

  • If the AI gives a failing score, the conversation is automatically marked as Reviewed + fix needed and routed to the Follow-up actions needed queue.

  • Teammates can still manually override any AI score if they spot a discrepancy.

Tip: Auto-review works best on scorecards where all criteria are AI-scored. If any criteria requires a human, those conversations will still appear in the Unreviewed queue.


Configure your scorecard

After adding scorecard criteria, configure how they affect the overall review result.

Marking a scorecard criteria as critical

You can mark an criteria as Critical. If a critical criteria receives a failing rating, the entire review fails:

  • The overall review score becomes 0%

  • This overrides all weights

  • Not scored ratings exclude the criteria from the overall score and do not trigger failure

Critical criteria are useful for non-negotiable standards such as compliance requirements, safety or policy adherence, and escalation handling.

Scorecard criteria weighting

Each criteria can be assigned a weight to define its relative importance.

  • Weight must be an integer between 0 and 100

  • Higher weights increase the impact of that criteria on the overall review score

Weights only apply to criteria included in the review score. Use weights to reflect what matters most — for example, a higher weight on Accuracy than Efficiency if correctness is more important than speed.

Note: Weights are relative to each other, not fixed to a scale of 100. The total can add up to any number — what matters is the proportion each criteria contributes. An criteria with a weight of 25 out of a total of 50 contributes the same as one weighted at 50 out of 100.

Adding a pass threshold

You can define a pass threshold — the minimum overall score required for a review to be considered passing. For example, if the pass threshold is 80%, any review scoring below 80% is marked as failed.

This is evaluated after weighted scoring, provided no critical criteria has already failed the review.


How the overall review score works

  1. Each criterion is rated using its defined rating options.

  2. Ratings contribute their assigned score (or are excluded if marked Not scored).

  3. Included criteria are combined using their assigned weights.

  4. If any critical criteria receives a failing rating, the overall review score becomes 0%.

  5. The final score is compared against the pass threshold to determine whether the review passes or fails.

Here is an example of how three criteria combine into a final score:

Criteria

Rating selected

Rating score

Weight

Accuracy

Good

100%

60

Tone

Okay

50%

30

Efficiency

Good

100%

10

Overall score = (100x60 + 50x30 + 100x10) / (60+30+10) = 85%


Where to view scores

Once reviews are completed, scores are visible in both the conversation list and within each conversation.

In Monitor, the conversation list shows the overall review score (percentage or Fail) alongside the individual criteria ratings as columns. This makes it easy to scan performance across conversations and spot failures or low scores.

When you open a conversation and go to the Score tab, you can see the assigned scorecard, review status, overall score, and the selected rating for each criteria. This view shows exactly how the final score was determined.


FAQs

Can I reuse scorecard criteria across multiple scorecards?

Yes, criteria titles and descriptions are reusable. Once you have created an criteria, you can add it to multiple scorecards. Note that previous rating scores cannot be reused and will need to be set from scratch in each scorecard.

What happens if I do not attach a scorecard to a monitor?

The monitor will still flag conversations that match your criteria, but no scoring will take place. Reviewers will see flagged conversations without any scorecard criteria to fill in. To enable evaluation, attach a scorecard during monitor setup.

Can I mix AI-scored and manually scored criteria in the same scorecard?

Yes. You can choose on a per-criteria basis whether AI or a human reviewer handles scoring. Note that if Auto-review is enabled and any criteria requires manual scoring, those conversations will still appear in the Unreviewed queue.

What does a critical criteria do?

If a critical criteria receives a failing rating, the overall review score drops to 0% regardless of how other criteria scored. This is useful for non-negotiable standards — compliance, safety, or escalation handling — where a single failure should override everything else.


💡Tip

Need more help? Get support from our Community Forum
Find answers and get help from Intercom Support and Community Experts


Did this answer your question?