A custom scorecard defines what good looks like for your team by explicitly setting the criteria you care about - such as accuracy, tone, or policy adherence. Scorecards work alongside Monitors: a Monitor defines which conversations get reviewed, and a scorecard defines how each one is evaluated. Monitors currently only evaluate Fin AI Agent conversations, not human teammates.
You can have multiple scorecards for different Monitors. Choose which scorecard to associate with a Monitor from the Monitor setup screen:
Note: Scorecards are available as part of the Pro add-on.
To create a scorecard
Go to Fin AI Agent > Analyze > Monitors and click Scorecards. You can use the out-of-the-box Fin Quality Scorecard or create your own by clicking + New scorecard:
Create a new scorecard criteria
Start by adding scorecard criteria. First, click New scorecard > + Criteria > Create new.
When creating a new attribute, work through the following steps:
1. Name the Criteria
Give the criteria a short, clear name (for example, Sentiment or Answer accuracy). This name appears in reports and will be used as a reference.
2. Describe what is being evaluated
Add a clear description explaining what the criteria checks and how it should be scored. The description is the prompt the AI uses to score this criteria, and the more precise it is, the more accurately AI will evaluate conversations. It also helps human reviewers apply the same criteria consistently.
Tip: For help writing effective descriptions, see how to write effective Monitor and Scorecard Criteria.
3. Choose how the criteria is scored
Decide whether the criteria should be automatically scored with AI, or manually scored by human reviewers. You can mix AI-scored and human-scored attributes within the same scorecard.
Note: Scorecard criteria titles and descriptions are reusable. Once you have created an attribute, you can add it to multiple scorecards. Previous rating scores cannot be reused and will need to be set from scratch in each scorecard.
4. Define rating options
Add the possible rating values a reviewer or AI can select (for example: Good, Okay, Poor). Each attribute must have at least two rating options. For each rating option, you will:
Name the rating (short and clear)
Describe when it should be selected
Assign a score (for example, 100%, 50%, 0%) or mark it as Not scored
The score you assign determines how that rating contributes to the overall review score.
5. Choose whether to include it in the review score
You can toggle Include in review score on or off.
When enabled, this attribute contributes to the overall review score.
When disabled, the attribute is recorded for analysis and reporting, but does not affect the overall score.
In this example, a scorecard attribute has been created to evaluate escalation ease:
6. Enable Auto-review (optional)
You can automate the entire QA process for a scorecard by toggling on Auto-review scorecard.
When enabled:
If AI scores all criteria in the scorecard, the manual review step is skipped entirely.
If the AI gives a failing score, the conversation is automatically marked as Reviewed + fix needed and routed to the Follow-up actions needed queue.
Teammates can still manually override any AI score if they spot a discrepancy.
Tip: Auto-review works best on scorecards where all criteria are AI-scored. If any criteria requires a human, those conversations will still appear in the Unreviewed queue.
Configure your scorecard
After adding scorecard criteria, configure how they affect the overall review result.
Marking a scorecard criteria as critical
You can mark an criteria as Critical. If a critical criteria receives a failing rating, the entire review fails:
The overall review score becomes 0%
This overrides all weights
Not scored ratings exclude the criteria from the overall score and do not trigger failure
Critical criteria are useful for non-negotiable standards such as compliance requirements, safety or policy adherence, and escalation handling.
Scorecard criteria weighting
Each criteria can be assigned a weight to define its relative importance.
Weight must be an integer between 0 and 100
Higher weights increase the impact of that criteria on the overall review score
Weights only apply to criteria included in the review score. Use weights to reflect what matters most — for example, a higher weight on Accuracy than Efficiency if correctness is more important than speed.
Note: Weights are relative to each other, not fixed to a scale of 100. The total can add up to any number — what matters is the proportion each criteria contributes. An criteria with a weight of 25 out of a total of 50 contributes the same as one weighted at 50 out of 100.
Adding a pass threshold
You can define a pass threshold — the minimum overall score required for a review to be considered passing. For example, if the pass threshold is 80%, any review scoring below 80% is marked as failed.
This is evaluated after weighted scoring, provided no critical criteria has already failed the review.
How the overall review score works
Each criterion is rated using its defined rating options.
Ratings contribute their assigned score (or are excluded if marked Not scored).
Included criteria are combined using their assigned weights.
If any critical criteria receives a failing rating, the overall review score becomes 0%.
The final score is compared against the pass threshold to determine whether the review passes or fails.
Here is an example of how three criteria combine into a final score:
Criteria | Rating selected | Rating score | Weight |
Accuracy | Good | 100% | 60 |
Tone | Okay | 50% | 30 |
Efficiency | Good | 100% | 10 |
Overall score = (100x60 + 50x30 + 100x10) / (60+30+10) = 85%
Where to view scores
Once reviews are completed, scores are visible in both the conversation list and within each conversation.
In Monitor, the conversation list shows the overall review score (percentage or Fail) alongside the individual criteria ratings as columns. This makes it easy to scan performance across conversations and spot failures or low scores.
When you open a conversation and go to the Score tab, you can see the assigned scorecard, review status, overall score, and the selected rating for each criteria. This view shows exactly how the final score was determined.
FAQs
Can I reuse scorecard criteria across multiple scorecards?
Can I reuse scorecard criteria across multiple scorecards?
Yes, criteria titles and descriptions are reusable. Once you have created an criteria, you can add it to multiple scorecards. Note that previous rating scores cannot be reused and will need to be set from scratch in each scorecard.
What happens if I do not attach a scorecard to a monitor?
What happens if I do not attach a scorecard to a monitor?
The monitor will still flag conversations that match your criteria, but no scoring will take place. Reviewers will see flagged conversations without any scorecard criteria to fill in. To enable evaluation, attach a scorecard during monitor setup.
Can I mix AI-scored and manually scored criteria in the same scorecard?
Can I mix AI-scored and manually scored criteria in the same scorecard?
Yes. You can choose on a per-criteria basis whether AI or a human reviewer handles scoring. Note that if Auto-review is enabled and any criteria requires manual scoring, those conversations will still appear in the Unreviewed queue.
What does a critical criteria do?
What does a critical criteria do?
If a critical criteria receives a failing rating, the overall review score drops to 0% regardless of how other criteria scored. This is useful for non-negotiable standards — compliance, safety, or escalation handling — where a single failure should override everything else.
Need more help? Get support from our Community Forum
Find answers and get help from Intercom Support and Community Experts









