Fin Vision is a built-in capability of Fin AI Agent that allows it to analyze and understand images sent by customers — screenshots, photos, and images of documents (e.g., scanned pages, photos of receipts or forms) — directly within conversations via chat or email.

There’s no need to enable or configure anything, and there’s no additional cost.

Fin Vision helps:

Diagnose issues faster.
Eliminate the need for lengthy customer explanations.
Extract and understand visual content like error messages, receipts, product defects, and more.

How Fin Vision works

Fin Vision uses multimodal large language models (LLMs) to analyze images sent by customers in chat or email conversations.

When a customer shares an image, Fin converts it into a structured textual description that becomes part of the conversation context. This description may include:

Extracted text (OCR) from the image.
UI elements and labels visible in screenshots.
Reference numbers and product details such as order IDs or error codes.
Context-aware insights derived from what’s shown in the image.

This visual understanding allows Fin to reason about images the same way it reasons about written customer messages.

With this context, Fin can:

Search your knowledge base more effectively.
Resolve Tasks that depend on visual inputs.
Provide actionable answers grounded in what the customer actually sees.

Fin Vision applies multimodal understanding in two ways:

To interpret images that customers send in a conversation.
To evaluate images in your support content when deciding whether they would help answer a customer’s question.

Understanding image analysis vs. image replies

Fin Vision focuses on analyzing images sent by customers.

Fin may also include images from your existing support content in its replies. When deciding whether to include an image, Fin analyzes the image itself using multimodal models, alongside the surrounding passage context and the answer it plans to send.

Note:

Fin never generates images — it only uses images that already exist in your content.
- Images appear after the text reply, not inline.
- Fin does not use image metadata or alt text when selecting images.
- Fin can only include images from content sources that preserve image data.

Ways to use Fin Vision

Industry	Example use cases
FinTech	Error troubleshooting: Screenshots of failed transfers or login issues help Fin provide targeted support. Fraud alert review: Fin helps identify phishing screenshots or suspicious activity.
SaaS	Troubleshooting UI bugs: Customers share screenshots of errors or unexpected UI behavior; Fin extracts error messages and provides fixes. Onboarding help: Fin can assist customers through unclear UI flows based on shared screenshots. License verification: Fin reads license keys or account numbers from uploaded invoices.
ecommerce	Return/refund validation: Customers upload images of damaged or incorrect products; Fin evaluates eligibility based on Task instructions. Shipping issues: Customers share photos of packaging or contents; Fin determines missing items or packaging damage. Invoice processing: Fin extracts order numbers and dates from receipts or packing slips.
Gaming/Gambling	Bug reporting: Players send screenshots of glitches or crashes; Fin interprets the visuals and logs issues. Withdrawal issues: Customers upload screenshots of failed transactions; Fin pulls timestamps, amounts, and transaction IDs. Bet slip verification: Fin reads and confirms bet slip details from uploaded images.

Maximizing Fin Vision

Fin Vision works best when combined with Fin Guidance, which lets you define how Fin should act on visual information.

Use Fin Vision with Fin Guidance

1. Reading and Interpreting Receipts

Scenario:

A customer uploads a photo of a purchase receipt and asks, "Can you help me with a refund for this item?"

How Fin Vision and Guidance Work Together:

Fin Vision extracts key details from the image, such as the item name, purchase date, and total amount.
Fin Guidance provides custom instructions to Fin, such as:
"If a customer asks about a refund and uploads a receipt, check that the purchase date is within 30 days. If so, guide them through the refund process. If not, politely explain the refund policy."

Result:
Fin can automatically verify eligibility and respond with the correct next steps, referencing the extracted receipt details.

2. Bug Reporting with Screenshots

Scenario:
A user submits a screenshot showing an error message in the app and says, "I'm getting this error—what should I do?"

How Fin Vision and Guidance Work Together:

Fin Vision analyzes the screenshot to identify the error code or message.
Fin Guidance instructs Fin to:
"If an error code is detected in a screenshot, search the help center for that code and provide the relevant troubleshooting steps."

Result:
Fin can quickly match the error to known issues and deliver targeted support, reducing back-and-forth.

3. Device Identification for Support

Scenario:
A customer uploads a photo of their device and asks, "Is my device compatible with your service?"

How Fin Vision and Guidance Work Together:

Fin Vision identifies the device make and model from the image.
Fin Guidance tells Fin:
"If a device model is recognized, check the compatibility list. If compatible, confirm and share setup instructions. If not, explain the limitations."

Result:
Fin provides a personalized answer based on the actual device, improving accuracy and customer satisfaction.

4. Document Verification

Scenario:
A user uploads a photo of their ID for account verification.

How Fin Vision and Guidance Work Together:

Fin Vision extracts the name, date of birth, and document type.
Fin Guidance instructs Fin to:
"If the uploaded document is a valid ID and matches the account details, proceed with verification. If not, request a clearer image or additional documentation."

Result:
Fin can automate parts of the verification process, reducing manual review.

Guidance strategies

Conditional Logic: Fin Guidance can set rules based on what Fin Vision detects (e.g., "If the receipt is older than 30 days, do X").
Fallbacks: If Fin Vision cannot extract needed information, Guidance can instruct Fin to ask the customer for clarification or a better image.
Personalization: Guidance can tailor responses based on visual context, making interactions feel more human and relevant.

FAQs

What image formats does Fin Vision support?

Fin Vision supports standard image formats including JPG, PNG, and GIF files shared by customers.

How does Fin handle privacy and sensitive information in images?

Fin is designed with privacy in mind. The vision models are explicitly prompted not to extract any personal or sensitive information from images, such as credit card numbers, CVVs, or identification details. Additionally, images are stored temporarily and are automatically deleted after a short period.

Does Fin store images?

Images are temporarily stored in a secure cloud environment and automatically deleted after a short period.

Do customers need to send images in a certain way?

No, customers can upload or paste images into the chat or email. Fin handles the rest.

Can customers send multiple images?

Yes, Fin will analyze the latest five images individually and use the context to inform responses.

Does Fin generate or send images?

Fin does not generate images. In some conversations, Fin may include images from your existing support content in replies.

Does Fin Vision support multiple languages?

Yes, Fin can extract text from images in many languages, though accuracy depends on clarity and complexity.

Can I turn off Fin Vision?

No, Fin Vision is built-in and cannot be disabled. It operates automatically as part of Fin’s understanding of conversations.

Can Fin Vision read documents?

Fin Vision cannot process the contents of document files (PDF/DOCX) unless they are shared as images.

💡Tip

Need more help? Get support from our Community Forum
Find answers and get help from Intercom Support and Community Experts

Fin AI Agent explained

Provide Fin AI Agent with specific guidance

Fin Guidance best practices

Use Fin previews

Using images and GIFs in Fin AI Agent replies

How Fin Vision understands images

How Fin Vision works

Understanding image analysis vs. image replies

Ways to use Fin Vision

Maximizing Fin Vision

Use Fin Vision with Fin Guidance

1. Reading and Interpreting Receipts

2. Bug Reporting with Screenshots

3. Device Identification for Support

4. Document Verification

Guidance strategies

FAQs

What image formats does Fin Vision support?

How does Fin handle privacy and sensitive information in images?

Does Fin store images?

Do customers need to send images in a certain way?

Can customers send multiple images?

Does Fin generate or send images?

Does Fin Vision support multiple languages?

Can I turn off Fin Vision?

Can Fin Vision read documents?