Analyze Photos in Telegram with AI Vision

Vision lets the @DoublegramAIBot Telegram bot see the images you send — not just read your words about them. In your private chat with the bot, send a photo, add a question in the caption, and the bot analyzes the picture and replies in text.

This is different from generating a new image (see Generate Images in Telegram from Text). Here you already have a picture and want the AI to understand it in Telegram.

How to analyze an image

Two simple ways:

Send a photo with a caption

Open @DoublegramAIBot.
Tap the attachment icon and send a photo (or screenshot saved as a photo).
In the caption field, type your question — for example: Describe what you see or What brand is this logo?
Send. The bot processes the image and replies in text.

Reply to an existing photo

If a photo is already in the chat, reply to that message with your question. The bot understands you are asking about the image in the message you replied to.

Try it: Send a screenshot of a webpage with the caption: Summarize the main points shown in this screenshot.

Use a Gemini text model

For full image analysis, your text model must be a Google Gemini model. Gemini receives the actual image data and can truly see what you sent.

To switch:

Send /model.
Tap Text Model → Google Gemini.
Pick a Gemini model (for example Gemini 2.5 Flash for speed, or Gemini 2.5 Pro for harder questions).

If you send a photo while using a non-Gemini text model, the bot cannot process the image properly. Always use Gemini for Vision tasks.

Tip: Gemini 2.5 Flash is the default model for new users — you can analyze images right away without changing anything.

What you can ask

Vision works for many everyday tasks:

Describe a scene — What is happening in this photo?
Read text (OCR) — Transcribe all the text in this image. Useful for photos of documents, signs, or screenshots.
Identify objects — What product is this? or What animal is in the picture?
Analyze screenshots — Explain this error message or What does this chart show?
Get details — What colors dominate this image? or Is there a person wearing a red jacket?
Help with content — Write alt text for this image for a social post.

Credits

Analyzing an image costs your normal model price plus 1 extra credit for the image. For example, if Gemini 2.5 Flash costs 3 credits per request, a photo question costs 4 credits total.

Check your balance with /credits before sending many images. Credits are only charged for successful responses.

Vision vs image generation

Vision (this guide) — You send a photo → bot analyzes it and answers in text.
Image generation — You describe something in Telegram → bot creates a new picture (see Generate Images in Telegram from Text).

Send a photo to analyze. Type generate an image of… to create a new one.

Quick reference

Send photo + caption, or reply to a photo with your question
Use a Gemini text model (/model → Text Model → Google Gemini)
Cost: model credits + 1 credit for the image
Default model (Gemini 2.5 Flash) already supports Vision