Vision lets the @DoublegramAIBot Telegram bot see the images you send — not just read your words about them. In your private chat with the bot, send a photo, add a question in the caption, and the bot analyzes the picture and replies in text.
This is different from generating a new image (see Generate Images in Telegram from Text). Here you already have a picture and want the AI to understand it in Telegram.
How to analyze an image
Two simple ways:
Send a photo with a caption
- Open @DoublegramAIBot.
- Tap the attachment icon and send a photo (or screenshot saved as a photo).
- In the caption field, type your question — for example: Describe what you see or What brand is this logo?
- Send. The bot processes the image and replies in text.
Reply to an existing photo
If a photo is already in the chat, reply to that message with your question. The bot understands you are asking about the image in the message you replied to.
Try it: Send a screenshot of a webpage with the caption: Summarize the main points shown in this screenshot.
Use a Gemini text model
For full image analysis, your text model must be a Google Gemini model. Gemini receives the actual image data and can truly see what you sent.
To switch:
- Send
/model. - Tap Text Model → Google Gemini.
- Pick a Gemini model (for example Gemini 2.5 Flash for speed, or Gemini 2.5 Pro for harder questions).
If you send a photo while using a non-Gemini text model, the bot cannot process the image properly. Always use Gemini for Vision tasks.
Tip: Gemini 2.5 Flash is the default model for new users — you can analyze images right away without changing anything.
What you can ask
Vision works for many everyday tasks:
- Describe a scene — What is happening in this photo?
- Read text (OCR) — Transcribe all the text in this image. Useful for photos of documents, signs, or screenshots.
- Identify objects — What product is this? or What animal is in the picture?
- Analyze screenshots — Explain this error message or What does this chart show?
- Get details — What colors dominate this image? or Is there a person wearing a red jacket?
- Help with content — Write alt text for this image for a social post.
Credits
Analyzing an image costs your normal model price plus 1 extra credit for the image. For example, if Gemini 2.5 Flash costs 3 credits per request, a photo question costs 4 credits total.
Check your balance with /credits before sending many images. Credits are only charged for successful responses.
Vision vs image generation
- Vision (this guide) — You send a photo → bot analyzes it and answers in text.
- Image generation — You describe something in Telegram → bot creates a new picture (see Generate Images in Telegram from Text).
Send a photo to analyze. Type generate an image of… to create a new one.
Quick reference
- Send photo + caption, or reply to a photo with your question
- Use a Gemini text model (
/model→ Text Model → Google Gemini) - Cost: model credits + 1 credit for the image
- Default model (Gemini 2.5 Flash) already supports Vision



