Analyze Photos in Telegram with AI Vision

In Telegram, @DoublegramAIBot can analyze photos you send — describe scenes, read text in screenshots, or answer questions about any image. Switch to a Gemini model with /model, then send a picture with a caption like: What text is written in this image?

Vision lets the @DoublegramAIBot Telegram bot see the images you send — not just read your words about them. In your private chat with the bot, send a photo, add a question in the caption, and the bot analyzes the picture and replies in text.

This is different from generating a new image (see Generate Images in Telegram from Text). Here you already have a picture and want the AI to understand it in Telegram.

How to analyze an image

Two simple ways:

Send a photo with a caption

  1. Open @DoublegramAIBot.
  2. Tap the attachment icon and send a photo (or screenshot saved as a photo).
  3. In the caption field, type your question — for example: Describe what you see or What brand is this logo?
  4. Send. The bot processes the image and replies in text.

Reply to an existing photo

If a photo is already in the chat, reply to that message with your question. The bot understands you are asking about the image in the message you replied to.

Try it: Send a screenshot of a webpage with the caption: Summarize the main points shown in this screenshot.

Use a Gemini text model

For full image analysis, your text model must be a Google Gemini model. Gemini receives the actual image data and can truly see what you sent.

To switch:

  1. Send /model.
  2. Tap Text ModelGoogle Gemini.
  3. Pick a Gemini model (for example Gemini 2.5 Flash for speed, or Gemini 2.5 Pro for harder questions).

If you send a photo while using a non-Gemini text model, the bot cannot process the image properly. Always use Gemini for Vision tasks.

Tip: Gemini 2.5 Flash is the default model for new users — you can analyze images right away without changing anything.

What you can ask

Vision works for many everyday tasks:

  • Describe a sceneWhat is happening in this photo?
  • Read text (OCR)Transcribe all the text in this image. Useful for photos of documents, signs, or screenshots.
  • Identify objectsWhat product is this? or What animal is in the picture?
  • Analyze screenshotsExplain this error message or What does this chart show?
  • Get detailsWhat colors dominate this image? or Is there a person wearing a red jacket?
  • Help with contentWrite alt text for this image for a social post.

Credits

Analyzing an image costs your normal model price plus 1 extra credit for the image. For example, if Gemini 2.5 Flash costs 3 credits per request, a photo question costs 4 credits total.

Check your balance with /credits before sending many images. Credits are only charged for successful responses.

Vision vs image generation

  • Vision (this guide) — You send a photo → bot analyzes it and answers in text.
  • Image generation — You describe something in Telegram → bot creates a new picture (see Generate Images in Telegram from Text).

Send a photo to analyze. Type generate an image of… to create a new one.

Quick reference

  • Send photo + caption, or reply to a photo with your question
  • Use a Gemini text model (/model → Text Model → Google Gemini)
  • Cost: model credits + 1 credit for the image
  • Default model (Gemini 2.5 Flash) already supports Vision