Gemini -> Image Understanding
Readonly
Action summary
Gemini -> Image Understanding
Gemini → Image Understanding
Overview
This action enables you to analyze and understand images using the Gemini API. By providing an image file along with a prompt (your question or instruction), you can instruct Gemini to interpret, describe, or extract insights from the image content using a selected Gemini model.
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
gemini_api_key |
text (registry) | Yes | Your Gemini API key (from settings registry). |
model |
text | Yes | The Gemini model to use (e.g., gemini-1.5-flash, gemini-pro-vision). |
prompt |
text | Yes | The question, instruction, or prompt about the image. |
image |
file resource | Yes | The image file you want Gemini to analyze. |
Function Stack
- Create file resource from image
- Reads the file payload from the provided
imageinput.
- Reads the file payload from the provided
- Gemini API Request
- Sends a POST request to:
https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent?key={gemini_api_key}
- The request body includes both the raw image data and the prompt, following Gemini API’s content structure.
- Precondition
- Verifies that the response status is
200to continue processing.
- Verifies that the response status is
- Response
- Returns the result from the response:
gemini_api.response.result.
- Returns the result from the response:
Example Usage
Request
{
"gemini_api_key": "AIzaSyD...",
"model": "gemini-1.5-flash",
"prompt": "Describe what is happening in this image.",
"image": "(attach image file)"
}
Response
{
"result": "The image shows a group of people hiking on a mountain trail under a clear sky."
}
Notes
- The
modelparameter lets you select among available Gemini models for vision tasks (such asgemini-1.5-flashor similar). - The
imageinput must be an actual image file (PNG, JPEG, etc.). - Ensure your Gemini API key has the necessary permissions and quota for vision/model usage.
- This action handles encoding and passing the image to Gemini as required by the API.
Troubleshooting
PERMISSION_DENIEDorUNAUTHORIZED: Check your Gemini API key and model permissions.INVALID_ARGUMENT: Make sure your prompt is a string and your image is a valid file format.UNSUPPORTED_MEDIA_TYPE: Only supported image formats (likejpeg,png) can be processed.- Other errors: Refer to the Gemini API documentation for detailed troubleshooting guidance.
References
Version notes
2025-06-16 16:45:39
Current