Gemini -> Audio Understanding
Readonly
Action summary
Gemini -> Audio Understanding
Gemini → Audio Understanding
Overview
This action enables you to perform audio understanding and analysis using the Gemini API. By referencing an existing, uploaded audio file and providing a question, you can instruct Gemini to analyze, summarize, or extract insights from the audio content.
Note: The file_uri must refer to an audio file that has already been uploaded and is accessible to the Gemini API. This action does not perform file uploads.
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
gemini_api_key |
text | Yes | Your Gemini API key (from settings registry). |
file_uri |
text | Yes | The URI of an audio file already uploaded to Google Gemini. |
question |
text | Yes | Your prompt, task, or question for Gemini about the audio content. |
Function Stack
- Talk to Uploaded Content
- Uses the given
file_uriandquestionto instruct Gemini to analyze the specific audio file.
- Uses the given
- API Request
- Posts a request to
https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=YOUR_API_KEY
Including a payload such as:
{
"contents": [
{
"parts": [
{ "file_data": { "file_uri": "<file_uri>" } },
{ "text": "<question>" }
]
}
]
}
- Authenticates with your Gemini API key.
- Parses Gemini's response.
- Precondition
- Ensures the API response status is
200.
- Ensures the API response status is
- Response
- Returns the result from
gemini_api.response.result.
- Returns the result from
Example Usage
Request
{
"gemini_api_key": "AIzaSyD...",
"file_uri": "https://storage.googleapis.com/path-to-your-audio.wav",
"question": "Summarize what is being discussed in this meeting recording."
}
Response
{
"result": "The meeting discusses quarterly revenue figures, marketing strategy, and upcoming project deadlines."
}
Notes
- The
file_urimust already point to a publicly or API-accessible audio file uploaded to Google Gemini. - Common use cases: summarize calls, extract action items, identify speakers, or answer specific questions about audio content.
- Ensure your API key’s quota and model access cover the requested usage.
Troubleshooting
INVALID_ARGUMENTorNOT_FOUND: Check that yourfile_uriis correct, exists, and is accessible.PERMISSION_DENIED: Your Gemini API key might be invalid or lack the necessary permissions.UNSUPPORTED_MEDIA_TYPE: Make sure your audio file format is supported.- For more help, refer to the Gemini API documentation.
References
Version notes
2025-06-30 16:36:53
Current