Content Creation · Marketing & Creative Teams

AI Agent for Generating Product Mockups with Nano Banana Gemini 2.5 Flash Image

Monitor inputs, create composed mockups with Gemini 2.5, log results, and notify stakeholders when mocks are ready.

How it works
1 Step
Prepare inputs
2 Step
Generate mockup
3 Step
Deliver and log
Receive product image, template image, and prompt; convert images to Base64 for the API call.

Overview

End-to-end mockup generation from inputs to delivery.

This AI agent automates the end-to-end creation of product mockups by combining a product image with a model or scene template using Gemini 2.5. It handles image prep (Base64 conversion), sends a multimodal prompt to the OpenRouter API, and returns a ready-to-use image. The result is scalable, repeatable visuals for marketing, ecommerce, and ads.


Capabilities

What Nano Banana AI does

Automates image composition and asset delivery.

01

Ingests inputs: accepts a product image, a template/model image, and a descriptive prompt.

02

Base64-encodes inputs for API payloads.

03

Generates a composed image by sending a multimodal payload to Gemini 2.5 via OpenRouter.

04

Decodes Gemini output and converts it to a binary image file.

05

Delivers: saves the image to storage or exports to CMS-ready formats.

06

Logs metadata and results for audit and reuse.

Why you should use AI Agent for Generating Product Mockups with Nano Banana Gemini 2.5

This AI agent replaces fragmented manual work with a predictable execution flow.

Before
Manual image compositing requires skilled designers and shoots, slowing campaigns.
Lighting and shadows differ across assets, causing inconsistent results.
Time wasted on format conversions and asset management.
Difficult to iterate on scenes or backgrounds quickly.
Hard to scale campaigns with unique visuals across products.
After
Automated input prep and generation produce consistent, share-ready mockups with minimal waiting.
Images maintain consistent lighting and composition across scenes and backgrounds.
Outputs are ready to export to CMS, CMS media libraries, or storage services.
Multiple scene variations can be generated for A/B testing and campaigns.
Assets can be automatically saved to Drive, S3, or Dropbox for immediate use.
Process

How it works

A simple 3-step flow that anyone can follow.

Step 01

Prepare inputs

Receive product image, template image, and prompt; convert images to Base64 for the API call.

Step 02

Generate mockup

Send a multimodal payload to Gemini 2.5 via OpenRouter and receive a Base64 image.

Step 03

Deliver and log

Convert to a binary file, save to the chosen storage, and log metadata.


Example

Example workflow

One realistic scenario.

A fashion retailer uploads a product image and a lifestyle model image, plus a descriptive prompt, via the form trigger. The AI agent processes the inputs, generates three lifestyle mockups on beach and urban scenes within minutes, and saves them to the brand’s Google Drive for immediate marketing use.

Content Creation OpenRouter APIn8n (Form Trigger)Google DriveAWS S3 AI Agent flow

Audience

Who can benefit

Individuals and teams that rely on consistent, fast visual assets.

✍️ Marketing teams

Need fast, consistent visuals for campaigns across channels.

💼 Ecommerce managers

Keep product catalogs up-to-date with fresh visuals.

🧠 Creative agencies

Generate multiple variants for client pitches and concepts.

Product photographers

Explore concepts before shooting with real assets.

🎯 Brand managers

Maintain brand consistency across scenes and backgrounds.

📋 Content creators

Produce social-ready assets quickly without a photoshoot.

Integrations

Works with OpenRouter Gemini 2.5 and common storage/CMS pipelines.

OpenRouter API

Generates the image by sending a multimodal payload to Gemini 2.5.

n8n (Form Trigger)

Captures inputs and triggers the AI agent workflow.

Google Drive

Saves generated images to a folder for quick access.

AWS S3

Stores images for CDN-ready delivery and backup.

Dropbox

Archives assets for collaboration and reuse.

Applications

Best use cases

Practical scenarios to maximize value.

Instantly replace product photos with lifestyle visuals for catalogs and landing pages.
Generate background variation sets to test different marketing scenes.
Create multiple ad creatives from a single product image without new shoots.
Produce social-ready visuals for campaigns with minimal manual edits.
Automate virtual try-on visuals for apparel catalogs and product pages.
Build consistent visual libraries across products and seasons.

FAQ

FAQ

Common questions about using this AI agent.

The agent requires a product image, a template or model image, and a descriptive prompt. The inputs are converted to Base64 for API payloads, and the multimodal Gemini 2.5 model processes all three to compose the final image. The process is triggered by a form or webhook, and the resulting image is returned as a binary file after decoding. Outputs can be saved to storage services or exported for CMS use. Depending on your setup, you can adjust the prompt to steer lighting, background, and composition.

Gemini 2.5 Flash Image is a multimodal model designed to generate images from multiple inputs, including text prompts and images. It can understand how to blend a product image with a model or scene template to produce realistic composites. The Flow involves sending a payload via an API, then decoding the response as an image file. As with any model, results vary with the prompt quality and input compatibility. Access costs depend on the OpenRouter pricing and usage.

The AI agent outputs a binary image file (e.g., PNG/JPG) after decoding the Base64 result. This image can be saved to cloud storage like Google Drive, AWS S3, or Dropbox, or exported to a CMS or marketing library. You can automate the delivery by wiring the agent’s output to your storage or CMS workflow. If needed, you can keep multiple variants in a single folder for quick retrieval.

Inputs are transmitted to an API in a controlled workflow that you initialize. Access to inputs is governed by your OpenRouter and n8n credentials, so only authenticated calls are executed. If your policy requires, you can filter sensitive data or mask certain fields before submission. It's advisable to review data handling practices with your security team, especially for proprietary assets.

A basic understanding of the tools in your stack (n8n, cloud storage, and API credentials) is helpful but not mandatory. The AI agent is designed to be triggered by a form or webhook and can be wired into existing workflows with minimal configuration. The key steps—provide inputs, run the payload, and retrieve the image—are straightforward. If you want deeper customization (selecting models or routes), editing the HTTP request body may be required.

Gemini 2.5 usage is billed via the OpenRouter account you connect to the agent, so pricing depends on model choices and usage. You can switch models by editing the API payload in the HTTP Request node. The workflow supports testing different models to compare results before committing to a production run. Always verify current pricing with OpenRouter before heavy usage.

Yes. You can tailor prompts, select different background templates, and route outputs to separate storage folders per product or campaign. The agent is designed to be adaptable, with the HTTP Request body adjustable to swap models or adjust payloads. You can also replace the Form Trigger with a Webhook to drive automation from other systems. This enables centralized control across teams.


AI Agent for Generating Product Mockups with Nano Banana Gemini 2.5 Flash Image

Monitor inputs, create composed mockups with Gemini 2.5, log results, and notify stakeholders when mocks are ready.

Use this template → Read the docs