Introduction
In the world of automation, platform lock-in is a major source of friction. Manually rebuilding a complex workflow from a tool like Zapier or Make.com into n8n can be a tedious, error-prone process. What if you could just take a screenshot of a workflow and have the code automatically generated?
This isn’t a hypothetical. While reviewing our internal systems, we unearthed a fascinating proof-of-concept: a “Visual Workflow Builder” (VWB) powered by an n8n workflow. This tool acts as an AI-powered reverse engineering engine for automations, and in this article, we’ll break down exactly how it works.
The Core Concept: A Multi-Agent AI Pipeline
The VWB leverages a multi-agent approach within a single n8n workflow. The idea is to use different AI models for tasks they excel at: one for seeing and understanding, and another for writing precise, structured code.
The process flows through four key stages:
- The Trigger (Image Upload): The entire process kicks off when a user sends a workflow image (e.g., a
.pngor.jpg) to a dedicated n8n webhook. - The Vision Agent (Gemini Pro Vision): The image is passed to a vision-capable AI. This agent is prompted to act as an expert automation consultant. Its job isn’t to write code, but to understand the diagram, identifying the apps, triggers, actions, and the logical flow connecting them. It then outputs this understanding as structured text.
- The Expert Coder (Claude 3 Sonnet): The structured text from the Vision Agent is then fed to a second AI. This agent is prompted to be an expert n8n developer. Its sole purpose is to take the logical description and convert it into a complete, valid, and ready-to-import n8n workflow JSON.
- The Response (JSON Output): The final, machine-readable JSON is sent directly back to the user who made the initial request.
The Prompts: A Deeper Look
The magic of this pipeline lies in the specific instructions given to each AI agent.
Vision Agent Prompt (Gemini):
You are an expert automation consultant specializing in reverse-engineering workflows from visual diagrams. Your task is to analyze the provided image of a workflow from a tool like Zapier or Make.com and break it down into a structured, logical format. Analyze the image provided and perform the following steps...
n8n Coder Prompt (Claude):
You are an n8n expert developer. Your sole purpose is to convert a logical description of a workflow into a complete, valid, and ready-to-import n8n workflow JSON. Your entire output must be the raw n8n JSON. Do not wrap it in markdown code blocks or provide any surrounding text or explanations. Input Data: {{$json.output}}...
By separating the “seeing” from the “coding,” the VWB plays to the strengths of different AI models. A vision model is great at interpreting unstructured visual data, while a model like Claude is excellent at generating precise, structured output like JSON.
The n8n Workflow Itself
Here’s a look at the actual nodes in the vwb-n8n-workflow.json file:
- Webhook Trigger: Listens for incoming image uploads.
- Gemini Pro Vision Node: Takes the binary image data and sends it to the Gemini API with the “vision consultant” prompt.
- Anthropic Claude 3 Node: Takes the text output from the Gemini node and sends it to the Claude API with the “n8n expert” prompt.
- Respond to Webhook Node: Returns the JSON output from the Claude node.
This simple, four-step workflow could be the backend for a powerful migration tool, drastically reducing the time and effort required to move automations to n8n.
Next Steps & Potential
While this is currently an internal MVP, the potential is huge. With further development, this could become a public-facing tool that helps thousands of users migrate to n8n, solidifying its position as a go-to platform for serious automation work. We’re documenting and cleaning up this workflow with the goal of testing it further. Stay tuned for more updates on this exciting project.
Related reading
Want this built for you?
We design and ship production n8n automation for agencies, and train your team to own it.
Book a build →