WAI new model (#26739)

mchenco · web-flow · commit e79f024dff39 · 2025-11-25T11:09:18.000-05:00
* push

* edit input img size

* pricing update
diff --git a/src/content/changelog/workers-ai/2025-11-25-flux-2-dev-workers-ai.mdx b/src/content/changelog/workers-ai/2025-11-25-flux-2-dev-workers-ai.mdx
@@ -0,0 +1,243 @@
+---
+title: Launching FLUX.2 [dev] on Workers AI
+description: Partnering with Black Forest Labs to bring their latest FLUX.2 model to Workers AI
+date: 2025-11-25
+---
+
+We've partnered with Black Forest Labs (BFL) to bring their latest FLUX.2 [dev] model to Workers AI! This model excels in generating high-fidelity images with physical world grounding, multi-language support, and digital asset creation. You can also create specific super images with granular controls like JSON prompting. 
+
+Read the [BFL blog](https://bfl.ai/flux2) to learn more about the model itself. Read our [Cloudflare blog](https://blog.cloudflare.com/flux-2-workers-ai) to see the model in action, or try it out yourself on our [multi modal playground](https://multi-modal.ai.cloudflare.com/).
+
+Pricing documentation is available on the [model page](/workers-ai/models/flux-2-dev/) or [pricing page](/workers-ai/platform/pricing/). Note, we expect to drop pricing in the next few days after iterating on the model performance.
+
+## Workers AI Platform specifics
+
+The model hosted on Workers AI is able to support up to 4 image inputs (512x512 per input image). Note, this image model is one of the most powerful in the catalog and is expected to be slower than the other image models we currently support. One catch to look out for is that this model takes multipart form data inputs, even if you just have a prompt.
+
+With the REST API, the multipart form data input looks like this:
+
+```bash
+curl --request POST \
+  --url 'https://api.cloudflare.com/client/v4/accounts/{ACCOUNT}/ai/run/@cf/black-forest-labs/flux-2-dev' \
+  --header 'Authorization: Bearer {TOKEN}' \
+  --header 'Content-Type: multipart/form-data' \
+  --form 'prompt=a sunset at the alps' \
+  --form steps=25
+  --form width=1024
+  --form height=1024
+```
+
+With the Workers AI binding, you can use it as such:
+
+```javascript
+const form = new FormData();
+form.append('prompt', 'a sunset at the alps')
+form.append('width', 512)
+form.append('height', 512)
+ 
+const resp = await env.AI.run("@cf/black-forest-labs/flux-2-dev", {
+    multipart: {
+        body: form,
+        contentType: "multipart/form-data"
+    }
+})
+
+The parameters you can send to the model are detailed here:
+
+<details>
+  <summary>JSON Schema for Model</summary>
+**Required Parameters**
+
+- `prompt` (string) - Text description of the image to generate
+
+**Optional Parameters**
+
+- `input_image_0` (string) - Binary image
+- `input_image_1` (string) - Binary image
+- `input_image_2` (string) - Binary image
+- `input_image_3` (string) - Binary image
+- `steps` (integer) - Number of inference steps. Higher values may improve quality but increase generation time
+- `guidance` (float) - Guidance scale for generation. Higher values follow the prompt more closely
+- `width` (integer) - Width of the image, default `1024` Range: 256-1920
+- `height` (integer) - Height of the image, default `768` Range: 256-1920
+- `seed` (integer) - Seed for reproducibility
+
+</details>
+
+```
+
+## Multi-Reference Images
+
+The FLUX.2 model is great at generating images based on reference images. You can use this feature to apply the style of one image to another, add a new character to an image, or iterate on past generate images. You would use it with the same multipart form data structure, with the input images in binary.
+
+For the prompt, you can reference the images based on the index, like `take the subject of image 1 and style it like image 0` or even use natural language like `place the dog beside the woman`.
+
+Note: you have to name the input parameter as `input_image_0`, `input_image_1`, `input_image_2` for it to work correctly. All input images must be smaller than 512x512.
+
+```bash
+curl --request POST \
+  --url 'https://api.cloudflare.com/client/v4/accounts/{ACCOUNT}/ai/run/@cf/black-forest-labs/flux-2-dev' \
+  --header 'Authorization: Bearer {TOKEN}' \
+  --header 'Content-Type: multipart/form-data' \
+  --form 'prompt=take the subject of image 1 and style it like image 0' \
+  --form input_image_0=@/Users/johndoe/Desktop/icedoutkeanu.png \
+  --form input_image_1=@/Users/johndoe/Desktop/me.png \
+  --form steps=25
+  --form width=1024
+  --form height=1024
+```
+Through Workers AI Binding:
+
+```javascript
+const image0 = await fetch("http://image-url");
+const image1 = await fetch("http://image-url");
+const form = new FormData();
+ 
+const image_blob0 = await streamToBlob(image0.body, "image/png");
+const image_blob1 = await streamToBlob(image1.body, "image/png");
+form.append('input_image_0', image_blob0)
+form.append('input_image_1', image_blob1)
+form.append('prompt', 'take the subject of image 1and style it like image 0')
+ 
+const resp = await env.AI.run("@cf/black-forest-labs/flux-2-dev", {
+    multipart: {
+        body: form,
+        contentType: "multipart/form-data"
+    }
+})
+
+```
+
+## JSON Prompting
+
+The model supports prompting in JSON to get more granular control over images. You would pass the JSON as the value of the 'prompt' field in the multipart form data. See the JSON schema below on the base parameters you can pass to the model.
+
+<details>
+  <summary>JSON Prompting Schema</summary>
+
+```json
+{
+  "type": "object",
+  "properties": {
+    "scene": {
+      "type": "string",
+      "description": "Overall scene setting or location"
+    },
+    "subjects": {
+      "type": "array",
+      "items": {
+        "type": "object",
+        "properties": {
+          "type": { 
+            "type": "string", 
+            "description": "Type of subject (e.g., desert nomad, blacksmith, DJ, falcon)" 
+          },
+          "description": { 
+            "type": "string", 
+            "description": "Physical attributes, clothing, accessories" 
+          },
+          "pose": { 
+            "type": "string", 
+            "description": "Action or stance" 
+          },
+          "position": { 
+            "type": "string",
+            "enum": ["foreground", "midground", "background"],
+            "description": "Depth placement in scene"
+          }
+        },
+        "required": ["type", "description", "pose", "position"]
+      }
+    },
+    "style": {
+      "type": "string",
+      "description": "Artistic rendering style (e.g., digital painting, photorealistic, pixel art, noir sci-fi, lifestyle photo, wabi-sabi photo)"
+    },
+    "color_palette": {
+      "type": "array",
+      "items": { "type": "string" },
+      "minItems": 3,
+      "maxItems": 3,
+      "description": "Exactly 3 main colors for the scene (e.g., ['navy', 'neon yellow', 'magenta'])"
+    },
+    "lighting": {
+      "type": "string",
+      "description": "Lighting condition and direction (e.g., fog-filtered sun, moonlight with star glints, dappled sunlight)"
+    },
+    "mood": {
+      "type": "string",
+      "description": "Emotional atmosphere (e.g., harsh and determined, playful and modern, peaceful and dreamy)"
+    },
+    "background": {
+      "type": "string",
+      "description": "Background environment details"
+    },
+    "composition": {
+      "type": "string",
+      "enum": [
+        "rule of thirds",
+        "circular arrangement",
+        "framed by foreground",
+        "minimalist negative space",
+        "S-curve",
+        "vanishing point center",
+        "dynamic off-center",
+        "leading leads",
+        "golden spiral",
+        "diagonal energy",
+        "strong verticals",
+        "triangular arrangement"
+      ],
+      "description": "Compositional technique"
+    },
+    "camera": {
+      "type": "object",
+      "properties": {
+        "angle": { 
+          "type": "string",
+          "enum": ["eye level", "low angle", "slightly low", "bird's-eye", "worm's-eye", "over-the-shoulder", "isometric"],
+          "description": "Camera perspective"
+        },
+        "distance": { 
+          "type": "string",
+          "enum": ["close-up", "medium close-up", "medium shot", "medium wide", "wide shot", "extreme wide"],
+          "description": "Framing distance"
+        },
+        "focus": { 
+          "type": "string",
+          "enum": ["deep focus", "macro focus", "selective focus", "sharp on subject", "soft background"],
+          "description": "Focus type"
+        },
+        "lens": { 
+          "type": "string",
+          "enum": ["14mm", "24mm", "35mm", "50mm", "70mm", "85mm"],
+          "description": "Focal length (wide to telephoto)"
+        },
+        "f-number": { 
+          "type": "string", 
+          "description": "Aperture (e.g., f/2.8, the smaller the number the more blurry the background)" 
+        },
+        "ISO": { 
+          "type": "number", 
+          "description": "Light sensitivity value (comfortable range between 100 & 6400, lower = less sensitivity)" 
+        }
+      }
+    },
+    "effects": {
+      "type": "array",
+      "items": { "type": "string" },
+      "description": "Post-processing effects (e.g., 'lens flare small', 'subtle film grain', 'soft bloom', 'god rays', 'chromatic aberration mild')"
+    }
+  },
+  "required": ["scene", "subjects"]
+}
+```
+</details>
+
+## Other features to try
+
+- The model also supports the most common latin and non-latin character languages
+- You can prompt the model with specific hex codes like `#2ECC71`
+- Try creating digital assets like landing pages, comic strips, infographics too!
+
+
diff --git a/src/content/docs/workers-ai/platform/pricing.mdx b/src/content/docs/workers-ai/platform/pricing.mdx
@@ -78,6 +78,7 @@ The Price in Tokens column is equivalent to the Price in Neurons column - the di
 | @cf/black-forest-labs/flux-1-schnell  | $0.0000528 per 512x512 tile <br/> $0.0001056 per step      | 4.80 neurons per 512x512 tile <br/> 9.60 neurons per step                |
 | @cf/leonardo/lucid-origin    | $0.006996 per 512x512 tile <br/> $0.000132 per step | 636.00 neurons per 512x512 tile <br/> 12.00 neurons per step |
 | @cf/leonardo/phoenix-1.0     | $0.005830 per 512x512 tile <br/> $0.000110 per step | 530.00 neurons per 512x512 tile <br/> 10.00 neurons per step |
+| @cf/black-forest-labs/flux-2-dev | $0.00021 per input 512x512 tile, per step <br/> $0.00041 per output 512x512 tile, per step | 18.75 neurons per input 512x512 tile, per step <br/> 37.50 neurons per output 512x512 tile, per step |
 
 ## Audio model pricing
 
diff --git a/src/content/release-notes/workers-ai.yaml b/src/content/release-notes/workers-ai.yaml
@@ -3,6 +3,10 @@ link: "/workers-ai/changelog/"
 productName: Workers AI
 productLink: "/workers-ai/"
 entries:
+  - publish_date: "2025-11-25"
+    title: Black Forest Labs FLUX.2 dev now available 
+    description: |-
+      - [`@cf/black-forest-labs/flux-2-dev`](/workers-ai/models/flux-2-dev/) now available on Workers AI! Read [changelog](/changelog/2025-11-25-flux-2-dev-workers-ai/) to get started
   - publish_date: "2025-11-13"
     title: Qwen3 LLM and Embeddings available on Workers AI
     description: |-
diff --git a/src/content/workers-ai-models/flux-2-dev.json b/src/content/workers-ai-models/flux-2-dev.json
@@ -0,0 +1,73 @@
+{
+    "id": "3ae8936e-593e-4fb2-85ee-95dd8a057588",
+    "source": 1,
+    "name": "@cf/black-forest-labs/flux-2-dev",
+    "description": "FLUX.2 [dev] is an image model from Black Forest Labs where you can generate highly realistic and detailed images, with multi-reference support.",
+    "task": {
+        "id": "3d6e1f35-341b-4915-a6c8-9a7142a9033a",
+        "name": "Text-to-Image",
+        "description": "Generates images from input text. These models can be used to generate and modify images based on text prompts."
+    },
+    "created_at": "2025-11-24 15:44:06.050",
+    "tags": [],
+    "properties": [
+        {
+            "property_id": "terms",
+            "value": "https://bfl.ai/legal/terms-of-service"
+        },
+        {
+            "property_id": "partner",
+            "value": "true"
+        },
+        {
+            "property_id": "price",
+            "value": [
+                {
+                    "unit": "per input 512x512 tile, per step",
+                    "price": 0.00021,
+                    "currency": "USD"
+                },
+                {
+                    "unit": "per output 512x512 tile, per step",
+                    "price": 0.00041,
+                    "currency": "USD"
+                }
+            ]
+        }
+    ],
+    "schema": 
+        {
+        "input": {
+            "type": "object",
+            "properties": {
+                "multipart": {
+                    "type": "object",
+                    "properties": {
+                        "body": {
+                            "type": "object"
+                        },
+                        "contentType": {
+                            "type": "string"
+                        }
+                    },
+                    "required": [
+                        "body",
+                        "contentType"
+                    ]
+                },
+                "required": [
+                    "multipart"
+                ]
+            }
+        },
+        "output": {
+            "type": "object",
+            "properties": {
+                "image": {
+                    "type": "string",
+                    "description": "Generated image as Base64 string."
+                }
+            }
+        }
+    }
+}