Flux AI: A Beginner-Friendly Overview

Published Categorized as Tutorial Tagged 2 Comments on Flux AI: A Beginner-Friendly Overview

Since the release of Flux.1 AI models on August 1, 2024, we have seen a flurry of activities around it. We all tried to figure out how to use and build tools around them. Now the dust is starting to settle and is a good time to summarize the tools and resources for Flux.

In this article, I review what we know about the Flux AI model and the resources available.

What is Flux AI model?

Developed by Black Forest Labs, the Flux AI model excels at generating photorealistic images. Three models are available: Pro, Dev, and Schnell.

  • Flux.1 Pro: This highest-quality Flux model is intended for professional use where the highest-quality images are necessary. You cannot run this model locally. It is available through APIs and image generation services.
  • Flux.1 Dev: A faster model (with guidance distillation) at the expense of quality. This is an open model widely used by the community.
  • Flux.1 Schnell: An even faster Flux model that generates images with 1 to 4 sampling steps. The quality is lower in return.

Flux.1 Dev is the most popular Flux model to run locally.

How does Flux AI work?

Like Stable Diffusion, Flux AI is a generative latent diffusion image model that generates images by denoising random noises in the latent space.

Number of parameters

The Flux AI model has 12 billion parameters. For reference,

  • SDXL: 3.5 billion parameters
  • SD 1.5: 0.98 billion parameters

Larger means better in generative AI models. That’s why Flux outperforms SDXL in many areas.

Model architecture

There’s not a lot of information about the model architecture. According to their blog post, here are the distinctive features of the Flux.1 diffusion model:

  • Hybrid architecture that combines multimodal and parallel diffusion transformer blocks.
  • Flow matching
  • Rotary positional embeddings
  • Parallel attention layers

Can I use Flux AI commercially?

Have a business idea that can use Flux AI models? Before writing the first line of code, you should understand each Flux model has its own license.

Flux.1 Pro is only available through an API or an online service. You will need to check the license from where you get the images.

Flux.1 Dev is under a Non-Commercial license. The output images can be used commercially but you cannot host a generation service and charge for it.

Flux.1 Schnell is licensed under the permissive Apache 2.0 license. You can use it commercially, including the output images, and host a for-profit image generation business.

Popular ways to use Flux

Here are some different ways you can use Flux.

Text-to-image

Text-to-image turns a text prompt into an image. It is the most basic way to use a Flux model.

Tutorials:

Image-to-image

Image-to-image turns an existing image into another image using Flux AI.

It is called inpainting when applied to a small area.

Tutorial:

ControlNet

You can control the composition of the AI images generated with Flux using ControlNet. Currently, you can use the following ControlNet models:

  • Canny
  • HED
  • Depth

Tutorial:

Flux Video workflows

Flux AI is an image model. It cannot create video directly but can be paired with an image-to-video AI model to generate a video. Below are a few options.

Flux + CogVideo

CogVideo is an open-source video model. It has an image-to-video model that you can use Flux with.

Workflow:

Flux + Gen3

RunwayML’s Gen3 video generator can turn an image into a video. It is a paid service that you can only use online.

Workflow:

Flux + Kling

Kling AI is a state-of-the-art AI video generator. It supports image-to-video so you can use a Flux image as the first frame.

Workflows:

Can Flux AI generate NSFW images?

The Flux base models cannot generate NSFW images, likely because the training images are sanitized. There are LoRA models on CivitAI that you can use with the Flux model to generate NSFW content.

What’s the difference between Flux and Stable Diffusion?

They are both diffusion AI image model families but with different architectures. Many original developers of Stable Diffusion have worked on Flux. Stable Diffusion 1.5 is a very well-trained model. They surely have brought forth the expertises to developing Flux.

More importantly, they are both local models that you can run on your PC, circumventing any online censorship and privacy concern.

The widely used local models are:

  • Stable Diffusion: SD 1.5 and SDXL.
  • Flux: Flux.1 Dev.

Can I train a Flux model?

Training a Flux.1 LoRA model is doable on PC or Google Colab. You can also use online services.

Training a Flux.1 checkpoint model is still cooking and will require more GPU resources.

Avatar

By Andrew

Andrew is an experienced engineer with a specialization in Machine Learning and Artificial Intelligence. He is passionate about programming, art, photography, and education. He has a Ph.D. in engineering.

2 comments

  1. This model is NOT good for ideating. I can’t get even the smallest version to run in under 68 seconds for ONE image in a non-complicated workflow.
    PC 1: RTX3090Ti
    PC2: RTX4060

Leave a comment

Your email address will not be published. Required fields are marked *