How to use Dreambooth to put anything in Stable Diffusion (Colab notebook)

129,185 views
Updated Categorized as Tutorial Tagged 228 Comments on How to use Dreambooth to put anything in Stable Diffusion (Colab notebook)
Dreambooth a subject to a stable diffusion model

Dreambooth is a way to put anything — your loved one, your dog, your favorite toy — into a Stable Diffusion model. We will introduce what Dreambooth is, how it works, and how to perform the training.

This tutorial is aimed at people who have used Stable Diffusion but have not used Dreambooth before.

You will follow the step-by-step guide to prepare your training images and use our easy 1-click Colab notebook for dreambooth training. No coding is required!

You can put real-life objects or persons into a Stable Diffusion model and generate images in different styles and settings.

Training image.
This image has an empty alt attribute; its file name is download-6.png
AI image.
Training image.
AI image.

Do you know many custom models are trained using Dreambooth? After completing this tutorial, you will know how to make your own.

You will first learn about what Dreambooth is and how it works. But You can skip to the step-by-step guide if you are only interested in the training.

Software

To follow this tutorial and perform a training, you will need to

Either option grants you access to the training notebook and example images.

Note:

  1. This notebook can only train a Stable Diffusion v1.5 checkpoint model. Train an SDXL LoRA model if you are interested in the SDXL Model.
  2. This notebook can be run with a free Colab account. A paid account allows you to use a faster V100 GPU, which speeds up the training.

What is Dreambooth?

Published in 2022 by the Google research team, Dreambooth is a technique to fine-tune diffusion models (like Stable Diffusion) by injecting a custom subject into the model.

Why is it called Dreambooth? According to the Google research team,

It’s like a photo booth, but once the subject is captured, it can be synthesized wherever your dreams take you.

Sounds great! But how well does it work? Below is an example from the research article. Using just 3 images of a particular dog (Let’s call her Devora) as input, the dreamboothed model can generate images of Devora in different contexts.

dreambooth examples from the dreambooth research article
With as few as 3 training images, Dreambooth injects a custom subject to a diffusion model seamlessly.

How does Dreambooth work?

You may ask why you can’t train the model with additional steps with those images. The issue is that doing so is known to cause catastrophic failure due to overfitting (since the dataset is quite small) and language drift.

Dreambooth resolves these problems by

  1. Using a rare word for the new subject (Notice I used a rare name, Devora, for the dog) so that it does not have a lot of meaning in the model in the first place.
  2. Prior preservation on class: In order to preserve the meaning of the class (dog in the above case), the model is fine-tuned in a way that the subject (Devora) is injected while the image generation of the class (dog) is preserved.

There’s another similar technique called textual inversion. The difference is that Dreambooth fine-tunes the whole model, while textual inversion injects a new word, instead of reusing a rare one, and fine-tunes only the text embedding part of the model.

What you need to train Dreambooth

You will need three things

  1. A few custom images
  2. An unique identifier
  3. A class name

In the above example. The unique identifier is Devora. The class name is dog.

Then you will need to construct your instance prompt:

a photo of [unique identifier] [class name]

And a class prompt:

a photo of [class name]

In the above example, the instance prompt is

a photo of Devora dog

Since Devora is a dog, the class prompt is

a photo of a dog

Now you understand what you need, let’s dive into the training!

Step-by-step guide

Step 1: Prepare training images

As in any machine learning task, high-quality training data is the most important factor to your success.

Take 3-10 pictures of your custom subject. The picture should be taken from different angles.

The subject should also be in a variety of backgrounds so that the model can differentiate the subject from the background.

I will use this toy in the tutorial.

Step 2: Resize your images to 512×512

In order to use the images in training, you will first need to resize them to 512×512 pixels for training with v1 models.

BIRME is a convenient site for resizing images.

  1. Drop your images to the BIRME page.
  2. Adjust the canvas of each image so that it shows the subject adequately.
  3. Make sure the width and height are both 512 px.
  4. Press SAVE FILES to save the resized images to your computer.

Alternatively, you can download my resized images if you want to go through the tutorial.

To download the training images:

Step 3: Training

I recommend using Google Colab for training because it saves you the trouble setting up. The following notebook is modified from Shivam Shrirao’s repository but is more user-friendly. Follow the repository’s instructions if you prefer other setups.

The whole training takes about 30 minutes. If you don’t use Google Colab much, you can probably complete the training without getting disconnected. Purchase some compute credits to avoid the frustration of getting disconnected.

The notebook will save the model to your Google Drive. Make sure you have at least 2GB if you choose fp16 (recommended) and 4GB if you don’t.

1. Open the Colab notebook.

2. Enter the MODEL_NAME. You can use the Stable Diffusion v1.5 model (HuggingFace page). You can find more models on HuggingFace here. The model name should be in the format user/model.

runwayml/stable-diffusion-v1-5

3. Enter the BRANCH name. See the screenshot below for the model and branch names.

fp16

Huggingface Model name and branch name

4. Put in the instance prompt and class prompt. For my images, I name my toy rabbit zwx so my instance prompt is:

photo of zwx toy

My class prompt is:

photo of a toy

5. Click the Play button ( ▶️ ) on the left of the cell to start processing.

6. Grant permission to access Google Drive. Currently, there’s no easy way to download the model file except by saving it to Google Drive.

7. Press Choose Files to upload the resized images.

8. It should take 10-30 minutes to complete the training, depending on which runtime machine you use. When it is done, you should see a few sample images generated from the new model.

8. Your custom model will be saved in your Google Drive, under the folder Dreambooth_model. Download the model checkpoint file and install it in your favorite GUI.

That’s it!

Step 4: Testing the model (optional)

You can also use the second cell of the notebook to test using the model.

Prompt:

oil painting of zwx in style of van gogh

Using the prompt

oil painting of zwx in style of van gogh

with my newly trained model, I am happy with what I got:

Note that you have to run this cell right after the training is complete. Otherwise your notebook may be disconnected.

Using the model

You can use the model checkpoint file in AUTOMATIC1111 GUI. It is a free and full-featured GUI. You can run it on Windows, Mac, and Google Colab.

Using the model with the Stable Diffusion Colab notebook is easy. Your new model is saved in the folder AI_PICS/models in your Google Drive. It is available to load without any moving around.

If you use AUTOMATIC1111 locally, download your dreambooth model to your local storage and put it in the folder stable-diffusion-webui > models > Stable-diffusion.

How to train from a different model

Stable Diffusion v1.5 may not be the best model to start with if you already have a genre of images you want to generate. For example, you should use the Realistic Vision model (see below) if you ONLY want to generate realistic images with your model.

You will need to change the MODEL_NAME and BRANCH.

Currently, the notebook only supports training half-precision v1 and v2 models. You can tell by looking at the model size. It should be about 2GB for v1 models.

You can find the model name and the branch name below on a Huggingface page. The page shown below is here.

Huggingface Model name and branch name

Example: a realistic person

Realistic Vision v2 is a good model for training a new model with a realistic person. Use the following settings for a woman.

MODEL_NAME:

SG161222/Realistic_Vision_V2.0

BRANCH:

main

Instance prompt:

photo of zwx woman

Class prompt:

photo of woman

To download the training images:

Below are some samples of the training images.

Here are a few images from the new model. You can find the training images in the Dreambooth guide.

Tips for successful training

Each training dataset is different. You may need to adjust the settings.

Training images

The quality of training images is argueably the most important for a successful dreambooth training.

If you are training a face, the dataset should make of be high-quality images that clearly show the face. Avoid full-body images where the face is too small.

The images ideally should have different background. Otherwise, the background may show up in the AI images.

You don’t need too many images. 7-10 images are enough. Quality is more important than quantity.

Training steps

It is possible to over-train the model so that the AI images all look too much like the training images. The goal is to train just enough so that the model can generalize your subject to all scenes.

Reduce the steps if the model is over-trained.

Typically, you need 100 to 500 steps to train.

Class prompt

Adding more qualifiers to the class prompt helps the training.

For example, if the subject is a middle-aged woman, instead of using

Photo of a woman

You can use:

Photo of a 50 year old woman

You can also add ethnicity. It helps train a subject of a minority.

The dreambooth token

Although the traditional wisdom is to use a rare token like zwx or sks, it is not always the best.

This especially true for training a face of a realistic person.

It could be better off to use a generic name like Jane, Emma, Jennifer, etc. Prompt the model with a single word to see what you get. Find a name that looks like your subject.

Learning rate

A large learning rate trains the model faster. You need fewer steps. But if it is too large, the training won’t work and you get bad results.

If you don’t get good results, you can experiment with reducing the learning rate. But at the same time, you should increase the training steps. Roughly, if you reduce the learning rate by half, you should double your training steps.

Further readings

I recommend the following articles if you want to dive deeper into Dreambooth.

Avatar

By Andrew

Andrew is an experienced engineer with a specialization in Machine Learning and Artificial Intelligence. He is passionate about programming, art, photography, and education. He has a Ph.D. in engineering.

228 comments

  1. Hello, thank you for the guide, Is there a way to contact you directly?
    (I support you, paying, from Italy!!!)
    Anyway, I need a model fine tuned that generates high quality photos of a meter. (like water/gas meter) I made it followng your steps but I wander if there is the possibility to add a caption for each photos.
    I need to set the “consuption” and the “meter id” before the generation, and I want them to appear on the meter generated. I’d like to add the caption for this. Like telling the model what the id and the consumption are for each photos so I can ask for specific ones while generating.
    I dont know If I explained it well, If you need futher information please contact me.
    Thank you very much in advance

    1. You can reach me directly using the “contact us” form on this site.

      SD is not good at printing numbers. It can be tough. XL model is a bit better.

      But if you want to try: try using a prompt like a meter with numbers “1234” on it. If that doesn’t work, you need training images with meters with numbers.

  2. Hello.
    Why is there no Step-by-Step Guide how to train your own model with automatic1111/Dreambooth.
    I dont care about google colab which is very restricted I want to use it at the tool that you support.

    1. I tried that and it was finicky and I don’t want to rely on it. In general, its a bad idea to do so many different things with one software.

      What option do you need?

      1. I am going to echo @Juergen DIetl. I was hoping to do it locally.
        Any way to give a general idea how to pursuit it with A1111? Then the rest of us can chat on the forums to figure it out.
        I would love to see a step by step guide with photos and see what you end result is like, whether it works or not. It is the examples I am looking for.
        But I get it…you are BUSY!

        Fyi – always grateful for your worke and finding this site!

        Thank you!

        V

  3. Love the tutorial! However i get is it compatible with SDXL? I get “OSError: Error no file named model_index.json found in directory /content/output.” when I try to run it in Colab. I have a pro account

  4. Can you please make something more complicated ? you need to be a major at M.I.T. to understand all this…. ;-(

      1. Hi Andrew, I’m a different user and I have a question. Everything was running fine in the Colab notebook, until I got some weird errors. I should note I’m on a Mac with an Intel chip/processor or whatever.

        Anyways, I mistakenly downloaded the 6.0 Realistic Vision from Citivai and installed it. I tried to delete its files, but it seems some remained. When I type Vision V6, some tensor files come up that take up about 4 gb each, and I can’t delete them. I don’t know why this is happening and how to resolve it.

        The reason I bring this up is, eventually I used a 5.1 No VAE version from huggingface and put that into the Google Colab notebook. The notebook was processing just fine, until it seems it ran into some weird problems with leftover files from previous versions of Realistic Vision I downloaded and attempted to delete. I can’t code or read code, but that’s what it looks like from the error log. Here is the part where it seems stuff started to go wrong, if you need more, let me know. Please help!

        Requirement already satisfied: scipy in /usr/local/lib/python3.10/dist-packages (from bitsandbytes) (1.11.4)
        Requirement already satisfied: numpy=1.21.6 in /usr/local/lib/python3.10/dist-packages (from scipy->bitsandbytes) (1.23.5)
        Installing collected packages: bitsandbytes
        Successfully installed bitsandbytes-0.42.0
        accelerate configuration saved at /root/.cache/huggingface/accelerate/default_config.yaml
        2.1.0+cu121
        2024-02-01 05:58:34.696601: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
        2024-02-01 05:58:34.696658: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
        2024-02-01 05:58:34.699075: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
        2024-02-01 05:58:36.304472: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
        usage: train_dreambooth.py [-h] –pretrained_model_name_or_path PRETRAINED_MODEL_NAME_OR_PATH
        [–revision REVISION] [–variant VARIANT]
        [–tokenizer_name TOKENIZER_NAME] –instance_data_dir INSTANCE_DATA_DIR
        [–class_data_dir CLASS_DATA_DIR] –instance_prompt INSTANCE_PROMPT
        [–class_prompt CLASS_PROMPT] [–with_prior_preservation]
        [–prior_loss_weight PRIOR_LOSS_WEIGHT]
        [–num_class_images NUM_CLASS_IMAGES] [–output_dir OUTPUT_DIR]
        [–seed SEED] [–resolution RESOLUTION] [–center_crop]
        [–train_text_encoder] [–train_batch_size TRAIN_BATCH_SIZE]
        [–sample_batch_size SAMPLE_BATCH_SIZE]
        [–num_train_epochs NUM_TRAIN_EPOCHS]
        [–max_train_steps MAX_TRAIN_STEPS]
        [–checkpointing_steps CHECKPOINTING_STEPS]
        [–checkpoints_total_limit CHECKPOINTS_TOTAL_LIMIT]
        [–resume_from_checkpoint RESUME_FROM_CHECKPOINT]
        [–gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS]
        [–gradient_checkpointing] [–learning_rate LEARNING_RATE] [–scale_lr]
        [–lr_scheduler LR_SCHEDULER] [–lr_warmup_steps LR_WARMUP_STEPS]
        [–lr_num_cycles LR_NUM_CYCLES] [–lr_power LR_POWER] [–use_8bit_adam]
        [–dataloader_num_workers DATALOADER_NUM_WORKERS]
        [–adam_beta1 ADAM_BETA1] [–adam_beta2 ADAM_BETA2]
        [–adam_weight_decay ADAM_WEIGHT_DECAY] [–adam_epsilon ADAM_EPSILON]
        [–max_grad_norm MAX_GRAD_NORM] [–push_to_hub] [–hub_token HUB_TOKEN]
        [–hub_model_id HUB_MODEL_ID] [–logging_dir LOGGING_DIR]
        [–allow_tf32] [–report_to REPORT_TO]
        [–validation_prompt VALIDATION_PROMPT]
        [–num_validation_images NUM_VALIDATION_IMAGES]
        [–validation_steps VALIDATION_STEPS]
        [–mixed_precision {no,fp16,bf16}]
        [–prior_generation_precision {no,fp32,fp16,bf16}]
        [–local_rank LOCAL_RANK]
        [–enable_xformers_memory_efficient_attention] [–set_grads_to_none]
        [–offset_noise] [–snr_gamma SNR_GAMMA]
        [–pre_compute_text_embeddings]
        [–tokenizer_max_length TOKENIZER_MAX_LENGTH]
        [–text_encoder_use_attention_mask] [–skip_save_text_encoder]
        [–validation_images VALIDATION_IMAGES [VALIDATION_IMAGES …]]
        [–class_labels_conditioning CLASS_LABELS_CONDITIONING]
        [–validation_scheduler {DPMSolverMultistepScheduler,DDPMScheduler}]
        train_dreambooth.py: error: unrecognized arguments: Vision V6.0 B1
        Traceback (most recent call last):
        File “/usr/local/bin/accelerate”, line 8, in
        sys.exit(main())
        File “/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py”, line 47, in main
        args.func(args)
        File “/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py”, line 1023, in launch_command
        simple_launcher(args)
        File “/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py”, line 643, in simple_launcher
        raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
        subprocess.CalledProcessError: Command ‘[‘/usr/bin/python3’, ‘train_dreambooth.py’, ‘–pretrained_model_name_or_path=SG161222/Realistic’, ‘Vision’, ‘V6.0’, ‘B1’, ‘–revision=main’, ‘–instance_prompt=photo of olis beautiful woman’, ‘–class_prompt=photo of a beautiful woman’, ‘–class_data_dir=/content/data/class’, ‘–instance_data_dir=/content/data/instance’, ‘–output_dir=/content/output’, ‘–with_prior_preservation’, ‘–prior_loss_weight=1.0’, ‘–seed=1337’, ‘–resolution=512’, ‘–train_batch_size=1’, ‘–train_text_encoder’, ‘–use_8bit_adam’, ‘–gradient_accumulation_steps=1’, ‘–learning_rate=5e-06’, ‘–lr_scheduler=constant’, ‘–lr_warmup_steps=0’, ‘–num_class_images=50’, ‘–sample_batch_size=4’, ‘–max_train_steps=350′]’ returned non-zero exit status 2.
        /content
        —————————————————————————
        OSError Traceback (most recent call last)
        in ()
        103
        104 if ‘pipe’ not in locals():
        –> 105 pipe = StableDiffusionPipeline.from_pretrained(OUTPUT_DIR, safety_checker=None, torch_dtype=torch.float16).to(“cuda”)
        106 pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config)
        107 g_cuda = None

        3 frames
        /usr/local/lib/python3.10/dist-packages/diffusers/configuration_utils.py in load_config(cls, pretrained_model_name_or_path, return_unused_kwargs, return_commit_hash, **kwargs)
        368 config_file = os.path.join(pretrained_model_name_or_path, subfolder, cls.config_name)
        369 else:
        –> 370 raise EnvironmentError(
        371 f”Error no file named {cls.config_name} found in directory {pretrained_model_name_or_path}.”
        372 )

        OSError: Error no file named model_index.json found in directory /content/output.

        1. I think your model name is incorrect. It seems to contain spaces which shouldn’t. You need to pick a diffusers model where you see folders similar to the one used in the default settings.

          1. OK, it seems the model was successfully created. There were output images that resembled the person I was training for, but, they seemed slightly deformed/off. I tested output using the second cell underneath, but most of them didn’t resemble the subject, or when they did, they had deformities.
            What could be causing this? How do I correct for this? Thank you.

      2. No encuentro claro donde debo abrir la pantalla de Stable Difusion?
        ( he descargado un stable difusion en mi laptop, pero no entiendo aun como poder practicar con este tutorial ) He abierto tambien el archivo de 1 click y veo el tutorial de fotos de una mujer, pero ni idea ni aclaracion que deberia uno hacer alli, para aprender de eso ? ….deberias aclarar que debe un novato abrir o tocar para aprender el tutorial .( descargados los archivos, si, eso es simple, ….cambiar el formato a 512 ,si ,eso tambien es simple,….pero como llevar todo eso a stable diffusion ? help, no entiendo como seguir este tutorial aun .

  5. Hey there, thanks for all the great instructions.

    Today when I tried to run the colab, I received the following errors, any ideas?:

    /content
    —————————————————————————
    OSError Traceback (most recent call last)
    in ()
    17
    18 if ‘pipe’ not in locals():
    —> 19 pipe = StableDiffusionPipeline.from_pretrained(OUTPUT_DIR, safety_checker=None, torch_dtype=torch.float16).to(“cuda”)
    20 pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config)
    21 g_cuda = None

    3 frames
    /usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py in _inner_fn(*args, **kwargs)
    116 kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
    117
    –> 118 return fn(*args, **kwargs)
    119
    120 return _inner_fn # type: ignore

    /usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
    1115 cached_folder = pretrained_model_name_or_path
    1116
    -> 1117 config_dict = cls.load_config(cached_folder)
    1118
    1119 # pop out “_ignore_files” as it is only needed for download

    /usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py in _inner_fn(*args, **kwargs)
    116 kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
    117
    –> 118 return fn(*args, **kwargs)
    119
    120 return _inner_fn # type: ignore

    /usr/local/lib/python3.10/dist-packages/diffusers/configuration_utils.py in load_config(cls, pretrained_model_name_or_path, return_unused_kwargs, return_commit_hash, **kwargs)
    368 config_file = os.path.join(pretrained_model_name_or_path, subfolder, cls.config_name)
    369 else:
    –> 370 raise EnvironmentError(
    371 f”Error no file named {cls.config_name} found in directory {pretrained_model_name_or_path}.”
    372 )

    OSError: Error no file named model_index.json found in directory /content/output.

    1. It cannot load the initial model. You will need to put a Hugging Face repository with a diffusers model. (checkpoint model doesn’t work) See the article.

  6. I am unable to train the model on my photos successfully. I have tried 5 five times with different settings, and nothing has worked.

    1. My first attempt was with the default settings. I used a set of 20 images, and it was overtrained.
    2. I read some of the articles you posted and saw that a lower setting is needed to train faces. I changed the learning rate to 1e-6 and ran it with 2000 steps (100 per image). This was also overtrained and very distorted. The prompt had no effect, and all photos looked like my training set.
    3. Based on another article, I changed the learning rate to 1e-5 and lowered the steps to 1500. The resulting images were not as bad, but when I ran the last cell, the output looked like a different person.
    4. I decided to go even lower with the steps, and at 1000, the output looked like someone else.
    5. My last attempt was with the learning rate of 1e-5 and steps set at 1600. The result was images that looked like distorted versions of my training set. The prompt did not change anything. I even changed it, but all the generated pictures resemble the training set.

    I don’t know what to do. Please advise me on the optimal settings to train the checkpoint on my face. I need it to be flexible but still look like me. I have images with different lighting, backgrounds, hairstyles, and clothing. I have no idea why it won’t work. Thank you.

      1. Great! I’ll send over a few of the images. I won’t send them all because some are NSFW, as I’m training a model to make artwork for my erotica.

  7. I used this collab a ton, work great but since around christmas (when the default model name was changed from SD1.5 to Realistic2), all the woman model I make seems to be more facially accurate but now it is almost impossible to do nude without using a Lora. I tried a ton of prompts positive and negative, with boosted strength, I can’t seem to do nude art with model anymore while it was working super fine before. I don’t know what was changed but it feels like an NSFW filter or something. I normally train with 2000 to 3000 steps, which might be that, but I’ll do a test with lower training steps if you have an idea of why the model changed (I tried doing the model with SD1.5 and Realistic as a base and, the same problem with both model.

    1. Did some test after posting my comment and with a training step of 400, i can make nude out of my model. So i guess i’ll train with less step, so far, seems to work as fine and is easier to work with. Good thing i keep all my image for training, will have to redo a few. The wierd thing is i didn’t have the “nude” problem with a 2000 step model before…

    2. There’s no nsfw filter. You can still use the SD 1.5 base model. The change was because most people don’t use the 1.5 model anymore so the default was set to the most frequent use case for convenience.

  8. Hi, quick question.

    I realized that all the pictures I’m generating are very similar to those images I used during the training phase, in terms of environment. Is there a way to improve it, maybe using regularization images?

    1. Your training images should have some diversity in background. Otherwise the training won’t know the background is not associate with your keyword

      You can try changing some of your images background in software. (I have a recent tutorial on that)

      You can also try picking a keyword that is close to what you want to train, and reduce the number of steps.

  9. every time I try to train a diffuser model that is XL that is different than the default example this error code appears, anyway to fix it?

    OSError: Error no file named model_index.json found in directory /content/output.

  10. Hi Andrew, I’ve bought your Colab and this is very good!

    Which model do you think, currently, is the best one to train with and create new images keeping my face consistent through them? RealVis 3.0?

    Thanks!

    1. I only tested realistic vision v2. It is good for both dreambooth and lora.

      the SDXL version realvis lora seems to be a bit harder to train but still works.

  11. Just want to say this worked like a charm and also how great the site is in general. I literally couldn’t have progressed with SD / A1111 at all without this site as a resource.

  12. Can’t generate images, in stable diffusion just gets an error and doesn’t create an image, indicates that the training process went well without errors, the model is saved in Google but can’t generate images

    1. Do you see the sample images showing your subject?

      For a standard environment for troubleshooting, please use the A1111 colab notebook to test the model. Post the error message if you still see it.

      1. Yes I see, but can’t generate images
        Attach the error message
        Thanks in advance for your help

        NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs: query : shape=(2, 4096, 8, 40) (torch.float16) key : shape=(2, 4096, 8, 40) (torch.float16) value : shape=(2, 4096, 8, 40) (torch.float16) attn_bias : p : 0.0 `flshattF` is not supported because: xFormers wasn’t build with CUDA support Operator wasn’t built – see `python -m xformers.info` for more info `tritonflashattF` is not supported because: xFormers wasn’t build with CUDA support requires A100 GPU Only work on pre-MLIR triton for now `cutlassF` is not supported because: xFormers wasn’ t build with CUDA support Operator wasn’t built – see `python -m xformers.info` for more info `smallkF` is not supported because: xFormers wasn’t build with CUDA support dtype=torch.float16 (supported: {torch. float32}) max(query.shape[-1] != value.shape[-1]) > 32 Operator wasn’t built – see `python -m xformers.info` for more info unsupported embed per head: 40

        1. It has more to do with your settings. You haven’t mention what GUI you are using. If you use A1111, remove the --xformers flag, or switch to a T4 in place of A100 GPU.

  13. hi,
    I can’t find the model I created in Google Drive, everything goes successfully but the model is not saved in Drive, thanks for the help

      1. The whole process goes through successfully, even at the end of the process I check in COLAB everything works, the AP_PICS folder is not in Google Drive

      2. These are the last lines of the end of the whole process,
        Maybe you recognize any problem there?
        Points out that in the previous versions everything worked fine
        Thanks in advance.
        Loading pipeline components…: 71% 5/7 [00:00<00:00, 9.39it/s]{'timestep_spacing', 'prediction_type'} was not found in config. Values will be initialized to default values.
        Loaded scheduler as PNDMScheduler from `scheduler` subfolder of runwayml/stable-diffusion-v1-5.
        Loading pipeline components…: 100% 7/7 [00:00<00:00, 10.66it/s]
        {'timestep_spacing', 'prediction_type'} was not found in config. Values will be initialized to default values.
        Configuration saved in /content/output/vae/config.json
        Model weights saved in /content/output/vae/diffusion_pytorch_model.safetensors
        Configuration saved in /content/output/unet/config.json
        Model weights saved in /content/output/unet/diffusion_pytorch_model.safetensors
        Configuration saved in /content/output/scheduler/scheduler_config.json
        Configuration saved in /content/output/model_index.json
        Steps: 100% 200/200 [09:28<00:00, 2.84s/it, loss=0.352, lr=5e-6]
        /content
        Loading pipeline components…: 100%
        6/6 [00:19<00:00, 4.93s/it]
        You have disabled the safety checker for by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
        100%
        25/25 [00:15<00:00, 1.61it/s]
        Traceback (most recent call last):
        File "/content/diffusers/scripts/convert_diffusers_to_original_stable_diffusion.py", line 330, in
        save_file(state_dict, args.checkpoint_path)
        File “/usr/local/lib/python3.10/dist-packages/safetensors/torch.py”, line 281, in save_file
        serialize_file(_flatten(tensors), filename, metadata=metadata)
        safetensors_rust.SafetensorError: Error while serializing: IoError(Os { code: 2, kind: NotFound, message: “No such file or directory” })
        [*] Converted ckpt saved at /content/drive/MyDrive/AI_PICS/models/my_dreambooth_model.safetensors
        Dreambooth completed successfully. It took 19.1 minutes.

          1. Thanks, it works now, is there maybe a way to load the model through Google Collab? My computer is very weak for local installation

          2. Is there a way to resend a model to the google drive from Colab if the transfer didn’t work?

  14. Training Module in Diffusionbee: I am testing the training using the latest beta Diffusionbee V2.4.3. My M2 Ultra is clocking an iteration every 2.12 seconds. I am about halfway through; this will be interesting as it is a breeze to use the interface.

  15. HI Andrew.

    I get this error:

    OSError Traceback (most recent call last)

    in ()
    99 if ‘pipe’ not in locals():
    100 scheduler = DDIMScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule=”scaled_linear”, clip_sample=False, set_alpha_to_one=False)
    –> 101 pipe = StableDiffusionPipeline.from_pretrained(model_path, scheduler=scheduler, safety_checker=None, torch_dtype=torch.float16).to(“cuda”)
    102 g_cuda = None
    103

    3 frames

    /usr/local/lib/python3.10/dist-packages/diffusers/configuration_utils.py in load_config(cls, pretrained_model_name_or_path, return_unused_kwargs, return_commit_hash, **kwargs)
    368 config_file = os.path.join(pretrained_model_name_or_path, subfolder, cls.config_name)
    369 else:
    –> 370 raise EnvironmentError(
    371 f”Error no file named {cls.config_name} found in directory {pretrained_model_name_or_path}.”
    372 )

    OSError: Error no file named model_index.json found in directory output.

    Thank you for everything.

  16. It all works perfectly for me, thank you so much! I have trained many models successfully. Since I started using Chrome on my M2 Mac Studio Ultra, that is, Safari was causing failures. Meanwhile Diffusionbee beta 2.3 supports SDXL, and I was wondering if you have a template to create such models and also what realistic initial model you would link to on Hugging face for this..

      1. Andrew, I tried changing the suggested base model from SG161222/Realistic_Vision_V2.0 (main) to SG161222/Realistic_Vision_V3.0_VAE (main), and the final four generated images on Colab are four small Asian boys’ faces, not my supplied faces which work with V2.0. Why is this?