Anatomy of a Prompt

Anatomy of a Prompt

Tags
Prompt engineering
Stable Diffusion
Text2Image
Published
March 7, 2023
Author
Arman Chaudhry
 

Anatomy of a Prompt

 
A prompt is structured as following:
 
💭
!dream “Your prompt here” -optional_modifiers_here
 
The modifiers are not necessary but help enhance the image. If you want to get started it’s that simple, try it out now!

Modifiers Overview

 
While a simple prompt can produce good results, getting familiar with the modifiers available will make it easier and more consistent to get what you’re looking for. Here is a quick overview of them and later sections will go into more detail.
 
Note modifiers are case sensitive.
 
  1. -h or --help This will return a list of all the modifiers you can use, their default settings and options.
  1. -H or --height This chooses the height for your image, and takes inputs that are multiples of 64. 512x512 is the default. Changing height or width results in artifacts such as duplication of limbs or people so some trial and error might be needed when generating non-standard aspect ratio images. See the section on custom widths and heights before modifying. Example prompt is: !dream “Your prompt here” -H 768
  1. -W or --width Same as height, this takes multiples of 64 and default is 512. Changing it might increase the amount of images with artifacts but you can still generate great images with a non-standard aspect ratio. See the section on custom widths and heights before modifying. Example prompt is: !dream “Your prompt here” -W 768
  1. -C or --cfg_scale Classifier-free guidance (CFG) scale is one of the most asked about and there is a section dedicated to later in this guide as well as a link to a visual comparison of CFG settings in the additional references section.. Basically though, this changes how strongly the AI will follow your prompt and default is set at 7. I recommend not changing it until you’ve read the section dedicated to CFG scale. Example prompt is: !dream “Your prompt here” -C 12.0
  1. -n or --number This sets how many images will be returned for your request. Default is 1 image and you can increase it up to a maximum of 9 images per request. Example prompt is: !dream “Your prompt here” -n 9
  1. -i or --separate-images Images used to come in a grid and this command was for specifying you wanted each image to be a separate file. Now this is the default mode so you do not need to specify it, but I included this incase you see -i in the bot’s interpretation of your prompt or in other’s prompts. Example prompt is: !dream “Your prompt here” -i
  1. -g or --grid The opposite of -i, this will return your generation as one image file, with all images put together in a grid. This is useful when you want to easily compare images or rapidly prototype prompts and settings. Example prompt is: !dream “Your prompt here” -g
  1. -A or --sampler This sampler is what the AI model uses to actually decide how to generate your image, but is a setting I don’t recommend you change unless you are a power user. There is also almost no difference in the output image depending on which sampler you use with 2 notable exceptions (check the linked sampler study in the additional resources section.) Default is k_lms and options are k_lms, ddim, plms, k_euler, k_euler_ancestral, k_huen, k_dpm_2, and k_dpm_2_ancestral. Example prompt is: !dream “Your prompt here” -A k_euler
  1. -s or --steps The AI model starts from random noise and then iteratively denoises the image until you get your final image. This modifier decides how many denoising steps it will go through. Default is 50, which is perfect for most scenarios. For reference, at around 10 steps you have generally a good idea of the composition and whether you will like that image or not, at around 20 it becomes very close to finished. If cfg_scale and sampler are at default settings, then the difference 20 steps and 150 (the maximum) is often times hard to tell. So if you want to increase the speed at which your images are generated try lowering the steps. Increasing steps also often adds finer detail and fixes artifacts (often but not always). Example prompt is: !dream “Your prompt here” -s 20
  1. -S or --seed All images generated start with random noise and the seed determines what that random noise will be. By default the bot will randomly pick seeds for you, but if you happen to like an image you generated you can reuse the seed (with the same prompt and modifiers) and get the exact same result back. The usefulness of this comes when you want to make small variations to the modifiers you used or prompt, then reusing that seed will lead to the generated images being close to the original composition you liked, which help you hone in on your desired result. Example prompt is: !dream “Your prompt here” -S 12345
  1. -t - t will let you see the tokenized version of your prompt. What are tokens? They are how the AI model sees and interprets your inputs. Currently there is a 77 token limit, including the “startoftext” and “endoftext” token, so effectively you have 75 tokens of room for your prompt. This modifier is useful when you have long prompts and aren’t sure if it will all get seen and used by the AI. If your prompt is too long, you might get an error or warning stating your prompt was truncated- that means the parts after the 77 token limit were thrown away. Also don’t worry if the tokens seem to break your words into multiple parts or numbers get separated with spaces, that’s just how it’s always worked. An example prompt is !dream “Your prompt here” -t
  1. -a and --ac -a is the command to have your generated image be in the ASCII art format. If you enjoy leet hacker aesthetics you can try this out and get something to put in your terminal. You can also add the -ac modifier to control the number of columns for the ascii art (minimum 40, maximum 160). The result will be sent as a text file. Example prompts are !dream “Your prompt here” -a and !dream “Your prompt here” -a -ac 128