BetterWaifu’s Hentai Prompt Guide (Stable Diffusion & Pony Diffusion)

BetterWaifu’s Hentai Prompt Guide (Stable Diffusion & Pony Diffusion)

g

By gerogero

Updated: October 22, 2024

Here are some notes about how we structure prompts to get better hentai generations. This guide is the result of constant experimentation, and learnings from operating BetterWaifu.com, on which users create tens of thousands of generations every day.

The map and the territory

First, learn about Danbooru and read its How-to.

If AI hentai generation is the strange and foreign territory we find ourselves operating in, then Danbooru is our map.

Danbooru is the largest anime imageboard in the world. Without exception, all anime AI generators use its images for training. There are many ‘booru’ sites that use the same layout, such as Rule34 and Gelbooru, but Danbooru is the largest.

On Danbooru, volunteers tag all images comprehensively by their content. These rags range from physical characterstics “large_breasts“, “red_hair” to objects “book” to actions “fellatio“.

You can use these tags directly in your prompts. (there no difference between underscore and space in prompts)

Keep these in mind:

  • The more images a tag has, the stronger the training. The AI can only generate what it knows, and what it knows is determined by the images it’s been trained on. That means the number of tags an image has on Danbooru is usually a good estimate as to whether the AI can generate it.
  • Minimal Tagging. Many people like to use many filler words in their prompts. This is a technique from the old days of AI-gen, circa 2022. We don’t need to do this anymore. Too many tags cause noise and lead to less precise results. Keep things to the strict minimum, put down only what you absolutely want to see.
    eg. don’t put ‘nsfw’ in the prompt when it is obviously an explicit scene already
  • Prompt what you see, not what you know. Don’t use tags of stuff that is not visible in the final image in your prompt

What NSFW stuff can I generate?

On Danbooru, check these lists of tags of Sex acts and Sexual positions. If a tag has many images, it is usually possible to generate in BetterWaifu.

Structure

Don’t know how to start? Go and see the Tag Groups and think about what is most important visually in your dream image.

I usually like to follow a format by using 5 groups of words, with sub-categories inside.

I try to separate those concepts with newlines, it helps me to quickly make adjustment. The newlines do not affect the generation.

Keep in mind that it’s a very rough guideline. It’s ok to mix this order, especially inside a category.

1. Composition

  1. Style (photorealistic, color palette, etc)
  2. Point of view
  3. Light, Time of day (day, night, sunset, …)

2. Subjects

  1. Main subject(s) (1boy, 1girl, object, landscape, …)
    (1girl is just Danbooru’s way of saying ‘1 girl’, and separating it from 2girls and 3girls and so on)

3. Actions

  1. Sex acts
  2. Sexual positions

4. Body

  1. Posture
  2. Main features (size, weight, body type, body fat, skin color, …)
  3. Body elements (breast size, nipples, ass, …)
  4. Face related (eyes color, hairstyle, …)
  5. Expressions (happy, surprised, serious, determined, …)
  6. Clothes

5. Background

  1. Main environment (indoor, outdoor, …)
  2. Weather (wind, rain, snow, …)
  3. Objects (furnitures, vehicules, …)

Wait, isn’t that a big long and complicated?

Yes it is! But you don’t have to have any category you don’t want, and a single word can suffice for a category. It’s really about what level of control you want. Let’s take a look at some examples.

I’m thinking of a black maid outfit shot from below. The background doesn’t matter but I’d like cool cinematic effects. For inspiration, I would click on the links above in each of the categories and subcategories I’m interested in.

Composition: from_below, Subjects:

About prompt’s length

The prompt is split by words (or chunks of words) to transform them into numerical representation called tokens. Depending on which models are used and how tokens are normalized, certain parts of your prompt will have more or less attention. My rule of thumb: the longer the prompt, the more control you’ll have over the whole prompt.

Here’s some resources if you want to read more on that matter:

Tokens with SDXL and SD15 models – Alen Knight

Token normalization & Weight interpretation – BlenderNeko (Github)

Example

Here is an example following the structure. I usually put some break lines to have a better look on the main areas of the prompt.


sidelighting, light particles,

1girl, ginger, solo, smirk

sweat, freckles, small breasts,
ginger hair, long hair, straight hair,
blue eyes, glowing eyes, glasses, looking at viewer,

smile, smirk, grin, frown,
white tank top,

indoor, library,
sunset, sunny, daylight,
desk, chair,

As you can see, the breaklines don’t divide perfectly the 5 fields of the structure. Depending on what you want to generate, it can make more sense to visually separate smaller and longer parts. Here, the subjectactions and posture of the body are grouped together: splitting them would create a mess visually more than anything.

Use-cases

Here are some usecases and favorite keywords. I try to organise by logical group, from the most general to the more specific. I recommand to cherry-pick the one that make sense for you rather than copy-paste the all line.

You’ll also find a reference to where I place it in the structure. I’ll use an ID with the format [xx.yy], using the acronym of the category and its sub-category.

Keep in mind that some of them could also go higher or lower in the apparition order (e.g. upside down could be placed as [compos.pov] or [body.posture].

Generic

Here is what I use in almost all my prompt to influence the overall quality of the image. I usually add additional negative keywords if something that I don’t like appear, but not before. Again, I like to keep things short and simple.

Following list cover some classic cases, detailing both positive and negative prompts.

Note: Negative+ is an additional list where I cherry-pick according to the situation. In my experience, negative keywords related to anatomic don’t always improve the results, so I just keep them aside in case I need it.

Scenes

Subject