BlackSwanTW 1 month ago

Holding an umbrella

HotNCuteBoxing 1 month ago

SDXL can't do action except poorly. For example a punch to the face. The fist connecting to the face and the person being hit showing some proper reaction. Either in a comic book style or realistic. Dalle 3 can do this to a certain extent, at least much better than SD.

roychodraws 1 month ago

Action is a limitation of all models due to training. It’s hard to create action poses with motion blur when you trained the model on high quality images free of blur. We need an sdxl model trained specifically on action images.

299314 1 month ago

I've heard it speculated Dall-E chooses different models to feed its prompts to depending on what the prompt is. I don't know if it's official or proven.

Ireallydonedidit 1 month ago

DallE is comically bad. I got a subscription for it and had to cancel because it just would not listen. ChatGPT even agreed that it was embarrassing how many times it would falsely flag a prompt

[deleted] 1 month ago

[удалено]

roychodraws 1 month ago

All training images are generally stationary. So there’s nothing to caption.

[deleted] 1 month ago

[удалено]

roychodraws 1 month ago

You’re talking about something different than I am. I’m saying models are not currently trained on images with motion. You’re saying they can be. Please try actively reading what I’m saying and reply to my thoughts or start a new thread.

[deleted] 1 month ago

[удалено]

roychodraws 1 month ago

This post was about SDXL.

[deleted] 1 month ago

[удалено]

roychodraws 1 month ago

I still don’t understand why you’re talking to me about something I never said.

orthomonas 1 month ago

I wil tend to believe your opinion on this u/HotNCuteBoxing

olekingcole001 1 month ago

Quick glance at their profile, and…yep. Will defer to their opinion lol

red__dragon 1 month ago

[I think someone heard your challenge and accepted.](https://civitai.com/models/487118/punch-in-the-face?modelVersionId=541681)

HotNCuteBoxing 1 month ago

Hahaha. I appreciate the effort.

witzowitz 1 month ago

Empty swimming pools, people holding guns, people smoking cigarettes, animals driving a 1992 Chevy Blazer, QWERTY keyboards, monster trucks made of silicone breast implants etc etc etc

Arumin 1 month ago

Guns are the most baffling to me, so many action scenes and such we could make but no gun ever looks good, straight and at the right angle. Sooooo, so much porn options but no guns

ScionoicS 1 month ago

Its because every gun in the training data is labelled "gun". It's vague and non descriptive. A gun can mean a pea shooter, a revolver, an smg, a sniper rifle, a staple gun, or many others. All of these are radically different looking. It's always a captioning problem. OpenAI proved this. It was further established with PonyXL

son-of-chadwardenn 1 month ago

Mechanical objects are harder than organic shapes. People have no straight lines and shapes and proportions vary naturally. On the other hand for any specific angle there's only one right way to draw a Glock17. Any variation becomes a visible defect. There's also probably lots of very inaccurate illustrations of guns in the training data since lots of artists suck at drawing them or are unconcerned about technical accuracy.

Simbuk 1 month ago

That’s because AI is smart. It knows what *really* matters.

CookLongjumping7404 1 month ago

"make love not war" - Stable Diffusion

ScionoicS 1 month ago

Objectification and dehumanization hardly qualifies as "love".

MyaSturbate 1 month ago

I tried to make a LoRA for guns and smoking both. The results are hit or miss.

Comrade_Derpsky 1 month ago

For the animal driving a car and the monster truck, you could probably get that to work using the prompt scheduling syntax. I once tried making pictures of ginger bread men with guns fighting in a trench made of cake. It didn't work as a basic prompt but if I started with a picture of a soldier and switched the prompt to a gingerbread man at the right point, I was able to get what I wanted.

SupermarketIcy73 1 month ago

upside down people

grbal 1 month ago

Can't make Mussolini 🙃

FengSushi 1 month ago

Flip the laptop around

proxiiiiiiiiii 1 month ago

pls don’t be mean to australians

AI_Alt_Art_Neo_2 1 month ago

Yeah, someone doing a cart wheel is very difficult for all image AI, I have tried it on quite a few, only Dalle.3 came close. *

AI_Alt_Art_Neo_2 1 month ago

https://preview.redd.it/09s77q2twl3d1.jpeg?width=1024&format=pjpg&auto=webp&s=62d1d8af454142ee7b6e3a6643d3acaade2af5fc Dalle.3

AnticitizenPrime 1 month ago

Can't even do thumbs down.

DuranteA 1 month ago

An archer shooting a bow. I've never seen **any** image generation model so far get the alignment of the arrow and the bowstring right even remotely consistently. (The fact that both of these are really thin probably doesn't help with encoding)

Captain_Biscuit 1 month ago

Oh yeah, I've tried to generate some archery shots and it's hilarious. If you specify a compound it can almost look convincing at a glance, until you realise the cables are going all over the place.

makerTNT 1 month ago

Anything where the subject has to interact with something. Holding or grabbing objects in non weird ways, using a leash correctly, using guns, swords, using stones to make a fire, unusual stuff like going swimming in a hazmat suit etc.

acid-burn2k3 1 month ago

Everything about complex designs. I think it’s just great for rendering and melting things together, ultra realistically but it doesn’t do design properly

Temp_84847399 1 month ago

People lying down, especially when trying to get someone lying on their back, seen from the side, at about the same height as the surface. This usually just produces nightmare fuel.

AltAccountBuddy1337 1 month ago

Not an issue with lineart, but if you're not using lineart you have a very very low chance of getting a person lying down correctly. It can happen but it's a mess. You should be able to get people lying down if you provide a decent lineart and use pryacanny in fooocus or some kind of canny thing in your favorite AI program.

biggerboy998 1 month ago

Pretty much every time I try to show a woman lying down it changes it to a child regardless of negative prompts 🤦 And don't try to have a grown woman with a teddy bear lol

Comrade_Derpsky 1 month ago

Time to pick a different checkpoint.

Comrade_Derpsky 1 month ago

The PonyXL models know how to do this quite well. You could use one of them to get the initial image and then use another checkpoint to refine it.

hgoten0510 1 month ago

outfit consistency. If I am using lora, it changes in small parts.

LGN-1983 1 month ago

Hardest are related to weapons: trident, archer, hammer warrior, etc (and people with 3 eyes or just one eye)

Curious-Thanks3966 1 month ago

I have difficulties creating literally everything apart from portraits, creatures, close-ups or landscapes. Any action (like opening a door, holding a weapon, texting on a phone, ...) leads me to Inpaint, ControlNet and Photoshop work that can last for hours to get fixed.

tethercat 1 month ago

Sexualized content. Mind you, I'm only using checkpoints and loras from religious sites... but *surely* there must be a way to see some risqué imagery.

Crimkam 1 month ago

I need a Mormon bubble porn Lora asap

onmyown233 1 month ago

I made a LoRa to show some ankle, I got you covered.

FengSushi 1 month ago

Hot in 1890

Landaree_Levee 1 month ago

Literally anything that isn’t common. For example, time ago I wanted a couple photorealistic illustrations for a novel of mine where the protagonist is a 6’6” tall woman, standing next to the coprotagonist—a male of a more “normal” stature, perhaps 6’—and it was impossible: it would render almost everything else pretty well (their other physical features, clothes, background, etc.), but the woman would be consistently *shorter* or, at best, his same height. I assume it’s because there are so few pictures of unusually tall women standing next to average-heighted men, that the models simply don’t know how to visualize it. The only way would be to “trick” the model by describing instead a midget-like man, but that would usually draw him in limb proportions typical of people with dwarfism. I assume the models just don’t have enough imagery of what I actually asked. More generally, anything that isn’t common in nature. The other day a guy complained in the ChatGPT subreddit that he couldn’t make DALL-E draw him a spider with two legs, they all came with six or so. Similarly, I assume it’s just the model doesn’t have in its training any two-legged spiders to get an idea of how that looks.

AgentTin 1 month ago

Did you try using controlnet and regional prompting?

RestorativeAlly 1 month ago

It definately knows spider and legs, it does okish with numbers, but didn't realize that the six spindly things that are part of "spider" are "legs" that you could count and number. Probably sees them as part and parcel of "spider."

the_friendly_dildo 1 month ago

>Literally anything that isn’t common. Tell me about it. Ask it for an image of Thomas the Tank Engine all cheeked up and it just doesn't know what the fuck to do.

HardenMuhPants 1 month ago

Could train a lora with 70+ pictures I think. Would probably need more images than usual though to break the model bias. 200+ quality images and it might be easy.

Comrade_Derpsky 1 month ago

Most of your stable diffusion models have no way to specify height. For this you need to draw it first or make a reference picture.

I_SHOOT_FRAMES 1 month ago

Consistent hands

summerstay 1 month ago

Particular species of dinosaurs. Mythological creatures like griffins or centaurs. Scenes with a dozen people in them, each reacting to the situation in a different, but appropriate way (like The Last Supper). A child hanging upside down from monkey bars. Video game buildings from a top-down, front-facing point of view, not isometric.

[deleted] 1 month ago

https://preview.redd.it/nn0gbvsbyk3d1.png?width=1024&format=png&auto=webp&s=69995fbca8b998ee5e36795bebb37a1d0d11423d and they think AI is going to kill us all

[deleted] 1 month ago

An upside down umbrella. Edit: This is my benchmark for Ai being able to use reasoning to turn an object upside down when it’s almost certainly not being trained on images of upsidedown umbrellas. GPT4o got close but still looked really wonky.

MoDErahN 1 month ago

How did you get access to 4o image generator? Public version of 4o still uses DALLE.

[deleted] 1 month ago

Ok, whatever is currently running on 4o, then.

Parogarr 1 month ago

many, many things, which I will not say

baddrudge 1 month ago

Undressing or naked or half naked people with the rest of their clothes on the floor. Both SD1.5 and SDXL can do, with the right models (and not some obscure LoRA or checkpoint or something): - Naked people - Clothed people - Half naked people (e.g. swimwear, underwear, etc.) But it's almost impossible to create realistic images of people in the process of putting on their clothes or taking them off.. or, say, a picture of a naked couple making out with their clothes on the floor. I've even tried training LoRAs and TIs (1.5 only, didn't have the time and resources for SDXL ones) for these concepts to no avail.

b1ackjack_rdd 1 month ago

Characters standing with their back towards the viewer. Possible but a lot of janky results.

Educational_Smell292 1 month ago

I have 99% good results with the "showing back" prompt.

bipolaridiot_ 1 month ago

“Their back is to the camera” or “view from behind” also works well

BobFellatio 1 month ago

I make such photos all the time. Just use «photo from behind» and «looking into the distance/horizon» or «looking away»

[deleted] 1 month ago

[удалено]

danamir_ 1 month ago

Here you go, with Ideogram : https://preview.redd.it/mcr2p4vlzj3d1.png?width=1874&format=png&auto=webp&s=d6f0b42ccdfaf11515777f6535ec88b9841e0226 (Sorry for the double post, replied to the wrong one 😅)

smoowke 1 month ago

https://preview.redd.it/p65sj3ds0k3d1.jpeg?width=1536&format=pjpg&auto=webp&s=ea0be0e7fa0624def20ce0d24604f9b7694ea292

0xmgwr 1 month ago

wtf 🤯

Reasonable_Net_6071 1 month ago

can confirm. i have tried with dall-e, midjourney, SD1.5/SDXL and no one generated square wheels. wtf

danamir_ 1 month ago

Ideogram can do it apparently, first try with prompt "a bicycle with square wheels", style "illustration" : https://preview.redd.it/3zl93qhhzj3d1.png?width=1874&format=png&auto=webp&s=bb0885c24b1ad2b511e37b176455785ef6d48e77

Sharlinator 1 month ago

To me the bigger wtf is how the hell Ideogram is actually able to do it. In the ontologies that these models have learned a square wheel is probably almost as much of an oxymoron as a square circle.

liimonadaa 1 month ago

Yeah it's interesting. At least square wheel is contradictory enough to be a known phrase that appears in pop culture. I'd be more surprised if they could do something like "triangle wheels".

voltisvolt 1 month ago

The Ideogram team came from Google and they were at the forefront of Diffusion pioneering there so, they're good and it's only going to get better, they're the real deal.

SurveyOk3252 1 month ago

Expressing a broken cup is extremely difficult. Even ControlNet isn't much help, and not only with SD but also with other AI image generation services, it is rendered very unnaturally.

Purple_Potato_69 1 month ago

Closed Eyes, actually an easy concept but I can't find any models that do it ( maybe I haven't explored enough). No matter what I put in the prompt it's always open eyes, looking straight to me like she wants me so bad.

BagOfFlies 1 month ago

Instead of "closed eyes" use "sleeping eyes". Sometimes a combination of the two works better also. Like using BonoboXL "sleeping eyes" works almost every time, but using EpicrealismXL I find I need to use both to get it consistent.

Purple_Potato_69 1 month ago

Well the eyes are closed but she's lying in bed now. Is that some kind of sign?

BagOfFlies 1 month ago

Haha did you describe an action first? This one was "a photo of a girl standing at a bus stop, sleeping eyes" https://imgur.com/a/eXUgrR0

Purple_Potato_69 1 month ago

Yes, this is exactly what I wanted to do, which model did you used?

BagOfFlies 1 month ago

That one was epicrealism v6

Purple_Potato_69 1 month ago

👍 Thanks

chickenofthewoods 1 month ago

https://i.imgur.com/fnFfJYk.png Which one?

BagOfFlies 1 month ago

I was using the SDXL version. This one here https://civitai.com/models/277058?modelVersionId=484695

chickenofthewoods 1 month ago

Thank you.

[deleted] 1 month ago

use "reading a book" and the eyes never open

Purple_Potato_69 1 month ago

That works but not for the scenario i want. However the other guy tips "sleeping eyes" "closed eye" both together works most of the time.

IamKyra 1 month ago

opened eyes in neg also

hudsonreaders 1 month ago

Mirrors

fallengt 1 month ago

"cutoff" content. Like part of person's arm is hidden inside a box but their hand is visible. While AI can do that but there always errors

HardenMuhPants 1 month ago

I find "intangible" in the negatives helps somewhat with that.

tranducduy 1 month ago

Katana and archery, two of my favorite sports

ilovebigbuttons 1 month ago

Tools such as wrenches and real power tools.

mcstripey-f56 1 month ago

I’m unable to get a three headed dragon wearing cowboy hats. SDXL/SD3 keeps generating three separate dragons. Dall-e seems to follow the prompt fine. Same with a samurai riding a sea horse. SDXL/SD3 keeps generating a horse.

semioticgoth 1 month ago

anybody talking on a telephone. SD3 can't do this either

Guilherme370 1 month ago

"A man is being loud and obnoxious in a coffee date, the woman is bored and unamused, maybe even annoyed, and its apparent in her face." Or any other variation of that, NEVER worked for me, ANYWHERE, ANYMODEL, not just sdxl Somehow coffee date ALWAYS imply an image of BOTH smiling

Apprehensive_Sky892 1 month ago

Don't expect A.I. to understand the nuances of human language. Just replace "coffee date" with "in a coffee shop", and describe each of the subjects as you want them to appear in the image. https://preview.redd.it/vflgpay67n3d1.jpeg?width=1216&format=pjpg&auto=webp&s=d3555a8dd01d23fe3251e3f0f1dcc91e0b9ee77a Photo of a young man and a young woman in a coffee shop. The man is talking loudly and animated. The woman is bored.

goodie2shoes 1 month ago

Designing simple yet elegant wood furniture. There's always something wrong with the perspective or it is missing legs. Some of the idea's are great thougjh. (but I'm also still learning to use all the tools, so there's that)

lonewolfmcquaid 1 month ago

Carrying person...i recently tried to create an image of a superhero carrying someone and nope, couldnt do it

aibot-420 1 month ago

txt to 3d stl

absolutenobody 1 month ago

Guitar smashing. Probably smashing *anything* that's not normally, well... smashed. But SD/SDXL just cannot fathom a way to interact with a guitar other than one hand by the pickups, one hand on the neck. Also, really weirdly, athletic footwear. SDXL seems to have zero clue what softball/football/soccer/baseball cleats or spikes are, gives hilariously inappropriate results if you use the Commonwealth term "football boots", and mostly draws either basketball hi-tops or plimsolls if you prompt "track shoes" or "running shoes".

Idenwen 1 month ago

Predefined Text on a banana that is in some scenery

scottix 1 month ago

Anything under water is difficult

_roblaughter_ 1 month ago

This is totally random, but I had the hardest time trying to get any yellow fruit on a yellow background today. I wanted to stay with the same aesthetic as the other images in the series, so I didn't venture out of my family of checkpoints, but every other color worked just fine... I just ended up with a banana on a black background, or a lemon on an orange background. Weird quirk 🤷🏻‍♂️ https://preview.redd.it/a1ztx0ajil3d1.png?width=8192&format=png&auto=webp&s=d86cb373b7bb3db2e4291429a70db6a881973f4d

Talae06 1 month ago

In case it might be helpful ([For Real XL v0.5](https://civitai.com/models/432244/forrealxl)) https://preview.redd.it/rf941o88do3d1.png?width=896&format=png&auto=webp&s=81f41196e9be6511b5db58a0fb7a078feb9fe34d

_roblaughter_ 1 month ago

I actually figured out what I was doing wrong. I was messing with the color tones by passing a black image in and even at 1.00 denoise, it was still somehow impacting only the background of only yellow images. Weird.

Talae06 1 month ago

https://preview.redd.it/58pcrl3ddo3d1.png?width=896&format=png&auto=webp&s=fa706aeace78aba57dbc557f672c8326f53e548c

Talae06 1 month ago

https://preview.redd.it/e49bp54fdo3d1.png?width=896&format=png&auto=webp&s=e487ef20502694599302a9acf6f2eab76cced5e6

Talae06 1 month ago

https://preview.redd.it/ho8ehmmgdo3d1.png?width=896&format=png&auto=webp&s=5b09c282601bba472f5165e1bb9b67e975744c07

Silent_Ad9624 1 month ago

Holding cards, like in a game of poker. Usually one very distorted card appears, but never a full hand of cards.

vanonym_ 1 month ago

buildings with shifted perspective, all our trials ended up wanky

cnecula 1 month ago

Industrial complex

metasepp 1 month ago

Discworld

TheArchivist314 1 month ago

Doing Spartan helmets without the fur on top

Confusion_Senior 1 month ago

AI is very good at reaching 90% due to statistical techniques but the last 10% is always much harder. There is always going to be some hybrid human/AI colab to some extent with stuff like inpaint and controlnet

BavarianBarbarian_ 1 month ago

Any industrial equipment. Boxes and shipping containers apparently are tagged often enough to give it at least *some* ideas (or maybe they're simple enough that errors don't come up), but a conveyor belt? An extruder? A drill? Nothing sensible.

k-r-a-u-s-f-a-d-r 1 month ago

normal hands every time

Treeshark12 1 month ago

Mirrors, try and make a girl doing her makeup, It has no idea how she would be reflected.

Vargol 1 month ago

Hair covering a persons face Sadako style.

cradledust 1 month ago

Historic North American Aboriginal peoples, their clothing, dwellings, way of life. Buffalo hunting, teepees, canoeing, kayaking, etc. Prairie life homesteading, wagons, horses, cowboys and RNWMP. I'd also like to see the ability to accurately depict turn of the century coal mining and realistic steam trains.

Tellesus 1 month ago

So far no AI can illustrate the trolly problem

hihdaniel 1 month ago

Beastmen in anything but anime style. https://preview.redd.it/0s6x1n860o3d1.jpeg?width=1840&format=pjpg&auto=webp&s=825b0f752f6732f21f6e2470b9ae85ccff171a26

Maggotin 1 month ago

Warhammer 40k, does not matter what subject. Nothing looks good.

interpretist 1 month ago

Clothes if they’re not being worn and are not neatly folded. It doesn’t understand what a casually strewn or hanging garment should look like

Ok-Rub-9576 1 month ago

It can't draw a crab

CountCandyhands 1 month ago

Legit spent half an hour trying to get it to generate someone floating face-down in a lake. :(

DeepDay6 1 month ago

A {funny|cute} barn owl {holding|waving with} a red griddle pan. That's InvokeAI syntax for dynamic prompts. And yes, I'm trying to get that on a birthday card *lol*

MetaCommando 1 month ago

Power armor. Now how am I supposed to touch up my SamusxMaster Chief hentai?

gexaha 1 month ago

photorealistic food

Anxious-Ad693 1 month ago

Anything that challenges a common concept. AI can do a person holding an apple but can't do it the other way around.

AddictiveFuture 1 month ago

Hands.

SlapAndFinger 1 month ago

Mermaids are so full of fail. There are a few cliche mermaid poses/looks/styles that it does well, but if you try to get it to do anything nuanced it will give you monsters. I just sketch and inpaint "fish tail" now rather than prompt for mermaids.

johnnychase 1 month ago

A triangle. A simple line drawing of a triangle. https://preview.redd.it/7v283hr0yk3d1.jpeg?width=474&format=pjpg&auto=webp&s=07d9f505b6cf7fdab16bc1f669f99382f02f6652

Talae06 1 month ago

"simple black line drawing of a perfect triangle, geometric, symmetrical, white background", [Mohawk v2](https://civitai.com/models/144952/mohawk) https://preview.redd.it/87hhoqizeo3d1.png?width=1024&format=png&auto=webp&s=33b6198a4631aebbb8e1e0c664aa62843bb70e75

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe