T O P

  • By -

BlackSwanTW

Holding an umbrella


HotNCuteBoxing

SDXL can't do action except poorly. For example a punch to the face. The fist connecting to the face and the person being hit showing some proper reaction. Either in a comic book style or realistic. Dalle 3 can do this to a certain extent, at least much better than SD.


roychodraws

Action is a limitation of all models due to training. It’s hard to create action poses with motion blur when you trained the model on high quality images free of blur. We need an sdxl model trained specifically on action images.


299314

I've heard it speculated Dall-E chooses different models to feed its prompts to depending on what the prompt is. I don't know if it's official or proven.


Ireallydonedidit

DallE is comically bad. I got a subscription for it and had to cancel because it just would not listen. ChatGPT even agreed that it was embarrassing how many times it would falsely flag a prompt


[deleted]

[удалено]


roychodraws

All training images are generally stationary. So there’s nothing to caption.


[deleted]

[удалено]


roychodraws

You’re talking about something different than I am. I’m saying models are not currently trained on images with motion. You’re saying they can be. Please try actively reading what I’m saying and reply to my thoughts or start a new thread.


[deleted]

[удалено]


roychodraws

This post was about SDXL.


[deleted]

[удалено]


roychodraws

I still don’t understand why you’re talking to me about something I never said.


orthomonas

I wil tend to believe your opinion on this u/HotNCuteBoxing


olekingcole001

Quick glance at their profile, and…yep. Will defer to their opinion lol


red__dragon

[I think someone heard your challenge and accepted.](https://civitai.com/models/487118/punch-in-the-face?modelVersionId=541681)


HotNCuteBoxing

Hahaha. I appreciate the effort.


witzowitz

Empty swimming pools, people holding guns, people smoking cigarettes, animals driving a 1992 Chevy Blazer, QWERTY keyboards, monster trucks made of silicone breast implants etc etc etc


Arumin

Guns are the most baffling to me, so many action scenes and such we could make but no gun ever looks good, straight and at the right angle. Sooooo, so much porn options but no guns


ScionoicS

Its because every gun in the training data is labelled "gun". It's vague and non descriptive. A gun can mean a pea shooter, a revolver, an smg, a sniper rifle, a staple gun, or many others. All of these are radically different looking. It's always a captioning problem. OpenAI proved this. It was further established with PonyXL


son-of-chadwardenn

Mechanical objects are harder than organic shapes. People have no straight lines and shapes and proportions vary naturally. On the other hand for any specific angle there's only one right way to draw a Glock17. Any variation becomes a visible defect. There's also probably lots of very inaccurate illustrations of guns in the training data since lots of artists suck at drawing them or are unconcerned about technical accuracy.


Simbuk

That’s because AI is smart. It knows what *really* matters.


CookLongjumping7404

"make love not war" - Stable Diffusion


ScionoicS

Objectification and dehumanization hardly qualifies as "love".


MyaSturbate

I tried to make a LoRA for guns and smoking both. The results are hit or miss.


Comrade_Derpsky

For the animal driving a car and the monster truck, you could probably get that to work using the prompt scheduling syntax. I once tried making pictures of ginger bread men with guns fighting in a trench made of cake. It didn't work as a basic prompt but if I started with a picture of a soldier and switched the prompt to a gingerbread man at the right point, I was able to get what I wanted.


SupermarketIcy73

upside down people


grbal

Can't make Mussolini 🙃


FengSushi

Flip the laptop around


proxiiiiiiiiii

pls don’t be mean to australians


AI_Alt_Art_Neo_2

Yeah, someone doing a cart wheel is very difficult for all image AI, I have tried it on quite a few, only Dalle.3 came close. *


AI_Alt_Art_Neo_2

https://preview.redd.it/09s77q2twl3d1.jpeg?width=1024&format=pjpg&auto=webp&s=62d1d8af454142ee7b6e3a6643d3acaade2af5fc Dalle.3


AnticitizenPrime

Can't even do thumbs down.


DuranteA

An archer shooting a bow. I've never seen **any** image generation model so far get the alignment of the arrow and the bowstring right even remotely consistently. (The fact that both of these are really thin probably doesn't help with encoding)


Captain_Biscuit

Oh yeah, I've tried to generate some archery shots and it's hilarious. If you specify a compound it can almost look convincing at a glance, until you realise the cables are going all over the place.


makerTNT

Anything where the subject has to interact with something. Holding or grabbing objects in non weird ways, using a leash correctly, using guns, swords, using stones to make a fire, unusual stuff like going swimming in a hazmat suit etc.


acid-burn2k3

Everything about complex designs. I think it’s just great for rendering and melting things together, ultra realistically but it doesn’t do design properly


Temp_84847399

People lying down, especially when trying to get someone lying on their back, seen from the side, at about the same height as the surface. This usually just produces nightmare fuel.


AltAccountBuddy1337

Not an issue with lineart, but if you're not using lineart you have a very very low chance of getting a person lying down correctly. It can happen but it's a mess. You should be able to get people lying down if you provide a decent lineart and use pryacanny in fooocus or some kind of canny thing in your favorite AI program.


biggerboy998

Pretty much every time I try to show a woman lying down it changes it to a child regardless of negative prompts 🤦 And don't try to have a grown woman with a teddy bear lol


Comrade_Derpsky

Time to pick a different checkpoint.


Comrade_Derpsky

The PonyXL models know how to do this quite well. You could use one of them to get the initial image and then use another checkpoint to refine it.


hgoten0510

outfit consistency. If I am using lora, it changes in small parts.


LGN-1983

Hardest are related to weapons: trident, archer, hammer warrior, etc (and people with 3 eyes or just one eye)


Curious-Thanks3966

I have difficulties creating literally everything apart from portraits, creatures, close-ups or landscapes. Any action (like opening a door, holding a weapon, texting on a phone, ...) leads me to Inpaint, ControlNet and Photoshop work that can last for hours to get fixed.


tethercat

Sexualized content. Mind you, I'm only using checkpoints and loras from religious sites... but *surely* there must be a way to see some risqué imagery.


Crimkam

I need a Mormon bubble porn Lora asap


onmyown233

I made a LoRa to show some ankle, I got you covered.


FengSushi

Hot in 1890


Landaree_Levee

Literally anything that isn’t common. For example, time ago I wanted a couple photorealistic illustrations for a novel of mine where the protagonist is a 6’6” tall woman, standing next to the coprotagonist—a male of a more “normal” stature, perhaps 6’—and it was impossible: it would render almost everything else pretty well (their other physical features, clothes, background, etc.), but the woman would be consistently *shorter* or, at best, his same height. I assume it’s because there are so few pictures of unusually tall women standing next to average-heighted men, that the models simply don’t know how to visualize it. The only way would be to “trick” the model by describing instead a midget-like man, but that would usually draw him in limb proportions typical of people with dwarfism. I assume the models just don’t have enough imagery of what I actually asked. More generally, anything that isn’t common in nature. The other day a guy complained in the ChatGPT subreddit that he couldn’t make DALL-E draw him a spider with two legs, they all came with six or so. Similarly, I assume it’s just the model doesn’t have in its training any two-legged spiders to get an idea of how that looks.


AgentTin

Did you try using controlnet and regional prompting?


RestorativeAlly

It definately knows spider and legs, it does okish with numbers, but didn't realize that the six spindly things that are part of "spider" are "legs" that you could count and number. Probably sees them as part and parcel of "spider."


the_friendly_dildo

>Literally anything that isn’t common. Tell me about it. Ask it for an image of Thomas the Tank Engine all cheeked up and it just doesn't know what the fuck to do.


HardenMuhPants

Could train a lora with 70+ pictures I think. Would probably need more images than usual though to break the model bias. 200+ quality images and it might be easy.


Comrade_Derpsky

Most of your stable diffusion models have no way to specify height. For this you need to draw it first or make a reference picture.


I_SHOOT_FRAMES

Consistent hands


summerstay

Particular species of dinosaurs. Mythological creatures like griffins or centaurs. Scenes with a dozen people in them, each reacting to the situation in a different, but appropriate way (like The Last Supper). A child hanging upside down from monkey bars. Video game buildings from a top-down, front-facing point of view, not isometric.


[deleted]

https://preview.redd.it/nn0gbvsbyk3d1.png?width=1024&format=png&auto=webp&s=69995fbca8b998ee5e36795bebb37a1d0d11423d and they think AI is going to kill us all


[deleted]

An upside down umbrella. Edit: This is my benchmark for Ai being able to use reasoning to turn an object upside down when it’s almost certainly not being trained on images of upsidedown umbrellas. GPT4o got close but still looked really wonky.


MoDErahN

How did you get access to 4o image generator? Public version of 4o still uses DALLE.


[deleted]

Ok, whatever is currently running on 4o, then.


Parogarr

many, many things, which I will not say


baddrudge

Undressing or naked or half naked people with the rest of their clothes on the floor. Both SD1.5 and SDXL can do, with the right models (and not some obscure LoRA or checkpoint or something): - Naked people - Clothed people - Half naked people (e.g. swimwear, underwear, etc.) But it's almost impossible to create realistic images of people in the process of putting on their clothes or taking them off.. or, say, a picture of a naked couple making out with their clothes on the floor. I've even tried training LoRAs and TIs (1.5 only, didn't have the time and resources for SDXL ones) for these concepts to no avail.


b1ackjack_rdd

Characters standing with their back towards the viewer. Possible but a lot of janky results.


Educational_Smell292

I have 99% good results with the "showing back" prompt.


bipolaridiot_

“Their back is to the camera” or “view from behind” also works well


BobFellatio

I make such photos all the time. Just use «photo from behind» and «looking into the distance/horizon» or «looking away»


[deleted]

[удалено]


danamir_

Here you go, with Ideogram : https://preview.redd.it/mcr2p4vlzj3d1.png?width=1874&format=png&auto=webp&s=d6f0b42ccdfaf11515777f6535ec88b9841e0226 (Sorry for the double post, replied to the wrong one 😅)


smoowke

​ https://preview.redd.it/p65sj3ds0k3d1.jpeg?width=1536&format=pjpg&auto=webp&s=ea0be0e7fa0624def20ce0d24604f9b7694ea292


0xmgwr

wtf 🤯


Reasonable_Net_6071

can confirm. i have tried with dall-e, midjourney, SD1.5/SDXL and no one generated square wheels. wtf


danamir_

Ideogram can do it apparently, first try with prompt "a bicycle with square wheels", style "illustration" : https://preview.redd.it/3zl93qhhzj3d1.png?width=1874&format=png&auto=webp&s=bb0885c24b1ad2b511e37b176455785ef6d48e77


Sharlinator

To me the bigger wtf is how the hell Ideogram is actually able to do it. In the ontologies that these models have learned a square wheel is probably almost as much of an oxymoron as a square circle.


liimonadaa

Yeah it's interesting. At least square wheel is contradictory enough to be a known phrase that appears in pop culture. I'd be more surprised if they could do something like "triangle wheels".


voltisvolt

The Ideogram team came from Google and they were at the forefront of Diffusion pioneering there so, they're good and it's only going to get better, they're the real deal.


SurveyOk3252

Expressing a broken cup is extremely difficult. Even ControlNet isn't much help, and not only with SD but also with other AI image generation services, it is rendered very unnaturally.


Purple_Potato_69

Closed Eyes, actually an easy concept but I can't find any models that do it ( maybe I haven't explored enough). No matter what I put in the prompt it's always open eyes, looking straight to me like she wants me so bad.


BagOfFlies

Instead of "closed eyes" use "sleeping eyes". Sometimes a combination of the two works better also. Like using BonoboXL "sleeping eyes" works almost every time, but using EpicrealismXL I find I need to use both to get it consistent.


Purple_Potato_69

Well the eyes are closed but she's lying in bed now. Is that some kind of sign?


BagOfFlies

Haha did you describe an action first? This one was "a photo of a girl standing at a bus stop, sleeping eyes" https://imgur.com/a/eXUgrR0


Purple_Potato_69

Yes, this is exactly what I wanted to do, which model did you used?


BagOfFlies

That one was epicrealism v6


Purple_Potato_69

👍 Thanks


chickenofthewoods

https://i.imgur.com/fnFfJYk.png Which one?


BagOfFlies

I was using the SDXL version. This one here https://civitai.com/models/277058?modelVersionId=484695


chickenofthewoods

Thank you.


[deleted]

use "reading a book" and the eyes never open


Purple_Potato_69

That works but not for the scenario i want. However the other guy tips "sleeping eyes" "closed eye" both together works most of the time.


IamKyra

opened eyes in neg also


hudsonreaders

Mirrors


fallengt

"cutoff" content. Like part of person's arm is hidden inside a box but their hand is visible. While AI can do that but there always errors


HardenMuhPants

I find "intangible" in the negatives helps somewhat with that.


tranducduy

Katana and archery, two of my favorite sports


ilovebigbuttons

Tools such as wrenches and real power tools.


mcstripey-f56

I’m unable to get a three headed dragon wearing cowboy hats. SDXL/SD3 keeps generating three separate dragons. Dall-e seems to follow the prompt fine. Same with a samurai riding a sea horse. SDXL/SD3 keeps generating a horse.


semioticgoth

anybody talking on a telephone. SD3 can't do this either


Guilherme370

"A man is being loud and obnoxious in a coffee date, the woman is bored and unamused, maybe even annoyed, and its apparent in her face." Or any other variation of that, NEVER worked for me, ANYWHERE, ANYMODEL, not just sdxl Somehow coffee date ALWAYS imply an image of BOTH smiling


Apprehensive_Sky892

Don't expect A.I. to understand the nuances of human language. Just replace "coffee date" with "in a coffee shop", and describe each of the subjects as you want them to appear in the image. https://preview.redd.it/vflgpay67n3d1.jpeg?width=1216&format=pjpg&auto=webp&s=d3555a8dd01d23fe3251e3f0f1dcc91e0b9ee77a Photo of a young man and a young woman in a coffee shop. The man is talking loudly and animated. The woman is bored.


goodie2shoes

Designing simple yet elegant wood furniture. There's always something wrong with the perspective or it is missing legs. Some of the idea's are great thougjh. (but I'm also still learning to use all the tools, so there's that)


lonewolfmcquaid

Carrying person...i recently tried to create an image of a superhero carrying someone and nope, couldnt do it


aibot-420

txt to 3d stl


absolutenobody

Guitar smashing. Probably smashing *anything* that's not normally, well... smashed. But SD/SDXL just cannot fathom a way to interact with a guitar other than one hand by the pickups, one hand on the neck. Also, really weirdly, athletic footwear. SDXL seems to have zero clue what softball/football/soccer/baseball cleats or spikes are, gives hilariously inappropriate results if you use the Commonwealth term "football boots", and mostly draws either basketball hi-tops or plimsolls if you prompt "track shoes" or "running shoes".


Idenwen

Predefined Text on a banana that is in some scenery


scottix

Anything under water is difficult


_roblaughter_

This is totally random, but I had the hardest time trying to get any yellow fruit on a yellow background today. I wanted to stay with the same aesthetic as the other images in the series, so I didn't venture out of my family of checkpoints, but every other color worked just fine... I just ended up with a banana on a black background, or a lemon on an orange background. Weird quirk 🤷🏻‍♂️ https://preview.redd.it/a1ztx0ajil3d1.png?width=8192&format=png&auto=webp&s=d86cb373b7bb3db2e4291429a70db6a881973f4d


Talae06

In case it might be helpful ([For Real XL v0.5](https://civitai.com/models/432244/forrealxl)) https://preview.redd.it/rf941o88do3d1.png?width=896&format=png&auto=webp&s=81f41196e9be6511b5db58a0fb7a078feb9fe34d


_roblaughter_

I actually figured out what I was doing wrong. I was messing with the color tones by passing a black image in and even at 1.00 denoise, it was still somehow impacting only the background of only yellow images. Weird.


Talae06

https://preview.redd.it/58pcrl3ddo3d1.png?width=896&format=png&auto=webp&s=fa706aeace78aba57dbc557f672c8326f53e548c


Talae06

https://preview.redd.it/e49bp54fdo3d1.png?width=896&format=png&auto=webp&s=e487ef20502694599302a9acf6f2eab76cced5e6


Talae06

https://preview.redd.it/ho8ehmmgdo3d1.png?width=896&format=png&auto=webp&s=5b09c282601bba472f5165e1bb9b67e975744c07


Silent_Ad9624

Holding cards, like in a game of poker. Usually one very distorted card appears, but never a full hand of cards.


vanonym_

buildings with shifted perspective, all our trials ended up wanky


cnecula

Industrial complex


metasepp

Discworld


TheArchivist314

Doing Spartan helmets without the fur on top


Confusion_Senior

AI is very good at reaching 90% due to statistical techniques but the last 10% is always much harder. There is always going to be some hybrid human/AI colab to some extent with stuff like inpaint and controlnet


BavarianBarbarian_

Any industrial equipment. Boxes and shipping containers apparently are tagged often enough to give it at least *some* ideas (or maybe they're simple enough that errors don't come up), but a conveyor belt? An extruder? A drill? Nothing sensible.


k-r-a-u-s-f-a-d-r

normal hands every time


Treeshark12

Mirrors, try and make a girl doing her makeup, It has no idea how she would be reflected.


Vargol

Hair covering a persons face Sadako style.


cradledust

Historic North American Aboriginal peoples, their clothing, dwellings, way of life. Buffalo hunting, teepees, canoeing, kayaking, etc. Prairie life homesteading, wagons, horses, cowboys and RNWMP. I'd also like to see the ability to accurately depict turn of the century coal mining and realistic steam trains.


Tellesus

So far no AI can illustrate the trolly problem 


hihdaniel

Beastmen in anything but anime style. https://preview.redd.it/0s6x1n860o3d1.jpeg?width=1840&format=pjpg&auto=webp&s=825b0f752f6732f21f6e2470b9ae85ccff171a26


Maggotin

Warhammer 40k, does not matter what subject. Nothing looks good.


interpretist

Clothes if they’re not being worn and are not neatly folded. It doesn’t understand what a casually strewn or hanging garment should look like


Ok-Rub-9576

It can't draw a crab


CountCandyhands

Legit spent half an hour trying to get it to generate someone floating face-down in a lake. :(


DeepDay6

A {funny|cute} barn owl {holding|waving with} a red griddle pan. That's InvokeAI syntax for dynamic prompts. And yes, I'm trying to get that on a birthday card *lol*


MetaCommando

Power armor. Now how am I supposed to touch up my SamusxMaster Chief hentai?


gexaha

photorealistic food


Anxious-Ad693

Anything that challenges a common concept. AI can do a person holding an apple but can't do it the other way around.


AddictiveFuture

Hands.


SlapAndFinger

Mermaids are so full of fail. There are a few cliche mermaid poses/looks/styles that it does well, but if you try to get it to do anything nuanced it will give you monsters. I just sketch and inpaint "fish tail" now rather than prompt for mermaids.


johnnychase

A triangle. A simple line drawing of a triangle. https://preview.redd.it/7v283hr0yk3d1.jpeg?width=474&format=pjpg&auto=webp&s=07d9f505b6cf7fdab16bc1f669f99382f02f6652


Talae06

"simple black line drawing of a perfect triangle, geometric, symmetrical, white background", [Mohawk v2](https://civitai.com/models/144952/mohawk) https://preview.redd.it/87hhoqizeo3d1.png?width=1024&format=png&auto=webp&s=33b6198a4631aebbb8e1e0c664aa62843bb70e75