Extremely slow on 5090

by STTrife - opened 4 days ago

4 days ago

•

I am trying the demo code, but it seems to take 3+ hours for the 50 steps from the demo... is that normal on a 5090? Or is the base model just not suitable for consumer hardware?

GPU: NVIDIA GeForce RTX 5090 (32GB)
CUDA Compute Capability: 12.0 (sm_120)
NVIDIA Driver: 580.88
OS: Windows 10 (AMD64)
Python: 3.11.9
PyTorch: 2.11.0.dev20251231+cu130 (nightly)
CUDA: 13.0
cuDNN: 91200
Diffusers: 0.37.0.dev0

Manni1000

4 days ago

i think is is maby running on your cpu

Geffers

3 days ago

•

edited 3 days ago

Edited comment as I had forgotten to use the lightning 4 step lora. (though actually using 6 steps). Actually pretty decent results; 1920x1080 image in 55 seconds.

fdwork

3 days ago

Yes sounded like CPU probably one thread too :) I am running the Q5M unsloth variant on a 5070ti generation time for 1024x1024 at 40 steps is about 148s.

STTrife

3 days ago

You'd think so, and I suspected that too. But I edited the code slightly to disable the cpu option, and I also checked my task-manager and GPU was at full 100% while trying to create an image... very strange. I tried it in comfyUI now and that works fine...

BigBlueWhale

3 days ago

Just use the df11 version https://github.com/LeanModels/DFloat11/issues/30

It fits perfectly on 32 GB VRAM at full quality using DF11

Sikaworld1990

1 day ago

Just use the df11 version https://github.com/LeanModels/DFloat11/issues/30

It fits perfectly on 32 GB VRAM at full quality using DF11

Problem...it has no comfy ui support until now!

Seventi

about 12 hours ago

The 5090 is a 32GB graphics card, but doesn't the model have a total of 40GB? Can it run?
5090是32G显卡，但是模型不是有40G的总量吗？能跑起来？

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment