I haven't tried with this tool but you should be able to do that by increasing your Samples per prompt to a high number. I do that with a command line interface with a different branch. As of yet, that function doesn't work with save to grid which is what I'm looking for. Wake up to many 12-image grids. I can wake up to 100+ of images though just by increasing the number of images generated per prompt. Got 101 the other night and then upsacled the 4X in the AM.
i cannot get diffusion to use my EGPU, instead it uses the worse Dgpu in the laptop even when selecting the 2060 manually in windows settings the program runs on the 2050, limiting vram availability. anyone know a workaround?
I had a similar issue and didn't find a solution. It was using my faster GPU with less VRAM where I couldn't achieve 512x512. I installed a different tool and edited a file that allowed me to point it at my slower GPU with more VRAM and now I can get 512X704.
i get an error when extracting "data error: stable diffusion GRisk GUI/torch/lib/torch_cuda_cu.dll" and when i try to run the program it just opens a console window for a half a second and closes.
I'm using a 1650 on a laptop, 256 x 256 image produces completely black result. I don't know what to do! Does anyone know a solution or at least what's causing the problem?
I had a Vram problem at 512 x 512, but I lowered the resolution to 256 x 256. I checked the console - no errors. Apparently, it's written in the project, that the 16 series have problem with the half resolution option. It's the default one and cannot be unticked. Maybe I can find a workaround to find a way to turn it off?
Pytorch is not clearing the vram after each run forcing me to shut down the software to free up the GPU memory between each use. A few other posts sound like maybe they are experiencing similar.
I dont know if there is a GUI version for linux, but you can always download the source code in linux and run it or compile it. Source Code of the project is: here
Best I get on RTX a2000 with 4 GB is 128x256. And the output is pretty meaningless... Pretty colors but nothing like the prompt. Can anyone test an exact prompt and seed etc. At this setting so I can compare? Thanks!
I can do up to 500steps, this compiled version limits the steps to 500, however, recompiling it changing the max limit variable can allow to go beyond. (steps does not require additional VRAM, it will just iterate more the "dream")
I ran it on a 3090Ti at about 5.75it/s but I cannot change the resolution, even for a small change it says I'm out of VRam !! lol !! I would like to change the aspect ratio but it seems to not work correctlyIf you need to write the prompt as time you need a generation, what is the "Samples per prompt" menu ??Thanks for making me knowing this in local usage !! Many thanks !!
Also I would love to know more about V scale and half-precision, what is it exactly ?
Dont know what does V Scale stands for except that its the "guidance scale" whatever that means..., but half-precision is basically used when you got less than 10 GB of VRAM, uses float16 precision instead of the default float32 for the 10+ GB VRAM GPUs
Thanks for the details, the problem is that the box remains checked on "half precision" despite the 24gb of vram of my 3090Ti, I do not understand very well why.
okay this interface ( and the work behind ;) ) is already very good, maybe fix the sample per prompt bug , increment the resolution in multiple of 64 direcly instead of 1, save the output folder path after a reboot of the gui. I don't know if it's a bug but i can't do more than 250 steps and this message apear in the console, happen the same thing with a lower resolution.
Rendering: Cat
0it [00:01, ?it/s]
Traceback (most recent call last):
File "start.py", line 363, in OnRender
File "torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py", line 152, in __call__
File "diffusers\schedulers\scheduling_pndm.py", line 136, in step
File "diffusers\schedulers\scheduling_pndm.py", line 212, in step_plms
File "diffusers\schedulers\scheduling_pndm.py", line 230, in _get_prev_sample
IndexError: index 1000 is out of bounds for dimension 0 with size 1000
RuntimeError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 4.00 GiB total capacity; 2.55 GiB already allocated; 119.30 MiB free; 2.60 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
can anyone help me with this. i have a 1050ti 4gb. and I tried to get a 512x512 image.
512x512 is too much for 4GB of VRAM... I am hitting the limit of 576x576 with 8GB. so, doing this resolution with 4GB its practically impossible. In some cases my screen "goes black" due to lack of VRAM to output something in screen while the process is going...
Dont know if this helps because I don't have really tried it, but in my "Geforce Experience" software I have the option of selecting 2 kinds of drivers... the "studio" and the "gaming"... I have "studio" selected, try switching them and re-test
Perfect for me on a RTX 2070 for 512x512 pictures, but more than 1 sample per prompt seems not to work maybe in a future version ? Thank you very much for your work !
This is in the description. The sample per prompt isn't working, but you can have more than one prompt (even the same). Only make sure you don't have a cursor on a line by itself.
Potential BUG. It keeps rendering and saving random extra images after finishing mine. Is my GPU being used to render other people's prompts? For example, I will enter 3 prompts, it finishes my 3, and continues onto 1 or 2 more other renders. They generate a txt file of settings but no text prompt.
you more than likely have extra lines without any text for the AI to work with, the same thing happened to me and all I had to do was delete the empty line below my prompts.
I just want to write here that my RTX 2060 Super runs this fine. generates an image in about 20 seconds. I can use resolution up to 448x768. using windows 10 home, 64 bt. to give others a baseline. trying higher resolutions gives an out of memory error. this is awesome, and thanks for the amazing tool.
REQUESTS: Add UP/DOWN arrow to seed number (it's the only one missing that). Add ability to increment seed by +1 after generation instead of using random.
Or if you want to give me the source code, I'll code the changes by myself.
Feature suggestions: — Drag-and-drop an info text file to copy its settings. Great for exploring variants on a particular render. — With the above option, ideally it could continue from where you left off if you only increase the step count. — Option to render image variations with a range of "v scales".
Really fun to use and it works pretty well. But for some reason i can't generate 512x512. Im running it on a "AMD Ryzen 3400G" And a "3060 TI", anyone know why?
how much vram does your 3060 have? I own a 3070 8gb and I cant generate images over 576x576, the highest I can go vertically is 448x704. If I close as many processes as possible to free up my vram usage I can sometimes go up to 640x640 but its extremely slow.
Since the file size is large, hopefully when there is an update, we will be able to just download the newer files, rather than redownloading unchanged files. I'm also wondering if there is a way to add more libraries or if it would be too much hassle for the average person like me to bother doing. Also, thanks, this is awesome.
I have it downladed and installed but this is the opening I get, (
torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
torch\_jit_internal.py:751: UserWarning: Unable to retrieve source for @torch.jit._overload function: <function _DenseLayer.forward at 0x000001F2D632F940>.
warnings.warn(f"Unable to retrieve source for @torch.jit._overload function: {func}."))
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
and then this is at the end and no imagery.... if you know please help I have dyslexia, kinda bad and this is not easy for me, to even get this far is amzing.
There seems to be a hard limit of 500 steps per image. Is there a reason for this and can this be changed? Many prompts only start to look good in the high 400s, at least with my crappy GPU
Yes. Its limited in the code (predict.py) you can always change it to a greater limit and recompile it. change the "500" for whatever number that suits your needs.
The code is the following:
num_inference_steps: int = Input(
description="Number of denoising steps", ge=1, le=500, default=50
I'm getting a "PermissionError: [ErrNo 13] Permission Denied: '.\\image.png'" error after generating an image, and the image won't be saved in the output folder. I even tried to run the program as admin, but it didn't fix the problem.
It sounds like the program might be situated in a folder that it doesn't have the permissions to modify, I would try moving it to a secondary drive if you have one or to a different folder on you C: drive that doesn't require permissions.
Ran it once and it seemed fine, I ended up restarting my PC and now I'm getting this error and getting an instant crash every time I try running it. Not sure what happened, any ideas? I completely reinstalled the file, which didn't seem to do anything.
ValueError: invalid literal for int() with base 10: '\x00'
[504] Failed to execute script 'start' due to unhandled exception!
It would be cool if you could add to the queue with a REST API. That way it would be easy for devs to make their own GUI that connects with it but has all sorts of features or a design we prefer.
← Return to tool
Comments
Log in with itch.io to leave a comment.
This would be amazing with an option to run in a loop until stopped manually.
Leave it over night and get hundreds of variations.
I haven't tried with this tool but you should be able to do that by increasing your Samples per prompt to a high number. I do that with a command line interface with a different branch. As of yet, that function doesn't work with save to grid which is what I'm looking for. Wake up to many 12-image grids. I can wake up to 100+ of images though just by increasing the number of images generated per prompt. Got 101 the other night and then upsacled the 4X in the AM.
That feature doesn't work yet. It is possible to do though if I just copy the same request in multiple lines.
Need to copy paste a bunch of times if I want hundreds of results though.
I use an auto clicker to do so. Just run it once, check the time it takes and then put the time on your autoclicker and voila!
Just copy and paste the prompt a few hundred times into the field. Each line of the input field is a new execution.
A frog made out of a strawberry
A frog made out of a strawberry
A frog made out of a strawberry
A frog made out of a strawberry
A frog made out of a strawberry
Would generate 5 results.
A WOMAN MADE OUT OF STRAWBERRY
i cannot get diffusion to use my EGPU, instead it uses the worse Dgpu in the laptop even when selecting the 2060 manually in windows settings the program runs on the 2050, limiting vram availability. anyone know a workaround?
I had a similar issue and didn't find a solution. It was using my faster GPU with less VRAM where I couldn't achieve 512x512. I installed a different tool and edited a file that allowed me to point it at my slower GPU with more VRAM and now I can get 512X704.
Installed this.
https://github.com/lstein/stable-diffusion/tree/78aba5b770d6e85e44c730da9735118d...
Edit the dream.py file and change the last bottom bit of code to point at the GPU you desire.
if __name__ == "__main__":
os.environ["CUDA_VISIBLE_DEVICES"] = "1"
main()
i get an error when extracting "data error: stable diffusion GRisk GUI/torch/lib/torch_cuda_cu.dll" and when i try to run the program it just opens a console window for a half a second and closes.
I'm using a 1650 on a laptop, 256 x 256 image produces completely black result. I don't know what to do! Does anyone know a solution or at least what's causing the problem?
Show post...
Not enough Vram maybe, reduce the résolution to see what happen
I had a Vram problem at 512 x 512, but I lowered the resolution to 256 x 256. I checked the console - no errors. Apparently, it's written in the project, that the 16 series have problem with the half resolution option. It's the default one and cannot be unticked. Maybe I can find a workaround to find a way to turn it off?
1660 Ti produces black image at 64x64...
Can Successfully run 512x512 with a 2060 SUPER (8GB VRAM)
Yes.
try 704x512 or 512x704
Pytorch is not clearing the vram after each run forcing me to shut down the software to free up the GPU memory between each use. A few other posts sound like maybe they are experiencing similar.
Got the same RAM ussage, but running the second task replaces the used RAM with the new job, so I have no problems re-using it.
So, is there no way to run this on an AMD Card?
I have rx580 8gb vram
No, at least not yet. I really do wish they did support AMD GPUs, but I guess it is not really the focus of the development right now.
Is there a version that runs on Linux
I dont know if there is a GUI version for linux, but you can always download the source code in linux and run it or compile it. Source Code of the project is: here
Is there a way, even doing some coding and under-the-hood work to add custom data sets? As well, where's a tip button? I'd tip if I could find one.
Best I get on RTX a2000 with 4 GB is 128x256. And the output is pretty meaningless... Pretty colors but nothing like the prompt. Can anyone test an exact prompt and seed etc. At this setting so I can compare? Thanks!
I can test prompts, Which prompts would you like to be generated?
how about something simple, like "a dog and a cat" at 128x128? All it gives me is color full blobs. Thanks.
You are right I am getting same results, colors everywhere, I guess at this resolution is unusable. Even at 500 Steps...
With 320x320 you can start getting something understandable with LOTS of trial and error.
This is 128x128
This is 320x320
The 512x512 nail it every time
Show post...
hey are you able to do more than 250steps ? Have you tried this prompt at 500steps ?
I can do up to 500steps, this compiled version limits the steps to 500, however, recompiling it changing the max limit variable can allow to go beyond. (steps does not require additional VRAM, it will just iterate more the "dream")
that is very helpful information, thank you. In other words nothing I do will change this short of getting more vram!
Unfortunately on my measly 4 I can't do any better!
you need to bump it up to 512x512 to get normal results. Anything less will get you wacky results
Thanks!!
I ran it on a 3090Ti at about 5.75it/s but I cannot change the resolution, even for a small change it says I'm out of VRam !! lol !! I would like to change the aspect ratio but it seems to not work correctly If you need to write the prompt as time you need a generation, what is the "Samples per prompt" menu ?? Thanks for making me knowing this in local usage !! Many thanks !!
Also I would love to know more about V scale and half-precision, what is it exactly ?
Dont know what does V Scale stands for except that its the "guidance scale" whatever that means..., but half-precision is basically used when you got less than 10 GB of VRAM, uses float16 precision instead of the default float32 for the 10+ GB VRAM GPUs
Thanks for the details, the problem is that the box remains checked on "half precision" despite the 24gb of vram of my 3090Ti, I do not understand very well why.
For some reason it keeps running out of VRAM... even when I have it on 64x64, it instantly uses up all the memory (4 GB) and throws an error.
Is this happening to anyone else?
You've run out of VRAM (alloc failed) means it has no more room/space to allocate the needed memory because there is no more, and task fails.
I got this error while generating a 64x64 image with 50 steps:
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)
You've run out of VRAM (alloc failed) means it has no more room/space to allocate the needed memory because there is no more, and task fails.
Show post...
okay this interface ( and the work behind ;) ) is already very good, maybe fix the sample per prompt bug , increment the resolution in multiple of 64 direcly instead of 1, save the output folder path after a reboot of the gui. I don't know if it's a bug but i can't do more than 250 steps and this message apear in the console, happen the same thing with a lower resolution.
Rendering: Cat
0it [00:01, ?it/s]
Traceback (most recent call last):
File "start.py", line 363, in OnRender
File "torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py", line 152, in __call__
File "diffusers\schedulers\scheduling_pndm.py", line 136, in step
File "diffusers\schedulers\scheduling_pndm.py", line 212, in step_plms
File "diffusers\schedulers\scheduling_pndm.py", line 230, in _get_prev_sample
IndexError: index 1000 is out of bounds for dimension 0 with size 1000
I have a mistake. The images I have are only black.
Me too. I get the dreaded black image too
Me too! I don't know why it's happening!
What card? Also is your card overclocked?
I cant create photos its creating black box
1650 and entirely black image. Even resolutions as low as 64x64 generate no data at all
RuntimeError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 4.00 GiB total capacity; 2.55 GiB already allocated; 119.30 MiB free; 2.60 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
can anyone help me with this. i have a 1050ti 4gb. and I tried to get a 512x512 image.
Show post...
try reduce the resolution like 64x64 and slowly increment by 64 to know what yout card limit
512x512 is too much for 4GB of VRAM... I am hitting the limit of 576x576 with 8GB. so, doing this resolution with 4GB its practically impossible. In some cases my screen "goes black" due to lack of VRAM to output something in screen while the process is going...
is there any way of doing this without an nvidia card?
Yes, there is a project that'll run it on an intel cpu. Setup is more complicated however: https://github.com/bes-dev/stable_diffusion.openvino
1650Ti just produces black :(
got the same issue :/
Dont know if this helps because I don't have really tried it, but in my "Geforce Experience" software I have the option of selecting 2 kinds of drivers... the "studio" and the "gaming"... I have "studio" selected, try switching them and re-test
Hi! What does it mean "UnicodeEncodeError: 'charmap' codec can't encode character '\u015b' in position 85: character maps to <undefined>" ?
Scan your file in VirusTotal @ https://www.virustotal.com/gui/home/upload to check it here.
Perfect for me on a RTX 2070 for 512x512 pictures, but more than 1 sample per prompt seems not to work maybe in a future version ? Thank you very much for your work !
This is in the description. The sample per prompt isn't working, but you can have more than one prompt (even the same). Only make sure you don't have a cursor on a line by itself.
Thanks for your answer.
'Ctrl-C/Ctrl-V' is our friend ! 😊
Edit:solved
Potential BUG. It keeps rendering and saving random extra images after finishing mine. Is my GPU being used to render other people's prompts? For example, I will enter 3 prompts, it finishes my 3, and continues onto 1 or 2 more other renders. They generate a txt file of settings but no text prompt.
you more than likely have extra lines without any text for the AI to work with, the same thing happened to me and all I had to do was delete the empty line below my prompts.
That's what I get for spamming Ctrl v, thanks!
It would be nice if those extra ends of the line were trimmed automatically.
I just want to write here that my RTX 2060 Super runs this fine. generates an image in about 20 seconds. I can use resolution up to 448x768. using windows 10 home, 64 bt. to give others a baseline. trying higher resolutions gives an out of memory error.
this is awesome, and thanks for the amazing tool.
REQUESTS:
Add UP/DOWN arrow to seed number (it's the only one missing that).
Add ability to increment seed by +1 after generation instead of using random.
Or if you want to give me the source code, I'll code the changes by myself.
Feature suggestions:
— Drag-and-drop an info text file to copy its settings. Great for exploring variants on a particular render.
— With the above option, ideally it could continue from where you left off if you only increase the step count.
— Option to render image variations with a range of "v scales".
Really fun to use and it works pretty well. But for some reason i can't generate 512x512. Im running it on a "AMD Ryzen 3400G" And a "3060 TI", anyone know why?
how much vram does your 3060 have? I own a 3070 8gb and I cant generate images over 576x576, the highest I can go vertically is 448x704. If I close as many processes as possible to free up my vram usage I can sometimes go up to 640x640 but its extremely slow.
The limit of the resolution is given by the ammount of VRAM you have... My top limit is 512x512 with 8GB of VRAM
Since the file size is large, hopefully when there is an update, we will be able to just download the newer files, rather than redownloading unchanged files.
I'm also wondering if there is a way to add more libraries or if it would be too much hassle for the average person like me to bother doing.
Also, thanks, this is awesome.
I have it downladed and installed but this is the opening I get, (
torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
torch\_jit_internal.py:751: UserWarning: Unable to retrieve source for @torch.jit._overload function: <function _DenseLayer.forward at 0x000001F2D632F940>.
warnings.warn(f"Unable to retrieve source for @torch.jit._overload function: {func}."))
"This error appear on the .exe startup, it always appears on 0.1, but the app should still work"
File "torch\nn\functional.py", line 2199, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
and then this is at the end and no imagery.... if you know please help I have dyslexia, kinda bad and this is not easy for me, to even get this far is amzing.
There seems to be a hard limit of 500 steps per image. Is there a reason for this and can this be changed? Many prompts only start to look good in the high 400s, at least with my crappy GPU
Yes. Its limited in the code (predict.py) you can always change it to a greater limit and recompile it. change the "500" for whatever number that suits your needs.
The code is the following:
I can't find predict.py ... any idea of the location? The file system for this app is a bit of a mess. LOL.
Did a search and got nothing for that file name.
I'm getting a "PermissionError: [ErrNo 13] Permission Denied: '.\\image.png'" error after generating an image, and the image won't be saved in the output folder. I even tried to run the program as admin, but it didn't fix the problem.
Anyone knows what I'm doing wrong?
It sounds like the program might be situated in a folder that it doesn't have the permissions to modify, I would try moving it to a secondary drive if you have one or to a different folder on you C: drive that doesn't require permissions.
Thank you very much for the tip! I just tried moving it to a different folder and now it works fine indeed. :)
Have you tried running it as administrator (right click menu)?
Ran it once and it seemed fine, I ended up restarting my PC and now I'm getting this error and getting an instant crash every time I try running it. Not sure what happened, any ideas? I completely reinstalled the file, which didn't seem to do anything.
ValueError: invalid literal for int() with base 10: '\x00'
[504] Failed to execute script 'start' due to unhandled exception!
It would be cool if you could add to the queue with a REST API. That way it would be easy for devs to make their own GUI that connects with it but has all sorts of features or a design we prefer.
Really wish I had gotten an NVIDIA card now :( I do hope this can be modified to work with AMD though that will be a lot of extra work I imagine.
Nice on Nvidia GeForce 970. Merci !
Ps.: Dall-E, Imagen, NigthCafé and co, will want to buy you, (they are losing money)..., PLEASE do not sell!
Will you be able to add image2image generation? If so how long till we can expect to see it?