RuntimeError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 4.00 GiB total capacity; 2.55 GiB already allocated; 119.30 MiB free; 2.60 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
can anyone help me with this. i have a 1050ti 4gb. and I tried to get a 512x512 image.
512x512 is too much for 4GB of VRAM... I am hitting the limit of 576x576 with 8GB. so, doing this resolution with 4GB its practically impossible. In some cases my screen "goes black" due to lack of VRAM to output something in screen while the process is going...
Dont know if this helps because I don't have really tried it, but in my "Geforce Experience" software I have the option of selecting 2 kinds of drivers... the "studio" and the "gaming"... I have "studio" selected, try switching them and re-test
Perfect for me on a RTX 2070 for 512x512 pictures, but more than 1 sample per prompt seems not to work maybe in a future version ? Thank you very much for your work !
This is in the description. The sample per prompt isn't working, but you can have more than one prompt (even the same). Only make sure you don't have a cursor on a line by itself.
Potential BUG. It keeps rendering and saving random extra images after finishing mine. Is my GPU being used to render other people's prompts? For example, I will enter 3 prompts, it finishes my 3, and continues onto 1 or 2 more other renders. They generate a txt file of settings but no text prompt.
you more than likely have extra lines without any text for the AI to work with, the same thing happened to me and all I had to do was delete the empty line below my prompts.
I just want to write here that my RTX 2060 Super runs this fine. generates an image in about 20 seconds. I can use resolution up to 448x768. using windows 10 home, 64 bt. to give others a baseline. trying higher resolutions gives an out of memory error. this is awesome, and thanks for the amazing tool.
REQUESTS: Add UP/DOWN arrow to seed number (it's the only one missing that). Add ability to increment seed by +1 after generation instead of using random.
Or if you want to give me the source code, I'll code the changes by myself.
Feature suggestions: — Drag-and-drop an info text file to copy its settings. Great for exploring variants on a particular render. — With the above option, ideally it could continue from where you left off if you only increase the step count. — Option to render image variations with a range of "v scales".
Really fun to use and it works pretty well. But for some reason i can't generate 512x512. Im running it on a "AMD Ryzen 3400G" And a "3060 TI", anyone know why?
how much vram does your 3060 have? I own a 3070 8gb and I cant generate images over 576x576, the highest I can go vertically is 448x704. If I close as many processes as possible to free up my vram usage I can sometimes go up to 640x640 but its extremely slow.
Since the file size is large, hopefully when there is an update, we will be able to just download the newer files, rather than redownloading unchanged files. I'm also wondering if there is a way to add more libraries or if it would be too much hassle for the average person like me to bother doing. Also, thanks, this is awesome.
I have it downladed and installed but this is the opening I get, (
torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
torch\_jit_internal.py:751: UserWarning: Unable to retrieve source for @torch.jit._overload function: <function _DenseLayer.forward at 0x000001F2D632F940>.
warnings.warn(f"Unable to retrieve source for @torch.jit._overload function: {func}."))
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
and then this is at the end and no imagery.... if you know please help I have dyslexia, kinda bad and this is not easy for me, to even get this far is amzing.
There seems to be a hard limit of 500 steps per image. Is there a reason for this and can this be changed? Many prompts only start to look good in the high 400s, at least with my crappy GPU
Yes. Its limited in the code (predict.py) you can always change it to a greater limit and recompile it. change the "500" for whatever number that suits your needs.
The code is the following:
num_inference_steps: int = Input(
description="Number of denoising steps", ge=1, le=500, default=50
I'm getting a "PermissionError: [ErrNo 13] Permission Denied: '.\\image.png'" error after generating an image, and the image won't be saved in the output folder. I even tried to run the program as admin, but it didn't fix the problem.
It sounds like the program might be situated in a folder that it doesn't have the permissions to modify, I would try moving it to a secondary drive if you have one or to a different folder on you C: drive that doesn't require permissions.
Ran it once and it seemed fine, I ended up restarting my PC and now I'm getting this error and getting an instant crash every time I try running it. Not sure what happened, any ideas? I completely reinstalled the file, which didn't seem to do anything.
ValueError: invalid literal for int() with base 10: '\x00'
[504] Failed to execute script 'start' due to unhandled exception!
It would be cool if you could add to the queue with a REST API. That way it would be easy for devs to make their own GUI that connects with it but has all sorts of features or a design we prefer.
Hi guys, amazing work, but how can I update the cuda pytorch within the files of the GRisk GUI folder in order for it to actually run on gpu? That worked for me when trying Cupscale, however in this GUI When you run it it definitely doesn't use the GPU, just the CPU, thanks!
I don't know where to report bugs, or of the sorts of bugs. it does not start, i've installed it, put it in a good folder, with a good amount of space, just when i click the exact .exe file, command prompt pops up for half a seccond, and vanishes. no matter how many time i try, same result. PS: this stuff is really impressive
If you go to task manager and to the GPU tab, you need to select the options for CUDA and look at VRAM usage. Even when the GPU is used, CPU usage will also be high. I don't think CUDA use will actually show up as high GPU %usage
I have a 308010Gb but I can't use this software, at start, the windows command line say some errors:
torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
torch\_jit_internal.py:751: UserWarning: Unable to retrieve source for @torch.jit._overload function: <function _DenseLayer.forward at 0x0000017B06DDD550>.
warnings.warn(f"Unable to retrieve source for @torch.jit._overload function: {func}.")
torch\_jit_internal.py:751: UserWarning: Unable to retrieve source for @torch.jit._overload function: <function _DenseLayer.forward at 0x0000017B06DF38B0>.
warnings.warn(f"Unable to retrieve source for @torch.jit._overload function: {func}.")
Thanks, I search by the word "torch" in this webpage and have 0 results when I posted, so i suspect they added the information after I post this question.
I've created 704x704 images, but it is quite a bit slower. (1.20s/it vs 3.5it/s on 512) Do you really see a difference in coherency? maybe I need to try a little bit more.
Seems that you just keep going and get enough single heads. My prompts are "sticky": if I change a prompt, it seems to keep the previous prompt for several photos, and then moves in the direction the prompt I asked it ... !
what I find best is to set steps to a low number like 20 or 30 then generate like 100 images with the same prompt but random seed. Delete the bad ones and keep the ones with the right general composition then use the seed from it and iterate on the prompt using a higher number of steps.
what has worked for me is when I get a render I like, I pull the seed out of the _info, and set it, then I'll up the iterations to 150. then i'll slowly start increasing the v scale 1 at a time (somewhere around 13-14 comes out best). Thankfully I have a 3090, so each 512x1024 only takes 90 seconds.
Thanks for creating this. I'm enjoying creating privately vs. Midjourney's public forum. Is there a way to point the program to a GPU1. I'd like to see how it performs on the other GPU in my system. Thanks!
So Stable Diffusion is what Dream Studio uses, Stability.ai developed both the model and the service.
Midjourney uses something else, but did recently run a short test integrating Stable Diffusion into their service, whatever their implementation of it was, it worked very well, but it's offline for tweaks right now.
Working perfectly on my RTX 3060(12GB) at about 3.47it/s, but with a hard limit at 512 of resolution and 250 steps. It's awesome nonetheless. Thank you for your quick and awesome work.
Edit: I've been able to get 704x704 images but they go at 1.15s/it. It's a lot slower, not worth it for me.
Did you try turning on the "Use Half-Precision" ? This makes it possible to be used in hardware that is less than 10GB of VRAM, maybe with this enabled you can achieve grater resolutions and/or speeds because it will have less precision, so less vram is needed. (enabling this will do float16 instead of float32).
If you want more speed, I guess, use the same resolutions with half-precision enabled... I cant test this myself, because I only have 8GB VRAM, and GUI dont let me uncheck it
Does it work on a GTX 1650? Because it only lets my do 62 x 62 and under and even then it just makes a black screen. Does CUDA have to be installed to make this work because its giving me a CUDA memory error when I'm running it as 512 x 512, and I don't know why.
Is there any way to use this without needing an Nvidia driver? Every time I try to generate an image it tells me: RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver. If not then that's fine.
this program only works with Nvidia graphics cards so that could be the problem, if you have an Nvidia gpu you can try to install the latest drivers with geforce experience and that might fix it.
← Return to tool
Comments
Log in with itch.io to leave a comment.
1650 and entirely black image. Even resolutions as low as 64x64 generate no data at all
RuntimeError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 4.00 GiB total capacity; 2.55 GiB already allocated; 119.30 MiB free; 2.60 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
can anyone help me with this. i have a 1050ti 4gb. and I tried to get a 512x512 image.
Show post...
try reduce the resolution like 64x64 and slowly increment by 64 to know what yout card limit
512x512 is too much for 4GB of VRAM... I am hitting the limit of 576x576 with 8GB. so, doing this resolution with 4GB its practically impossible. In some cases my screen "goes black" due to lack of VRAM to output something in screen while the process is going...
is there any way of doing this without an nvidia card?
Yes, there is a project that'll run it on an intel cpu. Setup is more complicated however: https://github.com/bes-dev/stable_diffusion.openvino
1650Ti just produces black :(
got the same issue :/
Dont know if this helps because I don't have really tried it, but in my "Geforce Experience" software I have the option of selecting 2 kinds of drivers... the "studio" and the "gaming"... I have "studio" selected, try switching them and re-test
Hi! What does it mean "UnicodeEncodeError: 'charmap' codec can't encode character '\u015b' in position 85: character maps to <undefined>" ?
Scan your file in VirusTotal @ https://www.virustotal.com/gui/home/upload to check it here.
It has to because its safe and trusted, I have the top best antiviruses and it didn't detect anything.
Perfect for me on a RTX 2070 for 512x512 pictures, but more than 1 sample per prompt seems not to work maybe in a future version ? Thank you very much for your work !
This is in the description. The sample per prompt isn't working, but you can have more than one prompt (even the same). Only make sure you don't have a cursor on a line by itself.
Thanks for your answer.
'Ctrl-C/Ctrl-V' is our friend ! 😊
Edit:solved
Potential BUG. It keeps rendering and saving random extra images after finishing mine. Is my GPU being used to render other people's prompts? For example, I will enter 3 prompts, it finishes my 3, and continues onto 1 or 2 more other renders. They generate a txt file of settings but no text prompt.
you more than likely have extra lines without any text for the AI to work with, the same thing happened to me and all I had to do was delete the empty line below my prompts.
That's what I get for spamming Ctrl v, thanks!
It would be nice if those extra ends of the line were trimmed automatically.
I just want to write here that my RTX 2060 Super runs this fine. generates an image in about 20 seconds. I can use resolution up to 448x768. using windows 10 home, 64 bt. to give others a baseline. trying higher resolutions gives an out of memory error.
this is awesome, and thanks for the amazing tool.
REQUESTS:
Add UP/DOWN arrow to seed number (it's the only one missing that).
Add ability to increment seed by +1 after generation instead of using random.
Or if you want to give me the source code, I'll code the changes by myself.
Feature suggestions:
— Drag-and-drop an info text file to copy its settings. Great for exploring variants on a particular render.
— With the above option, ideally it could continue from where you left off if you only increase the step count.
— Option to render image variations with a range of "v scales".
Really fun to use and it works pretty well. But for some reason i can't generate 512x512. Im running it on a "AMD Ryzen 3400G" And a "3060 TI", anyone know why?
how much vram does your 3060 have? I own a 3070 8gb and I cant generate images over 576x576, the highest I can go vertically is 448x704. If I close as many processes as possible to free up my vram usage I can sometimes go up to 640x640 but its extremely slow.
The limit of the resolution is given by the ammount of VRAM you have... My top limit is 512x512 with 8GB of VRAM
Since the file size is large, hopefully when there is an update, we will be able to just download the newer files, rather than redownloading unchanged files.
I'm also wondering if there is a way to add more libraries or if it would be too much hassle for the average person like me to bother doing.
Also, thanks, this is awesome.
I have it downladed and installed but this is the opening I get, (
torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
torch\_jit_internal.py:751: UserWarning: Unable to retrieve source for @torch.jit._overload function: <function _DenseLayer.forward at 0x000001F2D632F940>.
warnings.warn(f"Unable to retrieve source for @torch.jit._overload function: {func}."))
"This error appear on the .exe startup, it always appears on 0.1, but the app should still work"
File "torch\nn\functional.py", line 2199, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
and then this is at the end and no imagery.... if you know please help I have dyslexia, kinda bad and this is not easy for me, to even get this far is amzing.
There seems to be a hard limit of 500 steps per image. Is there a reason for this and can this be changed? Many prompts only start to look good in the high 400s, at least with my crappy GPU
Yes. Its limited in the code (predict.py) you can always change it to a greater limit and recompile it. change the "500" for whatever number that suits your needs.
The code is the following:
I can't find predict.py ... any idea of the location? The file system for this app is a bit of a mess. LOL.
Did a search and got nothing for that file name.
I'm getting a "PermissionError: [ErrNo 13] Permission Denied: '.\\image.png'" error after generating an image, and the image won't be saved in the output folder. I even tried to run the program as admin, but it didn't fix the problem.
Anyone knows what I'm doing wrong?
It sounds like the program might be situated in a folder that it doesn't have the permissions to modify, I would try moving it to a secondary drive if you have one or to a different folder on you C: drive that doesn't require permissions.
Thank you very much for the tip! I just tried moving it to a different folder and now it works fine indeed. :)
Have you tried running it as administrator (right click menu)?
Ran it once and it seemed fine, I ended up restarting my PC and now I'm getting this error and getting an instant crash every time I try running it. Not sure what happened, any ideas? I completely reinstalled the file, which didn't seem to do anything.
ValueError: invalid literal for int() with base 10: '\x00'
[504] Failed to execute script 'start' due to unhandled exception!
It would be cool if you could add to the queue with a REST API. That way it would be easy for devs to make their own GUI that connects with it but has all sorts of features or a design we prefer.
Really wish I had gotten an NVIDIA card now :( I do hope this can be modified to work with AMD though that will be a lot of extra work I imagine.
Nice on Nvidia GeForce 970. Merci !
Ps.: Dall-E, Imagen, NigthCafé and co, will want to buy you, (they are losing money)..., PLEASE do not sell!
Will you be able to add image2image generation? If so how long till we can expect to see it?
Hi guys, amazing work, but how can I update the cuda pytorch within the files of the GRisk GUI folder in order for it to actually run on gpu? That worked for me when trying Cupscale, however in this GUI When you run it it definitely doesn't use the GPU, just the CPU, thanks!
I don't know where to report bugs, or of the sorts of bugs.
it does not start, i've installed it, put it in a good folder, with a good amount of space, just when i click the exact .exe file, command prompt pops up for half a seccond, and vanishes. no matter how many time i try, same result.
PS: this stuff is really impressive
It doesn't use GPU, it looks like it only uses CPU... :S
If you go to task manager and to the GPU tab, you need to select the options for CUDA and look at VRAM usage. Even when the GPU is used, CPU usage will also be high. I don't think CUDA use will actually show up as high GPU %usage
I understand, thx
when will it work for 1660 cards?
I have a 308010Gb but I can't use this software, at start, the windows command line say some errors:
torchvision\io\image.py:13: UserWarning: Failed to load image Python extension:
torch\_jit_internal.py:751: UserWarning: Unable to retrieve source for @torch.jit._overload function: <function _DenseLayer.forward at 0x0000017B06DDD550>.
warnings.warn(f"Unable to retrieve source for @torch.jit._overload function: {func}.")
torch\_jit_internal.py:751: UserWarning: Unable to retrieve source for @torch.jit._overload function: <function _DenseLayer.forward at 0x0000017B06DF38B0>.
warnings.warn(f"Unable to retrieve source for @torch.jit._overload function: {func}.")
Apparently if you read above, this always happens, and is of no consequence. So leave it and enjoy making pictures!
Thanks, I search by the word "torch" in this webpage and have 0 results when I posted, so i suspect they added the information after I post this question.
Anyway, it works fine, thanks!
this was last updated 3 days ago so it was there lol.
I managed to create 640 x 704 images with RTX 2080ti 11gb and coherency seems to improve from 512 x 512.
I've created 704x704 images, but it is quite a bit slower. (1.20s/it vs 3.5it/s on 512) Do you really see a difference in coherency? maybe I need to try a little bit more.
I have 4 gb vram but it says cuda out of memory and uses only 3.29 gb vram
With 4GB VRAM you'll need to run a lower image resolution like 384x384
512x384 will run with 4GB
Hi, I am getting double or triple faces in my pictures when I put "woman"/"girl" or similar words in my prompts to get portrait pictures.
How do I minimize getting these odd results?
Seems that you just keep going and get enough single heads. My prompts are "sticky": if I change a prompt, it seems to keep the previous prompt for several photos, and then moves in the direction the prompt I asked it ... !
Yeah, it feels like the more I generate the less I get double faces/persons coming up.
what I find best is to set steps to a low number like 20 or 30 then generate like 100 images with the same prompt but random seed. Delete the bad ones and keep the ones with the right general composition then use the seed from it and iterate on the prompt using a higher number of steps.
Awesome stuff, works out of the box with my GTX 1070 (8GB) at 1.3it/s. Wow!
1. What does V-scale do?
2. How do we make it use other models? (Can we do that yet?)
Higher and higher V Scale seems to give artefacts - I don't think its a fully-fledged option yet?
v scale is supposed to make the result more similar to the prompt but it just seems to curse the image from what I've tested. I like 5.5 or 6 though
what has worked for me is when I get a render I like, I pull the seed out of the _info, and set it, then I'll up the iterations to 150. then i'll slowly start increasing the v scale 1 at a time (somewhere around 13-14 comes out best). Thankfully I have a 3090, so each 512x1024 only takes 90 seconds.
Thanks for creating this. I'm enjoying creating privately vs. Midjourney's public forum. Is there a way to point the program to a GPU1. I'd like to see how it performs on the other GPU in my system. Thanks!
Is this a cousin of MidJourney or Dream Studio? I get confused when posting on facebook the "genealogy" of this software?!
So Stable Diffusion is what Dream Studio uses, Stability.ai developed both the model and the service.
Midjourney uses something else, but did recently run a short test integrating Stable Diffusion into their service, whatever their implementation of it was, it worked very well, but it's offline for tweaks right now.
Any updates to get it to work kn amd?
What does V scale do?
Try changing it! I noticed that higher and higher values made the result look worse (artefacts).
it's supposed to make the result more similar to the text prompt but when I try it, the higher value just makes the image look cursed
Working perfectly on my RTX 3060(12GB) at about 3.47it/s, but with a hard limit at 512 of resolution and 250 steps. It's awesome nonetheless. Thank you for your quick and awesome work.
Edit: I've been able to get 704x704 images but they go at 1.15s/it. It's a lot slower, not worth it for me.
Did you try turning on the "Use Half-Precision" ? This makes it possible to be used in hardware that is less than 10GB of VRAM, maybe with this enabled you can achieve grater resolutions and/or speeds because it will have less precision, so less vram is needed. (enabling this will do float16 instead of float32).
If you want more speed, I guess, use the same resolutions with half-precision enabled... I cant test this myself, because I only have 8GB VRAM, and GUI dont let me uncheck it
Does it work on a GTX 1650? Because it only lets my do 62 x 62 and under and even then it just makes a black screen. Does CUDA have to be installed to make this work because its giving me a CUDA memory error when I'm running it as 512 x 512, and I don't know why.
same i NEED ANSWERS
because original stable-diffusion need 10GB+ VRAM, and this one with some optimisations need 6+ too on 512x512
I finally got it to work on my 1650 using this link. 512 is still the limit, but it's finally working
Can you make it so it supports over clocked gpu
What happens if your gpu is overlocked?
It doesn't allow you to render images, instead the image is just black.
Quality submission but the game jam is over... jokes aside...
Thank you for sharing this fantastic Graphic User Interface!
Is there any way to use this without needing an Nvidia driver? Every time I try to generate an image it tells me: RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver. If not then that's fine.
this program only works with Nvidia graphics cards so that could be the problem, if you have an Nvidia gpu you can try to install the latest drivers with geforce experience and that might fix it.