Whisper GUI - Generate Subtitle for audios and videos.
A downloadable tool for Windows
GUI for the algorithm Whisper by OpenAI:
https://openai.com/blog/whisper/
https://github.com/openai/whisper
Attention:
Whisper is build on top of pytorch, that usually work better with NVIDIA cards, so for now the application require a decent Nvidia card to run.
About the application:
You can select multiple audio/videos from your computer and generate subtitle for it. It accept multiple languages as input, you can select the language using the GUI.
There is also the option to translate the subtitle to English if you like.
This is still a work in progress, more options will be available in the future as more updates are complete.
Setup:
- After extracting the .zip, open the "Whisper GUI.exe" inside the folder
- There are multiple models to select, it will download them if you don't have it already.
- If you have more than 10Vram on your card, you will always want to use Large-V2
- If not, use the larger model you can. If your input is in english, use the ".en" version.
Examples:
English transcribe:
Japanese -> English Tranlation
Download
Download
Whisper GUI 0.1 2.3 GB
Comments
Log in with itch.io to leave a comment.
i can't download, please have you another link? No adblock only my poor internet conection..
What about v3 model? Its not listed...
Hey, wanted to check in. Tool works great but even though I have GTX 3090 and have it selected on the device, it seems to show CPU at 15% usage and GPU at 1% usage. Is this normal?
Using small.end small model english input only with 2gb vram to test. Doing a 1 hour audio file in 3 minutes. So that seems fine anyway, but wanted to make sure I wasn't missing a step.
I can't tell if this is using CPU or my GPU. I'm doing about
Seems cool, but can't get it to run. Just get hit with the "qta.qpa.plugin: Could not load the Qt platform plugin "windows" in "" even though it was found" error when I try to run "Whisper GUI.exe"
Does anyone know what to do in order to fix it?
model medium and large-2 are downloaded, but not loaded, the download just stops at the end (100%) and so on in a circle - a constant download from the Internet. Pt files are downloaded to the cache folder and then nothing happens. They are overwritten again.
The models should be downloaded into (sub)folder where the exe is, not left on the system in some user/cache folder...
Thanks for making this Whisper GUI. I appreciate that it can be used offline. I hope you will continue to improve its functionality, not that it doesn't function, but perhaps to add more bells and whistles. My laptop (NVIDIA Geforce RTX 3060) can only utilize the small model -can you recommend a more powerful laptop to allow me to use the larger models? Is there a way to compensate you for your incredible work?
Unfortunately, this program is useless for me, because I want a Hungarian language course!
So I still have to use YouTube's service, which works uncertainly and gives dubious results.
P.S.:
I want better so much!
Does someone know if this GUI is usable sans GPU? I know the model can be used without a GPU but I don't know if the settings in the GUI allow it. I am a mac user but i'm trying to find an easy way for a non-techy windows using friend to use whisper. I know it's gonna be slow but can it work?
If we can have the CPU support?
I would like to add the following feature requests:
does nothing
it will be so cool when models like whisper , also attaches meta data to each word, like tone , pitch, start and end time and recognizes different voices. So that we can feed it back into simple text to voice generator and generate new audio to dub videos. So many anime's , Korean fantasy and sci fi drama, that I would love to listen to instead of reading subtitles. It would also help with creating a star trek like communicate that lets anyone talk to others in in the same tone they intended.
love this! thank you for making an easy to use functional GUI so it's easy to try out!
I might just have missed it but is there a buy/donate button for this to toss a couple bucks over in appreciation?
this might be beyond the scope of a GUI but is there any chance of having it be able to do live subtitles?
also i noticed that it seems to pause when i click off the window, I assume this is intended?
Would it be possible to add a progress bar for processing?
Yeah, once I mess a little more with the code.
awesome. Thanks for the great software.
If possible, another feature req: -allow for custom models (ex: some of the custom trained ones on hugging face)
Does it work with Live fead?
Not right now, perhaps in a future update.
Thanks for the reply,
It is realy nice little app.
As of now detect language or chose language and translate by default to english. what about adding target language for translation? (not just english)
Other languages as target are not supported by Whisper, but I plan to add a another model for that.
Amazing!!!!! Cant wait for itIcan be on your beta testing if you like ;-)
Unfortunately, this program is useless for me, because I want a Hungarian language course!
So I still have to use YouTube's service, which works uncertainly and gives dubious results.
P.S.:
I want better so much!
Hi mate, can you please get onto the Rife-app page and update to the 3.36 version?? The 3.35 version is broken and will not unzipp!! Also it is full of .psy files only... Can you let us know what's going on?
Wonderful tool! Works very well with japanese!