Traceback (most recent call last):
File "/app/app.py", line 14, in <module>
api = APIHandler() # <-- FIXED: no api_key anymore
File "/app/api_handler.py", line 29, in __init__
from generate import LeVoPipeline # from Tencent's repo
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'generate'
I’m sure it wouldn’t be too hard to fix that error, but could you please describe the functionality you want to implement in that space using “natural language” rather than just model names?
Just a few lines would be fine… Otherwise, while I can fix the error itself, I won’t be able to make it work properly.
The tencent/SongGeneration model is primarily designed as a large-scale framework (LeLM and music codec) with parameters extending into the billions. Because of its immense memory requirements and specialized inference architecture, it cannot be hosted on Hugging Face’s free CPU-based Inference API.
The daydreamlive/DreamVAE model is not available through the free hosted API because it requires specific, dedicated hardware. Inference requires a robust machine with around ~ 7.5 GB of VRAM and 27 GB of RAM, which exceeds the limitations of the free Hugging Face infrastructure.
Yeah. That’s right. And that’s a problem that can’t be solved even with the best programming or coding. Generative AI is very powerful, but “it’s not magic. It’s technology.” It’s technology that makes it easy to achieve what’s possible. So, it can’t make the impossible possible. By the way, even without generative AI, it’s impossible for me, at least, to make the impossible possible.
So, as a minimum solution, you’ll probably need to rent a GPU from HF for a fee, or if that’s not possible, you’ll need to find some kind of existing API. However, APIs aren’t usually free either.
I tried to download the models from their repos.
Oh, I see. Running OSS models on a local PC is one option. If you have a PC with a really powerful GeForce GPU, you could probably get it to work with enough effort…
But if it’s like my PC’s GPU with only 8GB of VRAM, it might run, but it’ll be super slow…
VRAM capacity isn’t the only indicator of GPU performance, but when it comes to generative AI, running out of VRAM is the biggest hurdle you’ll face. In that case, it’s pretty hopeless—it’s easier to just buy a new GPU or rent one on the cloud.
Or, if a model is too large to run, you could look for a different, lighter model with a similar purpose.
Why don’t you prove it by duplicating my space: New Riffusion - a Hugging Face Space by Gertie2013
What does that mean???