svjack

No bio provided.

Joined April 2017

Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter NotebookUpdated 5/1/2025

musubi-tuner

No description provided.

PythonUpdated 5/1/2025

LLaVA-NeXT

No description provided.

PythonUpdated 5/1/2025

FramePack

Lets make video diffusion practical!

PythonUpdated 5/1/2025

ai-toolkit

The ultimate training toolkit for finetuning diffusion models

PythonUpdated 5/1/2025