Generate captions for music audio
Generate edited video frames using text prompts
Generate images guided by sketches, depth, pose, and more
Engage in multimedia chat with LLMs and ML models