LLM2CLIP Collection LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. • 11 items • Updated 2 days ago • 55
CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models Paper • 2410.13267 • Published Oct 17, 2024 • 1
Memories are One-to-Many Mapping Alleviators in Talking Face Generation Paper • 2212.05005 • Published Dec 9, 2022
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder Paper • 2303.17550 • Published Mar 30, 2023
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis Paper • 2502.04128 • Published Feb 6 • 25