YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models Paper β’ 2409.13592 β’ Published Sep 20, 2024 β’ 50 β’ 9
Patch-Level Training for Large Language Models Paper β’ 2407.12665 β’ Published Jul 17, 2024 β’ 17 β’ 3
LiteSearch: Efficacious Tree Search for LLM Paper β’ 2407.00320 β’ Published Jun 29, 2024 β’ 40 β’ 5
LiteSearch: Efficacious Tree Search for LLM Paper β’ 2407.00320 β’ Published Jun 29, 2024 β’ 40 β’ 5
LiteSearch: Efficacious Tree Search for LLM Paper β’ 2407.00320 β’ Published Jun 29, 2024 β’ 40 β’ 5
Scaling Laws for Linear Complexity Language Models Paper β’ 2406.16690 β’ Published Jun 24, 2024 β’ 23 β’ 4
Scaling Laws for Linear Complexity Language Models Paper β’ 2406.16690 β’ Published Jun 24, 2024 β’ 23 β’ 4
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Paper β’ 2406.11931 β’ Published Jun 17, 2024 β’ 67 β’ 4
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models Paper β’ 2406.11831 β’ Published Jun 17, 2024 β’ 22 β’ 4
PowerInfer-2: Fast Large Language Model Inference on a Smartphone Paper β’ 2406.06282 β’ Published Jun 10, 2024 β’ 38 β’ 5
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models Paper β’ 2406.06563 β’ Published Jun 3, 2024 β’ 20 β’ 10
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models Paper β’ 2406.06563 β’ Published Jun 3, 2024 β’ 20 β’ 10
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models Paper β’ 2406.06563 β’ Published Jun 3, 2024 β’ 20 β’ 10