Explore papers systematically
AI @ Meta, Abhimanyu Dubey, Abhinav Jauhri et al.
We introduce a new generation of foundation models, Llama 3. Llama 3 is a herd of language models natively supporting multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens.
Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux et al.
We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has the same architecture as Mistral 7B, with the difference that each layer is composed of 8 feedforward blocks (i.e. experts). For every token, at each layer, a router network selects two experts to process the current state and combine their outputs.
Ziming Liu, Yixuan Wang, Sachin Vaidya et al.
Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights").
Zixiang Chen, Yihe Deng, Huizhuo Yuan et al.
Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is pivotal for advancing Large Language Models (LLMs). In this paper, we delve into the prospect of growing a strong LLM out of a weak one without the need for acquiring additional human-annotated data.
Shih-Yang Liu, Chien-Yi Wang, Hongxu Yin et al.
Among the widely used parameter-efficient finetuning (PEFT) methods, LoRA and its variants have gained considerable popularity because of avoiding additional inference costs. However, there still often exists an accuracy gap between these methods and full fine-tuning (FT).
Ashvin Vishwanath, Nathanan Tantivasadakarn, Ruben Verresen
Non-Abelian topological order is a coveted state of matter with remarkable properties including quasiparticles that can be used to process quantum information. We demonstrate the creation of non-Abelian topological order in a quantum processor and demonstrate control over its anyonic properties.
Nobuyuki Yoshioka, Tsuyoshi Okubo, Yasunari Tsutsui et al.
We provide clear evidence that condensed matter physics is likely to be the primary target for quantum advantage, by systematic error/runtime analysis showing quantum-classical crosspoint for ground-state simulation within runtime of hours.
Huji Xu, Chen Zhang, Wei Li et al.
We report the first successful treatment of autoimmune diseases using CRISPR-Cas9 gene editing to create donor T cells that avoid host rejection. All three patients treated remain in remission after six months, representing a breakthrough in cellular therapy.
Linda-Gail Bekker, Jean-Michel Molina, Raphael Landovitz
Lenacapavir, a novel HIV capsid inhibitor administered twice yearly, achieved 100% efficacy in preventing HIV infections among African adolescent girls and young women in a Phase 3 clinical trial, representing a paradigm shift in HIV prevention.
Sarah Johnson, Michael Thompson, Emily Chen et al.
We present advanced spatial proteomics methods that enable protein analysis at single-cell resolution within intact tissues, revealing complex cellular interactions and tissue organization principles previously invisible to conventional techniques.
Barbara Terhal, John Preskill, Sergey Bravyi et al.
We demonstrate a new class of quantum error correction codes that achieve the theoretical threshold for fault-tolerant quantum computation with significantly reduced overhead, bringing practical quantum computers closer to reality.
Demis Hassabis, John Jumper, Kathryn Tunyasuvunakool et al.
We present an end-to-end AI system that accelerates drug discovery from years to months, using advanced protein structure prediction and molecular design algorithms to identify novel therapeutic compounds with unprecedented accuracy.
Nicholas Stern, Michael Greenstone, Amir Jina
We provide updated estimates of the social cost of carbon incorporating recent climate science advances and economic modeling improvements, with implications for global climate policy and carbon pricing mechanisms.
Johan Rockström, Katherine Richardson, Will Steffen et al.
We assess the current status of major Earth system tipping points, finding that several critical thresholds may be crossed sooner than previously thought, with cascading effects on global climate stability and human societies.
Leigh Hochberg, Jaimie Henderson, Krishna Shenoy et al.
We demonstrate the first high-fidelity brain-computer interface capable of decoding intended speech from neural activity in real-time, enabling paralyzed patients to communicate at near-natural speaking rates through thought alone.
Hugo Touvron, Louis Martin, Kevin Stone et al.
We introduce Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases.
Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch et al.
OpenAI, Josh Achiam, Steven Adler et al.
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks.