Browse Papers

Explore papers systematically

All Papers (18)

The Llama 3 Herd of Models

AI @ Meta, Abhimanyu Dubey, Abhinav Jauhri et al.

arXiv
2024
4.8
(342 reviews)

We introduce a new generation of foundation models, Llama 3. Llama 3 is a herd of language models natively supporting multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens.

Computer Science
Large Language Models
Multimodal AI
Natural Language Processing
In Library

Mixtral of Experts

Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux et al.

arXiv
2024
4.7
(178 reviews)

We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has the same architecture as Mistral 7B, with the difference that each layer is composed of 8 feedforward blocks (i.e. experts). For every token, at each layer, a router network selects two experts to process the current state and combine their outputs.

Computer Science
Mixture of Experts
Large Language Models
Sparse Models
Read

KAN: Kolmogorov-Arnold Networks

Ziming Liu, Yixuan Wang, Sachin Vaidya et al.

arXiv
2024
4.5
(156 reviews)

Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights").

Computer Science
Mathematics
Neural Networks
Function Approximation
Mathematical Foundations

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Zixiang Chen, Yihe Deng, Huizhuo Yuan et al.

arXiv
2024
4.4
(89 reviews)

Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is pivotal for advancing Large Language Models (LLMs). In this paper, we delve into the prospect of growing a strong LLM out of a weak one without the need for acquiring additional human-annotated data.

Computer Science
Self-Supervised Learning
Model Fine-tuning
Language Models
In Library

DoRA: Weight-Decomposed Low-Rank Adaptation

Shih-Yang Liu, Chien-Yi Wang, Hongxu Yin et al.

arXiv
2024
4.3
(94 reviews)

Among the widely used parameter-efficient finetuning (PEFT) methods, LoRA and its variants have gained considerable popularity because of avoiding additional inference costs. However, there still often exists an accuracy gap between these methods and full fine-tuning (FT).

Computer Science
Parameter-Efficient Fine-tuning
Low-Rank Adaptation
Model Optimization
Read

Creating Non-Abelian Topological Order in Quantum Processors

Ashvin Vishwanath, Nathanan Tantivasadakarn, Ruben Verresen

Nature
2024
4.9
(67 reviews)

Non-Abelian topological order is a coveted state of matter with remarkable properties including quasiparticles that can be used to process quantum information. We demonstrate the creation of non-Abelian topological order in a quantum processor and demonstrate control over its anyonic properties.

Physics
Computer Science
Quantum Computing
Topological Order
Quantum Error Correction

Hunting for Quantum-Classical Crossover in Condensed Matter Problems

Nobuyuki Yoshioka, Tsuyoshi Okubo, Yasunari Tsutsui et al.

npj Quantum Information
2024
4.2
(34 reviews)

We provide clear evidence that condensed matter physics is likely to be the primary target for quantum advantage, by systematic error/runtime analysis showing quantum-classical crosspoint for ground-state simulation within runtime of hours.

Physics
Computer Science
Quantum Advantage
Condensed Matter
Quantum Simulation
In Library

CRISPR-Cas9 Gene Editing for Autoimmune Disease Treatment

Huji Xu, Chen Zhang, Wei Li et al.

Cell
2024
4.8
(123 reviews)

We report the first successful treatment of autoimmune diseases using CRISPR-Cas9 gene editing to create donor T cells that avoid host rejection. All three patients treated remain in remission after six months, representing a breakthrough in cellular therapy.

Biology & Life Sciences
Medicine & Health
CRISPR
Gene Therapy
Autoimmune Diseases
CAR-T Cells
Read

Lenacapavir for HIV Prevention: 100% Efficacy in Clinical Trials

Linda-Gail Bekker, Jean-Michel Molina, Raphael Landovitz

Science
2024
4.9
(234 reviews)

Lenacapavir, a novel HIV capsid inhibitor administered twice yearly, achieved 100% efficacy in preventing HIV infections among African adolescent girls and young women in a Phase 3 clinical trial, representing a paradigm shift in HIV prevention.

Medicine & Health
Biology & Life Sciences
HIV Prevention
Antiviral Drugs
Clinical Trials
Public Health
In Library

Spatial Proteomics Reveals Tissue Architecture at Single-Cell Resolution

Sarah Johnson, Michael Thompson, Emily Chen et al.

Nature Methods
2024
4.6
(87 reviews)

We present advanced spatial proteomics methods that enable protein analysis at single-cell resolution within intact tissues, revealing complex cellular interactions and tissue organization principles previously invisible to conventional techniques.

Biology & Life Sciences
Proteomics
Single-Cell Analysis
Spatial Biology
Tissue Architecture

Breakthrough in Quantum Error Correction Codes

Barbara Terhal, John Preskill, Sergey Bravyi et al.

Nature Physics
2024
4.7
(56 reviews)

We demonstrate a new class of quantum error correction codes that achieve the theoretical threshold for fault-tolerant quantum computation with significantly reduced overhead, bringing practical quantum computers closer to reality.

Physics
Mathematics
Computer Science
Quantum Error Correction
Coding Theory
Fault-Tolerant Computing
In Library

AI-Driven Drug Discovery: From Molecules to Medicine

Demis Hassabis, John Jumper, Kathryn Tunyasuvunakool et al.

Nature
2024
4.8
(189 reviews)

We present an end-to-end AI system that accelerates drug discovery from years to months, using advanced protein structure prediction and molecular design algorithms to identify novel therapeutic compounds with unprecedented accuracy.

Biology & Life Sciences
Computer Science
Medicine & Health
Drug Discovery
Protein Folding
AlphaFold
Computational Biology
Read

Climate Economics: The Social Cost of Carbon in the 2020s

Nicholas Stern, Michael Greenstone, Amir Jina

Science
2024
4.5
(98 reviews)

We provide updated estimates of the social cost of carbon incorporating recent climate science advances and economic modeling improvements, with implications for global climate policy and carbon pricing mechanisms.

Social Sciences
Environmental Sciences
Economics & Business
Climate Change
Environmental Economics
Policy Analysis
Carbon Pricing

Tipping Points in Earth System Components: A Global Assessment

Johan Rockström, Katherine Richardson, Will Steffen et al.

Nature
2024
4.7
(145 reviews)

We assess the current status of major Earth system tipping points, finding that several critical thresholds may be crossed sooner than previously thought, with cascading effects on global climate stability and human societies.

Environmental Sciences
Physics
Climate Tipping Points
Earth System Science
Global Change
Sustainability
In Library

Brain-Computer Interfaces: Decoding Speech from Neural Signals

Leigh Hochberg, Jaimie Henderson, Krishna Shenoy et al.

Nature Neuroscience
2024
4.6
(78 reviews)

We demonstrate the first high-fidelity brain-computer interface capable of decoding intended speech from neural activity in real-time, enabling paralyzed patients to communicate at near-natural speaking rates through thought alone.

Engineering
Biology & Life Sciences
Computer Science
Brain-Computer Interfaces
Neural Decoding
Speech Recognition
Neuroprosthetics

Llama 2: Open Foundation and Fine-Tuned Chat Models

Hugo Touvron, Louis Martin, Kevin Stone et al.

arXiv
2023
4.8
(456 reviews)

We introduce Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases.

Computer Science
Large Language Models
Open Source AI
RLHF
Instruction Following
Read

Mistral 7B

Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch et al.

arXiv
2023
4.5
(298 reviews)

We introduce Mistral 7B, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B on all benchmarks we tested.

Computer Science
Language Models
Model Efficiency
Open Source
Instruction Following
In Library

GPT-4 Architecture, Infrastructure, Training, Capabilities, Limitations

OpenAI, Josh Achiam, Steven Adler et al.

arXiv
2023
4.9
(789 reviews)

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks.

Computer Science
Large Language Models
Multimodal AI
AI Safety
Model Capabilities
Read