← Back to Home

NLP

53 researchers · 26 papers · 58 projects · 58 builders

Researchers (53)

Oriol Vinyals

Oriol Vinyals

Google DeepMind

415,165 citations · h-index 95

Jeff Dean

Jeff Dean

Google DeepMind

400,701 citations · h-index 124

Alec Radford

Alec Radford

OpenAI

346,122 citations · h-index 24

Chris Manning

Chris Manning

Stanford University

316,101 citations · h-index 122

Jürgen Schmidhuber

Jürgen Schmidhuber

KAUST / IDSIA

302,473 citations · h-index 105

Aidan Gomez

Aidan Gomez

Cohere

260,248 citations · h-index 15

Tomas Mikolov

Tomas Mikolov

Czech Institute of Informatics (CIIRC)

165,000 citations · h-index 42

Kyunghyun Cho

Kyunghyun Cho

New York University / Genentech

140,000 citations · h-index 82

Aaron Courville

Aaron Courville

Mila / Université de Montréal

135,000 citations · h-index 72

Jacob Devlin

Jacob Devlin

Google

127,030 citations · h-index 23

Samy Bengio

Samy Bengio

Apple AI/ML

112,069 citations · h-index 72

Jason Weston

Jason Weston

Meta AI

112,000 citations · h-index 88

Dan Jurafsky

Dan Jurafsky

Stanford University

72,000 citations · h-index 82

Luke Zettlemoyer

Luke Zettlemoyer

University of Washington / Meta AI

62,000 citations · h-index 78

Marc'Aurelio Ranzato

Marc'Aurelio Ranzato

Meta AI

62,000 citations · h-index 60

Oren Etzioni

Oren Etzioni

Allen Institute for AI (AI2)

52,000 citations · h-index 85

Tom Brown

Tom Brown

OpenAI

52,000 citations · h-index 25

Noah Smith

Noah Smith

University of Washington / Allen AI

48,000 citations · h-index 82

Dan Roth

Dan Roth

University of Pennsylvania

48,000 citations · h-index 82

Yejin Choi

Yejin Choi

University of Washington / Allen AI

45,000 citations · h-index 75

Percy Liang

Percy Liang

Stanford University

42,998 citations · h-index 85

Sanjeev Arora

Sanjeev Arora

Princeton University

42,000 citations · h-index 68

Lior Wolf

Lior Wolf

Tel Aviv University

38,000 citations · h-index 65

Hang Li

Hang Li

ByteDance AI Lab

38,000 citations · h-index 65

Xiaodong He

Xiaodong He

JD AI Research

35,000 citations · h-index 58

Regina Barzilay

Regina Barzilay

MIT CSAIL

32,000 citations · h-index 75

Hannaneh Hajishirzi

Hannaneh Hajishirzi

University of Washington / Allen AI

32,000 citations · h-index 55

Zhiyuan Liu

Zhiyuan Liu

Tsinghua University

32,000 citations · h-index 60

Danqi Chen

Danqi Chen

Princeton University

32,000 citations · h-index 38

Graham Neubig

Graham Neubig

Carnegie Mellon University

31,000 citations · h-index 60

Mirella Lapata

Mirella Lapata

University of Edinburgh

30,000 citations · h-index 68

Noam Shazeer

Noam Shazeer

Google

28,748 citations · h-index 34

Maosong Sun

Maosong Sun

Tsinghua University

28,000 citations · h-index 65

Sanja Fidler

Sanja Fidler

University of Toronto / NVIDIA

28,000 citations · h-index 58

Iryna Gurevych

Iryna Gurevych

TU Darmstadt

28,000 citations · h-index 62

Emily Bender

Emily Bender

University of Washington

24,806 citations · h-index 38

Douwe Kiela

Douwe Kiela

Contextual AI

24,000 citations · h-index 45

Sebastian Riedel

Sebastian Riedel

UCL / Meta AI

24,000 citations · h-index 55

Pascale Fung

Pascale Fung

HKUST

22,000 citations · h-index 52

Alexander Rush

Alexander Rush

Cornell University

22,000 citations · h-index 45

William Yang Wang

William Yang Wang

UC Santa Barbara

21,000 citations · h-index 52

Kai-Wei Chang

Kai-Wei Chang

UCLA

19,000 citations · h-index 48

Diyi Yang

Diyi Yang

Stanford University

16,000 citations · h-index 42

Yue Zhang

Yue Zhang

Westlake University

16,000 citations · h-index 48

Xiang Ren

Xiang Ren

USC

15,000 citations · h-index 40

Zhou Yu

Zhou Yu

Columbia University

14,000 citations · h-index 40

Isabelle Augenstein

Isabelle Augenstein

University of Copenhagen

13,000 citations · h-index 38

Barbara Plank

Barbara Plank

LMU Munich

12,000 citations · h-index 40

Tatsunori Hashimoto

Tatsunori Hashimoto

Stanford University

12,000 citations · h-index 30

Ryan Cotterell

Ryan Cotterell

ETH Zurich

10,000 citations · h-index 38

Ashish Vaswani

Ashish Vaswani

Essential AI

9,969 citations · h-index 28

Mrinmaya Sachan

Mrinmaya Sachan

ETH Zurich

8,500 citations · h-index 32

Sara Hooker

Sara Hooker

Cohere for AI

2,851 citations · h-index 21

Papers (26)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

NAACL 20192019108,124 citations

Long Short-Term Memory

Neural Computation199795,000 citations

Efficient Estimation of Word Representations in Vector Space

ICLR Workshop 2013201338,000 citations

Speech and Language Processing

Textbook (3rd Edition draft)202332,000 citations

Language Models are Few-Shot Learners

NeurIPS 2020202018,500 citations

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

EMNLP 2014201416,000 citations

Deep contextualized word representations

NAACL 2018201815,000 citations

GPT-4 Technical Report

arXiv preprint20238,500 citations

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

EMNLP 201920197,500 citations

Llama 2: Open Foundation and Fine-Tuned Chat Models

arXiv preprint20237,200 citations

Attention Is All You Need

NeurIPS 201720176,510 citations

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?

FAccT 202120215,500 citations

Reading Wikipedia to Answer Open-Domain Questions

ACL 201720175,200 citations

Training language models to follow instructions with human feedback

NeurIPS 202220224,260 citations

Memory Networks

ICLR 201520154,200 citations

Measuring Massive Multitask Language Understanding

ICLR 202120213,800 citations

Language Models as Knowledge Bases?

EMNLP 201920193,200 citations

Language Models are Few-Shot Learners

NeurIPS 202020203,027 citations

COMET: Commonsense Transformers for Automatic Knowledge Graph Construction

ACL 201920191,500 citations

SayCan: Grounding Language in Robotic Affordances

CoRL 202220231,200 citations

Projects (58)

Build a Large Language Model From Scratch

Hands-on book and code repository teaching LLM internals from the ground up. Covers pretraining, fine-tuning, and RLHF with clear PyTorch implementations.

by Sebastian Raschka

Pinecone AI Education & Canopy

Developer education content and open-source RAG framework for Pinecone. Tutorials on vector search, embeddings, and building production RAG systems.

by James Briggs

sentence-transformers

The most popular library for computing dense vector representations of text. Powers semantic search, clustering, and similarity across millions of applications.

by Nils Reimers

TRL (Transformer Reinforcement Learning)

Hugging Face library for training language models with RLHF, DPO, and PPO. The standard open-source toolkit for alignment fine-tuning.

by Lewis Tunstall

Hugging Face Inference Endpoints

Production deployment platform for transformer models. One-click deployment of any Hugging Face model with auto-scaling and optimized inference.

by Philipp Schmid

GPT-NeoX / Pythia

EleutherAI's open-source large language model suite. Pythia provides 16 model checkpoints (70M-12B) for scientific study of LLM training dynamics.

by Stella Biderman

Together AI Platform

Cloud platform for training, fine-tuning, and running open-source AI models. Offers the fastest inference for Llama, Mixtral, and other open models.

by Vinicius Moens

Mistral AI Models

Open-weight frontier models from Europe. Mistral 7B, Mixtral 8x7B, and Mistral Large compete with models 10x their size through architecture innovations.

by Arthur Mensch

Sakana AI — Evolutionary Model Merging

Nature-inspired AI research lab using evolutionary algorithms to automatically merge and create new foundation models without training from scratch.

by David Ha

Imbue AI Agents

AI systems that reason about code and build software autonomously. Imbue trains models optimized for practical reasoning and agent capabilities.

by Kanjun Qiu

Adept ACT-1

AI model trained to use software tools like a human. ACT-1 can navigate web browsers, use spreadsheets, and operate enterprise software via natural language.

by David Luan

Jasper AI

AI content generation platform for enterprise marketing teams. Used by 100K+ businesses to create on-brand copy, blog posts, and campaigns at scale.

by Dave Rogenmoser

Character.AI

Platform for creating and chatting with AI characters. Over 20M monthly users interact with millions of user-created AI personalities and assistants.

by Daniel De Freitas

Inflection Pi / Microsoft AI

Personal AI assistant designed for emotional intelligence and natural conversation. Pi pioneered empathetic AI before Mustafa moved to lead Microsoft AI.

by Mustafa Suleyman

Perplexity AI

AI-native answer engine with real-time web search and citations. Redefining search by providing direct, sourced answers instead of link lists.

by Aravind Srinivas

AssemblyAI Universal-2

State-of-the-art speech-to-text API with speaker diarization, sentiment analysis, and summarization. Universal-2 model achieves best-in-class accuracy.

by Dylan Fox

Deepgram Nova-2

Enterprise speech AI platform with real-time transcription, custom vocabulary, and language understanding. Nova-2 delivers industry-leading speed and accuracy.

by Scott Stephenson

Cohere

Enterprise LLM platform built by a Transformer paper co-author. RAG, embeddings, and language understanding for enterprise search and generation.

by Aidan Gomez

Allen AI (AI2)

Nonprofit AI research lab founded by Paul Allen. Builds open-source AI tools: Semantic Scholar, OLMo, AI2 THOR. Mission: AI for the common good.

by Ali Farhadi

Semantic Scholar

AI-powered academic search engine by Allen AI. Indexes 200M+ papers with AI-generated TLDRs, citation contexts, and research recommendations.

by Michael Schmitz

Lexion AI

AI contract management for enterprises. Uses NLP to extract key terms, track obligations, and automate legal document workflows.

by Ammad Ahmad

OLMo

Fully open language model from Allen AI. Open training data (Dolma), open code, open weights, open evals. Making LLM science reproducible.

by Dirk Groeneveld

Textio

Augmented writing platform. NLP predicts how language performs in job posts, emails, and business writing. Used by Fortune 500 for inclusive hiring.

by Suchen Zang

Megatron-LM

NVIDIA's framework for training large transformer language models efficiently across thousands of GPUs using model and data parallelism.

by Bryan Catanzaro

PyTorch

The most widely used open-source deep learning framework. Provides dynamic computation graphs, GPU acceleration, and a rich ecosystem for research and production.

by Soumith Chintala

Hugging Face Transformers

The definitive open-source library for state-of-the-art NLP, vision, and multimodal models. Provides 200K+ pretrained models and unified APIs.

by Thomas Wolf

RAG (Retrieval-Augmented Generation)

The original RAG framework combining retrieval and generation for knowledge-intensive NLP. Now a foundational pattern used across the LLM ecosystem.

by Patrick Lewis

LoRA

Low-Rank Adaptation of Large Language Models. The most widely adopted parameter-efficient fine-tuning method, reducing trainable parameters by 10,000x.

by Edward Hu

bitsandbytes / QLoRA

Efficient 4-bit quantization library enabling fine-tuning of 65B parameter models on a single 48GB GPU. QLoRA made LLM fine-tuning accessible to everyone.

by Tim Dettmers

FlashAttention

IO-aware exact attention algorithm that is 2-4x faster and uses 5-20x less memory. Now integrated into PyTorch and used by virtually all LLM training runs.

by Tri Dao

Mamba

Selective state space model architecture offering linear-time sequence modeling. Achieves transformer-quality performance with 5x faster inference throughput.

by Albert Gu

Qwen

Alibaba's family of large language and multimodal models. Qwen2 series achieves top performance among open-weight models across benchmarks.

by Junyang Lin

Reka AI Models

Multimodal AI models from Reka AI. Reka Core, Flash, and Edge deliver frontier performance across text, image, video, and audio understanding.

by Yi Tay

PaLM Training Infrastructure

Scaling infrastructure behind Google's PaLM model. Achieved efficient training of 540B parameter models across 6144 TPU v4 chips.

by Reiner Pope

Open Assistant

Community-driven open-source project to create a free, high-quality chat assistant dataset and model. One of the earliest open RLHF efforts.

by Christoph Schuhmann

Lil'Log

Comprehensive technical blog covering LLMs, diffusion, RL, and more. The most widely cited personal ML blog, serving as a reference for researchers worldwide.

by Lilian Weng

Contextual AI RAG 2.0

Enterprise RAG platform that goes beyond naive retrieval. Contextual AI builds RAG-native language models trained end-to-end for grounded generation.

by Douwe Kiela

DSPy

Programming framework for optimizing LLM pipelines. Replaces hand-written prompts with composable, learnable modules that auto-optimize via compilation.

by Omar Khattab

ALiBi Positional Encoding

Attention with Linear Biases enables transformers to generalize to longer sequences than seen during training without learned position embeddings.

by Ofir Press

Machine Learning Mastery

The largest practitioner-focused ML education platform with 1000+ tutorials and 20+ books covering deep learning, NLP, computer vision, and time series.

by Jason Brownlee

OpenAI API Platform

Built and scaled OpenAI's API platform serving GPT-4, DALL-E, and Whisper to millions of developers. Architected the product powering ChatGPT.

by Peter Welinder

OpenAI Engineering

Co-founded OpenAI and led engineering from GPT-1 through ChatGPT and GPT-4. Built the infrastructure and team behind the most impactful AI products.

by Greg Brockman

Thinking Machines Lab

AI research startup founded after departing OpenAI as CTO. Led the technical development of ChatGPT, GPT-4, and DALL-E at OpenAI.

by Mira Murati

Scale AI Data Platform

Enterprise data labeling and AI infrastructure platform. Powers training data for OpenAI, Meta, and US DoD with human-in-the-loop annotation at scale.

by Alexandr Wang

Hugging Face Transformers

The most popular open-source library for state-of-the-art NLP, vision, and audio models. Hosts 500K+ models and 100K+ datasets. The GitHub of machine learning.

by Clem Delangue

AI Grants & Investments

Portfolio of AI investments and open-source contributions. Co-leads Pioneer Fund. Previously acquired GitHub for Microsoft and shipped Copilot.

by Nat Friedman

OpenAI Cookbook & DevRel

Built OpenAI's developer ecosystem from scratch. Created the OpenAI Cookbook with 200+ examples, grew the API community to millions of developers.

by Logan Kilpatrick

Latent Space & smol.ai

Latent Space is the leading AI engineering podcast and newsletter. smol.ai builds small, useful AI developer tools. Defined the AI Engineer role.

by Swyx (Shawn Wang)

fast.ai

Free deep learning courses and the fastai library. Made cutting-edge deep learning accessible to anyone who can code. Used by hundreds of thousands of students.

by Jeremy Howard

ML Paper Explanations (YouTube)

YouTube channel with 250K+ subscribers explaining cutting-edge ML papers. Co-created OpenAssistant, an open-source ChatGPT alternative.

by Yannic Kilcher

Replit

AI-powered coding platform used by 30M+ developers. Features Replit Agent for building full-stack apps from natural language prompts in the browser.

by Amjad Masad

Chain-of-Thought Research

Pioneering research on chain-of-thought prompting, instruction tuning, and emergent abilities of large language models at OpenAI.

by Jason Wei

v0 by Vercel

AI-powered UI generation. Describe a component in natural language, get production-ready React/Tailwind code. Uses LLMs for code generation.

by Guillermo Rauch

llm

CLI tool for interacting with Large Language Models. Supports OpenAI, Claude, local models via plugins. Log and search prompts.

by Simon Willison

DocETL

AI-powered data processing pipelines for unstructured documents. LLM-driven extraction, transformation, and analysis at scale.

by Shreya Shankar

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs. Reproduces GPT-2 (124M) in ~$20 of compute.

by Andrej Karpathy

MosaicML / DBRX

Open-source efficient LLM training platform. DBRX is a 132B MoE model rivaling GPT-3.5. Acquired by Databricks for $1.3B.

by Jonathan Frankle

LangChain

Framework for developing applications powered by language models. Connects LLMs to external data, tools, and agents.

by Harrison Chase