NLP
53 researchers · 26 papers · 58 projects · 58 builders
Researchers (53)
Oriol Vinyals
Google DeepMind
415,165 citations · h-index 95
Jeff Dean
Google DeepMind
400,701 citations · h-index 124
Alec Radford
OpenAI
346,122 citations · h-index 24
Chris Manning
Stanford University
316,101 citations · h-index 122
Jürgen Schmidhuber
KAUST / IDSIA
302,473 citations · h-index 105
Aidan Gomez
Cohere
260,248 citations · h-index 15
Tomas Mikolov
Czech Institute of Informatics (CIIRC)
165,000 citations · h-index 42
Kyunghyun Cho
New York University / Genentech
140,000 citations · h-index 82
Aaron Courville
Mila / Université de Montréal
135,000 citations · h-index 72
Jacob Devlin
127,030 citations · h-index 23
Samy Bengio
Apple AI/ML
112,069 citations · h-index 72
Jason Weston
Meta AI
112,000 citations · h-index 88
Dan Jurafsky
Stanford University
72,000 citations · h-index 82
Luke Zettlemoyer
University of Washington / Meta AI
62,000 citations · h-index 78
Marc'Aurelio Ranzato
Meta AI
62,000 citations · h-index 60
Oren Etzioni
Allen Institute for AI (AI2)
52,000 citations · h-index 85
Tom Brown
OpenAI
52,000 citations · h-index 25
Noah Smith
University of Washington / Allen AI
48,000 citations · h-index 82
Dan Roth
University of Pennsylvania
48,000 citations · h-index 82
Yejin Choi
University of Washington / Allen AI
45,000 citations · h-index 75
Percy Liang
Stanford University
42,998 citations · h-index 85
Sanjeev Arora
Princeton University
42,000 citations · h-index 68
Lior Wolf
Tel Aviv University
38,000 citations · h-index 65
Hang Li
ByteDance AI Lab
38,000 citations · h-index 65
Xiaodong He
JD AI Research
35,000 citations · h-index 58
Regina Barzilay
MIT CSAIL
32,000 citations · h-index 75
Hannaneh Hajishirzi
University of Washington / Allen AI
32,000 citations · h-index 55
Zhiyuan Liu
Tsinghua University
32,000 citations · h-index 60
Danqi Chen
Princeton University
32,000 citations · h-index 38
Graham Neubig
Carnegie Mellon University
31,000 citations · h-index 60
Mirella Lapata
University of Edinburgh
30,000 citations · h-index 68
Noam Shazeer
28,748 citations · h-index 34
Maosong Sun
Tsinghua University
28,000 citations · h-index 65
Sanja Fidler
University of Toronto / NVIDIA
28,000 citations · h-index 58
Iryna Gurevych
TU Darmstadt
28,000 citations · h-index 62
Emily Bender
University of Washington
24,806 citations · h-index 38
Douwe Kiela
Contextual AI
24,000 citations · h-index 45
Sebastian Riedel
UCL / Meta AI
24,000 citations · h-index 55
Pascale Fung
HKUST
22,000 citations · h-index 52
Alexander Rush
Cornell University
22,000 citations · h-index 45
William Yang Wang
UC Santa Barbara
21,000 citations · h-index 52
Kai-Wei Chang
UCLA
19,000 citations · h-index 48
Diyi Yang
Stanford University
16,000 citations · h-index 42
Yue Zhang
Westlake University
16,000 citations · h-index 48
Xiang Ren
USC
15,000 citations · h-index 40
Zhou Yu
Columbia University
14,000 citations · h-index 40
Isabelle Augenstein
University of Copenhagen
13,000 citations · h-index 38
Barbara Plank
LMU Munich
12,000 citations · h-index 40
Tatsunori Hashimoto
Stanford University
12,000 citations · h-index 30
Ryan Cotterell
ETH Zurich
10,000 citations · h-index 38
Ashish Vaswani
Essential AI
9,969 citations · h-index 28
Mrinmaya Sachan
ETH Zurich
8,500 citations · h-index 32
Sara Hooker
Cohere for AI
2,851 citations · h-index 21
Papers (26)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Long Short-Term Memory
Efficient Estimation of Word Representations in Vector Space
Speech and Language Processing
Language Models are Few-Shot Learners
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
Deep contextualized word representations
GPT-4 Technical Report
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Llama 2: Open Foundation and Fine-Tuned Chat Models
Attention Is All You Need
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?
Reading Wikipedia to Answer Open-Domain Questions
Training language models to follow instructions with human feedback
Memory Networks
Measuring Massive Multitask Language Understanding
Language Models as Knowledge Bases?
Language Models are Few-Shot Learners
COMET: Commonsense Transformers for Automatic Knowledge Graph Construction
SayCan: Grounding Language in Robotic Affordances
Projects (58)
Build a Large Language Model From Scratch
Hands-on book and code repository teaching LLM internals from the ground up. Covers pretraining, fine-tuning, and RLHF with clear PyTorch implementations.
by Sebastian Raschka
Pinecone AI Education & Canopy
Developer education content and open-source RAG framework for Pinecone. Tutorials on vector search, embeddings, and building production RAG systems.
by James Briggs
sentence-transformers
The most popular library for computing dense vector representations of text. Powers semantic search, clustering, and similarity across millions of applications.
by Nils Reimers
TRL (Transformer Reinforcement Learning)
Hugging Face library for training language models with RLHF, DPO, and PPO. The standard open-source toolkit for alignment fine-tuning.
by Lewis Tunstall
Hugging Face Inference Endpoints
Production deployment platform for transformer models. One-click deployment of any Hugging Face model with auto-scaling and optimized inference.
by Philipp Schmid
GPT-NeoX / Pythia
EleutherAI's open-source large language model suite. Pythia provides 16 model checkpoints (70M-12B) for scientific study of LLM training dynamics.
by Stella Biderman
Together AI Platform
Cloud platform for training, fine-tuning, and running open-source AI models. Offers the fastest inference for Llama, Mixtral, and other open models.
by Vinicius Moens
Mistral AI Models
Open-weight frontier models from Europe. Mistral 7B, Mixtral 8x7B, and Mistral Large compete with models 10x their size through architecture innovations.
by Arthur Mensch
Sakana AI — Evolutionary Model Merging
Nature-inspired AI research lab using evolutionary algorithms to automatically merge and create new foundation models without training from scratch.
by David Ha
Imbue AI Agents
AI systems that reason about code and build software autonomously. Imbue trains models optimized for practical reasoning and agent capabilities.
by Kanjun Qiu
Adept ACT-1
AI model trained to use software tools like a human. ACT-1 can navigate web browsers, use spreadsheets, and operate enterprise software via natural language.
by David Luan
Jasper AI
AI content generation platform for enterprise marketing teams. Used by 100K+ businesses to create on-brand copy, blog posts, and campaigns at scale.
by Dave Rogenmoser
Character.AI
Platform for creating and chatting with AI characters. Over 20M monthly users interact with millions of user-created AI personalities and assistants.
by Daniel De Freitas
Inflection Pi / Microsoft AI
Personal AI assistant designed for emotional intelligence and natural conversation. Pi pioneered empathetic AI before Mustafa moved to lead Microsoft AI.
by Mustafa Suleyman
Perplexity AI
AI-native answer engine with real-time web search and citations. Redefining search by providing direct, sourced answers instead of link lists.
by Aravind Srinivas
AssemblyAI Universal-2
State-of-the-art speech-to-text API with speaker diarization, sentiment analysis, and summarization. Universal-2 model achieves best-in-class accuracy.
by Dylan Fox
Deepgram Nova-2
Enterprise speech AI platform with real-time transcription, custom vocabulary, and language understanding. Nova-2 delivers industry-leading speed and accuracy.
by Scott Stephenson
Cohere
Enterprise LLM platform built by a Transformer paper co-author. RAG, embeddings, and language understanding for enterprise search and generation.
by Aidan Gomez
Allen AI (AI2)
Nonprofit AI research lab founded by Paul Allen. Builds open-source AI tools: Semantic Scholar, OLMo, AI2 THOR. Mission: AI for the common good.
by Ali Farhadi
Semantic Scholar
AI-powered academic search engine by Allen AI. Indexes 200M+ papers with AI-generated TLDRs, citation contexts, and research recommendations.
by Michael Schmitz
Lexion AI
AI contract management for enterprises. Uses NLP to extract key terms, track obligations, and automate legal document workflows.
by Ammad Ahmad
OLMo
Fully open language model from Allen AI. Open training data (Dolma), open code, open weights, open evals. Making LLM science reproducible.
by Dirk Groeneveld
Textio
Augmented writing platform. NLP predicts how language performs in job posts, emails, and business writing. Used by Fortune 500 for inclusive hiring.
by Suchen Zang
Megatron-LM
NVIDIA's framework for training large transformer language models efficiently across thousands of GPUs using model and data parallelism.
by Bryan Catanzaro
PyTorch
The most widely used open-source deep learning framework. Provides dynamic computation graphs, GPU acceleration, and a rich ecosystem for research and production.
by Soumith Chintala
Hugging Face Transformers
The definitive open-source library for state-of-the-art NLP, vision, and multimodal models. Provides 200K+ pretrained models and unified APIs.
by Thomas Wolf
RAG (Retrieval-Augmented Generation)
The original RAG framework combining retrieval and generation for knowledge-intensive NLP. Now a foundational pattern used across the LLM ecosystem.
by Patrick Lewis
LoRA
Low-Rank Adaptation of Large Language Models. The most widely adopted parameter-efficient fine-tuning method, reducing trainable parameters by 10,000x.
by Edward Hu
bitsandbytes / QLoRA
Efficient 4-bit quantization library enabling fine-tuning of 65B parameter models on a single 48GB GPU. QLoRA made LLM fine-tuning accessible to everyone.
by Tim Dettmers
FlashAttention
IO-aware exact attention algorithm that is 2-4x faster and uses 5-20x less memory. Now integrated into PyTorch and used by virtually all LLM training runs.
by Tri Dao
Mamba
Selective state space model architecture offering linear-time sequence modeling. Achieves transformer-quality performance with 5x faster inference throughput.
by Albert Gu
Qwen
Alibaba's family of large language and multimodal models. Qwen2 series achieves top performance among open-weight models across benchmarks.
by Junyang Lin
Reka AI Models
Multimodal AI models from Reka AI. Reka Core, Flash, and Edge deliver frontier performance across text, image, video, and audio understanding.
by Yi Tay
PaLM Training Infrastructure
Scaling infrastructure behind Google's PaLM model. Achieved efficient training of 540B parameter models across 6144 TPU v4 chips.
by Reiner Pope
Open Assistant
Community-driven open-source project to create a free, high-quality chat assistant dataset and model. One of the earliest open RLHF efforts.
by Christoph Schuhmann
Lil'Log
Comprehensive technical blog covering LLMs, diffusion, RL, and more. The most widely cited personal ML blog, serving as a reference for researchers worldwide.
by Lilian Weng
Contextual AI RAG 2.0
Enterprise RAG platform that goes beyond naive retrieval. Contextual AI builds RAG-native language models trained end-to-end for grounded generation.
by Douwe Kiela
DSPy
Programming framework for optimizing LLM pipelines. Replaces hand-written prompts with composable, learnable modules that auto-optimize via compilation.
by Omar Khattab
ALiBi Positional Encoding
Attention with Linear Biases enables transformers to generalize to longer sequences than seen during training without learned position embeddings.
by Ofir Press
Machine Learning Mastery
The largest practitioner-focused ML education platform with 1000+ tutorials and 20+ books covering deep learning, NLP, computer vision, and time series.
by Jason Brownlee
OpenAI API Platform
Built and scaled OpenAI's API platform serving GPT-4, DALL-E, and Whisper to millions of developers. Architected the product powering ChatGPT.
by Peter Welinder
OpenAI Engineering
Co-founded OpenAI and led engineering from GPT-1 through ChatGPT and GPT-4. Built the infrastructure and team behind the most impactful AI products.
by Greg Brockman
Thinking Machines Lab
AI research startup founded after departing OpenAI as CTO. Led the technical development of ChatGPT, GPT-4, and DALL-E at OpenAI.
by Mira Murati
Scale AI Data Platform
Enterprise data labeling and AI infrastructure platform. Powers training data for OpenAI, Meta, and US DoD with human-in-the-loop annotation at scale.
by Alexandr Wang
Hugging Face Transformers
The most popular open-source library for state-of-the-art NLP, vision, and audio models. Hosts 500K+ models and 100K+ datasets. The GitHub of machine learning.
by Clem Delangue
AI Grants & Investments
Portfolio of AI investments and open-source contributions. Co-leads Pioneer Fund. Previously acquired GitHub for Microsoft and shipped Copilot.
by Nat Friedman
OpenAI Cookbook & DevRel
Built OpenAI's developer ecosystem from scratch. Created the OpenAI Cookbook with 200+ examples, grew the API community to millions of developers.
by Logan Kilpatrick
Latent Space & smol.ai
Latent Space is the leading AI engineering podcast and newsletter. smol.ai builds small, useful AI developer tools. Defined the AI Engineer role.
by Swyx (Shawn Wang)
fast.ai
Free deep learning courses and the fastai library. Made cutting-edge deep learning accessible to anyone who can code. Used by hundreds of thousands of students.
by Jeremy Howard
ML Paper Explanations (YouTube)
YouTube channel with 250K+ subscribers explaining cutting-edge ML papers. Co-created OpenAssistant, an open-source ChatGPT alternative.
by Yannic Kilcher
Replit
AI-powered coding platform used by 30M+ developers. Features Replit Agent for building full-stack apps from natural language prompts in the browser.
by Amjad Masad
Chain-of-Thought Research
Pioneering research on chain-of-thought prompting, instruction tuning, and emergent abilities of large language models at OpenAI.
by Jason Wei
v0 by Vercel
AI-powered UI generation. Describe a component in natural language, get production-ready React/Tailwind code. Uses LLMs for code generation.
by Guillermo Rauch
llm
CLI tool for interacting with Large Language Models. Supports OpenAI, Claude, local models via plugins. Log and search prompts.
by Simon Willison
DocETL
AI-powered data processing pipelines for unstructured documents. LLM-driven extraction, transformation, and analysis at scale.
by Shreya Shankar
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs. Reproduces GPT-2 (124M) in ~$20 of compute.
by Andrej Karpathy
MosaicML / DBRX
Open-source efficient LLM training platform. DBRX is a 132B MoE model rivaling GPT-3.5. Acquired by Databricks for $1.3B.
by Jonathan Frankle
LangChain
Framework for developing applications powered by language models. Connects LLMs to external data, tools, and agents.
by Harrison Chase