Shreya Shankar
BuilderBerkeley
@shreyashankar
PhD at UC Berkeley. Building DocETL — AI-powered data processing pipelines. Research on ML data quality and LLM evaluation.
Skills
Looking For
Research Foundations (2)
Projects (1)
AI-powered data processing pipelines for unstructured documents. LLM-driven extraction, transformation, and analysis at scale.
Built on research:
AI Suggested Researchers
Researchers whose work may be relevant to your projects (auto-detected)
Hannaneh Hajishirzi
University of Washington / Allen AI
Kyunghyun Cho
New York University / Genentech
Aaron Courville
Mila / Université de Montréal
Luke Zettlemoyer
University of Washington / Meta AI
AI Suggested Papers
Papers that may have inspired your projects (auto-detected by domain & keyword analysis)
For DocETL:
CPM: A Large-scale Generative Chinese Pre-trained Language Model
AI Open · 2021 · 1,200 citations
OLMo: Accelerating the Science of Language Models
ACL 2024 · 2024 · 800 citations
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
NAACL 2019 · 2019 · 108,124 citations