David J. Lee
Research Engineer at Scale • GitHub • LinkedIn • Email • New York, NY
I'm an incoming ML research engineer joining the MLDG team at Scale.
Prior to Scale, I was a graduate student at Cornell, where I worked with Kevin Ellis on code generation, focusing on synthetic data generation for LLM coding applications. In particular, I designed algorithms to generate diverse training datasets for code generation models, using ideas from novelty search. I also worked on probabilistic program synthesis for the ARC-AGI benchmark.
As an undergrad at Williams, I worked on adaptive quotient filters, concurrent program analysis, and knot theory. My thesis was advised by Shikha Singh and Sam McCauley.