🙋 Hello! I'm a MS candidate at Stanford University studying Computer Science and Music.

about me

I did my BS in Computer Science and BA in Music Performance at Stanford. Previously, I did competitive linguistics(silver medalist at the 2018 International Linguistics Olympiad), ran a national horticulture organization, and kept bees!

🔬 I research NLP: the art of making computers understand and work with human language.
I led a research group working on building social chatbots and won $100K.

research

As part of the Stanford NLP Group, I published at NAACL, EMNLP, and ACL. I also help maintain the English Web Treebank.

⚒️ I've spent time at Google AI, DE Shaw Research and Amazon AWS AI. In the fall, I'll be starting at Hudson River Trading in NYC.

work experience

Previously, I worked on understanding the ability of large language models to reason at Google, built neural networks for drug discovery at DE Shaw Research, and developed efficient ASR models at Amazon AWS AI on the Transcribe team, working with Katrin Kirchhoff [NAACL 2021]. In Fall 2021, I will be at Google Research working on understanding the limits of extremely large language models.

🎻 I'm a concert pianist and organist! Click here to see my videos!

music

Currently, I study piano with Dr. Frederick Weldy and organ with Dr. Robert Huw Morgan. Previously, I served as the president of the Stanford Symphony and have opened the Holiday Musicale twice on organ. I also enjoy playing violin in the Stanford Symphony, where I serve as President.

For more of my recent performances, check out my YouTube channel.

✏️ In my spare time, I lecture a Stanford course, design fun linguistics puzzles, and think about mass transit systems.

interests

  • I lectured CS 106L, a Stanford elective about the C++ language, for the 2020-2021 school year! I really enjoyed teaching the beauty of C++ to students from all areas of Stanford.

  • I am on the organizing committee for the North American Computational Linguistics Open, a linguistics puzzles competition for high school students. My problems have appeared on contests in the UK, Ireland, Australia, Canada, and Russia! Here's one of my favorite problems I've authored.

  • I co-founded Stanford ACMLab, a club for undergraduates to learn machine learning in a hands-on way. Our members gave a spotlight presentation at WELM @ ICLR'21 and were also accepted to SemEval @ ACL'21..


Publications click any to find out more!

Neural, Neural Everywhere: Controlled Generation Meets Scaffolded, Structured Dialogue

to appear, Proceedings of the 2021 Alexa Prize

We present the second iteration of Chirpy Cardinal, an open-domain dialogue agent developed for the Alexa Prize SGC4 competition. We focus on improving conversational flexibility, initiative, and coherence, introducing an array of new neural methods and make major improvements in entity linking, topical transitions, and latency. These components come together to create a refreshed Chirpy Cardinal that is able to initiate conversations filled with interesting facts, engaging topics, and heartfelt responses.

Align-Refine: Non-Autoregressive Speech Recognition via Iterative Realignment

NAACL 2021

Sequence-to-sequence Transformer speech recognition models achieve extremely high performance, but are difficult to deploy in the wild due to long decoding times. Non-autoregressive models are significantly faster, but at the cost of degraded performance. We introduced a new non-autoregressive method— Align-Refine—which significantly narrows this gap. Our method achieves SoTA-level performance at a 6x speedup over previous work.

pdf

Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT

EACL 2021

Previous work has demonstrated that Transformer encoders learn certain linguistic features, like part-of-speech and syntax. In this work, we demonstrate that this is also the case for deeper linguistic features like morphosyntactic alignment, a core attribute of all languages. We find that the predictive distributions of subjecthood classifiers reflect the morphosyntactic alignment of their training languages, demonstrating the influence of high-level grammatical features not manifested in any one input sentence.

pdf techxplore

Finding Universal Grammatical Relations in Multilingual BERT

EACL 2021

We show that Multilingual BERT is able to learn a general syntactic structure applicable to a variety of natural languages. Additionally, we find evidence that mBERT learns cross-lingual syntactic categories like “subject” and “adverb”—categories that largely agree with traditional linguistic concepts of syntax! Our results imply that simply by reading a large amount of text, mBERT is able to represent syntax—something fundamental to understanding language—in a way that seems to apply across many of the languages it comprehends.

pdf blogpost acm news stanford engineering

A ML algorithm can optimize the day of trigger to improve in vitro fertilization outcomes

Fertility and Sterility

Injecting hormones to trigger ovulation is one of the most important parts of IVF, but the impact of timing decisions has not been previously studied. We applied causal inference methods to optimize the timing, achieving strong results with a simple model!

Changing stimulation protocol on repeat conventional ovarian stimulation cycles does not lead to improved laboratory outcomes

Fertility and Sterility

Physicians often change stimulation protocols after an unsuccessful cycle. We demonstrate empirically that this is not beneficial; on the contrary, we found a minor, but statistically significant improvement in certain laboratory outcomes in the group where the same approach was used a second time

Development and Validation of an Artificial Intelligence System to Optimize Clinician Review of Patient Records

JAMA Network Open, Jul 2021

PDF clinical records are inconsistent and hard to understand, so physician review is time-consuming and fraught with error. We developed a system which summarizes and presents these records in an easy-to-understand web interface. A study with first-time physician users demonstrated a 20% time speedup!