💬  click anything to find out more!

🙋 Hello! I'm a MS candidate at Stanford University studying Computer Science and Music.

about me

I'm pursuing a MS/BS/BA in Computer Science (Artificial Intelligence) and Music Performance. Previously, I did competitive linguistics (silver medalist at the 2018 International Linguistics Olympiad), ran a national horticulture organization, and kept bees!

🔬 I research NLP: the art of making computers understand and work with human language.

research

I work with Prof. Chris Manning as part of the Stanford NLP Group, where I coordinate the weekly NLP Lunch seminars. I've published in a variety of NLP and bioinformatics conferences, as well as serving as a reviewer at NAACL, EMNLP, and ACL. I also help maintain the English Web Treebank, an open-source dependency treebank.

👩‍💻 In 2020-2021, I led Stanford's Chirpy Cardinal research team. We're competing in the Amazon Alexa Prize Challenge for a $500K prize.

chirpy cardinal

As part of our work on Team Chirpy Cardinal, we're working to build an emotionally engaging socialbot that participates in interesting and supportive conversations. We competed in the SGC4 finals and are waiting for results!

In the meantime, click here to try a live demo of our bot from last year!

⚒️ I've worked at DE Shaw Research and Amazon AWS AI. In Fall 2021, I will be at Google Research.

work experience

Currently, I'm building graphical convolutional neural networks for drug discovery at DE Shaw Research. Previously, I developed efficient ASR models at Amazon AWS AI on the Transcribe team, working with Katrin Kirchhoff [NAACL 2021]. In Fall 2021, I will be at Google Research working on understanding the limits of extremely large language models.

🎻 I'm a concert pianist and organist! Click here to see my videos!

music

Currently, I study piano with Dr. Frederick Weldy and organ with Dr. Robert Huw Morgan. Previously, I served as the president of the Stanford Symphony and have opened the Holiday Musicale twice on organ. I also enjoy playing violin in the Stanford Symphony, where I serve as President.

For more of my recent performances, check out my YouTube channel.

✏️ In my spare time, I lecture a Stanford course, design fun linguistics puzzles, and think about mass transit systems.

interests

  • I lectured CS 106L, a Stanford elective about the C++ language, for the 2020-2021 school year! I really enjoyed teaching the beauty of C++ to students from all areas of Stanford.

  • I am on the organizing committee for the North American Computational Linguistics Open, a linguistics puzzles competition for high school students. My problems have appeared on contests in the UK, Ireland, Australia, Canada, and Russia! Here's one of my favorite problems I've authored.

  • I co-founded Stanford ACMLab, a club for undergraduates to learn machine learning in a hands-on way. Our members gave a spotlight presentation at WELM @ ICLR'21 and were also accepted to SemEval @ ACL'21..


Publications click any to find out more!

Neural, Neural Everywhere: Controlled Generation Meets Scaffolded, Structured Dialogue

to appear, Proceedings of the 2021 Alexa Prize

We present the second iteration of Chirpy Cardinal, an open-domain dialogue agent developed for the Alexa Prize SGC4 competition. We focus on improving conversational flexibility, initiative, and coherence, introducing an array of new neural methods and make major improvements in entity linking, topical transitions, and latency. These components come together to create a refreshed Chirpy Cardinal that is able to initiate conversations filled with interesting facts, engaging topics, and heartfelt responses.

Align-Refine: Non-Autoregressive Speech Recognition via Iterative Realignment

NAACL 2021

Sequence-to-sequence Transformer speech recognition models achieve extremely high performance, but are difficult to deploy in the wild due to long decoding times. Non-autoregressive models are significantly faster, but at the cost of degraded performance. We introduced a new non-autoregressive method— Align-Refine—which significantly narrows this gap. Our method achieves SoTA-level performance at a 6x speedup over previous work.

pdf

Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT

EACL 2021

Previous work has demonstrated that Transformer encoders learn certain linguistic features, like part-of-speech and syntax. In this work, we demonstrate that this is also the case for deeper linguistic features like morphosyntactic alignment, a core attribute of all languages. We find that the predictive distributions of subjecthood classifiers reflect the morphosyntactic alignment of their training languages, demonstrating the influence of high-level grammatical features not manifested in any one input sentence.

pdf techxplore

Finding Universal Grammatical Relations in Multilingual BERT

EACL 2021

We show that Multilingual BERT is able to learn a general syntactic structure applicable to a variety of natural languages. Additionally, we find evidence that mBERT learns cross-lingual syntactic categories like “subject” and “adverb”—categories that largely agree with traditional linguistic concepts of syntax! Our results imply that simply by reading a large amount of text, mBERT is able to represent syntax—something fundamental to understanding language—in a way that seems to apply across many of the languages it comprehends.

pdf blogpost acm news stanford engineering

A ML algorithm can optimize the day of trigger to improve in vitro fertilization outcomes

Fertility and Sterility

Injecting hormones to trigger ovulation is one of the most important parts of IVF, but the impact of timing decisions has not been previously studied. We applied causal inference methods to optimize the timing, achieving strong results with a simple model!

Changing stimulation protocol on repeat conventional ovarian stimulation cycles does not lead to improved laboratory outcomes

Fertility and Sterility

Physicians often change stimulation protocols after an unsuccessful cycle. We demonstrate empirically that this is not beneficial; on the contrary, we found a minor, but statistically significant improvement in certain laboratory outcomes in the group where the same approach was used a second time

Development and Validation of an Artificial Intelligence System to Optimize Clinician Review of Patient Records

JAMA Network Open, Jul 2021

PDF clinical records are inconsistent and hard to understand, so physician review is time-consuming and fraught with error. We developed a system which summarizes and presents these records in an easy-to-understand web interface. A study with first-time physician users demonstrated a 20% time speedup!