đź’¬ click anything to find out more!
🙋 Hello! I'm a MS candidate at Stanford University studying Computer Science and Music.
🔬 I research NLP: the art of making computers understand and work with human language.
research
I work with Prof. Chris Manning as part of the Stanford NLP Group, where I coordinate the weekly NLP Lunch seminars. I've published in a variety of NLP and bioinformatics conferences, as well as serving as a reviewer at NAACL, EMNLP, and ACL. I also help maintain the English Web Treebank, an open-source dependency treebank.
👩‍💻 In 2020-2021, I led Stanford's Chirpy Cardinal research team. We're competing in the Amazon Alexa Prize Challenge for a $500K prize.
chirpy cardinal
As part of our work on Team Chirpy Cardinal, we're working to build an emotionally engaging socialbot that participates in interesting and supportive conversations. We competed in the SGC4 finals and are waiting for results!
In the meantime, click here to try a live demo of our bot from last year!
⚒️ I've worked at DE Shaw Research and Amazon AWS AI. In Fall 2021, I will be at Google Research.
work experience
Currently, I'm building graphical convolutional neural networks for drug discovery at DE Shaw Research. Previously, I developed efficient ASR models at Amazon AWS AI on the Transcribe team, working with Katrin Kirchhoff [NAACL 2021]. In Fall 2021, I will be at Google Research working on understanding the limits of extremely large language models.
🎻 I'm a concert pianist and organist! Click here to see my videos!
music
Currently, I study piano with Dr. Frederick Weldy and organ with Dr. Robert Huw Morgan. Previously, I served as the president of the Stanford Symphony and have opened the Holiday Musicale twice on organ. I also enjoy playing violin in the Stanford Symphony, where I serve as President.
For more of my recent performances, check out my YouTube channel.
✏️ In my spare time, I lecture a Stanford course, design fun linguistics puzzles, and think about mass transit systems.
interests
I lectured CS 106L, a Stanford elective about the C++ language, for the 2020-2021 school year! I really enjoyed teaching the beauty of C++ to students from all areas of Stanford.
I am on the organizing committee for the North American Computational Linguistics Open, a linguistics puzzles competition for high school students. My problems have appeared on contests in the UK, Ireland, Australia, Canada, and Russia! Here's one of my favorite problems I've authored.
I co-founded Stanford ACMLab, a club for undergraduates to learn machine learning in a hands-on way. Our members gave a spotlight presentation at WELM @ ICLR'21 and were also accepted to SemEval @ ACL'21..
Publications click any to find out more!
-
Neural, Neural Everywhere: Controlled Generation Meets Scaffolded, Structured Dialogue
Alexa Prize Proceedings 2021
-
Align-Refine: Non-Autoregressive Speech Recognition via Iterative Realignment
NAACL 2021
-
Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT
EACL 2021
-
Finding Universal Grammatical Relations in Multilingual BERT
ACL 2020
-
A ML algorithm can optimize the day of trigger to improve in vitro fertilization outcomes
Fertility and Sterility, Jul 2021
-
JAMA Network Open, Jul 2021
-
Fertility and Sterility, May 2021
NLP
Medical AI
Neural, Neural Everywhere: Controlled Generation Meets Scaffolded, Structured Dialogue
to appear, Proceedings of the 2021 Alexa PrizeWe present the second iteration of Chirpy Cardinal, an open-domain dialogue agent developed for the Alexa Prize SGC4 competition. We focus on improving conversational flexibility, initiative, and coherence, introducing an array of new neural methods and make major improvements in entity linking, topical transitions, and latency. These components come together to create a refreshed Chirpy Cardinal that is able to initiate conversations filled with interesting facts, engaging topics, and heartfelt responses.
Align-Refine: Non-Autoregressive Speech Recognition via Iterative Realignment
NAACL 2021Sequence-to-sequence Transformer speech recognition models achieve extremely high performance, but are difficult to deploy in the wild due to long decoding times. Non-autoregressive models are significantly faster, but at the cost of degraded performance. We introduced a new non-autoregressive method— Align-Refine—which significantly narrows this gap. Our method achieves SoTA-level performance at a 6x speedup over previous work.
Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT
EACL 2021Previous work has demonstrated that Transformer encoders learn certain linguistic features, like part-of-speech and syntax. In this work, we demonstrate that this is also the case for deeper linguistic features like morphosyntactic alignment, a core attribute of all languages. We find that the predictive distributions of subjecthood classifiers reflect the morphosyntactic alignment of their training languages, demonstrating the influence of high-level grammatical features not manifested in any one input sentence.
Finding Universal Grammatical Relations in Multilingual BERT
EACL 2021We show that Multilingual BERT is able to learn a general syntactic structure applicable to a variety of natural languages. Additionally, we find evidence that mBERT learns cross-lingual syntactic categories like “subject” and “adverb”—categories that largely agree with traditional linguistic concepts of syntax! Our results imply that simply by reading a large amount of text, mBERT is able to represent syntax—something fundamental to understanding language—in a way that seems to apply across many of the languages it comprehends.
A ML algorithm can optimize the day of trigger to improve in vitro fertilization outcomes
Fertility and SterilityInjecting hormones to trigger ovulation is one of the most important parts of IVF, but the impact of timing decisions has not been previously studied. We applied causal inference methods to optimize the timing, achieving strong results with a simple model!
Changing stimulation protocol on repeat conventional ovarian stimulation cycles does not lead to improved laboratory outcomes
Fertility and SterilityPhysicians often change stimulation protocols after an unsuccessful cycle. We demonstrate empirically that this is not beneficial; on the contrary, we found a minor, but statistically significant improvement in certain laboratory outcomes in the group where the same approach was used a second time
Development and Validation of an Artificial Intelligence System to Optimize Clinician Review of Patient Records
JAMA Network Open, Jul 2021PDF clinical records are inconsistent and hard to understand, so physician review is time-consuming and fraught with error. We developed a system which summarizes and presents these records in an easy-to-understand web interface. A study with first-time physician users demonstrated a 20% time speedup!
about me
I'm pursuing a MS/BS/BA in Computer Science (Artificial Intelligence) and Music Performance. Previously, I did competitive linguistics (silver medalist at the 2018 International Linguistics Olympiad), ran a national horticulture organization, and kept bees!