I’m a computational social scientist, project and community leader, and award-winning educator. I’m passionate about helping others and creating positive social impact through rigorous, reproducible research that influences policy.
My research applies computational methods to study how organizational contexts shape the impacts of structural inequalities. I analyze the ways organizational cultures, identities, and structures mediate and reinforce societal inequalities in locally embedded ways.
As someone dedicated to data science community and education, I am now a Postdoctoral Fellow with the Massive Data Institute (MDI) at Georgetown University. My goal at MDI is to foster research communities while applying data science methods to policy puzzles. I also co-coordinated the D-Lab’s Computational Text Analysis Working Group and the Text Across Domains (TextXD) symposium at UC Berkeley. Finally, I organized the virtual Summer Institute for Computational Social Science in the San Francisco Bay Area (BAY-SICSS) in 2020, which created partnerships between computational social scientists and Bay Area nonprofits.
In accordance with the core values of reproducibility, transparency, and open access, solving my own domain-specific puzzles has led me to develop a complex, open-source project infrastructure portable to other computational social science applications. As project manager of a total of nearly 40 coders through the Undergraduate Research Apprentice and Data Science Discovery Programs, I’ve developed a multi-platform, reproducible approach to coordinating several coding teams (e.g., text analysis, web-scraping, and data management) using GitHub, Slack, BaseCamp, and Box. Indeed, my team and I have created solutions to a range of pressing challenges faced by many computational social scientists, from corpus creation and distributed web-crawling to Docker and virtual machine environment management. I’ve publicly shared these tools through GitHub and public documentation, and I encourage you to take and adapt them to your use case.
For guidance on virtual machine management, web-crawling, and more, see the project documentation. For current code, see the GitHub pages for myself and for the research team I supervise. To reproduce the results of my paper on charter school identities submitted to Sociology of Education (Sorting Schools), you can access the code. You can also read the public pre-registration with the Open Science Foundation.
As someone committed to making computational research methods accessible, I’ve led data science workshops at UC Berkeley, including several demos at the D-Lab’s Computational Text Analysis Working Group and the Digital Humanities Faire and a guest lecture on web crawling for a graduate course in computational social science. Example talk titles include “Introduction to a thorough, practical CTA training” and “Web-scraping at scale: How I captured the corpus of inconsistent charter school websites”. I am also experienced in guiding and mentoring undergraduates in computational projects.
As an experienced instructor in sociology, I have three goals for student learning: (1) critical awareness of socially constructed privileges, assumptions, and norms; (2) a sense of connectedness to and compassion for “others”; and (3) accountability for one’s contributions both to the learning environment and to the social world. To meet these goals, I use the active learning practices of small group discussion, a rigorous presentation structure, and incremental writing assignments. For my teaching in sociology at UC Berkeley, I received the the Certificate of Teaching and Learning in Higher Education in 2015 and the Outstanding Graduate Student Instructor Award in 2016. As further professional development, I also received a Waldorf Teaching Certificate from the Bay Area Center for Waldorf Teacher Training in 2017.
Since Spring 2018, I have coordinated the Computational Text Analysis Working Group (CTAWG): I’ve arranged speakers, led meetings, presented computational text analysis (CTA) tools and resources, contributed to CTA curriculum for the Data-Intensive Social Sciences Laboratory (D-Lab), developed workflows for collaborative coding, and implemented a collaborative project analyzing the United Nations General Debates Corpus. And I’ve organized and led special events, including two series of Sociology Job Market Practice Talks, the “Making Text Research-Ready” symposium in Spring 2018, and the TextXD (“Text Across Domains”) symposium in 2018 and 2019.
I played several key roles at TextXD 2018. In addition to serving on the core organizing committee and being an event speaker, I also worked with the D-Lab to implement cloud infrastructure for the event’s collaborative coding sessions (“hackathons”). And I led a hands-on tutorial on word embeddings during the pre-symposium CTA bootcamp, using a corpus of ancient Akkadian texts also featured in the next day’s keynote. Finally, I served as a hackathon data leader, curating and creating a workflow for exploring word embeddings built on my charter schools data. See the code for my TextXD word embeddings workshop and collaborative session.