Max W. Shen

Hi, I am a Ph.D. Candidate at MIT. My research uses applied machine learning and statistical methods for scientific discovery and applications.

Cambridge, MA, USA

As a computationalist at heart, I believe in interdisciplinarity, getting one's feet wet, and that better solutions to real-world problems arise by marrying each problem's unique structure with thoughtful modeling and inference design.
I will soon complete my Ph.D. and am actively looking for opportunities. I have published co-first author papers in Nature and Cell, and have an invited talk at NeurIPS 2020 in a workshop for machine learning in biology.
During my Ph.D., I worked with David R. Liu and Aviv Regev at the Broad Institute from 2018-now. Before that, I did my Ph.D work with David Gifford in the MIT Computer Science & Artificial Intelligence Laboratory from 2016-2018 while collaborating closely with Richard I. Sherwood. Before that, I began my Ph.D. at MIT in Computational & Systems Biology in 2015 after graduating summa cum laude with a B.S. in Computer Science with a specialization in bioinformatics from U.C. San Diego.

Publications and work

Reconstruction of genes, evolutionary trajectories, and fitness from short sequencing reads of laboratory evolution experiments using machine learning
Max W Shen, Kevin T. Zhao, David R. Liu
Invited talk at NeurIPS 2020, Learning Meaningful Representations of Life Workshop
Under review as a journal publication.

Determinants of Base Editing Outcomes from Target Library Analysis and Machine Learning
Max W Shen*, Mandana Arbab*, Beverly Mok, Christopher Wilson, Żaneta Matuszek, Christopher A. Cassa, David R. Liu
Cell, 2020. My artwork was featured on the cover of the July 23, 2020 issue!
[Co-first author reordering approved by all co-first authors]
[Code] [Interactive web app]

Predictable and precise template-free CRISPR editing of pathogenic variants
Max W Shen*, Mandana Arbab*, Jonathan Y Hsu, Daniel Worstell, Sannie J Culbertson, Olga Krabbe, Christopher A Cassa, David R Liu, David K Gifford, Richard I Sherwood
Nature, 2018
[Code] [Interactive web app] [Press feature by Dash plotly for data visualization]

Development of C•G-to-G•C transversion base editors from CRISPRi screens, target-library analysis and machine learning
Luke W Koblan*, Max W Shen*, Mandana Arbab, Jeffrey A Hussmann, Andrew V Anzalone, Jordan L Doman, Gregory A Newby, Dian Yang, Beverly Mok, Joseph M Replogle, Albert Xu, Tyler A Sisley, Jonathan S Weissman, Britt Adamson, David R Liu
Nature Biotechnology, 2021. In press.

Detection of gene cis-regulatory element perturbations in singlecell transcriptomes
Grace H.T. Yeo, Oscar Juez, Qing Chen, Budhaditya Banerjee, Lendy Chu, Max W Shen, May Sabry, Ive Logister, Richard I Sherwood, David K Gifford
PLOS Computational Biology, 2021.

PrimeDesign software for rapid and simplified design of prime editing guide RNAs
Jonathan Y Hsu, Julian Grünewald, Regan Szalay, Justine Shih, Andrew V Anzalone, Kin Chung Lam, Max W Shen, Karl Petri, David R Liu, Keith Joung, Luca Pinello
Nature Communications, 2021.

Machine learning based CRISPR gRNA design for therapeutic exon skipping
Wilson Louie, Max W Shen, Zakir Tahiry, Sophia Zhang, Daniel Worstell, Christopher A Cassa, Richard I Sherwood
PLOS Computational Biology, 2021.

Comprehensive Mapping of Key Regulatory Networks that Drive Oncogene Expression
Lin Lin, Benjamin Holmes, Max W. Shen, Darnell Kammeron, Niels Geijsen, David K Giffiord, Richard I Sherwood
Cell Reports, 2020.

Continuous evolution of SpCas9 variants compatible with non-G PAMs
Shannon M Miller*, Tina Wang*, Peyton B Randolph, Mandana Arbab, Max W Shen, Tony P Huang, Zaneta Matuszek, Gregory A Newby, Holly A Rees, David R Liu
Nature Biotechnology, 2020

Assembly of long error-prone reads using de Bruijn graphs
Yu Lin*, Jeffrey Yuan*, Mikhail Kolmogorov, Max W Shen, Mark Chaisson, Pavel A Pevzner
Proceedings of the National Academy of Sciences, 2016

plasmidSPAdes: assembling plasmids from whole genome sequencing data
Dmitry Antipov, Nolan Hartwick, Max W Shen, Mikahil Raiko, Alla Lapidus, Pavel A. Pevzner
Bioinformatics, 2016

MEG source imaging method using fast L1 minimum-norm and its applications to signals with brain noise and human resting-state source amplitude images
Ming-Xiong Huang, Charles W Huang, Ashley Robb, AnneMarie Angeles, Sharon L Nichols, Dewleen G Baker, Tao Song, Deborah L Harrington, Rebecca J Theilmann, Ramesh Srinivasan, David Heister, Mithun Diwakar, Jose M Canive, J Christopher Edgar, Yu-Han Chen, Zhengwei Ji, Max W Shen, Fady El-Gabalawy, Michael Levy, Robert McLay, Jennifer Webb-Murphy, Thomas T Liu, Angela Drake, Roland R Lee
Neuroimage, 2014


Causal Inference & Deep Learning
MIT independent activites period, Jan. 2018
Max W. Shen, Fredrik Johansson
○ Prepared and co-taught a short graduate-level class with 4 sessions and 6 total h. Typical attendance: 20 students.

Applied Probabilistic Programming & Bayesian Machine Learning
MIT independent activites period, Jan. 2017
Max W. Shen, Alvin Shi, Carles Boix
○ Prepared and co-taught a short upper-division class with 6 sessions and 9 total h. First class attendance: 100 students, typical attendance: 25 students.


Applied Machine Learning: My work customizes powerful, modern deep models to leverage the unique structure within each real world problem. I have designed deep conditional autoregressive models to model base editing outcomes, and jointly-trained multitask sister deep networks to accurately learn a particularly noisy subset of CRISPR editing activity. I have taught classes to MIT undergrad and graduate students on Bayesian modeling, deep learning, and causal inference. I am proficient in pytorch, and enjoy keeping up with modern toolkits: see my low-level integration between pyro and gpytorch to learn 10 latent Gaussian processes on time series data in a larger probabilistic model with stochastic variational inference.

Software Engineering: See my GitHub. I completed software engineering internships at Qualcomm Korea (2013) and Illumina (2014). Python is my language of choice, though I have previously worked in C++. Summa cum laude B.S. in Computer Science with a specialization in bioinformatics (2011-2015).

Data Visualization: My interactive web apps have received press attention from Dash plotly. I am proficient in Adobe Illustrator, Photoshop, Premiere Pro, and After Effects, and use matplotlib, pandas, and seaborn everyday. I am also proficient in html, css, and dash plotly.

Communication and Collaboration: My Ph.D. has featured extensive collaboration with wet lab experimentalists, including Richard I. Sherwood and Mandana Arbab. I have completed a 40 h course on conflict management and mediation that has substantially impacted my life.

Management: For two years in undergrad, I managed 9 teams with a total of ~100 students to host a regional urban-style (hip hop) dance competition with ~2,000 audience members with revenue and expenses of $35,000/year. Each team had ~10 students and 2 team leaders, and all team leaders were overseen by me and one other co-leader. This provided substantial public speaking experience, as I led dozens of meetings speaking to and motivating our team of 100 students.

