Bakar Computational Health Sciences Institute,
University of California,
San Francisco
Email: Madhumita.Sushil@ucsf.edu
I am an incoming Assistant Professor in the Division of Clinical Informatics and Digital Transformation (DoC-IT), Department of Neurosurgery, and the Bakar Computational Health Sciences Institute (BCHSI) at University of California, San Francisco. I have previously worked as a postdoctoral researcher with Prof. Atul Butte at BCHSI. I have obtained a PhD in clinical Natural Language Processing at the Computational Linguistics and Psycholinguistics (CLiPS) Research Center, University of Antwerp, Belgium, under Prof. Dr. Walter Daelemans and Dr. Simon Šuster. My research interest is to develop foundation models to understand the nuances in electronic health record data, and to use these insights for answering clinical research questions. I am particularly interested in developing strategies for inferring causal patterns from observational textual data, improving generalization for low frequency data samples, and developing multi-modal and interpretable foundation models to improve the understanding of patients’ disease and treatment trajectory. I have extensive experience with the development of methodology for deep learning model interpretability and retrieval-augmented classification, and in creating benchmarking datasets for advanced oncology-specific information extraction from clinical notes.
During my PhD, I have worked as a research intern at the Google Brain Applied team in Zurich, where I investigated linguistic reasoning skills of BERT representations. I have additionally been involved in several academic service positions throughout. I hold a Master of Science in Language Science and Technology (spec. Language Technology) from Saarland University, Germany, and a Bachelor of Technology in Computer Science and Engineering from VIT University, Vellore, India. I have additionally worked on clinical text understanding as a Junior Research Developer at the Antwerp University Hospital, Belgium, and contributed towards recognizing textual entailment for the EU-funded Excitement project as a Research Assistant at the German Research Center for Artifical Intelligence (DFKI), Saarbrücken, Germany.
CORAL: Expert-Curated Oncology Reports to Advance Language Model Inference
Madhumita Sushil, Vanessa E. Kennedy#, Divneet Mandair#, Brenda Miao, Travis Zack*, Atul J. Butte*
New England Journal of Medicine (NEJM)-AI, 2024
bibtex | preprint | Dataset
A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports
Madhumita Sushil*, Travis Zack*, Divneet Mandair*, Zhiwei Zheng#, Ahmed Wali#, Yan-Ning Yu#, Yuwei Quan#, Dmytro Lituiev, Atul J. Butte
Journal of American Medical Informatics Association (JAMIA), 2024
bibtex | Dataset
Cross-institution natural language processing for reliable clinical association studies: a methodological exploration
Madhumita Sushil, Atul J. Butte, Ewoud Schuit, Maarten van Smeden, Artuur M. Leeuwenberg
Journal of Clinical Epidemiology, 2024
bibtex
Algorithmic identification of treatment-emergent adverse events from clinical notes using large language models: a pilot study in inflammatory bowel disease
Anna L Silverman*, Madhumita Sushil*, Balu Bhasuran*, Dana Ludwig, James Buchanan, Rebecca Racz, Mahalakshmi Parakala, Samer El-Kamary, Ohenewaa Ahima, Artur Belov, Lauren Choi, Monisha Billings, Yan Li, Nadia Habal, Qi Liu, Jawahar Tiwari, Atul J. Butte, and Vivek A. Rudrapatna.
Journal of Clinical Pharmacology and Therapeutics, 2024.
bibtex | preprint
Topic modeling on clinical social work notes for exploring social determinants of health factors
Shenghuan Sun, Travis Zack, Christopher Y. K. Williams, Madhumita Sushil*, Atul J. Butte*
JAMIA Open, 2024
bibtex
Are we there yet? Exploring clinical domain knowledge of BERT models
Madhumita Sushil, Simon Šuster, Walter Daelemans
BioNLP Workshop, NAACL 2021
bibtex | slides
Contextual explanation rules for neural clinical classifiers
Madhumita Sushil, Simon Šuster, Walter Daelemans
BioNLP Workshop, NAACL 2021
bibtex | code | poster
Rule induction for global explanation of trained models
Madhumita Sushil, Simon Šuster, Walter Daelemans
Workshop on Analyzing and interpreting neural networks for NLP (BlackboxNLP), EMNLP 2018
bibtex | code | poster
Revisiting neural relation classification in clinical notes with external information
Simon Šuster, Madhumita Sushil, Walter Daelemans
Workshop on Health Text Mining and Information Analysis (LOUHI), EMNLP 2018
bibtex |
code | poster
Patient representation learning and interpretable evaluation using clinical notes
Madhumita Sushil, Simon Šuster, Kim Luyckx, Walter Daelemans
Journal of Biomedical Informatics, 2018
bibtex | code | arXiv preprint
Natural language processing for inferences from electronic health record notes
Clinical Informatics Data Science Pathway lecture series, UCSF, 2023.
UCSF-Stanford Center of Excellence in Regulatory Science and Innovation (CERSI) EHR training series to the FDA, 2023.
Lessons learned from clinical language processing
Seminar at Butte lab, University of California-San Francisco, July 2020
Synthetic dataset for explaining and evaluating rules learned by RNNs
Blackbox@NL Workshop, May 2019
Understanding Machine Learning models for healthcare: Why, and how?
Project Accumulate Industry Meeting, March 2019
Identifying Patients with Major Diabetes-related Complications
CLiPS Lab Meeting, May 2018
Model Agnostic Interpretability Techniques
CLiPS Lab Meeting, March 2018
Unsupervised patient representations with interpretable classification decisions
Computational Linguistics in the Netherlands 28 (CLIN28), January 2018
Clinical Data Characteristics and Processing Challenges
Project Accumulate Technical Meeting, December 2016
Psychiatric symptom severity identification, and experiences with cTakes
Project Accumulate Technical Meeting, August 2016
Large Language Models are Zero-shot Oncology Information Extractors
AMIA Annual Symposium, 2023
Training a transferrable clinical language model from 75 million notes
AMIA Annual Symposium, 2022
Rule induction for global explanation of recurrent neural classifiers
3rd Google NLP Summit, Zurich, June 2019
Symptom Severity Identification from Psychiatric Evaluation Notes
ATILA Workshop, Nijmegen, October 2016
Evolution of Language from an Information Theoretic Point of View
Saarland University, May 2015
Exploring and Understanding Neural Models for Clinical Tasks
Ph.D. Thesis, March 2021
slides
Recognizing Textual Entailment
M.Sc. Thesis, February 2016
slides