We present OncoCITE, a multi-agent AI system for automated genomic evidence extraction from scientific literature. Through systematic exploratory data analysis on the CIViC database (11,312 evidence items, 3,083 publications), we quantified 12 structural bottlenecks including curation latency (median 31 days, P90 >21 months) and emerging target gaps. Our 6-agent architecture using Claude Agent SDK with 22 MCP tools achieves 84% ground truth recovery, 97.8% novel discovery precision, and 0% critical errors on a 15-paper validation corpus. A novel three-way validation framework identified 24.2% curation errors in expert-curated databases.
@article{quidwai2025oncocite,title={OncoCITE: AI-Driven Genomic Evidence Curation for Hematologic Malignancies},author={Quidwai, Mujahid Ali and Thibaud, Santiago and Jagannath, Sundar and Parekh, Samir and Lagana, Alessandro},journal={Nature Cancer},year={2025},note={Submitted to Nature Cancer},}
ASH
Oncodif: An Auditable AI Framework for Automated Genomic Curation and Natural-Language Clinical Querying in Hematologic Malignancies
Mujahid Ali Quidwai, Santiago Thibaud, Sundar Jagannath, and 2 more authors
We present OncoDIF, an auditable AI framework for automated genomic curation achieving 97.8% precision in novel discovery with 0% critical errors. The system enables natural-language clinical querying for hematologic malignancies, providing clinicians with evidence-based genomic insights.
@article{quidwai2025oncodif,title={Oncodif: An Auditable AI Framework for Automated Genomic Curation and Natural-Language Clinical Querying in Hematologic Malignancies},author={Quidwai, Mujahid Ali and Thibaud, Santiago and Jagannath, Sundar and Parekh, Samir and Lagana, Alessandro},journal={Blood},volume={146},pages={2646},year={2025},note={ASH 2025 Conference Abstract},}
ASH
Development of Predictive Relapse Indicators for Myeloma T Cell Engagers (PRIME) Model for Myeloma Patients Receiving BCMA- and GPRC5D-Targeting T Cell Engagers
Tarek H. Mouhieddine, Tony Sheng, Junia Vieira Dos Santos, and 8 more authors
We developed the PRIME (Predictive Relapse Indicators for Myeloma T-Cell Engagers) model for patients receiving BCMA- and GPRC5D-targeting T-cell engager therapies including teclistamab, elranatamab, and talquetamab. By integrating clinical, genomic, and treatment response data, we identified early relapse indicators enabling risk stratification at treatment initiation and monitoring strategy optimization. The model supports clinical decision-making for the growing landscape of bispecific antibody therapies in multiple myeloma.
@article{mouhieddine2025prime,title={Development of Predictive Relapse Indicators for Myeloma T Cell Engagers (PRIME) Model for Myeloma Patients Receiving BCMA- and GPRC5D-Targeting T Cell Engagers},author={Mouhieddine, Tarek H. and Sheng, Tony and Vieira Dos Santos, Junia and Aleman, Adriana and Avigan, Zachary and Siegel, Ariel and Quidwai, Mujahid Ali and Chari, Ajai and Jagannath, Sundar and Parekh, Samir and Lagana, Alessandro},journal={Blood},volume={146},number={Supplement 1},pages={3996},year={2025},doi={10.1182/blood-2025-3996},note={ASH 2025 Conference Abstract, Contributing Author},}
IMS
Integrating Microenvironment with Tumor Multi-Omic Using Unsupervised Machine Learning to Model Heterogeneity Refines Multiple Myeloma Subtypes and Reveals Immune-Based Therapeutic Vulnerabilities
We applied a modified IntegrAO graph neural network methodology to the MMRF COMPASS cohort (N=655 samples) for multi-omics patient stratification in multiple myeloma. Integrating SNV, CNV, tumor microenvironment (TME), and whole exome sequencing data, we achieved 50% classification granularity improvement (18 vs 12 subgroups) and 258% high-risk detection enhancement (43 vs 12 patients). We identified 18 distinct vulnerability profiles with 94% of patients having actionable therapeutic targets, enabling personalized treatment recommendations.
@article{hamidi2025multiomics,title={Integrating Microenvironment with Tumor Multi-Omic Using Unsupervised Machine Learning to Model Heterogeneity Refines Multiple Myeloma Subtypes and Reveals Immune-Based Therapeutic Vulnerabilities},author={Hamidi, Habib and Park, Alison S. and Vieira Dos Santos, Junia and Diaz, Gabriel and Okholm, Trine L. and Quidwai, Mujahid Ali and Chari, Ajai and Jagannath, Sundar and Parekh, Samir and Lagana, Alessandro},journal={Clinical Lymphoma Myeloma and Leukemia},volume={25},pages={S181-S182},year={2025},note={IMS 2025 Conference Abstract, Contributing Author},}
2024
medRxiv
A RAG Chatbot for Precision Medicine of Multiple Myeloma
We developed a production Retrieval-Augmented Generation (RAG) system for clinical decision support in multiple myeloma. The system uses LangGraph orchestration, BAAI/bge-large-en-v1.5 embeddings for semantic retrieval, and Mistral-7B for generation. Processing 5,000+ clinical documents, we achieved 88% clinical effectiveness. The first prototype was deployed and actively used by 2-3 clinicians at Mount Sinai for multiple myeloma research and treatment planning, demonstrating the feasibility of production RAG systems in healthcare settings.
@article{quidwai2024rag,title={A RAG Chatbot for Precision Medicine of Multiple Myeloma},author={Quidwai, Mujahid Ali and Lagana, Alessandro},journal={medRxiv},pages={2024.03.14.24304293},year={2024},doi={10.1101/2024.03.14.24304293},note={36 citations},}
MDPI
Early Detection of Cytokine Release Syndrome Using Wearable Devices and Cytokine Profiling Following CAR-T Therapy for Myeloma
We developed a machine learning system for early prediction of Cytokine Release Syndrome (CRS) in CAR-T therapy patients from an investigator-initiated trial (N=25 patients). We engineered time-lagged features from continuous wearable data (skin temperature, SpO2, heart rate) and a 92-biomarker Olink panel. Evaluating 5 classifiers via StratifiedKFold CV, we achieved 84.62% accuracy for ide-cel and 80.62% for cilta-cel within a 6-hour prediction window. SHAP analysis identified IFN-γ as a cross-product predictor, and a fold-change classifier achieved 90% precision with 40-hour mean lead time before CRS onset.
@article{rajeeve2024crs,title={Early Detection of Cytokine Release Syndrome Using Wearable Devices and Cytokine Profiling Following CAR-T Therapy for Myeloma},author={Rajeeve, Sridevi and Wilkes, Matt and Zahradka, Nicole and Tomalin, Lewis E. and Quidwai, Mujahid Ali and Pan, David and Chari, Ajai and Jagannath, Sundar and Parekh, Samir and Lagana, Alessandro},journal={Information (MDPI)},year={2024},note={Under Review, Contributing Author, 2 citations},}
IMS
2P-145: Innovative AI-Driven Decision Support Tool for Multiple Myeloma Using Retrieval Augmented Generation
We present an AI-driven clinical decision support system for multiple myeloma leveraging Retrieval Augmented Generation (RAG) architecture. The system integrates clinical guidelines, research literature, and trial data to provide evidence-based recommendations for treatment planning. Presented at the International Myeloma Society Annual Meeting 2024.
@article{quidwai2024decision,title={2P-145: Innovative AI-Driven Decision Support Tool for Multiple Myeloma Using Retrieval Augmented Generation},author={Quidwai, Mujahid Ali and Thibaud, Santiago and Richter, Joshua and Jagannath, Sundar and Parekh, Samir and Lagana, Alessandro},journal={Clinical Lymphoma Myeloma and Leukemia},volume={24},pages={S123-S124},year={2024},note={IMS 2024 Conference Abstract, 1 citation},}
2023
ACL
Beyond Black Box AI-Generated Plagiarism Detection: From Sentence to Document Level
We present a novel multi-faceted NLP approach for detecting AI-generated text in educational settings. Our method employs contrastive loss training and LLM-generated paraphrases to achieve 94% accuracy in human-AI text classification. Unlike black-box approaches, our system provides interpretable signals at both sentence and document levels, enabling educators to understand the nature and extent of AI involvement in student submissions. This work was conducted in collaboration with IBM Research.
@inproceedings{quidwai2023plagiarism,title={Beyond Black Box AI-Generated Plagiarism Detection: From Sentence to Document Level},author={Quidwai, Mujahid Ali and Li, Chunhui and Dube, Parijat},booktitle={Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)},year={2023},organization={Association for Computational Linguistics},doi={10.18653/v1/2023.bea-1.58},note={31 citations, IBM Research},}