cv
Computational Scientist specializing in AI/ML for precision oncology, multi-agent systems, and clinical decision support.
Basics
| Name | Mujahid Ali Quidwai |
| Label | AI Systems Engineer |
| quidwaiali@gmail.com | |
| Phone | (224) 322-6446 |
| Url | https://ali-maq.github.io/ |
| Summary | AI Systems Engineer with production deployments in clinical oncology — systems actively used by Mount Sinai oncologists for treatment decisions. Architect of multi-agent genomic curation (OncoCITE: 97.8% precision; Submitted to Nature Cancer) and GPU-accelerated pipelines (92.3% time reduction enabling same-day tumor board). Full-stack AI expertise from fine-tuning 7B-235B models to inference optimization on H100/A100 infrastructure. Research includes a widely cited clinical RAG framework (35+ citations) and first-author work at ACL (BEA Workshop). |
Work
-
2024.12 - Present New York, NY
Computational Scientist
Icahn School of Medicine at Mount Sinai
Leading AI/ML initiatives for precision oncology including multi-agent genomic systems, GPU-accelerated pipelines, and predictive modeling.
- OncoCITE (Submitted to Nature Cancer): Performed systematic EDA on CIViC database (11,312 evidence items, 3,083 publications) quantifying 12 structural bottlenecks. Architected 6-agent solution (Claude Agent SDK, 22 MCP tools) with state serialization and vision-based PDF extraction. Validated on 15-paper corpus: 84% ground truth recovery, 97.8% novel discovery precision, 0% critical errors (n=108). Published at ASH 2025.
- PRIME Model (Blood 2025): Co-developed predictive model for patients receiving BCMA- and GPRC5D-targeting T-cell engagers. Integrated clinical, genomic, and treatment response data. Presented at ASH 2025.
- MMAP Pipeline (56x Speedup): Engineered production pipeline on Minerva HPC using NVIDIA H100 GPUs with Parabricks and RAPIDS. Orchestrated 57 computational processes across 3 integrated workflows. Achieved 95.8% processing time reduction (7 days → 3 hours), enabling 8 patients/day throughput. Enabled same-day molecular tumor board readiness.
- Multi-Omics Integration (Clin Lymph Myel Leuk 2025): Applied modified IntegrAO GNN to MMRF COMPASS cohort (N=655). Achieved 50% classification granularity improvement and 258% high-risk detection enhancement. Identified 18 distinct vulnerability profiles with 94% actionable targets.
-
2023.10 - 2024.12 New York, NY
Associate Computational Scientist
Icahn School of Medicine at Mount Sinai
Developed production AI systems for clinical decision support, inference optimization, and CAR-T therapy monitoring.
- LLM Inference Optimization: Designed model-routing gateway for cost-performance optimization. Implemented KV cache optimization and prompt compression. Achieved 60-80% inference cost reduction using vLLM batch inference on HPC. Multi-cloud deployment across AWS and Azure OpenAI.
- RAG System (medRxiv, 36 citations): Led development using LangGraph orchestration, BAAI/bge-large-en-v1.5 embeddings, and Mistral-7B. Processed 5,000+ documents achieving 88% clinical effectiveness. First prototype deployed and used by 2-3 clinicians for multiple myeloma research. Presented at IMS 2024.
- CAR-T Adverse Event Prediction (Information MDPI, under review): Led ML development for early CRS detection (N=25 patients). Engineered time-lagged features from wearables and 92-biomarker Olink panel. Achieved 84.62% accuracy (ide-cel), 80.62% (cilta-cel) within 6-hour prediction window. SHAP analysis identified IFN-gamma as cross-product predictor.
- Voice ASR Prototype for Clinical Terminology: Fine-tuned OpenAI Whisper Small for on-device deployment. Trained on recorded patient voice dataset to reduce word error rate for multiple myeloma terminology. Prototype demonstrated improved recognition accuracy for domain-specific medical terms.
- Graduate Student Mentorship: Mentored 6 Carnegie Mellon University graduate students on CAR-T therapy monitoring capstone project. Guided experiment design, baselining, and reporting.
-
2023.07 - 2023.10 New York, NY
Entrepreneurial Fellow
AYA
Conceptualized and built AYA, leveraging RL-optimized NLP model to extract and formulate questions from academic video content.
- Selected for Summer Sprint at NYU's Entrepreneurial Institute (10 selected from 150 startups)
- Developed end-to-end product combining speech processing, natural language generation, and educational content analysis
-
2021.09 - 2023.05 New York, NY
Machine Learning Researcher & Teaching Assistant
New York University
Research on AI-generated text detection with IBM Research; taught bioinformatics algorithms.
- AI-Generated Plagiarism Detection (ACL 2023, BEA Workshop): Developed novel multi-faceted NLP approach with Prof. Parijat Dube (IBM Research) achieving 94% accuracy in human-AI text classification. Method employs contrastive loss and LLM-generated paraphrases. [31 citations]
- Teaching Assistant - Data Structures & Algorithms for Bioinformatics: Supported graduate course under Prof. Manpreet S. Katari for 3 semesters, teaching 130+ master's students. Created assignments bridging CS fundamentals with bioinformatics applications.
-
2016.08 - 2017.09 New Delhi, India
Technical Lead
Carcrew
Founding technical team member for automotive marketplace startup.
- Built MVP using Django, Flask, PostgreSQL
- Contributed to technical due diligence that secured $2M Series A from TVS Group
- Scaled to 10,000+ DAU
Education
-
2021.09 - 2023.05 New York, NY
MS (Honors)
New York University, Tandon School of Engineering
Computer Engineering
- Deep Learning
- Machine Learning
- High Performance ML
- Natural Language Processing
- ML for Cyber-Security
- Algorithms & Data Structures for Bioinformatics
-
2012.08 - 2016.05 Ghaziabad, India
Awards
- 2025.01.01
ASH 2025 Presentations
American Society of Hematology
Two presentations: OncoCITE multi-agent system and PRIME predictive model
- 2024.03.01
IMS 2024 Presentation
International Myeloma Society
RAG for Multiple Myeloma clinical decision support
- 2024.01.01
- 2023.07.01
NYU Entrepreneurial Institute Summer Sprint
NYU Entrepreneurial Institute
Selected as top 7% (10 of 150 startups) for AYA EdTech startup
- 2021.09.01
NYU Tandon Graduate Scholarship
New York University
Merit-based scholarship award ($8,000/year) for academic excellence
Publications
-
2025.01.01 Multi-Omics Integration for Multiple Myeloma Subtypes
Clinical Lymphoma Myeloma Leukemia
IntegrAO GNN on MMRF COMPASS cohort (N=655). 50% classification granularity improvement, 258% high-risk detection enhancement.
-
2025.01.01 PRIME: Predictive Relapse Indicators for Myeloma T-Cell Engagers
Blood
Predictive model for BCMA- and GPRC5D-targeting T-cell engagers. ASH 2025.
-
2025.01.01 OncoCITE: AI-Driven Genomic Evidence Curation
Nature Cancer
Multi-agent system for automated genomic evidence extraction. 6-agent solution with 22 MCP tools, 84% ground truth recovery, 97.8% novel discovery precision. Submitted to Nature Cancer.
-
2024.03.01 A RAG Chatbot for Precision Medicine of Multiple Myeloma
medRxiv
Production RAG system using LangGraph, BAAI/bge-large-en-v1.5 embeddings, and Mistral-7B achieving 88% effectiveness. 36 citations.
-
2024.01.01 Early Detection of CRS Using Wearable Devices Following CAR-T Therapy
Information (MDPI)
ML for early CRS detection (N=25). 84.62% accuracy within 6-hour window. SHAP analysis identified IFN-gamma as predictor. Under review.
-
2023.07.01 Beyond Black Box AI-Generated Plagiarism Detection: From Sentence to Document Level
ACL BEA 2023
Novel multi-faceted NLP approach achieving 94% accuracy in human-AI text classification. IBM Research collaboration. 31 citations.
Skills
| LLM Fine-tuning & Training | |
| Fine-tuned 7B-235B parameter models (Qwen, Qwen VL, Mistral, MedGemma, Whisper Large) | |
| LoRA/QLoRA adapters | |
| Full fine-tuning | |
| DPO (Direct Preference Optimization) | |
| Hugging Face Transformers | |
| Multi-GPU training (H100 NVL, A100) | |
| LSF/SLURM scheduling | |
| Singularity containers |
| Multi-Agent Systems & Orchestration | |
| Claude Agent SDK | |
| OpenAI Agent SDK | |
| LangChain | |
| LangGraph | |
| DSPy | |
| 6-agent collaborative architectures | |
| State Serialization | |
| Pause-Resume Workflows | |
| Deterministic Replay | |
| MCP (22 tools) | |
| Vector databases (OpenSearch, FAISS) | |
| Hallucination prevention |
| Model Serving & Deployment | |
| vLLM | |
| SGLang | |
| TGI | |
| TensorRT-LLM | |
| Ollama | |
| PagedAttention | |
| KV Cache Optimization | |
| Context Minimization | |
| <100ms TTFT | |
| FP8/INT4 quantization | |
| Model routing for cost optimization |
| Multimodal AI | |
| Vision (Claude 3.5, Qwen-VL, DeepSeek-OCR) | |
| PDF-to-Image Processing (300 DPI) | |
| Whisper Fine-tuning | |
| Voice Activity Detection (Silero VAD) | |
| Piper TTS | |
| Domain-specific ASR |
| Distributed Computing | |
| NVIDIA H100 NVL/A100/A10 | |
| 196-GPU cluster (Minerva) | |
| Multi-GPU (144 GPU-hours) | |
| CUDA | |
| HPC (LSF/SLURM) | |
| NVLink | |
| Parabricks | |
| cuDF/cuML | |
| Spark |
| Genomics & Bioinformatics | |
| Nextflow | |
| NGS Processing (STAR, BWA-MEM, fastp, GATK) | |
| Variant Calling (Mutect2, Lancet, HaplotypeCaller) | |
| CNV Analysis (FACETS, BEDTools) | |
| Gene Fusion (Arriba) | |
| Annotation (VEP, SnpEff, Funcotator) | |
| Scanpy | |
| Geneformer | |
| IntegrAO | |
| RNA-seq | |
| WES |
| Clinical AI & Healthcare | |
| Precision Medicine | |
| Clinical NLP | |
| Variant Annotation | |
| EHR Integration (EPIC) | |
| REDCap | |
| HIPAA Compliance | |
| Regulatory validation (CAP/CLIA) | |
| Clinical Decision Support |
| Cloud & DevOps | |
| AWS (Bedrock, EC2, OpenSearch, SageMaker) | |
| Azure OpenAI | |
| Multi-cloud deployment | |
| Docker | |
| Singularity | |
| Kubernetes | |
| MLflow | |
| CI/CD |
Languages
| English | |
| Fluent |
| Hindi | |
| Native speaker |
| Urdu | |
| Native speaker |
Interests
| Clinical AI Systems | ||||||
| Precision Oncology | ||||||
| Multi-Agent AI Systems | ||||||
| Real-Time Clinical Decision Support | ||||||
| Genomics | ||||||
| Drug Discovery | ||||||
Projects
- 2024.12 - Present
OncoCITE - Multi-Agent Genomic Evidence Extraction
6-agent system using Claude Agent SDK with 22 MCP tools for automated genomic curation from scientific literature. Submitted to Nature Cancer.
- Systematic EDA on CIViC database (11,312 evidence items, 3,083 publications)
- 84% ground truth recovery, 97.8% novel discovery precision, 0% critical errors
- State serialization enabling pause-resume for long-running extractions
- Identified 24.2% curation errors in expert-curated databases
- 2024.12 - Present
MMAP - GPU-Accelerated Genomic Pipeline (56x Speedup)
Production pipeline on Minerva HPC using NVIDIA H100 GPUs with Parabricks, RAPIDS, and DeepVariant.
- 95.8% processing time reduction (7 days → 3 hours, 56x speedup)
- 57 computational processes across 3 integrated workflows
- 8 patients/day throughput vs. 0.14 previously
- Enabled same-day molecular tumor board readiness
- 2024.01 - Present
Voice ASR Prototype for Clinical Terminology
Fine-tuned ASR for multiple myeloma medical terminology recognition.
- Fine-tuned Whisper Small for on-device deployment
- Trained on patient voice recordings
- Reduced word error rate for myeloma terminology
- Improved recognition of domain-specific medical terms
- 2023.10 - Present
Clinical RAG System
Production RAG system for multiple myeloma clinical decision support. Deployed and used by clinicians.
- LangGraph orchestration with Mistral-7B
- 5,000+ documents processed
- 88% clinical effectiveness
- 36 citations on medRxiv
Volunteer
-
2024.09 - 2024.12 New York, NY
Graduate Research Mentor
Icahn School of Medicine at Mount Sinai
Mentored 6 Carnegie Mellon University graduate students on CAR-T therapy monitoring capstone project.
- Guided experiment design, baselining, and reporting for CRS prediction using wearables and cytokines
- Coordinated with clinical team on project scope and deliverables
- Students developed ML models for early adverse event detection
-
2023.10 - Present New York, NY
AI/ML Integration Mentor
Icahn School of Medicine at Mount Sinai
Mentored team members on AI/ML integration workflows and best practices.
- Trained clinical research staff on RAG system usage and interpretation
- Developed documentation and tutorials for AI tools in clinical workflows
- Guided junior team members on production ML system development
-
2022.01 - 2023.05 New York, NY
Teaching Assistant - Data Structures & Algorithms for Bioinformatics
New York University
Graduate-level course under Prof. Manpreet S. Katari for 3 semesters (Spring 2022, Fall 2022, Spring 2023).
- Taught data structures, algorithms, and genomic algorithms to 130+ master's students
- Created assignments bridging computer science fundamentals with bioinformatics applications
- Held office hours and provided one-on-one support for complex algorithmic concepts
- Developed course materials covering sequence alignment, graph algorithms, and dynamic programming for genomics