Abstract

This work presents a structured benchmarking study of multimodal large language models (MLLMs) applied to electrocardiogram (ECG) interpretation tasks. We evaluate three representative architectures: MedGemma, HuatuoGPT-Vision, and LLaVA-Med, across progressive experimental stages involving text-only structured prompt normalization, text–image fusion with ECG plots, and full multimodal fusion incorporating time-series signals. A standardized five-section cardiology prompt was designed to enforce consistent output structure and SCP-code alignment, enabling reproducible metric computation across models. Quantitative evaluation using BERTScore, token-level F1, and diagnostic accuracy demonstrates that HuatuoGPT-Vision achieves the highest semantic and diagnostic alignment, while MedGemma exhibits superior formatting stability and reproducibility. In contrast, LLaVA-Med showed limited ability to handle extended clinical prompts, yielding a high invalid-response rate. Preliminary multimodal results suggest that augmenting textual and visual prompts with ECG time-series data doesn't enhance diagnostic precision and semantic coherence, indicating image-forward training practices. Overall, the findings highlight the critical role of structured reasoning, and modality fusion in improving interpretability and reliability of medical MLLMs, providing a reproducible framework for future ECG-centric language–vision model evaluation.

Advisor(s)

Yang, Huiyuan

Committee Member(s)

Maity, Suman
Yu, Xiaowei

Department(s)

Computer Science

Degree Name

M.S. in Computer Science

Publisher

Missouri University of Science and Technology

Publication Date

Fall 2025

Pagination

ix, 57 pages

Note about bibliography

Includes_bibliographical_references_(pages 54-56)

Rights

Document Type

Thesis - Open Access

File Type

text

Language

English

Thesis Number

T 12555

Recommended Citation

Anil, Prisha, "Performance of Standard Medical MLLMs on ECG Image Data" (2025). Masters Theses. 8270.
https://scholarsmine.mst.edu/masters_theses/8270

Download

Included in

Computer Sciences Commons

COinS

Masters Theses

Performance of Standard Medical MLLMs on ECG Image Data

Abstract

Advisor(s)

Committee Member(s)

Department(s)

Degree Name

Publisher

Publication Date

Pagination

Note about bibliography

Rights

Document Type

File Type

Language

Thesis Number

Recommended Citation

Included in

Search

Browse

Author Corner

Useful Links

Thesis Locations

Masters Theses

Performance of Standard Medical MLLMs on ECG Image Data

Author

Abstract

Advisor(s)

Committee Member(s)

Department(s)

Degree Name

Publisher

Publication Date

Pagination

Note about bibliography

Rights

Document Type

File Type

Language

Thesis Number

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Useful Links

Thesis Locations