Can you learn to diagnose the severity of aortic stenosis (AS), a common valve disease, from ultrasound images of the heart (echocardiograms)?
The Tufts Medical Echocardiogram Dataset (TMED) is a clinically-motivated benchmark for computer vision and machine learning from limited labeled data. TMED is designed to be an authentic assessment of semi-supervised learning (SSL) methods that train classifiers from a small, hard-to-acquire labeled dataset and a much larger (but easier to acquire) unlabeled set.
Jump to: Announcements Clinical Motivation Dataset Summary Classification Tasks Bibliography
Announcements
Jan 2023: Clinical journal paper accepted at J. Amer. Soc. of Echocardiography
Our JASE ’23 paper shows that automatic AS diagnosis is possible for a clinical audience, with external validation Publisher Link PubMed
Nov 18 2022: TMED virtual retreat 2022
See event page for details and how to RSVP to attend!
Aug. 2021: Paper accepted at MLHC 2021
Our publication is now available as a PDF Paper on PMLR Video on YouTube
Clinical motivation
Our motivating task is to improve timely diagnosis and treatment of aortic stenosis (AS), a common degenerative cardiac valve condition. If left untreated, severe AS has lower 5-year survival rates than several metastatic cancers1 2. With timely diagnosis, AS becomes a treatable condition via surgical or transcatheter aortic valve replacement with very low mortality3.
AS is a particularly important condition where automation holds substantial promise. There is evidence that many patients with severe AS are not treated 4 5 and there are disparities in access to care that must be addressed 6. Automated screening for AS can increase referral and treatment rates for patients with this life threatening condition.
We hope this dataset catalyzes research in two directions:
Deployable automatic preliminary screening and early detection of cardiac disease, especially expanding access to patients who live in areas without expert cardiologists (but where ultrasound imaging would still be possible)
Improved ML methodology for learning from limited labeled data. For our use case and many others, acquiring appropriate labels from expert clinicians is expensive and time consuming. Our dataset deliberately supports semi-supervised methods that can learn simultaneously from a small labeled dataset and a large unlabeled dataset (much easier to collect).
Dataset Summary
The TMED dataset contains transthoracic echocardiogram (TTE) imagery acquired in the course of routine care consistent with American Society of Echocardiography (ASE) guidelines, all obtained from 2011-2020 at Tufts Medical Center.
When gathering echocardiogram imagery for a patient, a sonographer manipulates a handheld transducer over the patient’s chest, manually choosing different acquisition angles in order to fully assess the heart’s complex anatomy. This imaging process results in multiple cineloop video clips of the heart depicting various anatomical views (see example view types below). We extract one still image from each available video clip, so each patient study is represented in our dataset as multiple images (typically 50-100).
In routine care when images are captured, neither view nor diagnostic labels are immediately captured and stored. View labels are not annotated or stored as part of routine practice. Diagnostic labels for aortic stenosis (AS) - along with many other observations about many aspects of heart health – are applied hours or days after a study by an expert clinician, who aggregates information from the many videos and images captured by the echocardiogram study. Diagnostic severity ratings are entered some time later into a human readable report document stored within that patient’s electronic medical record. Due to logistical reasons it is difficult to easily extract that information into a machine readable format.
We have performed significant annotation effort to gather appropriate view labels for a subset of data, as well as significant manual effort to extract diagnostic labels from existing medical records.
Each view label was provided by experts (board-certified sonographers or cardiologists) specifically for this study using a custom labeling tool. These view types are a subset of the many possible view types, chosen because they are relevant for diagnosing valve diseases like AS.
Each diagnostic label was assigned by a board-certified cardiologist in the course of routine practice when interpreting the echocardiogram to care for the patient. These labels were pulled from the patient’s medical record in a manually intensive process.
Classification Tasks
Both TMED supports the same two clinically-meaningful tasks: view classification and severity diagnosis classification.
Task 1: Classify the view of an image
In echocardiography, many canonical view types are possible, each displaying distinct aspects of the heart’s complex anatomy.
As part of routine clinical care, when images are taken the sonographer is intentionally capturing a specific view, but the annotation of the view type is not applied to the image or recorded in the electronic record. Thus, from raw data alone (remember, each study contains 100s of images) it is difficult to focus on a specific anatomical view of interest.
For our goal of supporting diagnosis of aortic stenosis, two kinds of views are particulary relevant: parasternal long axis (PLAX) and parasternal short axis (PSAX). Both PLAX and PSAX views are used in the routine clinical assessment of aortic valve disease, because the aortic valve’s structure and function is visible.
Task 2: Classify the diagnostic severity level of a patient
Our ultimate goal is automated preliminary screening of aortic stenosis (AS), which would improve early detection of this life-threatening disease.
Toward this goal, our diagnosis task requires aggregating predictions across many images of the same patient’s heart (using ~100 images) to make a coherent prediction for that individual.
This tasks mimics how cardiologists make real AS diagnoses in practice: they have access to ~100 images captured by the sonographer, each of varying signal quality as well as representing different view types. The cardiologist needs to identify which images are relevant (show relevant anatomical views with appropriate quality) and then look for key signs of disease in these relevant images to determine the appropriate severity diagnosis (none, mild, moderate, or severe).
Acknowledgement of Funding
We gratefully acknowledge financial support from the Pilot Studies Program at the Tufts Clinical and Translational Science Institute (Tufts CTSI NIH CTSA UL1TR002544).