Leveraging Public health data and performing analysis on records over the last decades helps redefine the L3 detection Algorithm

Allwyn performed a research project to help radiologists and other physicians in the accurate identification of the L3 slice from a CT scan by enhancing the level of confidence with the use of AI and Machine Learning. We discussed the background and problem statement of the project in detail in our earlier article. Below are some insights into the chosen Methodology.

DATASETS

Using publicly available data sources and selecting the CT scans of lung cancer patients, we set out to prove that a machine learning algorithm can detect L3 accurately and with high confidence. Adjustments were made in the pre-processing data parameters to test methods of improving the output.

The data source used for this project was the Cancer Imaging Archive (TCIA), a publicly available website populated by user-submitted data. Before data is accepted on the site, it must meet TCIA standards and contain no protected health information (PHI).

The data used for this project were CT scans in Digital Imaging and Communications in Medicine (DICOM) format.This assures standardized structure and data uniformity.

Five different collections were used to collect cases with a focus on lung cancer research, containing various types of CT scans (full-body, chest, etc.)

EXPLORATORY  DATA  ANALYSIS

The full data set with all five collections were downloaded. They totaled 151.2 GB of data storage and 315,406 images.Each collection was examined for year of study because scans performed prior to 1990 tend to have low image quality. We found that the majority of scans are post-1990.

L3  DETECTION  ALGORITHM  OVERVIEW

The algorithm created for detection is based on sarcopenia-ai and work of previous teams. It is a trained, Fully Convolutional Neural Network (FCNN), which is commonly used in medical image processing.

The code was developed using cloud computing resources on Google Colab, a platform that facilitates the creation, sharing, and testing of computer code and particularly useful for AI applications.

Prior to attempting detection with the algorithm, we had to convert and transform the data. We created a single file in Neuroimaging Informatics Technology Initiative (NIfTI) format and converted the maximal intensity projection (MIP) from 3D to 2D.

A significant amount of the team’s effort focused on improving the algorithm workflow. This involved streamlining the separation of data prep and L3 detection, data organization, and code documentation.

We will be discussing our solution approach and achievements over the next weeks. Please follow us on LinkedIn to stay tuned.