Towards the Automation of Kidney Function Classification and Prediction Through Ultrasound-based Kidney Imaging Using Deep Learning

Kidney ultrasound imaging is widely used in clinical applications, such as excluding reversible causes of acute kidney injury such as urinary obstruction and identification of irreversible chronic kidney disease(CKD) that precludes unnecessary workup such as kidney biopsy. Ultrasound comes with many advantages, it’s non-invasive, relatively lower in cost, and has wider availability. Thus, prediction of kidney function and chronic kidney disease (CKD) through kidney ultrasound imaging has long been considered desirable in clinical practice. However, the high subjective variability in image acquisition and interpretation makes it difficult to translate experience-based prediction into standardized practice, such as invasive serum creatinine measurement. Generally in clinical practice, some aspects of kidney ultrasounds are used as markers for evaluation of severity of kidney injuries. Kidney length, volume, cortical thickness, and echogenicity. There are a few relative studies in evaluation of renal function with ultrasounds. Yet the results were suboptimal or based on small samples.

markers	correlation by prior studies	notes
length	0.36
volume	0.4-0.49
cortical thickness	0.852	only 42 samples

Yapark et al, 2017 developed a CKD scoring system that integrate three ultrasonographic parameters: kidney length, parenchymal thickness, and echogenicity, but the correlation remain at 0.587

Our Goal

To overcome substantial inter-observer variability in kidney ultrasound interpretation, we train a deep learning model to inform better clinical decision, including:

Part 1. Predict the present eGFR
Part 2. Classification of CKD status

Study Population

Initially we enrolled 8,281 pre-ESRD patients at China Medical University, aged 20–89 years, with a total of 203,353 sonographic images since 2003. Regarding sharpness, contrast, and noise, we selected studies performed after 2014 that used GE ultrasound systems (LOGIQ E9 and LOGIQ P3, GE Healthcare, Milwaukee, WI, USA). eGFR was measured within 4 weeks before or after the day of the kidney sonography using the abbreviated MDRD equation

eGFR = 186 × creatinine−1.154 × age−0.203 × 1.212 [if black] × 0.742 [if female]

Dataset and Data Cleaning

Total images: 37,696 (acquired after 2014)

Training set:
- Around 1106 patients
- Around 3500 images
Validation set:
- Around 180 patients
- Around 180 images (Only selected the image with the clearest kidney from each patient)
Testing set:
- 160 patients
- 160 images (Only selected the image with the clearest kidney from each patient)

Preprocessing

There are plenty of obstacles in obataining a clean kidney ultrasound, the surrouding organs like liver, intestines and spleen, or the adjacent tissues such as fat. Moreover, most ultrasound images contained annotations that are hardcoded to the images. These could pose as noises, or “distractions” to the deep learning model. To cope with these noises, we developed a “tailored crop” cropping process, based on two markers annotating the kidney length.

Model Architecture

Convolutional neural network based on ResNet-101 for prediction of eGFR.
Extract features using trained CNN model
Gradient boosted tree algorithm for classification of CKD stage (above stage 2 or not), using the above extracted features.

Performance

For prediction of eGFR

For classification of CKD stage

Demonstration

Conclusion

Our model is the first fundamental step toward realizing the potential of transforming kidney ultrasound imaging into an effective, real-time, distant screening tool. AI-GFR estimation offers the possibility of non-invasive assessment of kidney function, a key goal of AI-powered functional automation in clinical practice.

Lin, Wei-Kai