1,720,961 research outputs found
Upper Aero Digestive Tract Cancer Diagnosis using Deep Learning Methods
Objective: Narrow band imaging (NBI) and white light (WL) are endoscopic techniques to visualize upper aero digestive tract (UADT) cancers. However, these imaging techniques are less effective for diagnosing tumors in less competent centers since they depend on skilled medical experts. Recently, there has been evidence that deep learning (DL) has potential applications in UADT video endoscopy. This research aims to develop a DL for the automatic identification and delineation of UADT cancer.
Approach: In both WL and NBI frames, the YOLO DL model (YOLOv5s with YOLOv5m) ensemble, was used to diagnose laryngeal squamous cell carcinoma (LSCC). Six external LSCC laryngoscopy videos were tested in real-time for cancer detection. The SegMENT is a segmentation convolution neural networks (CNN), model proposed based on a modified DeepLabV3+ model for precise UADT delineation using an in-domain transfer learning ensemble technique. Its accuracy was further validated on external datasets with NBI images of oral cavity SCC (OSCC) and oropharyngeal SCC (OPSCC). The SegMENT-Plus is the improved version of SegMENT model designed for large LSCC datasets. SegMENT-Plus used EfficientNetB5 backbone as an encoder with a modified atrous spatial pyramid pooling (m-ASPP) block. The attentions blocks (SE and CBAM) were integrated into m-ASPP module to improve cancer segmentation. The m-ASPP was used to extract local and global LSCC features to overcome the limitation of conventional ASPP modules in literature. SegMENT-Plus was evaluated using a multi-center dataset from three hospitals (Genoa, Brescia, Seoul South Korea). The model was tested on LSCC frames, the delineation performance was compared with three otolaryngology experts. The unseen intraoperative laryngoscopy videos also validated for real-time performance. The SegMENT-Plus was compared with its predecessor SegMENT and other DL models (UNET, ResUNET, DeepLabv3+, DoubleUET,).
Main results: In the LSCC detection task, 219 patients from Genoa, Italy were enrolled, and were provided 624 LSCC video frames. YOLO models were trained using an 82.6% training set, an 8.2% validation set, and a 9.2% testing set. The ensemble algorithm (YOLOv5s with YOLOv5m —Test Time Augmentation) achieved top LSCC detection with 66% Precision, 62% Recall, and 63% mean Average Precision at 0.5 intersection over union (IoU). The average computation time per frame on laryngoscopy videos was 0.026 seconds. The SegMENT model for the UADT cancer delineation was developed using 219 patients (624 larynx frames), and external validation from Brescia, Italy for the OPSCC and OCSCC cohorts involved 116 and 102 NBI images, respectively. The SegMENT model achieved 0.68% IoU and 0.81% dice coefficient (DSC), outperforming other DL models. The DSC values in the OCSCC and OPSCC datasets improved significantly, with median DSC values of 10.3% and 11.9%, respectively. This study includes 557 patients with 3933 laryngeal images from Genoa, Italy to the development of SegMENT-Plus to improve LDCC delineation. The optimal performance and generalization of the algorithm were confirmed by external testing cohorts from Seoul, South Korea, and Brescia, Italy. The external cohorts showed DSC between 81.4% and 84.9% and IoU between 81.8% and 85.7%.
Significance: The study identified a suitable CNN model for LSCC detection in WL and NBI video laryngoscopes. SegMENT outperformed previous results in external validation cohorts, showing promise for precise tumor segmentation. SegMENT-Plus holds the potential for improved early tumor detection and delineation, laying the foundation for a clinical system in LSCC margin delineation
Artificial Intelligence for Upper Aerodigestive Tract Endoscopy and Laryngoscopy: A Guide for Physicians and State‐of‐the‐Art Review
Objective: The endoscopic and laryngoscopic examination is paramount for laryngeal, oropharyngeal, nasopharyngeal, nasal, and oral cavity benign lesions and cancer evaluation. Nevertheless, upper aerodigestive tract (UADT) endoscopy is intrinsically operator-dependent and lacks objective quality standards. At present, there has been an increased interest in artificial intelligence (AI) applications in this area to support physicians during the examination, thus enhancing diagnostic performances. The relative novelty of this research field poses a challenge both for the reviewers and readers as clinicians often lack a specific technical background. Data sources: Four bibliographic databases were searched: PubMed, EMBASE, Cochrane, and Google Scholar. Review methods: A structured review of the current literature (up to September 2022) was performed. Search terms related to topics of AI, machine learning (ML), and deep learning (DL) in UADT endoscopy and laryngoscopy were identified and queried by 3 independent reviewers. Citations of selected studies were also evaluated to ensure comprehensiveness. Conclusions: Forty-one studies were included in the review. AI and computer vision techniques were used to achieve 3 fundamental tasks in this field: classification, detection, and segmentation. All papers were summarized and reviewed. Implications for practice: This article comprehensively reviews the latest developments in the application of ML and DL in UADT endoscopy and laryngoscopy, as well as their future clinical implications. The technical basis of AI is also explained, providing guidance for nonexpert readers to allow critical appraisal of the evaluation metrics and the most relevant quality requirements
Real-Time Laryngeal Cancer Boundaries Delineation on White Light and Narrow-Band Imaging Laryngoscopy with Deep Learning
Objective: To investigate the potential of deep learning for automatically delineating (segmenting) laryngeal cancer superficial extent on endoscopic images and videos. Methods: A retrospective study was conducted extracting and annotating white light (WL) and Narrow-Band Imaging (NBI) frames to train a segmentation model (SegMENT-Plus). Two external datasets were used for validation. The model's performances were compared with those of two otolaryngology residents. In addition, the model was tested on real intraoperative laryngoscopy videos. Results: A total of 3933 images of laryngeal cancer from 557 patients were used. The model achieved the following median values (interquartile range): Dice Similarity Coefficient (DSC) = 0.83 (0.70-0.90), Intersection over Union (IoU) = 0.83 (0.73-0.90), Accuracy = 0.97 (0.95-0.99), Inference Speed = 25.6 (25.1-26.1) frames per second. The external testing cohorts comprised 156 and 200 images. SegMENT-Plus performed similarly on all three datasets for DSC (p = 0.05) and IoU (p = 0.07). No significant differences were noticed when separately analyzing WL and NBI test images on DSC (p = 0.06) and IoU (p = 0.78) and when analyzing the model versus the two residents on DSC (p = 0.06) and IoU (Senior vs. SegMENT-Plus, p = 0.13; Junior vs. SegMENT-Plus, p = 1.00). The model was then tested on real intraoperative laryngoscopy videos. Conclusion: SegMENT-Plus can accurately delineate laryngeal cancer boundaries in endoscopic images, with performances equal to those of two otolaryngology residents. The results on the two external datasets demonstrate excellent generalization capabilities. The computation speed of the model allowed its application on videolaryngoscopies simulating real-time use. Clinical trials are needed to evaluate the role of this technology in surgical practice and resection margin improvement. Level of evidence: III Laryngoscope, 2024
Automatic delineation of laryngeal squamous cell carcinoma during endoscopy
White Light (WL) and Narrow Band Imaging (NBI) endoscopy are widely used to assess the superficial spreading of laryngeal squamous cell carcinoma (LSCC). However, the analysis of images requires a high level of attention and extensive clinical expertise, leading to inter-clinician variability on the assessment of tumor margins. Computer-aided segmentation can automate the identification of LSCC margins, supporting clinicians in this challenging task. In this paper, we present SegMENT-Plus, a Deep Learning segmentation convolutional network specifically developed and optimized for accurate delineation of LSCC. SegMENT-Plus uses EfficienstNetB5 as encoder with a new modified Atrous Spatial Pyramid Pooling (m-ASPP) block that integrates Channel Block Attention Module (CBAM) and Squeeze Excitation (SE). In this new architecture, CBAM extracts local and global LSCC features from the encoder, while the SE block refines cancer segmentation on each dilated convolution output. SegMENT-Plus was trained and evaluated on a multi-center dataset including clinical data from three different hospitals. A total of 4289 annotated laryngeal images from 766 patients were included in this study. The experiments showed that SegMENT-Plus achieved a Dice Similarity Coefficient (DSC) between 81.4% and 84.9% and an Intersection over Union (IOU) between 81.8% and 85.7% on the data from the different hospitals, attesting its high performance and generalization capability. The proposed segmentation architecture also demonstrated statistically significant improvement in DSC and IoU compared to other state of the art architectures, showing that this work is a concrete foundation towards a clinical system for the automatic delineation of LSCC margins in endoscopic images
Deep Learning Applied to White Light and Narrow Band Imaging Videolaryngoscopy: Toward Real‐Time Laryngeal Cancer Detection
To assess a new application of artificial intelligence for real-time detection of laryngeal squamous cell carcinoma (LSCC) in both white light (WL) and narrow-band imaging (NBI) videolaryngoscopies based on the You-Only-Look-Once (YOLO) deep learning convolutional neural network (CNN)
Multimodal Medical Image Registration and Fusion for Quality Enhancement
For the last two decades, physicians and clinical experts have used a single imaging modality to identify the normal and abnormal structure of the human body. However, most of the time, medical experts are unable to accurately analyze and examine the information from a single imaging modality due to the limited information. To overcome this problem, a multimodal approach is adopted to increase the qualitative and quantitativemedical information which helps the doctors to easily diagnose diseases in their early stages. In the proposed method, aMulti-resolutionRigidRegistration (MRR) technique is used for multimodal image registration while Discrete Wavelet Transform (DWT) along with Principal Component Averaging (PCAv) is utilized for image fusion. The proposedMRR method provides more accurate results as compared with Single Rigid Registration (SRR), while the proposed DWT-PCAv fusion process adds-on more constructive information with less computational time. The proposed method is tested on CT and MRI brain imaging modalities of the HARVARD dataset. The fusion results of the proposed method are compared with the existing fusion techniques. The quality assessment metrics such as Mutual Information (MI), Normalize Crosscorrelation (NCC) and Feature Mutual Information (FMI) are computed for statistical comparison of the proposed method. The proposed methodology provides more accurate results, better image quality and valuable information for medical diagnoses
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Videomics of the Upper Aero-Digestive Tract Cancer: Deep Learning Applied to White Light and Narrow Band Imaging for Automatic Segmentation of Endoscopic Images
Narrow Band Imaging (NBI) is an endoscopic visualization technique useful for upper aero-digestive tract (UADT) cancer detection and margins evaluation. However, NBI analysis is strongly operator-dependent and requires high expertise, thus limiting its wider implementation. Recently, artificial intelligence (AI) has demonstrated potential for applications in UADT videoendoscopy. Among AI methods, deep learning algorithms, and especially convolutional neural networks (CNNs), are particularly suitable for delineating cancers on videoendoscopy. This study is aimed to develop a CNN for automatic semantic segmentation of UADT cancer on endoscopic images
- …
