nicholas drage
paul van der stelt


W. Panyarak1, W. Suttapak2, P. Mahasantipiya1, A. Charuakkra1, N. Boonsong3, K. Wantanajittikul4

1Chiang Mai University, Faculty of Dentistry, Division of Oral and Maxillofacial Radiology, Chiang Mai, Thailand, 2University of Phayao, School of Information and Communication Technology, Department of Computer Engineering, Phayao, Thailand, 3Chiang Mai University, Faculty of Dentistry, Department of Oral Biology and Diagnostic Sciences, Chiang Mai, Thailand, 4Chiang Mai University, Faculty of Associated Medical Sciences, Department of Radiologic Technology, Chiang Mai, Thailand

Aim: Frequently encountered radiolucent jaw lesions, such as dentigerous cyst (DC), radicular cyst (RC), odontogenic keratocyst (OKC), and ameloblastoma (AM), share common characteristics, including well-defined borders near tooth-bearing areas. In 2021, CrossViT was introduced as a novel deep-learning approach for image classification by combining multi-scale vision transformers (ViT) with cross-attention mechanisms for accurate classification. However, its application in dental radiographic classification remains unexplored. This preliminary study evaluates the CrossViT-15 and -18 against the ResNet models for classifying common radiolucent jaw lesions.

Material and Methods: We conducted a retrospective study involving 169 radiolucent jaw lesions (57 DCs, 43 RCs, 30 OKCs, and 39 AMs) observed in panoramic radiographs (OPGs). Three experienced oral radiologists provided annotations with consensus, with histological confirmation of each lesion. We implemented CrossViT-15, -18, ResNet-50, -101, and -152, with horizontal flip for data augmentation. A four-fold cross-validation approach was employed. The models’ performance was assessed through accuracy, specificity, precision, recall (sensitivity), and F1-score metrics.

Results: CrossViT-18 achieved the highest average accuracy (72.81%), followed by ResNet-101 (72.20%), CrossViT-15 (71.01%), ResNet-152 (68.67%), and ResNet-50 (68.65%). CrossViT-15, -18, and ResNet-101 showed high specificity of over 90%. ResNet-101 precision was 73.04%, CrossViT-15 was 70.22%, with recall of CrossViT-15 at 70.94%, and CrossViT-18 of 69.37%. The F1-score was highest for CrossViT-15 (68.35%), followed by ResNet-101 (67.90%) and CrossViT-18 (67.87%).

Conclusion: CrossViT and ResNet-101 models demonstrated comparable performance in categorizing prevalent radiolucent lesions in OPGs. However, further exploration with a larger dataset is recommended to discern distinctive results among these models.


G. Li1, Y.-L. Wang1, X.-Y. Zhang1, Y. Wang2, J.-P. Li2

1Peking University School and Hospital of Stomatology, Beijing, China, 2Beijing Jiaotong University, Beijing, China

Aim: to investigate whether a proposed method can be used for determination of an optimal training set for U-net caries recognition.

Material and methods: the optimal training set was calculated using the Tests for One-Sample Sensitivity and Specificity module of PASS 15 based on the expected sensitivity, specificity, mean caries rate, along with α=0.05, β=0.1, and a two-sided test of variance. By this method, an optimal training set containing 263 carious teeth was suggested. According to a ratio of 8:1:1 for the distribution of training set, validation set and test set, the initial training set comprised slices for 25 carious teeth, the second training set comprised slices for 50 carious teeth, the third training set comprised slices for 75 carious teeth, and so on until the data set reached 300 teeth with carious lesions. The U-Net was validated and tested in a similarly way. The trained model and three radiologists was employed for caries diagnosis in another group of images.

Results: The results show that the model performance is gradually enhanced as the carious teeth increased in the training set. When the carious teeth reached 250, the network reaches the optimal level of performance, with accuracy, sensitivity, specificity, F1-Score, Dice similarity coefficients, and intersection over union of 0.9929, 0.9307, 0.9989, 0.9590, 0.9435 and 0.9008, respectively. the U-Net performance stabilized after training with more than 250 teeth. The U-Net performed a better accuracy than dental radiologists for caries recognition.

Conclusion: The optimal number of images in the training set is predictable.


Z.M. Semerci1, S. Antic2, D. Bracanovic2, A. Janovic2, İ.Ş. Bayrakdar3, Ö. Çelik4, B. Markovic Vasiljkovic2, K. Orhan5

1Akdeniz University Faculty of Dentistry, Oral and Maxillofacial Radiology Department, Antalya, Turkey, 2University of Belgrade School of Dental Medicine, Center for Radiological Diagnostics, Belgrade, Serbia, 3Eskisehir Osmangazi University, Faculty of Dentistry, Department of Oral and Maxillofacial Radiology, Eskisehir, Turkey, 4Eskisehir Osmangazi University Faculty of Science, Department of Mathematics-Computer, Eskisehir, Turkey, 5Ankara University Faculty of Dentistry, Oral and Maxillofacial Radiology Department, Ankara, Turkey

Aim: This study aimed to develop and validate a deep learning algorithm, nnU-Netv2, for the segmentation of the parotid gland in both native and contrast-enhanced computed tomography (CT) images, enhancing the precision of anatomical segmentation in head and neck radiology.

Material and Methods: In this retrospective study, a total of 91 anonymized CT images of the head and neck region were employed. Ground truth labeling was performed with the CranioCatch Annotation Tool (CranioCatch, Eskisehir, Turkey) by a dentomaxillofacial radiologist. We employed the nnU-Netv2 framework to segment the parotid gland from a dataset comprising 101 DICOM files, converted into Nifti format for compatibility. The training set included 91 CT images, while the test set consisted of 10 CT images. A comprehensive training regimen was undertaken. The automatic segmentation performance was evaluated in terms of the F1-score, precision, sensitivity, and the Area Under Curve (AUC) statistics.

Results: The nnU-Netv2 algorithm demonstrated high performance, evidenced by an accuracy of 99.81%. The AUC for the model was 0.95; precision and recall were 0.894 and 0.903, respectively, further emphasizing the model›s effectiveness.The Dice similarity coefficient was 0.898, indicating a robust congruence between the algorithm›s segmentations and the ground truth.

Conclusion: The nnU-Netv2 network exhibits supreme performance in the segmentation of the parotid gland on CT images, offering near-perfect accuracy and high reliability in delineating gland contours. These results underscore the potential of advanced deep learning frameworks in improving anatomical segmentation, with significant implications for diagnostic radiology and treatment planning in head and neck pathologies.


F. Aşantoğrol1

1Gaziantep University, Faculty of Dentistry, Oral and Maxillofacial Radiology, Gaziantep, Turkey

Aim: The aim of this study was to evaluate the diagnostic success of various deep convolutional neural networks (CNNs) in diagnosing osteoporosis using dental panoramic radiographs.

Material and Methods: A total of 708 patients were included in our clinic between 2014 and 2024, consisting of 354 patients previously diagnosed with osteoporosis through dual-energy X-ray absorption (DEXA) scanning and with panoramic imaging, matched for age and gender with 354 non-osteoporotic individuals. The dataset was created from images obtained from all patients and divided into random training (70%), validation (15%), and testing (15%) datasets. The dataset was analysed using a series of deep CNN models including VGG16 and VGG19, which are visual geometry group (VGG) deep CNN models, Xception, Inception, MobileNet, MobileNet V2 and You Only Look Once (YOLO) V8 models.

Results: Out of the seven CNN models tested, YOLOV8 achieved the highest accuracy rate of 100%, followed by MobileNet V2 with 91.5%. YOLOV8 exhibited a precision value of 1.00, recall value of 1.00, F1 scores of 1.00, and an AUC value of 1.00. On the other hand, the Inception model demonstrated the lowest performance with an accuracy of 85.8%, precision and recall of 0.86, F1 scores of 0.86, and an AUC of 0.92.

Conclusion: Deep CNN models have shown a high diagnostic performance in detecting osteoporosis patients identified by DEXA, which is considered the gold standard for determining bone mineral density, using only panoramic radiographs. These results can provide valuable information to dentists for the early detection of osteoporosis.