Comparative Analysis of Traditional Methods and AI-Powered Statistical Approaches in Educational Measurement and Evaluation(


Creative Commons License

Sarıbacak B., Soylu M., Kara M.

VII. International Applied Statistics Congress (UYIK – 2026), İstanbul, Türkiye, 11 - 13 Mayıs 2026, sa.1348, ss.405-409, (Tam Metin Bildiri)

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Basıldığı Şehir: İstanbul
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.405-409
  • Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
  • Ondokuz Mayıs Üniversitesi Adresli: Evet

Özet

Abstract

The increasing integration of artificial intelligence (AI) into data analysis workflows necessitates a reexamination of classical statistical inference methods used in educational measurement and evaluation. This study investigates the consistency, validity, and potential bias of AI-supported statistical analyses by comparing their outputs with results from conventional statistical procedures. In the first phase, gender-based differences were analyzed using high school entrance exam scores of 8th-grade students. AI-generated outputs were compared with the results of an Independent Samples t-test conducted in IBM SPSS Statistics (Version 25). Assumptions of normality and homogeneity of variance were tested, and effect sizes were reported. In the second phase, student responses were examined within the framework of Classical Test Theory, where item difficulty and discrimination indices were calculated. Exploratory factor analysis was conducted using appropriate extraction and rotation methods to assess structural validity and identify latent constructs. To ensure accuracy, automated validation tools, including Statcheck and the GRIM test, were employed. Comparisons focused on areas of agreement and divergence between AI-generated results and traditional analyses as data complexity increased. Findings show that AI-supported analyses are consistent in descriptive statistics and basic inferential procedures. However, in multivariate and complex datasets, deviations may occur due to assumption violations and model misspecification. In conclusion, the reliability of AI-based statistical analyses depends on integration with formal validation procedures and expert oversight. A hybrid approach combining algorithmic efficiency with methodological rigor can enhance reproducibility and inferential accuracy in large-scale assessment systems.

Keywords: Artificial Intelligence, Measurement and Evaluation, Statistical Analysis, Reliability, Verifiability