Usage
  • 385 views
  • 290 downloads

Comparing the Performance of Data Mining Methods in Classifying Successful Students with Scientific Literacy in PISA 2015

  • Author(s) / Creator(s)
  • This study aims to classify successful and unsuccessful students in PISA (2015) scientific literacy using the indices and student questionnaire items in the PISA 2015 database. The sample of the study consists of 5895 Turkish students who participated in PISA 2015. In data analysis, Multilayer Perceptron, Logistic Regression, and Support Vector Machine methods were used as data mining methods. The data set was evaluated in three different ways using 80% training-20% test, 70% training-30% test and 10-fold Cross Validation test. Accuracy, F-measure, Precision, Recall, and ROC Area were used as the evaluation criteria. The results showed that the most important variables were found to be environmental awareness scale items in order to classify successful and unsuccessful students in the research. The highest Accuracy value across all conditions was 0.81 for the Support Vector Machine method in the data set tested with 10-fold Cross Validation. The lowest Accuracy value was 0.74 for the Multilayer Perceptron method when the data was split as 80% training-20% test. In the study, the performance measures obtained from the data set tested with 10-fold Cross Validation were found to be the highest in all conditions. Based on the Accuracy criterion, values obtained from Support Vector Machine are the highest in 70% training-30% test and 10-fold Cross Validation data set. Although the performance measures obtained from the other methods used and evaluation criteria are relatively close to each other, it can be seen that they can vary according to the conditions.

  • Date created
    2018-09-05
  • Subjects / Keywords
  • Type of Item
    Conference/Workshop Presentation
  • DOI
    https://doi.org/10.7939/R3KW5812Q
  • License
    Attribution-NonCommercial 4.0 International