Figure 1. The logic model of the design of automatic analysis of visualized modeling web portal
The Logic Model of the AAVM Web Portal Design
This design (i.e., automatic analysis of visualized modeling web portal) employed machine learning to automatically score students developed visualized scientific modeling. The motivation of the design is based on three assumptions: (1) science teachers have a heavy workload in grading students developed scientific models; (2) they lack guidance in connecting grading results with instruction decision making; (3) they lack professional programming knowledge to develop machine learning algorithm models. Deriving from those assumptions, this design aims to provide an ensembled approach of machine learning Neural Networks (NN) to reduce teachers’ grading workload and help their instructional decision making in the use of scientific modeling. Figure 1 displayed the logic model of this design, which includes six components (i.e., inputs, activities, outputs, short-term outcomes, intermediate outcomes, and impacts). In the following sections, I will illustrate the logic model step by step to make more sense of the design’s value and feasibility.
The logic model starts from the inputs, which presents the resources that the machine working platform should have. The first resource is the ML NN algorithm models. To score image data, this portal developed an ensembled approach by employing four main algorithm models. Among diverse machine learning algorithm models, Deep Belief Network (DBN), Convolutional Neural Network (CNN), Deep Convolutional Network (DCN), and Deep Residual Network (DRN) are generally used on data classification. Different from most commercial automatic scoring programs that fit one type of algorithm at a time, the AAVM web portal can run the above four algorithms simultaneously. Based on the predictions of each algorithm, derived from the cross-validation, the machine assigns weights towards the algorithm that best optimize the sets of algorithmic parameters (Breiman, 2001),. Extensive experimentation by Large, Lines, and Bagnall (2019) revealed that the ensemble approach has measurable benefits over other classifiers such as alternative weighting, selection, or meta-classifier approaches. The second resource is image data (i.e., students developed visualized modeling). Teachers convert all image data into the digital format before using the platform and split the data into a training set, testing set at a certain ratio (e.g., 8:2, or 7:3), and predicting data.
The second step in the logic model is the activities. First, teachers need to upload the training data and testing data to train and test the algorithms. Figure 2 shows an example of the Neural network, in which the input (x1, x2, x3…) is multiplied by a set of weights (w) and passed to the next layer of nodes (which have been hidden). Researchers can set up the batch number and the hidden layers manually before the research. Each time the new inputs go through the hidden layers, the weights will be updated by the function f(x,w) in each node. The outcome y will be the income for the next layer, and the last layer outputs the results, which will be compared with the expected values from the training data, resulting in an error measure. (Zhai, C Haudek, Shi, H Nehm, & Urban‐Lurain, 2020)The comparison also informs adjustment to the weights by the weight optimizer to decrease errors in the next epoch. Once the errors have limited significant improvement, the learning processes of the network end. Second, teachers could load the predicting data by choosing the algorithm that best optimizes the sets of algorithmic parameters.
Figure 2 The example of Neural Network workflow.
With the optimized algorithm model, the machine runs the predicting data within seconds and outputs the results in two formats: (1) The graphical and numerical score report; (2) the machine-provided instruction guidance. The outputs of machine automatic scoring could guide teachers’ classroom instruction.
The outputs will have short-term and intermediate outcomes regarding teachers’ instructional decision making and students’ customized learning guidance. In the short term, the teachers will have less grading load, thus giving students feedback in a timely fashion. Also, the score report and guidance could guide teachers to make better instructional decisions, thus providing students customized help and learning guide. In the intermediate-term, teachers will improve their ability by connecting the machine score and guidance with their pedagogical content knowledge, which will be improved with additional instruction or training from disciplinary experts. How It is critical to ensure that the scores are sensitive to learning quality so that the interpretation of scores informs instructional planning, adjustment, and implementation. The machine scoring could guarantee high-quality scoring with the best-optimized algorithm models. Teachers’ instructional decision making will benefit students significantly. Students will increasingly build confidence in developing scientific modeling given the fact that they could have timely feedback and professional instruction. The scientific modeling competence will further lead students to develop high-level scientific skills and even pursue a science-related career.
The automatic scoring of visualized scientific modeling will have a great impact on science assessment practice and science education. Developing models to explain phenomena is a critical scientific practice in science classrooms (Zhai, Haudek, Stuhlsatz, & Wilson, 2020). However, student-developed models expressed through either drawing or written responses are time-consuming to score, which limits the use of modeling assessments. Without timely scored assessments that align with the vision of the NRC Framework (Council, 2012) and the NGSS (2013), teachers may feel challenging to engage students in modeling practice, failing to adjust instruction using timely feedback. However, with the aid of the AAVM web portal, teachers could score students drawing in a very short time, thus significantly reducing teachers scoring challenges. Without the scoring challenge, teachers will be more than ever to use scientific modeling in science assessment. More importantly, increasing attention on students scientific modeling, one critical scientific practice in the NRC framework, (NRC,2012), will foster the achievement of three-dimensional learning goals suggested by the NRC framework (NRC, 2012) and NGSS (2013) (i.e., scientific and engineering practices, crosscutting concepts, and disciplinary core ideas).
Reference
Breiman, L. (2001). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statis- tical Science, 16(3), 199–231.
Council, N. R. (2012). A framework for K-12 science education: Practices, crosscutting concepts, and core ideas: National Academies Press.
Large, J., Lines, J., & Bagnall, A. (2019). A probabilistic classifier ensemble weighting scheme based on cross-validated accuracy estimates. Data mining and knowledge discovery, 33(6), 1674-1709.
Lead States, NGSS. (2013). Next generation science standards: for states, by states. Washington, DC: The National Academies Press.
Zhai, X., C Haudek, K., Shi, L., H Nehm, R., & Urban‐Lurain, M. (2020). From substitution to redefinition: A framework of machine learning‐based science assessment. Journal of Research in Science Teaching, 57(9), 1430-1459.
Zhai, X., Haudek, K. C., Stuhlsatz, M. A., & Wilson, C. (2020). Evaluation of construct-irrelevant variance yielded by machine and human scoring of a science teacher PCK constructed response assessment. Studies in Educational Evaluation, 67, 100916.