summaryrefslogtreecommitdiff
path: root/aied2017/conclusion.tex
blob: e2cd71bbdf5539e5fdf8249f7ae37fce92c45a7c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
\section{Conclusion}

AST patterns can be used to describe structural information in programs’ ASTs. By encoding only relations between selected nodes, each pattern can match many programs. AST patterns thus function as “regular expressions for trees”. We have used patterns as features for machine learning.

We presented a method for automatically extracting AST patterns from student programs. Our patterns encode relations between data objects in a program, with each pattern connecting either two instances of the same variable, a variable and a value, or two values. We consider such patterns as the atomic syntactic relations in a program, and use them as machine-learning features.

Evaluation shows that our patterns are useful for classifying Prolog programs. It is likely, however, that other programming languages will require different kinds of patterns. In Python, for example, variables can take on different values and appear in many places. This will likely require patterns relating more than two instances of a variable, or multiple variables.

We explained how patterns can be used as features to induced rules for classifying correct and incorrect programs. Since the goal of our research is to generate hints, we adapted the CN2 algorithm to produce rules useful for that purpose. We induce rules in two passes: we first learn the I-rules for incorrect programs, and then use the programs not covered by any I-rule to learn the C-rules for correct programs.

We showed how to generate hints based on I- and C-rules by highlighting buggy patterns or pointing out what patterns are missing. Evaluation on past student data shows that useful hints can be provided for many incorrect submissions this way. The quality of feedback could be improved by annotating rules with explanations in natural language. Furthermore, since patterns and rules are easily interpretable, they can also help when manually authoring tutoring systems, with I-rules indicating the common errors and C-rules the typical solution strategies for each problem.

In the future we will attempt to improve rule accuracy for certain problems, such as \code{union}. This will likely necessitate new kinds of patterns, for example to handle the cut operator. Adapting our methods to handle Python programs will give us some insight into what kinds of patterns could be useful in different situations. Finally, we will implement hint generation in an online programming tutor CodeQ, and evaluate the effect of automatic feedback on students’ problem-solving.


%%% Local Variables:
%%% mode: latex
%%% TeX-master: "aied2017"
%%% End: