aied2017/background.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

\section{Background}

% solution strategies
Several programming tutors generate hints from differences between the student’s program and a predefined set of possible solutions. The possible solution strategies for each problem can be given as a set of programs, or specified in a domain-specific language. Both Johnson’s Pascal tutor~\cite{johnson1990understanding} and Hong’s Prolog tutor~\cite{hong2004guided} perform hierarchical goal decomposition based on predefined programming \emph{plans} or \emph{techniques} to determine the student’s intent. Gerdes et al. use a small set of annotated model programs to derive solution strategies, which function in a similar way~\cite{gerdes2016ask-elle}.

% canonicalization
Rivers and Koedinger compare students’ programs directly to a database of previous correct submissions~\cite{rivers2015data-driven}. They reduce program variability using equivalence-preserving transformations, such as inlining functions and reordering binary expressions. Hints are generated by suggesting a minimal correct step leading from the current submission to the closest correct program.

% unit tests
Above approaches directly compare ASTs, so they avoid the need for defining program features. Another option is to compare program behavior. Nguyen et al. use results from on a battery of unit tests to cluster incorrect programs~\cite{nguyen2014codewebs}. Li et al. generate test cases to distinguish between programs by selecting inputs that exercise different code paths in the program~\cite{li2016measuring}. Such tutors can point out pertinent failing test cases, which can be very helpful. However, to do any sort of conceptual analysis of student programs, we need to define some language for describing invariant properties of programs.

% constraints
\emph{Constraints}~\cite{mitrovic2012fifteen} encode domain principles using if-then rules with relevance and satisfaction conditions, e.g. “if a function has a non-void return type, then it must have a return statement”~\cite{holland2009j-latte}. If a program violates a constraint, the tutor displays a predefined message. Le’s Prolog tutor improves constraint-based diagnosis by assigning weights to different types of constraints~\cite{le2009using}.

% data-driven tutors
Jin et al. use \emph{linkage graphs} to describe data dependencies between the program’s statements~\cite{jin2012program}; we use AST patterns in a similar way.
Nguyen et al. analyzed submissions in a large machine-learning course to learn a vocabulary of \emph{code phrases}: subtrees of submissions’ abstract syntax trees that perform the same function in a given context. By swapping parts between different programs, they built up a search library of functionally equivalent AST subtrees within a given context.

% tree-regular expressions
The idea for AST patterns comes from Tregex -- \emph{tree regular expressions}, mainly used in the field of natural-language processing~\cite{levy2006tregex}. Tregex patterns can encode complex relations between nodes, but can become unwieldy; in this paper we use a simpler s-expression syntax. Another language for describing tree patterns using s-expressions is \emph{trx}, which additionally supports choice, repetition and other operators~\cite{bagrak2004trx}.


%%% Local Variables:
%%% mode: latex
%%% TeX-master: "aied2017"
%%% End: