From b6c03a4ac86e58d876b9a6a337b3afe71deb6c64 Mon Sep 17 00:00:00 2001 From: Timotej Lazar Date: Tue, 7 Feb 2017 00:28:59 +0100 Subject: Include comments --- aied2017/aied2017.tex | 6 +++--- aied2017/background.tex | 5 ++--- aied2017/introduction.tex | 12 ++++++------ aied2017/method.tex | 4 ++-- aied2017/patterns.tex | 21 ++++++++++++--------- 5 files changed, 25 insertions(+), 23 deletions(-) (limited to 'aied2017') diff --git a/aied2017/aied2017.tex b/aied2017/aied2017.tex index c8a6e9c..a253632 100644 --- a/aied2017/aied2017.tex +++ b/aied2017/aied2017.tex @@ -38,13 +38,13 @@ \begin{abstract} % motivation -When implementing a programming tutor, it is often difficult to consider all possible errors encountered by students. A possible alternative is to automatically learn a bug library of erroneous patterns from students’ programs. +When implementing a programming tutor, it is often difficult to manually consider all possible errors encountered by students. An alternative is to automatically learn a bug library of erroneous patterns from students’ programs. % learning -We propose using abstract-syntax-tree patterns as features for learning rules to distinguish between correct and incorrect programs. These rules can be used for debugging student programs: rules for incorrect programs (buggy rules) contain patterns indicating mistakes, whereas each rule for correct programs covers a subset of submissions sharing the same solution strategy. +We propose using abstract-syntax-tree (AST) patterns as features for learning rules to distinguish between correct and incorrect programs. These rules can be used for debugging student programs: rules for incorrect programs (buggy rules) contain patterns indicating mistakes, whereas each rule for correct programs covers a subset of submissions sharing the same solution strategy. % generating hints To generate hints, we first check all buggy rules and point out incorrect patterns. If no buggy rule matches, rules for correct programs are used to recognize the student’s intent and suggest patterns that still need to be implemented. % evaluation -We evaluated our approach on past student programming data for a number of Prolog problems. For many problems, the induced rules correctly classified over 90\% of programs based only on their structural features. For approximately 75\% of incorrect submissions we were able to generate hints that were implemented by the student in some subsequent submission. +We evaluated our approach on past student programming data for a number of Prolog problems. For 31 out of 44 problems, the induced rules correctly classified over 85\% of programs based only on their structural features. For approximately 73\% of incorrect submissions, we were able to generate hints that were implemented by the student in some subsequent submission. \\\\ \textbf{Keywords:} Programming tutors · Error diagnosis · Hint generation · Abstract syntax tree · Syntactic features \end{abstract} diff --git a/aied2017/background.tex b/aied2017/background.tex index bb6bed3..9457b35 100644 --- a/aied2017/background.tex +++ b/aied2017/background.tex @@ -7,14 +7,13 @@ Several programming tutors generate hints from differences between the student Rivers and Koedinger compare students’ programs directly to a database of previous correct submissions~\cite{rivers2015data-driven}. They reduce program variability using equivalence-preserving transformations, such as inlining functions and reordering binary expressions. Hints are generated by suggesting a minimal correct step leading from the current submission to the closest correct program. % unit tests -Above approaches directly compare ASTs, so they avoid the need for defining program features. Another option is to compare program behavior. Nguyen et al. use results from on a battery of unit tests to cluster incorrect programs~\cite{nguyen2014codewebs}. Li et al. generate test cases to distinguish between programs by selecting inputs that exercise different code paths in the program~\cite{li2016measuring}. Such tutors can point out pertinent failing test cases, which can be very helpful. However, to do any sort of conceptual analysis of student programs, we need to define some language for describing invariant properties of programs. +Another option is to compare program behavior. Nguyen et al. classify programs according to results on a preselected set of test inputs~\cite{nguyen2014codewebs}. Li et al. generate test cases to distinguish between programs by selecting inputs that exercise different code paths in the program~\cite{li2016measuring}. Such tutors can point out pertinent failing test cases, which can be very helpful. However, to do any sort of conceptual analysis of student programs, we need to define some language for describing invariant properties of programs. % constraints \emph{Constraints}~\cite{mitrovic2012fifteen} encode domain principles using if-then rules with relevance and satisfaction conditions, e.g. “if a function has a non-void return type, then it must have a return statement”~\cite{holland2009j-latte}. If a program violates a constraint, the tutor displays a predefined message. Le’s Prolog tutor improves constraint-based diagnosis by assigning weights to different types of constraints~\cite{le2009using}. % data-driven tutors -Jin et al. use \emph{linkage graphs} to describe data dependencies between the program’s statements~\cite{jin2012program}; we use AST patterns in a similar way. -Nguyen et al. analyzed submissions in a large machine-learning course to learn a vocabulary of \emph{code phrases}: subtrees of submissions’ abstract syntax trees that perform the same function in a given context. By swapping parts between different programs, they built up a search library of functionally equivalent AST subtrees within a given context. +Jin et al. use \emph{linkage graphs} to describe data dependencies between the program’s statements~\cite{jin2012program}; we use AST patterns in a similar way. Nguyen et al. analyzed submissions in a large machine-learning course to learn a vocabulary of \emph{code phrases}: subtrees of submissions’ abstract syntax trees that perform the same function in a given context~\cite{nguyen2014codewebs}. By swapping parts between different programs, they built up a search library of functionally equivalent AST subtrees within a given context. % tree-regular expressions The idea for AST patterns comes from Tregex -- \emph{tree regular expressions}, mainly used in the field of natural-language processing~\cite{levy2006tregex}. Tregex patterns can encode complex relations between nodes, but can become unwieldy; in this paper we use a simpler s-expression syntax. Another language for describing tree patterns using s-expressions is \emph{trx}, which additionally supports choice, repetition and other operators~\cite{bagrak2004trx}. diff --git a/aied2017/introduction.tex b/aied2017/introduction.tex index 790d1e7..13dc9ec 100644 --- a/aied2017/introduction.tex +++ b/aied2017/introduction.tex @@ -4,22 +4,22 @@ Programming education is becoming increasingly accessible with massive online courses. Since thousands of students can attend such courses, it is impossible for teachers to individually evaluate each participant’s work. On the other hand, in-time feedback that directly addresses students’ mistakes can aid the learning process. Providing feedback automatically could thus greatly enhance these courses. % ITS background -Traditional programming tutors use manually constructed domain models to generate feedback. Model-tracing tutors simulate the problem-solving \emph{process}: how students write programs. This is challenging because there are no well-defined steps when programming (as there are for example in chess). Many tutors instead only analyze individual programs submitted by students, and disregard how a program evolved. They often use models coded in terms of constraints or bug libraries~\cite{keuning2016towards}. +Traditional programming tutors use manually constructed domain models to generate feedback. Model-tracing tutors simulate the problem-solving \emph{process}: how students program. This is challenging because there are no well-defined steps when writing a program. Many tutors instead only analyze individual programs submitted by students, and disregard how a program evolved. They often use models coded in terms of constraints or bug libraries~\cite{keuning2016towards}. % data-driven domain modeling -Developing the domain model requires significant knowledge-engineering effort~\cite{folsom-kovarik2010plan}. This is particularly true for programming tutors, where most problems have several alternative solutions with many possible implementations~\cite{le2013operationalizing}. Data-driven tutors reduce the necessary effort by mining educational data -- often from online courses -- to learn common errors and generate feedback~\cite{rivers2015data-driven,nguyen2014codewebs,jin2012program}. +Developing a domain model requires significant knowledge-engineering effort~\cite{folsom-kovarik2010plan}. This is particularly true for programming tutors, where most problems have several alternative solutions with many possible implementations~\cite{le2013operationalizing}. Data-driven tutors reduce the necessary effort by mining educational data -- often from online courses -- to learn common errors and generate feedback~\cite{rivers2015data-driven,nguyen2014codewebs,jin2012program}. % problem statement -This paper addresses the problem of finding useful features to support data mining in programming tutors. Features should be robust against superficial or irrelevant variations in program code, and relatable to the knowledge components of the target skill (programming), so as to support hint generation. +This paper addresses the problem of finding useful features to support data mining in programming tutors. Features should be robust against irrelevant variations in program code (such as renaming a variable) and relatable to the knowledge components of the target skill (programming), so as to support hint generation. % our approach: patterns + rules -We describe features with \emph{AST patterns} that encode relations between nodes in a program’s abstract syntax tree. We use patterns that describe a path between pairs of leaf nodes referring to variables or values. By omitting some nodes on these paths, patterns can match different programs containing the same relation. We then induce rules to predict program correctness based on AST patterns, allowing us to generate hints based on missing or incorrect patterns. +We describe features with \emph{abstract-syntax-tree patterns} that encode relations between nodes in a program’s abstract syntax tree. We use patterns that describe a path between pairs of leaf nodes referring to variables or values. By omitting some nodes on these paths, patterns can match different programs containing the same relation. We then induce rules to predict program correctness from AST patterns, allowing us to generate hints based on missing or incorrect patterns. % evaluation -We evaluated our approach on existing Prolog programs submitted by students during past lab sessions of a second-year university course. Rule classification accuracy ranged between 85\% and 99\% for most problems. For 75\% of incorrect submissions we are able to suggest potentially useful patterns -- those that the student had actually implemented in the final, correct program. +We evaluated our approach on existing Prolog programs submitted by students during past lab sessions of a second-year university course. For 73\% of incorrect submissions we are able to suggest potentially useful patterns -- those that the students had actually implemented in the final, correct programs. % contributions -The main contributions presented in this paper are: AST patterns as features for machine learning, a rule-based model for predicting program correctness, and hints generated from incorrect or missing patterns in student programs. +The main contributions presented in this paper are: AST patterns as features for machine learning, a rule-based model for predicting program correctness, and hint generation from incorrect or missing patterns in student programs. %%% Local Variables: %%% mode: latex diff --git a/aied2017/method.tex b/aied2017/method.tex index 4df3989..5b7614b 100644 --- a/aied2017/method.tex +++ b/aied2017/method.tex @@ -1,6 +1,6 @@ \section{Method} -This section explains the three main components of our approach: extracting patterns from student submissions, learning classification rules for correct and incorrect programs, and using those rules to generate hints. +This section explains the three main components of our approach: automatically extracting patterns from student submissions, learning classification rules for correct and incorrect programs, and using those rules to generate hints. \subsection{Extracting patterns} \label{sec:extracting-patterns} @@ -18,7 +18,7 @@ we select the following pairs of nodes: \{\code{A},\,\code{A\textsf{'}}\}, \{\co For each selected pair of leaf nodes $(a,b)$ we build a pattern by walking the AST in depth-first order, and recording nodes that lie on the paths to $a$ and $b$. We omit \textsf{and} nodes, as explained in the previous section. We also include certain nodes that do not lie on a path to any selected leaf. Specifically, we include the functor or operator of all \textsf{compound}, \textsf{binop} or \textsf{unop} nodes containing $a$ or $b$. -Patterns constructed in this way form the set of features for rule learning. To keep this set at a reasonable size, we only use patterns that have appeared in programs submitted by at least five students. +Patterns constructed in this way form the set of features for rule learning. To keep this set at a reasonable size, we only use patterns that have appeared in programs submitted at least five times. \subsection{Learning rules} diff --git a/aied2017/patterns.tex b/aied2017/patterns.tex index 25a75e8..fe2ae6c 100644 --- a/aied2017/patterns.tex +++ b/aied2017/patterns.tex @@ -1,6 +1,6 @@ \section{AST patterns} -In this section we describe AST patterns through examples, while Sect.~\ref{sec:extracting-patterns} explains how patterns are extracted from student programs. Consider the following Prolog program implementing the relation \code{sister(X,Y)}\footnote{Meaning “\code{X} is a sister of \code{Y}”.}: +In this section we describe AST patterns through examples, while Sect.~\ref{sec:extracting-patterns} explains how patterns are extracted from student programs. Consider the following Prolog program implementing the relation \code{sister(X,Y)}\footnote{Binary relations like this one should be read as “\code{X} is a sister/parent/… of \code{Y}”.}: \begin{Verbatim} sister(X,Y):- % X is Y’s sister when: @@ -10,7 +10,7 @@ sister(X,Y):- % X is Y’s sister when: X \textbackslash{}= Y. % X and Y are not the same person. \end{Verbatim} -Figure~\ref{fig:sister} shows the program’s AST with two patterns. The pattern drawn with blue dotted arrows encodes the fact that the first argument to the \code{sister} predicate also appears in the call to \code{female}. In other words, this pattern states that \code{X} must be female to be a sister. We write the pattern as the s-expression +Figure~\ref{fig:sister} shows the program’s AST with two patterns. The pattern drawn with blue dotted arrows encodes the fact that the first argument to the \code{sister} predicate also appears in the call to \code{female}. In other words, this pattern states that \code{X} must be female to be a sister. We write this pattern as the s-expression \begin{Verbatim}[fontfamily=sf] (clause (head (compound (functor \code{sister}) (args var))) @@ -68,9 +68,11 @@ Figure~\ref{fig:sister} shows the program’s AST with two patterns. The pattern \label{fig:sister} \end{figure} -Patterns describe relations between nodes in a program’s AST. Specifically, the pattern ($a$ $b$ $c$) means that the nodes $b$ and $c$ are descended from $a$, and that $b$ precedes $c$ in a depth-first tree walk. In general, an AST matches the pattern (\textsf{name} $p_1$ … $p_k$) if it contains a node $n$ labeled \textsf{name}; the subtree rooted at $n$ must also contain, in depth-first order, distinct nodes $n_1$ to $n_k$ matching subpatterns $p_1$ to $p_k$. +Every pattern used in this paper has the same basic structure, and describes paths from a \textsf{clause} node to one or two leaf nodes containing variables or values. All patterns in Figs.~\ref{fig:sister} and~\ref{fig:sum} are induced from such node pairs. For each leaf we also include some local context, for example the predicate name (e.g. \texttt{parent}). + +We regard these patterns as the smallest units of meaning in Prolog programs: each pattern encodes some interaction between two parts of the program. Including more than two leaf nodes in a pattern could make it difficult to pinpoint the exact error when generating hints. Since a pattern contains at most two \textsf{var} nodes, we require they both refer to the same variable, since relating two nodes corresponding to different variables would not tell us much about the program. This allows us to omit variable names from patterns. -We handle syntactic variations in programs by omitting certain nodes from patterns. For example, by not including \code{and} nodes, the above pattern can match predicates with goals in any order (any arrangement of \code{and} nodes in the AST). This way, the same pattern will match any clause containing the “\textsf{sister}-\textsf{female}” relation, regardless of other goals. +We handle syntactic variations in programs by omitting certain nodes from patterns. For example, by not including \textsf{and} nodes, the above pattern can match a clause regardless of the presence (or order) of other goals in its body (any arrangement of \textsf{and} nodes in the AST). Order \emph{is} important for the nodes specified in the pattern; this is explained below. The second pattern in Fig.~\ref{fig:sister}, drawn with solid red arrows, encodes the fact that the two calls to \code{parent} share the first argument. In other words, \code{X}~and~\code{Y} must have the same parent~\code{P}. @@ -79,7 +81,7 @@ The second pattern in Fig.~\ref{fig:sister}, drawn with solid red arrows, encode (compound (functor \code{parent}) (args var))) \end{Verbatim} -All patterns used in this paper have the same basic structure: a union of paths from some \textsf{clause} to a pair of leaf nodes with variables (or values). Some local context is included with each leaf, such as the functor of a compound term. Each pattern only refers to a single variable, so we do not include its name with the \textsf{var} nodes. We regard such patterns as the basic units of meaning in Prolog programs: each pattern encodes an interaction between two parts of the program accessing the same object (variable). +Patterns describe relations between nodes in a program’s AST. Specifically, the pattern ($a$ $b$ $c$) means that the nodes $b$ and $c$ are descended from $a$, and that $b$ precedes $c$ in a depth-first tree walk. In general, an AST matches the pattern (\textsf{name} $p_1$ … $p_k$) if it contains a node $n$ labeled \textsf{name}; the subtree rooted at $n$ must also contain, in depth-first order, distinct nodes $n_1$ to $n_k$ matching subpatterns $p_1$ to $p_k$. A relation between any two objects in a program is insufficient to reason about the program’s behavior on the whole. In the tutoring context, however, there are patterns that strongly indicate the presence of certain bugs. Take for example the following incorrect program to sum a list: @@ -90,7 +92,7 @@ sum([H|T],Sum):- % \textit{recursive case:} Sum is Sum + H. % add first element (\textit{bug:} reused variable) \end{Verbatim} -This error is fairly common with Prolog novices: the variable \code{Sum} is used to represent both the sum of the whole list, and the sum of only the tail elements. The last line fails since Prolog cannot unify \code{Sum} with a (usually) different value of \code{Sum\,+\,H}. The program’s AST is displayed in Fig.~\ref{fig:sum}. +This error is fairly common with Prolog novices: the variable \code{Sum} is used to represent both the sum of the whole list (line 2), and the sum of only the tail elements (line 3). The last line fails since Prolog cannot unify \code{Sum} with a (usually) different value of \code{Sum\,+\,H}. The program’s AST is displayed in Fig.~\ref{fig:sum}. \begin{figure}[htbp] \centering @@ -146,7 +148,7 @@ for tree={ \path[-{Latex[length=1.5mm,width=1mm]},thick,relative,dashed,orange] (c2bb) edge[transform canvas={xshift=-1mm}] (c2bbv); \path[-{Latex[length=1.5mm,width=1mm]},thick,relative,dashed,orange] (c2bb) edge[transform canvas={xshift=0.9mm}] (c2bbo); \end{forest} - \caption{AST for the buggy \texttt{sum} program. Dotted arrows relate the correct values in the base case. Solid and dashed arrows denote two patterns describing incorrect reuse of the \code{Sum} variable.} + \caption{The AST for the buggy \texttt{sum} program. Dotted arrows relate the correct values in the base case. Solid and dashed arrows denote two patterns describing incorrect reuse of the \code{Sum} variable in the recursive case.} \label{fig:sum} \end{figure} @@ -161,7 +163,7 @@ Several patterns capture this mistake. Solid red arrows in Fig.~\ref{fig:sum} sh The second pattern, drawn with dashed orange arrows in the figure, indicates the likely error in the arithmetic expression: \begin{Verbatim}[fontfamily=sf] -(clause (binop (var \code{Sum}) \code{is} (binop (var \code{Sum}) \code{+}))) +(clause (binop var \code{is} (binop var \code{+}))) \end{Verbatim} The leftmost pattern in Fig.~\ref{fig:sum}, drawn with dotted blue arrows, describes the correct relation between the two constants in the base-case rule: @@ -173,7 +175,8 @@ The leftmost pattern in Fig.~\ref{fig:sum}, drawn with dotted blue arrows, descr \noindent We include such patterns in our feature set to cover the base-case clauses in recursive programs, which often include no variables. -While the patterns used in this paper appear to be useful for analyzing Prolog programs, it is likely that other kinds of patterns will be needed for other programming languages. In Python, for example, variables can take on different values and be accessed from many places. This will likely require patterns relating more than two instances of a variable, or multiple variables. +% TODO move to conclusion +%While the patterns used in this paper appear to be useful for analyzing Prolog programs, it is likely that other kinds of patterns will be needed for other programming languages. In Python, for example, variables can take on different values and be accessed from many places. This will likely require patterns relating more than two instances of a variable, or multiple variables. %%% Local Variables: %%% mode: latex -- cgit v1.2.1