diff options
author | Timotej Lazar <timotej.lazar@fri.uni-lj.si> | 2017-02-04 19:52:12 +0100 |
---|---|---|
committer | Timotej Lazar <timotej.lazar@fri.uni-lj.si> | 2017-02-05 03:27:09 +0100 |
commit | 8d5e1278e90d30c7fe79ce5c7d5d97b833853a68 (patch) | |
tree | eb7b1361ba1dfdeb409c719ace62593824ef09d9 /aied2017/method.tex | |
parent | 90ea92ad4c42a5fa980d55576cc96aa4a8fbf3a8 (diff) |
Add section on patterns
Diffstat (limited to 'aied2017/method.tex')
-rw-r--r-- | aied2017/method.tex | 26 |
1 files changed, 25 insertions, 1 deletions
diff --git a/aied2017/method.tex b/aied2017/method.tex index c86af75..9935005 100644 --- a/aied2017/method.tex +++ b/aied2017/method.tex @@ -1,5 +1,26 @@ \section{Method} +The following subsections explain the three main components of our approach: extracting patterns from student submissions, learning classification rules for correct and incorrect programs, and using those rules to generate hints. + +\subsection{Extracting patterns} +\label{sec:extracting-patterns} + +We extract patterns from student programs by selecting certain subsets of leaves in a program’s AST, and building up patterns that match nodes in those subsets. For this paper we always select pairs of nodes from the same clause: either two nodes referring to the same variable (like the examples above), or a value (such as \code{0} or the empty list \code{[]}) and another variable or value that occurrs in the same \code{compound} or \code{binop}. For example, in the clause\footnote{Occurrences of the three variables \code{A}, \code{B} and \code{C} are subscripted for disambiguation.} + +\begin{Verbatim} +a(A\textsubscript{1},B\textsubscript{1}):- + b(A\textsubscript{2},C\textsubscript{1}), + B\textsubscript{2} is C\textsubscript{2} + 18. +\end{Verbatim} + +\noindent +we would select the following sets of leaf nodes: \{\code{A\textsubscript{1}},\code{A\textsubscript{2}}\}, \{\code{B\textsubscript{1}},\code{B\textsubscript{2}}\}, \{\code{C\textsubscript{1}},\code{C\textsubscript{2}}\}, \{\code{B\textsubscript{2}},\code{18}\}, and \{\code{C\textsubscript{2}},\code{18}\}. + +We build a pattern for each set $S$ of selected leaf nodes by walking the AST in depth-first order, and recording nodes that lie on paths to elements of $S$. As explained above, we omit \code{and} nodes, allowing the pattern to generalize to more programs. Patterns also include certain nodes that do not lie on a path to any selected leaf. Specifically, for each included \code{compound} node we also include the corresponding \code{functor} with the predicate name. We also include the operator names (like \code{+} and \code{is}) for all unary and binary (\code{binop}) nodes in the pattern. + +Patterns constructed in this way form the set of features for rule learning. To keep this set at a reasonable size, we only use patterns that have appeared in submissions by at least five students. + + \subsection{Learning rules for correct and incorrect programs} \begin{figure}[t] \centering @@ -79,4 +100,7 @@ incorrect. \subsection{Generating hints from rules} - +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "aied2017" +%%% End:
\ No newline at end of file |