diff options
Diffstat (limited to 'aied2018/introduction.tex')
-rw-r--r-- | aied2018/introduction.tex | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/aied2018/introduction.tex b/aied2018/introduction.tex index 9c395a4..ab69147 100644 --- a/aied2018/introduction.tex +++ b/aied2018/introduction.tex @@ -6,11 +6,11 @@ Prompt, specific feedback could however greatly benefit learning. Furthermore, a Several attempts have been made to automatically discover commonalities in a set of programs~\cite{jin2012program,rivers2015data-driven,nguyen2014codewebs,hovemeyer2016control}. This would allow a teacher to annotate a representative subset of submissions with feedback messages, which could then be automatically propagated to similar programs. These techniques are used for instance by the OverCode tool to visualize variations in student programs~\cite{glassman2015overcode}. -This paper presents a new language for describing patterns in student code. Our approach is based on \emph{tree regular expressions} (TREs) used in natural language processing~\cite{levy2006tregex}. TREs are similar to ordinary regular expressions: they allow us to specify important patterns in a program’s abstract syntax tree (AST) while disregarding irrelevant parts. TREs are sufficiently expressive to represent the various concepts and errors in novice programs. +This paper presents a new language for describing patterns in student code. Our approach is based on \emph{tree regular expressions} (TREs) used in natural language processing~\cite{levy2006tregex}. TREs are similar to ordinary regular expressions: they allow us to specify important patterns in a program’s abstract syntax tree (AST) while disregarding irrelevant parts. We found that TREs are sufficiently expressive to represent various concepts and errors in novice programs. We have previously demonstrated this approach with Prolog programs~\cite{lazar2017automatic}. Here we refine the definition of AST patterns, and show that they can be applied to Python -- representing a different programming paradigm -- with only a few language-specific modifications. We also demonstrate that rules learned from such patterns can be easily interpreted. -Our approach is similar to CodeWebs, in which Nguyen et al. discover functionally equivalent program fragments -- subtrees of the AST that perform the same transformation from inputs to outputs~\cite{nguyen2014codewebs}. Piech et al. learn program embeddings to a similar end~\cite{piech2015learning}. Our work differs mainly in that TREs allow us to describe relations in (potentially disjoint) subparts of the program. We induce rules based on binary correct/incorrect labels and do not execute any program more than once. +In CodeWebs, Nguyen et al. look for functionally equivalent AST subtrees that perform the same transformation from inputs to outputs~\cite{nguyen2014codewebs}. Piech et al. learn program embeddings to a similar end~\cite{piech2015learning}. Our work differs from theirs mainly in that TREs allow us to describe relations in (potentially disjoint) subparts of the program. We induce rules based on binary correct/incorrect labels, and do not have to execute any submission more than once. Jin et al. use linkage graphs to represent dependencies between individual program statements based on the variables they contain~\cite{jin2012program}. This allows them to form equivalence classes of programs by ignoring unrelated statements. While TREs can encode such attributes, we use smaller patterns that encode dependencies between pairs of variables. Hovemeyer et al. focus on control structures such as loops and conditionals~\cite{hovemeyer2016control}. Our AST patterns include TREs that describe control flow, expressing the same features. |