summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTimotej Lazar <timotej.lazar@fri.uni-lj.si>2017-04-17 15:49:33 +0200
committerTimotej Lazar <timotej.lazar@fri.uni-lj.si>2017-04-17 15:49:33 +0200
commitc3615b45b84e81e421763d76d01fa363dd79f985 (patch)
tree3fd0f1b7f6434239d729b2b8b159f418794cb9eb
parent06e85a88efd8225e9e7044ddd09f3ee0406f4c6e (diff)
Add a missing word
-rw-r--r--paper/evaluation.tex2
1 files changed, 1 insertions, 1 deletions
diff --git a/paper/evaluation.tex b/paper/evaluation.tex
index 077b684..8e768b1 100644
--- a/paper/evaluation.tex
+++ b/paper/evaluation.tex
@@ -59,7 +59,7 @@ when compared to buggy hints: in the case of problem \code{sister} 84 out of 127
The last column shows the number of submissions where no hints could be generated. This value is relatively high
for the \code{is\_sorted} problem, because the algorithm could not learn any positive rules and thus no intent hints were generated.
-To sum up, buggy hints seem to be good and reliable, since they are always implemented when presented, even when we tested them on past data -- the decisions of students were not influenced by these hints. The percentage of implemented intent hints is, on average, lower (56\%), which is still not a bad result, providing that it is difficult to determine the programmer’s intent. In 12\% (244 out 2057) of generated intent hints, students implemented an alternative hint that was identified by our algorithm. Overall we were able to generate hints for 84.5\% of incorrect submissions. Of those hints, 86\% were implemented (73\% of all incorrect submissions).
+To sum up, buggy hints seem to be good and reliable, since they are always implemented when presented, even when we tested them on past data -- the decisions of students were not influenced by these hints. The percentage of implemented intent hints is, on average, lower (56\%), which is still not a bad result, providing that it is difficult to determine the programmer’s intent. In 12\% (244 out of 2057) of generated intent hints, students implemented an alternative hint that was identified by our algorithm. Overall we were able to generate hints for 84.5\% of incorrect submissions. Of those hints, 86\% were implemented (73\% of all incorrect submissions).
High classification accuracies for many problems imply that it is possible to determine program correctness simply by checking for the presence of a small number of patterns. Our hypothesis is that for each program certain crucial patterns exist that students have difficulties with. When they figure out these patterns, implementing the rest of the program is usually straightforward.