Description of method (still very draft).

author: Martin Možina <martin.mozina@fri.uni-lj.si> 2017-02-03 13:56:49 +0100
committer: Martin Možina <martin.mozina@fri.uni-lj.si> 2017-02-03 13:56:49 +0100
commit: 1ca4a764430bac01100ed0da0fa1582c42541ae2 (patch)
tree: e8db327b97b44fd9f79e561f9ab4a6fd5657ea94
parent: c0f4bbf1ba66c5531e560f2d63a55ef6ee189268 (diff)
1 files changed, 82 insertions, 1 deletions
diff --git a/aied2017/method.tex b/aied2017/method.tex
index 3a14d53..c86af75 100644
--- a/aied2017/method.tex
+++ b/aied2017/method.tex
@@ -1 +1,82 @@
-\section{Method}
-\ No newline at end of file
+\section{Method}
+
+\subsection{Learning rules for correct and incorrect programs}
+\begin{figure}[t]
+\centering
+ \begin{enumerate}
+  \item Let $P$ be the data of all student programs, each described with a set of AST patterns 
+  and classified as correct (it passes unit tests) or incorrect. 
+  \item Let method $learn\_rules(target, P, P_1, sig, acc)$ be a method that learns 
+  a set of rules for class $target$ from data $P$. The method needs to consider 
+  two additional constraints: the significance of every attribute-value pair
+  in the condition part of the rule needs to be significant with respect to the likelihood-ratio
+  test ($p<sig$) and classification accuracy of each rule on data $P_1$ should be at least $acc$. 
+  \item Let $I-rules = learn\_rules(incorrect, P, P, 0.05, 0.9)$
+  \item Let $P_c$ be data $P$ without programs that are already covered by $I-rules$
+  \item Let $C-rules = learn\_rules(correct, P, P_c, 0.05, 0.9)$
+  \item Return $I-rules$ and $C-rules$
+ \end{enumerate}
+ \caption{An outline of the algorithm for learning rules. The method $learn\_rules$,
+ which induces rules  for a specific class, is a variant of the 
+ CN2 algorithm~\cite{YYY} implemented within the Orange data-mining suite~\cite{XXX}. 
+ In all our experiments, $sig$ was set to 0.05 and $acc$ was set to 0.9. }
+ \label{figure:algorithm}
+\end{figure}
+
+As described in the previous section, the feature space contains all frequent AST patterns. 
+The submitted programs are then represented in this feature space and classified as either
+correct, if they pass all prespecified tests, or incorrect otherwise. Such data 
+serves as the data set for machine learning. 
+
+However, since we are always able to validate a program with unit tests,
+the goal of machine learning is not to classify new programs, but to discover patterns
+that are correlated with correctness of programs. This approach of machine learning is 
+referred to as descriptive induction, that is the automatic discovery of patterns describing 
+regularities in data. We use rule learning for this task since rule-based models are easy to comprehend. 
+
+Before we can explain the algorithm, we need to discuss the reasons why a program
+can be incorrect. From our previous pedagogical experience, a student program is 
+incorrect 1) if some incorrect pattern is present, which needs to be removed, or 2) if the 
+program lacks a certain relation (pattern) that should be included before the program can be correct. 
+We shall now explain how both types of errors can be identified with rules. 
+
+To discover patterns related to the first point, the algorithm first learns rules that describe
+incorrect programs. The conditions of these rules contain frequent patterns that symptom 
+incorrect programs. Since rules are used to generate hints and since hints should not be 
+presented to students unless they are probably correct, we require that each learned rule's 
+classification accuracy exceeds a certain threshould  (in our experiments we used 90\%), 
+each conjunct in a condition is significant with respect to the likelihood-ratio test (with $p=0.05$ in our experiments),
+and a conjunct can only specify the presence of a pattern. The former two constraints are needed 
+to induce good rules with significant patterns, while the latter constraint assures that rules mention
+only presence (and not absence) of patterns as reasons for a program to be incorrect. 
+
+With respect to the second type of error, we could try the same approach and learn rules for the class of 
+correct programs. Having accurate rules for correct programs, the conditional part of these rules would
+define sufficient groups of patterns that render a program correct. However, it turns out that 
+it is difficult to learn accurate rules for correct programs, since these rules should contain all relevant patterns 
+and prevent incorrect patterns, yet a conjunct can only 
+specify the presence of a pattern. If specifying absence of patterns was allowed in rules' condition, 
+the learning problem would still be difficult, since usually there are many incorrect patterns. 
+A possible way to solve this problem is to learn from data set not containing programs that are covered by rules for incorrect class. 
+This way all known incorrect patterns are removed from the data and are not needed anymore in conditions of rules. 
+However, removing incorrect patterns also removes the need for relevant patterns. For example, if all incorrect programs were removed, 
+a rule ``IF True then class=correct'' would suffice. 
+Such rule does not contain any relevant patterns and could not be used to generate hints. 
+We achieved the best results by learning from both data sets. The original data set (with all programs)
+is used to learn rules, while the filtered data are used to test whether a rule achieves the 
+required classification accuracy (90\%). 
+
+Figure~\ref{figure:algorithm} contains an outline of the algorithm. The rules describing
+incorrect programs are called $I-rules$ and the rules for correct programs are called $C-rules$. 
+
+Even though our main interest is discovery of patterns, we could still use induced rules to classify
+new programs, for example to evaluate the quality of rules. The classification procedure has three steps: 
+first check whether a $I-rule$ covers the program that needs to be classified, and if it 
+does, classify it as incorrect. Then, check whether a $C-rule$ covers the program and classify 
+it as correct if it does. Otherwise, if none of the induced rules cover this program, classify it as 
+incorrect. 
+
+\subsection{Generating hints from rules}
+
+
+
author	Martin Možina <martin.mozina@fri.uni-lj.si>	2017-02-03 13:56:49 +0100
committer	Martin Možina <martin.mozina@fri.uni-lj.si>	2017-02-03 13:56:49 +0100
commit	1ca4a764430bac01100ed0da0fa1582c42541ae2 (patch)
tree	e8db327b97b44fd9f79e561f9ab4a6fd5657ea94
parent	c0f4bbf1ba66c5531e560f2d63a55ef6ee189268 (diff)