I am not a good reader. Whenever I stay at the library will end like these :
 |
| Get books |
 |
| Get to read |
 |
| Get bored |
 |
| Stress |
 |
| Get hungry |
But I decide to change the habit. I will learn to be a good reader every now and then. Start to read with these three papers :
- Association Rule Mining as a Data Mining Technique -> Irina Tudor
- Mining Association Rules with Apriori -> Jinbo Paul Lin's resume
- Application using Data Mining Association Rules with Priori Method for Analysis of Data on The Market Basket Pharmacy Sales Transaction -> Leni Meiwati
Paper 1 :
- Data Mining devides into two major classes which are Supervised ( Bayesian, Neural Network, Decision Tree, Genetic Algorithm, Fuzzy Set, K-Nearest Neighbor ) and Unsupervised ( Association Rules and Clustering )
a. Support
b. Confidence
Typically, association rules are considered interesting if they satisfy both a minimum support threeshold and a minimum confidence threshold.
Case study : Market Basket Analysis
Association rule mining searches for interesting relationships among items in a given data set. Considering the example of a store that sells DVDs, Videos, CDs, Books and Games, the store owner might want to discover which of these items customers are likely to buy together.
- Suppose minimum support required is 2
- Let minimum confidence required is 60%
- We have to find out the frequent itemset using Apriori algorithm
- Association rule will be generated using the two parameters minimum support and minimum confidence
Transaction :
- Customer A bought BOOKS, CD, VIDEO
- Customer B bought CD, GAMES
- Customer C bought CD, DVD
- Customer D bought BOOKS, CD, GAMES
- Customer E bought BOOKS, DVD
- Customer F bought CD, DVD
- Customer G bought BOOKS, DVD
- Customer H bought BOOKS, CD, DVD, VIDEO
- Customer I bought BOOKS, CD, DVD
1- Itemset
*/ { BOOKS }, support count 6
*/ { CD }, support count 7
*/ { VIDEO }, support count 2
*/ { GAMES }, support count 2
*/ { DVD }, support count 6
2-Itemset
*/ { BOOKS, CD }, support count 4
*/ { BOOKS, VIDEO }, support count 2
*/ { BOOKS, GAMES }, support count 1
*/ { BOOKS, DVD }, support count 4
*/ { CD, VIDEO }, support count 2
*/ { CD, GAMES }, support count 2
*/ { CD, DVD }, support count 4
*/ { VIDEO, GAMES }, support count 0
*/ { VIDEO, DVD }, support count 1
*/ { GAMES, DVD }, support count 0
The red are called prune step. Any item that has a support count less than the minimum support count required is removed the pool of candidate items.
3-Itemset
*/ { BOOKS, CD, VIDEO }, support count 2
*/ { BOOKS, CD, DVD }, support count 2
4-Itemset
*/ { BOOKS, CD, VIDEO, DVD }
The final step is to provide the association rules from frequent itemsets.
- For each frequent itemset " a ", generate all none empty subset of " a "
- For every nonempty subset " s " of " a ", output rule " s -> ( a-s )" if support count (a) / support count (s) >= min_conf
For example L = { BOOKS, CD, VIDEO }. Its all none empty subsets are { BOOKS, VIDEO }, { BOOKS, CD }, { CD, VIDEO }, { BOOKS }, { VIDEO }, { CD }
Let minimum confidence threshold is 60 %
R1 : BOOKS and VIDEO -> CD
Confidence = support count { BOOKS, CD, VIDEO } /
support count { BOOKS, VIDEO }
= 2/2
= 100 % - R1 is selected
R2 : BOOKS and CD -> VIDEO
Confidence = support count { BOOKS, CD, VIDEO } /
support count { BOOKS, CD }
= 2/4
= 50 % - R2 is rejected
R3 : CD and VIDEO -> BOOKS
Confidence = support count { BOOKS, CD, VIDEO } /
support count { VIDEO, CD }
= 2/2
= 100 % - R3 is selected
R4 : BOOKS -> VIDEO and CD
Confidence = support count { BOOKS, CD, VIDEO } /
support count { BOOKS }
= 2/6
= 33 % - R4 is rejected
R5 : VIDEO -> BOOKS and CD
Confidence = support count { BOOKS, CD, VIDEO } /
support count { VIDEO }
= 2/2
= 100 % - R5 is rejected
R6 : CD -> BOOKS and VIDEO
Confidence = support count { BOOKS, CD, VIDEO } /
support count { CD }
= 2/7
= 28 % - R6 is rejected
In this way, we have found three strong association rules.
Paper 2 :
Preparation :
a. JAVA
* JDK
* NetBeans
b. MICROSOFT ACCESS
From this paper, I learned how to design system with DFD - Data Flow Diagram :
I also learned how to design structure in database - Microsoft Access :
Paper 3 :
I learned more about JAVA in this paper. I knew pseudocode of the algorithm, the join step and also the prune step.
I also learned how to create Graphical User Interface ( GUI ), classes's structure and another reference of input :
 |
| Graphical User Interface |
|
**********
My question is how all of those references help me to finish my final assignment ??
Figure it out now !!!