Apriori Algorithm

This project demonstrates the Apriori algorithm.

The program generates:

Frequent itemsets using F_k-1 * F₁ and F_k-1 * F_k-1 methods.
Frequent Closed Itemsets
Maximal Frequent Itemsets
Association Rules
Enumeration of rules from Confidence-based pruning

Some datasets which have been tried are:

Car: 1728 records, 7 attributes from https://archive.ics.uci.edu/ml/datasets/Car+Evaluation
Mushroom: 8124 records, 23 attributes from https://archive.ics.uci.edu/ml/datasets/Mushroom
Nursery: 12960 records, 9 attributes from https://archive.ics.uci.edu/ml/datasets/Nursery

These data sets are converted into a sparse binary matrix using 'binarizer.awk', by running the command: $ gawk -f binarizer.awk datasetfile > matrixfile

where datasetfile is the dataset from the UCI Machine Learning Repository, and matrixfile is the file into which the matrix is to be stored.

The script is written in Awk, a language designed for easy processing of text. What my script does is that as it makes a scan through the file, it records for each column, the different attribute values encountered (in order). This is stored in an associative array.

After the entire file has been scanned, it retrieves each line, and then based on the attribute in each column, replaces it with either a 0 or a 1. This is determined by checking whether the value lies in the first 1/3rd of all different values encountered.

The underlying meaning of the data is not entirely lost, as correlated attribute values will appear in the same order, and hence there is an increased probability that two or more correlated attributes will be assigned the same values.

The main Apriori program is in Java, and requires the following parameters:

matrixfile
columnsfile
minsupportpercentage
minconfpercentage

It would be advisable to run with a larger allocated memory by using -Xmx8g

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
bin/amol/apriori		bin/amol/apriori
data		data
lib		lib
src/amol/apriori		src/amol/apriori
Apriori.jar		Apriori.jar
README.md		README.md
binarizer.awk		binarizer.awk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Apriori Algorithm

About

Releases

Packages

Languages

amolrbhagwat/Apriori

Folders and files

Latest commit

History

Repository files navigation

Apriori Algorithm

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages