Vol. 1, Issue 1, Part A (2024)

Frequent pattern mining with improved Apriori and FP-growth algorithms for big data applications

Author(s):

Arifa Akter and Rakibul Hasan

Abstract:

Frequent Pattern Mining (FPM) plays a foundational role in data mining by uncovering recurring relationships within large datasets, supporting a wide range of applications such as market basket analysis, bioinformatics, cybersecurity, and recommendation systems. Traditional algorithms like Apriori and FP-Growth, though well-established, suffer from scalability and efficiency limitations when applied to massive, high-dimensional data. This study proposes optimized versions of Apriori and FP-Growth algorithms specifically designed for big data environments. Enhancements include hash-based candidate pruning, dynamic support thresholding, transaction ID list indexing for Apriori, and a memory-efficient, iterative FP-tree mining mechanism for FP-Growth. Implementations were developed using both Hadoop MapReduce and Apache Spark to leverage distributed computing capabilities. Experimental evaluations using benchmark datasets (Retail, Kosarak, and T40I10D100K) demonstrated significant improvements in execution time (up to 60%) and memory usage (up to 50%) without loss of pattern quality. These findings validate the proposed methods as scalable, efficient, and adaptable solutions for frequent pattern mining in real-world big data applications. Future extensions include real-time stream mining, privacy-preserving pattern discovery and integration with intelligent decision-making systems.

Pages: 01-09  |  17 Views  6 Downloads

How to cite this article:
Arifa Akter and Rakibul Hasan. Frequent pattern mining with improved Apriori and FP-growth algorithms for big data applications. J. Data Min. Anal. 2024;1(1):01-09.