Abstract
High utility sequential pattern (HUSP) mining has emerged as a novel topic in data mining, its computational complexity increases compared to frequent sequences mining and high utility itemsets mining. A number of algorithms have been proposed to solve such problem, but they mainly focus on mining HUSP in static databases and do not take streaming data into account, where unbounded data come continuously and often at a high speed. The efficiency of mining algorithms is still the main research topic in this field. In view of this, this paper proposes an efficient HUSP mining algorithm named HUSP-UT (utility on Tail Tree) based on tree structure over data stream. Substantial experiments on real datasets show that HUSP-UT identifies high utility sequences efficiently. Comparing with the state-of-the-art algorithm HUSP-Stream (HUSP mining over data streams) in our experiments, the proposed HUSP-UT outperformed its counterpart significantly, especially for time efficiency, which was up to 1 order of magnitude faster on some datasets.
Article PDF
Similar content being viewed by others
References
J. Pei, J. Han, B. Mortazavi-Asl, PrefixSpan: mining sequential patterns effciently by prefix-projected pattern growth, in Proceeding IEEE International Conference on Data Engineering, New Jersey, 2001, pp. 215–552.
M.J. Zaki, SPADE: an efficient algorithm for mining frequent sequences, Mach. Learn. 42 (2001), 31–60.
P.P.C. Rassi, M. Teisseire. Speed: mining maximal sequential patterns over data streams, in Proceeding of the IEEE International Conference on Intelligent Systems, New Jersey, 2006, pp. 546–552.
B. Zhang, C.W. Lin, P. Fournierviger, Mining of high utility-probability sequential patterns from uncertain databases, PLOS ONE. 12(7) (2017), e0180931.
M. Zihayat, Y. Chen, A. An. Memory-adaptive high utility sequential pattern mining over data streams, Mach. Learn. 106 (2017), 799–836.
J.Z. Wang, Z.H. Yang, J.L. Huang, An efficient algorithm for high utility sequential pattern mining, Frontier Innovation Future Comput. Commun. 30(1) (2014), 49–56.
C.F. Ahmed, S.K. Tanbeer, B. Jeong. A novel approach for mining high-utility sequential patterns in sequence databases. Electron. Telecommun. Res. Inst. 32 (2010), 676–686.
J. Yin, Z. Zheng, L. Cao, Uspan: an efficient algorithm for mining high utility sequential patterns, in Proceeding of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, 2012, pp. 660–666.
J.Z. Wang, J.L. Huang, Y.C. Chen, On efficiently mining high utility sequential patterns, Knowl. Info. Syst. 49(2) (2016), 597–627.
A. Marascu, F. Masseglia, Mining sequential patterns from temporal streaming data, Food Chem. 155(28) (2005), 186–191.
M. Zihayat, A. An, Mining top-k high utility patterns over data streams, Info. Sci. 285 (2014), 138–161.
B. Shie, H. Hsiao, V.S. Tseng, Efficient algorithms for discovering high utility user behavior patterns in mobile commerce environments, Knowl. Info. Syst. J. 37(2) (2013), 363–387.
M. Zihayat, C.-W. Wu, A. An, V.S. Tseng, Mining high utility sequential patterns from evolving data streams, in Proceeding of the ASE BigData & Social Informatics, Kaohsiung, Taiwan, 2015, pp. 1–26.
L. Chang, T. Wang, D. Yang, H. Luan, Seqstream: mining closed sequential patterns over stream sliding windows, in Proceeding of the IEEE International Conference on Data Mining, Pisa, 2008, pp. 83–92.
M. Zihayat, C.W. Wu, A. An, Efficiently mining high utility sequential patterns in static and streaming data, Intell. Data Anal. 21 (2017), 103–135.
Y. Wu, Z. Tang, H. Jiang, Approximate pattern matching with gap constraints, J. Info. Sci. 42(5) (2016), 639–658.
W. Le, W. Shui, L. Sheng-Lan, W. Hui-Bing, An algorithm of Mining Sequential pattern with wildcards based on Index-Tree, Chin. J. Comput. 39(17) (2016), 1–9.
Z. Farzanyar, M. Kangavari, N. Cercone, Max-FISM: mining (recently) maximal frequent itemsets over data streams using the sliding window model, Comput. Math. Appl. 64(6) (2012), 1706–1718.
L. Wang, L. Feng, B. Jin, Sliding window-based frequent itemsets mining over data streams using tail pointer table, Int. J. Comput. Intell. Syst. 7(1) (2014), 25–36.
M. Song, S. Rajasekaran, A transaction mapping algorithm for frequent itemsets mining, IEEE Trans. Knowl. Data Eng. 18(4) (2006), 472–481.
P. Fournier-Viger, A. Gomariz, T. Gueniche, SPMF: a Java open source pattern mining library, J. Mach. Learn. Res. 15 (2014), 3389–3393.
S. Zida, P. Fournier-Viger, C.W. Wu, Efficient mining of high-utility sequential rule, in Proceeding International Conference on Machine Learning and Data Mining, San Francisco, 2015, pp. 157–171.
V.S. Tseng, C.W. Wu, B.E. Shie, UP-Growth: an efficient algorithm for high utility itemset mining, in Proceeding International Conference on Knowledge Discovery and Data Mining, Washington, 2010, pp. 253–262.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).
About this article
Cite this article
Tang, H., Liu, Y. & Wang, L. A New Algorithm of Mining High Utility Sequential Pattern in Streaming Data. Int J Comput Intell Syst 12, 342–350 (2018). https://doi.org/10.2991/ijcis.2019.125905650
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.2991/ijcis.2019.125905650