Parallel mining of frequent subtree patterns

Wenwen Qu, Da Yan, Guimu Guo, Xiaoling Wang, Lei Zou, Yang Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

Mining frequent subtree patterns in a tree database (or, forest) is useful in domains such as bioinformatics and mining semi-structured data. We consider the problem of mining embedded subtrees in a database of rooted, labeled, and ordered trees. We compare two existing serial mining algorithms, PrefixTreeSpan and TreeMiner, and adapt them for parallel execution using PrefixFPM, our general-purpose framework for frequent pattern mining that is designed to effectively utilize the CPU cores in a multicore machine. Our experiments show that TreeMiner is faster than its successor PrefixTreeSpan when a limited number of CPU cores are used, as the total mining workloads is smaller; however, PrefixTreeSpan has a much higher speedup ratio and can beat TreeMiner when given enough CPU cores.

Original languageEnglish (US)
Title of host publicationSoftware Foundations for Data Interoperability and Large Scale Graph Data Analytics - 4th International Workshop, SFDI 2020, and 2nd International Workshop, LSGDA 2020, held in Conjunction with VLDB 2020, Proceedings
EditorsLu Qin, Wenjie Zhang, Ying Zhang, You Peng, Hiroyuki Kato, Wei Wang, Chuan Xiao
PublisherSpringer Science and Business Media Deutschland GmbH
Pages18-32
Number of pages15
ISBN (Print)9783030611323
DOIs
StatePublished - 2020
Externally publishedYes
Event4th International Workshop on Software Foundations for Data Interoperability, SFDI 2020 and 2nd International Workshop on Large Scale Graph Data Analytics, LSGDA 2020, held in Conjunction with VLDB 2020 - Tokyo, Japan
Duration: Sep 4 2020Sep 4 2020

Publication series

NameCommunications in Computer and Information Science
Volume1281
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference4th International Workshop on Software Foundations for Data Interoperability, SFDI 2020 and 2nd International Workshop on Large Scale Graph Data Analytics, LSGDA 2020, held in Conjunction with VLDB 2020
Country/TerritoryJapan
CityTokyo
Period9/4/209/4/20

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • General Mathematics

Fingerprint

Dive into the research topics of 'Parallel mining of frequent subtree patterns'. Together they form a unique fingerprint.

Cite this