Mining Evolution of Complex Structured Data

Sourav S Bhowmick
Associate Professor
School of Computer Engineering£¬Nanyang Technological University £¬Singapore

April 2, 14:00-17:00
Room: Science & Engineering Building, Room 103, Soochow University

Many real-life data can be represented as tree or graph. Different data mining techniques have been recently proposed to mine such complex structured data. These techniques can be broadly classified into two categories: (i) A large number of algorithms have been designed to find different types of pattern from such data by considering them as snapshot data. These techniques are evolution-unaware. (ii) Recently, there has been increasing research efforts in wide variety of domains such as XML, Web, and life sciences that mine evolutionary features of tree and graph structured data to discover novel knowledge. Typically, such knowledge cannot be discovered by mining snapshot data. This tutorial focuses on this second issue. That is, we highlight recent efforts in discovering novel knowledge from the historical evolution patterns of tree and graph structured data.

The tutorial is structured as follows. We motivate the necessity for mining evolution and give an overview of the evolutionary features of various types of tree and graph-structured data. Next, we identify various research issues involved in evolution mining. Specifically, our discussion can be categorized into the following three main components:

  • Study of evolution mining for tree-structured data. We use XML and web usage data as representatives of tree structured data.
  • Study of evolution mining for graph-structured data. We use web communities, click-through data, and biological net works as representatives of graph-structured data.
  • This tutorial session also reveal various application domains of evolution mining (such as social networks, blogs, web event detection, web personalization, XML query caching, protein function prediction, protein-protein interaction prediction, etc).

PDF: Mining Evolution of Complex Structured Data

Lecture Notes£º PPTX PDF

Sourav S Bhowmick is an Associate Professor in the School of Engineering, Nanyang Technological University and the Director of Centre for Advanced Information Systems (CAIS) . He is currently Visiting Associate Professor at the Biological Engineering Division, Massachusetts Institute of Technology(MIT), USA. He also holds the position of Singapore-MIT Alliance(SMA) Fellow in Computation and Systems Biology program(2005-2008). Sourav received his Ph.D. in computer engineering in 2001. His current research interests include XML data management, systems biology data management, web data management, and data mining. He has published more than 100papers in major international database and data mining conferences and journals such as VLDB, IEEE ICDE, ACM WWW, ACM SIGMOD, ACM SIGKDD, ACM CIKM, ER, PAKDD,IEEE TKDE, ACM CS, Information Systems, and DKE. Sourav is serving as a PC member of various database conferences and workshops and reviewer for various database journals. He is also serving as a program chair/co-chair of several international workshops in biological and XML data management. He is am ember of the editorial boards of several international journals.He has given tutorial in ER 2006, APWeb 2008, and WAIM 2008.He has co-authored a book entitled ¡°Web Data Management: AWarehouse Approach'' (Springers Verlag, October 2003). Sourav is a member of ACM and an affiliate member of IEEE.
Sourav has received Best Interdisciplinary Paper Award (along with Q Zhao, M Mohania, Y Kambayashi) at ACM CIKM 2004 for the paper titled "Discovering Frequently Changing Structures from Historical Structural Deltas of Unordered XML" . He was also nominated for Excellence in Teaching Award for three consecutive years (2003 - 2005).