Mining Evolution of Complex Structured Data
Sourav S Bhowmick
School of Computer Engineering£¬Nanyang Technological University
April 2, 14:00-17:00
Room: Science & Engineering Building, Room 103, Soochow University
Many real-life data can be represented as tree or graph. Different
data mining techniques have been recently proposed to mine such
complex structured data. These techniques can be broadly classified
into two categories: (i) A large number of algorithms have been
designed to find different types of pattern from such data by considering
them as snapshot data. These techniques are evolution-unaware. (ii)
Recently, there has been increasing research efforts in wide variety
of domains such as XML, Web, and life sciences that mine evolutionary
features of tree and graph structured data to discover novel knowledge.
Typically, such knowledge cannot be discovered by mining snapshot
data. This tutorial focuses on this second issue. That is, we highlight
recent efforts in discovering novel knowledge from the historical
evolution patterns of tree and graph structured data.
The tutorial is structured as follows.
We motivate the necessity for mining evolution and give an overview
of the evolutionary features of various types of tree and graph-structured
data. Next, we identify various research issues involved in evolution
mining. Specifically, our discussion can be categorized into the
following three main components:
- Study of evolution mining for tree-structured
data. We use XML and web usage data as representatives of tree
- Study of evolution mining for graph-structured
data. We use web communities, click-through data, and biological
net works as representatives of graph-structured data.
- This tutorial session also reveal various application
domains of evolution mining (such as social networks, blogs, web
event detection, web personalization, XML query caching, protein
function prediction, protein-protein interaction prediction, etc).
Evolution of Complex Structured Data
Lecture Notes£º PPTX
Sourav S Bhowmick is an Associate Professor in the School of Engineering,
Nanyang Technological University and the Director of Centre for
Advanced Information Systems (CAIS) . He is currently Visiting Associate
Professor at the Biological Engineering Division, Massachusetts
Institute of Technology(MIT), USA. He also holds the position of
Singapore-MIT Alliance(SMA) Fellow in Computation and Systems Biology
program(2005-2008). Sourav received his Ph.D. in computer engineering
in 2001. His current research interests include XML data management,
systems biology data management, web data management, and data mining.
He has published more than 100papers in major international database
and data mining conferences and journals such as VLDB, IEEE ICDE,
ACM WWW, ACM SIGMOD, ACM SIGKDD, ACM CIKM, ER, PAKDD,IEEE TKDE,
ACM CS, Information Systems, and DKE. Sourav is serving as a PC
member of various database conferences and workshops and reviewer
for various database journals. He is also serving as a program chair/co-chair
of several international workshops in biological and XML data management.
He is am ember of the editorial boards of several international
journals.He has given tutorial in ER 2006, APWeb 2008, and WAIM
2008.He has co-authored a book entitled ¡°Web Data Management: AWarehouse
Approach'' (Springers Verlag, October 2003). Sourav is a member
of ACM and an affiliate member of IEEE.
Sourav has received Best Interdisciplinary Paper Award (along with
Q Zhao, M Mohania, Y Kambayashi) at ACM CIKM 2004 for the paper
titled "Discovering Frequently Changing Structures from Historical
Structural Deltas of Unordered XML" . He was also nominated for
Excellence in Teaching Award for three consecutive years (2003 -