| ANNOUNCEMENTS |
|
With the unprecedented rate at which data is being collected today in almost all fields of human endeavor, there is an emerging economic and scientific need to extract useful information from it. Data mining is the process of automatic discovery of patterns, changes, associations and anomalies in massive databases, and is a highly inter-disciplinary field representing the confluence of several disciplines, including database systems, data warehousing, machine learning, statistics, algorithms, data visualization, and high-performance computing. This seminar will provide an introductory survey of the main topics (including and not limited to classification, regression, clustering, association rules, trend detection, feature selection, similarity search, data cleaning, privacy and security issues, and etc.) in data mining and knowledge discovery as well as a wide spectrum of data mining applications such as bioinformatics, E-commerce, environmental study, financial market study, multimedia data processing, network monitoring, social service analysis.
The first half of the semester will cover the principles and algorithms of data mining, while emphasis will be on performance issues and applications of data mining during the second half of the semester. The lectures are based on a collection of journal and conference papers and book chapters. A number of guest lectures by faculty members in other fields or other departments will be also scheduled.
Each student in COMP 790-90 will be expected to present a paper and lead the discussion following his/her presentation and do a project on selected topics. Students in GNET 713 need to attend the first nine lectures and do a project on selected topics. There will be neither homework nor exam.
| Instructor: Wei Wang Office: SN 329 Email: weiwang@cs.unc.edu Voice: 1 (919) 962-1744 Office Hour: By Appointment |
References: No required textbook
| DATE | LECTURE NOTES | READING | PAPER PRESENTATION | PROJECT | |
| Jan. 10 | Introduction [PDF][PPT] Association Rule (Part I) [PDF][PPT] |
Association Rules I | |||
| Jan. 15 | Association Rule (Part II) [PDF][PPT] | Association Rules II | |||
| Jan. 17 | Association Rule (Part III) [PDF][PPT] | Association Rules III | |||
| Jan. 22 | Association Rule (Part IV) [PDF][PPT] | Association Rules IV | Recommendations | ||
| Jan. 24 | Clustering (Part I) [PDF][PPT] | Clustering I | |||
| Jan. 29 | Clustering (Part II) [PDF][PPT] | Clustering II | |||
| Jan. 31 | Clustering (Part III) [PDF][PPT] | Clustering III | Last Day to Select Presentation Paper | ||
| Feb. 5 | Classification (Part I) [PDF][PPT] | Classification I | |||
| Feb. 7 | Classification (Part II) [PDF][PPT] | Classification II | |||
| Feb. 12 | Sequence Clustering [PDF][PPT] | Sequence Clustering | Project Proposal Due | ||
| Feb. 14 | BiClustering (Part I) [PDF][PPT] | Bi-Clustering I | |||
| Feb. 19 | BiClustering (Part II) [PDF][PPT] | Bi-Clustering II | |||
| Feb. 21 | Mining Comple Data (Part I) [PDF][PPT] | Mining Complex Data I | |||
| Feb. 26 | Mining Comple Data (Part II) [PDF][PPT] | Mining Complex Data II | |||
| Feb. 28 | Semi-supervised Learning [PDF][PPT] | ||||
| Mar. 4 | Probabilistic Graphical Models [PDF][PPT] | ||||
| Mar. 6 | Jens Rantil [PDF] [ODP] [PPT] | Tracking multiple topics for finding interesting articles | |||
| Mar. 11 | Spring Break! | ||||
| Mar. 13 | Spring Break! | ||||
| Mar. 18 | Xin Huang [PDF] [PPT] | Detecting anomalous records in categorical datasets | |||
| Mar. 20 | Eric La Force [PDF] [PPT] | Show me the money!: deriving the pricing power of product features by mining consumer reviews | |||
| Mar. 25 | Tao Yu [PDF] [PPT] | Detecting time series motifs under uniform scaling | |||
| Mar. 27 | Stephan Altmueller [PDF] | Webpage understanding: an integrated approach | |||
| Apr. 1 | no class | ||||
| Apr. 3 | Man Lou [PDF] [PPT] | Efficient incremental constrained clustering | |||
| Apr. 8 | no class | ||||
| Apr. 10 | no class | ||||
| Apr. 15 | Ning Jin [PDF] [PPT] | Association analysis-based transformations for protein interaction networks: a function prediction case study | |||
| Apr. 17 | Vishnu Konda [PDF] [PPT] | SCAN: a structural clustering algorithm for networks | |||
| Apr. 22 | Ram Kumar [PDF] [PPT] | From frequent itemsets to semantically meaningful visual patterns | |||
| Apr. 24 | Stephen Olivier [PDF] | Parallel Mining of Frequent Closed Patterns:Harnessing Modern Computer Architectures | |||
| Apr. 25 | GNET 713 Final Project Due | ||||
| Apr. 28 | Eric La Force Jens Rantil Man Lou Ning Jin Ram Kumar Stephan Altmueller Stephen Olivier Tao Yu Vishnu Konda Xin Huang |
4PM in SN 115 | COMP 790-90 Final Project Presentation | ||
| Apr. 29 | COMP 790-90 Final Project Due |