Recommended Readings


January 13th: Association Rule Part I

  • R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. SIGMOD, 207-216, 1993.
  • R. Agrawal and R. Srikant. Fast algorithms for mining association rules. VLDB, 487-499, 1994.
  • S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket analysis. SIGMOD, 255-264, 1997.
  • J.S. Park, M.S. Chen, and P.S. Yu. An effective hash-based algorithm for mining association rules. SIGMOD, 175-186, 1995.
  • A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. VLDB, 432-444, 1995.
  • H. Toivonen. Sampling large databases for association rules. VLDB, 134-145, 1996.
  • M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. Parallel algorithm for discovery of association rules. Data Mining and Knowledge Discovery, 1:343-374, 1997.

    January 15th: Association Rule Part II

  • J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. SIGMOD, 1-12, 2000.
  • R. J. Bayardo. Efficiently mining long patterns from databases. SIGMOD, 85-93, 1998.
  • N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for association rules. ICDT, 398-416, 1999.
  • J. Pei, J. Han, and R. Mao. CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets. DMKD, 11-20, 2000.

    January 22th: Association Rule Part III

  • R. Srikant and R. Agrawal. Mining sequential patterns: Generalizations and performance improvements. EDBT, 3-17, 1996.
  • J. Pei, J. Han, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. ICDE, 215-224, 2001.

    January 27th: Association Rule Part IV

  • J. Yang, W. Wang, P. S. Yu, and J. Han. Mining long sequential patterns in a noisy environment. SIGMOD, 406-417, 2002.

    January 27th: Clustering Part I

  • P. Berkhin. Survey of clustering data mining techniques, 2002.

    January 29th: Clustering Part II

  • R. Ng and J. Han. Efficient and effective clustering method for spatial data mining. VLDB, 144-155, 1994.
  • T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH : an efficient data clustering method for very large databases. SIGMOD, 103-114, 1996.
  • S. Guha, R. Rastogi, and K. Shim. Cure: an efficient clustering algorithm for large databases. SIGMOD, 73-84, 1998.

    February 3rd: Clustering Part III

  • M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases. KDD, 226-231, 1996.
  • W. Wang, J. Yang, and R. Muntz. STING: a statistical information grid approach to spatial data mining. VLDB, 186-195, 1997.

    February 5th: Clustering Part IV

  • G. Sheikholeslami, S. Chatterjee, and A. Zhang. WaveCluster: a multi-resolution clustering approach for very large spatial databases. VLDB, 428-439, 1998.
  • R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining. SIGMOD, 94-105, 1998.

    February 10th: Classification Part I

  • S. K. Murthy. Automatic construction of decision trees from data: A multi-disciplinary survey, data mining and knowledge discovery. KDD Journal, 2(4), 345-389, 1998.
  • C. J. C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 2(2), 121-168, 1998.

    February 12th: Classification Part II

  • B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. KDD, 1998.

    February 17th: Sequence Clustering

  • J. Yang and W. Wang. Towards automatic clustering of protein sequences. CSB, 175-186, 2002.
  • J. Yang and W. Wang. CLUSEQ: efficient and effective sequence clustering. ICDE, 101-112, 2003.

    February 19th: Bi-Clustering I

  • Y. Cheng and G.M. Church. Biclustering of expresssion data. ISMB, 2000.
  • J. Yang, W. Wang, H. Wang, and P. Yu. Delta-cluster: capturing subspace correlation in a large data set. ICDE, 517-528, 2002.

    February 24th: Bi-Clustering II

  • H. Wang, W. Wang, J. Yang, and P. Yu. Clustering by pattern similarity in large data sets. SIGMOD,394-405, 2002.
  • J. Liu and W. Wang. OP-Cluster: clustering by tendency in high dimensional space. ICDM, 187-194,2003.
  • Y. Sungroh,  C. Nardini, L. Benini, and G. De Micheli. Enhanced pClustering and its applications to gene expression data. Bioinformatics and Bioengineering, 2004.

    February 26: Mining Complex Data I

  • M. J. Zaki, Efficiently mining frequent trees in a forest, SIGKDD, 71-80, 2002.

    March 3rd: Mining Complex Data Part II

  • X. Yan and J. Han, gSpan: graph-based substructure pattern mining. ICDM, 721-724, 2002.
  • J. Huan, W. Wang, and J. Prins, Efficient mining of frequent subgraph in the presence of isomorphism. ICDM, 549-552, 2003.
    Wei Wang