SIAM Conference on Data Mining



And the winners are...

Best Algorithms Paper:

Clustering with Bregman Divergences
Arindam Banerjee (University of Texas, Austin), Srujana Merugu (University of Texas, Austin), Inderjit Dhillon (University of Texas, Austin), Joydeep Ghosh (University of Texas)

Best Applications Paper:

Enhancing Communities of Interest using Bayesian Stochastic Blockmodels
Deepak Agarwal (AT&T Laboratories - Research), Daryl Pregibon (AT&T labs)

Best Student Paper:

Non-linear Manifold Learning For Data Stream
Martin H. C. Law (Department of Computer Science and Engineering, Michigan State University), Nan Zhang (Department of Computer Science and Engineering, Michigan State University), Anil Jain (Michigan State University)

Special thanks to our Sponsors...

SIAM and the Conference Organizing Committee would like to extend a special thanks IBM Research for sponsoring travel grants and NASA for its generous Platinum level contribution to the meeting.

IBM Research NASA

And also to the American Statistical Association, University of Minnesota, and Center for Applied Scientific
Computing/Lawrence Livermore National Laboratory
for participating as sponsors as well.



Accepted paper presentors for the conference should IMMEDIATELY review and respond accordingly to the instructions found:


***Please complete and return the copyright transfer form to SIAM IMMEDIATELY. (pdf file)***


Due to unfortunate circumstances, the Hyatt Orlando has closed it's doors. As such, the SIAM International Conference on Data Mining will take place at:

Hilton in the Walt Disney World Resort
1751 Hotel Plaza Blvd.
Lake Buena Vista, Florida

Please take note of this change in location. ALL other dates and details remaind unchanged.

Conference Participants !


Advances in information technology and data collection methods have led to the availability of large data sets in commercial enterprises and in a wide variety of scientific and engineering disciplines. We have an unprecedented opportunity to analyze this data and extract intelligent and useful information from it. The field of data mining draws upon extensive work in areas such as statistics, machine learning, pattern recognition, databases, and high performance computing to discover interesting and previously unknown information in data.

This conference will provide a forum for the presentation of recent results in data mining, including applications, algorithms, software, and systems. There will be peer reviewed, contributed papers as well as invited talks and tutorials. Best paper awards will be given in different categories. Proceedings of the conference will be available both online at the SIAM Web site and in hard copy form. In addition, several workshops on topics of current interest will be held on the final day of the conference.


Methods and Algorithms


Human Factors and Social Issues


Chandrika Kamath, Lawrence Livermore National Laboratory
David Skillicorn, Queen’s University


Umeshwar Dayal, Hewlett-Packard Laboratories
Michael W. Berry, University of Tennessee

Program Committee

Deepak K. Agarwal, AT&T Shannon Labs
Mihael Ankerst, The Boeing Company
Chid Apte, IBM T.J. Watson Research Center
Lars Asker, Stockholm University, Sweden
Daniel Barbara, George Mason University
Roberto J. Bayardo, IBM Almaden
Clifford Behrens, Telcordia Technologies, Inc.
Michael R. Berthold, Tripos, Inc.
Malú; Castellanos, Hewlett-Packard Laboratories
Philip Chan, Florida Institute of Technology
Edward, Chang, University of California
Sanjay Chawla, University of Sydney, Australia
Ming-Syan Chen, National Taiwan University
Alok Choudhary, Northwestern University
Chris Clifton, Purdue University
Corinna Cortes, AT&T Laboratories, Research
George Cybenko, Dartmouth College
Tamraparni Dasu, AT&T Laboratories - Research
Dennis DeCoste, California Institute of Technology
Inderjit S. Dhillon, University of Texas, Austin
Jennifer G. Dy, Northeastern University
Wei Fan, IBM, T.J.Watson Research
Ronen Feldman, Bar-Ilan University, Israel
William R. Ferng, The Boeing Company
Peter A. Flach, University of Bristol, United Kingdom
Johannes Fuernkranz, Austrian Research Inst. for Artificial Intelligence, Austria
Minos Garofalakis, Bell Laboratories
Johannes Gehrke, Cornell University
Joydeep Ghosh, University of Texas, Austin
Sara James Graves, University of Alabama, Huntsville
Marko Grobelnik, J. Stefan Institute
Jiawei Han, University of Illinois, Urbana-Champaign
Howard Ho, IBM Almaden Research Center
Piotr Indyk, Massachusettes Institute of Technology
Bala Iyer, IBM Silicon Valley Lab
George Karypis, University of Minnesota
Daniel A. Keim, University of Constance, Germany
Eamonn Keogh, University of California, Riverside
Jacob Kogan, University of Maryland, Baltimore County
Helene E. Kulsrud, Center for Communications Research
Diane Lambert, Bell Laboratories, Lucent Technologies
Wenke Lee, Georgia Institute of Technology
King-Ip (David) Lin, University of Memphis
Jiming Liu, Hong Kong Baptist University, Hong Kong
Sheng Ma, IBM T.J. Watson Research Center
Vasileios Megalooikonomou, Temple University
Rajeev Motwani, Stanford University
Richard R. Muntz, University of California, Los Angeles
S. Muthukrishnan, Rutgers University and AT&T Research
Zoran Obradovic, Temple University
Sankar K. Pal, Indian Statistical Institute, Calcutta, India
Byung-Hoon Park, Oak Ridge National Laboratory
Haesun Park, University of Minnesota
Srinivasan Parthasarathy, Ohio State University
Jian Pei, State University of New York, Buffalo
David M. Pennock, Overture Services, Inc.
William M. Pottenger, Lehigh University
Raghu Ramakrishnan, University of Wisconsin-Madison
Luc De Raedt, Albert-Ludwigs-University Freiburg, Germany
Patricia J Riddle, University of Auckland, New Zealand
Greg Ridgeway, RAND
John Roddick, Flinders University, Australia
Joerg Sander, University of Alberta, Canada
Lorenza Saitta, University of Piemonte Orientale, Italy
David W. Scott, Rice University
Kyuseok Shim, Seoul National University, Korea
Simeon J. Simoff, University of Technology,Sydney, Australia
Krishnamoorthy Sivakumar, Washington State University
Myra Spiliopoulou, Otto-von-Guericke-Universitaet Magdeburg, Germany
Nicolas Spyratos, Universite Paris-Sud, France
Jaideep Srivastava, University of Minnesota
Domenico Talia, University of Calabria, Italy
Kai Ming Ting, Monash University, Australia
Hannu Toivonen, University of Helsinki, Finland
Shusaku Tsumoto, Shimane Medical University, Japan
Ramasamy Uthurusamy, General Motors Corporation
Jason T. L. Wang, New Jersey Institute of Technology
Haixun Wang, IBM T. J. Watson Research Center
Layne T. Watson, Virginia Polytechnic Institute and State University
Geoffrey I. Webb, Monash University, Australia
Sally Wood, University of New South Wales, Australia
Stefan Wrobel, Fraunhofer AIS and University of Bonn
Xindong Wu, University of Vermont
Xintao Wu, University of North Carolina, Charlotte
Philip S. Yu, IBM T.J. Watson Research Center
Osmar R. Zaiane, University of Alberta, Canada
Mohammed J. Zaki, Rensselaer Polytechnic Institute
Hongyuan Zha, Pennsylvania State University
Chengqi Zhang, University of Technology, Australia
Ning Zhong, Maebashi Institute of Technology, Japan


Vipin Kumar, Chair, AHPCRC, University of Minnesota
Steven Ashby, Lawrence Livermore National Laboratory
Umeshwar Dayal, Hewlett-Packard Laboratories
Usama Fayyad, Digimine
Robert Grossman, University of Illinois, Chicago
Jiawei Han, Univ. of Illinois at Urbana-Champaign
David Hand, Imperial College, UK
Heikki Mannila, Nokia
Tom Mitchell, Carnegie Mellon University
Andrew Odlyzko, DTC, University of Minnesota
N. Radhakrishnan, Army Research Laboratory
Jeffrey Ullman, Stanford University


Srinivasan Parthasarathy, Ohio State University


Hillol Kargupta, University of Maryland, Baltimore County


Sanjay Ranka, University of Florida


Aleksandar Lazarevic, University of Minnesota


Saso Dzeroski, Jozef Stefan Institute, Slovenia


John Roddick, Flinders University, Australia


Morgan C. Wang, University of Central Florida


Recent Advances in Bayesian Inference Techniques
Christopher M. Bishop, Microsoft Research Cambridge

Data Mining and Data Usability
Sara Graves, University of Alabama, Huntsville

Data Mining Research Questions Raised by Biological Data
C. David Page Jr., University of Wisconsin Medical School

Data Mining for Connecting the Dots
Ted Senator, DARPA


Manuscripts Due:
September 15, 2003 PASSED

Author Notification:
December 15, 2003

Camera Ready Papers:
January 9, 2004 PASSED


Submissions are opening soon! Please visit: for directions on how to register and submit a paper for consideration for the conference. The submission system will START accepting papers on August 25, 2003. CLOSED!

***Please note: The Submission site requires javascript and cookies to be activated on your browser.***

(Workshop and tutorial istructions can be found on this page.)


Papers submitted to the conference should not be in consideration by any another conference with a published proceeding or by a journal. The work may be either theoretical or applied, but should make a significant contribution to the field. The papers should have a maximum of 12 pages (single-spaced, 2 column, 10 point font, and at least 0.75 inch margin on each side) not counting the title page and references, but including tables and figures.

Please use US Letter (8.5" x 11") paper size. Papers must have a keyword list with no more than 6 keywords and an abstract with a maximum of 250 words.

Authors are strongly encouraged to submit their papers electronically in PDF format. For MS Word users, please convert your document to the PDF format.

LaTeX macros are available at Authors should use the SODA and Data Mining Proceedings Macro in particular, this is downloadable from




Clustering High Dimensional Data and its Applications

High Performance and Distributed Mining

Data Mining in Resource Constrained Environments

Link Analysis, Counter-terrorism, and Privacy

Mining Scientific and Engineering Datasets


Please visit HERE for a list of accepted tutorials.

The conference will feature workshops and tutorials on several special topics. Proposals for workshops and tutorials are due on September 3, 2003. PASSED


The SDM-2004 organizing committee is seeking high quality workshop proposals. Selected workshops will focus on new challenges and initiatives in data mining research and applications. They will foster the discussion of exciting research directions and works in progress through paper presentations, discussions, and invited talks. Each workshop will be either a full-day or a half-day event.

The responsibilities of the workshop organizers include (1) preparing the call for papers and publicizing it, (2) maintaining the workshop web site, (3) selecting the workshop organizing and program committees, (4) deciding the workshop program content, (5) selecting the papers through a peer review process, (6) delivering the proceedings to the press in time, and (7) delivering the final workshop program to the workshop chair in time.

Workshop Submission Instructions:

Workshop proposals should be sent via e-mail to the SDM-2004 Workshops Chair, Hillol Kargupta, before September 3, 2003. PASSED

A workshop proposal should include the following information:

a) Workshop title.
b) Full contact information of the organizers.
c) Description of the workshop including objectives, content, and format
of the workshop. Please indicate your preference regarding the length
of the workshop: Half-day or full-day.
d) List of potential attendees.
e) List of potential authors.
f) A short biography of each organizer.

Workshop Deadlines:

Deadline for proposal submission: September 3, 2003 PASSED
Decision notification: September 15, 2003
Call for workshop papers: October 1, 2003
Paper Submission Deadline: January 21, 2004
Acceptance notification to the authors: February 20, 2004
Camera-ready workshop proceedings: March 15, 2004
Conference dates: April 22, 23, and 24, 2004.

For any question regarding the workshops for SDM-2004, please contact:
Hillol Kargupta, University of Maryland Baltimore County and Agnik, LLC.

Contact Address:

Hillol Kargupta
Associate Professor
Department of Computer Science and Electrical Engineering
1000 Hilltop Circle, University of Maryland, Baltimore County
Baltimore, MD 21250
E-mail: [email protected]


The SIAM Data Mining (SDM04) Organizing Committee invites proposals for tutorials to be held in conjunction with the conference. Tutorials are an effective way to educate and/or provide the necessary background to the intended audience enabling them to understand technical advances. For SDM04, we are seeking proposals for tutorials on all topics related to data mining. A tutorial may be a theme-oriented comprehensive survey, discuss novel data mining techniques or may center around successful and timely application of data mining in important application areas (e.g. medicine, national security, scientific data analysis). For examples of typical SIAM tutorials, see the set of accepted tutorials at previous SIAM conferences (SDM01, SDM02 and SDM03).

Tutorials are open to all conference attendees without any extra fees. The typical tutorial will be 2 hrs long (longer tutorials will be considered), and held in parallel with two paper presentation tracks during the main conference program. This format encourages participation. Previous SDM conference attracted 50 to 100 attendees per tutorial.

Proposals should be submitted electronically by September 3 PASSED to [email protected] in PDF format (for other formats please contact the tutorial chair first). Proposals should include the following:

  • Basic information: Title, brief description, name and contact information for each tutor, length of the proposed tutorial. If the intended tutorial is expected to take longer than 2 hours a rationale is expected. Also identify any other venues in which the tutorial has been or will be presented.
  • Audience: Proposals must clearly identify the intended audience for the tutorial (e.g., novice,intermediate,expert).
    • What background will be required of the audience?
    • Why is this topic important/interesting to the SIAM data mining community?
    • What is the benefit to participants?
    • Provide some informal evidence that people would attend (e.g., related workshops).
  • Coverage: Enough material should be included to provide a sense of both the scope of material to be covered and the depth to which it will be covered. The more details that can be provided, the better (up to and including links to the actual slides or viewgraphs). Note that the tutors should not focus mainly on their own research results.. If, for certain parts of the tutorial, the material comes directly from the tutors' own research or product, please indicate this clearly in the proposal.
  • Bios: Provide brief biographical information on each tutor (including qualifications with respect to the tutorial's topic).
  • Special equipment (if any): Please indicate any additional equipment needed (if any). The standard equipment includes an LCD projector, an overhead projector, a single projection screen and microphones.


  • Submission: September 3, 2003 PASSED
  • Decision Notification: October 1, 2003 PASSED
  • Complete Set of Tutorial Viewgraphs (Slides): February 15, 2004 PASSED


Program information will be available in January


Registration information is now available!


General information is now available!


Hotel information is now available!

Audio-Visual Policy

Standard A/V Set-Up in Meeting Rooms

All other concurrent breakout rooms will have one overhead projector, one screen, and a data projector. Speakers may order additional audio/visual equipment, other than the standard A/V set-up listed above, by contacting [email protected].