We are very pleased to announce two exciting tutorials at SDM 2005.

The first of these, which has never been presented anywhere in the world before, offers a compressive overview of Segmentation Algorithms for Time Series and Sequence Data. The presenters are, respectively, a rising star and one of the founding fathers of data mining.

The second tutorial offers an overview of Pattern Discovery in Biosequences, made accessible to the general data mining community. The tutorial has been adapted from a very well received tutorial presented to the molecular biology community. The presenter, who has published extensively in both the data mining and bioinformatics area, is in a unique position to introduce this material to the data mining community.

*Segmentation Algorithms for Time Series and Sequence Data*

By Aristides Gionis and Heikki Mannila, Helsinki Institute for Information Technology

The objective of the proposed tutorial is to give a comprehensive overview of the area of segmenting sequential data. Segmentation methods have been used successfully for extracting higher-order structure in sequences. Applications include improving the accuracy of clustering and classification tasks, extracting patterns in time-series, improving the efficiency of indexing algorithms, decomposing complex sequences to a small number of discrete states, discovering structure in genomic sequences, and other tasks. The tutorial will focus on presenting the necessary background for sequence segmentation, discussing recent advances in the area, and demonstrating the usefulness of the methods in various application domains.

*Aristides Gionis* received his Ph.D. from Stanford University in 2003, and he is currently a senior researcher at the Basic Research Unit of Helsinki Institute of Information Technology. His research experience includes summer internship positions at Bell Labs, AT&T Labs, and Microsoft Research. His research areas are data mining, algorithms, and databases.

*Heikki Mannila* is the research director of the Basic Research Unit of Helsinki Institute of Information Technology, a joint research unit of University of Helsinki and Helsinki University of Technology. He is also a professor of computer science at Helsinki University of Technology. He received his Ph.D. in 1985, and has been a professor at University of Helsinki, senior researcher in Microsoft Research, Redmond, research fellow at Nokia Research Center, and a visiting researcher at Max Planck Institute for computer science and at Technical University of Vienna. His research areas are data mining, algorithms, and databases. He is the author of two books and over 120 scientific publications. He is a member of the Finnish Academy of Science and Letters and editor-in-chief of the journal Data Mining and Knowledge Discovery.

Dr. Mannila is the recipient of ACM SIGKDD Innovation Award 2003.

*Pattern Discovery in Biosequences*

By  Stefano Lonardi, University of California, Riverside

The tutorial is intended to describe recent algorithms that are aimed to attack the problem of pattern discovery in biosequences. We first classify the methods based upon the type of patterns they are designed to find: deterministic, rigid or profiles.  Deterministic patterns are simply words over the alphabet whereas rigid patterns allow substitutions but their length cannot change. Matrix profiles are matrices in which each position is associated with a probability distribution over the symbols of the alphabet.

*Stefano Lonardi* is Assistant Professor at University of California, Riverside, CA. He belongs to the Computational Biology group, and he is also a faculty of the Genetics program.

Stefano received his Ph.D. in the summer of 2001 from the Department of Computer Sciences, Purdue University, West Lafayette, IN. His thesis, supervised by Prof. A. Apostolico, is entitled "Global Detectors of Unusual Words: Design, Implementation, and Applications to Pattern Discovery in Biosequences".

He also received the Dottorato di Ricerca degree in Computer and Electrical Engineering from the Department of Electrical and Computer Engineering, University of Padua, Italy. During the summer of 1999, he was intern at Celera Genomics, Department of Informatics Research, Rockville, MD, working under the supervision of E.W.Myers. He published papers in several journals, like Science, Journal of Computational Biology, Proceedings of the IEEE, and  Bioinformatics among others.


The SIAM Data Mining (SDM05) Organizing Committee invites proposals for tutorials to be held in conjunction with the conference. Tutorials are an effective way to educate and/or provide the necessary background to the intended audience enabling them to understand technical advances. For SDM04, we are seeking proposals for tutorials on all topics related to data mining. A tutorial may be a theme-oriented comprehensive survey, discuss novel data mining techniques or may center around successful and timely application of data mining in important application areas (e.g. medicine, national security, scientific data analysis). For examples of typical SIAM tutorials, see the set of accepted tutorials at previous SIAM conferences ( SDM01 , SDM02 and SDM03 ).

Tutorials are open to all conference attendees without any extra fees. The typical tutorial will be 2 hrs long (longer tutorials will be considered), and held in parallel with two paper presentation tracks during the main conference program. This format encourages participation. Previous SDM conference attracted up to 100 attendees in a tutorial.

Proposals should be submitted electronically by September 3 to:

Eamonn Keogh
Assistant Professor
Computer Science & Engineering Department
Surge building
University of California, Riverside
Riverside,CA 92521

in PDF format (for other formats please contact the tutorial chair first). Proposals should include the following:

  • Basic information: Title, brief description, name and contact information for each tutor, length of the proposed tutorial. If the intended tutorial is expected to take longer than 2 hours a rationale is expected. Also identify any other venues in which the tutorial has been or will be presented.
  • Audience: Proposals must clearly identify the intended audience for the tutorial (e.g., novice,intermediate,expert).
    • What background will be required of the audience?
    • Why is this topic important/interesting to the SIAM data mining community?
    • What is the benefit to participants?
    • Provide some informal evidence that people would attend (e.g., related workshops).
  • Coverage: Enough material should be included to provide a sense of both the scope of material to be covered and the depth to which it will be covered. The more details that can be provided, the better (up to and including links to the actual slides or viewgraphs). Note that the tutors should not focus mainly on their own research results.. If, for certain parts of the tutorial, the material comes directly from the tutors' own research or product, please indicate this clearly in the proposal.
  • Bios: Provide brief biographical information on each tutor (including qualifications with respect to the tutorial's topic).
  • Special equipment (if any): Please indicate any additional equipment needed (if any). The standard equipment includes an LCD projector, an overhead projector, a single projection screen and microphones.


  • Submission : September 3, 2004
  • Decision Notification : October 1, 2004
  • Complete Set of Tutorial Viewgraphs (Slides): February 11, 2005


Last Edited: 11/17/04
DHTML Menus by http://www.milonic.com/