T2 : E-Commerce and Clickstream Mining
Ronny Kohavi, Blue Martini Software, Inc.
Jon Becher, Accrue Software, Inc.
Sites conducting electronic commerce or providing content can generate
significant amounts of data that can help gain insight about the visitors,
allowing for improved interactions through personalization. The tutorial
reviews the value proposition for mining e-commerce and clickstream data,
presents several case studies, and takes the attendees through possible
architectures for data collection, warehousing, analysis, and closing
of the loop back to the web site. We review data collection through web
logs, sniffing, application server logging. We look at the advantages
and disadvantages of building a data warehouse, and present implicit and
explicit ways to "close the loop." Finally, we review the building blocks
of successful solutions for analysis, including data transformations,
reports, visualizations, OLAP, and mining algorithms.
Ronny Kohavi is the director of data mining at Blue Martini Software,
where he heads the engineering group responsible for the data collection,
analysis, reporting, and campaign management modules in the company's
Customer Interaction System. Prior to joining Blue Martini, Kohavi managed
the MineSet project, Silicon Graphics' award-winning product for data
mining and visualization. He joined Silicon Graphics after getting a Ph.D.
in Machine Learning from Stanford University, where he led the MLC++ project,
the Machine Learning library in C++ now used in MineSet and at Blue Martini
Software. Kohavi received his BA from the Technion, Israel. He co-chaired
KDD 99's industrial track with Jim Gray and the KDD Cup 2000 with Carla
Brodley. He co-edited with Foster Provost the special issue of the journal
Machine Learning on Applications of Machine Learning and the special issue
of the Data Mining and Knowledge Discovery journal on Applications of
Data Mining to Electronic Commerce (to appear in 2001). He is a member
of the editorial board for the Data Mining and Knowledge Discovery journal
from its inception and served as a member of the editorial board for the
journal of Machine Learning from 1997 to 1999.
Jonathan Becher is VP of Product Strategy and Business Development at
Accrue Software, where he is responsible for defining the product direction
for the leading scalable Web analysis solution. Becher joined Accrue from
the acquisition of NeoVista Software in January 2000, where he was President,
CEO, and co-founder. While at NeoVista, he was a principal designer of
an enterprise-class knowledge discovery workbench and led the implementation
of the retail industry's first supply chain optimization solution based
on data mining. Prior to NeoVista, Becher led the Application Development
and the Professional Services organizations at MasPar Computer Corporation
where he defined and helped build multiple production data mining applications.
Earlier in his career, he was a senior software engineer at the MicroElectronics
Center of North Carolina where he developed advanced pattern recognition
systems. Becher holds a B.S. from the University of Virginia in computer
engineering and an M.S. in computer science from Duke University.