A Data Science Central Community
Date: Monday January 24, 2011; 6:30 pm 6:30 – 9:00 pm (6:30 – 7:00 networking & snacks; 7:00 – 7:10 announcements;
7:10+ presentation, Q&A)
Cost: Free and open to all who wish to attend, but membership is only $20/year. Anyone may join our mailing list at no
charge, and receive announcements of upcoming events.
Speakers: Giovanni Seni, PhD
Title: On Diversity, Complexity, and Regularization in Ensemble Models
The discovery of ensemble methods is one of the most influential developments in Data Mining and Machine Learning in the past decade. These methods combine multiple models into a single predictive
system that is more accurate than even the best of its components. The
use of ensemble methods can provide a critical boost to existing systems addressing the hardest of industrial challenges – from investment
timing to drug discovery, from fraud detection to recommendation systems
– where predictive accuracy is vital. This talk, based on a recently
published book by the speaker, offers a concise introduction to this
breakthrough topic. After a sketch of the major concerns in predictive
learning, the talk will give an overview of regularization, a key
concept driving the superior performance of modern ensemble
algorithms. It then takes a shortcut into the heart of the popular
tree-based ensemble creation strategies using recent developments from
the frontiers of statistics, where research efforts are now focused to
explain and harness the mysteries of ensembles.
Giovanni Seni is a Senior Scientist with Elder Research, Inc. (ERI) and directs ERI’s Western office. As an active data mining practitioner in Silicon Valley, he has over 15 years R&D experience in
statistical pattern recognition, data mining, and human-computer
interaction applications. He has been a member of the technical staff at
large technology companies, and a contributor at smaller organizations.
He holds five US patents and has published over twenty conference and
journal articles. His book with John Elder, “Ensemble Methods in Data
Mining – Improving accuracy through combining predictions”, was
published in February 2010 by Morgan & Claypool. Giovanni is also an
adjunct faculty at the Computer Engineering Department of Santa Clara
University, where he teaches an Introduction to Pattern Recognition and
Data Mining class.