A Data Science Central Community
Over the years, the magazine publishing industry has made significant strides in improving subscription based circulation by developing analytic frameworks that better predict customer response to acquisition and renewal offers. The objective of this contest is to apply the same analytic discipline and effectively predict newsstand locations "response". Specifically the objective is to predict the number of copies to be placed in each newsstand location to optimize the overall contribution of the newsstand location typically referred to as draw.
HOW TO ENTER: Beginning October 14th, 2010 at 12:01 AM (ET) throughDecember 3rd, 2010 at 11:59 PM (ET) go to the Hearst Challenge website located at www.HearstChallenge.com (the “Site”) and complete and submit the entry form pursuant to the onscreen instructions. Entrants will be provided a historical sample of newsstand location draw, sales and associated location level data to help develop their predictive algorithm. Hearst will in turn hold back two distinct sets of draw/sales data, one to be used as a validation set by the contestant and one to be used as a final contest evaluation set. Entrants may not include any other external variables for the challenge. Additional details will be provided with the data. Entrants will be able to track their performance against the validation set throughout the course of the challenge via a leader tracking board to be made available on the Site. Entries must include the following documentation:
Data file with id variables and expected sales values by store and publication
The final model/ algorithm code used to score the final data set
Any supporting documentation that pertains to the development of the submitted model/algorithm including variable creation. Variables that were used in the model need to be traced through from input to coefficient / node (if using a tree based methodology). This includes recodes, transformations, and summarizations. This should be a table, with English labels for variables, and equations (if appropriate) for transformations or recodes.
JUDGING AND FINALIST AND WINNER SELECTION: To determine the three (3) finalists, Hearst will analyze the results of a standard set of magazines to be evaluated by the participants' algorithms using the final evaluation data set. The final evaluation data set – excluding the final sales results - will be posted on the Site at 12:01AM December 3, 2010 and entrants will be asked to execute their algorithm against this data set and provide their predicted level of sales in the prescribed format to be uploaded to the website for final evaluation by Hearst and the designated judges (EXL/ Kaggle). Entries must be submitted by 11:59PM on December 3, 2010. Three (3) finalists will be notified on or before December 7, 2010 and must prepare a ten (10) minute presentation to be presented at the NCDM Conference in Miami, Florida on December 13 through 15, 2010.
The presentation must include the following components:
The steps taken with the raw data provided to convert into modeling variables.
The types of exploratory analyses conducted to refine the variables used in the model.
Any variable reduction techniques used.
The steps undertaken in the modeling process.
How the final model to submit for the competition was determined.
The presentation must be in PowerPoint and submitted by December 10, 2010 to the Site. All presentations will be reviewed and edited by the judges for content and educational components. Entrants may be contacted to review changes. Refusal to reveal educational content will be grounds for removal from the competition. By entering, you are agreeing to fully share the steps and methodologies used in your model. Finalists will be allowed to use company logos in the presentation. However, the presentation will be converted to a standard slide template for presentation consistency purposes.
The winner will be announced at the NCDM Conference on December 14, 2010 after the presentations have been concluded. The winner will be the entrant who is able to best predict final store sales given the number of copies placed (draw) in each store. (Best will be defined as the root mean square error between the predicted and final sales.) In the event of the unlikely event of a tie, the presentation will be the tiebreaker.
The winner will receive $25,000 cash.
The top three (3) finalists along with up to one (1) additional presenter for each finalist will be entitled to round trip coach class airfare to Miami, Florida from a metropolitan airport nearest to their residence (ARV $1000-$2000), lodging for two (2) nights at a hotel selected by Hearst (ARV $400-$800), a $75 day meal allowance for two (2) days (ARV $150-$300) and registration fees for the NCDM Conference in Miami on December 13-15, 2010 ($1600-$3200) Total ARV ($3150-$6300).
Each of the three (3) finalists and their additional presenter will be required to travel to Miami to attend the NCDM Conference with an arrival date on December 13, presentation on December 14 and departure date of December 15, 2010.
Note: Once the travel schedule has been arranged, it cannot be altered and failure of the finalists to follow such schedule shall not obligate Hearst in any way to provide the finalists with alternate arrangements.
ENTRIES: Limit (1) one final entry per individual/team/company. In the event that multiple entries have been submitted by an individual/ team/company, the last entry submitted will represent the official contest entry. Individuals are not eligible to participate on more than one (1) team. Entries, and the associated algorithms, become the property of Hearst. The prize will be awarded to the individuals named in the entry form. Team/Company entries must include the names of all individuals included in the team entry. Proof of submission does not constitute proof of receipt.
ELIGIBILITY: Employees of Hearst, its parents, affiliates and subsidiaries, participating data providers (Experian and CMG ), the DMA, and the independent judging organizations (EXL and Kaggle) (and members of their immediate family and/or those living in the same of household of each such employee) are not eligible. Void where prohibited by law.
CONDITIONS OF PARTICIPATION: Entrants agree to be bound by the terms of these Official Rules and by the decisions of Hearst, which are final and binding on all matters pertaining to the Challenge. The winner hereby further agrees that it will sign any documents necessary to transfer copyright of the entry to Hearst within seven (7) days following the date of first attempted notification. Acceptance of the prize constitutes permission for Hearst and its agencies to use winner’s name and/or likeness, biographical material and/or entry (including an altered form of the entry) for advertising and promotional purposes without additional compensation, unless prohibited by law. By accepting prize, winner agrees to hold Hearst, the data providers, judges, and their respective parent companies, subsidiaries, affiliates, partners, representative agents, successors, assigns, officers, directors, and employees harmless for any injury or damage caused or claimed to be caused by participation in the Challenge or acceptance or use of the prize. Hearst is not responsible for any printing, typographical, mechanical or other error in administration of the Challenge or in the announcement of the prize. All taxes are the sole responsibility of the three (3) finalists and winner. Each prize is awarded “as is” with no warranty or guarantee, either express or implied. No transfer, assignment or substitution of a prize is permitted, except Sponsor reserves the right to substitute prize for an item of equal or greater value. All federal, state and local laws and regulations apply. By entering, an entrant represents and warrants that their entry is original, has not been previously published or won any award, and does not contain any material that would violate or infringe upon the rights of any third party, including copyrights, trademarks or rights or privacy or publicity. Hearst reserves the right in its sole and unfettered discretion to disqualify any entry that it believes contains obscene, offensive or inappropriate content, that does not comply with these official rules or that is not consistent with the spirit or theme of the contest. The decision of Hearst and the judges is final and binding on all matters relating to the Challenge. The NCDM retains non-exclusive rights to the material presented at the Conference. All PR releases regarding the Hearst Challenge must include a statement that it is sponsored by "Hearst Corporation" and "The DMA".
DATA USE LIMITATIONS: Entrants in the Challenge acknowledge and agree to the following data use restrictions:
i) Entrants shall not share or disclose the Experian or CMG data or the results of any solutions with or to any third party other than to members of their Challenge team who have a need to access such data to perform the analysis;
ii) Entrants shall not mail, telemarket, or develop or apply a model using the provided Experian/CMG data outside the models being developed for the Challenge;
iii) Entrants shall not copy, modify, adapt, translate, reverse engineer, decompile, disassemble, or create derivative works based on the Experian/CMG data or merge or incorporate the Experian/CMG data with any other file outside those used in the Challenge;
iv) Entrants shall issue appropriate instruction to each employee given access to the Experian/CMG data regarding the restrictions set forth in this Challenge, and shall provide physical security for the Experian /CMG data to the same or greater degree that entrant uses to protect its own most sensitive data;
v) Entrant is legally responsible for any failure by it, him or her or Entrant’s employees to comply with the terms of these Official Rules; and
vi) Entrant agrees that in the event of an actual or threatened breach of these Official Rules, Experian/CMG will suffer irreparable harm and Experian and CMG shall be entitled to injunctive relief, among other remedies which it may seek
INTERNET: Hearst is not responsible for electronic transmission errors resulting in omission, interruption, deletion, defect, delay in operations or transmission, theft or destruction or unauthorized access to or alterations of entry materials, or for technical, network, telephone equipment, electronic, computer, hardware or software malfunctions or limitations of any kind, or inaccurate transmissions of or failure to receive entry information by Hearst on account of technical problems or traffic congestion on the Internet or at any Web site or any combination thereof. If for any reason the Internet portion of the program is not capable of running as planned, including infection by computer virus, bugs, tampering, unauthorized intervention, fraud, technical failures, or any other causes which corrupt or affect the administration, security, fairness, integrity, or proper conduct of this Challenge, Hearst reserves the right at its sole discretion to cancel, terminate, modify or suspend the Challenge. Hearst reserves the right to select winners from eligible entries received as of the termination date. Hearst further reserves the right to disqualify any individual who tampers with the entry process. Hearst may prohibit an entrant from participating in the Challenge if it determines that said entrant is attempting to undermine the legitimate operation of the Challenge by cheating, hacking, deception or other unfair playing practices or intending to abuse, threaten or harass other entrants. Caution: Any attempt by a participant to deliberately damage the Site or undermine the legitimate operation of the Challenge is a violation of criminal and civil laws and should such an attempt be made, Hearst reserves the right to seek damages from any such participant to the fullest extent of the law.
DISPUTES/CHOICE OF LAW: Except where prohibited, each entrant agrees that: (1) any and all disputes, claims and causes of action arising out of or connected with this Challenge or any prize awarded shall be resolved individually, without resort to any form of class action, and exclusively by state or federal courts situated in New York, New York, (2) any and all claims, judgments and awards shall be limited to actual out-of-pocket costs incurred, but in no event attorneys' fees; and (3) no punitive, incidental, special, consequential or other damages, including without limitation lost profits may be awarded (collectively, "Special Damages"), and (4) entrant hereby waives all rights to claim Special Damages and all rights to have such damages multiplied or increased. New York State law, without reference to New York’s choice of law rules, governs the Challenge and all aspects related thereto.
SPONSOR: The Sponsor of this Challenge is Hearst Communications, Inc., 300 West 57th Street, New York, New York 10019.