Subscribe to DSC Newsletter

The INFORMS Data Mining Section (in conjunction with Sinapse) is pleased to announce its third annual Data Mining Contest. This contest requires participants to develop a model that predicts stock price movements at five minute intervals.

Competitors will be provided with intraday trading data showing stock price movements at five minute intervals, sectoral data, economic data, experts' predictions and indexes. (We don’t reveal the underlying stock to prevent competitors from looking up the answers.)

Being able to better predict short-term stock price movements would be a boon for high-frequency traders, so the methods developed in this contest could have a big impact on the finance industry.

We have provided a training database to allow participants to build their predictive models. Participants will submit their predictions for the test database (which doesn't include the variable being predicted). The public leaderboard will be calculated based on 10 per cent of the test dataset.

The submission deadline is October 10th 2010. Final results will be announced on October 12th. The winners of this contest will be honoured at a session of the INFORMS Annual Meeting in Austin-Texas (November 7-10).


Details at: http://kaggle.com/informs2010

Views: 312

Replies to This Discussion

"Being able to better predict short-term stock price movements would be a boon for high-frequency traders"
So, how comes they don't share some of this benefit? Why there's no money prize ?!
Hi Edith -- you are welcome to participate in our stock market competition (flash crash forensics): we offer a $1,000 prize.
Hello Vincent,

Analytics is a heavy duty job.

I could crack this riddle, given adequate data and a prize that matches the challenge.

Come on, Netflix allocated 1 million $ just for a slight improvement of their market prediction.
Solving NYSE trade problems surely worth no less.
That's a serious stuff, not another student competition.
One needs to attract the true powers to this DM challenge.

Truly,
Edith
I believe that you can provide a great answer in 10 hours of work: 2 hours to get the data (we will actually provide the data - just be patient), 2 hours to think about the problem, 4 hours to analyze the data, and 2 hours for the write up.

A great answer could be:

* all stocks were impacted by the drop
* the worst drops were 30%, but based on the number of stocks involved and extreme value theory, the fact tha one stock (Procter & Gamble) fell by 30% is not surprising
* the recovery was extremely swift and did not have chaotic behavior, compared with how stocks typically recover from a massive crash
* thus we believe the crash was "natural" and the recovery "artificial"

I'm not saying this is the right answer - nobody will ever know the right answer. But this is an example of answer that could win, provided it comes with a sound analysis of stock and index prices before and after the collapse.
Dear Vincent,
Thanks for putting things clearly and allowing this questioning.

First, my estimate to the time needed for the analysis of the SE 6% critical fall is over 2 weeks work (after receiving the data): one week to analyze the data thoroughly, and another week to put the results through statistics test and make sure no "tautology" has sneaked in – in which case it calls for rework…

* You've made a number of points, which I largely accept. But since they lead to an impression that no body will ever know what caused the crisis, I feel obligated to take one step back, put aside the widely accepted notions and search in the unknown part of the evidences.
From my experience, cause-effect relations can be found. My solution GT data mining always brings up new hypotheses (it needs though authentic data to operate).

* The fact that "all stocks were impacted by the drop" is there, yet one can imagine that the increasing use of computerized orders stand behind this effect, and that underneath it could be lurking a real risk factor. Moreover, nice patterns of drop and recovery together with exceptional patterns of behavior, may turn out to be the fatal combination that we are looking for. So, my suggestion here is to study the data straight up based on data with little assumptions as possible.
Actually, not just the data from the crisis, the whole year may be relevant, as there is a chance of earlier smaller-scale similar occurrences, i.e. with similar characteristics & drivers. If such hidden patterns do exist, it is the role of data mining to dig them out.

* Also, I'd not put my neck on the reassurance that "one stock fell by 30% is not surprising". The words of Shakespeare's Macbeth "NOTHING IS WHAT IT SEEMS" are truer than anywhere, in the stock exchange reality.

Best regards
Deadline for the INFORMS Data Mining Contest!

Dear Colleague,

I want to remember you that the submission deadline for the INFORMS Data
Mining Contest 2010 is October 10th 2010. Time is running!

Already 788 people registered to this contest.

INFORMS Data Mining Competition goal is to predict stock price movements
(over 60 minutes) at five minute intervals with accurate predictive
analysis solutions.

Visit the INFORMS Data Mining Contest web page for more details:
http://kaggle.com/informs2010

Competitors are provided with intraday trading data showing stock price
movements at five minute intervals, sectoral data, economic data, experts'
predictions and indexes.

We have provided a training database to allow participants to build their
predictive models. Participants will be evaluated according to the
arithmetic mean of the AUC on the test database.

Being able to better predict short-term stock price movements would be a
boon for high-frequency traders, so the methods developed in this contest
could have a big impact on the finance industry.

The winners of this contest will be honored at a session during INFORMS
Annual Meeting in Austin-Texas (November 7-10).

Louis Duclos-Gosselin
Chair of INFORMS Data Mining Contest 2010
INFORMS Data Mining Section Member
Applied Mathematics (Predictive Analysis, Data Mining) Consultant at
Sinapse
E-Mail: [email protected]
http://www.sinapse.ca/En/Home.aspx
http://dm.section.informs.org/
Phone: 1-866-565-3330
Fax: 1-418-780-3311
Sinapse (Quebec), 1170, Boul. Lebourgneuf
Suite 320, Quebec (Quebec), Canada
G2K 2E3
No prize pay, no gain

RSS

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service