A Data Science Central Community
I posted a few interesting articles. You are free to disagree and post your comments.Continue
Added by Vincent Granville on December 31, 2014 at 12:31pm — No Comments
This week, I invite you to read the following:
Added by Vincent Granville on December 27, 2014 at 1:01pm — No Comments
A business problem which involves predicting future events by extracting patterns in the historical data. Prediction problems are solved using Statistical techniques, mathematical models or machine learning techniques.
For example: Forecasting stock price for the next week, predicting which football team wins the world cup, etc.
There are two circumstances that you should use a consultant:
1. When the consultant has both the domain knowledge and exact modeling experience: There are times that the consultant come to you and sell you the ideas of modeling something. Look for exact experience. Like an academic researchers, the most challenging task is to get the data-set, not the idea. Dataset is the execution. In the world of Big Data analytics - whoever owns the data has the command, especially in the…Continue
While importing the structured text files into the database using Java alone, we need to combine the SQL statements together manually, and to deal with various troublesome situations as well, like if the data in a table has been existed, whether we should update it or insert data into it, if some fields are included in the file, and if the fields in the file are consistent with those in the table.
As esProc participates in Java programming, these problems can be solved…Continue
Added by Lynn Guo on December 22, 2014 at 7:04pm — No Comments
The fact is that a business intelligence solution roll-out should not take 1-2 years though it often takes that long and then some. When considering business intelligence reporting software it is crucial for every enterprise to thoroughly explore its options and choose the right solution provider in order to get a solution that is affordable and practical and one that will rapidly achieve ROI and user adoption and ensure low TCO. But, there is another, equally important factor…Continue
This list was started a while back and rather small, but it grew up to 200+ articles in the past few weeks. It will reach 400+ when completed. Essentially, this is the best of all our weekly digests. Also, it features all the articles (double-starred in red) that will be part of my upcoming book Data Science 2.0.…Continue
Added by Vincent Granville on December 20, 2014 at 3:30pm — No Comments
I am facing a simple problem and trying to find the optimum solution:
Y(cont) = x1(cat) + x2(cat) +x3(cat) + x4(cat) + x5(cont)
Where: cat = categorical and cont = continuous.My categorical variables have 100 classes.
So my Y is cont and 4/5 Xs are categorical. What is the optimum approach? ANOVA? For ANOVA I think that would be true only when ALL of my Xs were categorical. If I simply apply a linear regression, then I…Continue
Originally posted on sctr7.
Analytics professionals should keep a persistent eye open for opportunities to create cross-functional insights from data. Insights can frequently be…Continue
Added by Scott Mongeau on December 17, 2014 at 6:00am — No Comments
I recorded a video of the amazing IBM Watson Analytics in action. Enjoy!
Source: click here
Added by Venky Rao on December 17, 2014 at 5:30am — No Comments
There is a type of text files that they are too big to be entirely loaded into the memory, yet as the data have been sorted by a certain column and if they are imported in groups according to this column, they can be all put into the memory for computing. These text files include the call detail record of a telecom company, statistics of visitors on a website, information of members of a shopping mall, etc.
A great deal of complicated code, which is difficult to maintain, is…Continue
Added by Lynn Guo on December 15, 2014 at 6:24pm — No Comments
We have three new articles to share with you:
Added by Vincent Granville on December 12, 2014 at 2:46pm — No Comments
As Java doesn’t directly support dynamically parsing expressions in the text files, the computation can only be realized by splitting strings manually and then writing a recursive program. The whole process requires writing a great amount of code, is complicated and the code is difficult to maintain. With the assistance of esProc, we can develop program for the computation in Java without writing code manually. Let’s look at how esProc works through an example.
Here is a text…Continue
Added by Lynn Guo on December 10, 2014 at 6:30pm — No Comments
This blog was originally published on our Text Analysis blog, the blog post set out to analyze and visualize 11 million tweets collected around the time of and during Apple Live 2014.
Apple Live probably got off to the worst start possible earlier…Continue
Added by Mike Waldron on December 8, 2014 at 11:16am — No Comments
We have the following to share with you:
Added by Vincent Granville on December 7, 2014 at 12:00pm — No Comments
Hi All ,
I've recently build a predictive data model for a campaign to target group of customer who can switch from payment method "A" to Payment method "B".
The idea was to come up with predictive model with payment/usage variables as independent variable and the event of switch in past is taken as dependent variable.
i) I used logistic regression model for probability estimation.
ii) The data is sorted in order of descending probability…Continue
During developing the database applications, we often need to perform computations on the grouped data in each group. For example, list the names of the students who have published papers in each of the past three years; make statistics of the employees who have taken part in all previous training; select the top three days when each client gets the highest scores in a golf game; and the like. To perform these computations, SQL needs multi-layered nests, which will…Continue
Added by Lynn Guo on December 3, 2014 at 6:30pm — No Comments
In developing database applications, usually it is the records corresponding to the max/min value that we need to retrieve, instead of the value itself. For example, the occasion in which each employee gets his/her biggest pay raise; the three lowest scores ever got in golf; the five days in each month when each product gets its highest sales amount; and so on. As the max function of SQL can only retrieve the max value, instead of the records to which the max…Continue
Added by Lynn Guo on December 1, 2014 at 6:05pm — No Comments
Added by Vikas Kamra on December 1, 2014 at 4:30am — No Comments