A Data Science Central Community
Cross-row and group computation often involves computing link relative ratio and year-on-year comparison. Link relative ratio refers to comparison between the current data and data of the previous period. Generally, it takes month as the time interval. For example, compare the sales amount of April with that of March, and the growth rate we get is the link relative ratio of April. Hour, day, week and quarter can also be used as the time…Continue
Added by Jessica May on September 22, 2014 at 2:00am — No Comments
Program development for data process often involves cross-database relational operations. The following example will illustrate Java’s method of handling these operations. sales table is in db2 database, employee table is in mysql database. The task is to join sales with employee through sellerid of sales table and eid of employee table, and filter out the data insales and employeethat…Continue
Added by Jessica May on September 9, 2014 at 1:03am — No Comments
In my previous post, we saw that R-squared can lead to a misleading interpretation of the quality of our regression fit, in terms of prediction power. One thing that R-squared offers no protection against is overfitting. On the other hand, cross validation, by allowing us to have cases in our testing set that are different from the cases in our training set, inherently offers protection against overfittting.
1.Do-it-yourself leave-one-out cross validation in R.
In this type…Continue