A Data Science Central Community
It is common to use R language to group and summarize data of files. Sometimes we may find ourselves processing comparatively big files which have smaller computed result and bigger source data. We cannot load them wholly to the memory when we need to compute them. The only solutions could be batch importing and computing as well as result merging. We’ll use an example in the following to illustrate the way of R language to group and summarize data from big text files.
Here is a file,…
ContinueAdded by Jessica May on August 24, 2014 at 8:54pm — 2 Comments
We all know the constant struggle between IT and business users when it comes to BI software: Business users want to access data in order to make fast decisions independently, without having to use IT as a middle man every time a new requirement arises, query is run, or data is added. Yet, IT is overwhelmed by constantly changing requests and requirements, and struggle to deliver data in an actionable time-frame. The…
ContinueAdded by Elana Roth on August 19, 2014 at 2:30am — No Comments
Global Energy Forecasting Competition 2014 by IEEE - Power Energy Society launches on CrowdANALYTIX platform. Check out the 4 tracks of your choice: Electric Load, Electric Price, Wind Power and Solar Power.
IEEE, the world’s largest professional organization dedicated to advancing technology for humanity, today announced the start of the Global Energy Forecasting Competition 2014 (GEFCom2014)—sponsored by the IEEE Power & Energy Society (PES) and organized by the IEEE Working…
ContinueAdded by mohans on August 19, 2014 at 1:45am — No Comments
Original blog post on sctr7.com, by Scott Mongeau.
Network analysis offers a new set of techniques to tackle the persistent and growing problem of complex fraud. Network…
ContinueAdded by Scott Mongeau on August 19, 2014 at 1:00am — No Comments
Original blog posted on sctr7.com.
The adage ‘garbage-in-garbage-out’ is an analytics mantra so ingrained it has its own shorthand: GIGO. Yet, in the mad, blind rush toward all things ‘big data’, there is the danger of sidelining the crucial-but-dreary topic of data quality, to which GIGO refers.
While data quality is not as ‘sexy’ as big data, anyone who wants…
ContinueAdded by Scott Mongeau on August 15, 2014 at 1:30am — No Comments
Originally posted on sctr7.com.
Network analysis is a rapidly growing analytics domain propelled by the explosion of interest in social networking. The methods rest upon much older foundations in the realms of statistics and social science. Euler’s graph theory was proposed in the early 18th century and Moreno established the foundations for…
ContinueAdded by Scott Mongeau on August 15, 2014 at 1:30am — No Comments
In this post, I've tried to capture some of the common aspects of working in the analytics industry. While we occasionally hear about India growing fast into this space, there are a lot of things happening in India that might transform this field further. While some of these aspects are specific to what I've observed in India, a lot of them are generic.
As in previous posts, I try to classify these aspects under different heads:
Added by Amogh Borkar on August 13, 2014 at 1:32am — No Comments
In my last post, I have explained about MSE, today I will explain the variance & bias trade-off, Precision recall trade-off while assessing the model accuracy.
Variance refers to the amount by which the estimated output (f) would change if we estimated it (f) using a different training dataset. Since the training data is used to fit the statistical learning method, different training sets will…
Added by suresh kumar Gorakala on August 5, 2014 at 6:24am — No Comments
Recently, I have started reading a book "Introduction to statistical Learning", which had good introduction for model accuracy assessing. This post contains excerpts of the chapter:
Often we take different statistical approaches to build a solution for a data analytical problem. Why is it necessary to introduce so many…
Added by suresh kumar Gorakala on August 5, 2014 at 6:00am — No Comments
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles