Subscribe to DSC Newsletter

August 2014 Blog Posts (9)

A Method of Grouping and Summarizing Data of Big Text Files in R Language

It is common to use R language to group and summarize data of files. Sometimes we may find ourselves processing comparatively big files which have smaller computed result and bigger source data. We cannot load them wholly to the memory when we need to compute them. The only solutions could be batch importing and computing as well as result merging. We’ll use an example in the following to illustrate the way of R language to group and summarize data from big text files.

Here is a file,…

Continue

Added by Jessica May on August 24, 2014 at 8:54pm — 2 Comments

9 Questions to Determine If a BI Solution Is Truly Self-Service

A Tool That Grants Independence to Business Users

We all know the constant struggle between IT and business users when it comes to BI software: Business users want to access data in order to make fast decisions independently, without having to use IT as a middle man every time a new requirement arises, query is run, or data is added. Yet, IT is overwhelmed by constantly changing requests and requirements, and struggle to deliver data in an actionable time-frame. The…

Continue

Added by Elana Roth on August 19, 2014 at 2:30am — No Comments

IEEE Power & Energy Society Launches the Global Energy Forecasting Competition 2014

Global Energy Forecasting Competition 2014 by IEEE - Power Energy Society launches on CrowdANALYTIX platform. Check out the 4 tracks of your choice: Electric Load, Electric Price, Wind Power and Solar Power.

IEEE, the world’s largest professional organization dedicated to advancing technology for humanity, today announced the start of the Global Energy Forecasting Competition 2014 (GEFCom2014)—sponsored by the IEEE Power & Energy Society (PES) and organized by the IEEE Working…

Continue

Added by mohans on August 19, 2014 at 1:45am — No Comments

Excuse me, do you speak fraud? Network graph analysis for fraud detection and mitigation

Original blog post on sctr7.com, by Scott Mongeau.

5

Executive summary

Network analysis offers a new set of techniques to tackle the persistent and growing problem of complex fraud. Network…

Continue

Added by Scott Mongeau on August 19, 2014 at 1:00am — No Comments

Tell me… how ugly is your bad data?

Original blog posted on sctr7.com.

The adage ‘garbage-in-garbage-out’ is an analytics mantra so ingrained it has its own shorthand: GIGO. Yet, in the mad, blind rush toward all things ‘big data’, there is the danger of sidelining the crucial-but-dreary topic of data quality, to which GIGO refers.

While data quality is not as ‘sexy’ as big data, anyone who wants…

Continue

Added by Scott Mongeau on August 15, 2014 at 1:30am — No Comments

Network analytics: more than pretty pictures

Originally posted on sctr7.com

Network analysis is a rapidly growing analytics domain propelled by the explosion of interest in social networking. The methods rest upon much older foundations in the realms of statistics and social science. Euler’s graph theory was proposed in the early 18th century and Moreno established the foundations for…

Continue

Added by Scott Mongeau on August 15, 2014 at 1:30am — No Comments

Data Science - Where is it headed?

In this post, I've tried to capture some of the common aspects of working in the analytics industry. While we occasionally hear about India growing fast into this space, there are a lot of things happening in India that might transform this field further. While some of these aspects are specific to what I've observed in India, a lot of them are generic.

As in previous posts, I try to classify these aspects under different heads:

  1. The "cost" reduction trap…
Continue

Added by Amogh Borkar on August 13, 2014 at 1:32am — No Comments

Assessing Model Accuracy - Part 2

In my last post, I have explained about MSE, today I will explain the variance & bias trade-off, Precision recall trade-off while assessing the model accuracy.

What is Variance and bias of a statistical learning Method?

Variance refers to the amount by which the estimated output (f) would change if we estimated it (f) using a different training dataset. Since the training data is used to fit the statistical learning method, different training sets will…

Continue

Added by suresh kumar Gorakala on August 5, 2014 at 6:24am — No Comments

Assessing Model Accuracy: Part 1

Recently, I have started reading a book "Introduction to statistical Learning", which had good introduction for model accuracy assessing. This post contains excerpts of the chapter:



Often we take different statistical approaches to build a solution for a data analytical problem. Why is it necessary to introduce so many…

Continue

Added by suresh kumar Gorakala on August 5, 2014 at 6:00am — No Comments

Blog Topics by Tags

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service