Subscribe to DSC Newsletter

Featured Blog Posts – September 2013 Archive (21)

Building customer segments using Principle Component Analysis (PCA)

A very common approach to building and understanding customer segments is through the use of clustering techniques such as Principle Components Analysis (PCA). These clustering techniques will analyze your customer data and see if customers tend to cluster by certain features, or combinations of features. Through such an approach, a marketer can use clusters to define specific segments. For example, running a cluster analysis could end up showing two clusters: one with customers who have…

Continue

Added by Wojciech Gryc on September 26, 2013 at 11:41am — 1 Comment

Python Scikit-learn to simplify Machine learning : { Bag of words } To [ TF-IDF ]

Text (word) analysis and tokenized text modeling always give a chill air around ears, specially when you are new to machine learning. Thanks to Python and its extended libraries for its warm support around text analytics and machine learning. Scikit-learn is a savior and excellent support in text processing when you also understand some of the concept like "Bag of word", "Clustering" and "vectorization". Vectorization is  must-to-know technique for all machine leaning learners, text miner…

Continue

Added by Manish Bhoge on September 25, 2013 at 12:30pm — No Comments

Turning Data into Social Good - 3 Talks to Livestream today

Among many inspiring talks at the Social Good Summit in NYC today, several will focus on the crucial theme of Data and Social Good.  If you have a moment and are interested in the growth of data-driven social good, I recommend tuning into the livestream here at the following times:

  • 12:27 PM EDT: “#Big Data in the Era of…
Continue

Added by Jaime Fitzgerald on September 24, 2013 at 10:00am — No Comments

Do the maverick mavens need managers?

Do the maverick mavens need managers?

The McKinsey and company report “Big data: The next frontier for innovation, competition and productivity” (May 2011) is a well publicized and circulated one on the internet .

The report projected that the demand for deep analytical positions in a big data world in the United States could exceed the supply based on the trends seen ( in 2011) , by 140,000 to 190,000 positions. While this was…

Continue

Added by Somjit Amrit on September 23, 2013 at 1:53am — No Comments

Unleashing Intelligence through natural language (Part 4 - Data, Information, Knowledge and Wisdom)

In this series I reveal Natural Laws of Intelligence contained within grammar, that can be utilized to unleash intelligence through natural language in software. These laws are extremely simple, but still undiscovered by scientists.



Experts in knowledge technology should be familiar with the DIKW Hierarchy (Data, Information, Knowledge, Wisdom):…

Continue

Added by Menno Mafait on September 20, 2013 at 2:00am — No Comments

How to Overcome Google Penguin Update Penalties

Google is the god of search, and most businesses are doing all they can to propitiate the search engine giant. Most of us turn to Google when we want an answer to almost any question, and the great god supplies it. Millions of businesses around the world want their website to show up when potential customers or visitors search for a specific set of keywords. And so we invest lots of time and effort in SEO.

 

Many business that had spent…

Continue

Added by Rajveer Singh Rathore on September 19, 2013 at 10:39am — No Comments

Corn migrating North: do you agree, based on these two maps (1948 vs. 2008)

Comparing crop acreage harvested per county, in US, 1948-1952 vs. 2008-2012. The article was posted in USA Today with the title Climate Change Changing Agriculture. It is an interesting visual presentation (in USA Today) as you can superimpose the two images for better comparisons. Here, you…

Continue

Added by Vincent Granville on September 18, 2013 at 7:30pm — 5 Comments

Better Together: Data Scientists and Automated Analysis

An executive from IBM recently highlighted the need for more rigorous preparation for Big Data analytics within and beyond the financial industry inan article in the Wall Street Journal. The article outlined the dire need for qualified data scientists, how qualified business and finance students are, and how even liberal arts majors can and should be trained to…

Continue

Added by Radhika Subramanian on September 18, 2013 at 9:53am — 1 Comment

Weekly digest - September 23

Featured Articles

Continue

Added by Vincent Granville on September 17, 2013 at 8:00pm — No Comments

Do not stifle the questions visualized data raises!

Extracting meaningful insights from data to address business needs has benefited immensely from the availability of data visualization  tools that have  data more approachable. Today the proliferation of off-the-shelf tools, which are easy to learn and are web enabled, have democratized the way data is presented and consumed. Tools like Spotfire, Tableau, Qikview have helped breathe life into data. They provide a professional look and feel and give an inherent feel of fidelity of the data…

Continue

Added by Somjit Amrit on September 16, 2013 at 7:53pm — 3 Comments

"Best Practice" quality Analytics and Decisioning just beginning in our Medical / Healthcare Delivery Systems

I have just returned from the first ever ICHI (= International Conference on Healthcare Informatics) at which I discovered a "new generation" of healthcare researchers and policymakers, mostly under the age of 35 ... BUT with an UNDERSTANDING and PASSION of what needed to take place to get out of being in about 33rd place in the world in the quality of health care delivery and replacing it with ACCURACY / PATIENT CENTERED EFFECTIVENESS / and EFFICIENT COST EFFECTIVE processes that match what…

Continue

Added by Gary D. Miner, Ph.D. on September 16, 2013 at 2:00pm — No Comments

Bootstraps, Permutation Tests, and Sampling Orders of Magnitude Faster Using SAS®, John Douglas (“J.D.”) Opdyke

Bootstraps, Permutation Tests, and Sampling Orders of Magnitude Faster Using SAS, Computational Statistics-WIREs, Vol. 5, Issue 5, 391-405.  Download @ http://www.datamineit.com/DMI_publications.htm



While permutation tests and bootstraps have very wide-ranging application, both share a common potential drawback: as data-intensive resampling methods, both can be runtime prohibitive when applied to large or even…

Continue

Added by J.D. Opdyke on September 16, 2013 at 8:27am — No Comments

Analytics and Belief: The Struggle for Truth

For full post:  http://sctr7.com/2013/09/08/analytics-and-belief-the-struggle-for-truth/

Increasingly sophisticated analytics tools and methods are available to derive business insight from data.  However, as a discipline which drives insight from data, the crucial ‘last step’ in the analytics process is about organizational decision making.  A sophisticated, intensive analysis may all be…

Continue

Added by Scott Mongeau on September 14, 2013 at 1:49am — 4 Comments

Do you want to solve real world predictive analytics case study and get ranked amongst your peers?

Statistics.com, a provider of online education in statistics and analytics, announces a partnership with CrowdANALYTIX, a predictive modeling “managed crowdsourcing” company, offering a new online course, “Applied Predictive Analytics in partnership with CrowdANALYTIX“, which will run from Oct. 11 to Nov 8, 2013.

The goal of this course is to teach users (who have basic knowledge of R programming, predictive analytics…

Continue

Added by Janet Dobbins on September 11, 2013 at 10:59am — No Comments

It is time to stop torturing your data

Ronald Coase died last week. Coase was an economist and a Nobel laureate, not someone you would typically associate with modern data analytics. Still, Coase is noted in the field for coining the phrase, “If you torture data long enough it will confess.”

Coase’s quote, and his career, are reminders that analysis can have repercussions that go beyond the screen and the analysis, and have impacts on the work that we do…

Continue

Added by Radhika Subramanian on September 10, 2013 at 1:53pm — No Comments

Some insights on the Big Data Barnes & Noble experience

 

Barnes & Noble is one of the 500 largest companies in the world. It operates 1350 bookstores (730 stores in cities and 630 in campuses) and the largest online bookstore, with roughly 10 million customers, sells 300 million books per year, and offers a 6 million references catalog.



The company goals are doing better than the competition (especially better than Amazon), and dominate the eBook market. For this the company aims to thoroughly analyze and control its…

Continue

Added by Michel Bruley on September 9, 2013 at 2:18am — No Comments

Predictive Analytics for CATS

The Comprehensive Analysis of Time Series (CATS) is an increasingly important use case in the field of Big Data analytics.  Cat videos on the Internet notwithstanding, the prevalence of time series is perhaps even more universally ubiquitous in big data applications: customer purchase histories, web click logs, social events, human behaviors, speech patterns, weather reports, climate science, numerical simulation science, spread of infectious diseases, market…

Continue

Added by Kirk Borne on September 6, 2013 at 9:30am — No Comments

Weekly digest - September 9

Featured articles

Continue

Added by Vincent Granville on September 5, 2013 at 1:00pm — No Comments

Crime Prediction - Predictive Modeling in Law Enforcement

No predictive model is going to be 100% accurate unless by chance.  The nature of predictive modeling is to learn from the past and see into the future.  Essentially, predictive modeling is just modeling.  Think about why we use statistical models - so we can fit the data into a pattern of behavior and anticipate future results.  It's all about how you use and interpret this model.

 

Crime analysts may use a tool similar to the following example on a robbery…

Continue

Added by Nicole on September 5, 2013 at 12:00pm — No Comments

Conditional Formatting in Excel – Highest Number in Each Row

Conditionally formatting each row individually is an issue that I struggled with for some time and finally found an answer.  I have a table that lists 28 different activities by day of the week.  On the report I need to highlight the day with the highest count per activity.

The solution is to essentially conditionally format each row to highlight the highest number.  But who wants to take time formatting 28 rows?  Plus, there are several other cities to analyze.  So it’s…

Continue

Added by Nicole on September 5, 2013 at 5:29am — No Comments

Featured Monthly Archives

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

On Data Science Central

© 2020   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service