Subscribe to DSC Newsletter

February 2012 Blog Posts (38)

Big Data for the Public Good - Seminar Series

We are generating more data than ever before. Thanks to the data scientists who organize and analyze this information, this abundance of Big Data can be harnessed to serve the public interest in innovative ways.

Big Data for the Public Good is a four-part seminar series hosted by Code for America in San…

Continue

Added by Vincent Granville on February 29, 2012 at 10:00pm — No Comments

STATISTICA Decisioning Platform™ Profiled in Review of Decision Management Solutions

StatSoft’s (www.statsoft.comSTATISTICA Decisioning Platform™ is the only enterprise predictive analytics and decision management software platform to…

Continue

Added by Vincent Granville on February 29, 2012 at 9:55pm — No Comments

Features Extraction: Co-occurrences and Graph clustering

In the last two post we have discussed about co - occurrences analysis to extract features  in order to classify documents and extract "meta concepts" from the corpus.

We have also noticed that this approach doesn't return better than the traditional bag of words.

I would now explore some derivation of this approach, taking advantage of the graph theory.

the graph of the co occurrences is really huge and complex, how could we reduce its complexity without big information…

Continue

Added by Cristian Mesiano on February 29, 2012 at 1:59pm — No Comments

Hadoop with Hive or a RDBMS? Learn when to use which!

Hadoop with Hive or a RDBMS? Do you know the right technology to use in Big Data Analytics?
Request this new white paper and learn when to use Hadoop, Hive and embedded MapReduce in RDBMS. Author Colin White of BI Research presents a set of scenarios outlining how Hadoop…
Continue

Added by Vincent Granville on February 28, 2012 at 3:00pm — No Comments

New Article on Data Mining: Classifiers & the ROC Curve

Many customer behaviors have the flavor of a choice between two alternatives:  Yes or no.  Buy or sell.  Renew or cancel.  Suppose software called a “classifier” is available to predict customer choices in advance.  Would you use it?  Perhaps you’d like to test it to see how well it performs before you commit.  In this installment of my series on the nuts and bolts of data mining, I discuss the use of classifiers and questions about their performance.  Regarding performance, we specifically…

Continue

Added by Daniel Graettinger on February 27, 2012 at 10:29am — No Comments

Compound scenarios: An efficient framework for integrated market-credit risk : ALGORITHMICS

or HOW ALGORITHMICS DOES IT !!!

Compound scenarios: An efficient framework for integrated market-credit risk


Ben De Prisco, Ian Iscoe, Yijun Jiang, Helmut Mausser

http://www.algorithmics.com/en/media/pdfs/algo-ra0507-arps-compoundscenarios.pdf

Added by John A Morrison on February 24, 2012 at 9:23am — No Comments

Quantifying of Extreme Events

Quantifying of Extreme Events



Vicky Fasen Claudia Kluppelberg Annette Menzel



September 28, 2011

abstract / summary



Understanding and managing risks due extreme events is one of the most demanding topics of our society. We consider this problem as a statistical problem and present some of the probabilistic and statistical theory, which was developed to model and quantify extreme events. By the very nature of an extreme event…

Continue

Added by John A Morrison on February 24, 2012 at 9:10am — No Comments

THE PREDICTIVE POWER OF THE YIELD CURVE ACROSS COUNTRIES AND TIME

Menzie D. Chinn, Kavan J. Kucko



Working Paper 16398

http://www.nber.org/papers/w16398

NATIONAL BUREAU OF ECONOMIC RESEARCH

ABSTRACT



In recent years, there has been renewed interest in the yield curve (or alternatively, the term premium) as a predictor of future economic activity. In this paper, we re-examine the evidence for this predictor, both for the United States, as well as…

Continue

Added by John A Morrison on February 22, 2012 at 1:56am — No Comments

Are 4% mortgage interest rates a mirage?

I believe so. Here are some interesting thoughts on this:

You talk to a mortgage adviser at (say) Wells Fargo bank. You are interested in financing, own > 50%, have 2 salaries (your wife + yourself) that represents more than 50% of the amount you want to refinance,  can make a 30% down payment and have an external income…

Continue

Added by Vincent Granville on February 21, 2012 at 6:30pm — 4 Comments

Attensity, Teradata and Anderson Analytics talk on the future of text analytics

The text analytic market is set to exceed £635mln as businesses look to capture customer sentiment to gain competitive advantage.

Companies from industries as diverse as financial services, pharmaceuticals and online retail are today looking to harness the voice of the customer across social networks to improve their services.

The technology to capture customer sentiment is becoming increasingly sophisticated, responsive, and flexible to distinct business needs. Despite the…

Continue

Added by Vincent Granville on February 21, 2012 at 5:23pm — No Comments

Detecting Economic Events Using a Semantics-Based Pipeline

Detecting Economic Events Using a Semantics-Based Pipeline

http://people.few.eur.nl/fhogenboom/papers/dexa11-speed.pdf

Alexander Hogenboom, Frederik Hogenboom, Flavius Frasincar, Uzay Kaymak, Otto van der Meer, and Kim Schouten

Erasmus University Rotterdam

Abstract.

In today's information-driven global economy, breaking news on economic…

Continue

Added by John A Morrison on February 21, 2012 at 8:10am — No Comments

Irrationality or Efficiency of Macroeconomic Survey Forecasts?

Irrationality or Efficiency of Macroeconomic Survey Forecasts?



Implications from the Anchoring Bias Test



Abstract



We analyze the quality of macroeconomic survey forecasts. Recent findings indicate that they are anchoring biased. This irrationality would challenge the results of a wide range of empirical studies, e.g., in asset pricing, volatility clustering or market liquidity, which rely on survey data to capture market participants’…

Continue

Added by John A Morrison on February 21, 2012 at 7:00am — No Comments

From Semantic Search & Integration to Analytics

From Semantic Search & Integration to Analytics

Amit Sheth 

LSDIS lab, University of Georgia, 415 Graduate Studies Research Center,

Athens, GA 30602-7404

Semagix Inc., 297 Prince Avenue,

Athens, GA 30601

Abstract.

Semantics is seen as the key ingredient in the next phase of the Web infrastructure as well as the next generation of enterprise content management. Ontology is the centerpiece of the most prevalent semantic technologies…

Continue

Added by John A Morrison on February 21, 2012 at 6:30am — No Comments

Document Classification: latent semantic vs bag of words. Who is the best?

We have seen few posts ago an approach to extract meta "concepts" from text based on latent semantic paradigm.

In this post we apply this approach to classify documents, and we do a comparison between this approach and the canonical bag of words.

The comparison test will be done through the ensemble method already showed in the last post.

To read the entire post click …

Continue

Added by Cristian Mesiano on February 20, 2012 at 7:22am — No Comments

Example of Bad Analytics and How to Remedy it

This came in my mailbox as a sales pitch by Autobox, however I thought that it is interesting:

Since we are always interested in learning about how others do time series and testing how our approaches work vis-à-vis other dated procedures, we pursued the data and would like to share our results. 



Sometimes in an…

Continue

Added by Vincent Granville on February 16, 2012 at 10:00pm — No Comments

J.D. Opdyke, Author: A Unified Approach to Algorithms Generating Unrestricted and Restricted Integer Compositions and Integer Partitions, J. of Mathematical Modelling and Algorithms, 2010, 9(1), 53-97

An original algorithm is presented that generates both restricted integer compositions and restricted integer partitions that can be constrained simultaneously by a) upper and lower bounds on the number of summands (“parts”) allowed, and b) upper and lower bounds on the values of those parts.  The algorithm is recursive, based directly on very fundamental mathematical constructs, and reasonably fast with good time complexity.  General solutions to the open problems of counting the number of…

Continue

Added by J.D. Opdyke on February 15, 2012 at 10:47am — No Comments

J.D. Opdyke, Author: A Unified Approach to Algorithms Generating Unrestricted and Restricted Integer Compositions and Integer Partitions, J. of Mathematical Modelling and Algorithms, 2010, 9(1), 53-97

An original algorithm is presented that generates both restricted integer compositions and restricted integer partitions that can be constrained simultaneously by a) upper and lower bounds on the number of summands (“parts”) allowed, and b) upper and lower bounds on the values of those parts.  The algorithm is recursive, based directly on very fundamental mathematical constructs, and reasonably fast with good time complexity.  General solutions to the open problems of counting the number of…

Continue

Added by J.D. Opdyke on February 15, 2012 at 10:34am — No Comments

10+ Great Metrics and Strategies for Email Campaign Optimization

This is our first article in a series about good actionable KPI's to optimize various ROI. Future articles will focuse on metrics for fraud detection, user engagement etc. This one focuses on newsletter optimization.

If you run an online newsletter, here are a number of metrics you need to track:…

Continue

Added by Vincent Granville on February 12, 2012 at 8:00pm — 1 Comment

The Age of Big Data | New York Times

By . GOOD with numbers? Fascinated by data? The sound you hear is opportunity knocking.…

Continue

Added by Vincent Granville on February 12, 2012 at 10:29am — No Comments

J.D. Opdyke, Author: Bootstraps, Permutation Tests, and Sampling With and Without Replacement Orders of Magnitude Faster Using SAS®

A very efficient approach to random sampling in SAS® achieves speed increases orders of magnitude faster than the relevant "built-in" SAS® procedures. For sampling with replacement as applied to bootstraps, seven algorithms are compared, and the fastest ("OPDY"), based on the new approach, achieves speed increases over 220x faster than Proc SurveySelect. OPDY also handles datasets many times larger than those on which two hashing algorithms crash. For sampling without replacement as applied…

Continue

Added by J.D. Opdyke on February 12, 2012 at 9:30am — No Comments

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service