Subscribe to DSC Newsletter

All Blog Posts (2,210)

Top 10 challenging problems in data mining

  • Developing a unifying theory of data mining
  • Scaling up for high dimensional data and high speed data streams
  • Mining sequence data and time series data
  • Mining complex knowledge from complex data
  • Data mining in a network setting
  • Distributed data mining and mining multi-agent data
  • Data mining for biological and environmental problems
  • Data Mining process-related problems
  • Security, privacy and data…
Continue

Added by Vincent Granville on April 19, 2008 at 12:30pm — No Comments

Applying the Markov copulae approach to modeling credit derivatives

In the latest issue of the Journal of Credit Risk, Bielecki et al. propose a dynamic bottom-up approach by using Markov copula for pricing and hedging credit index derivatives and ratings-triggered corporate step-up bonds.



The Markov copula procedure works efficiently and is a useful step towards developing a copula-like formalism for multivariate processes, which can be applied to the modeling of credit derivatives. Read the full article and receive the current edition of the… Continue

Added by Vincent Granville on April 15, 2008 at 2:55pm — No Comments

Narratives Events and Discourses

That's NED for short - our Strathclyde university improve-our-methodology self-help group.
I like to think it can help both qualitative and quantitatives, but I suspect I'm the only mixed method member!

Added by Dr Stephen Tagg on April 14, 2008 at 9:50am — No Comments

Increasing customer loyalty using data analytics

Loyalty Marketing has become a key strategy for most companies in today's competitive marketplace. The practice is based on a very simple premise - as you develop stronger relationships with your best customers, they will stay with you longer; the longer they stay, the more profitable they become.



It costs less to retain a customer than to acquire a new one. To retain a customer involves many factors - personal relationships, product quality, customer service, price, and other brand… Continue

Added by Vincent Granville on April 6, 2008 at 12:15pm — 3 Comments

Predictive Model Markup Language - PMML

The Predictive Model Markup Language (PMML) is a mark up language for statistical and data mining models. Some sort of HTML language to handle statisical tasks such as regression, decision trees etc.



PMML is an XML-based language which provides a way for applications to define statistical and data mining models and to share models between PMML compliant applications.



PMML provides applications a vendor-independent method of defining models so that proprietary issues and… Continue

Added by Vincent Granville on April 5, 2008 at 1:30am — No Comments

Nonparametric regression: the LOESS procedure

PROC LOESS implements a nonparametric method for estimating local regression surfaces pioneered by Cleveland (1979); also refer to Cleveland et al. (1988) and Cleveland and Grosse (1991). This method is commonly referred to as loess, which is short for local regression.



PROC LOESS allows greater flexibility than traditional modeling tools because you can use it for situations in which you do not know a suitable parametric form of the regression surface. Furthermore, PROC LOESS is… Continue

Added by Vincent Granville on March 31, 2008 at 11:00pm — No Comments

Double Sampling and Calibrating a visitor usage meter

Ive been consulting with a National Park on how to improve estimates of visitor usage. they have door mounted meters but these are not considered very accurate. Here is a summary of my sample design:





SUMMARY: National Park Administrators are required to file monthly reports of visitor totals at various park facilities such as visitor centers. Although door mounted meters can and are used to estimate these totals, there are issues with the quality of meter counts. Meters are… Continue

Added by J. Liddle on March 26, 2008 at 5:11am — No Comments

Upcoming Data Mining Conferences - Submission Deadlines

ISIPS: Interdisciplinary Studies in Information Privacy and Security, due Mar 30 (extended)

International Conference on Advanced Intelligence (ICAI-08), due Apr 1

The Fifth Conference On Email and Anti-Spam (CEAS 2008), due Apr 3

Intelligent Techniques for Web Personalization & Recommender Systems, due Apr 7

EMAIL-2008: AAAI 2008 Workshop On Enhanced Messaging, due Apr 7

Inference and Estimation in Probabilistic Time-Series Models, due Apr 11

ICDM '08 workshop… Continue

Added by Vincent Granville on March 25, 2008 at 7:30am — No Comments

New Ideas to Forecast Stock Market Trends

My interest has mostly been in trading QQQQ and other major indexes, hoping to build a small portfolio of contrarian (negatively correlated) indexes. I am interested in two types of strategies:



1. A strategy where some cash is dormant on a trading account for most of the time, yielding a return of less than 4% a year when in "dormant mode". Once in a while (it could be every two or three years, sometimes every six months), when large movements occur on the stock market, I step in as… Continue

Added by Vincent Granville on March 22, 2008 at 11:36am — 2 Comments

Salary survey for Data Warehouse and Business Intelligence Professionals

The purpose of this report is to gain a better sense of the people and teams who built and maintained business intelligence (BI) and data warehousing (DW) solutions during the 2007 calendar year. This report uses the term “BI” to refer to both business intelligence and data warehousing initiatives, and the term “BI professionals” to the individuals who deliver these initiatives. Specifically, the report looks at…

Continue

Added by Vincent Granville on March 21, 2008 at 2:00am — No Comments

What is Six Sigma?

The concepts surrounding the drive to Six Sigma quality are essentially those of statistics and probability. In simple language, these concepts boil down to, “How confident can I be that what I planned to happen actually will happen?” Basically, the concept of Six Sigma deals with measuring and improving how close we come to delivering on what we planned to do.



Anything we do varies, even if only slightly, from the plan. Since no result can exactly match our intention, we usually… Continue

Added by Vincent Granville on March 15, 2008 at 9:09am — No Comments

Bayesian Mark Recapture for Small Sample Sizes

3/10/08

I gave a short seminar at National Marine Fisheries Services NMFS in Honolulu, Hawaii, entitled "Bayesian Mark-Recapture for Small Sample Sizes"



SUMMARY.

Mark-recapture methods are often used to estimate the abundance of rare or elusive populations but produce highly uncertain results when sample sizes are small. We develop a new estimator for a single-release, single-recapture experiment based on Bayesian methodology to handle this situation. The number of marked… Continue

Added by J. Liddle on March 11, 2008 at 11:52am — No Comments

Analyticcircle, Analytictunnel and Other Analytic Domains

The two domains AnalyticCircle.com and AnalyticTunnel.com both currently point to AnalyticBridge. Feel free to use them when you invite contacts to join our network. Other related domains include



Continue

Added by Vincent Granville on March 8, 2008 at 10:00pm — No Comments

AnalyticBridge is a top 10,000 website in Switzerland

According to Alexa. Traffic breakdown:



US: 43%

Switzerland: 27%

Canada: 7%

UK: 7%

Argentina: 7%

Germany: 4%

India: 4%

Netherlands: 2%



Switzerland is known for being one of the best countries for scientific research. AnalyticBridge started on Feb 16th, 2008. Alexa rankings are based on reach (unique users) and pageviews per user, smoothed over a 1-month time period.



Source:… Continue

Added by Vincent Granville on March 6, 2008 at 12:30am — 1 Comment

How to encourage young people to pursue careers in statistics / high tech?

What do you think? Do you agree with Bill Gate's statement? Many math PhDs don't find it easy to get a job. Do recruiters agree with Bill?

See Bill Gates' question on LinkedIn. As of now, it generated more than 3,000 answers on LinkedIn.

Added by Vincent Granville on March 3, 2008 at 1:00pm — 1 Comment

Source code for Robust Ridge and Linear Regression with Bootstrap

Works well with non Gaussian data or outliers. Allows you to set up bounds on the regression parameters (similar to ridge regression). Does not use matrix inversion, thus numerically stable. Robust parameter estimation based on Monte-Carlo simulations and re-sampling. The source code can easily be modified to perform logistic regression. This package can be used by scientists, programmers, analysts or engineers with limited statistical knowledge. Works on Unix, Linux or Windows. We will help… Continue

Added by Vincent Granville on March 1, 2008 at 3:00pm — No Comments

The Analytics Guru - Web Analytics Blog

Marshall Sponder’s rantings about Web Analytics, Art, Social Media, Travels and Politics.

theanalyticsguru.wordpress.com

Added by Vincent Granville on March 1, 2008 at 2:00pm — No Comments

Derivatives Trading, Hedging and Volatility. London, April 2-3.

Wilmott-Taleb Seminar - 2-3 April - London



DERIVATIVES TRADING, HEDGING AND

VOLATILITY IN THE REAL WORLD:

AN EXCLUSIVE TWO-DAY WORKSHOP



Past delegates say: "The best course I have ever attended," "Immensely practical," "Passionate," "New insights," "I recommend it to my friends," "The course was excellent" and "Paul and Nassim make a great team."



Paul Wilmott and Nassim Nicholas Taleb have teamed up to give a two-day workshop on "Derivatives Trading,… Continue

Added by Vincent Granville on March 1, 2008 at 1:00am — No Comments

SAS Global Forum 2008. San Antonio, March 16-19.

Mark your calendars now for SAS Global Forum 2008 to be held March 16 - 19 at the Henry B. Gonzalez Convention Center in San Antonio, Texas.



SAS Global Forum 2008, the premier event for SAS professionals worldwide, has offered unequalled educational and networking opportunities for the past 33 years. Don't miss your chance to sharpen your SAS skills and network with other SAS professionals. You will find information on every topic of interest through more than 300 papers and… Continue

Added by Vincent Granville on March 1, 2008 at 1:00am — No Comments

Data Mining Definition

Having read from great authors and researchers, I had always taken the definition of data mining for granted. Following the tradiitons of Occam's Razor, I tried to summarize one myself but felt the difficulty of covering the entire spectrum of data mining which is ever evolving. Anyways, here is my try...



Data Mining is a process of extraction of non-trivial patterns from massive datasets which either provides descriptive insights of the data (not perceived without this… Continue

Added by Atif Abdul-Rahman on February 27, 2008 at 1:14pm — 2 Comments

Blog Topics by Tags

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service