Subscribe to DSC Newsletter

Hi all,

do you have any (good or bad) experience of unsupervised fraud detection? What methods did you use - plain clustering or SOMs?

Read any white papers?

I'm interested in the case where you have some 1000 observations of data in the form of transactions.



Tags: detection, fraud, unsupervised

Views: 1521

Replies to This Discussion

Thanks for the feedback Kumud,
association and link analysis sounds like an interesting approach in fraud detection, not applicable in my case but anyway.


Do you have any good references of Association and Link Analysis? Perhaps links that explain.
What does this refer to? I understand what is meant by "fraud detection" and by "unsupervised" but don't know if it is the fraud itself, or if it is detection of the fraud that is "unsupervised". In either usage, how does "unsupervised" fit?
I think it means that the algorithm used for detection is unsupervised clustering. In the context of spam or click fraud, unsupervised is typically used; in the context of credit card fraud, supervised is used. Supervised means that we have a training set at our disposal, with known fraudulent and non-fraudulent transactions.
That's correct Vincent.
The client wants me to develop a model that can identify suspicious behaviour that can be put on a watch list.
@John, I don't have any confirmed fraud cases since the client wants to monitor a new type of fraud. So I can't use a supervised method to find suspicious activity, instead some clustering method (unsupervised) can be used to identify transactions that can be related to the new fraud approach. That's why I used the term 'Unsupervised fraud detection' , what I ment was 'unsupervised method to be used in fraud detection'.

Hi Thomas,

I had worked on a small project where i identified/detected fraudulent transactions (fuel cards). The client was using SQL based rules to detect frauds. I used plain clustering/profiling methods and was able to detect almost twice the number of fraudulent transactions (compared to the number the client was able to catch with their existing rules).


that's the approach I'm using and I find it quite successfull.

My client is using some 'gut feeling generated' rules so the clustering method is of course much better at identifying odd/fraudulent behaviour.

I've been interested in this as well. Based on my limited research it seems Peer Group Analysis is a popular method.

Here is a paper on it.

Also here is a short Wikipedia blurb on it as well (under Unsupervised Methods)
Hi Larry,

thanks a lot for the reference!



On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service