Subscribe to DSC Newsletter

Real Life Example of Text Mining to Detect Fraudulent Buyers

The credit card transaction described here in details is a real example of a fraudulent transaction performed by organized criminals, undetected by all financial institutions involved, and very easy to detect with simple text mining techniques.

It was not caught by any of the financial institutions involved in processing (declining or accepting) the transaction in question : merchant gateway, bank associated with the credit card holder, bank associated with the merchant, e-store, Visa. It was manually declined by the manager of the e-store, who investigated the transaction.

In short, scoring algorithms used by financial institutions, to check whether a transaction should be accepted or not, could be significantly improved using findings from the case described below. The pattern associated with this specific online purchase is very typical of traditional online fraudulent transactions.

  • We are dealing with a B2B merchant with very good rating, located in US. Financial institutions should have a field in their transaction databases, to identify B2B from B2C or something else.
  • The purchase took place Friday night. This is very unusual for the merchant in question, and it is unusual for B2B merchants in general (US merchants). 
  • Cardholder address is somewhere in Chicago, IL.
  • Phone number (716-775-8339) is listed in Grand Island, NY, although it was reported before as an Indian cell phone number.
  • Product being purchased is a product with a higher fraud risk. Historical data should show that risk associated with this product is higher than from other products from the same merchant.
  • IP address from purchaser is, corresponding to, a domain name with server in Dallas, TX, and owned by Arunava Bhowmick, a guy located in India. In addition, the domain name contains the term "proxy", a red-flag by itself (unless it's a corporate proxy, but this is not the case here).
  • A Google search on the phone number points to a fraud report about "christan kingdom shipping company Renee Darrin, Terrysa Leteff free car scam Pasadena, Texas". See
  • Email address of purchaser is [email protected]: is a non-existent website (the domain is hosted by, and quite likely, the purchaser provided a fake email address.

All these findings, which make this transaction highly suspicious, would have been extremely easy to detect in real-time, automatically, with a tiny bit of web crawling and text analytics, when the transaction was being reviewed by the merchant. Or even better, before it made its way to the e-store.

Methodology to detect this type of fraud:

  • Capture a number of metrics on the online purchase form: phone number, e-mail address (keep in mind that the purchaser can fake these fields)
  • Record IP address of purchaser
  • Do a search on the e-mail address and the domain attached to it. Is the domain name empty? Is it a free email account (gmx, hotmail, yahoo, gmail)? From which country? Can you successfully ping the e-mail address?
  • Do a reverse lookup on the IP address to retrieve domain name. Is domain name a non-corporate proxy? From which country?
  • Are IP address, phone number, email address and cardholder address all from different states or countries?
  • Do a search on the phone number: does the search return results containing one of the following strings: abuse, scam, spam etc.
  • Create a credit card transaction score that integrate the above rules.

Views: 13096


You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Sandeep Raut on December 23, 2011 at 8:30pm

excellent vincent.

On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service