Subscribe to DSC Newsletter

5 Things You Can Learn About Analysis from the Intelligence Community

Abstract
Businesses have a large and growing need to analyze data. This is no easy task today with the exploding volumes of data pouring in from everywhere, and the enormous pressure to turn these mountains of data into information that can be acted upon quickly.

It is no surprise that organizations spend over $15B annually on Business Intelligence (BI) and Data Mining technologies. But with all of the focus on infrastructure technologies, there is little emphasis on the art of analysis (analytics).

This is an area where the private sector would be well served by studying the methods used by the US Intelligence Community. This community has been in the business of understanding massive amounts of data for a long time and the applications are as mission critical as they get.

So, what are the lessons that you can apply to your business today. This multi-part blog series will explore 5 specific areas in more detail.


Part 1 The Power of Link Analysis


Business Intelligence solutions often make use of charts and graphs to communicate information. We think visually and pictures, or information visualizations, can be highly effective at communicating interesting insights to us quickly.

Recognizing this, newer solutions are augmenting their arsenals of charts with new visualizations that are more dynamic and specialized. For example, in addition to standard bar and pie charts, these include heat maps, bubble charts and timelines.

However, one particular visualization that the law enforcement community has used for a long time has been notably absent from mainstream analytics products. This visualization is a Relationship Graph (also called node-and-link diagrams). Relationship Graphs fall under the science of Link Analysis which is used to discover and understand relationships between seemingly unrelated entities. Increasingly, this is becoming an important exercise for businesses in everything from fraud identification to customer and market basket analysis.

The reason that this area, relationship analysis, is getting so much attention is that our information landscape is getting more dynamic by the day. Most analysis tools require you to know what questions to ask in advance. For example, ”What is my revenue by region?" or "How many customers do I have?" However, as soon as you want to explore and navigate through the mountains of information at your disposal, the tool falls short. Yet this is precisely what businesses must do today - Discover the unknown, reveal those insights that provide a competitive advantage.

Again, this is not a new challenge for the intel community. They are routinely presented with massive amounts of data and a charge to discover the non-obvious connections. For example, imagine looking at reams of tabular data relating to flight and housing records of foreign visitors to the country. Now, look at the same data in a relationship graph (Figure 1).


Figure 1 - Link Analysis Diagram of Foreign Visitors

Here, an observation 'jumps out' at you; multiple people, coming in on different flights and going to the same address. The human brain is an unprecedented pattern recognition engine and when we identify patterns, we tend to draw inferences almost instantly. This is typical of link analysis. When done well, the resulting insights can be remarkable.

So, how could we apply something like that to a more typical business scenario?

Let's take a simple example of analyzing retail data relating to sales promotions (gift cards, etc.) that are currently being run. A common chart here is margin by product line (Figure 2); this is the question that I know to ask in advance. But I see here that I am losing money on Computer Games in the Eastern region. This is odd because I would expect this to be a profitable product category.

Figure 2 - CRM Example of Margin by Product Category.

So now the question is, why are we losing money here? To explore this in more detail, let's look at a relationship graph relating to these margin-eroding transactions. Let's specifically look at a relationship graph that highlights the actual customers along with the promotion that they are using to make the purchases and the state that they live in (Figure 3).


Now, while this graph is starting to get a little busy, where is your eye drawn right away? Note all of the activity relating to the 2 highlighted individuals. Now notice that they both are making all of these purchases on the new Gift Card promotion and they are also both from Massachusetts.

So, we have a couple of individuals that are making high volumes of purchases using these gift cards that I have been issuing. Remember that I didn't know that I was looking for this in advance. Indeed, this would be a difficult insight to reveal with traditional reporting unless I had a specific query pre-defined that called it out.

But with this particular visualization, the relationship jumps out at you; the pattern is detected. In this case, I have revealed potential fraud activity that would explain why I am losing money on an otherwise profitable product category in one particular geography.

Highly interactive relationship graphs would allow us even more flexibility. For example, if every node, and link, is interactive, we could envision 'drilling down' or 'drilling out' to extend the analysis even further. The key theme here is being able to proceed with analysis at the speed of the human brain. Rather than having a pre-defined set of questions to ask, allowing the analyst to explore the data and let the resulting insights drive where they go next.

Increasing amounts of data coupled with shrinking time windows to understand and act on the information it contains are driving businesses to seek new approaches to analytics. The ability to express relationships in a visual way is extremely powerful. The IC has been using link analysis for years and it is no surprise that the secret is getting out to the commercial BI world.

Stop back and visit us for part 2 of this series titled "Shift Your Lens". To learn more about Interactive Analytics, go to www.centrifugesystems.com.

Views: 742

Tags: analysis, anlaysis, bi, business, centrifuge, cyber, data, fraud, information, integration, More…intelligence, link, migration, mining., network, security, sytems, visualization

Comment

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Tony Agresta on July 22, 2010 at 10:04am
Thank you for the comment. You are correct, when you draw a graph with hundreds or thousands of linkages, it can get complex. We address this in many ways.

First, users can pre-process the data prior to creating a 'Data view' in Centrifuge. This helps select the data you want to analyze. Secondly, at any time, the user can filter data using any combination of fields. For example, you can filter based on flights, geo locations, business lines, any attribute that is part of your dataview. Third, we have a feature called "spin-off" which allows users to select sections of the relationship graph that may show an unusual pattern of behavior and then "spin-off-the-data." This creates a subset of data without losing the original set. Finally, we are about to release 2.0. In 2.0, we have a new feature called "graph search" allowing users to search the data for specific links and nodes prior to drawing the graph. This is also helpful in controlling graph size. Settings for the graph also provide guidance on how many link-node combinations will be drawn and warn the user to modify the graph based on this. So, there are loads of checks and balances that make the relationship graph usable.

Oftentimes, our users also use charts and timelines prior to drawing the graph. These summary level views help them interpret the data and select subsets prior to drawing a relationship graph.

Please let me know if you have any other comments or questions.

Thanks.
Comment by Yi-Chun Tsai on July 22, 2010 at 9:06am
Hi, Tony:
It seems very promising in commercial BI world. My question is that if I have hundreds of thousands of customers, won't your visual graph get so crowded and busy that an analyst can't get anything out of it?

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service