A Data Science Central Community
For decades, the intelligence community has been collecting and analyzing information to produce timely and actionable insights for intelligence consumers. But as the amount of information collected increases, analysts are facing new challenges in terms of data processing and analysis. In this article, we explore the possibilities that graph technology is offering for intelligence analysis.
The digital age brought new possibilities for Intelligence, Surveillance, and Reconnaissance across both traditional and new intelligence sources. The possibilities of collection within each discipline have widened. For instance, the Open Source Intelligence (OSINT) collection channels multiplied with the Internet, providing accesses to new valuable information. The generalization of digital technologies also extended the production, and thus the collection possibilities, with users generating and sharing content from portable devices anywhere in the world.
But those changes come at a cost for analysts:
This has a direct impact on the analysis. It’s difficult and time-consuming to handle those large, dynamics and various data assets. And in the meantime, the complexity of threats remains the same. To identify them, analysts must be able to cross-check various data assets in order to spot key elements and patterns that will produce actionable intelligence.
To renew and improve the traditional intelligence cycle, intelligence producers are turning to new tools and methods. Among those tools, we find graph technology. The underlying approach allows analysts to rapidly access relevant data and sift through large heterogeneous collections to find the small subset that holds high-value information.
The graph technology approach relies on a model in which you deal with data as a network. Information is stored as nodes, connected to each other by edges representing their relationships. This is actually a natural way to think about intelligence data: whether it’s people, telecommunication or events, the elements often form networks in which they are linked to each other.
Graph or RDF databases are optimized for the storage of connected data. It emerged as the answer to the limitations of traditional databases. The relational databases were designed to codify and store tabular structures. While they are very good at it, they do not perform well when it comes to handling large volumes of connected data. Graph databases, on the other hand, offer several advantages over traditional technology when it comes to connected data:
Popular graph storage vendors include DataStax, JanusGraph, Neo4j or Stardog. These systems widely developed in the last decade, responding to the growing need for a technical solution for organizations working with connected data at scale.
With graph technology, you can combine multi-dimensional data, including time series, demographic or geographic data. It aggregates data from multiple sources and formats into a single, comprehensive data model that can scale up to billions of nodes and edges.
This is essential in multi-intelligence or all-source analysis to identify suspicious patterns, anomalies or irregular behavior. Indeed, suspicious activities are more easily detected when you analyze the dynamics between entities and not just the characteristics of single entities. With this approach, analysts easily gather and analyze data about people, events, and locations for example, into one view.
In the end, graph technology offers several advantages to intelligence and law enforcement agencies. It provides a single entry point to multiple data sources and data types that are integrated under a unique model. Analysts can produce intelligence from the analysis of heterogeneous data and its connections.
Introducing graph databases into an organization comes with a set of new challenges. How to let analysts access the data in a suitable way? How to enable them to find information hidden in a complex web of billions of nodes and relationships? That’s where graph visualization and analysis tools come in handy.
While the graph approach offers a unified model, finding insights into the enormous volume of data remains a challenge for analysts. To this, must be added the pressure of intelligence consumers that expect analysts to deliver intelligence insights in a timely manner.
As we previously explained, visualization tools can be a great asset for investigation.
When you work with connected data, graph visualization and analysis is definitely a more efficient method than the traditional analysis of spreadsheets or data stored in relational databases.
In addition, graph analysis offers a valuable set of methods to get insights from connected data. For example, there are many algorithms, derived from graph theory and social network analysis, that can be used to identify communities, to spot highly connected individuals or to understand flows of information through a network.
Graph investigation tools, such as Linkurious Enterprise, are an additional asset for intelligence analysts facing the challenges of big data. These tools are designed to enable analysts to uncover insights hidden in complex datasets by leveraging the power of graph databases They also provide more agility than in-house tools or complex proprietary platforms, such as i2 or Palantir.
When it comes to threat detection and investigation, graph investigation tools reduce the complexity and noise induced by the nature and volume of the processed data.
In Linkurious Enterprise, a complex data domain with different data sources or multiple entity types becomes a single, comprehensive graph. Analysts can visually investigate vast data collections. They can search for known patterns and suspicious links from a browser-based interface. Data filters and visual styles help them focus on what’s important and reduce the noise generated by large amounts of data.
Below, we used OSINT data to showcase some of the visualization and analysis capabilities of Linkurious Enterprise. We used a publicly available dataset, the Global Terrorism Database. We modeled part of the data (from 2013 to 2016) into a graph database following a simple graph model.
The data was then ingested into a graph database using a script (there are a few different options to modeland import the data). Our database contains about 90,000 nodes and 240,000 relationships that are now all available for investigation in Linkurious Enterprise.
Analysts can use full-text search capacities to look for specific information in the database. With a few clicks, it’s possible to visualize all the terrorist attacks that happened in France between 2013 and 2016. The brown and green nodes respectively present the province of France and their cities. Each blue node represents a terrorist act recorded in a city. When the authors are known, they are symbolized by a yellow node, linked to the events.
|1: Terrorist activities recorded in the Ile de France province.
[Click to enlarge]
2: Terrorist activities perpetrated by Salafi jihadist groups in France. [Click to enlarge}
|3: Terrorist activities recorded in Corsica. [Click to enlarge]||4: Terrorist activities recorded in southern France regions. [Click to enlarge]|
The underlying graph structure gives us a better understanding of the events. The connections and the different categories of data (events, locations, people) provide some contextual information helpful for the analysis. For instance, by looking at the node clusters and the relationships, we can identify that:
From this quickly generated visualization, we are able to identify the main terrorist trends in France (rise of Islamic terrorism, local conflicts, and nationalist movements). In a real-life scenario, professional intelligence analysts can provide accurate reports based on the analysis of conflict and terrorist data.
Graph models support the aggregation of heterogeneous data, so it’s possible to enrich our OSINT data with geospatial information. In our example, every “Event” node carries geolocation properties, allowing us to display them on a map within Linkurious Enterprise. Below is an example of a geospatial visualization, with events represent as red nodes on the map, representing a month of terrorist activity in 2014.
Within intelligence teams, this feature is used to track down a series of events happening in a region in a short time-frame. A cluster of kinetic events is a known pattern for a trained analyst than can identify correlations and underlying terrorist tactics.
The advantage of graph databases is that they allow you to quickly traverse a high number of entities and relationships to retrieve information. This is a big change from systems based on relational databases in which querying connections are compute and memory-intensive operations that have an exponential cost. With Linkurious Enterprise, it’s possible to leverage the power of graph databases to search for a specific scenario, such as “are these two seemingly unconnected terrorist groups connected, and if so how? ”. For intelligence analysts, this can help identify key individuals, correlate a series of events with people, or understand the dynamics at work within an organization. Combined with their knowledge and experience, detecting pattern in connected data is an additional asset to conduct intelligence analysis.
Below is an example of a visualization generated with a graph query that matches the world’s ten deadliest attacks since 2013 and their connections to groups, city, and locations.
In Linkurious Enterprise, pattern detection can be automated as alerts. This reduces the analysts’ workload, with the platform automatically monitoring large volumes of data to uncover hidden connections and complex patterns.
In our examples, we created our database in a limited time, from a single data source. However, it is possible to add data from additional sources to enrich the database, depending on the questions you want to answer. For instance, you could add data from phone interceptions or financial transactions to identify potential relationships between attacks.
In addition to what we just saw, analysts can use advanced graph analysis, a set of methods expressly designed to find insights in connected data. There are for instance many algorithms, derived from graph theory, that can be used to identify communities, to spot people who occupy a key position in a network or to understand how information, money, or people flow through a network.
In the end, graph technology enables intelligence analysts to tackle the changes induced by the big data era. It’s an asset in the processing, storage, and analysis of the complex data collected today. While graph databases are great to aggregate and connect a multitude of sources in one place, Linkurious Enterprise helps teams of analysts easily find hidden intelligence within large graphs. It highlights connections in the data, allowing analysts to better understand and analyze complex situations. At the end of the day, analysts can better exploit their data to generate high-value insights.
Learn more here