A Data Science Central Community
Large scale equipment for power generation, manufacturing, mining, and similarly sized functions are structurally important to the global economy. They turn raw materials into the energy and other products that help keep the economy running.
Consider the gas-turbine electric plant. These installations can involve multiple instances of large scale gas turbines like those manufactured by General Electric and Siemens, and they can supply power for thousands of homes and many jobs. Keeping these turbines running, continuing to turn gas into electricity, requires precise manufacturing. Bearings, blades, and shafts must be perfectly balanced in order for continued power generation, but strict maintenance schedules for addressing corrosion, fatigue, and wear are also required.
As precisely as these giant machines are made, the extreme conditions of highly compressed gas and the continuous runtime requirements of power generation invariably lead to some components failing to function appropriately. Extreme temperatures, high pressure, corrosive environments, and many other factors lead to costly interruptions in power. Nevertheless, these machines are intended to run for long periods of time, and interruptions in power generation for either maintenance or repair are very expensive. In order to minimize these interruptions, turbine manufacturers and plant operators employ statistical analysis to determine optimal plant maintenance schedules. Much like a personal vehicle, these giant power generators have parts that are designed to be replaced at planned intervals to ensure continued operation. The machines themselves are made to be repaired as quickly as possible, with parts designed to be worn out and replaced. These maintenance schedules help in planning for a steady flow of electricity generation, minimizing equipment failures through rigorous statistical analysis.
In spite of these efforts, and that the typical gas turbine has relatively few moving parts, there are occasional events that interrupt power output. Bearings could fail, a sheared blade could impact performance or even cause additional damage. These events, and many others, are tracked through multitudes of sensors in the turbine tracking vibration and temperature levels, ambient air conditions, exhaust properties, compression levels, and much more. This is where traditional statistical methods fail, giving windows for a predicted event, but not the indicators that will predict failure.
With the creation of so much sensor data however, energy producers are often overwhelmed. The constantly generated unstructured sensor data are likely to contain the many predictors that could lead to a failure, but collecting, storing, and analyzing all that data have proven to be a daunting task. The result has been that the primary approach to predictive maintenance has, to date, been limited to statistical estimations.
Instead of trying to predict failure on the macro scale, these large-scale industrial operations are in need of a solution that will make better use of the sensor data they already have. A more advanced approach involve streaming queries, a process of repeatedly querying data as it is recorded. These ongoing queries have been an advancement in data analysis, but they rely on the analyst to first identify the manner of failure and write queries for events that are well-understood, meaning that these events are identifiable, but not exactly predictive.
Advanced machine data methods that are typically used to analyze streaming network data are ideal for these large scale and sensor intensive applications. With comparable data volumes, approaches like graph analytics automate data analysis, greatly reducing the pitfalls of human bias. The approach uses software derived from graph theory to map and analyze the connections in machine and sensor data, revealing the data points that are associated with failure and allowing analysts to identify the sensor readings that correlate with costly interruptions.
These more predictive methods, coupled with traditional statistical analysis, will help modernize power generation, making the grid more efficient and more cost effective with the data that we already have.