One of the leading producers of cleaning and personal care products in the world with hundreds of billions of dollars in revenue has dozens of production lines operating in multiple countries. The manufacturing process of the company suffers substantially from unplanned stops associated with equipment failures (breakdowns), which leads to up to 20% loss in up-time of lines. As a result, every factory loses at a minimum of 5% of income to lost productivity from unplanned downtime. The company aims to make maintenance preparations by predicting the unplanned stops before they occur and consequently to reduce the downtime. To realize this goal, the company has been collecting data from hundreds of sensors to measure the critical parameters on these production lines such as temperature, pressure, and vacuum rates. At this stage, the company contacted Kaizen Intelligence to develop a better AI model to predict stops and help the company with the design of an efficient maintenance strategy.

While the data volume is large, specific faults are rare and far in between. Also, sensor data is noisy, with large gaps and errors in collection. Due to these issues, the previous model developed in house performed inadequately with a large number of false positives, making it impractical to deploy in the field. After obtaining the sensor data collected over 2 years, we approached the problem as follows: We conducted statistical analysis on the sensor data, and noticed that a considerable amount of data consists of anomalies due to sensor malfunctions and/or changes in the data collection intervals. Hence, we performed extensive data cleaning while preserving the integrity of the data. Guided by the analysis, we processed the time-series data to extract useful features for the problem. Since labels are sparse, good feature engineering can be critical to obtaining good performance. We built models of varying complexity, starting from traditional methods such as boosted tree or random forest models, to cutting edge deep learning methods, with various combinations in between.

We ran extensive offline experiments to do model selection, settling on a deep auto encoder architecture, which we carefully fine-tuned. Finally we deployed our best model online to test on real-time sensor data. The performance of our deep learning based model has been observed for a duration of 6 months. The model was able to predict more than 80% of faults, with an average of 6-8 hours advance notice, which gives the company ample time to plan its maintenance operation to prevent these stops. In this 6 month period, there were only 8 false alarm events, resulting in a false alarm rate that is well within what the operation can handle. We are currently in collaboration with the company to extend this model for more fault types and apply it on a large number of production lines. This work is expected to yield $ 1.4 million dollars in revenue for each line by increasing the uptime and operational efficiency of the line.

The performance of our deep learning based model has been observed for a duration of 6 months. The model was able to predict more than 80% of faults, with an average of 6-8 hours advance notice, which gives the company ample time to plan its maintenance operation to prevent these stops. In this 6 month period, there were only 8 false alarm events, resulting in a false alarm rate that is well within what the operation can handle. We are currently in collaboration with the company to extend this model for more fault types and apply it on a large number of production lines. This work is expected to yield $ 1.4 million dollars in revenue for each line by increasing the uptime and operational efficiency of the line.