Big data - black gold?
01 April 2015
In 2013, bestselling author, Dan Ariely, referred to the widely used term, ‘big data’ as the 'crude oil' of the new millennium - hugely valuable but useless if unrefined. Johannes Petrowisch seeks to shed some light on big data archiving for manufacturers.
If big data really is this valuable, where and how should we be archiving the binary stockpile of black gold? This question is being asked more frequently by people in charge of production at manufacturing companies who have terabytes of information to store.
There are two main reasons why manufacturing companies want to archive large amounts of data on a long-term basis, and they're virtually the same for all industries; compliance is one of these reasons.
Traceability and seamless documentation of product history - including information regarding creation, quality and quantity - are key pieces of data that a company wants to store for accountability reasons. The most important ramification of this is that the firm has documented proof of its adherence to legal requirements. The second reason, and the focus of this article, is that knowledge is power.
In the midst of the current fourth industrial revolution, or Industry 4.0 as it it’s more commonly referred to, information is power. The more data you collate and analyse today, the stronger and more accurate your predictions will be tomorrow.
Whether your reasons are quality management, predictive maintenance or simply staying ahead of the curve in innovation, there is a strong belief that the more data points archived, the better. So, with companies now hoarding their data like precious fossil fuels, where's best to keep this sensitive material?
Data, big or small, needs to be kept safe and readily accessible for analysis or there's little point saving it in the first place. Our experience at COPA-DATA has highlighted a recurring trend: the cheaper the storage medium, the worse and more time consuming the re-readability of the data.
One less costly and slightly outdated storage method entails data being moved to an external media, such as magnetic tapes. This makes searching and extrapolating information cumbersome and time consuming; it also increases the risk of data loss or theft. Moreover, the storage space is often insufficient for the gargantuan amounts of data recorded in industrial processes today.
Alternatively, the data can be saved in a database. This method is good for simple accessibility of data, but is more expensive than other methods. For example, large quantities of information would need to be split into separate database ‘shards’. This increases running costs, complexity of operations and maintenance – and it also runs the risks involved with a single point of failure. So where should the masses of data go?
Many are now turning towards the cloud for their archiving needs. Indeed, COPA-DATA estimates that updating to a big data cloud solution could reduce a company's total cost of ownership for storage by 40 to 60 percent.
When collecting plant data, large quantities need to be archived on a daily basis. The elasticity of the cloud makes it ideal for such scenarios. Cloud based storage is capable of rapidly processing large volumes of unstructured and often heterogeneous data to identify patterns that, in turn, can be used to improve business strategies. This can even be done in real-time.
Unsurprisingly, big data environments require big supporting structures. Clusters of servers are used to support the tools necessary for processing large volumes of information. The added benefit of cloud storage is that it’s already employed on pools of servers, storage and network resources, so they can be scaled up and down as needed.
Security has become of paramount importance to manufacturing industry in recent years. With firms compiling and archiving larger and larger amounts of historical data, the need for increased security has become apparent. As previously mentioned, archiving data in exterior physical forms, such as magnetic tape, increases the risk of that information being lost or stolen.
However, with the cost of cyber attacks on critical infrastructure now measuring into billions of dollars worth of damage and the recent infamous celebrity cloud hacking scandal, you'd be forgiven for thinking that these forms of storage were just as vulnerable.
In reality, the celebrity hacking scandal was less a hacking and more an abundance of personal information online that was unfortunately abused. Cloud archiving is actually one of the safest methods of big data storage, ensuring against both theft and loss.
For example, Microsoft cloud-integrated StorSimple storage (CiS) and Microsoft Azure, when combined with COPA-DATA's industrial automation software, zenon, forms a secure, ergonomic and dynamic archiving system. Loss of data and unauthorised data access are protected by automatic backup, redundancy, disaster recovery and hardware encryption. Another security advantage is that metadata is only saved in the local runtime application.
Conventional native archiving technology, consisting of aggregated archives, dynamic re-readability and trend evaluations ensure that data points are not saved locally on the panel or PC, but on a hardware appliance in the internal network, the CiS. This dynamic storage gateway, with a current capacity of 120 terabytes per device, guarantees that the data in the Azure cloud storage is moved to, and safely archived, there.
For these reasons, COPA-DATA believes it makes sense for companies to look to cloud computing for their analytical and archiving needs. For today’s businesses, properly mining and refining data from an artefact into an asset drives innovation and gives competitive advantages. A cloud-based big data archiving system provides a business with the ability and ease to analyse on a large scale without security and cost worries.
Johannes Petrowisch is a partner account manager at COPA-DATA
Contact Details and Archive...