logo_header
  • Topics
  • Research & Analysis
  • Features & Opinion
  • Webinars & Podcasts
  • Videos
  • Dtw
Article | Security, Big data

Big data has big security woes

Securing big data is as crucial as monetizing it, particularly for IT execs in charge of transforming big data infrastructure and advancing to cloud storage.

25 Sep 2018

Big data has big security woes

Big data security is as important as monetizing the data, particularly for IT executives in charge of transforming data infrastructure, and it has to be baked in, not added as an afterthought. Not only does a company’s reputation and trust factor depend on consumer privacy, regulators also have all eyes on it with the EU’s General Data Protection Regulation (GDPR) now in effect and ready to dish out substantial penalties and fines for offenders. And yet, security breaches continue to happen regularly. Indeed, in just the past few days breaches have been reported at the US State Department, Western Australia’s Perth Mint and Air Canada. At Mobile World Congress in Barcelona earlier this year, McAfee CEO Chris Young declared the firm was dealing with hundreds of thousands of unique threats every day. What’s more, the greater the number of stakeholders (users, systems, business units and partners) involved in big data initiatives, the more difficult it is to catch security breaches and oversights.

Revel in anonymity

This is leading many organizations to consider new ways of protecting user data, such as anonymizing it.

“The really interesting thing about data anonymization is the more you anonymize data, the less useful the dataset is,” says Steve Bowker, CEO and Co-founder of Cardinality, a data analytics software company. Bowker has been leading a TM Forum Catalyst proof of concept called Data Anonymizing API. “So, there is a fine line between meeting regulatory compliance and your own internal governance processes around privacy and security, yet the data still remaining useful. It’s a very important balance.”

The Catalyst project focuses on ways for communications service providers (CSPs) to anonymize customers’ data so that it can be used safely and in compliance with GDPR. It does this through an API that disguises data that can identify someone personally, so that they can no longer be identified directly or indirectly. The team used a process called ‘pseudonymization’ to replace personally identifiable information fields within a data record with artificial identifiers, or pseudonyms. The pseudonyms make the data records unidentifiable when they’re shared, but the data can be restored to its original state eventually, allowing individuals to be re-identified.

Security bake-off

A report from Informatica suggests that rather than building infrastructure first and then attempting to apply security policies and comply with regulations later, it’s absolutely crucial that platforms have best-practice data security baked in. According to the report, this means that in practice companies need to enable:

Discovery and identification

A 360-degree view of sensitive data is crucial for a risk-centric approach to big data management. So IT needs to be able to discover, classify and monitor sensitive data stores wherever they live and routinely profile that data for exposure to potential threats. Additionally, non-intrusive data masking is required to protect data assets even in the case of a perimeter security breach by de-identifying sensitive data in development and production environments, in much the same way the previously mentioned Catalyst project did using Open APIs.

Risk scorecards and analytics

A comprehensive view of all sensitive data is crucial to detecting and responding to threats. Risk analytics and scorecards automate the detection of high-risk scenarios and exceptions based on modeling, score trends, usage, and proliferation analysis so IT is alerted instantly. When it comes to big data security, speed is everything. The longer it takes IT to notice a security threat, the harder it becomes for you to reverse or even diagnose the damage. Cybersecurity firm Mandiant discovered it takes on average 146 days to detect a malicious attack in an organization’s environment, and in that time a substantial amount of information can be stolen and entire infrastructures infected and hacked.

Universal protection

Threats to big data security are evolving rapidly. According to IT Governance’s ISO 27001 Global Report, there was a 25% increase in data breaches in 2017. So when it comes to securing data across systems, users and regions, shortcuts aren’t an option. A company's big data security strategy needs to be holistic enough to provide masking, encryption, and access control across data types (both live and stored data) and environments (production and non-production).

Centralized policy-based security

Operationally, it’s important that big data security doesn’t become a burden on IT or an obstacle to experimentation. Companies must be able to create and monitor security policies centrally and then distribute them to users, systems and regions. At the scale of big data, this policy-based approach to security also makes compliance more manageable since certain privacy laws mandate location and role-based data controls. Or you could just be like Wetherspoons and delete your entire marketing database.