ISO/IEC 27046 — Information technology — Big data security and privacy — Implementation guidelines [DRAFT]
This standard is intended to help organizations implement the processes described in ISO/IEC 27045 in order to ensure the security and privacy of big data.
Scope and purpose
‘Big data’ is defined as “Extensive datasets - primarily in the characteristics of volume, variety, velocity, and/or variability - that require a scalable architecture for efficient storage, manipulation, and analysis. [Source: ISO/IEC 20546]”
Content of the standard
Status of the standard
The project started in 2019.
It is currently at 2nd Working Draft stage.
The currently-adopted definition of ‘big data’ in the draft standard does not (in my personal opinion) reflect its widespread use in the IT industry at present, mostly because of the vagueness of ‘extensive’ which is essentially synonymous with and adds little clarity to plain ‘big’.
Wikipedia is more helpful e.g.:
“Current usage of the term big data tends to refer to the use of predictive analytics, user behavior analytics, or certain other advanced data analytics methods that extract value from data, and seldom to a particular size of data set. "There is little doubt that the quantities of data now available are indeed large, but that's not the most relevant characteristic of this new data ecosystem." Analysis of data sets can find new correlations to "spot business trends, prevent diseases, combat crime and so on." Scientists, business executives, practitioners of medicine, advertising and governments alike regularly meet difficulties with large data-sets in areas including Internet searches, fintech, urban informatics, and business informatics. Scientists encounter limitations in e-Science work, including meteorology, genomics, connectomics, complex physics simulations, biology and environmental research.”
For me, one of the defining characteristics of big data is that typical (mostly relational) database management systems struggle or are unable to cope with the complexity and dynamics/volatility of truly massive data sets. Beyond the limits of their scalability, conventional architectures experience constraints and failures, no matter how much raw CPU power is thrown at the problems. That implies the need for fundamentally different approaches and I rather suspect entails novel information risks and hence security/privacy controls. However, it remains to be seen what this standard will actually address in practice: this is cutting-edge stuff.
< Previous standard ^ Up a level ^ Next standard >