Organisations have been investing in analytics relying on internal and external data to gain a competitive advantage. However, the legal and regulatory acts imposed nationally and internationally have become a challenge, especially for highly regulated sectors like health or finance/banking. Data handlers like Facebook and Amazon have already sustained considerable fines or are under investigation due to violations of data governance. The era of Big Data has further intensified the challenges of minimising the risk of data loss by introducing the dimensions of Volume, Velocity and Variety into confidentiality. Although Volume and Velocity have been extensively researched, Variety, "the ugly duckling" of Big Data, is often neglected and difficult to solve, thus increasing the risk of data exposure and data loss. In mitigating the risk of data exposure and data loss in this paper, a framework is proposed to utilise algorithmic classification and workflow capabilities to provide a consistent approach towards data evaluations across the organisation. A rule-based system, implementing the corporate data classification policy, will minimise the risk of exposure by facilitating users to identify the approved guidelines and enforce them quickly. The framework includes an exception handling process with appropriate approval for extenuating circumstances. The system was implemented in a Proof of Concept working prototype to showcase the capabilities and provide a hands-on experience. The Information System was evaluated and accredited by a diverse audience of academics and senior business executives in the fields of security and data management. The audience had an average experience of approximately 25 years and amasses a total experience of almost three centuries (294 years). The results confirmed that the 3Vs are of concern and that Variety, with a majority of 90% of the commentators, is the most troubling. In addition to that, with an approximate average of 60%, it was confirmed that appropriate policies, procedure and prerequisites for classification are in place whilst implementation tools are lagging.



Publication Date


Publication Title

Big Data



Embargo Period


Organisational Unit

School of Engineering, Computing and Mathematics