Volume 10 - Issue 3
Session-dependent Usage Control for Big Data
- Gabriele Baldi
Consiglio Nazionale delle Ricerche, Istituto Informatica e Telematica, Pisa, Italy
gabriele.baldi@iit.cnr.it
- Yair Diaz
German Research Center, Huawei Technologies Dusseldorf GmbH, Munich, Germany
yair.diaz@huawei.com
- Theo Dimitrakos
German Research Center, Huawei Technologies Dusseldorf GmbH, Munich, Germany
theo.dimitrakos@huawei.com
- Fabio Martinelli
Consiglio Nazionale delle Ricerche, Istituto Informatica e Telematica, Pisa, Italy
fabio.martinelli@iit.cnr.it
- Christina Michailidou
Consiglio Nazionale delle Ricerche, Istituto Informatica e Telematica, Pisa, Italy
christina.michailidou@iit.cnr.it
- Paolo Mori
Consiglio Nazionale delle Ricerche, Istituto Informatica e Telematica, Pisa, Italy
paolo.mori@iit.cnr.it
- Oleksii Osliak
Consiglio Nazionale delle Ricerche, Istituto Informatica e Telematica, Pisa, Italy
oleksii.osliak@iit.cnr.it
- Andrea Saracino
Consiglio Nazionale delle Ricerche, Istituto Informatica e Telematica, Pisa, Italy
andrea.saracino@iit.cnr.it
Keywords: Big Data, Usage Control, Access Control, Hadoop
Abstract
Business strategies are increasingly driven by the integrated analysis of huge volumes of heterogeneous
data, coming from different sources such for example social media or Internet of Things
devices. The so called Big Data are considered as relevant assets by companies and organizations,
since they can be analysed to create new valuable knowledge and insights that could help managers
in their strategic decisions. The full potential of Big Data could be realized if the information was
coming from several distinct sources, with different characteristics and target audience. Although,
data producers are not always willing to share their data with other companies due to lack of trust
and the absence of a data protection framework which can be adopted in a Big Data environment. In
this work, we present BigUCON, a framework which exploits the Usage Control paradigm in order
to provide an enhanced, expressive and flexible authorization support for data protection within the
aforementioned environment. The framework is integrated in Apache Hadoop, a software library
which provides the infrastructure for storing, mining and processing large data sets through a collection
of open-source software.