Manifesto for secure data sharing
Last year in a project meeting in Dagstuhl, Germany I explained complex security scenarios for remote sharing of databases among multiple teams. My peer Breanndan from the University of Amsterdam commented: 'Why did you make it so complex, when we need it simple? Look: I have a database, my colleague has even more data, and you have the data mining software. We want to run your software over our data, that’s it! Optimally, we need to form a dynamic secure data sharing group on-the-fly. We give each other the access rights, we run the job, and then we immediately disband the group. '
'Yes - I replied – this is precisely
what my Virtual Organizations are for. And by the way, this isn’t a new idea.'
But his point was well taken and it got me thinking. This "simple" scenario is in
fact complex to implement. But really, why did we make it so complex for the
end user?
To step back a bit, over a
decade ago people understood the need to share large
data, applications, and even machines. Much before the pervasive
Web 2.0 and P2P file sharing craze, this need has been acknowledged by the science community. There, the idea of the Virtual
Organizations (VO) was born: dynamic groups of mutual trust, that people (or
processes) could set up on the fly. Entities in a VO would share a digital identity
scheme, to securely identify and authorize each other, share whatever, and then
– disband the VO as soon as the task was complete. Soon, the first Virtual
Organizations were implemented… or were they really?
Nope. Bureaucracy won. It is striking how unfriendly
to the user the administrative policies could become. In corporations and universities alike, there is one security
bottleneck: the human administrator with his procedures. On the pretext of technology requirements, in the fear of responsibility, we have killed productivity
and forgotten the premise of on-the-fly, secure yet easy sharing.
I have not seen many production virtual organization in action. Instead, I saw a medical project
where thousands of anonymized patient records, in the absence of effective
sharing mechanism, were shipped to another institution on a hard drive in a parcel. I have also been to a research center where a crafty project team has quietly set up their own
wifi router with a direct Internet cable, to bypass the absurd security imposed
by their institution.
Is this rational? In
the case of mission-critical data, I agree. But we’ve grown mature enough to
understand that there are various degrees of confidentiality in various data.
In many cases imposing equally strict security regime on all data is ridiculous.
Hence my…
Manifesto for secure data sharing
- Free the ordinary users! Let
them decide to share.
Today, institutions leave this decision to an
expert or an administrator. But that’s not practical. Everyone owns data these
days. Sharing data is not an IT concept any more, it is our daily bread. Web
2.0 and ubiquitous P2P file shares, whether we want them or not, demonstrate it
best. So: administrators should guard the data critical for the enterprise. Temporary
data of a project should be owned and managed directly by the team.
- User-friendly data sharing
interface should not require IT skills.
Because sharing is not an expert action, it
should not need an expert’s involvement. Optimally, it should be a drag-and-drop
GUI.
- Data sharing must be easy,
efficient and take seconds.
I know
cases when the procedure to add members to a group can take hours or even days,
mainly due to the involvement of third parties. I think multiple levels of human
acceptance are okay, if that really is the policy of an institution in case of
critical data. Technology should enable, but not enforce such model.
Breanndan’s remark last year was indeed very stimulating. We got
back to the drawing board. And we came up with the entirely new way of thinking
of the data sharing security, which was basically what the users wanted. This
was baptized AdHoc, as it empowers ordinary users ad-hoc collaborations in
three seconds; just like the scenario described at the beginning of this
article. AdHoc 1.1.0 has just been announced a few days ago. My skilled team
crafted this hilarious movie which describes
AdHoc with tongue in cheek.
Big kudos to the
poor virolabers including myself who became overnight movie stars at one
project retreat. Disclaimer: any resemblance of the BigDisaster story to the existing
numerous, successful collaborative data sharing projects is purely coincidental.
While AdHoc is now available for users, I’m also interested in further
discussion on practical requirements for secure yet efficient and easy data
sharing. Feel free to comment.



Subscribe
AdHoc, great idea.
Posted by: a.k. | September 04, 2009 at 03:25 PM
Though it is related to other technologies (clouds) the newly established Google's effort "The Data Liberation Front" http://www.dataliberation.org/ has quite a similar goal.
Posted by: Chris Wilk | September 15, 2009 at 02:45 PM
I am watching in wonder the transformation of Gridwisetech into a database expertise center from a grid expertise center. Not only database, but an Oracle Database. The two kinds of expertise are rare to find into a single individual.
I am dreaming or not...
Miha
Posted by: Miha Ahronovitz | March 02, 2010 at 04:24 PM
Thanks for the kind comment Miha. In fact, this transformation came naturally. We deployed grids in corporations to make their IT scalable, on-demand and agile. But in many cases scaling out the processing was of no use, as long as the monolithic data i/o was on the way.
Now that we have built expertise in scaling out both the database and the processing, it is easier to tackle scalability problems of an enterprise holistically.
Posted by: Pawel Plaszczak | March 17, 2010 at 10:20 AM