Category Archives: Architecture

Access Layer Components

The Task Force work has been divided into two broad areas – exemplars and enablers. The ShowUsABetterWay competition has taken forward the exemplars strand. On the enablers side, we have been working on a model for a different architecture for public service information.

Our enablers work looks at what is needed for a usable Access Layer according to this model. I have tried to illustrate this with some simple questions from a re-user perspective. These are:

DISCOVERY – can I find the data that I want?

LEGAL – am I allowed to use the data?

TECHNICAL – is the data in the right format?

COMMERCIAL – can I afford the data that I need?

I shared these questions during a very useful workshop we recently took part in organisd by the Open Knowledge Foundation.

Discussions at the workshop generated some additional questions that we also need to ask of the access layer.

INTELLIGIBILITY – can I easily interpret the data that I am accessing?

DEPENDENCIES – does this data depend on anything else that could affect my use of it?

This last point on dependencies is drawn from the free software world where software is bundled into packages that are managed by tools like apt for Linux. Each package is constructed so that it is aware of its dependencies on other software packages.

Package libraries have grown up for related software families like the Comprehensive Perl Archive Network, CPAN, and this model has been deliberately adopted for the Comprehensive Knowledge Archive Network, CKAN.

The discussion that is ongoing around the use of geographical information that can be regarded as ‘derived’ from Ordnance Survey data is a very good illustration of the need to have dependency information associated with any public dataset.

Data that is dependent on Ordnance Survey data may have very different re-use characteristics from data that has been constructed independently. These significant differences may also apply in respect of other dependencies, for example on Royal Mail Postcode Address File data.

Richard Allan, Task Force Chair

Advertisement

Leave a comment

Filed under Architecture

Packaging Data for Reuse

The Open Knowledge Foundation has been working for some time on promoting knowledge ‘which anyone is free to use, re-use and redistribute without legal, social or technological restriction‘. (The comprehensive definition of Open Knowledge is also interesting as it makes explicit reference to ‘government and other administrative information’.

As part of their work, OKF have developed a tool to package up information resources in a model which will be familiar to anyone working with open source packages and languages such as Perl. This tool, the Comprehensive Knowledge Archive Network, now includes packages for some of the information that has been released as part of the Task Force’s work.

As more data is released into the public domain, signposting it in smart ways will become increasingly important. Getting to the right data and being offered clear information about its technical and legal characteristics is a critical part of a sound access layer. While these are still early days for CKAN this does look like a promising approach.

Richard Allan, Task Force Chair

2 Comments

Filed under Architecture

More Architecture

Some diagrams to stimulate further thought and discussion on what needs to be done architecturally to realise the power of public information.

Diagram 1 – the ‘traditional approach’

The \'traditional\' architecture

The emphasis of much web development to date has been on the presentation of the data to the public.

The assumption was that a particular website would be the unique interface to a particular set of data.

This meant that little or no thought might have been given to how anyone else would use the data set in question.

Sometimes the data and any analysis of it could be unpicked from such a site but in many instances this would be extremely difficult.

Diagram 2 – a Power of Information architecture

A Power of Information Architecture

If we want to realise the power of much public information then we need to start thinking differently about the way we treat data sets.

There is a need for an Access Layer to the data.

This must address all the issues that are necessary to enable use of the data. These typically include technical issues such as file formats, intellectual property issues such as copyright, and commercial issues such as pricing where applicable.

With access to the data enabled then multiple players may create their own analyses of it.

There is a need for a further Access Layer to the output of the analysis activity. This must again address any technical, intellectual property and commercial issues.

With the Access Layers in place there is scope for multiple web presentations of the data. Additional value can be generated through the ability to interact with a community around the data.

The full realisation of the power of the information is therefore realised when all layers are in place with the architecture designed to offer opportunities for interaction.

Diagram 3 – Example – Hansard in the old world

Hansard Old Model

It is helpful to illustrate this transition with the example of Hansard, the record of the UK Parliament’s proceedings.

www.parliament.uk has offered an integrated approach to presenting Hansard for a number of years. The Hansard data is wrapped up with Parliament’s own analysis output and presented to the public in an official website.

The Parliamentary site receives a lot of traffic and has evolved over the years to become more accessible.

But it did not offer all the functionality wanted by civic activists who have been working on alternative solutions to do more with Hansard and other Parliamentary content.

These alternative solutions have produced a more open architecture for this data.

Diagram 4 – Example – Hansard in the new world

Hansard in the new world

An access layer has been created for Hansard with a screen scraper and Click-Use license to address both technical and copyright issues.

The scraped data goes through an analysis process at publicwhip.org.uk.

Access to the output of this analysis process is offered by means of XML data under a Creative Commons license. An API has been produced to make it very easy to get this data.

TheyWorkForYou.com provides a very good and popular presentation layer for this content.

The data as reworked by TheyWorkForYou is also commonly presented in many other places on the web such as MPs’ personal sites.

There is a comment facility built into TheyWorkForYou to provide a layer of interaction around the content.

It is also cited in many blogs that generate their own interaction as well as featuring in mainstream media stimulating further discussion.

The new architecture now provides a platform for more innovation around the Hansard data set with very low barriers to doing this.

Richard Allan, Task Force Chair

5 Comments

Filed under Architecture

Interaction Layer

Thinking more about the work programme for the Task Force and the layered model I described earlier, there is a need for an additional layer above the Presentation Layer.

INTERACTION LAYER – the methods used for engaging with people interested in the data being presented, including both reactive, e.g. accepting comments, and proactive, e.g. introducing material into blogs and social networks, methods.

This goal of increasing interaction around government data is an important part of the Power of Information analysis and a number of activities are underway in this area.

An early priority amongst these is to produce guidelines for civil servants who already want to interact with social media but are unsure about the propriety of this.

Tom Watson blogged his initial ideas on some guidelines a few weeks ago. This has fed into a more formal process that was triggered by PoI report recommendation 13

To maximise the potential value of civil servants’ input into online fora, by autumn 2007 the Cabinet Office Propriety and Ethics and Government Communications teams should together clarify how civil servants should respond to citizens seeking government advice and guidance online.

This clarification is an essential element to enable the development of a sound Interaction Layer. With these rules in place, with a public version expected soon, we can work on further building expertise and capacity within government for engagement with social media.

Article by Richard Allan, Task Force Chair.

2 Comments

Filed under Architecture, interaction

Information Architectures

Models for presenting information over the internet have often been driven by their ‘shiny front ends’. The user-facing website is all important and the supporting data is somehow squeezed into this.

Thinking has moved on over recent years with a developing understanding of the importance of separating data from its presentation. If nothing else, this allows for simpler changes to the presentation layer as, for example, websites are redesigned.

We can take up this thinking in the Task Force and consider the architectures that are needed for public sector data to advance the Power of Information goals.

The following model is presented as an initial contribution to this discussion:

PRESENTATION LAYER – the public-facing front end, typically a set of web pages

ANALYSIS LAYER – any form of interpretation of the raw data, typically for summary presentation

ACCESS LAYER – all the information needed to access the data, including technical, legal and commercial aspects

DATA LAYER – the raw data sets

This sketch will be fleshed out over coming weeks into a more comprehensive model against which sources of public sector information can be tested.

This will allow us to understand and work on overcoming any barriers in the data and access layers that prevent innovation in the analysis and presentation layers.

5 Comments

Filed under Architecture