Derivadow

Publishing to the iPad

Dec 10, 12 3:39 PM

NPG recently launched a new iPad app Nature Journals – an app that allows us to distribute journal content to iPad users. I thought it might be interesting to highlight a few of the design decisions we took and discuss why we took them.

“Magazines to Read” by Long Nguyen. Some rights reserved.

Most publishers when they make an iPad magazine tend to design a skeuomorphic digital facsimile of their printed magazine – they build in lots of interactive features but build it using similar production processes as for print and make it feel like a print magazine. They layout each page (actually they need to layout each page twice one for landscape and one for portrait view) and then produce a big file to be distributed via Apple’s app store.

This approach feels very wrong to me. For starters it doesn’t scale well – every issue needs a bunch of people to layout and produce it; from an end users point of view they get a very big file and I’ve seen nothing to convince me most people want all the extra stuff; and from an engineering point of view the lack of separation of concerns worries me. I just think most iPad Magazines are doing it wrong.

Now to be clear I’m not for a moment suggesting that what we’ve built is perfect – I know its not – but I think, I hope we’re on the right track.

So what did we do?

Our overarching focus was to create a clean, uncluttered user experience. We didn’t want to replicate print nor replicate the Website instead we wanted to take a path that focused on the content at the expense of ‘features’ while giving the reader the essence of the printed journals.

This meant we wanted decent typography, enough branding to connect the user to the journal but no more and the features we did build had to be justified in terms of benefits to a scientist’s understanding of the article. And even then we pushed most of the functionality away from the forefront of the interface so that the reader hopefully isn’t too aware of the app. The best app after all is no app.

In my experience most publishers tend to go the other way (although there are notable exceptions) – most iPad Magazines have a lot of app and a lot of bells and whistles, so many features in fact that many magazines need an instruction manual to help you navigate them! That can’t be right.

As Craig Mod put it – many publishers build a Homer.

When Homer Simpson was asked to design his ideal car, he made The Homer. Given free reign, Homer’s process was additive. He added three horns and a special sound-proof bubble for the children. He layered more atop everything cars had been. More horns, more cup holders.

We didn’t want to build a Homer! We tried to only include features where they really benefit the reader or their community. For example, we built a figure viewer which lets the reader see the figures within the article at any point and tap through to higher resolution images because that’s useful.

You can also bookmark or share an article, download the PDF but these are only there if you need them. The normal reading behaviour assumes you don’t need this stuff and so they are hidden away (until you tap the screen to pull then into focus).

Back to the content…

It’s hard to build good automated pagination unless the content is very simple and homogenous. Beautiful, fast pagination for most content is simply too hard unless you build each page by hand. Nasty, poorly designed and implemented pagination doesn’t help anyone. We therefore decided to go with scrolling within an article and pagination between articles.

Under the hood we wanted to build a system that would scale, could be automated and ensured separation of concerns.

On the server we therefore render EPUB files from the raw XML documents in MarkLogic and bundle those files along with all the images and other assets into a zip file and serve them to the iPad app.

From the readers point of view this means they can download whole issues for offline reading and the total package is quite small – an issue of Nature is c. 30MB, the Review Journals can be as small as 5MB by way of comparison Wired is c. 250MB.

From our point of view the entire production is automated – we don’t need to have people laying out every page or issue. This also means that as we improve the layout so we can rollout those improvements to all the articles – both new content and the archive (although users would need to re download the content).

A son’s eulogy

Jan 25, 12 6:21 PM

1927 was the year that Ford stopped production of the Model T, the year that for all practical purposes Television was invented, the year that the Spirit of St. Louis crossed the Atlantic to become the first nonstop transatlantic flight and the year that the League of Nations signed a treaty abolishing slavery.

1927 was also the year my Daddy was born. Born into a world that was a radically different to the one we live in today.

He was born in Belfast into a devote Presbyterian family and grew up during the Second World War. From Belfast he moved to Dublin to train as a Vet at Trinity.

Moving to a catholic country presented dad with new opportunities – for starters – he was able to supplement his student income by smuggling condoms across the border and selling them to his fellow students.

Although we might all admire this entrepreneurial spirit I should point out that this additional income wasn’t always put to good use.

For when he and his friend, Billy MacArthur, found a bat’s roost they scooped up a bagful of unfortunate bats and headed off to the local cinema, who happened to be showing a zombie movie. I like to think that when he and Billy released the bats they invented the first 3D cinema experience.

After Trinity he left Ireland for Cornwall, where he met mum. The two of them then moved to Bedford where he setup his own practice and started a family.

After 40 years they returned to Cornwall.

Retirement can be a risky business – but my parents where lucky. They found friends who made them laugh, who also enjoyed a bottle of red wine or few, who made their retirement a full and happy time.

But on the 24th of January, my daddy died of cancer.

My father, as anyone who met him will know, was a cantankerous, stubborn bugger. He would argue with anyone about any subject. I sometimes wondered why.

I’m sure he did it because he loved the challenge, loved the debate, loved challenging why people thought what they thought, and because he was endlessly curious about the world.

Born into a world that was soon to disappear, washed away by the flood of the modern world. It would have been easy for him to have retreated into what he knew.

But his determination and curiosity drove him forward. Stopped him from retreating into the past.

Instead he did what he loved and explored the world – he caught animals in East Africa, worked and travelled in Asia, read anything and everything, built his practice and then in recent years started to explore the world via the Web.

But more than his willingness to embrace the new was his desire to challenge the status quo and the beliefs that others held.

He knew that whoever you are you’re just a mammal. That it was ok to question what you and others believed and did. He taught me that not only was it ok to question but also not to be scared of the consequences. He taught me to question others and do what I thought was right. He taught me quiet determination.

This Christmas, my brother, Sean and I ended up discussing life and death over a bottle of whisky. And at some point Sean asked me what I wanted out of life.

I told him I wanted to die happy having made interesting things I could be proud of. I think Dad managed that.

What I learned from my daddy’s death was that character is essential: What he was, was how he died.

In the final days of his life he was very tired but when he woke he woke with a smile. He was happy even though he knew he was dying. He was happy because he was happy with his life, he loved being a vet, he loved living in Cornwall, he was proud of us: his children, grandchildren and great grandchildren and, he was proud of what he made of his life but most of all he loved his wife, my mummy.

Don’t mourn his death; he wouldn’t want that.

Remember him for the last time he teased you, the last time you fell for one of his practical jokes, the last time you winced at one of his emails or perhaps just the last time he made you look at the world in a different way.

Scientific publishing on the Web

Jan 22, 12 4:07 PM

As usual these are my thoughts, observations and musings not those of my employer.

Scientific publishing has in many ways remained largely unchanged since 1665. Scientific discoveries are still published in journal articles where the article is a review, a piece of metadata if you will, of the scientists’ research.

Cover of the first issue of Nature, 4 November 1869.

This is of course not all bad. For example, I think it is fair to say that this approach has played a part in creating the modern world. The scientific project has helped us understand the universe, helped eradicate diseases, helped decreased child mortality and helped free us from the drudgery of mere survival. The process of publishing peer reviewed articles is the primary means of disseminating this human knowledge and as such has been, and remains, central to the scientific project.

And if I am being honest nor is it entirely fair, to claim that things haven’t changed in all those years – clearly they have. Recently new technologies, notably the Web, have made it easier to publish and disseminate those articles, which in turn has lead to changes in the associated business models of publishers e.g. Open Access publications.

However, it seems to me that scientific publishers and the scientific community at large has yet to fully utilize the strengths of the Web.

Content is distributed over http but what is distributed is still, in essence, a print journal over the Web. Little has changed since 1665 – the primary objects, the things a ~~SMT~~ STM publisher publishes remain the article, issue and journal.

The power of the Web is its ability to share information via URIs and more specifically its ability to globally distribute a wide range of documents and media types (from text to video to raw data and software (as source code or as binaries)). The second and possibly more powerful aspect of the Web is its ability to allow people to recombine information, to make assertions and statements about things in the world and information on the Web. These assertions can create new knowledge and aid discoverability of information.

This is not to say that there shouldn’t be research articles and journals – both provide value – for example journals provides a useful point of aggregation and quality assurance to the author and reader. The article is an immutable summary of the researchers work at a given date and, of course, the paper remains the primary means of communication between scientists. However, the Web provides mechanisms to greatly enhance the article, to make it more discoverable and allow it to place it into a wider context.

In addition to the published article STM publishers already publish supporting information in the form of ‘supplementary information’ unfortunately this is often little more than a PDF document. However, it is also not clear (to me at least) if the article is the right location for some of this material – it appears to me that a more useful approach is that of the ‘Research Object’ [pdf], semantically rich aggregations of resources, as proposed by the Force11 community.

It seems to me that the notion of a Research Object as the primary published object is a powerful one. One that might make research more useful.

What is a Research Object?

Well what I mean by a Research Object is a URI (and if one must a DOI) that identifies a distinct piece of scientific work. An Open Access ‘container’ that would allow an author to group together all the aspects of their research into a single location. These resources within it might include:

The published article or articles if a piece of research resulted in a number of articles (whether they be OA or not);
The raw data behind the paper(s) or individual figures within the paper(s) (published in a non-proprietary format e.g. csv not Excel);
The protocols used (so an experiment can be easily replicated);
Supporting or supplementary video;
URLs to News and Views or other commentary from the Publisher or elsewhere;
URLs to news stories;
URLs to university reading lists;
URLs to profile pages of the authors and researchers involved in the work;
URLs to the organizations involved in the work (e.g. funding bodies, host university or research lab etc.);
Links to other research (both historical i.e. bibliographic information but also research that has occurred since publication).

Furthermore, the relationship between the different entities within a Research Object should be explicit. It is not enough to treat a Research Object as a bag of stuff, there should be stated and explicit relationship between the resources held within a Research Object. For example, the relationship between the research and the funding organization should be defined via a vocabulary (e.g. funded_by), likewise any raw data should be identified as such and where appropriate linked to the relevant figures within a paper.

Something like this:

The major components of a Research Object.

It is important to note that while the Research Object is open access the resources it contains may or may not be. For example, the raw data might be open whereas the article might not. People would therefore be able to reference the Research Object, point to it on the Web, discuss it and make assertions about it.

In the FRBR world a Research Object would be a Work i.e. a “distinct intellectual creation”.

Making research more discoverable

The current publishing paradigm places seriously limitations on the discoverability of research articles (or research objects).

Scientists work with others to research a domain of knowledge; in some respects therefore research articles are metadata about the universe (or at least the experiment). They are assertions, made by a group of people, about a particular thing based on their research and the data gathered. It would therefore be helpful if scientists could discover prior research along these lines of enquiry.

Implicit in the above description of a Research Object is the need to publish URIs about: people, organisations (universities, research labs, funding bodies etc.) and areas of research.

These URIs and the links between them would provide a rich network of science – a graph that describes and maps out the interrelationships between people, organisations and their area of interest, each annotated with research objects, such a graph would also allow for pages such as:

All published research by an author;
All published research by a research lab;
The researchers that have worked together in a lab;
The researchers who have collaborated on a published paper;
The areas of research by lab, funding body or individual;
Etc.

Such a graph would help readers to both ‘follow their nose’ to discover research and provide meaningful landing pages for search.

Digital curation

One of the significant benefits a journal brings to its readership is the role of curation. The editors of the journal selects and publishes the best research for their readers. On the Web there is no reason this role couldn’t be extended beyond the editor to the users and readers of a site.

Different readers will have different motivations for doing so but providing a mechanism for those users to aggregate and annotate research objects provides a new and potentially powerful mechanism by which scientific discoveries could be surfaced.

For example, a lecturer might curate a collection of papers for an undergraduate class on genomics, combining research objects with their own comments, video and links to other content across the web. This collection could then be shared and used more widely with other lecturers. Alternatively a research lab might curate a collection of papers relevant to their area of research but choose to keep it private.

Providing a rich web of semantically linked resources in this way would allow for the development of a number of different metrics (in addition to Impact Factor). These metrics would not need to be limited to scientific impact; they could be extended to cover:

Educational indices – a measure of the citations in university reading lists;
Social impact – a measure of citations in the mainstream media;
Scientific impact of individual papers;
Impact of individual scientists or research labs;
Etc.

Such metrics could be used directly e.g. research indexes or; indirectly e.g. to help readers find the best/ most relevant content.

Finally it is worth remembering that in all cases this information should be available for both humans and machines to consume and process. In other words this information should be available in structured, machine readable formats.

What is a Research Object?

Making research more discoverable

Digital curation

Share this:

Categories

Publishing to the iPad

So what did we do?

Back to the content…

Share this:

Like this:

A son’s eulogy

Share this:

Like this:

Scientific publishing on the Web