Category Archives: developments

Announcing the new Copac interface and design…

Posted on May 2, 2012 by Joy Palmer

A tremendous amount has been going on behind the scenes of Copac for quite a period of time now. Like everyone across the sector we’re working at what feels like full tilt — tackling multiple projects, and figuring out as a team how to juggle and prioritise it all. We’re undertaking quite a few JISC innovations projects, including the work with developing a shared service prototype for a recommender API based on aggregated circulation data, a considerable amount of effort is being invested in the Copac Collections Management project, we’ve been collaborating with our colleagues across the office on Linked Data research and development, working closely with the Discovery initiative, and our developers (namely Ashley Sanders) have just about cracked the new database design and algorithms that will address some of the major duplication issues we are currently challenged with as a national aggregator of bibliographic records.

In order to understand and meet the needs of our current user-base (800,000 search sessions per month, and counting) we’ve also been conducting market research in the form of surveys, focus groups and interviews with our users and stakeholders. We’ve amassed a lot of knowledge about how Copac is used, its benefit to academics and librarians, the features most valued in the interface, and what we could be doing better (deduplication! Ebook records and access!) We still have a way to go to meet all these needs, and as a service with a ‘perpetual beta ethos,’ committed to innovation, we know we’ll never be ‘done’ with this work.

But the launch of the new interface and design today is a very significant milestone, and one we want to mark. These changes are the product of a great deal of committed work to the principles of market research and user-centred design. Thanks to the efforts of Mimas web developers Leigh Morris and Shiraz Anwar, the new application interface positively reflects the real world user-journeys of Copac users, and has been rigorously tested to ensure it’s in line with those needs. The new graphic design has been developed to communicate the value proposition of Copac as a JISC service representing Research Libraries, and also as a tool to Research Libraries. Mimas’ new graphic designer has done an excellent job of transforming a site that was out of date, (‘lacked depth’ and ‘cold’ I believe are words used) into something more engaging, reflecting the breadth and richness of the libraries that make up Copac. Certainly, beyond providing an excellent resource discovery experience for end users (and this is why the simplicity and ease of use of the search and personalisation tools are our primary focus) it is important for us to communicate on behalf of JISC that Copac is a community-driven initiative, made possible by its contributors and representative bodies like RLUK. We hope that the new elements of the website represent this community feel, giving Copac a bit more of an engaging voice than perhaps we’ve previously had.

A big vote of thanks to my fantastic Copac and Mimas colleagues, and particularly those who have worked quite a few late nights and weekends lately: Shirley Cousins, Ashley Sanders, Leigh Morris, Lisa Jeskins, and Beth Ruddock. Thanks to Shiraz Anwar for his work earlier in this project in ensuring every detail of the interface design reflected user needs. Thanks also to Janine Rigby and Lisa Charnock from the Mimas Marketing team for the market research work, and working with us to identify the value proposition and identity of Copac, and to Ben Perry for translating that so swiftly into a design we all instantly agreed on.

Copac Beta Interface

Posted on November 23, 2011 by Ashley Sanders

We’ve just released the beta test version of a new Copac interface and I thought I’d write a few notes about it and how we’ve created it.

Some of the more significant changes to the search result page (or “brief display” as we call it) are:

There are now links to the library holdings information pages directly from the brief display. You no longer have to go via the “full record” page to get to the holdings information.
You can see a more complete view of a record by clicking on the magnifying glass icon at the end of the title. This enables you to quickly view a more detailed record without having to leave the brief display.
You can quickly edit your query terms using the search forms at the top of the page.
To further refine your search you can add keywords to the query by typing them into the “Search within results” box.
You can change the number of records displayed in the result page.

The pages have been designed using Responsive Web Design techniques — which is jargon that means that the HTML5 and CSS have been designed in such a way that the web page rearranges itself depending on the size of your screen. The new interface should work whether you are using a desktop with a cinema display, a tablet computer or a mobile phone. Users of those three display types will see a different arrangement of screen elements and some may be missing altogether on the smaller displays. If you use a tablet computer or smartphone, then please give beta a try on them and let us know what you think.

The CGI script that creates the web pages is a C++ application which outputs some fairly simple, custom, XML. The XML is fed through an XSLT stylesheet to produce the HTML (and also the various record export formats.) Opinion on the web seems divided on whether or not this is a good idea; the most valid complaints seem to be that it is slow. It seems fast enough to us and the beta way of doing things is actually an improvement as there is now just one XSLT used in creating the display, whereas our old way of doing things used multiple XSLT stylesheets run multiple times for each web page. Which probably just goes to show that the most significant eater of time is the searching of the database rather than the creation of the HTML.

Copac deduplication

Posted on September 20, 2011 by Ashley Sanders

Over 60 institutions contribute records to the Copac database. We try to de-duplicate those contributions so that records from multiple contributors for the same item are “consolidated” together into a single Copac record. Our de-duplication efforts have reduced over 75 million records down to 40 million.

Our contributors send us updates on a regular basis which results in a large amount of database “churn.” Approximately one million records a month are altered as part of the updating process.

Updating a consolidated record

Updating a database like Copac is not as immediately intuitive as you may think. A contributor sending us a new record may result in us deleting a Copac record. A contributor who deletes a record may result in a Copac record being created. A diagram may help explain this.

: A Copac consolidated record created from 5 contributed records. Lines show how contributed records match with one another.

The above graph represents a single Copac record consolidated from five contributed records: a1, a2, a3, b1 & b2. A line between two records indicates that our record matching algorithm thinks the records are for the same bibliographic item. Hence, record a1,a2 & a3 match with one another; b1 & b2 match with each other and a1 matches with b1.

Should record b1 be deleted from the database, then as b2 does not match with any of a1, a2 or a3 we are left with two clumps of records. Records a1, a2 & a3 would form one consolidated record and b2 would constitute a Copac record in its own right as it matches with no other record. Hence the deletion of a contributed record turns one Copac record into two Copac records.

I hope it is clear that the inverse can happen — that a new contributed record can bring together multiple Copac records into a single Copac record.

The above is what would happen in an ideal world. Unfortunately the current Copac database does not save a log of the record matches it has made and neither does it attempt to re-match the remaining records of a consolidated set when a record is deleted. The result is that when record b1 is deleted, record b2 will stay attached to records a1, a2 & a3. Coupled with the high amount of database churn this can sometimes result in seemingly mis-consolidated records.

Smarter updates

As part of our forthcoming improvements to Copac we are keeping a log of records that match. This makes it easier for the Copac update procedures to correctly disentangle a consolidated record and should result in less mis-consolidations.

We are also trying to make the update procedures smarter and have them do less. For historical reasons the current Copac database is really two databases: a database of the contributors records and a database of consolidated records. The contributors database is updated first and a set of deletions and additions/updates is passed onto the consolidated database. The consolidated database doesn’t know if an updated record has changed in a trivial way or now represents another item completely. It therefore has no choice but to re-consolidate the record and that means deleting it from the database and then adding it back in (there is no update functionality.) This is highly inefficient.

The new scheme of things tries to be a bit more intelligent. An updated record from a contributor is compared with the old version of itself and categorised as follows:

The main bibliographic details are unchanged and only the holdings information is different.
The bibliographic record has changed, but not in a way that would affect the way it has matched with other records.
The bibliographic record has changed significantly.

Only in the last case does the updated record need to be re-consolidated (and in future that will be done without having to delete the record first!) In the first two cases we would only need to refresh the record that we use to create our displays.

An analysis of an update from one of our contributors showed that it contained 3818 updated records; 954 had unchanged bibliographic details and only 155 had changed significantly and needed reconsolidating. The saving there is quite big. In the current Copac database we have to re-consolidate 3818 records. In the new version of Copac we only need to re-consolidate 155. This will reduce database churn significantly, result in updates being applied faster and allow us to have more contributors.

Example Consolidations

Just for interest and because I like the graphs, I’ve included a couple graphs of consolidated records from our test database. The first graph shows a larger set of records. There are two records in this set that when either are deleted would result in the set being broken up into two smaller sets.

The graph below shows a smaller set of records where each record matches with every other record.

Performance improvements

Posted on August 26, 2011 by Ashley Sanders

The run up to Christmas (or Autumn term if you prefer) is always our busiest time of year as measured by the number of searches performed by our users. Last year the search response times were not what we would have liked and we have been investigating the causes of the poor performance and ways of improving it. Our IT people determined that at our busiest times the disk drives in our SAN were being pushed to their maximum performance and just couldn’t deliver data any faster. So, over the summer we have installed an array of Solid State Disks to act as a fast cache for our file-systems (for the more technical I believe it is actually configured as a ZFS Level 2 Cache.)

The SSD cache was turned on during our brief downtime on Thursday morning and so far the results look promising. I’m told the cache is still “warming up” and that performance may improve still further. The best performance indicator I can provide is the graph below. We run a “standard” query against the database every 30 minutes and record the time taken to run the query. The graph below plots the time (in seconds) to run the query since midnight on the 23rd August 2011. I think it is pretty obvious from looking at the graph exactly when the SSD cache was configured in.

It all looks very promising so far and I think we can look forward to the Autumn with less trepidation and hopefully some happier users.

Copac trial interface: feedback

Posted on June 23, 2011 by Shirley Cousins

Many thanks to those of you that gave us feedback on the recent trial of the new Copac user interface. We really appreciate the time you put into testing and responding to us through the feedback form, email, and twitter.

I’ve summarised the feedback below:

In general you gave an enthusiastic response to the new interface design, including positive comments on the layout and workflow. Those who tried it on mobile devices were pleased with the how it came out.
There were also positive comments about the range of features, with the availability of the holding library list on the initial search result listing being particularly popular.
The grey ‘colour scheme’ generated a number of comments. Some people liked it but others definitely didn’t! The lack of colour on the site was to try and avoid getting too much comment on the graphics as opposed to the functionality of the new interface, so it won’t be staying monochrome.
There were individual comments about wording, screen elements, or requests for additional features, which are all valuable in helping us refine the presentation and facilities.
Amongst those who didn’t like the new interface the major concern was the lack of the ‘Main search’ screen with its range of detailed search options. Whilst the initial test was working just with the Quick search, we can reassure that we always intended to reintroduce the other search screens once we had feedback on the overall design. This obviously wasn’t as clear as we’d hoped.

We are continuing to work on the interface, reassured that we are moving in the right direction for most of you. In the next stage we’ll be incorporating colour as well as adding the missing search screens. We will also be making changes in response to comments or requests relating to individual features, as well as ensuring that it works well for as wide a range of browsers and devices as possible.

You’ll be able to try out the new interface again in a few months time and provide input into the final version before the work is completed.

Copac trial interface – have your say on the future of Copac!

Posted on May 23, 2011 by Bethan Ruddock

We are developing a new style of Copac interface with greater search flexibility, new functionality, and clearer displays. Following initial user testing we’re now opening up the trial interface for further comment. We’re making the early draft interface available for a week from 12.00 noon 23rd May to 12.00 noon 30th May. This is your opportunity to try out the new interface, and let us know what you think!

Access the Copac Alpha trial interface.

Please note: The interface is very pared down, and there is no colour scheme. Some elements are just placeholders for planned options. The interface is designed to work in the latest browsers – you might experience issues with display/functionality in older browsers, such as IE 6 and 7.

We’d really appreciate your input into this work. There are feedback options on the screens and all comments will feed into the ongoing development process. You can also email copac@mimas.ac.uk with your feedback.

There will be further opportunities to comment on the interface redevelopment as the work continues. This is part of the complete redevelopment of the Copac service and additional interface facilities will become available at later stages on the work.

Getting Excited about Collection Management

Posted on May 16, 2011 by Joy Palmer

The Copac Collections Management Tools Project is a collaboration between Mimas, RLUK, and the White Rose Consortium.

A number of partners have been working through and with us here at Mimas on a JISC funded Collection Management project, which is part of the broader Resource Discovery Taskforce activity

Since we have all been working on this slightly under the radar, and recognising the need to share more about this project and what’s going on, we’re planning series of blog posts to update the community on the progress and lessons learned through the partnetship. The following update is from Julia Chruszcz, who is project managing this piece of work:

Just two months into the JISC funded Copac Collection Management Project the progress has been significant. At a meeting of the project partners on the 6^th May each of the representatives from the White Rose Consortium (WRC) universities (Leeds, York and Sheffield) articulated the potential significance of this tool on their decision making processes around monograph retention and disposal and collection development. This included notions of collaborative collection development and how such a Collection Management Tool could facilitate regional and national approaches, each influencing local decisions for libraries.

The WRC has undertaken the early testing of the web-based tool in an approach that the project has adopted to inform development and iteratively assess the tool. The idea is to build up a full specification over the life of the project of what will be required to take such a tool forward to introduce into library workflows. The next stage, between now and the beginning of July will be to further develop the batch and web technical interfaces based upon the WRC feedback and for this development to undergo further critical testing. The project is due to provide an interim report at the end of June with full report to the JISC at the end July.

The enthusiasm from all the project partners, JISC, Mimas, RLUK and WRC, stems from the realisation that we have the potential to produce a tool that will make a real difference to helping libraries make informed decisions particularly at a time of financial constraint, and assist in furthering the possibility of a national monographs collection, protecting access for researchers at the same time as facilitating local decisions that will save money and resource longer term. And all this by intelligent re-use and application of an existing extensive database, a resource invested in by RLUK and the JISC over many years, the Copac database.

If this is something you are interested in we’d really like to hear your view point and perspective.

Surfacing the Academic Long Tail — Announcing new work with activity data

Posted on March 3, 2011 by Joy Palmer

We’re pleased to announce that JISC has funded us to work on the SALT (Surfacing the Academic Long Tail) Project, which we’re undertaking with the University of Manchester, John Rylands University Library.

Over the next six months the SALT project will building a recommender prototype for Copac and the JRUL OPAC interface, which will be tested by the communities of users of those services. Following on from the invaluable work undertaken at the University of Huddersfield, we’ll be working with ten years+ of aggregated and anonymised circulation data amassed by JRUL. Our approach will be to develop an API onto that data, which in turn we’ll use to develop the recommender functionality in both services. Obviously, we’re indebted to the previous knowledge acquired by a similar project at the University of Huddersfield and the SALT project will work closely with colleagues at Huddersfield (Dave Pattern and Graham Stone) to see what happens when we apply this concept in the research library and national library service contexts.

Our overall aim is that by working collaboratively with other institutions and Research Libraries UK, the SALT project will advance our knowledge and understanding of how best to support research in the 21st century. Libraries are a rich source of valuable information, but sometimes the sheer volume of materials they hold can be overwhelming even to the most experienced researcher — and we know that researchers’ expectation on how to discover content is shifting in an increasingly personalised digital world. We know that library users — particularly those researching niche or specialist subjects — are often seeking content based on a recommendation from a contemporary, a peer, colleagues or academic tutors. The SALT Project aims to provide libraries with the ability to provide users with that information. Similar to Amazons, ‘customers who bought this item also bought….’ the recommenders on this system will appear on a local library catalogue and on Copac and will be based on circulation data which has been gathered over the past 10 years at The University of Manchester’s internationally renowned research library.

How effective will this model prove to be for users — particularly humanities researchers users?

Here’s what we want to find out:

Will researchers in the field of humanities benefit from receiving book recommendations, and if so, in what ways?
Will the users go beyond the reading list and be exposed to rare and niche collections — will new paths of discovery be opened up?
Will collections in the library, previously undervalued and underused find a new appreciative audience — will the Long Tail be exposed and exploited for research?
Will researchers see new links in their studies, possibly in other disciplines?

We also want to consider if there are other potential beneficiaries. By highlighting rarer collections, valuing niche items and bringing to the surface less popular but nevertheless worthy materials, libraries will have the leverage they need to ensure the preservation of these rich materials. Can such data or services assist in decision-making around collections management? We will be consulting with Leeds University Library and the White Rose Consortium, as well as UKRR in this area.

(And finally, as part of our sustainability planning, we want to look at how scalable this approach might be for developing a shared aggregation service of circulation data for UK University Libraries. We’re working with potential data contributors such as Cambridge University Library, University of Sussex Library, and the M25 consortium as well as RLUK to trial and provide feedback on the project outputs, with specific attention to the sustainability of an API service as a national shared service for HE/FE that supports academic excellence and drives institutional efficiencies.