Friday, 22 May 2015

Bib record de-duplication update

Many of you are aware that we've just completed another automated de-duplication process - with considerable success.  After a slight tweaking to the "profile" we use to identify and merge records this process has merged a further 23,284 duplicate records in the system.  

As we were changing our profile we needed to ensure that what we were doing would be effective while not having unintended consequences - i.e. merging records which should not be merged. Di Cranwell at PLS has led this process & she has been ably supported by a great band of willing volunteers from a range of libraries.  The testing team consisted of Joel Hill – Alexandrina, Margaret Wallace – Pt Adelaide Enfield, Alison Packer – Barossa, Cathy O’Brien – Campbelltown, Anne Knight – Norwood Payneham & St Peters, Karen Rubath – Loxton, Joy Smith and Cathy Ehlers – Burnside, Rae Bromley – Bordertown, Alice Mariano – Holdfast Bay, Leonie Somerfield – Lameroo, Jodie Eckert and Kellie Slape – Onkaparinga, Michelle Cox – Playford, Peter Thomas – Mitcham, Brenton Green and Bronwen Kingwell – Marion.

I want to acknowledge all of the testers as their contribution has been on behalf of everyone who works in our network, as well as all of our customers.

We intend to continue to tweak our profile to increase the success of the automated matching process, so will be looking for more testers in the future.  All volunteers will be welcome!

Since April last year, when we commenced keeping records of the monthly manual de-duplication work being done the network has manually de-duplicated over 125,000 records, at an average of almost 9,000 per month.  During the height of the blitz from May to September last year the monthly stats got as high as 19,000 in August - which is an amazing result.

Regarding this quite laborious process - and on behalf of all library staff and customers - I'd really like to thank the libraries who have stuck to the task and have completed their allocation of records to de-duplicate, or are well on the way towards this goal. We currently have 15 libraries who have completed more than 50% of the de-duplication work asked of them, with 4 libraries now at over 100% of their allocation! I'd encourage all libraries to keep working at this task, as it has direct benefits for all libraries and customers.

If you or your staff are not confident about how to do this work & would like to join your colleagues in continuing to tidy up the database please contact Di at PLS & we'll be happy to arrange some training.

The net result of all of the work to date is that while we're adding many new titles to the database, the total number of bib records has fallen from 1,154,576 titles to 1,099,306.  Perhaps a better indication of the impact of the change is the copies to bib record ratio. The ratio was 3.39 items per bib record in March last year which has now climbed to 3.74.  While this doesn't seem like a big change - across 1.1M records it is quite significant.

Thanks again to all who continue to contribute to this vital work.

No comments:

Post a Comment