Where credit’s due

If we’re using and sharing other people’s copyrighted photos, we should be scrupulous in telling people the source, who owns the copyright, and whether someone else can reuse it. But even museum and library professionals regularly get this wrong. Here’s a quick guide to crediting photos properly. This is not just good practice, it’s the law (specifically, the 1994 Copyright Act).

There are two things you should declare when you use someone else’s photo: the copyright owner (whoever owns the right to make copies of it), and the licence (why you were allowed to make a copy).

COPYRIGHT

The copyright owner is usually the photographer. Sometimes, if they took the photo as part of their job, their employer owns that copyright. Regardless, we have to say who the copyright holder is. If the copyright is held by an organisation, like DOC or Knox College, I personally credit the photographer as well if I know who they are. That isn’t required, it’s just being polite. 

Sometimes the copyright has expired – in New Zealand, that happens 50 years after the photographer’s death, or for photos taken before 1944. If there’s no longer any copyright, you can do what you like with the photo, and there’s no requirement to credit anyone. Some institutions put all sort of conditions and restrictions on the use of out-of-copyright photographs, but you should usually just ignore those; feel free to be polite and identify the author or source, but you don’t have to.

LICENCE

If the photo is still in copyright, the default licence is All Rights Reserved. That usually means you need the permission of the copyright holder to reuse it; there are only a few exceptions (research, private study, criticism, or review – as I’m doing in the examples below). But even if you don’t need permission to reproduce the photo, you should still credit it properly.

You don’t have to put the full credit in the photo caption, and often publishers don’t. They’ll put the copyright owner’s name in tiny capitals under or beside the image, and in a “photo credits” section say something like: “p 74: © Daisy Smith / All Rights Reserved. Used with permission.” You can do this too.

If there’s no copyright (it’s expired, or the creator has released the photo into the public domain) you should say “No known copyright” or “Public domain” in the credit. You don’t have to, but it’s good practice. When you reproduce a photo, you should also be telling people if or how they can reuse that photo themselves: don’t give them false information!

Some photos are released under a Creative Commons or CC licence: everything in WikiCommons* (which supplies most of Wikipedia’s photos), much of Te Ara, and some of Flickr or DigitalNZ. A Creative Commons image is free for anyone to use, but there’s usually one or more conditions. If you found the photo on Wikipedia, you should click through to view it, then click “View on Commons” to see where it’s stored on WikiCommons and what those conditions are. For example, the license might be CC BY 4.0 (translation: Creative Commons, Attribution). This means you have to credit the copyright holder (usually the photographer). This isn’t optional, or being nice! It’s a legal agreement you’ve entered into by reusing that photograph. Sometimes the photographer only gives their silly Wikipedia username. You still have to credit them.

You also have to state what kind of CC licence the photograph has (so people know what they themselves can use it for), and ideally give a source or link that lets people find the original. This is also part of the legal agreement you’ve entered into! The best way to do this if the photo’s online is a link to its page in WikiCommons. In print media you can just say “WikiCommons” or “Wikimedia Commons”.

If a DOC office emailed you this Brown Teal photo, the credit would look something like “© DOC / CC BY”, which translates as “DOC owns the copyright, and they’re releasing the photo under a Creative Commons Attribution licence”. If you found this photo on Flickr, you could link to it directly, using the licence it says on Flickr: “© DOC / Flickr / CC BY 2.0”. But looking at the photo’s metadata tells us it was taken in 2009 by “Fantommst”, who turns out to be Auckland photographer Lisa Ridings. Presumably DOC commissioned it, or she was working for them. So an even better credit would be “© DOC / Lisa Ridings / Flickr / CC BY 2.0”

SOME REAL LIFE EXAMPLES (TWEAKED TO PROTECT THE OFFENDERS)

This is a Sharon Murdoch cartoon published in Stuff. It’s still very much in copyright, even though there’s a copy hosted in the NLNZ collection. So we’d need to ask for permission to reuse this, and the credit line would be:

© Sharon Murdoch/Stuff, All Rights Reserved


“Wikimedia” definitely didn’t take this photo of the Champs Élysées; when we track it down it in Wikimedia Commons, we see that German war correspondent Johannes Jörgensen did, and the German National Archive has specified how they’d like to be credited. So the correct credit, with a link to the original, is: 

Bundesarchiv, Bild 101I-362-2210-05A / Jörgensen / Wikimedia Commons / CC-BY-SA 3.0.


This is a photo of the late entomologist Ray Shannon, from his personal papers, but just being in a photo doesn’t make you the copyright holder. Ray almost certainly didn’t take this photo, and since it’s less than 50 years old it’s definitely copyrighted to someone; we’d need to track them down and get permission before we could use it at all.


The dreaded Mr or Ms “Supplied”, who seems to take most of the photos in newspapers. The photo is actually a selfie by Axel Wilke, and available in WikiCommons, so the Herald should have put:

© Axel Wilke / Commons / CC BY-SA 4.0


TO SUMMARISE

We should be obeying the law and crediting creators properly. When you use someone else’s photo, clearly state whose it is, and what licence it’s released under. Let’s lift our game.


A version of this blog post appeared in the February 2021 edition of the LIANZA journal Library Life.


* Just to clarify some confusing names: Creative Commons is a licensing scheme for copyrighted photographs; Wikimedia Commons or WikiCommons is a website that hosts freely-usable photos. Most of the photos in WikiCommons have a Creative Commons licence.

An introduction to Wikisource

On 10 February 2021 a Wikisource volunteer who edits under the username Beeswaxcandle gave a workshop to a dozen West Coast librarians, museum workers, and history buffs on how Wikisource works and why it might be useful for digital heritage. These are my notes summarising his talk.

Beeswaxcandle has been a Wikisource editor – a wikisourcerer – since 2009. He began as a Wikipedia editor, but when one of his first articles was promptly shredded by other volunteers he turned to the calmer backwater of Wikisource. A musician, one of his biggest projects there has been transcribing the 1900 Grove Dictionary of Music and Musicians, still in progress (he’s up to Schubert).

Wikisource is a free online digital library that anyone can improve. Its logo is an iceberg; much is happening beneath the surface. It was created as a sister project to Wikipedia, and aims to be a reference library of primary source texts. Its value as a repository is not that it contains scanned images of the texts, but that these have been transcribed and proofread by volunteers (at least two per page) so they can be found by search engines.

The great strength of Wikisource is that its transcribed text is backed up by the original scanned pages; anyone can check the source and correct mistakes. It’s useful to compare it to the free ebook library Project Gutenberg. Gutenberg doesn’t include the original scanned text, and will sometimes merge several different editions: there’s no way to check the accuracy of its transcriptions. (Wikisource began back in 2003 as “Project Sourceberg” and numerous Gutenberg transcriptions have been added to Wikisource, even without scanned pages to back them up.)

There are other Wikimedia projects for housing texts that have different goals to Wikisource: Wikimedia Commons can host images or entire PDFs of publications, but it doesn’t transcribe; Wikibooks is a home for annotated publications and study guides; but Wikisource is solely for accurate transcriptions of the text as written, not commentary nor interpretation.

Wikisource is a small community, with only 421 active users in English and 26 admins. There are substantial efforts in French, Russian, and other languages: 98% of French Wikisource works are backed up by scanned pages, compared with 58% in English. The culture seems more laid-back and friendly than Wikipedia, with people working together to finish someone else’s project as a nice gesture. Rates of vandalism seem low, and the community uses watchlists and the list of recent changes to keep tabs on it. There are regular collaborations and “Proofread of the Month” projects – January is “quirky” month.

Author links to Julius von Haast’s Wikisource page

Usually the only blue hyperlinks in a Wikisource text will be Author pages, which display a short bio and a list of publications. Wikipedia-style linking would be seen as commentary. [To me author pages looked like something that could be generated from Wikidata, but Wikisource seems to have an arm’s-length relationship to Wikidata, and each project blames the other. — Mike] Because Wikisource hosts mostly public-domain works, the authors will usually be long-dead, so no page for J.K. Rowling. The rolling cutoff date for the US public domain is 1925: on January 1st this year The Great Gatsby entered the public domain, and very soon after a full transcribed and downloadable version appeared in Wikisource. [During question time there was some discussion about approaching local historians and convincing them to donate their copyrights to the public domain so some of the small local histories could be made available through Wikisource.]

Out of all the texts published in English from Chaucer to 1925, Wikisource holds about 300,000. That includes plenty of out-of-copyright novels, but Wikisource can also host:

From Why the Shoe Pinches  (1861) by Georg Hermann von Meyer

There are also portals, a curated collection on a particular area. The New Zealand portal (a selection from Category:New Zealand) includes legislation, treaties, travel writing, and floras. Some highlights:

One gem is the Letters from New Zealand, 1857–1911 by clergyman Henry W. Harper, who spent 9 years on the West Coast. Advice given to him before moving to Hokitika: “Take an old hand’s advice, don’t be discouraged, and if it rains, let it rain.”

But there is very little New Zealand work, and a need for more volunteers here to get busy expanding and correcting Wikisource’s holdings. There also plenty to do sorting and categorising what’s already been done, sourcing better images, finding scanned versions of works already done, and creating author pages.

To add a text to Wikisource, it needs to be scanned at at least 200 dpi for OCR to work, but 250–300 is fine. You can also source already-scanned works from the Internet Archive, the Biodiversity Heritage Library, or the Hathi Trust (which has clearer scans than the Internet Archive). An example text that could be brought into Wikisource would be the Hathi Trust scan of Horatio Gordon Robley’s Pounamu: notes on New Zealand greenstone (1915). I’ve written more on the process of scanning and OCR in another blog post. Once in Wikisource, each page needs to be both proofread and validated by volunteers – and they have to be two different volunteers (also known as Wikisourcerers). As this is happening the work can be transcluded into the main namespace, which is Wikisource jargon for being turned into a live digital document available for download.

George Marriner’s 1908 The Kea: a New Zealand problem, badly OCR’d in the Biodiversity Heritage Library, and a good candidate for Wikisource.

One example of a Wikisource project happened during COVID lockdown in the UK, when the staff of the Scottish National Library were sent home. The SNL had an extensive collection of scanned pamphlets – for example, The surprising adventures, miraculous escapes, and wonderful travels of the renowned Baron Munchausen – which had been (poorly) OCR’d, but needed to be proofread by humans. The resulting Wikiproject, advised by Beeswaxcandle, uploaded thousands of works, and over 1100 were eventually transcluded, by dozens of NLS staff, from April to August 2020. The NLS was then able to take the corrected text and reimport it to their database.

The talk was well received, with plenty of discussion, and may well be a catalyst for West Coast heritage organisations using Wikisource to make their collections more accessible. Watch this space.

Digitising a tiny book

Many books about the West Coast are a) out of print, b) out of copyright, and c) printed in very small runs. Consequently, this history is only available in a few libraries, so only accessible to a few thousand people.

One solution is to reprint these out of copyright books, and a couple of small New Zealand publishers have started doing this; the results are expensive and not very high quality, and the distribution is still quite limited. How could we make this West Coast history available to millions of people, ideally for free?

A simple way to do this would be to digitise the books and release them online, so anyone can download them for free, for reading as an ebook or to print their own copy. The problems are twofold: the work of typing and proofreading, and finding a permanent place to host the result. One solution is to use the Wikimedia Foundation project Wikisource, a repository for free public-domain texts, which volunteers collaboratively transcribe and proofread in their free time. Here’s how it works, using a small book from the Westland District Library collections and easily-available hardware and software.

Hokitika, N.Z. is a 24-page pamphlet containing a talk by Westland County Clerk David Evans on the origins of Hokitika, essays by William Evans (no relation) and journalist Samuel Saunders, and an excerpt from the writings of Julius Von Haast. It was printed as a fund-raising booklet by the Hokitika Guardian in 1921; apart from us, the only libraries in the world which have a copy are in Dunedin, Wellington, and Adelaide. Being pre-1925 means it’s out of copyright in the USA (something that Wikisource requires, because that’s where its servers are hosted).

I started by scanning each page as monochrome text TIFF files at 400 dpi. This didn’t require a fancy scanner, just the library’s multifunction photocopier (an ApeosPort VII) and the free Image Capture that came with my MacBook. A dedicated scanner and software would certainly have sped things up, but they weren’t necessary. It’s important to crop pages evenly and quite close to the text, because a page margin gets added later during printing. I took extra time to clean up the scans in Affinity Photo and adjust the contrast, erase spots and lines, and straighten the columns.

I used Print > PDF in Preview to export the scans as a single PDF; Preview let me reshuffle pages and drop photographic plates into the right place. I could then print the PDF 2-up on A4 using Preview’s booklet layout, and with a long-arm stapler could turn them into an A5 pamphlet (be sure to choose Print Entire Image or the text can be cropped). At this stage we now have a printable PDF which can produce a far better copy of the original than a photocopier could.

I used Affinity Photo to stitch together a better version of the pamphlet’s cover – without library stickers – and replaced the one in the PDF. Then I uploaded the PDF to Wikimedia Commons, as well as all the illustrations separately, all the typographic ornaments as high-resolution 1-bit (black and white, not greyscale) TIFFs, and the fancy headings (just because I thought they were a bit hilarious). The images, cover, and PDF were then all available in Category:Hokitika, N.Z : the Birth of the Borough (1921) in Commons for anyone to use.

The PDF and all the images need to be uploaded to WikiCommons first,
so Wikisource can use them to assemble pages.

With a clean PDF in Wikimedia Commons, I could create an Index in Wikisource, which lists all those pages and shows whether they’ve been proofread or not. This is a critical step when working with volunteers, who can see at a glance which pages to work on. A page in Wikisource is first proofread side by side with its PDF scan, then saved, and finally verified by a different editor, so every page gets checked at least twice. When all the pages are validated the book can be made available as a [Wikisource work](https://en.wikisource.org/wiki/Hokitika,_N.Z.): essentially becoming a long web page with digital text. There’s a certain amount of formatting, and images are inserted in the right places, but the type size and font are up to the reader’s settings or screen reader.

The typesetter at the Hokitika Guardian was determined to use all the fancy fonts and squiggly rules in the type cases.

Rather than transcribe each page by hand, I used Optical Character Recognition (OCR) software to generate a rough draft. There are plenty of free services that will perform OCR on an uploaded image, and the best ones seem to use the software Tesseract. First I tried uploading single pages to PDF24 Tools: a page took 1 minute to OCR, and 7 minutes to manually clean up ready for upload (cleanup is just sorting line ends and column breaks; the proofreading still has to happen ). I also tried the Mac and Windows software PDF OCR X; the community edition can only convert one page at a time, but was reasonably speedy and understood two-column pages. OCRing and uploading the 23 pages of text took half an hour. Wonky columns cause big problems for OCR software, so I was glad I spent some time cleaning up the scans first.

As a new transcription the book briefly featured on Wikisource’s home page, just below the intriguing-sounding Dream of a Rarebit Fiend.

Once digitised, proofread, and verified, the book can be downloaded from Wikisource as an EPUB (all e-readers except the Kindle), PDF, or MOBI (for Kindles).

Proofreading took about 10 minutes per page, and an experienced Wikisource editor User:Beeswaxcandle validated each one and handled the formatting of headings, page numbers, and even the fancy page rules. Page formatting is not too complicated, and it’s all documented in the Wikisource Help pages, but I’m now preparing a handout for new proofreaders with a list of tips and the common templates you’d use to validate pages. Someone brand new to Wikisource could start proofreading text right away, and leave the more technical stuff to someone else.

Having the book online as text – not just page images – opens up lots of possibilities. The book can be indexed by Google and the contents more easily discovered (it’s now deposited in the Internet Archive and the Open Library, for example). It can be downloaded as a PDF and read on a tablet, or uploaded to Overdrive to be borrowed from the Westland District Library as an EPUB file – which means it’s now accessible to the sight impaired, who use a screenreader or need to increase font size. It can be used as a reference in Wikipedia articles and cited in Wikidata; in a future blog post I’ll demonstrate using this book to support a “Streets of Hokitika” Wikidata project.

Digitising Hokitika, N.Z. (1921) has been a proof of concept, and shows there is scope for proofreading and validating longer texts. Short books we can prepare by hand, but longer ones we’ll want to use better scanning technology for. Notably, some West-Coast-related books have already been digitised and OCR’d by the Internet Archive or the Biodiversity Heritage Library, like George Marriner’s 1908 book The Kea : a New Zealand Problem, and could be imported to Wikisource for proofreading right now.

The next step will be to build up a community of Wikisource volunteers, who could be anywhere in the world but ideally here on the West Coast. We now have regular Wikipedia meetups in Hokitika and Greymouth and can suggest proofreading projects to the attendees. In February Wikisource veteran Andrew Wooding is giving a presentation to local GLAM people, and we’ll have a Wikisource workshop at the West Coast WikiCon in Hokitika in March. I’m hoping people interested in genealogy and local heritage or working in museums, libraries, and archives will see the potential for making out of copyright books much more available.


Many thanks to Sara Thomas from the National Library of Scotland and Andrew Wooding (User:Beeswaxcandle) for all their help with this project.

Arriving on the West Coast

I moved to Hokitika on Sunday 22nd November and started at Westland District Library as a Digital Discovery Librarian the next day. Starting this job marks the end of over two years of having no fixed abode; in June 2018 I left my job as a curator at Whanganui Regional Museum and hit the road as New Zealand Wikipedian at Large, travelling from North Cape to Bluff and helping institutions take Wikipedia seriously. As a roving Wikipedian I went to conferences in Bangkok, Berlin, and Stockholm, lived in Estonia for a month (and in Palmerston North for five months to make up for it) and then in September arrived on the West Coast.

Development West Coast had sponsored me as West Coast Wikipedian at Large to spend six weeks travelling from Westport to Fox Glacier and running workshops for libraries, museums, tourism operators, and the general public. While I was at Westland District Library, the manager Natasha Morris asked me if I’d ever considered relocating to Hokitika. It turned out there was a $59 million COVID relief package allocated to libraries in recognition of their value to communities, administered by the National Library, and Westland District Council had secured funding for two staff positions to run until June 2022.

So for the first time in my life I’m a librarian. I need to learn how to issue, check in, shelve, place holds, and handle overdue fines, but most of my work will be dealing with online sources, photographs, newspapers, and blog posts. As a Digital Discovery Librarian my brief, very broadly, is to help West Coast stories get told online, and empower the people of the Coast with the skills to do that.

Natasha and I are currently brainstorming projects for me to tackle. Some of them will be Wikipedia-based, like making sure there’s a good article about every library and museum on the West Coast—best done by recruiting and training volunteer editors from the community, and supporting them over 18 months so they form a self-sustaining editing community. Some will be working with photo collections, looking at ways to digitise them and make them more widely available and shareable. And some will be working with books: getting some out-of-copyright and out-of-print historical works online. I’m looking forward to working with communities like Ōkārito, Fox Glacier, and Haast, as well as collaborating with librarians in Greymouth and Westport.

Westland District Library • MRD • CC BY

After a week on the job I’ve been joined by Rauhine Coakley, a Community Engagement Librarian supported by the same National-Library-administered fund. So we’ve almost doubled the library team here in Hokitika. I personally think this is the coolest part of the West Coast, a little town that punches above its weight. And it’s close to coastal forest, walking tracks, lakes, and beaches, all of which appeal to my love of getting out into nature and looking at ferns and insects.

The rules for the Hokitika Free Public Library required gentleman to remove their hats and not spit on the floor.

Over the course of my time as Digital Discovery Librarian I’ll be blogging my progress each month, and sharing quirky and fascinating things I come across. My goal is also to compile useful resources for institutions and individuals wanting to open up their collections and tell stories online. If you want to participate or have ideas for projects, contact me at Mike.Dickison@westlib.co.nz. Kia ora koutou!