PDF OCR and Tagging

What is Optical Character Recognition (OCR) and why is it important?
When a PDF is scanned, it basically takes a picture of the document. This means that it is one big graphic and has no text that can be understood by a screen reader, making it completely inaccessible for someone who is blind or with a reading disability. It also makes it less useful for all students as the text is not searchable or usable for the student who may want to highlight sections or pull quotes for their notes.
Scanned
- One giant graphic
- Not searchable
- Not Editable
- Not usable by screen readers (there is no text – only an image)
OCR’d
- Text based
- Searchable
- Editable
- Usable by screen readers
Tagging Documents
OCR is only part of the equation. Documents should also be tagged for headings and alt-text and checked for other accessibility issues so that it is easy for those using assistive technologies such screen readers to navigate easily through a document.
For more information on making your materials accessible, visit the Aim for Accessibility Website or sign up for our Accessible Document Creation course available on our Workshops and Events page.
The best way to do this is to this is to:
- Create your Word or PowerPoint using styles/headings (and following all other accessibility recommendations) and then convert the document to a PDF using the save as option or Acrobat DC Pro Ribbon (if available) or,
- Create your document in Google Suites and use the Grackle Accessiblity Tool to check for accessiblity then use the Export to PDF button in Grackle to create an accessible PDF.
If you do not have the original document and cannot find an accessible version, you can use the Panorama Max tool to create and remediate a scanned document in Canvas.
Keep in mind that making the document accessible from the start and then converting it to PDF will always produce cleaner and easier to use results.
Not Your Document?
If you are using a document that was created by someone else – try reaching out to the creator to see if they have an accessible version.
This is especially important if the document is from a periodical or is publisher content. We need to hold them accountable for making their work accessible.
You can also check with the library – they may be able to help find an accessible version. The library also has an Overhead Scanner that can help make it easy to scan and create accessible documents from books.