{"id":604,"date":"2026-03-13T17:09:59","date_gmt":"2026-03-13T17:09:59","guid":{"rendered":"https:\/\/blogs.bath.ac.uk\/digital-content-and-development\/?p=604"},"modified":"2026-03-13T17:09:59","modified_gmt":"2026-03-13T17:09:59","slug":"why-pdfs-get-a-bad-rep","status":"publish","type":"post","link":"https:\/\/blogs.bath.ac.uk\/digital-content-and-development\/2026\/03\/13\/why-pdfs-get-a-bad-rep\/","title":{"rendered":"Why PDFs get a bad rep"},"content":{"rendered":"<p><span data-contrast=\"none\">PDFs on the website aren\u2019t great. You know that, we know that, but they\u2019re still there. Why that?<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">There is valid reason to upload a document to the website for certain types of information. In our <\/span><a href=\"https:\/\/www.bath.ac.uk\/topics\/typecase-for-content-manual\/\"><span data-contrast=\"none\">guidelines for Typecase<\/span><\/a><span data-contrast=\"auto\">, we recommend that you only add a document to the website if:<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<ul>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"2\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"1\" data-aria-level=\"1\"><span data-contrast=\"auto\">people need to download it<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"2\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"2\" data-aria-level=\"1\"><span data-contrast=\"auto\">people need to print it<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/li>\n<\/ul>\n<ul>\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"2\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" data-aria-posinset=\"3\" data-aria-level=\"1\"><span data-contrast=\"auto\">the content of the document is more than 10 pages long<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/li>\n<\/ul>\n<p><span data-contrast=\"auto\">We\u2019ve recommended this for many years now and we still do, but you may have wondered why.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:278}\">\u00a0<\/span><\/p>\n<h1><span data-contrast=\"none\">It\u2019s all about accessibility - except it isn\u2019t<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:360,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h1>\n<p><span data-contrast=\"auto\">If you\u2019ve ever asked someone to help you add a document to a website, you\u2019ve probably been told that you shouldn\u2019t be uploading a PDF to the website. Accessibility, or lack of, is probably the main reason you were quoted. It\u2019s a good reason. Well done to whoever told you that.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">PDF documents are often mostly, if not entirely, inaccessible to people who use things like screen readers to dictate content or need to customise their view. There is also <\/span><a href=\"https:\/\/www.bath.ac.uk\/campaigns\/digital-accessibility-and-why-its-important\/\"><span data-contrast=\"none\">legislation against inaccessible content<\/span><\/a><span data-contrast=\"auto\">.\u00a0\u00a0<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<h2><span data-contrast=\"none\">\u2018But my document is accessible\u2019<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:160,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h2>\n<p><span data-contrast=\"auto\">You\u2019re smug. Yes, you can make documents, including PDF documents, more accessible. Good job if you\u2019ve done that, enjoy being smug.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">But, and it\u2019s a long but, it\u2019s not the only reason why you shouldn\u2019t put your PDF on the website. There are other reasons. Let\u2019s take a look at those.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<h1><span data-contrast=\"none\">The elephant in the PDF<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:360,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h1>\n<p><span data-contrast=\"auto\">Let\u2019s get it out of the way. Have you ever opened a PDF and thought: I need to copy information from this into something I\u2019m working on?<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Your fingers will have swiftly found the Ctrl + C\/V keys (<\/span><span data-contrast=\"auto\">\u2318<\/span><span data-contrast=\"auto\"> + C\/V if you\u2019re on a Mac), and you\u2019ll shortly be greeted with a screen full of formatting and fonts you\u2019ve never heard of. Say hello to Boring Sans, Cringe Gothic or Gravitype.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Oh, you\u2019ve set your Paste command to default to plain text rather than copying the formatting? Smart. Except, those sentences look like they end at the wrong place, and those paragraphs sure aren\u2019t formatted correctly. Enjoy your next ten minutes with the backspace and hard return keys; you\u2019ve earned it.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<h2><span data-contrast=\"none\">We\u2019ve all been here<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:160,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h2>\n<p><span data-contrast=\"auto\">We\u2019d all rather be doing something else than tidying up formatting. Sure, there are other ways to avoid it, but if you\u2019re adding a document to the website, it's probably because you want other people to be able to do something with the information so why not make it easier for them?<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<h1><span data-contrast=\"none\">It\u2019s bad customer service<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:360,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h1>\n<p><span data-contrast=\"none\">I was looking for a bus timetable recently on my phone\u2026<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:160,&quot;335559739&quot;:80}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">I wanted to know what time the first bus of the day leaves so that I could plan my week and future weeks. Sure, I could have used a navigation app of course, but that relies on data being fed through, which may or may not be entirely accurate when services are changing and you\u2019re planning weeks ahead.\u00a0<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">I wanted certainty, so I went to the source: the official bus service website. They\u2019ll be sure to tell me what\u2019s what with their timetabling.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">I found my bus service, and I was presented with a PDF of the timetable. Sure, the information is there. I had to find the right document, scroll through the many pages to get to my bit, and then zoom in on my small phone screen to find the day and time I was looking for. It\u2019s there, hooray!<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<h2><span data-contrast=\"none\">Not a great user experience<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:160,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h2>\n<p><span data-contrast=\"auto\">Did I feel like this was a slick process that was helping me, as a bus user, get what I needed quickly? Not really.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Did I feel like the bus company had created a document for their own internal system and then stuck it on the website because they just needed to get it out there? Yes. And there is some sense to that.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">If there was an emergency bus service, or an urgent change to the timetable, and getting that information out was the priority, then it\u2019s sort of understandable, although not excusable (accessibility, see my earlier point). But this document was there for some time.\u00a0<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">We can all relate in our jobs; sometimes we do need to get information out quickly. It needs to go out, and today\u2019s problem can be fixed tomorrow. Except it wasn\u2019t in this case.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Overall, I came away thinking someone could have made this easier for me, but they didn\u2019t have time or couldn\u2019t be bothered. Obviously, they\u2019re busy people with a lot to do and they have their own processes and things that this needs to work around. But maybe that\u2019s the point: some of these problems could be solved at the same time. It could take a similar amount of time to create a web page as it would a PDF.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<h2><span data-contrast=\"none\">What about printing?<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:360,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h2>\n<p><span data-contrast=\"auto\">You might be thinking from my bus example earlier that some people might want to print the timetable out and stick it up in their home, office or file it away in their Filofax (other personal organisers are available).\u00a0<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Printing a document is a very good reason to use a PDF. It does preserve the formatting. But it should also be accessible if that is the only way to get the information.\u00a0<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Your website should also really use a print style sheet to help people print the information they need, not the whole page. If you have that, you may not need another PDF.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<h1><span data-contrast=\"none\">It\u2019s time-consuming for me and you<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:360,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h1>\n<p><span data-contrast=\"auto\">While it can seem like a simple solution to use a PDF for your information on the website, it can waste your time and your visitors\u2019 time.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<h2><span data-contrast=\"none\">Your time<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:160,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h2>\n<p><span data-contrast=\"auto\">We all work with documentation and have administrative duties we need to complete in our roles. You may update a document and think \u2018job done\u2019, but if you plan on using that document online, you\u2019re going to need to make sure you have prepared the document for the website and make it accessible.\u00a0Remember the legislation I mentioned earlier.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">It\u2019s time consuming to do this each time. Especially if the information that is relevant to visitors may only be a small part of that document. That can sting.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:278}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">It can be more efficient to update a web page, which is accessible, than do this to a document each time you need to add it to the website.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:278}\">\u00a0<\/span><\/p>\n<h2><span data-contrast=\"none\">My time<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:160,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h2>\n<p><span data-contrast=\"auto\">If I\u2019m visiting a website, I will unconsciously know how to use the website based on established design conventions and your own design systems. Things are usually consistent. Your buttons look like buttons, I know what information is a heading and what is a paragraph of text and I am spoken to in a consistent tone of voice.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">I then open the PDF that is linked from your website...<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">It\u2019s like being taken from the shiny showroom into the back admin office with the door slamming shut behind you. The normal rules of casual human conversation (\u2018Hi Matt\u2019) no longer apply in this room, as I am addressed as though I am a distant unknown visitor (\u2018Dear applicant\u2019). I now must reorient myself in this environment as your language and information architecture is completely different to me. It takes me more time in this environment working out what I am looking for.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:276}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">I\u2019m exaggerating, but I hope you can see my point. The landscape has changed for the user and even a small change, when combined with other changes, can become a big deal.\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:276}\">\u00a0<\/span><\/p>\n<h1><span data-contrast=\"none\">On the internet, no one hears your PDF<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:360,&quot;335559739&quot;:80,&quot;335559740&quot;:278}\">\u00a0<\/span><\/h1>\n<p><span data-contrast=\"auto\">Ok, that\u2019s not true. PDFs and documents are visited online and indexed by search engines.\u00a0<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">PDFs and documents can rank lower in search results than web pages, especially if they lack structure, metadata and keywording. This can be even more problematic if there are lots of web pages with similar information online, as they are likely to be prioritised in search results.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">If you\u2019re adding a document to the website, it is presumably because it needs to be seen, so don\u2019t put your content in a format that is going to give it less chance of being found.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<h2><span data-contrast=\"none\">\u2018<\/span>But everyone is using AI now, websites are for grown-ups who don't get it<span data-contrast=\"none\">\u2019<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:160,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h2>\n<p>Ouch. Well, you may have a bit of a point, but still, ouch.<\/p>\n<p>Ok, well, firstly, where do you think those LLMs are getting a lot of their information from? It's not magic.<\/p>\n<p>Secondly, and you may be surprised to hear this, even some of the most advanced AI models struggle hugely with PDFs. At least at the time of writing this in early 2026, many of these models can't understand the information in PDF documents well.<\/p>\n<p>This is because PDFs were never built to be read by AI, they were built to maintain the visual appearance of a document for humans (remember the 90s? Oh, sorry). The content within them is encoded in such a way that, to a machine at least, the reading order isn't clear. Even if they can extract the information, the meaning may be lost, increasing the risk of inaccuracy or those pesky hallucinations.<\/p>\n<p>AI is improving, of course, at a rapid rate. There will come a day when AI can understand PDFs much better than they do now, but some of the intrinsic problems with them will likely remain or require further work to optimise them for AI.<\/p>\n<h2>If a PDF falls into an LLM, does it make a\u2026?<\/h2>\n<p>There is an interesting point to be made about this. If AI doesn't understand what is in a PDF, then does the information exist? (to the machine at least).<\/p>\n<p>Think about the kind of information that is typically contained within a PDF - a poster (meh), a timetable for an event (hold on), an academic study based on research and evidence (ok, I see where you're going).<\/p>\n<p>I'm not suggesting that all of these things don't exist to AI, some information within PDFs can be extracted, but it's an interesting thought. If you add a PDF to the website, you are making it harder for AI to extract information from it. Maybe you don't want AI to use your PDF, and sometimes that's fine, but maybe that information is useful. Although you might not want AI to find your content, you do want humans to find it, and who do you think is using the AI?<\/p>\n<h1><span data-contrast=\"none\">Final thought<\/span><span data-ccp-props=\"{&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:360,&quot;335559739&quot;:80}\">\u00a0<\/span><\/h1>\n<p><span data-contrast=\"auto\">Accessibility is still a big part of why you shouldn\u2019t add documents to the website. If you do need to add a document to the website, you should first read our <\/span><a href=\"https:\/\/www.bath.ac.uk\/guides\/making-digital-documents-accessible\/\"><span data-contrast=\"none\">making digital documents accessible guide<\/span><\/a><span data-contrast=\"auto\">.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">But rather than simply trying to make your document accessible, you should be thinking about the end user and whether this is the best way to help them get what they need.<\/span><span data-ccp-props=\"{}\">\u00a0<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>PDFs on the website aren\u2019t great. You know that, we know that, but they\u2019re still there. Why that?\u00a0 There is valid reason to upload a document to the website for certain types of information. In our guidelines for Typecase, we...<\/p>\n","protected":false},"author":41,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[39,29,8,30],"tags":[],"class_list":["post-604","post","type-post","status-publish","format-standard","hentry","category-accessibility","category-content-design","category-content-guidance","category-user-experience"],"acf":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/blogs.bath.ac.uk\/digital-content-and-development\/wp-json\/wp\/v2\/posts\/604","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.bath.ac.uk\/digital-content-and-development\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.bath.ac.uk\/digital-content-and-development\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.bath.ac.uk\/digital-content-and-development\/wp-json\/wp\/v2\/users\/41"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.bath.ac.uk\/digital-content-and-development\/wp-json\/wp\/v2\/comments?post=604"}],"version-history":[{"count":0,"href":"https:\/\/blogs.bath.ac.uk\/digital-content-and-development\/wp-json\/wp\/v2\/posts\/604\/revisions"}],"wp:attachment":[{"href":"https:\/\/blogs.bath.ac.uk\/digital-content-and-development\/wp-json\/wp\/v2\/media?parent=604"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.bath.ac.uk\/digital-content-and-development\/wp-json\/wp\/v2\/categories?post=604"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.bath.ac.uk\/digital-content-and-development\/wp-json\/wp\/v2\/tags?post=604"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}