Why PDFs get a bad rep

Posted in: Accessibility, Content design, Content guidance, User experience

PDFs on the website aren’t great. You know that, we know that, but they’re still there. Why that? 

There is valid reason to upload a document to the website for certain types of information. In our guidelines for Typecase, we recommend that you only add a document to the website if: 

  • people need to download it 
  • people need to print it 
  • the content of the document is more than 10 pages long 

We’ve recommended this for many years now and we still do, but you may have wondered why. 

It’s all about accessibility - except it isn’t 

If you’ve ever asked someone to help you add a document to a website, you’ve probably been told that you shouldn’t be uploading a PDF to the website. Accessibility, or lack of, is probably the main reason you were quoted. It’s a good reason. Well done to whoever told you that. 

PDF documents are often mostly, if not entirely, inaccessible to people who use things like screen readers to dictate content or need to customise their view. There is also legislation against inaccessible content.   

‘But my document is accessible’ 

You’re smug. Yes, you can make documents, including PDF documents, more accessible. Good job if you’ve done that, enjoy being smug. 

But, and it’s a long but, it’s not the only reason why you shouldn’t put your PDF on the website. There are other reasons. Let’s take a look at those. 

The elephant in the PDF 

Let’s get it out of the way. Have you ever opened a PDF and thought: I need to copy information from this into something I’m working on? 

Your fingers will have swiftly found the Ctrl + C/V keys ( + C/V if you’re on a Mac), and you’ll shortly be greeted with a screen full of formatting and fonts you’ve never heard of. Say hello to Boring Sans, Cringe Gothic or Gravitype. 

Oh, you’ve set your Paste command to default to plain text rather than copying the formatting? Smart. Except, those sentences look like they end at the wrong place, and those paragraphs sure aren’t formatted correctly. Enjoy your next ten minutes with the backspace and hard return keys; you’ve earned it. 

We’ve all been here 

We’d all rather be doing something else than tidying up formatting. Sure, there are other ways to avoid it, but if you’re adding a document to the website, it's probably because you want other people to be able to do something with the information so why not make it easier for them? 

It’s bad customer service 

I was looking for a bus timetable recently on my phone… 

I wanted to know what time the first bus of the day leaves so that I could plan my week and future weeks. Sure, I could have used a navigation app of course, but that relies on data being fed through, which may or may not be entirely accurate when services are changing and you’re planning weeks ahead.  

I wanted certainty, so I went to the source: the official bus service website. They’ll be sure to tell me what’s what with their timetabling. 

I found my bus service, and I was presented with a PDF of the timetable. Sure, the information is there. I had to find the right document, scroll through the many pages to get to my bit, and then zoom in on my small phone screen to find the day and time I was looking for. It’s there, hooray! 

Not a great user experience 

Did I feel like this was a slick process that was helping me, as a bus user, get what I needed quickly? Not really. 

Did I feel like the bus company had created a document for their own internal system and then stuck it on the website because they just needed to get it out there? Yes. And there is some sense to that. 

If there was an emergency bus service, or an urgent change to the timetable, and getting that information out was the priority, then it’s sort of understandable, although not excusable (accessibility, see my earlier point). But this document was there for some time.  

We can all relate in our jobs; sometimes we do need to get information out quickly. It needs to go out, and today’s problem can be fixed tomorrow. Except it wasn’t in this case. 

Overall, I came away thinking someone could have made this easier for me, but they didn’t have time or couldn’t be bothered. Obviously, they’re busy people with a lot to do and they have their own processes and things that this needs to work around. But maybe that’s the point: some of these problems could be solved at the same time. It could take a similar amount of time to create a web page as it would a PDF. 

What about printing? 

You might be thinking from my bus example earlier that some people might want to print the timetable out and stick it up in their home, office or file it away in their Filofax (other personal organisers are available).  

Printing a document is a very good reason to use a PDF. It does preserve the formatting. But it should also be accessible if that is the only way to get the information.  

Your website should also really use a print style sheet to help people print the information they need, not the whole page. If you have that, you may not need another PDF. 

It’s time-consuming for me and you 

While it can seem like a simple solution to use a PDF for your information on the website, it can waste your time and your visitors’ time. 

Your time 

We all work with documentation and have administrative duties we need to complete in our roles. You may update a document and think ‘job done’, but if you plan on using that document online, you’re going to need to make sure you have prepared the document for the website and make it accessible. Remember the legislation I mentioned earlier. 

It’s time consuming to do this each time. Especially if the information that is relevant to visitors may only be a small part of that document. That can sting.  

It can be more efficient to update a web page, which is accessible, than do this to a document each time you need to add it to the website. 

My time 

If I’m visiting a website, I will unconsciously know how to use the website based on established design conventions and your own design systems. Things are usually consistent. Your buttons look like buttons, I know what information is a heading and what is a paragraph of text and I am spoken to in a consistent tone of voice. 

I then open the PDF that is linked from your website... 

It’s like being taken from the shiny showroom into the back admin office with the door slamming shut behind you. The normal rules of casual human conversation (‘Hi Matt’) no longer apply in this room, as I am addressed as though I am a distant unknown visitor (‘Dear applicant’). I now must reorient myself in this environment as your language and information architecture is completely different to me. It takes me more time in this environment working out what I am looking for. 

I’m exaggerating, but I hope you can see my point. The landscape has changed for the user and even a small change, when combined with other changes, can become a big deal.  

On the internet, no one hears your PDF 

Ok, that’s not true. PDFs and documents are visited online and indexed by search engines.  

PDFs and documents can rank lower in search results than web pages, especially if they lack structure, metadata and keywording. This can be even more problematic if there are lots of web pages with similar information online, as they are likely to be prioritised in search results. 

If you’re adding a document to the website, it is presumably because it needs to be seen, so don’t put your content in a format that is going to give it less chance of being found. 

But everyone is using AI now, websites are for grown-ups who don't get it 

Ouch. Well, you may have a bit of a point, but still, ouch.

Ok, well, firstly, where do you think those LLMs are getting a lot of their information from? It's not magic.

Secondly, and you may be surprised to hear this, even some of the most advanced AI models struggle hugely with PDFs. At least at the time of writing this in early 2026, many of these models can't understand the information in PDF documents well.

This is because PDFs were never built to be read by AI, they were built to maintain the visual appearance of a document for humans (remember the 90s? Oh, sorry). The content within them is encoded in such a way that, to a machine at least, the reading order isn't clear. Even if they can extract the information, the meaning may be lost, increasing the risk of inaccuracy or those pesky hallucinations.

AI is improving, of course, at a rapid rate. There will come a day when AI can understand PDFs much better than they do now, but some of the intrinsic problems with them will likely remain or require further work to optimise them for AI.

If a PDF falls into an LLM, does it make a…?

There is an interesting point to be made about this. If AI doesn't understand what is in a PDF, then does the information exist? (to the machine at least).

Think about the kind of information that is typically contained within a PDF - a poster (meh), a timetable for an event (hold on), an academic study based on research and evidence (ok, I see where you're going).

I'm not suggesting that all of these things don't exist to AI, some information within PDFs can be extracted, but it's an interesting thought. If you add a PDF to the website, you are making it harder for AI to extract information from it. Maybe you don't want AI to use your PDF, and sometimes that's fine, but maybe that information is useful. Although you might not want AI to find your content, you do want humans to find it, and who do you think is using the AI?

Final thought 

Accessibility is still a big part of why you shouldn’t add documents to the website. If you do need to add a document to the website, you should first read our making digital documents accessible guide. 

But rather than simply trying to make your document accessible, you should be thinking about the end user and whether this is the best way to help them get what they need. 

Posted in: Accessibility, Content design, Content guidance, User experience

Respond

  • (we won't publish this)

Write a response