Code Yarns ‍👨‍💻
Tech BlogPersonal Blog

How to shrink a scanned PDF

📅 2014-Jun-12 ⬩ ✍️ Ashwin Nanjappa ⬩ 📚 Archive

Documents that are generated from a scanner typically are in the PDF format. Sometimes the scanned PDF given by others can be really huge, running into hundreds of MBs for just a few pages. Ghostscript is the most common program used to shrink the PDF file. There is a long invocation to Ghostscript with multiple input parameters that is required to do this correctly. Thankfully, we have easier-to-use programs that invoke Ghostscript correctly for us.

Install Ghostscript and Imagemagick if you do not have it:

$ sudo apt install ghostscript imagemagick

Imagemagick's convert utility can be used to compress the PDF. By default, it tries to use a huge DPI which makes Ghostscript occupy all the RAM and brings the computer to its knees. Instead instruct convert what DPI to use. Start from a small DPI and work your way up until you are satisfied with the output quality.

For example:

$ convert -density 20 original.pdf out.pdf

If you are curious about the Ghostscript invocation it is using to perform the conversion, ask it to be verbose:

$ convert -density 20 -verbose original.pdf out.pdf

Note: Another way to shrink PDF is to convert to PS (pdf2ps) and then back to PDF (ps2pdf). Note that this does double the work and creates a ginormous intermediate PS file. I would not recommend this since it takes longer time, creates huge intermediate files and you get no control over the DPI or compression ratio.

Tried with: Ubuntu 14.04