Compresspdf.sh

From creative crowd wiki
Revision as of 07:34, 7 April 2023 by Simoon (talk | contribs) (→‎Script)
Jump to navigation Jump to search

Script

A PDF's size before compression
The resampled PDF's size, after compression
#! /bin/bash

pdffile=$1;
dpi=$2 

gs \
  -o "${pdffile%.pdf}-resampled.pdf" \
  -sDEVICE=pdfwrite \
  -dDownsampleColorImages=true \
  -dDownsampleGrayImages=true \
  -dDownsampleMonoImages=true \
  -dColorImageResolution=$dpi \
  -dGrayImageResolution=$dpi \
  -dMonoImageResolution=$dpi \
  -dColorImageDownsampleThreshold=1.0 \
  -dGrayImageDownsampleThreshold=1.0 \
  -dMonoImageDownsampleThreshold=1.0 \
   "${pdffile}"

How to work with it?

What does this script do?
This script squashes large PDFs into very small ones.

In which context was it made?
The script is written by Open Source Publishing. It has since travelled through other practices to be used for a variety of applications, usually at the eleventh hour before sending PDFs to print and for grant applications.

What software does it use?
ghostscript

How to download these pieces of software?
Go to the URL https://www.ghostscript.com/, you can download it from there.

On which systems can this script run?
This works on Linux systems, also on Mac OS. Windows works differently, and it is likely it won't work there.

How to run the script?
After installing the software ghostscript (see above), save the code above as a bash script with the .sh extension, for example: compresspdf.sh. Don't forget to add #! /bin/bash to the top of your script!

Then, in a terminal session, navigate to the directory where you have the file saved. You should add the PDF you want to compress to the same directory.

Then do:

./compresspdf.sh <file.pdf> <resolution in DPI>

For example:

./compresspdf.sh input.pdf 300

The output will be something like:

input-resampled.pdf

==Preparing print files for ''ongoing circulations''== This script was written to generate the print files for the publication ''ongoing circulations'' to document a series of five collective learning sessions organised by Varia in 2022. Each session departed from a technological practice and made space for exploring, questioning and (re)turning (to) digital library software, plain text publishing protocols, web-to-print tools, high frequency radio communication, and colonial infrastructures. The publication is based on the format of the letter, a mode of address that was chosen to reach out to the participants of the sessions specifically, but could meanwhile also speak to a larger body of readers and invite them into the questions that were formulated and materials that were produced. The letters were RISO printed by Printroom, which is a space in Rotterdam dedicated to artist publications, small press and self-publishing. They run a RISO workshop in the city, where they have two A3 RISO machines plus four A4 ones, lots of different colors, and a range of machines for the post production, such as folding machines. Printroom asked us to prepare the print files on A3, because these RISO machines are the most reliable ones they have. At first i wasn't really sure how to do it... How can each A4 page of the PDF be duplicated and placed side-by-side on one A3 page without loosing any quality? Using raster editor software, like Gimp, would not make sense, as i was not sure if this would keep the quality of the PDF intact. Using vector editor software, like Inkscape, would also not make a lot of sense, because the fonts would not be rendered correctly. So i started to look for a tool that could be used from the command line. The script does two things: it multiplies each page of a PDF by two, and it places two A4 pages on an A3 side-by-side. It took me a while to figure out how you could do this, but eventually [https://superuser.com/a/1549057 this answer on superuser.com] revealed a very nice trick using <code>pdftk</code>. I have used <code>pdftk</code> in the past to work with PDFs to extract certain pages, so i was aware of the <code>cat</code> option and that you can use it to point to specific pages of your document. I had never seen <code>cat</code> being used though, to duplicate pages! It was a great discovery, specially because other ways of reaching the same outcome involved for loops written in bash, which would mean probably mean that i would write a small bash script to write this all out, and now i needed just one line of code. So in the end, the script uses two PDF manipulation tools: <code>pdftk</code> and <code>pdfjam</code>. The first line of the script calls <code>pdftk</code>, then uses <code>ongoing-circulations.pdf</code> as its input file, then uses <code>cat 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10</code> to select a range of pages of the input file, and writes it the <code>output</code> to <code>ongoing-circulations-double-pages.pdf</code>.