Compresspdf.sh: Difference between revisions

From creative crowd wiki
Jump to navigation Jump to search
No edit summary
 
(11 intermediate revisions by one other user not shown)
Line 32: Line 32:


'''In which context was it made?'''<br>  
'''In which context was it made?'''<br>  
The script is written by [http://osp.kitchen Open Source Publishing]. It has since travelled through other practices to be used for a variety of applications, usually at the eleventh hour before sending PDFs to print and for grant applications.
The script is written by [http://osp.kitchen Open Source Publishing]. It has since travelled through other practices to be used for a variety of applications, usually at the eleventh hour before sending PDFs via email and webforms.


'''What software does it use?'''<br>
'''What software does it use?'''<br>
Line 41: Line 41:


'''On which systems can this script run?'''<br>
'''On which systems can this script run?'''<br>
This works on Linux systems, also on Mac OS. Windows works differently, and it is likely it won't work there.
This works from the command line in terminal sessions on Linux systems, also on Mac OS. Windows has a different shell, and it is likely it won't work there.


'''How to run the script?'''<br>
'''How to run the script?'''<br>
Line 60: Line 60:
<code>input-resampled.pdf</code>
<code>input-resampled.pdf</code>


<nowiki>
==Making a portfolio PDF at the last minute==
==Preparing print files for ''ongoing circulations''==


This script was written to generate the print files for the publication ''ongoing circulations'' to document a series of five collective learning sessions organised by Varia in 2022. Each session departed from a technological practice and made space for exploring, questioning and (re)turning (to) digital library software, plain text publishing protocols, web-to-print tools, high frequency radio communication, and colonial infrastructures. The publication is based on the format of the letter, a mode of address that was chosen to reach out to the participants of the sessions specifically, but could meanwhile also speak to a larger body of readers and invite them into the questions that were formulated and materials that were produced.  
This script has been used many times on many machines by many users. It's super handy when making files that need to be sent quickly over a network. One particular use was when making a final final final portfolio PDF for a grant application at the eleventh hour.


The letters were RISO printed by Printroom, which is a space in Rotterdam dedicated to artist publications, small press and self-publishing. They run a RISO workshop in the city, where they have two A3 RISO machines plus four A4 ones, lots of different colors, and a range of machines for the post production, such as folding machines. Printroom asked us to prepare the print files on A3, because these RISO machines are the most reliable ones they have. At first i wasn't really sure how to do it... How can each A4 page of the PDF be duplicated and placed side-by-side on one A3 page without loosing any quality? Using raster editor software, like Gimp, would not make sense, as i was not sure if this would keep the quality of the PDF intact. Using vector editor software, like Inkscape, would also not make a lot of sense, because the fonts would not be rendered correctly. So i started to look for a tool that could be used from the command line.
Usually, the portfolio is the last thing I make after I write the application text. I should use a template to work from, but I almost always make a file from scratch. I should be better organised! I add images willy-nilly, tailoring them to the particular grant application. And using CSS-to-print to make a PDF means that your software doesn't have a handy option to compress files when exporting, you have to add this in to the workflow manually.


The script does two things: it multiplies each page of a PDF by two, and it places two A4 pages on an A3 side-by-side. It took me a while to figure out how you could do this, but eventually [https://superuser.com/a/1549057 this answer on superuser.com] revealed a very nice trick using <code>pdftk</code>. I have used <code>pdftk</code> in the past to work with PDFs to extract certain pages, so i was aware of the <code>cat</code> option and that you can use it to point to specific pages of your document. I had never seen <code>cat</code> being used though, to duplicate pages! It was a great discovery, specially because other ways of reaching the same outcome involved for loops written in bash, which would mean probably mean that i would write a small bash script to write this all out, and now i needed just one line of code.
On the final day of application, January 31st, the resulting PDF was quite large (122,5mb!). I needed to send this through a webform, and the specification was that it could not be bigger than 10mb.


So in the end, the script uses two PDF manipulation tools: <code>pdftk</code> and <code>pdfjam</code>.
So, I used this script just several hours before submitting the PDF. I managed to get the file down to a manageable size without stress, and pressed send in time. In the end I did not get the grant, but I got the script :)


The first line of the script calls <code>pdftk</code>, then uses <code>ongoing-circulations.pdf</code> as its input file, then uses <code>cat 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10</code> to select a range of pages of the input file, and writes it the <code>output</code> to <code>ongoing-circulations-double-pages.pdf</code>.
[[Category:Boilerplates]]
 
</nowiki>
 
[[Category:Documentation]]

Latest revision as of 09:07, 7 April 2023

Script

A PDF's size before compression
The resampled PDF's size, after compression
#! /bin/bash

pdffile=$1;
dpi=$2 

gs \
  -o "${pdffile%.pdf}-resampled.pdf" \
  -sDEVICE=pdfwrite \
  -dDownsampleColorImages=true \
  -dDownsampleGrayImages=true \
  -dDownsampleMonoImages=true \
  -dColorImageResolution=$dpi \
  -dGrayImageResolution=$dpi \
  -dMonoImageResolution=$dpi \
  -dColorImageDownsampleThreshold=1.0 \
  -dGrayImageDownsampleThreshold=1.0 \
  -dMonoImageDownsampleThreshold=1.0 \
   "${pdffile}"

How to work with it?

What does this script do?
This script squashes large PDFs into very small ones.

In which context was it made?
The script is written by Open Source Publishing. It has since travelled through other practices to be used for a variety of applications, usually at the eleventh hour before sending PDFs via email and webforms.

What software does it use?
ghostscript

How to download these pieces of software?
Go to the URL https://www.ghostscript.com/, you can download it from there.

On which systems can this script run?
This works from the command line in terminal sessions on Linux systems, also on Mac OS. Windows has a different shell, and it is likely it won't work there.

How to run the script?
After installing the software ghostscript (see above), save the code above as a bash script with the .sh extension, for example: compresspdf.sh. Don't forget to add #! /bin/bash to the top of your script!

Then, in a terminal session, navigate to the directory where you have the file saved. You should add the PDF you want to compress to the same directory.

Then do:

./compresspdf.sh <file.pdf> <resolution in DPI>

For example:

./compresspdf.sh input.pdf 300

The output will be something like:

input-resampled.pdf

Making a portfolio PDF at the last minute

This script has been used many times on many machines by many users. It's super handy when making files that need to be sent quickly over a network. One particular use was when making a final final final portfolio PDF for a grant application at the eleventh hour.

Usually, the portfolio is the last thing I make after I write the application text. I should use a template to work from, but I almost always make a file from scratch. I should be better organised! I add images willy-nilly, tailoring them to the particular grant application. And using CSS-to-print to make a PDF means that your software doesn't have a handy option to compress files when exporting, you have to add this in to the workflow manually.

On the final day of application, January 31st, the resulting PDF was quite large (122,5mb!). I needed to send this through a webform, and the specification was that it could not be bigger than 10mb.

So, I used this script just several hours before submitting the PDF. I managed to get the file down to a manageable size without stress, and pressed send in time. In the end I did not get the grant, but I got the script :)