← Back to team overview

desktop-packages team mailing list archive

[Bug 1476705] Re: postscript printer hideously slow in some cases (pdftops)

 

The problem, I suspect, is the way Cairo rights PDF files (the gmap.pdf
file you attached above was created by Cairo). The Cairo *always* writes
the page contents into one or more PDF transparency groups - even when
all the contents are really opaque.

The issue is that, due to the way PDF works (and PDF transparency in
particular) the only way to be sure that all the page contents are
opaque would be to pre-process each page, checking for non-opaque
content, and then re-interpret the page using the information gleaned in
the first pass - which would, frankly, result in an unacceptable
performance drop for the vast majority of PDF files.

Most interpreters I know will pre-scan the quickly accessible elements
of a PDF page, and if no transparency constructs are found, will then
elide the extra processing transparency requires. Unfortunately, those
easily accessible elements don't contain (or, at least, don't reliably
contain) the actual opacity information. So, in most cases I know, just
the existence of the transparency constructs means that extra processing
is enabled, regardless of the actual opacity values.

Now, secondly, Postscript cannot represent PDF transparency in high
level (vector etc) operations. So, the only way to get a visually
accurate representation of a PDF containing transparency in Postscript
is to "flatten" the transparency by rendering it to a sampled image -
and clearly, sampled images end up being larger than vector graphics.

Hence we have the result that basically every Cairo produced PDF will
convert to Postscript as one or more sampled images per page.

And that explains why the Postscript is so much larger than the PDF.

Now, looking at the Postscript file you posted, it *appears* that the
rendering for transparency flattening is being done at 1200dpi which is,
frankly, ridiculous for a couple of reasons. First is, your printer has
a maximum physical resolution of 600dpi (the ImageRET modes provide
enhanced quality, claimed to be equivalent to 2400 and, IIRC, 3600 dpi,
but the printer is still a 600 dpi printer). Secondly, our experience
with Ghostscript's Postscript output, is that many printers are much,
*much* faster at upsampling images than downsampling.

So, my first suggestion would be to poke around the CUPS dialogues
and/or the PPD, and see if you can drop the claimed resolution of the
printer to at most 600dpi and, frankly, I'd even try 300 dpi. As a rule
of thumb, in the printing world, it's generally claimed that dropping
continuous tone, sampled image resolution by 50% from the physical
resolution results in almost no visible loss of quality. Where that
falls down, in cases like this, is because there is text involved, and
the small details inherent in text shapes may well suffer visibly.

Another thing we've found with the Postscript output from Ghostscript is
that many printers are very, very slow at decompressing data, so if you
can find an option to avoid compressing image data, that *might* make a
difference - but that is highly printer dependent.

I'd like to take this opportunity to rant (again): this kind of thing is
the reason that PDF is such poor, poor choice as a print spool format.
PDF has a *hugely* rich imaging model, *far* more so than Postscript,
PCL5/PXL or any of the proprietary page description languages (PDL),
which means for almost any low/midrange PDL based printer, there is a
very high chance that the PDF content cannot be converted to a high
level, vector representation, and must be rendered and sent as sampled
images (bitmaps). Perhaps a PDF/A or PDF/X variant would be a better
choice......

And I'm sure someone will point out that more and more printers are
supporting direct PDF printing, but that really just moves the
bottleneck: PDF transparency is (over!) complex, and is extremely
processor and memory intensive. So a low/mid range printer, with limited
memory and processing power, is going to struggle to print a file like
these ones. In truth, many such PDF printers "get around" this by either
not supporting transparency at all, or supporting only relatively
trivial subset of the full PDF transparency model.

Also, with more and more applications integrating Cairo to do their PDF
output, with the transparency problems outlined above, the situation is
only set to get worse. </rant>


Anyway, as I said, look into adjusting the resolution, and possibly
compression settings, and let us know how you fare.

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to poppler in Ubuntu.
https://bugs.launchpad.net/bugs/1476705

Title:
  postscript printer hideously slow in some cases (pdftops)

Status in poppler package in Ubuntu:
  New
Status in system-config-printer package in Ubuntu:
  New

Bug description:
  With my (old) postscript printer, print a single page can take many
  minutes on some situations. It happens with some PDF files (not all)
  and Firefox printing of Google map, for example. When this happens, I
  observed in system monitor that pdftops is running continuously. After
  some manual PDF -> PS conversions, I see that pdftops inflates the
  file size for problematic cases, but is ok for other files (size
  similar to the original file, or even smaller). I don't know if modern
  Postscript printers can handle this quickly, but it's unacceptable
  here and certainly not an efficient way to print those files.

  So I suspect that pdftops should be fixed.

  For example, I join a problematic pdf produced by Google Map in
  Firefox. I tried many conversions. As you can see, I get a much larger
  file (36 times) with pdftops. It is worse with pdf2ps (and it takes
  longer to process), so replace pdftops by pdf2ps is not an option for
  me. However, pdftocairo quickly produces an efficient file. I have the
  same success if I open the PDF file with Evince and print it as a
  Postscript file. I get a similar file if I print to PS directly from
  Google Map (Firefox). Of course these small PS files produced by
  pdftocairo, Evince of Firefox print flawlessly on my printer.

  See also Bug # 1095498 which I suspect is the same (old) thing, but I fill a new one since it doesn't seem to be printer specific.
  Of course, another workaround could be to use a PCL driver but no one is available for my printer (HP Color Laserjet 2605dn).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/poppler/+bug/1476705/+subscriptions


References