Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[driverless] HP Color LaserJet MFP M281 prints garbage instead of 'žčš' characters #331

Closed
zdohnal opened this issue Dec 14, 2020 · 30 comments

Comments

@zdohnal
Copy link
Member

zdohnal commented Dec 14, 2020

Hi,

I have an report in Fedora for HP Color LaserJet MFP M281, which has a PPD generated via driverless driver.

The user tried to print a file with diacritics (its file from /var/spool/cups here) but it resulted in garbage for 'čžš' characters when the job came out from the printer. We checked the file which comes out from filter chain , but the diacritics is correct there too.

I don't see any error in job log nor in IPP response.

We're going to try to print the file unfiltered since the printer accepts pdfs, but I'm not sure how to debug it further.

I guess the error will be connected to fonts, but I'm not sure how to prove it. Would you mind looking into it?

Thank you in advance!

Zdenek

@tillkamppeter
Copy link
Member

You do not see any error in the log file as the filter got executed correctly and did its task correctly.
We have most probably a bug in the printer here, not being able to print PDF files which display correctly on the screen. For further investigation, please try to print the following files unfiltered:

  • issue-with-čžš-characters.pdf (original input file)
  • ps (file which passed the filter chain)

For raw printing, use the command

lp -d PRINTER -o raw FILE

Please tell which of the files print correctly and which of them show the error.

@zdohnal
Copy link
Member Author

zdohnal commented Dec 16, 2020

Original input file sent raw to the printer prints characters žčš fine, but the other issue appeared - some lines are bold (see).
'd' file sent raw shows the same errors as ps file sent raw and the initial printout, so IMO evince (or poppler/fontforge) makes changes to PDF which the device doesn't like.

The classic driver (lsb/usr/HP/hp-color_laserjet_mfp_m278-m281-ps.ppd.gz) doesn't have any of those two issues.

So the next action is to report it to evince, do you agree?

@surajkulriya
Copy link
Contributor

surajkulriya commented Dec 18, 2020

hey @zdohnal
i saw the report and saw that problem was solved by changing the driver manually. So can you please show me the ppd file you later used to solve the issue. It can help me for debugging the issue for driverless printing.
thanks :-)

@zdohnal
Copy link
Member Author

zdohnal commented Dec 18, 2020

Hi @surajkulriya ,
I wouldn't say solved, since the classic printer driver was used and driverless solution was abandoned (we all want driverless solutions to work at least at the same quality as classic drivers or better), but here is the classic driver.
The classic driver uses postscript document format instead of PDF (which driverless used here), so the bug will be about the device is not able to handle some font types within PDFs.

@surajkulriya
Copy link
Contributor

hey @zdohnal
since the output coming out from filter-chain is correct as stated by @tillkamppeter sir too and printer is also fine as you told that its printing well using a driver so the issue we are facing must lie in cups-backend since this is the one that sends print data to printer. So can you please tell me which type of printer you are using like is it a network printer or usb printer or socket printer etc.
Thanks :-)

@tillkamppeter
Copy link
Member

Note that evince does not pass through the input PDF file for printing it, but it renders the file as it does for screen display, only that it is redirected into GTK's PDF file generator and so a new PDF file gets generated. This is the significant impact. The filter chain, only consisting of the pdftopdf filter, does not change much more.
The PDF coming from evince seems to cause the problem in the printer, while the original PDF is OK.

When using the classic driver the printer does not get a PDF file but raster graphics to print, the letters with accents are not generated by the printer any more but by Ghostscript on the host, meaning that Ghostscript is able to display the file correctly. Also evince is able to display the file correctly. Ghostscript and evince display even the print file, which evince had generated, correctly. So for me it looks like that the PDF interpreter in the printer has a bug and therefore some PDFs do not get printed correctly by this printer.

As PostScript and PDF interpreters are rather complex they can easily have bugs, I get confronted with a lot of them when switching the standard printing format from PostScript to PDF, as suddenly GhostScript's PDF interpreter got massively exercised, as well as Poppler got exercised in a different way, and QPDF made it from a not that important hobby project to an integral part of the operating system and so bug reports came in here, too.

As printer manufacturers implementing driverless printing want to attend as many users as possible, including the ones who print from an iPhone, they also include AirPrint and with this Apple Raster support, so IPP printers doing PDF usually also do Apple Raster, another of the 4 page description languages (PDLs) used by driverless printing. Apple Raster is a simple, well-defined raster graphics language, not much bug-prone when implemented in printer firmware. It also needs less of the limited memory and CPU resources of the printer.

I have put the PDLs into a priority order, giving highest priority to PDF, as this is the most sophisticated PDL, expected to deliver best possible print quality. Switching the prioritarization to Apple Raster could imprrove printing reliabilty and also moves the complex step of PDF rendering to the host, where bugs can get fixed more easily (and we at OpenPrinting can fix them) and we have a lot more computational (memory and CPU) resources.

Not that when using the original HPLIP driver, most probably raster data (in PCL format), in some cases also PostScript is sent to the printer. AFAIK HPLIP does not send PDF. Note that sending PostScript to a PostScript printer also easily causes similar bugs and it happened often already, cups-filters contains a lot of workarounds for such bugs (mainly in the pdftops filter). So sending raster graphics is usually the best bet.

So I will try to flick the switch of priorities in cups-filters, so that if a printer supports both Apple Raster and PDF, to prefer the former.

@tillkamppeter
Copy link
Member

Priorities changed in master with commits fb9ecb4 and 1b39096 and in 1.x with commit a6807df.

@tillkamppeter
Copy link
Member

Please test and tell us whether this solves the problem. Thanks.

@OdyX
Copy link

OdyX commented Jan 8, 2021

This removes the driverless_support_strs@Base symbol as exported in libcupsfilters1 since 1.27.5. Removing symbols shouldn't happen over the course of a SONAME's history. In that case, I think it was a mistake to export it, and it was never used anywhere else, so I just went with it.

@tillkamppeter
Copy link
Member

@OdyX this answer was probably meant for Issue #235 and/or OpenPrinting/cups#65?
The symbol you mean was not meant to be public API. The driverless_support_strs make only sense for driverless and now the functionality needing this string has gone in driverless.

@OdyX
Copy link

OdyX commented Jan 8, 2021

Oh. Snap. Yes. That was a comment on aae86d2 , removing the symbol. What I'm saying is that it should never have been added, so (as it's unused) it's (as an exception) fine to remove now.

@zdohnal
Copy link
Member Author

zdohnal commented Jan 19, 2021

Hi @tillkamppeter ,

unfortunately the preferring image/urf didn't help, the result is the same for the user. IIUC if urf is preferred, rastertopdf by default should be run, right?

I started to experience the similar issue with diacritics with my printer and with a different PDF, I'll file a separate issue for that.

@zdohnal zdohnal reopened this Jan 19, 2021
@tillkamppeter
Copy link
Member

@zdohnal, if we prefer URF on a driverless workflow, the incoming PDF is turned into PWG Raster via pdftoraster or gstoraster, the former using Poppler and the latter using Ghostscript. If with such a workflow the accents do not come out correctly, they either come already in broken (evince bug) or at least one of Ghostscript or Poppler is not able to render them.

So we need to isolate. First the incoming PDF needs to get checked. Disable the print queue, send a job, and grab the d.... file in the CUPS spool directory. Are the accents OK? Only if yes, enable the printer and see by the CUPS error_log which filters are called. Try to call them separately on the command line, to see which filter breaks the accents. If Ghostscript turns out to be the culprit, grab the Ghostscript command line from error_log/the filters stderr output and the input file, and file a bug on Ghostscript upstream.

Workaround is to force use of Poppler or Ghostscript when the other does wrong, by removing the execute bit from a filter and restarting CUPS.

@tillkamppeter
Copy link
Member

@zdohnal, did your user re-create his print queue after updating cups-filters? Did his PPD get generated by cups-filters ot by CUPS (I changed only cups-filters)? What I need from the user is his current PPD, his error_log (in debug mode), and his current d... spool file.

@zdohnal
Copy link
Member Author

zdohnal commented Jan 20, 2021

First the data we have right now:

  • he reinstalled his print queue after updating to new cups-filters
  • here is his new PPD - the ppd shows driverless, cups-filters 1.28.7 and has a line *cupsFilter2: "image/urf image/urf 0 -" which confirms it is reinstalled via cups-filters's driverless driver and the change is applied.
  • error log for the current job shows neither pdftoraster or gstoraster is run.

I will ask for the current d spool file and do the debugging filter by filter.

So we need to isolate. First the incoming PDF needs to get checked. Disable the print queue, send a job, and grab the d.... file in the CUPS spool directory.

Is disabling the print queue needed or can it be substituted by 'PreserveJobFiles Yes'? I'm just making sure I don't mess up.

Are the accents OK?

Do you have a way how to check the pdf without opening it in viewer (in this case and some others in the past the viewer shows the correct accentsm, but printed page looks different) or printing it out?
I usually try to check up the pdf in fontforge, but I'm missing a knowledge to make something from the stuff fontforge shows.

@tillkamppeter
Copy link
Member

@zdohnal, could you tell the user to do the following:
Stop the CUPS daemon (service cups stop or systemctl stop cups), then edit your queue's PPD file in /etc/cups/ppd changing the line

*cupsFilter2: "application/vnd.cups-pdf application/pdf 100 -"

to

*cupsFilter2: "application/vnd.cups-pdf application/pdf 200 -"

After savin the file, start CUOS again (service cups start or systemctl start cups), then try to print again. Does this solve the problem? If not, please provide a new error_log, of the current job.

@zdohnal
Copy link
Member Author

zdohnal commented Jan 20, 2021

Aha, the conversion cost was too high - I got confused by zero in updated line in ppd, but forgot to count all other conversion costs before... gstoraster+rastertopwg are run atop of pdftopdf with this change.
I asked the user for trying this too.

@zdohnal
Copy link
Member Author

zdohnal commented Jan 21, 2021

Just a side note - changing the conversion cost solved my own problem with a different PDF. Let's see if it helps the user too.

Before that, I was able to workaround my own accent printing problem by regenerating the original PDF via ghostscript this way:

$ gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=<output>.pdf <input>.pdf

and then print it via lp - evince or libreoffice caused problems even with regenerated PDF.

Could this information help us in similar issues? Like this command would be called as a filter after going through pdftopdf, ghostscript will substitute any fonts it cannot find (especially some cairo fonts in PDFs generated by evince seems suspicious) and then go to other filters? This way we could stay with smaller PDF format being sent to the printer rather than big raster format.

@tillkamppeter what do you think? Can it be helpful or was I just lucky it worked?

@zdohnal
Copy link
Member Author

zdohnal commented Jan 21, 2021

The user was able to print accents with corrected conversion costs, but the first printing failed halfway with:

Error writing document data for Send-Document

and about a half of page was printed. It really looks like printer bug. I asked for network traffic capture.

The second printing attempt was good - the page was printed whole and with accents.

@tillkamppeter
Copy link
Member

@zdohnal, I have raised the cost values now by another 100, in master (41382de and bcd1aed) and 1.x (a5dd51e). Thank you for testing, please close if you think it is solved now.

@zdohnal
Copy link
Member Author

zdohnal commented Jan 22, 2021

Thanks, Till!

I'll verify the commits by testing and then I'll close the bug.

@zdohnal
Copy link
Member Author

zdohnal commented Jan 22, 2021

How it was tested:

  1. PPD created by driverless cat <uri>
  2. created print queue via lpadmin for IPP device uri.
  3. printed via evince

The accents are ok and the job log shows pdftopdf, gstoraster, rastertopwg filters. Verified.

@zdohnal zdohnal closed this as completed Jan 22, 2021
@zdohnal
Copy link
Member Author

zdohnal commented Jan 25, 2021

The user now often experiences the mentioned error when printing image/urf - it looks M281 has a bug regarding URF processing. I removed the fix from Fedora for now, maybe we should revert the commit in cups-filters too?

@tillkamppeter
Copy link
Member

It seems that the user's printer is buggy on all driverless PDLs.
Now there is really the question which PDL of PDF, Apple Raster, PWG Raster, and PCLm is the most reliable. Your user's printer seems to have problems on both PDF and Apple Raster, I have already seen a user complaint on PWG Raster, also with an HP printer.
Now we need to know which PDL priorization we want.

@zdohnal
Copy link
Member Author

zdohnal commented Jan 25, 2021

@tillkamppeter what about staying with PDF and adding ghostscript call to regenerate pdf into pdftopdf filter or as a standalone filter?

@zdohnal
Copy link
Member Author

zdohnal commented Jan 25, 2021

Like this:

$ gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=<output>.pdf <input>.pdf

@tillkamppeter
Copy link
Member

We should not modify pdftopdfas it is a QPDF-based PDF manipulation tool, not something for rendering the PDF and produce new cleaned PDF. Better would be to have a pdftopclm filter being called after pdftopdf. PCLm is nothing more that raster-embedded PDF, a PDF which does not depend on fonts in the printer as the page content is raster graphics and not vector graphics and text. This should reduce compatibility problems of PDF printers a lot. If the printer is able to do basic PDF printing it prints documents in any language this way.

The text rendering then happens on the computer, where one has needed fonts more easily or at least one can more easily add missing fonts.

Creating a pdftopclm filter could be done in a GSoC project.

@zdohnal
Copy link
Member Author

zdohnal commented Jan 26, 2021

pdftopdf uses already pdftocairo or gs for document form flattening, if it is requested via job options, and then it is processed by QPDF. Both of them do the trick for fonts. I tested them manually - I used the same options as pdftopdf uses to run pdftocairo or gs and both results were printed with correct accents.

Then the code could be changed to:

  • use pdftocairo if requested in options, then qpdf
  • use gs otherwise and then qpdf

Ad pdftopclm - does it just add a requirement for printers to understand PCLm? I'm not sure if it is desired - I thought PCLm is specific to HP devices.

@tillkamppeter
Copy link
Member

First, PCLm is a subset of PDF, every PDF display tool and every PDF printer should be able to display or print it. The special of PCLm is that it only contains embedded raster graphics, no vector graphics and not text, so it does not require fonts. It was created to allow driverless printing on cheaper printers, which do not have enough computational resources to render regular, full-fledged PDF. No explicit PCLm support is required on the printer side.

So adding a pdftopclm filter after pdftopdf would move the requirement of fonts from the PDF printer to the host.

As Ghostscript and pdftocairo render the PDF and create a new PDF there is a good probability to work around such font problems, probably they embedd fonts in the new PDF. They still pass on vector graphics with text, so they have still a higher incompatibility potential than raster-graphics-only PCLm.

Also I had introduced the functionality to run pdftopdf's PDF input through Ghostscipt or pdftocairo as we did not have form-flattening vis QPDF. After the new QPDF-based form-flattening arrived, I have left the Ghostscipt and pdftocairo in the pdftopdf filter to use them on request, for the case that the new QPDF solution fails for some reason. More than one year I did not get any bug reports about form-flattening problems, so I assumed that the new QPDF method works.

Recently I converted all CUPS filters into filter functions in libcupsfilters, to make them more easily accessible for Printer Applications. Due to the fact that there were no complaints about QPDF-based form flattening I removed the Ghostscript/pdftocairo form-flattening support.

So my idea now was to replace the standard PDF output for PDF printers by raster-only PDF and to get best working raster graphics for the printer, it should be in the printer's resolution, and when the printer has more than one resolution, in the resolution selected for the job. For this the rasterization needs to be done after pdftopdf (as we already do with non-PDF printer drivers) as in PDF pages can get scaled or re-arranged (for example when N-up is used).

@zdohnal
Copy link
Member Author

zdohnal commented Jan 27, 2021

First, PCLm is a subset of PDF, every PDF display tool and every PDF printer should be able to display or print it. The special of PCLm is that it only contains embedded raster graphics, no vector graphics and not text, so it does not require fonts. It was created to allow driverless printing on cheaper printers, which do not have enough computational resources to render regular, full-fledged PDF. No explicit PCLm support is required on the printer side.

Thank you for the info! Then it makes sense to try PCLm.

So adding a pdftopclm filter after pdftopdf would move the requirement of fonts from the PDF printer to the host.

I agree.

As Ghostscript and pdftocairo render the PDF and create a new PDF there is a good probability to work around such font problems, probably they embedd fonts in the new PDF. They still pass on vector graphics with text, so they have still a higher incompatibility potential than raster-graphics-only PCLm.

I agree it is not ideal, but IMO it is a workaround which is usable till pdftopclm is implemented. The current situation requires a manual interference for people who fails to print urf :( .

Recently I converted all CUPS filters into filter functions in libcupsfilters, to make them more easily accessible for Printer Applications. Due to the fact that there were no complaints about QPDF-based form flattening I removed the Ghostscript/pdftocairo form-flattening support.

Aha, I checked only the version 1.28.7., where gs and pdftocairo form flattening is still available.

So my idea now was to replace the standard PDF output for PDF printers by raster-only PDF and to get best working raster graphics for the printer, it should be in the printer's resolution, and when the printer has more than one resolution, in the resolution selected for the job. For this the rasterization needs to be done after pdftopdf (as we already do with non-PDF printer drivers) as in PDF pages can get scaled or re-arranged (for example when N-up is used).

Sounds good for the future. However, it would be good to have a workaround for now, since some devices don't understand URF correctly :( .

zdohnal added a commit to zdohnal/cups that referenced this issue Dec 17, 2021
Some printers print out garbled text if the job comes from evince and
the original PDF contains a specific font.

It is not certain whether cairo (PDF generator evince uses) has a bug in
PDF rendering or cairo renders a correct PDF, but in PDF version which
the printer doesn't support - but the switch to raster formats -
image/urf or image/pwg-raster works around the issue.

Related issues in Fedora:
https://bugzilla.redhat.com/show_bug.cgi?id=1904405
https://bugzilla.redhat.com/show_bug.cgi?id=1991678

The same workaround has been applied in cups-filters' PPD generator
(OpenPrinting/cups-filters#331), proposing the
possible final solution - using PCLm format.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants