Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing encoding data for: "MacExpertEncoding" #85

Open
trapero opened this issue Oct 23, 2015 · 12 comments
Open

Missing encoding data for: "MacExpertEncoding" #85

trapero opened this issue Oct 23, 2015 · 12 comments
Labels

Comments

@trapero
Copy link

trapero commented Oct 23, 2015

Hello,

I´m trying to parse the metadata from different files, the library works great but in some documents give me this error:

Missing encoding data for: "MacExpertEncoding"

You can access one of the document in:

http://www.statistik.rlp.de/fileadmin/dokumente/berichte/C1013_201500_1j_L.pdf

I get this error in the demo too.

Do you know a solution?

Thank you

@garraeth
Copy link

garraeth commented Aug 9, 2018

I'm getting this same message.

A couple resources I found, but I'm no PDF dev so it's foreign to me:

https://github.com/rohitpaulk/pdfparser/blob/master/gems/pdf-reader-master/lib/pdf/reader/encodings/mac_expert.txt

https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
page: 665

This would be a great addition.

Thanks! Love the project!

@garraeth
Copy link

garraeth commented Aug 9, 2018

Just on a whim, I created a MacExpertEncoding class (using the MacRomanEncoding.php file as a template) and copied the $encoding data from MacRomanEncoding into my new MacExpertEncoding class...and it worked!

Granted, I might have just gotten lucky.

@smalot
Copy link
Owner

smalot commented Aug 9, 2018

Could you create a merge request to contribute ?

@adrianbj
Copy link

adrianbj commented Jan 27, 2021

Sorry if I should open a new issue, but I am actually getting this error:

Missing encoding data for: ""

I have worked around it for now by hardcoding StandardEncoding like this:

$className = '\\Smalot\\PdfParser\\Encoding\\StandardEncoding';

The problem is that $this->get('BaseEncoding') returns false

Any thoughts on how to fix this properly?

Thanks.

@k00ni
Copy link
Collaborator

k00ni commented Jan 27, 2021

Can you provide us the PDF which is causing this problem? It must be free of charge and without any obligations.

@adrianbj
Copy link

Hi @k00ni - thanks for responding so quickly. The issue can be seen with this file.
healthy-chesapeake-waterways.pdf

@k00ni
Copy link
Collaborator

k00ni commented Jan 27, 2021

Is healthy-chesapeake-waterways.pdf free of charge and without any obligations?

@adrianbj
Copy link

Yes - it's from here: https://ian.umces.edu/press/newsletters/publication/1/healthy_chesapeake_waterways_2002-05-01/

Filename is different, but that's just because I am rebuilding this site and the new version is automatically renaming PDFs to match the publication title.

@igor-krein
Copy link
Contributor

Currently, in the Encoding.php, there is a "magic" function __toString() implementation, which calls a getEncodingClass() function, which, in its turn, may throw an exception (line 150).

From php.net:

Warning It was not possible to throw an exception from within a __toString() method prior to PHP 7.4.0. Doing so will result in a fatal error.

I think, if there is a need to keep a PHP 7.4 support, better change the line 150 somehow (for example, simply return an empty string). Note that __toString() method must return a string.

@suuuunto
Copy link

suuuunto commented Apr 6, 2021

I am getting the same error, is there a dirty fix for it?

k00ni added a commit that referenced this issue Apr 6, 2021
Prior PHP 7.4 we expect an empty string to be returned (based on PHP
specification) when class is invalid.
PHP 7.4+ we expect an exception to be thrown when class is invalid.

Hint came from @igor-krein in
#85 (comment)
@k00ni
Copy link
Collaborator

k00ni commented Apr 6, 2021

@igor-krein and @suuuunto does #407 fix it?

@igor-krein
Copy link
Contributor

igor-krein commented Apr 6, 2021

@igor-krein and @suuuunto does #407 fix it?

The code looks like it should fix the problem, thanks. I'll try to check it in action asap.

UPD: Let's say, it is highly likely that the problem was fixed. Actually, I don't have a sample PDF, I just see there was a problem in the logs (obviously, some clients' PDF files invoked the fatal error; but no client has complained yet, so there are no samples). This log data had appeared since the latest upgrade of the system (looks like composer downloaded the latest PdfParser version, too). So, there were new logs, and now they don't seem to appear anymore.

Thanks for the changes! Can't wait for them to be merged.

k00ni added a commit that referenced this issue Apr 13, 2021
* Encoding::__toString complies with PHP spec from now on

Prior PHP 7.4 we expect an empty string to be returned (based on PHP
specification) when class is invalid.
PHP 7.4+ we expect an exception to be thrown when class is invalid.

Hint came from @igor-krein in
#85 (comment)

* fixed cs issue

* refined Encoding::__toString description

* Update src/Smalot/PdfParser/Encoding.php

Co-authored-by: Igor Peisakhovich <igor.krein@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants