Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addressing issue #93: CrossRefStream incorrectly assumes /Index value is a 2 element array #97

Closed
wants to merge 2 commits into from
Closed

Conversation

paulmer
Copy link
Contributor

@paulmer paulmer commented Jul 19, 2016

array containing an arbitrary set of starting object number and object
count pairs instead of a single pair.

Calculate number of objects using final start/count pair. (Actual number
will be less than or equal to this value.)

Modified reading loop to read only objects in the index ranges.

Modified PdfModule to test object number against stream's index instead of
a range from 0 to the last object number.

Attempt to conform to PDF specification
http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/pdf_reference_archives/PDFReference15_v6.pdf, page 83 and PDF versions 1.6 and 1.7.

array containing an arbitrary set of starting object number and object
count pairs instead of a single pair.

Calculate number of objects using final start/count pair.  (Actual number
will be less than or equal to this value.)

Modified reading loop to read only objects in the index ranges.

Modified PdfModule to test object number against stream's index instead of
a range from 0 to the last object number.

Attempt to conform to PDF specification
http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/pdf_reference_archives/PDFReference15_v6.pdf, page 83 and PDF versions 1.6 and 1.7.
@codecov-io
Copy link

codecov-io commented Jul 19, 2016

Current coverage is 3.43% (diff: 0.00%)

Merging #97 into integration will decrease coverage by 20.30%

@@           integration       #97   diff @@
============================================
  Files              411       411           
  Lines            34669     34690     +21   
  Methods              0         0           
  Messages             0         0           
  Branches          7105      7110      +5   
============================================
- Hits              8231      1192   -7039   
- Misses           26438     33405   +6967   
- Partials             0        93     +93   

Powered by Codecov. Last update 08baeef...3fd54f2

@carlwilson carlwilson added the bug A product defect that needs fixing label Sep 7, 2016
@carlwilson carlwilson added this to the Release 1.16 milestone Sep 7, 2016
@smmorrissey
Copy link

Sheila +1

@pmay
Copy link

pmay commented Sep 19, 2016

+1

1 similar comment
@asciim0
Copy link
Contributor

asciim0 commented Sep 19, 2016

+1

Copy link
Member

@david-russo david-russo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch Paul, and good catch on the discrepancy with the spec. The code is largely fine and only requires some minor clean-up.

private PdfStream _xstrm; // The underlying Stream object.
private PdfDictionary _dict;
private int _size;
private int[] _index;
private int _index_size;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_index_size appears to be unnecessary since the same value can be retrieved from _index.length.

/** Range elements of the _index array:
Starting object and number of objects.
*/
private class index_range {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, all class names should be camel-case with an initial capital letter, i.e. IndexRange.

@@ -262,8 +294,16 @@ public int getFreeCount ()
/** Returns the total object count. */
public int getNumObjects ()
{
return _index[0] + _index[1];
return _index[_index_size - 1].start + _index[_index_size - 1].len;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the original method, this seems like a convoluted way to calculate the value already stored in _size. Is there a reason we can't simply return that?

private PdfStream _xstrm; // The underlying Stream object.
private PdfDictionary _dict;
private int _size;
private int[] _index;
private int _index_size;
private index_range[] _index;
private int[] _fieldSizes;
private int _freeCount;
private Filter[] _filters;
private int _entriesRead;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the addition of _read_index below, _entriesRead is no longer in use and can be safely removed.

private int[] _fieldSizes;
private int _freeCount;
private Filter[] _filters;
private int _entriesRead;
private int _read_range;
private int _read_index;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try and match any surrounding whitespace conventions. In this case 4 spaces instead of a tab. JHOVE's source code is currently a mess with indentation, but we should avoid adding to the confusion.

Also, the JHOVE (and Java) style convention for common variable names is to have them camel-cased with lowercase initial letters, e.g. _readRange, such as in the surrounding code.

}

public boolean isValidObject(int objNum) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add JavaDocs for any new methods.

david-russo added a commit to bl-dpt/jhove that referenced this pull request Dec 8, 2016
Additionally reverted the `sObjNum` tests in PdfModule, as the
replacement tested for something slightly different: instead of
testing that `sObjNum` fell between 0 and the cross-reference table's
highest object number, the new test only checked for the object number's
existence within the current cross-reference stream.
david-russo added a commit to bl-dpt/jhove that referenced this pull request Dec 8, 2016
Additionally reverted the `sObjNum` tests in PdfModule, as the
replacement tested for something slightly different: instead of
testing that `sObjNum` fell between 0 and the cross-reference table's
highest object number, the new test only checked for the object number's
existence within the current cross-reference stream.
@carlwilson
Copy link
Member

Closed by #188

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A product defect that needs fixing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants