Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds UMD headers to core, display and shared files. #6683

Merged
merged 3 commits into from
Dec 16, 2015

Conversation

yurydelendik
Copy link
Contributor

Now each file is a module. There are some benefits: we can track dependencies between files and various module loading systems can be used, e.g. requirejs or node. Not so good news that we have some circular dependencies to deal with (I fixed/patched couple of them), e.g. colorspace uses image and image uses colorspace. I tried not to fix or refactor anything in this PR (only stuff to make modules dependencies valid). That's just an intermediate step the clean up the dependencies.

jshint globals directives were removed where was possible. I kept only PDFJS, however in the future we can refactor that too. Notice that we may try to turn "unused" on now.

There is validation tool 'external/umdutils/verifier.js' that helps to maintain UMD headers valid. See for UMD format definition there.

Future work might include: use module loading system for the viewer, remove UMD module headers during bundling, remove src/expose_to_global.js (which was created for unit tests), split some files into smaller (e.g. core/fonts.js), refactor PDFJS global more. After this is done we can individually refactor each module and even migrate to some e.g. ES6 translator.

@yurydelendik
Copy link
Contributor Author

After this patch you can execute the following in the node.js:

var api = require('./src/display/api.js');
PDFJS.displayWorker = true; // remove that in the future
var data = require('fs').readFileSync('./web/compressed.tracemonkey-pldi-09.pdf');
var loadingTask = api.getDocument(new Uint8Array(data));
loadingTask.promise.then(function (doc) {
  console.log(doc.numPages);
});

@yurydelendik
Copy link
Contributor Author

Dependo generated graph (only core and display):

screen shot 2015-11-23 at 1 05 06 pm

@Rob--W
Copy link
Member

Rob--W commented Nov 23, 2015

Questions:

  1. The module names are repeated 5+ times:

    • once in AMD
    • once in CommonJS
    • once without module loader
    • once in the function expression's parameters
    • at least once in the function body

    Does your tool enforce that all of these names are consistent and follow the same convention (and briefly be documented somewhere)?

  2. There is lots of boilerplate. Can the build tool be updated to automatically trim the duplicate information when a single-file build is generated? E.g. by only using AMD, CommonJS or the module pattern (whatever is the easiest for you).

@yurydelendik
Copy link
Contributor Author

Does your tool enforce that all of these names are consistent and follow the same convention (and briefly be documented somewhere)?

The validation of the names is done by node make lint command (see also https://github.com/yurydelendik/pdf.js/blob/module-core/external/umdutils/verifier.js#L18). It also tracks circular dependencies. The AMD names (for module id and dependencies) looks like a good candidates to be selected as primary names (it easy to map them to CommonJS path and JS object name).

Can the build tool be updated to automatically trim the duplicate information when a single-file build is generated?

Yes. The intent will be eventually to build without UMD headers -- those can be replaced by regular function closure, e.g.

(function (root, (exports, sharedUtil, displayWebGL) {
  ... 
})((root.pdfjsDisplayPatternHelper = {}), root.pdfjsSharedUtil, root.pdfjsDisplayWebGL);

where root can be some local object instead of window.

Also, I think we shall remove all links to core/display/shared files and use module loaders instead (e.g. RequireJS).

(IHMO it's out of scope of this PR ATM -- I don't want to touch our build system without real need to)

@yurydelendik
Copy link
Contributor Author

Let me also re-iterate how to choose names:

  • for AMD module name:
    • almost all files at 'src/' folder shall start with 'pdfjs/' prefix followed by the relative path excluding the extension, e.g. 'src/core/pattern_helper.js' file will have 'pdfjs/core/pattern_helper' AMD module name
    • The dependencies shall use full module names
    • the name, as well as the file name, shall be only lower case ASCII letter and digits, and '_' can be used
  • for CommonJS reference:
    • all references to the required modules must be relative the the current one, e.g. if 'src/core/pattern_helper.js' refers 'src/shared/util.js' the path shall be '../shared/utils.js'
    • the path shall start from './' or '../' and ends with '.js'
  • for global/root name:
    • the name is based on AMD name: after each '/' character, the letter is upper-cased, and the '/' and '_' characters removed (it's possible to uppercase other letter inside the name, e.g. pdfjsCoreColorSpace or pdfjsCorePatternHelper, but shall be consistent for a module name and its references)
  • import name:
    • the name is part of the corresponding global/root name, but without 'pdfjs' prefix and first letter is in lower case

@@ -542,3 +593,7 @@ var PDFDocument = (function PDFDocumentClosure() {

return PDFDocument;
})();

exports.Paga = Page;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: typo exports.Paga should be exports.Page.

@Snuffleupagus
Copy link
Collaborator

A testing related nit: shouldn't test/unit/obj_spec.js be renamed to test/unit/primitives_spec.js, since that unit-test covers code that is now moved from obj.js?

@timvandermeij
Copy link
Contributor

Aside from the nit above, this PR needs a rebase too after the recently landed PRs.

@yurydelendik yurydelendik force-pushed the module-core branch 3 times, most recently from 663db97 to 1c71074 Compare December 1, 2015 18:20
@timvandermeij
Copy link
Contributor

@yurydelendik Could you rebase this PR again? I think we should land this soon as it keeps on needing a rebase and it looks good to have modules and to drop most of the globals.

@timvandermeij
Copy link
Contributor

/botio-linux preview

@pdfjsbot
Copy link

pdfjsbot commented Dec 1, 2015

From: Bot.io (Linux)


Received

Command cmd_preview from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/397b69541371361/output.txt

@timvandermeij
Copy link
Contributor

/botio test

@pdfjsbot
Copy link

pdfjsbot commented Dec 1, 2015

From: Bot.io (Windows)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.22.172.223:8877/a8be5c2601c97ae/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Dec 1, 2015

From: Bot.io (Linux)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/1d5e50d933793ce/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Dec 1, 2015

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/a8be5c2601c97ae/output.txt

Total script time: 19.04 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

pdfjsbot commented Dec 1, 2015

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/1d5e50d933793ce/output.txt

Total script time: 20.31 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@yurydelendik
Copy link
Contributor Author

New 'evaluator' on 'function' dependency is added by #6723 -- fixed.

exports.PDFFunction = PDFFunction;
exports.PostScriptEvaluator = PostScriptEvaluator;
exports.PostScriptCompiler = PostScriptCompiler;
}));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you also need to export isPDFFunction here, or am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, node make lint was throwing "ERROR:The non-exported symbol is referred: isPDFFunction from pdfjs/core/function used in pdfjs/core/evaluator". I think we need to change travis to execute that instead.

context.warnCallback(path + ': ' + msg);
}
function error(msg) {
context.info(path + ': ' + msg);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this call context.errorCallback instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you.

}
// Scans for exports definitions in the body.
var exportedNames = [];
re = /\bexports.(\w+)\s*=\s/g;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While testing the PR locally, I wanted to check that the validator failed if an exports.<name> wasn't present. In order to do that, I initially tested to just comment out a random exports.<name> line. To my surprise, that didn't fail, which lead me to this line.

So my question is: Should this regexp be changed to also check that the line with exports.<name> isn't commented out, since that would cause a runtime failure that the validator currently doesn't catch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we shall catch all of those cases by testing. And we usually enforce that during code review by not allowing commented code to be present. This a temporary solution anyway to just validate introduction of UMDs and keeping code quality until ES6 module imports will be introduced.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds fine to me!
(Since I happened to notice it, I figured that it couldn't hurt to ask.)

@brendandahl
Copy link
Contributor

/botio test

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @brendandahl received. Current queue size: 0

Live output at: http://107.22.172.223:8877/876914199871804/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @brendandahl received. Current queue size: 0

Live output at: http://107.21.233.14:8877/16c4337fe8456a5/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/876914199871804/output.txt

Total script time: 19.77 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/16c4337fe8456a5/output.txt

Total script time: 20.41 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants