Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extended kernel generator docs #1802

Merged
merged 6 commits into from
Apr 1, 2020

Conversation

t4c1
Copy link
Contributor

@t4c1 t4c1 commented Mar 26, 2020

Summary

Adds some documentation on kernel generator, with emphasis on how to implement a new operation.

Tests

No new tests. This is just documentation.

Side Effects

None.

Checklist

  • Math issue Implement OpenCL kernel generator #1342

  • Copyright holder: Tadej Ciglarič

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@t4c1 t4c1 requested a review from SteveBronder March 26, 2020 10:17
@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.82 4.91 0.98 -1.94% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.97 -3.41% slower
eight_schools/eight_schools.stan 0.1 0.09 1.09 8.41% faster
gp_regr/gp_regr.stan 0.22 0.22 1.01 0.86% faster
irt_2pl/irt_2pl.stan 6.44 6.46 1.0 -0.4% slower
performance.compilation 87.94 86.58 1.02 1.55% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.52 7.58 0.99 -0.78% slower
pkpd/one_comp_mm_elim_abs.stan 21.0 20.5 1.02 2.42% faster
sir/sir.stan 93.64 94.02 1.0 -0.4% slower
gp_regr/gen_gp_data.stan 0.05 0.05 0.97 -3.15% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.95 3.0 0.98 -1.75% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.32 0.34 0.95 -4.75% slower
arK/arK.stan 1.73 1.74 1.0 -0.27% slower
arma/arma.stan 0.66 0.66 1.0 0.4% faster
garch/garch.stan 0.51 0.51 1.0 -0.23% slower
Mean result: 0.998620052855

Jenkins Console Log
Blue Ocean
Commit hash: 15f078d


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@SteveBronder
Copy link
Collaborator

For doc stuff I thought it might be easier just to make a gist with changes. How do you feel about the following. Mostly it makes the requirements into a listicle and rewords the first paragraph. The list makes the output doxygen a bit prettier

https://gist.github.com/SteveBronder/0c94b465dbf4ef9dbccd3fedddae4d6d

@t4c1
Copy link
Contributor Author

t4c1 commented Mar 30, 2020

Content is fine. I just don't understand why do you think GIST is preferable to a PR?

@SteveBronder
Copy link
Collaborator

Actually maybe that was silly.

One other thing about this is that it feels like these docs are more for how to add new expressions than ops right? For instance adding an unary or binary op is much simpler than what's in the docs here

@t4c1
Copy link
Contributor Author

t4c1 commented Mar 31, 2020

Right. Adding a new binary/unary op in fact much simpler, so I feel they don't need in depth explanation. What needs to be done can be easily deduced from code for existing ops.

SteveBronder
SteveBronder previously approved these changes Mar 31, 2020
Copy link
Collaborator

@SteveBronder SteveBronder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an optional change to the docs just directing people where to look explicitly but otherwise lgtm thanks for writing these!

Comment on lines 25 to 27
* ## Defining a new kernel generator operation
*
* New kernel generator classes must satsify the conditions below:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* ## Defining a new kernel generator operation
*
* New kernel generator classes must satsify the conditions below:
* ## Defining a new kernel generator expression
*
* For adding new Unary and Binary ops use the macros provided in
* `unary_function_cl.hpp` and `binary_operation.hpp`.
*
* New kernel generator expressions classes must satisfy the conditions below:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Macro for binary operation only works for OpenCL operators. I think I covered all of them.

Copy link
Collaborator

@SteveBronder SteveBronder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1!

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.82 4.91 0.98 -1.82% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.99 -0.51% slower
eight_schools/eight_schools.stan 0.09 0.09 0.97 -3.31% slower
gp_regr/gp_regr.stan 0.22 0.22 0.99 -0.89% slower
irt_2pl/irt_2pl.stan 6.45 6.46 1.0 -0.24% slower
performance.compilation 88.22 87.18 1.01 1.17% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.57 7.58 1.0 -0.16% slower
pkpd/one_comp_mm_elim_abs.stan 20.78 20.68 1.0 0.48% faster
sir/sir.stan 91.34 91.7 1.0 -0.4% slower
gp_regr/gen_gp_data.stan 0.05 0.05 1.02 1.56% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.95 2.95 1.0 0.04% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.31 0.33 0.94 -6.88% slower
arK/arK.stan 1.9 1.73 1.1 9.03% faster
arma/arma.stan 0.67 0.66 1.02 1.54% faster
garch/garch.stan 0.51 0.51 1.0 0.12% faster
Mean result: 1.00085913201

Jenkins Console Log
Blue Ocean
Commit hash: d74edce


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@t4c1 t4c1 merged commit ad4600f into stan-dev:develop Apr 1, 2020
@t4c1 t4c1 deleted the cl_kernel_generator_docs branch July 10, 2020 07:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants