Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tier1 decoder speed optimizations #783

Merged
merged 12 commits into from
Sep 13, 2016

Commits on May 21, 2016

  1. Move some MQC functions into a header for speed

    Allow these hot functions to be inlined. This boosts decode performance by ~10%.
    c0nk authored and rouault committed May 21, 2016
    Configuration menu
    Copy the full SHA
    426bf8d View commit details
    Browse the repository at this point in the history
  2. opj_t1_updateflags(): tiny optimization

    We can avoid using a loop-up table with some shift arithmetics.
    rouault committed May 21, 2016
    Configuration menu
    Copy the full SHA
    c539808 View commit details
    Browse the repository at this point in the history
  3. Improve code generation in opj_t1_dec_clnpass()

    Add a opj_t1_dec_clnpass_step_only_if_flag_not_sig_visit() method that
    does the job of opj_t1_dec_clnpass_step_only() assuming the conditions
    are met. And use it in opj_t1_dec_clnpass(). The compiler generates
    more efficient code.
    rouault committed May 21, 2016
    Configuration menu
    Copy the full SHA
    d8fef96 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    23a01df View commit details
    Browse the repository at this point in the history
  5. Reduce number of occurrences of orient function argument

    This is essentially used to shift inside the lut_ctxno_zc, which we
    can precompute at the beginning of opj_t1_decode_cblk() /
    opj_t1_encode_cblk()
    rouault committed May 21, 2016
    Configuration menu
    Copy the full SHA
    ba1edf6 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    31882ad View commit details
    Browse the repository at this point in the history

Commits on May 23, 2016

  1. Tier 1 decoding: add a colflags array

    Addition flag array such that colflags[1+0] is for state of col=0,row=0..3,
    colflags[1+1] for col=1, row=0..3, colflags[1+flags_stride] for col=0,row=4..7, ...
    This array avoids too much cache trashing when processing by 4 vertical samples
    as done in the various decoding steps.
    rouault committed May 23, 2016
    Configuration menu
    Copy the full SHA
    1da397e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    93f7f90 View commit details
    Browse the repository at this point in the history
  3. opj_t1_dec_clnpass(): remove useless test in the runlen decoding path…

    … (of the non VSC case)
    rouault committed May 23, 2016
    Configuration menu
    Copy the full SHA
    956c31d View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    8371491 View commit details
    Browse the repository at this point in the history
  5. Improve perf of opj_t1_dec_sigpass_mqc_vsc() and opj_t1_dec_refpass_m…

    …qc_vsc() with loop unrolling
    rouault committed May 23, 2016
    Configuration menu
    Copy the full SHA
    107eb31 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    7092f7e View commit details
    Browse the repository at this point in the history