Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GLES2 emulated integer functions op_and(), op_or() and op_xor() #45

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

juj
Copy link

@juj juj commented Oct 24, 2019

Fix GLES2 emulated integer functions op_and(), op_or() and op_xor() to work properly for negative input values, and optimize to avoid control flow.

E.g. before:

int BITWISE_BIT_COUNT = 32;

int op_modi(int x, int y)
{
   return x - y * (x / y);
}

int op_and(int a, int b)
{
   int result = 0;
   int n = 1;
   for (int i = 0; i < BITWISE_BIT_COUNT; i++)
   {
      if ((op_modi(a, 2) != 0) && (op_modi(b, 2) != 0))
      {
         result += n;
      }
      a = a / 2; // Bug: Fails to shift right if a < 0
      b = b / 2; // Bug: Fails to shift right if b < 0
      n = n * 2;
      if (!(a > 0 && b > 0)) // Bug: Fails if a < 0 or b < 0 (though simple fix to change to test a != 0 && b != 0)
      {
         break;
      }
   }
   return result;
}

After:

int op_and(int a, int b)
{
   // First extract the sign bit to convert inputs to positive values.
   int result = (a < 0 && b < 0) ? -2147483648 : 0;
   if (a < 0) a -= -2147483648;
   if (b < 0) b -= -2147483648;
   int n = 1;
   ivec2 ab = ivec2(a, b); // Use vectorization
   for (int i = 0; i < 31; i++) // Loop excluding the sign bit
   {
      ivec2 ab_div = ab / 2;
      ivec2 ab_rem = ab - ab_div*2; // Avoid calling op_modi() to optimize away integer divs.
      // Here ab_rem.x and ab_rem.y are either 0 or 1.
      result += n * ab_rem.x * ab_rem.y; // for one-bit values a and b,  a & b == a*b
      ab = ab_div;
      n += n;
      // At the end avoid test "if (a == 0 || b == 0) break;", as counterproductive
   }
   return result;
}

Similar transformations to op_or and op_xor.

In case of op_or, for two one-bit inputs a and b, a | b is implemented as (a+b)/2.

In case of op_xor, a & b is implemented using int(ab_rem.x != ab_rem.y).

…o work properly for negative input values, and optimize to avoid control flow.
@juj juj force-pushed the fix_gles2_integer_ops branch from fceb3d1 to 3ef11ff Compare September 22, 2020 11:48
@Lssikkes
Copy link

Lssikkes commented Jul 3, 2021

Thanks for this juj, I've pulled it into my local branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants