Sparse computation with kernels #902

cforro · 2020-04-29T23:48:25Z

Dear Community,

If I were to simulate diffusion over a 2D grid with a square hole in the middle, how could I make use of sparse structures to avoid populating my kernel with a bunch of if statements ?

import taichi as ti
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
#running this in jupyter 
import cv2
ti.reset() #running this inside a jupyter notebook helps relaunch. 
#Otherwise compiler (?) doesn't let jupyter rerun.
ti.cfg.arch = ti.cuda 

real = ti.f32
scalar = lambda: ti.var(dt=real)
vec = lambda: ti.Vector(2, dt=real)

steps=15000
n_grid=256
p=scalar()
ti.root.dense(ti.l, steps).dense(ti.ij, n_grid).place(p) #

thick=10
init_cond=np.ones((n_grid,n_grid))
init_cond[thick:-thick,thick:-thick]=0
plt.imshow(init_cond)
#the initial condition has a thick border of concentration 1. The rest is 0.

   
@ti.func 
def laplacian(t, i, j):
    return  (-4 * p[t, i, j] + p[t, i, j - 1] + p[t, i, j + 1] + p[t, i + 1, j] + p[t, i - 1, j])


###Request####
#Here below, I avoid computing the diffusion on a square 25x25 area.
#How can I engineer the field p() to be sparse, so the kernel does not care
#about those values
@ti.kernel
def diffuse(t: ti.i32):
    for i in range(n_grid):
        for j in range(n_grid):
            if not ((i>100)&(i<125)&(j>100)&(j<125)):
                lp = laplacian(t-1,i,j)
                p[t,i,j] = p[t-1,i,j]+0.24*lp
####Request#####
# I would not need to replenish the initial conditions if the borders
# of my field were also inactive in the sparse structure of p()
@ti.kernel
def replenish(t: ti.i32):
    for i in range(n_grid):
        for j in range(n_grid):
            if ((i<thick) or (i>n_grid-thick)) or  ((j<thick) or (j>n_grid-thick)):
                p[t,i,j]=1




@ti.kernel                
def init_condition(arr: ti.ext_arr()):
    for i in range(n_grid):
        for j in range(n_grid):
            p[0,i,j]=arr[i,j]
            

@ti.kernel
def put_into_img(t: ti.i32, arr: ti.ext_arr()):
    for i in range(n_grid):
        for j in range(n_grid):
            arr[i,j]=p[t,i,j]
            
            
#main             
def forward():
    init_condition(init_cond)
    for t in range(1,steps):
        diffuse(t)
        replenish(t)
        if (t+1)%16==0:
            put_into_img(t,display_img)
            cv2.imshow('',display_img)
            cv2.waitKey(1)

forward()

For completeness' sake, I pasted here a working example of a diffusion that avoids a central square hole. This runs in my jupyter notebook.

The concentration field that is updated at every time step, p(), is dense. So I need to tell the kernel to explicitly avoid that region, and also avoid (or rebuild) the initial condition (the borders).

To be honest, I tried playing around from the examples in the test functions for sparse structures, but to no avail. I tried for example this approach:

ti.reset()
n=256
x=ti.var(ti.f32)
ptr1 = ti.root.pointer(ti.ij, n)
ptr1.place(x)


@ti.kernel
def deact():
    for j in range(n):
        for i in range(n):
            if (i>100)&(i<125)&(j>100)&(j<125):
                ti.deactivate(x.parent(),[i,j])
                
@ti.kernel
def fill():
    for i,j in x:
            x[j,i]=i+j

It does not work. Also, I do not understand how to play with pointers. ti.root.pointer(ti.ij,16).dense(ti.ij,16).shape is (256,256), and not (16,16,16,16). Etc.

Long story short, I have no idea how to play with sparse data structures. I was not successful at deactivating elements, or playing with the bitmasked structures.

Eventually, I would like to iteratively solve an extremely sparse linear system, where handling the datastructure correctly would come in real handy.

Let me know if I should rather post this on the forum.

archibate · 2020-04-30T05:21:32Z

ti.root.pointer(ti.ij,16).dense(ti.ij,16).shape is (256,256)

Try ti.root.pointer(ti.indices(0, 1),16).dense(ti.indices(2, 3),16).shape? If two node have same indices, they will multiply together. NOTE: ti.ij is an alias of ti.indices(0, 1).

Long story short, I have no idea how to play with sparse data structures. I was not successful at deactivating elements, or playing with the bitmasked structures.

Thank for asking this! Sorry about the lacks a good doc on sparse data structures. To be honest as a org member I don't know how to operate them well either...
(@yuanming-hu Again, could you complete the docs on sparse? Many thanks.)

###Request####
#Here below, I avoid computing the diffusion on a square 25x25 area.
#How can I engineer the field p() to be sparse, so the kernel does not care
#about those values

Let me try to tell something I know about solving this:

bitmasked is simpler than pointer, it's suggested to start from bitmasked.
an element is activated when it's assigned. e.g.:

x[i, j] = 123

will mark x[i, j] as active.

to deactivate an element, use:

ti.deactivate(x.parent(), [i, j])

use struct-for syntax to iterate only the active elements (important!):

for i, j in x:  # will only loop on those who are activated!!
  ...

if you use range-for syntax, all elements whatever active or inactive will be iterated:

for i in range(n):  # will iterate all!!
  ...

archibate · 2020-04-30T05:33:08Z

import taichi as ti

ti.init()

n = 256
x = ti.var(ti.f32)
blk1 = ti.root.bitmasked(ti.ij, (n, n))
blk1.place(x)

@ti.kernel
def act():
    for i in range(n):
        for j in range(n):
            if (i>100)&(i<125)&(j>100)&(j<125):
                # NOTE: all elements in bitmasked is initially deactivated
                # So we have to **activate those elements we want**
                # Instead of **deactivate those elements we don't want** (your case)
                x[i, j] = 0  # assign any value to activate!

@ti.kernel
def paint(t: ti.f32):
    for i, j in x:  # use the struct-for syntax
        # now, the loop body is only executed for active elements
        # don't use: for i in range(n), it will execute on all element no matter it's active or not (your case)
        x[i, j] = ti.sin(t)  # make the activated part to blink


act()

gui = ti.GUI('Test', n)
for frame in range(10000):
   paint(frame * 0.01)
   gui.set_image(x)
   gui.show()

Hope this could be helpful.

digitalillusions · 2020-04-30T11:49:09Z

Is there a way to deactivate all elements at once, without range iterating over the entire structure?
Does the following line do exactly that?

taichi/examples/taichi_sparse.py

Line 87 in 6751bd5

block1.deactivate_all()

archibate · 2020-04-30T14:59:51Z

Is there a way to deactivate all elements at once, without range iterating over the entire structure?
Does the following line do exactly that?

taichi/examples/taichi_sparse.py

Line 87 in 6751bd5

block1.deactivate_all()

Yes, it does :)

cforro · 2020-05-01T00:36:56Z

Thanks a lot @archibate
I understand it better now, thanks for the details.
Going a bit further, I'm more interested in using diff-taichi, and the ti.Tape, etc. So I want to track the simulation in time. Therefore, my data structure is dense in time but would be bitmasked in space.

ti.root.dense(ti.l, steps).bitmasked(ti.ij, n_grid).place(x)

How do you then run a struct for only on the bitmasked part of the structure ?
I had a suspiscion it has to do with something like

@ti.kernel
def paint(t: ti.f32):
    for i, j in x:  # x.leaf(), x.children(), x.node()  ? How to access only the spatial part ?
        x[t, i, j] = ti.sin(t)

but again limited success :)

archibate · 2020-05-01T07:12:30Z

Good question. You can do this by only activating x[0, i, j], so that x[1, i, j] is not activated. e.g.:

import taichi as ti

ti.init()

n = 256
steps = 3
x = ti.var(ti.f32)
img = ti.var(ti.f32, shape=(n, n))
blk1 = ti.root.dense(ti.l, steps).bitmasked(ti.ij, (n, n))
blk1.place(x)

@ti.kernel
def act():
    for i, j in ti.ndrange(n, n):
            if (i>100)&(i<125)&(j>100)&(j<125):
                x[0, i, j] = 0  # activate `x[0, i, j]`, but not activate `x[1, i, j]`.

@ti.kernel
def paint(n: ti.f32):
    for t, i, j in x:  # although t is included in indices, but it will be always 0 since only x[0, i, j] is activated in act()
        x[t, i, j] = ti.sin(n)

@ti.kernel
def to2dimg():
    for i, j in img:
        img[i, j] = x[0, i, j]  # convert to make gui.set_image() recognize


ti.root.deactivate_all()
act()

gui = ti.GUI('Test', n)
for frame in range(10000):
   paint(frame * 0.01)
   to2dimg()
   gui.set_image(img)
   gui.show()

k-ye · 2020-05-01T15:10:33Z

I think what @archibate said is really close, but slightly off from what @mathusalem2020 wants, since s/he does want t to change with each invocation?

How about define a mask taichi tensor, which is bitmasked. E.g.

x = ti.var(ti.f32)
x_mask = ti.var(ti.i32)

ti.root.bitmasked(ti.ij, (n, n)).place(x_mask)
ti.root.dense(ti.l, steps).dense(ti.ij, (n, n)).place(x)

@ti.kernel
def act():
  for i, j in ti.ndrange(n, n):
    if ...:
      x_mask[i, j] = 1  # the value doesn't matter, just to activate

@ti.kernel
def  diffuse(t: ti.i32):
  for i, j in x_mask:  # will only iterate over those activated in |x_mask|
    x[t, i, j] = ...

act()
for i in range(steps):
  diffuse(i)
  ...

One nice thing is that, the struct-for loop will only return you the index, but not the actual element of the tensor being iterated. So you can leverage the sparsity from one tensor for iterating over another, I think.

xumingkuan · 2020-05-01T17:21:55Z

@ti.kernel
def act():
    for i, j in ti.ndrange(n, n):
        if (i>100)&(i<125)&(j>100)&(j<125):
            x[0, i, j] = 0  # activate `x[0, i, j]`, but not activate `x[1, i, j]`.

A simpler way to write this:

@ti.kernel
def act():
    for I in ti.grouped(ti.ndrange(1, (101, 125), (101, 125))):
        x[I] = 0

archibate · 2020-05-01T17:34:40Z

One nice thing is that, the struct-for loop will only return you the index, but not the actual element of the tensor being iterated. So you can leverage the sparsity from one tensor for iterating over another, I think.

Cool! Your answer sounds much close. But there is two things to consider for your solution:

What if we want to maintain a mask for different t?
If we use pointer instead of bitmasked, then this won't save any memory since x is still dense.

k-ye · 2020-05-01T23:24:14Z

What if we want to maintain a mask for different t?

Yeah, then I guess a static mask variable won’t work. In that case, I’m not sure if sparse data structure is advantageous in any way, since you will have to iterate through the entire mesh grid to figure out which places to activate for every diffuse() iteration?

If we use pointer instead of bitmasked, then this won't save any memory since x is still dense.

I suppose you meant the other way around? That’s true, but I don’;t see memory being a major concern in this issue.

archibate · 2020-05-02T03:58:33Z

> 1. What if we want to maintain a mask for different t? Yeah, then I guess a static mask variable won’t work. In that case, I’m not sure if sparse data structure is advantageous in any way, since you will have to iterate through the entire mesh grid to figure out which places to activate?

No, I mean do we have something like `for i, j in x[0, :, :]`? Since for each `t` the bitmasked is independent, this should be possible.

> 2. If we use `pointer` instead of `bitmasked`, then this won't save any memory since `x` is still `dense`. I suppose you meant the other way around? That’s true, but I don’;t see memory being a major concern in this issue.

Yes, what if it's a major concern?

archibate mentioned this issue Apr 30, 2020

[Example] Add example/bitmasked.py #905

Merged

archibate added the doc Documentation related issues & PRs label Apr 30, 2020

archibate mentioned this issue May 1, 2020

[Doc] Add documentions for layout #900

Merged

archibate added the question Question on using Taichi label May 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse computation with kernels #902

Sparse computation with kernels #902

cforro commented Apr 29, 2020 •

edited by k-ye

Loading

archibate commented Apr 30, 2020 •

edited

Loading

archibate commented Apr 30, 2020 •

edited

Loading

digitalillusions commented Apr 30, 2020

archibate commented Apr 30, 2020

cforro commented May 1, 2020

archibate commented May 1, 2020

k-ye commented May 1, 2020 •

edited

Loading

xumingkuan commented May 1, 2020 •

edited

Loading

archibate commented May 1, 2020

k-ye commented May 1, 2020 •

edited

Loading

archibate commented May 2, 2020 via email

Sparse computation with kernels #902

Sparse computation with kernels #902

Comments

cforro commented Apr 29, 2020 • edited by k-ye Loading

archibate commented Apr 30, 2020 • edited Loading

archibate commented Apr 30, 2020 • edited Loading

digitalillusions commented Apr 30, 2020

archibate commented Apr 30, 2020

cforro commented May 1, 2020

archibate commented May 1, 2020

k-ye commented May 1, 2020 • edited Loading

xumingkuan commented May 1, 2020 • edited Loading

archibate commented May 1, 2020

k-ye commented May 1, 2020 • edited Loading

archibate commented May 2, 2020 via email

cforro commented Apr 29, 2020 •

edited by k-ye

Loading

archibate commented Apr 30, 2020 •

edited

Loading

archibate commented Apr 30, 2020 •

edited

Loading

k-ye commented May 1, 2020 •

edited

Loading

xumingkuan commented May 1, 2020 •

edited

Loading

k-ye commented May 1, 2020 •

edited

Loading