Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse computation with kernels #902

Open
cforro opened this issue Apr 29, 2020 · 11 comments
Open

Sparse computation with kernels #902

cforro opened this issue Apr 29, 2020 · 11 comments
Labels
doc Documentation related issues & PRs question Question on using Taichi

Comments

@cforro
Copy link

cforro commented Apr 29, 2020

Dear Community,

If I were to simulate diffusion over a 2D grid with a square hole in the middle, how could I make use of sparse structures to avoid populating my kernel with a bunch of if statements ?

import taichi as ti
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
#running this in jupyter 
import cv2
ti.reset() #running this inside a jupyter notebook helps relaunch. 
#Otherwise compiler (?) doesn't let jupyter rerun.
ti.cfg.arch = ti.cuda 

real = ti.f32
scalar = lambda: ti.var(dt=real)
vec = lambda: ti.Vector(2, dt=real)

steps=15000
n_grid=256
p=scalar()
ti.root.dense(ti.l, steps).dense(ti.ij, n_grid).place(p) #

thick=10
init_cond=np.ones((n_grid,n_grid))
init_cond[thick:-thick,thick:-thick]=0
plt.imshow(init_cond)
#the initial condition has a thick border of concentration 1. The rest is 0.

   
@ti.func 
def laplacian(t, i, j):
    return  (-4 * p[t, i, j] + p[t, i, j - 1] + p[t, i, j + 1] + p[t, i + 1, j] + p[t, i - 1, j])


###Request####
#Here below, I avoid computing the diffusion on a square 25x25 area.
#How can I engineer the field p() to be sparse, so the kernel does not care
#about those values
@ti.kernel
def diffuse(t: ti.i32):
    for i in range(n_grid):
        for j in range(n_grid):
            if not ((i>100)&(i<125)&(j>100)&(j<125)):
                lp = laplacian(t-1,i,j)
                p[t,i,j] = p[t-1,i,j]+0.24*lp
####Request#####
# I would not need to replenish the initial conditions if the borders
# of my field were also inactive in the sparse structure of p()
@ti.kernel
def replenish(t: ti.i32):
    for i in range(n_grid):
        for j in range(n_grid):
            if ((i<thick) or (i>n_grid-thick)) or  ((j<thick) or (j>n_grid-thick)):
                p[t,i,j]=1




@ti.kernel                
def init_condition(arr: ti.ext_arr()):
    for i in range(n_grid):
        for j in range(n_grid):
            p[0,i,j]=arr[i,j]
            

@ti.kernel
def put_into_img(t: ti.i32, arr: ti.ext_arr()):
    for i in range(n_grid):
        for j in range(n_grid):
            arr[i,j]=p[t,i,j]
            
            
#main             
def forward():
    init_condition(init_cond)
    for t in range(1,steps):
        diffuse(t)
        replenish(t)
        if (t+1)%16==0:
            put_into_img(t,display_img)
            cv2.imshow('',display_img)
            cv2.waitKey(1)

forward()

For completeness' sake, I pasted here a working example of a diffusion that avoids a central square hole. This runs in my jupyter notebook.

The concentration field that is updated at every time step, p(), is dense. So I need to tell the kernel to explicitly avoid that region, and also avoid (or rebuild) the initial condition (the borders).

To be honest, I tried playing around from the examples in the test functions for sparse structures, but to no avail. I tried for example this approach:

ti.reset()
n=256
x=ti.var(ti.f32)
ptr1 = ti.root.pointer(ti.ij, n)
ptr1.place(x)


@ti.kernel
def deact():
    for j in range(n):
        for i in range(n):
            if (i>100)&(i<125)&(j>100)&(j<125):
                ti.deactivate(x.parent(),[i,j])
                
@ti.kernel
def fill():
    for i,j in x:
            x[j,i]=i+j

It does not work. Also, I do not understand how to play with pointers. ti.root.pointer(ti.ij,16).dense(ti.ij,16).shape is (256,256), and not (16,16,16,16). Etc.

Long story short, I have no idea how to play with sparse data structures. I was not successful at deactivating elements, or playing with the bitmasked structures.

Eventually, I would like to iteratively solve an extremely sparse linear system, where handling the datastructure correctly would come in real handy.

Let me know if I should rather post this on the forum.

@archibate
Copy link
Collaborator

archibate commented Apr 30, 2020

ti.root.pointer(ti.ij,16).dense(ti.ij,16).shape is (256,256)

Try ti.root.pointer(ti.indices(0, 1),16).dense(ti.indices(2, 3),16).shape? If two node have same indices, they will multiply together. NOTE: ti.ij is an alias of ti.indices(0, 1).

Long story short, I have no idea how to play with sparse data structures. I was not successful at deactivating elements, or playing with the bitmasked structures.

Thank for asking this! Sorry about the lacks a good doc on sparse data structures. To be honest as a org member I don't know how to operate them well either...
(@yuanming-hu Again, could you complete the docs on sparse? Many thanks.)

###Request####
#Here below, I avoid computing the diffusion on a square 25x25 area.
#How can I engineer the field p() to be sparse, so the kernel does not care
#about those values

Let me try to tell something I know about solving this:

  1. bitmasked is simpler than pointer, it's suggested to start from bitmasked.

  2. an element is activated when it's assigned. e.g.:

x[i, j] = 123

will mark x[i, j] as active.

  1. to deactivate an element, use:
ti.deactivate(x.parent(), [i, j])
  1. use struct-for syntax to iterate only the active elements (important!):
for i, j in x:  # will only loop on those who are activated!!
  ...
  1. if you use range-for syntax, all elements whatever active or inactive will be iterated:
for i in range(n):  # will iterate all!!
  ...

@archibate
Copy link
Collaborator

archibate commented Apr 30, 2020

import taichi as ti

ti.init()

n = 256
x = ti.var(ti.f32)
blk1 = ti.root.bitmasked(ti.ij, (n, n))
blk1.place(x)

@ti.kernel
def act():
    for i in range(n):
        for j in range(n):
            if (i>100)&(i<125)&(j>100)&(j<125):
                # NOTE: all elements in bitmasked is initially deactivated
                # So we have to **activate those elements we want**
                # Instead of **deactivate those elements we don't want** (your case)
                x[i, j] = 0  # assign any value to activate!

@ti.kernel
def paint(t: ti.f32):
    for i, j in x:  # use the struct-for syntax
        # now, the loop body is only executed for active elements
        # don't use: for i in range(n), it will execute on all element no matter it's active or not (your case)
        x[i, j] = ti.sin(t)  # make the activated part to blink


act()

gui = ti.GUI('Test', n)
for frame in range(10000):
   paint(frame * 0.01)
   gui.set_image(x)
   gui.show()

Hope this could be helpful.

@digitalillusions
Copy link

Is there a way to deactivate all elements at once, without range iterating over the entire structure?
Does the following line do exactly that?

block1.deactivate_all()

@archibate
Copy link
Collaborator

Is there a way to deactivate all elements at once, without range iterating over the entire structure?
Does the following line do exactly that?

block1.deactivate_all()

Yes, it does :)

@archibate archibate added the doc Documentation related issues & PRs label Apr 30, 2020
@cforro
Copy link
Author

cforro commented May 1, 2020

Thanks a lot @archibate
I understand it better now, thanks for the details.
Going a bit further, I'm more interested in using diff-taichi, and the ti.Tape, etc. So I want to track the simulation in time. Therefore, my data structure is dense in time but would be bitmasked in space.

ti.root.dense(ti.l, steps).bitmasked(ti.ij, n_grid).place(x)

How do you then run a struct for only on the bitmasked part of the structure ?
I had a suspiscion it has to do with something like

@ti.kernel
def paint(t: ti.f32):
    for i, j in x:  # x.leaf(), x.children(), x.node()  ? How to access only the spatial part ?
        x[t, i, j] = ti.sin(t)

but again limited success :)

@archibate
Copy link
Collaborator

Good question. You can do this by only activating x[0, i, j], so that x[1, i, j] is not activated. e.g.:

import taichi as ti

ti.init()

n = 256
steps = 3
x = ti.var(ti.f32)
img = ti.var(ti.f32, shape=(n, n))
blk1 = ti.root.dense(ti.l, steps).bitmasked(ti.ij, (n, n))
blk1.place(x)

@ti.kernel
def act():
    for i, j in ti.ndrange(n, n):
            if (i>100)&(i<125)&(j>100)&(j<125):
                x[0, i, j] = 0  # activate `x[0, i, j]`, but not activate `x[1, i, j]`.

@ti.kernel
def paint(n: ti.f32):
    for t, i, j in x:  # although t is included in indices, but it will be always 0 since only x[0, i, j] is activated in act()
        x[t, i, j] = ti.sin(n)

@ti.kernel
def to2dimg():
    for i, j in img:
        img[i, j] = x[0, i, j]  # convert to make gui.set_image() recognize


ti.root.deactivate_all()
act()

gui = ti.GUI('Test', n)
for frame in range(10000):
   paint(frame * 0.01)
   to2dimg()
   gui.set_image(img)
   gui.show()

@k-ye
Copy link
Member

k-ye commented May 1, 2020

I think what @archibate said is really close, but slightly off from what @mathusalem2020 wants, since s/he does want t to change with each invocation?

How about define a mask taichi tensor, which is bitmasked. E.g.

x = ti.var(ti.f32)
x_mask = ti.var(ti.i32)

ti.root.bitmasked(ti.ij, (n, n)).place(x_mask)
ti.root.dense(ti.l, steps).dense(ti.ij, (n, n)).place(x)

@ti.kernel
def act():
  for i, j in ti.ndrange(n, n):
    if ...:
      x_mask[i, j] = 1  # the value doesn't matter, just to activate

@ti.kernel
def  diffuse(t: ti.i32):
  for i, j in x_mask:  # will only iterate over those activated in |x_mask|
    x[t, i, j] = ...

act()
for i in range(steps):
  diffuse(i)
  ...

One nice thing is that, the struct-for loop will only return you the index, but not the actual element of the tensor being iterated. So you can leverage the sparsity from one tensor for iterating over another, I think.

@xumingkuan
Copy link
Contributor

xumingkuan commented May 1, 2020

@ti.kernel
def act():
    for i, j in ti.ndrange(n, n):
        if (i>100)&(i<125)&(j>100)&(j<125):
            x[0, i, j] = 0  # activate `x[0, i, j]`, but not activate `x[1, i, j]`.

A simpler way to write this:

@ti.kernel
def act():
    for I in ti.grouped(ti.ndrange(1, (101, 125), (101, 125))):
        x[I] = 0

@archibate
Copy link
Collaborator

One nice thing is that, the struct-for loop will only return you the index, but not the actual element of the tensor being iterated. So you can leverage the sparsity from one tensor for iterating over another, I think.

Cool! Your answer sounds much close. But there is two things to consider for your solution:

  1. What if we want to maintain a mask for different t?
  2. If we use pointer instead of bitmasked, then this won't save any memory since x is still dense.

@k-ye
Copy link
Member

k-ye commented May 1, 2020

  1. What if we want to maintain a mask for different t?

Yeah, then I guess a static mask variable won’t work. In that case, I’m not sure if sparse data structure is advantageous in any way, since you will have to iterate through the entire mesh grid to figure out which places to activate for every diffuse() iteration?

  1. If we use pointer instead of bitmasked, then this won't save any memory since x is still dense.

I suppose you meant the other way around? That’s true, but I don’;t see memory being a major concern in this issue.

@archibate
Copy link
Collaborator

archibate commented May 2, 2020 via email

@archibate archibate added the question Question on using Taichi label May 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Documentation related issues & PRs question Question on using Taichi
Projects
None yet
Development

No branches or pull requests

5 participants