Better documentation of input/output dimensions #76

scheidan · 2022-07-04T13:58:53Z

Thanks for this great package! I think the documentation could be improved be defining the expected input dimensions better.

I think some thing like this would help a lot:

Operator kernel layer

$$v_{t+1}(x) = \sigma(W v_t(x) + \mathcal{K} \{ v_t(x) \} )$$

where x \in R^d, v_{t}\in R^c_in and v_{t+1}\in R^c_out.

Arguments

ch: A `Pair` of input (`c_in`) and output channel size  (`c_out`), e.g. 64=>64.

modes: The modes to be preserved. A tuple of length `d`.

```
...
```

permuted: Whether the dim is permuted. If permuted=true, layer accepts data in the order of (c_in, x_1, ... x_d, batch), otherwise the order is (x_1, ... x_d, c_in, batch).

The text was updated successfully, but these errors were encountered:

scheidan · 2022-07-04T13:59:39Z

(sorry, I'm not sure why the format is off)

ChrisRackauckas · 2022-07-04T20:51:19Z

Or more directly, it should conform to what's described in the SciML Style for this: https://github.com/SciML/SciMLStyle#documentation .

The new SciMLSensitivity.jl documentation is a good model to follow: https://sensitivity.sciml.ai/dev/

foldfelis · 2022-07-05T03:07:21Z

actually, I am also thinking about not to support the permuted option to keep things simple... 😕
I'll improve the doc lately.

scheidan · 2022-07-05T13:18:48Z

Thanks for looking into this! Some other remark regarding documentation:

The docs of the "models" FourierNeuralOperator and MarkovNeuralOperator could need some improvements as well. For example, I somehow expected that I can define a model with more or less layers depending on the length of the ch argument.

Looking at the code, however, it seems that both function define a fixed architecture. It is also interesting that they are almost the same - which is very surprising given the very different names! (There seems nothing especially Markovian in MarkovNeuralOperator , it's more the application).

It may be more transparent to remove this wrapper functions from the API and instead define them in the application examples/tutorials.

foldfelis · 2022-07-06T08:54:44Z

@scheidan As for the models, on one hand, it is more like a doc to me to understand how to intergrate OperatorKernel with other layers. On the other hand, the reason why the FourierNeuralOperator and MarkovNeuralOperator will work, discribed in the original paper. I have also try with some other architecture including increase or decrease the number of OperatorKernel layer, but they all turns out to give worse result. Therefore, I descided to implement the model APIs in a fixed architecture (maybe the SOTA architecture). The model APIs is to provide a easy way for one to get quick start with NeuralOperators.jl. For advenced user, one can just build their own model with OperatorKernel layer and regarde the model APIs as the doc 😄

scheidan · 2022-07-06T10:24:50Z

Now I understand your reasoning for the models, makes sense! :)

A short sentence like:
"The models are based on the layers described above and implement the architecture used in referenced paper."
should help to clarify the issue.

foldfelis · 2022-07-06T10:44:20Z

@scheidan Thanks for the advice. Please take a look at the PR.

scheidan · 2022-07-06T11:56:26Z

That a big improvement, thanks! Especially the models should be clear now.
Unfortunately I could not comment on the PR directly (or I did not know how to :)

There are two points that are not well defined yet:

A tuple of length c where d is the dimension of data. What is the the dimension of the data? The channels are also a dimension...
layer accepts data in the order of (ch, ..., batch) Here we should explain what the ... stand for. This was the most confusing point for me at the beginning.

To avoid this confusion I proposed above that we first define d, ch_in, ch_out and x1, ..., x_d in mathematical terms using the equation for the operator kernel. Then we can write

(ch_in, x1, ... , x_d, batch)

foldfelis · 2022-07-06T16:35:18Z

I can understand that this is not that straightforward for people who are not that familiar with neural operators. But I am not quite sure what I can do. The "dimension of data" here is literally the dimension of "data". Channel on the other hand is one of the dimensions of the "input".

The reason why the "channel" appears in the "input" dimension is because one needs to concatenate the gride information to the "data". And this is also mentioned in the original paper with the definition of d:

The initialization v0(x) to our network (6) can be viewed as the initial guess we make for the solution u(x) as well as any other dependence we want to make explicit. A natural choice is to start with the coefficient a(x) itself as well as the position in physical space x. This (d + 1)-dimensional vector field is then lifted to a n-dimensional vector field, an operation which we may view as the first layer of the overarching neural network. This is then used as an initialization to the kernel neural network, which is iterated T times. In the final layer, we project back to the scalar field of interest with another neural network layer.

So, should I need to copy that again in the doc?

foldfelis · 2022-07-07T04:12:06Z

@scheidan Please take a look at the PR and see if this works for you 😄

foldfelis · 2022-07-07T06:06:47Z

I would like to merge and temporary close this issue. Please feel free to re-open the issue if needed.

foldfelis mentioned this issue Jul 6, 2022

Enhance docstring #77

Merged

foldfelis closed this as completed in #77 Jul 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better documentation of input/output dimensions #76

Better documentation of input/output dimensions #76

scheidan commented Jul 4, 2022

scheidan commented Jul 4, 2022

ChrisRackauckas commented Jul 4, 2022

foldfelis commented Jul 5, 2022 •

edited

Loading

scheidan commented Jul 5, 2022

foldfelis commented Jul 6, 2022

scheidan commented Jul 6, 2022

foldfelis commented Jul 6, 2022

scheidan commented Jul 6, 2022

foldfelis commented Jul 6, 2022 •

edited

Loading

foldfelis commented Jul 7, 2022

foldfelis commented Jul 7, 2022

Better documentation of input/output dimensions #76

Better documentation of input/output dimensions #76

Comments

scheidan commented Jul 4, 2022

Operator kernel layer

Arguments

scheidan commented Jul 4, 2022

ChrisRackauckas commented Jul 4, 2022

foldfelis commented Jul 5, 2022 • edited Loading

scheidan commented Jul 5, 2022

foldfelis commented Jul 6, 2022

scheidan commented Jul 6, 2022

foldfelis commented Jul 6, 2022

scheidan commented Jul 6, 2022

foldfelis commented Jul 6, 2022 • edited Loading

foldfelis commented Jul 7, 2022

foldfelis commented Jul 7, 2022

foldfelis commented Jul 5, 2022 •

edited

Loading

foldfelis commented Jul 6, 2022 •

edited

Loading