Skip to content
This repository has been archived by the owner on Sep 28, 2024. It is now read-only.

Better documentation of input/output dimensions #76

Closed
scheidan opened this issue Jul 4, 2022 · 11 comments · Fixed by #77
Closed

Better documentation of input/output dimensions #76

scheidan opened this issue Jul 4, 2022 · 11 comments · Fixed by #77

Comments

@scheidan
Copy link

scheidan commented Jul 4, 2022

Thanks for this great package! I think the documentation could be improved be defining the expected input dimensions better.

I think some thing like this would help a lot:

Operator kernel layer

$$v_{t+1}(x) = \sigma(W v_t(x) + \mathcal{K} \{ v_t(x) \} )$$

where x \in R^d, v_{t}\in R^c_in and v_{t+1}\in R^c_out.

Arguments

  • ch: A `Pair` of input (`c_in`) and output channel size  (`c_out`), e.g. 64=>64.
    
  • modes: The modes to be preserved. A tuple of length `d`.
    
  • ...
    
  • permuted: Whether the dim is permuted. If permuted=true, layer accepts data in the order of (c_in, x_1, ... x_d, batch), otherwise the order is (x_1, ... x_d, c_in, batch).
    
@scheidan
Copy link
Author

scheidan commented Jul 4, 2022

(sorry, I'm not sure why the format is off)

@ChrisRackauckas
Copy link
Member

Or more directly, it should conform to what's described in the SciML Style for this: https://github.com/SciML/SciMLStyle#documentation .

The new SciMLSensitivity.jl documentation is a good model to follow: https://sensitivity.sciml.ai/dev/

@foldfelis
Copy link
Contributor

foldfelis commented Jul 5, 2022

actually, I am also thinking about not to support the permuted option to keep things simple... 😕
I'll improve the doc lately.

@scheidan
Copy link
Author

scheidan commented Jul 5, 2022

Thanks for looking into this! Some other remark regarding documentation:

The docs of the "models" FourierNeuralOperator and MarkovNeuralOperator could need some improvements as well. For example, I somehow expected that I can define a model with more or less layers depending on the length of the ch argument.

Looking at the code, however, it seems that both function define a fixed architecture. It is also interesting that they are almost the same - which is very surprising given the very different names! (There seems nothing especially Markovian in MarkovNeuralOperator , it's more the application).

It may be more transparent to remove this wrapper functions from the API and instead define them in the application examples/tutorials.

@foldfelis
Copy link
Contributor

@scheidan As for the models, on one hand, it is more like a doc to me to understand how to intergrate OperatorKernel with other layers. On the other hand, the reason why the FourierNeuralOperator and MarkovNeuralOperator will work, discribed in the original paper. I have also try with some other architecture including increase or decrease the number of OperatorKernel layer, but they all turns out to give worse result. Therefore, I descided to implement the model APIs in a fixed architecture (maybe the SOTA architecture). The model APIs is to provide a easy way for one to get quick start with NeuralOperators.jl. For advenced user, one can just build their own model with OperatorKernel layer and regarde the model APIs as the doc 😄

@scheidan
Copy link
Author

scheidan commented Jul 6, 2022

Now I understand your reasoning for the models, makes sense! :)

A short sentence like:
"The models are based on the layers described above and implement the architecture used in referenced paper."
should help to clarify the issue.

@foldfelis
Copy link
Contributor

@scheidan Thanks for the advice. Please take a look at the PR.

@scheidan
Copy link
Author

scheidan commented Jul 6, 2022

That a big improvement, thanks! Especially the models should be clear now.
Unfortunately I could not comment on the PR directly (or I did not know how to :)

There are two points that are not well defined yet:

  • A tuple of length c where d is the dimension of data. What is the the dimension of the data? The channels are also a dimension...
  • layer accepts data in the order of (ch, ..., batch) Here we should explain what the ... stand for. This was the most confusing point for me at the beginning.

To avoid this confusion I proposed above that we first define d, ch_in, ch_out and x1, ..., x_d in mathematical terms using the equation for the operator kernel. Then we can write

(ch_in, x1, ... , x_d, batch) 

@foldfelis
Copy link
Contributor

foldfelis commented Jul 6, 2022

I can understand that this is not that straightforward for people who are not that familiar with neural operators. But I am not quite sure what I can do. The "dimension of data" here is literally the dimension of "data". Channel on the other hand is one of the dimensions of the "input".

The reason why the "channel" appears in the "input" dimension is because one needs to concatenate the gride information to the "data". And this is also mentioned in the original paper with the definition of d:

The initialization v0(x) to our network (6) can be viewed as the initial guess we make for the solution u(x) as well as any other dependence we want to make explicit. A natural choice is to start with the coefficient a(x) itself as well as the position in physical space x. This (d + 1)-dimensional vector field is then lifted to a n-dimensional vector field, an operation which we may view as the first layer of the overarching neural network. This is then used as an initialization to the kernel neural network, which is iterated T times. In the final layer, we project back to the scalar field of interest with another neural network layer.

So, should I need to copy that again in the doc?

@foldfelis
Copy link
Contributor

@scheidan Please take a look at the PR and see if this works for you 😄

@foldfelis
Copy link
Contributor

I would like to merge and temporary close this issue. Please feel free to re-open the issue if needed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants