Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing LY_CTX_BASIC_PLUGINS_ONLY & LYPLG_TYPE_STORE_ONLY #2171

Merged
merged 1 commit into from
Feb 21, 2024

Conversation

steweg
Copy link
Contributor

@steweg steweg commented Feb 13, 2024

This patch separates the build-in type validations from store operations to allow storing a node even if the value doesn't pass the validations. To do that a new context option was introduced:
LY_CTX_STORE_INVALID_DATA

This option also prevent loading of advanced plugins, which cannot really separate store and validation operations

@steweg
Copy link
Contributor Author

steweg commented Feb 13, 2024

Some background:
We have briefly discussed this in #2169. It turned out that opaque node was usable, but only for one use-case to evaluate the "when" condition, without having and lyd_node instance. Unfortunately, I realized that I need it also for cases:

  • to evaluate possible leafref targets
  • to be able to add implicit default values
  • to be able to create whole data tree of values, which comply with just type itself (string is string, int8 is int8...) with no advanced validations during creation/storing
  • to be able to identify associated schema from any of the data node.
  • to be able to create data nodes from dictionary using libyang-python dict_to_dnode, which effectively uses lyd_new_term, lyd_new_list, lyd_new_inner calls.

This last one is not really a problem of libyang project. It's more related to nature of usage of libyang-python + libyang:

  • which limits users of libyang-python to access and modify libyang structures directly
  • the way of concept of how libyang-python objects are being instantiated (many libyang-python objects shares the same libyang objects), preventing effective extension of libyang-python objects to support storing of additional data that will not affect libyang parts and having 1:1 relationship with libyang objects.

For these cases usage of an opaque node is very problematic as it requires to store schema in priv and perform complete tree traversal with temporarily putting a schema value, evaluating, and removing it back. It also can have side effects, as during the evaluation opaque node is wrongly considered as real node, so if anything in the code tries to e.g. access the value of opaq node, it will cause segfaults/invalid reads

@steweg steweg marked this pull request as draft February 13, 2024 10:12
src/context.h Outdated Show resolved Hide resolved
src/context.h Outdated Show resolved Hide resolved
src/context.h Outdated Show resolved Hide resolved
@steweg steweg force-pushed the feature/built-in-validations branch from d65efe2 to a8f7f4a Compare February 13, 2024 12:02
@steweg steweg marked this pull request as ready for review February 13, 2024 12:17
Copy link
Member

@michalvasko michalvasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think your use-case can be supported but using a different design that may even improve things overall. It can be divided into 2 independent parts:

  1. New context flag for it to avoid using (loading) any type plugins except for the basic ones for the built-in YANG types. I believe this should also include external plugins (which you have not disabled in this PR).

  2. New internal type plugin option LYPLG_TYPE_STORE_ONLY which would be set for every store callback called when parsing data with LYD_PARSE_ONLY and would mean that the callback can skip all validation tasks that would not prevent the value from being stored successfully.

    You have mostly implemented this already leading to more validate callbacks, which would not be called in store if the new option is set (and never return LY_EINCOMPLETE), called otherwise. This would lead to another important change in the background, some validate callbacks would need the data tree (current ones) and some not (the new ones). I am not certain this is relevant and everything will probably be fine if this difference is ignored and the data are always provided to validate (because it would normally be called as part of store when data are not needed just like now).

    This will also lead to more efficient LYD_PARSE_ONLY that would skip redundant type validation. Also, as you mentioned, you will need this new functionality not only as part of lyd_parse() so you can add functions such as lyd_new_term_store_only(), not sure if you need anything else.

PS: Sorry it took me a while, busy last few days.

@steweg
Copy link
Contributor Author

steweg commented Feb 14, 2024

On 1st point:

  • I have kept the external plugins on purpose. The reason is that user have a control about this. So if user provide his own plugin, he knows it's behavior and must take it into account. The problematic part is only advanced plugins within libyang, which you currently cannot really control from user perspective.
  • I was also thinking about possibility to have different set of plugins per context, but this can be done as a separate patch in future (if needed), as well as some more granular filter of which plugins to load and which not (kind of deny-list or allow-list...)

On 2nd point:

  • creation of separate parse flag on plugin level is kind of ok. The only drawback will be that no all plugins will support it, which could be misleading to user. Or do you want me to perform check in all advanced plugin, so that if user will try to use that flag with advanced plugin, it will fail with LY_EDENIED?
  • Also how you want to make sure that it is internal? Currently all the flags/options are cascaded from user API calls, so I am not sure if you wish to put additional argument to store function definition (which will cause backward compatibility issue) or how exactly you want to do this
  • I have put the validations calls into the store functions to avoid need to have duplicate code. If you prefer duplicate code, then I can rewrite it as well, just let me know
  • Return of LY_EINCOMPLETE was already used by leafref plugin, that's why I have followed the same logic, as it seem to be the most generic way. If we don't return this code, caller will not be aware if the validation call is needed or not after the storing. To do that, it will be needed to combine information of what exact type of plugin was called + whether new flag was used + return code (something similar what I have done in schema_compile.c). Or simply always call validation regardless of everything. The cleanest alternative would be to totally separate storing and validations, but in that case we would introduce backward compatibility issue. And again introduce problem of what to do with advanced plugin that cannot really separate those operations. So I still think, that current solution with using of LY_EINCOMPLETE is the correct to avoid all side-effects. Please confirm how to proceed from your PoV

@michalvasko
Copy link
Member

  1. I am trying to make the type plugins have only the new single option to implement (LYPLG_TYPE_STORE_ONLY) meaning the user should not need to control anything. To keep it simple, they would have just the option to disable all plugins except for built-in types, which should be fine for your use-case. Or am I missing something?

  2. I want all the plugins to support it, naturally. Yes, ipv4-address plugin will not be able to store strings even with the flag, which is why you need the option to disable the plugin in point 1. but that is all fine.

    There are already options for the store callback so all I am suggesting is to add a new one. It is technically public but only for the plugin API.

    I said that is all fine and actually can stay in the design I am suggesting, just calling the validation callback in store in case LYPLG_TYPE_STORE_ONLY is not set, not calling it if it is.

    Again, I am only suggesting to keep the current behavior, mostly. In case LYPLG_TYPE_STORE_ONLY the caller does not care that the value may not be valid, that is the point of the flag, it should only be stored. So it is redundant to return it, that is all I said before. And yes, completely separating validation and storing is what I am talking about. For cases when it is not possible it will all be part of storing (and LYPLG_TYPE_STORE_ONLY will not really be supported in the sense that it will succeed only if the value is valid as well), I do not see any issue with this behavior. I do not see how this is backwards-incompatible (the validation callback was always optional and adding it for some types does not cause incompatibility that I know of) but it actually does not matter because we are in the process of releasing a new SO version 3 so it can be a part of that.

@steweg steweg force-pushed the feature/built-in-validations branch from a8f7f4a to 8e216c9 Compare February 16, 2024 08:43
@steweg steweg changed the title Introducing LY_CTX_STORE_INVALID_DATA Introducing LY_CTX_BASIC_PLUGINS_ONLY & LYPLG_TYPE_STORE_ONLY Feb 16, 2024
@steweg steweg force-pushed the feature/built-in-validations branch from 8e216c9 to 8860204 Compare February 16, 2024 09:04
Copy link
Member

@michalvasko michalvasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without looking at this in much detail, I meant it when I said that you can add new functions for the store-only flag, please do not modify the current ones. This is not about backwards-compatibility, they already have one annoying parameter (output), they do not need another. Also, this makes sense only for a few functions such as lyd_new_term() and okay, even lyd_new_list() or lyd_new_meta().

@steweg
Copy link
Contributor Author

steweg commented Feb 16, 2024

Well adding a new variant of public calls seem to be quite odd..... how about converting ly_bool output in lyd_new_ to uint32_t options or uint8_t options? This way we can convert output to LYD_NEW_LIST_OUTPUT = 1 and introduce LYD_NEW_LIST_STORE_ONLY = 2 and kind of maintain backward compatibility.

Or do it totally different and say that all lyd_new_* API will always imply LYPLG_TYPE_STORE_ONLY usage

@michalvasko
Copy link
Member

michalvasko commented Feb 16, 2024

Or do it totally different and say that all lyd_new_* API will always imply LYPLG_TYPE_STORE_ONLY usage

Was thinking about that but I do not think it is really desired, in general, to not get an error on an invalid value.

But the options parameter is a good idea. It would allow for removal of lyd_new_(term|list)_(bin|canon)() as well. So we end up with options: LYD_NEW_OUTPUT, LYD_NEW_VAL_STORE_ONLY, LYD_NEW_VAL_BIN, and LYD_NEW_VAL_CANON.

One just needs to sanitize the options because some should not be combined. Specifically:

  • LYD_NEW_VAL_STORE_ONLY and LYD_NEW_VAL_CANON - for store_only the canonical value should not be generated because it may not be possible on invalid values
  • LYD_NEW_VAL_STORE_ONLY and LYD_NEW_VAL_BIN - binary value must always be canonical and valid
  • LYD_NEW_VAL_CANON and LYD_NEW_VAL_BIN - value is either binary or a canonical string

Also, the LYD_NEW_VAL_* options will make no sense for lyd_new_inner(). I think this will simplify the API and make it easily extensible (make the options uint32_t to be safe), thanks for your effort.

Actually, these options should then be merged with LYD_NEW_PATH_* options because it can use all of these and has some specific ones. So, for duplicate options I would remove them from LYD_NEW_PATH_* and keep the specific ones the way they are now keeping the prefix LYD_NEW_PATH_*.

@steweg
Copy link
Contributor Author

steweg commented Feb 16, 2024

One side question coming from utest modifications. Is there a reason why lyd_parse_op_ is always using hardcoded LYD_PARSE_ONLY? As LYD_PARSE_ONLY implies LYPLG_TYPE_STORE_ONLY this could have a potential side-effects. Is this ok to keep it like this or should it be exposed to user via some new options field?

@michalvasko
Copy link
Member

Right, not sure about that one. I do not think it is such a problem that LYPLG_TYPE_STORE_ONLY would be used but if possible, it should not in this case. However, I am not sure it is worth adding a new parameter lyd_parse_op().

A solution may be adding a new parsing flag, one would use LYPLG_TYPE_STORE_ONLY, the other not, and both would skip the full validation. For some backwards compatibility, maybe add LYD_PARSE_STORE_ONLY that would use LYPLG_TYPE_STORE_ONLY and keep LYD_PARSE_ONLY without it? Seems a bit confusing but should be okay with a clear documentation.

@steweg
Copy link
Contributor Author

steweg commented Feb 17, 2024

I have accidentally push incomplete patch, no review is needed yet. I will let you know once finished on my side

@steweg steweg force-pushed the feature/built-in-validations branch from b50a0f6 to d0eb796 Compare February 17, 2024 17:34
@steweg
Copy link
Contributor Author

steweg commented Feb 17, 2024

Ok, now it should be the ready for review

@steweg steweg force-pushed the feature/built-in-validations branch 2 times, most recently from a6fab9c to e9c8c5e Compare February 19, 2024 07:31
Copy link
Member

@michalvasko michalvasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So again, thanks for the effort, a lot of changes were required. Unfortunately, I have not realized that adding the binary value option to the new functions would add another parameter (value length) and make the value void * instead of char *. I would really prefer to avoid changes requiring rewriting of all these functions usage in existing code (as long as no options are needed) so I have to ask you to keep this single variant (_bin) in a separate function. However, the documentation still needs to be improved significantly and I see no point in adding all those comments so I will do that separately after merging this PR and I could also make the binary value change, up to you.

Another thing, please change the target branch to new_so, these changes cannot simply be merged into devel, the other projects needs to be synced first with the changes.

Additional note, lyplg_type_check_hints() should probably be a part of validate callbacks as well, it does not affect the storing process.

src/plugins_exts/schema_mount.c Outdated Show resolved Hide resolved
src/tree_data_new.c Outdated Show resolved Hide resolved
src/tree_data_new.c Outdated Show resolved Hide resolved
src/tree_data_new.c Outdated Show resolved Hide resolved
@steweg
Copy link
Contributor Author

steweg commented Feb 20, 2024

I have put back the lyd_new_term_bin(), with just minor update to use options. And updated restrictions of lyd_new_term() to be not working with binary value. Rebased on new_so and squashed my patches into one.

I deliberately kept lyplg_type_check_hints within the store, just to force user to put value, which really is of given type (so if I put value of "asd" to node of type int8, I get an error).

@steweg steweg force-pushed the feature/built-in-validations branch from e9c8c5e to 2f8f0d8 Compare February 20, 2024 12:36
@steweg steweg changed the base branch from devel to new_so February 20, 2024 12:36
@steweg steweg force-pushed the feature/built-in-validations branch from 2f8f0d8 to aa5419d Compare February 20, 2024 12:53
Copy link
Member

@michalvasko michalvasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there, hopefully some final adjustments needed.

src/context.c Outdated Show resolved Hide resolved
src/context.h Outdated Show resolved Hide resolved
src/tree_data.h Show resolved Hide resolved
@michalvasko
Copy link
Member

Regarding lyplg_type_check_hints(), it should not cause problems to skip it when storing the value. For int, for example, the value is always parsed (to check it can be stored) and that will fail for any invalid value.

I am talking mainly about encoding problems that if a number is encoded as a string, it can still be stored and the validation should fail. However, hints are not available in the validate callback and it is not worth it to add them there so leave it the way it is then.

This patch separates the build-in type validations from store
operations to allow storing a node even if the value doesn't pass the
validations. To do multiple changes were done:
 - introduces LY_CTX_BASIC_PLUGINS_ONLY flag, which prevents loading of
   any advanced plugins
 - introduces LYPLG_TYPE_STORE_ONLY flag, which skip validation and
   perform just storing
 - introduces LYD_PARSE_STORE_ONLY flag, which implies usage of LYPLG_TYPE_STORE_ONLY
 - introduces options for lyd_new_* API
 - removes lyd_new_(term|list)_(bin|canon) API
 - added sanity checks within lyd_new_(term|list) APIs for proper
   combinations of options
 - refactored lyd_new_* APIs to use common flags to use common options attributes
@steweg steweg force-pushed the feature/built-in-validations branch from aa5419d to b970d5a Compare February 20, 2024 15:16
Copy link
Member

@michalvasko michalvasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine now, thanks, I will now improve the documentation myself.

@michalvasko michalvasko merged commit e1d790e into CESNET:new_so Feb 21, 2024
1 check passed
michalvasko pushed a commit that referenced this pull request Feb 26, 2024
)

This patch separates the build-in type validations from store
operations to allow storing a node even if the value doesn't pass the
validations. To do multiple changes were done:
 - introduces LY_CTX_BASIC_PLUGINS_ONLY flag, which prevents loading of
   any advanced plugins
 - introduces LYPLG_TYPE_STORE_ONLY flag, which skip validation and
   perform just storing
 - introduces LYD_PARSE_STORE_ONLY flag, which implies usage of LYPLG_TYPE_STORE_ONLY
 - introduces options for lyd_new_* API
 - removes lyd_new_(term|list)_(bin|canon) API
 - added sanity checks within lyd_new_(term|list) APIs for proper
   combinations of options
 - refactored lyd_new_* APIs to use common flags to use common options attributes
@steweg steweg deleted the feature/built-in-validations branch April 5, 2024 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants