Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow changing of artifact type in ArtifactIndex #1681

Closed
umesh-timalsina opened this issue May 5, 2020 · 9 comments
Closed

Allow changing of artifact type in ArtifactIndex #1681

umesh-timalsina opened this issue May 5, 2020 · 9 comments

Comments

@umesh-timalsina
Copy link
Contributor

Similar to dblclick handler for name change, we should be able to change the Artifact's type as well.

@brollb
Copy link
Contributor

brollb commented May 5, 2020

It would be nice if we could detect it automatically since it only makes sense to change it when it is unset. That is, it doesn't really make sense to change the value when it is set correctly as it will cause inconsistencies when that data is used in a pipeline. (A simple "echo" operation would update the type to the actual value and then a different serializer could be used.)

@brollb
Copy link
Contributor

brollb commented May 5, 2020

That said, this works in the meantime...

@umesh-timalsina
Copy link
Contributor Author

umesh-timalsina commented May 5, 2020

This would be helpful if you decide to import an artifact first and implement a loader serialization module later.

It would be nice if we could detect it automatically since it only makes sense to change it when it is unset.

Do you mean automatically detecting the type upon import? Or on the output of an operation?

@brollb
Copy link
Contributor

brollb commented May 5, 2020

This would be helpful if you decide to import an artifact first and implement a loader serialization module later.

Why can't you? Maybe I don't understand correctly - you should be able to upload an artifact and then specify the serializer (as long as you specify the serializer before you use it in an operation).

I was referring to detect on import. The downside is that it could be quite slow to import if we need to run a small python job on something like SciServer (as submitting a job takes a little time). (It already is detected when it is the output of an operation.) Perhaps the bigger problem is the chick-and-egg problem that you get into when trying to check the type; you need to know the type to deserialize it and you would need to deserialize it to be able to detect the type.

@umesh-timalsina
Copy link
Contributor Author

umesh-timalsina commented May 5, 2020

Why can't you? Maybe I don't understand correctly - you should be able to upload an artifact and then specify the serializer (as long as you specify the serializer before you use it in an operation).

Currently, an option to specify a Data Type while running UploadArtifact or ImportArtifact is the way to specify Data Type that I know of. But, I don't know a way to change the Data Type of an artifact when using it in an operation Input node or an Operation pointer. Is there a way to do that?

@brollb
Copy link
Contributor

brollb commented May 5, 2020

So, I am not convinced that we should change the data type as it doesn't actually change the data itself and is just a metadata field.

Is there a case where there is more than one correct value for the data type field? Or where we would want to change it from a correct value to another value?

@umesh-timalsina
Copy link
Contributor Author

Is there a case where there is more than one correct value for the data type field? Or where we would want to change it from a correct value to another value?

I cannot think of one for the first case. For the second case, a type of Unknown could be changed to another type.

@brollb
Copy link
Contributor

brollb commented May 6, 2020

If the type information is missing (I believe that is what you mean by Unknown, right?), it certainly makes sense to be able to add it. However, that isn't really a compelling answer to the second question since I don't consider a missing field to have the correct value. Does that make sense?

For now, I think this will be fine to add because we aren't validating the type information so we cannot ensure that it is correct. As we don't know that it is correct, it may be appropriate for the user to modify it. My concern is that this could be abused to insert invalid values for artifacts and be a bit confusing for users as this could cause deserialization errors.

Perhaps it would be worthwhile to validate the types for uploaded/imported artifacts... Unfortunately, this could be slow and would make the process of uploading data more cumbersome...

@umesh-timalsina
Copy link
Contributor Author

umesh-timalsina commented May 6, 2020

If the type information is missing (I believe that is what you mean by Unknown, right?), it certainly makes sense to be able to add it. However, that isn't really a compelling answer to the second question since I don't consider a missing field to have the correct value. Does that make sense?

Yeah, it makes sense. However, there could be some value in changing the artifact type instead of uploading/importing it again.

For now, I think this will be fine to add because we aren't validating the type information so we cannot ensure that it is correct. As we don't know that it is correct, it may be appropriate for the user to modify it. My concern is that this could be abused to insert invalid values for artifacts and be a bit confusing for users as this could cause deserialization errors.

We could use a Dialog/Prompt warning that this could lead to deserialization errors. Or even better, if we could validate that in some way, either my checking whether the changed value is available and registered as a serialization module that would be nice as well.

The other route we could take is only allow this change to happen once i.e. In other words, you could switch from to default serializer to the custom one but that's not logical, as that kind of defies the metamodel IMO.

Perhaps it would be worthwhile to validate the types for uploaded/imported artifacts... Unfortunately, this could be slow and would make the process of uploading data more cumbersome...

It might be worthwhile to have a separate plugin for validation and we run it before the job is executed. I would be curious to know your suggestions to make it little more efficient / less cumbersome.

@brollb brollb closed this as completed in 361720f May 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants