Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SchemaDefinition.Create picks up internal fields instead of just public only - affects F# #6209

Open
fwaris opened this issue May 30, 2022 · 9 comments
Labels
F# Support of F# language needs-further-triage P2 Priority of the issue for triage purpose: Needs to be fixed at some point.
Milestone

Comments

@fwaris
Copy link

fwaris commented May 30, 2022

This used to work before but now I cannot use CreateFromEnumerable in F# now.

In F#, we define mutable classes by annotating F# 'record' types with CLIMutable:

[<CLIMutable>]
type D = 
   { 
         Data :  float[]
   }

F# compiler generates IL that looks as follows:

public class D  [ interface list  ] 
{
        internal Data@ = Double[];

  	public Double[] Data
	{
		get
		{
			return Data@;
		}
		set
		{
			Data@ = value;
		}
	}

     [ more methods including a default constructor ]
}

The SchemaDefinition.Create method picks up both "Data@" and "Data" as fields required by the schema. It should only pickup the public fields.

Please add unit tests for F# as well so these issues are caught earlier.

@dsyme

@ghost ghost added the untriaged New issue has not been triaged label May 30, 2022
@fwaris
Copy link
Author

fwaris commented May 30, 2022

I just found out that one workaround for the above issue is to set "ignoreMissingColumns=true" when calling CreateEnumerable(...)

@michaelgsharp
Copy link
Member

Glad you found a workaround, but yeah that isn't the behavior we want I don't think.

When did you notice this change? We haven't made a recent changes to SchemaDefinition.Create that I am aware of. I wonder if it has to do with a .NET version change? What version are you using now? Could you also try an older version?

@michaelgsharp michaelgsharp added F# Support of F# language needs-further-triage labels Jun 13, 2022
@fwaris
Copy link
Author

fwaris commented Jun 14, 2022 via email

@michaelgsharp
Copy link
Member

Since this is a change by the F# tooling there is nothing that we can do about it unfortunately. Closing this for now, but if things change in the future please create a new issue and we will take another look.

@ghost ghost removed the untriaged New issue has not been triaged label Jul 11, 2022
@dsyme
Copy link
Contributor

dsyme commented Jul 11, 2022

@michaelgsharp Just to check....

There are some situations where F# needs to emit public fields - notably when emitting code into multiple assemblies in F# Interactive.

My understanding is that serialization of fields should ignore

  • internal fields
  • private fields
  • public fields marked "CompilerGenerated".

This is what System.Json and Newtonsoft.Json both do. Does SchemaDefinition do that? It would be great if it does.

See also dotnet/fsharp#13494 which is related

@fwaris Do you know if you hit this issue when using F# Interactive?

Thanks
Don

@michaelgsharp
Copy link
Member

@dsyme let me reopen the issue for now while I look into that. I am not actually sure off of the top of my head.

@michaelgsharp michaelgsharp reopened this Jul 13, 2022
@ghost ghost added the untriaged New issue has not been triaged label Jul 13, 2022
@luisquintanilla luisquintanilla added the P2 Priority of the issue for triage purpose: Needs to be fixed at some point. label Mar 29, 2023
@luisquintanilla luisquintanilla added this to the ML.NET Future milestone Mar 29, 2023
@ghost ghost removed the untriaged New issue has not been triaged label Mar 29, 2023
@Szer
Copy link

Szer commented Jul 20, 2024

Just in case, it is still an issue as all ML.NET guides advice to use records and they are breaking in F# interactive environment.
One have to know about hacky workaround passing --multiemit- to FSI which is not a common knowledge

@fwaris
Copy link
Author

fwaris commented Jul 20, 2024

Check out MLUtils that has some utility code to help use ML.Net from F#. You can use cleanSchema to remove '@' fields from the default schema

#r "nuget: MLUtils"
open MLUtils
type NRow  = { Field1:int; ...}
let xs : NRow list = []
let dv = ctx.Data.LoadFromEnumerable(xs,Schema.cleanSchema typeof<NRow>)

Going the other way - from IDataView to F# record - ignoreMissingColumns flag works.

 ctx.Data.CreateEnumerable<NRow>(dv,false,ignoreMissingColumns=true)

@fwaris
Copy link
Author

fwaris commented Jul 20, 2024

@michaelgsharp Just to check....

There are some situations where F# needs to emit public fields - notably when emitting code into multiple assemblies in F# Interactive.

My understanding is that serialization of fields should ignore

  • internal fields
  • private fields
  • public fields marked "CompilerGenerated".

This is what System.Json and Newtonsoft.Json both do. Does SchemaDefinition do that? It would be great if it does.

See also dotnet/fsharp#13494 which is related

@fwaris Do you know if you hit this issue when using F# Interactive?

Thanks Don

@dsyme, sorry, did not see this until now.
There is a workaround. See the last few messages in this conversation for some notes on the MLUtils package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F# Support of F# language needs-further-triage P2 Priority of the issue for triage purpose: Needs to be fixed at some point.
Projects
None yet
Development

No branches or pull requests

5 participants