-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SchemaDefinition.Create picks up internal fields instead of just public only - affects F# #6209
Comments
I just found out that one workaround for the above issue is to set "ignoreMissingColumns=true" when calling CreateEnumerable(...) |
Glad you found a workaround, but yeah that isn't the behavior we want I don't think. When did you notice this change? We haven't made a recent changes to |
I think the change was F# tooling rather than ML.Net (rethinking the issue). Some newer version of F# (maybe 5.x) changed the IL generation for mutable records, surfacing this issue.
Totally agree that only public fields should be considered for serialization/deserialization.
Note also that, as-is, when creating a DataView (LoadFromEnumerable…) , the xxx@ fields also get added into the dataset, bloating its size.
Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows
From: Michael ***@***.***>
Sent: Monday, June 13, 2022 1:51 PM
To: ***@***.***>
Cc: ***@***.***>; ***@***.***>
Subject: Re: [dotnet/machinelearning] SchemaDefinition.Create picks up internal fields instead of just public only - affects F# (Issue #6209)
Glad you found a workaround, but yeah that isn't the behavior we want I don't think.
When did you notice this change? We haven't made a recent changes to SchemaDefinition.Create that I am aware of. I wonder if it has to do with a .NET version change? What version are you using now? Could you also try an older version?
—
Reply to this email directly, view it on GitHub<#6209 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AACGZMFVW53NGRAHXS6NG2TVO5YKFANCNFSM5XKT6S2Q>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Since this is a change by the F# tooling there is nothing that we can do about it unfortunately. Closing this for now, but if things change in the future please create a new issue and we will take another look. |
@michaelgsharp Just to check.... There are some situations where F# needs to emit public fields - notably when emitting code into multiple assemblies in F# Interactive. My understanding is that serialization of fields should ignore
This is what System.Json and Newtonsoft.Json both do. Does See also dotnet/fsharp#13494 which is related @fwaris Do you know if you hit this issue when using F# Interactive? Thanks |
@dsyme let me reopen the issue for now while I look into that. I am not actually sure off of the top of my head. |
Just in case, it is still an issue as all ML.NET guides advice to use records and they are breaking in F# interactive environment. |
Check out MLUtils that has some utility code to help use ML.Net from F#. You can use cleanSchema to remove '@' fields from the default schema #r "nuget: MLUtils"
open MLUtils
type NRow = { Field1:int; ...}
let xs : NRow list = []
let dv = ctx.Data.LoadFromEnumerable(xs,Schema.cleanSchema typeof<NRow>) Going the other way - from IDataView to F# record - ignoreMissingColumns flag works. ctx.Data.CreateEnumerable<NRow>(dv,false,ignoreMissingColumns=true) |
@dsyme, sorry, did not see this until now. |
This used to work before but now I cannot use CreateFromEnumerable in F# now.
In F#, we define mutable classes by annotating F# 'record' types with CLIMutable:
F# compiler generates IL that looks as follows:
The SchemaDefinition.Create method picks up both "Data@" and "Data" as fields required by the schema. It should only pickup the public fields.
Please add unit tests for F# as well so these issues are caught earlier.
@dsyme
The text was updated successfully, but these errors were encountered: