You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If the blocking rule you use for training m values using em maximisation accidentally creates no candidate pairings, you end up with an error BinderException: Binder Error: Referenced column "nan" not found in FROM clause!.
The process seems to be that the fact that __splink__df_blocked is empty leads to (in __splink__m_u_counts) m_count and u_count in _probability_two_random_records_match both being NaN. Then when __splink__df_predict is created this is translated to the string nan which is then interpreted as a column name.
Perhaps a clearer error message could be given after the blocking step if the frame is empty (and thus also earlier failing).
But also I thought worth flagging in case there may be other situations which may lead to NaNs being wrongly interpreted as columns in ways which cause more subtle issues?
Not certain this is resolved, i think i experienced the same issue today. I applied the same blocking rule for EM training that I use on large datasets to subset of those datasets, which created 0 comparison. The EM training loop ran for 1 iteration and I then got BinderException: Binder Error: Referenced column "nan" not found
Happy to provide additional details if needeed
If the blocking rule you use for training
m
values using em maximisation accidentally creates no candidate pairings, you end up with an errorBinderException: Binder Error: Referenced column "nan" not found in FROM clause!
.The process seems to be that the fact that
__splink__df_blocked
is empty leads to (in__splink__m_u_counts
)m_count
andu_count
in_probability_two_random_records_match
both beingNaN
. Then when__splink__df_predict
is created this is translated to the stringnan
which is then interpreted as a column name.Perhaps a clearer error message could be given after the blocking step if the frame is empty (and thus also earlier failing).
But also I thought worth flagging in case there may be other situations which may lead to
NaN
s being wrongly interpreted as columns in ways which cause more subtle issues?Reprex:
The text was updated successfully, but these errors were encountered: