Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency w.r.t. external weights #124

Closed
dmbates opened this issue Feb 8, 2016 · 4 comments
Closed

Consistency w.r.t. external weights #124

dmbates opened this issue Feb 8, 2016 · 4 comments

Comments

@dmbates
Copy link
Contributor

dmbates commented Feb 8, 2016

It looks as if my original code was not consistent with respect to external, or prior, weights, saved as r.wts in a GlmResp object r. The main reason for allowing weights in a GLM is for a Binomial response in the form of the number of successes and the number of trials. In those cases the response is evaluated as the proportion of successes with the prior weights being the number of trials.

In the wrkwt! method for GlmResp there is a check on whether r.wts is empty but it never will be. The fit method for an AbstractGLM initializes the weights to ones(y) if they are not specified as an optional argument.

I have a slight preference for changing the default prior weights to being a zero-length vector and retaining the checks on whether it is empty or not but that may be making the code overly-complicated to save a very small amount of time and storage. Should we always have prior weights, even if they are unit weights?

@andreasnoack
Copy link
Member

The VectorOfOnes type has been mentioned a couple of times. It might make the code simpler than checking for an empty vector, but I'm not sure.

@nalimilan
Copy link
Member

Indeed, might be a good application. For reference: JuliaStats/StatsBase.jl#135

@dmbates
Copy link
Contributor Author

dmbates commented Feb 9, 2016

I expect all of this is very minor but, in addition to having a vector of unit weights take up space there is the issue of unnecessary multiplications by 1. If you have hundreds of thousands of observations and you can do one check of isempty{r.wts) then skip all of the accesses and multiplications by the unit weights you are better off then going through the access of the unit value and multiplication by it.

@nalimilan
Copy link
Member

@dmbates The idea behind the Ones type (see reference above) is that it would be optimized out when compiling a specialized method for that argument type. This would allow writing only the most generic code (with weighting) and still get an efficient special-case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants