Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add LIBSVM #19

Merged
merged 4 commits into from
May 28, 2019
Merged

add LIBSVM #19

merged 4 commits into from
May 28, 2019

Conversation

ValdarT
Copy link
Contributor

@ValdarT ValdarT commented May 23, 2019

Added LIBSVM. A few notes:

  • If I understand correctly then it needs to be added to the MLJRegistry first, before the loading and hence tests start working properly?
  • I used the names as they are in LIBSVM but perhaps it would be better if they are the same as in the ScikitLearn port?
  • I also included the one class SVM for anomaly/novelty detection but I'm not quite sure how that fits into MLJ currently.

@ablaom
Copy link
Member

ablaom commented May 26, 2019

Thanks for that!!

*If I understand correctly then it needs to be added to the MLJRegistry first, before the loading and hence tests start working properly?

No. You don't need to register to test. You have simply forgotten to add import LIBSVM to your test file (before using MLJModels.LIBSVM_) to trigger lazy loading of your glue code. Travis still fails, but points to some error internal to LIBSVM.jl. Can you investigate?

I used the names as they are in LIBSVM but perhaps it would be better if they are the same as in the ScikitLearn port?

I suggest you use names that are as consistent as possible with the LIBSVM docs. MLJ has no problem with duplicate model names. You just can't simultaneously load two models with the same name from different packages - you can load one or the other.

I also included the one class SVM for anomaly/novelty detection but I'm not quite sure how that fits into MLJ currently.

I expect we should regard this as unsupervised learning (unless there appears some dissent at JuliaAI/MLJ.jl#51 (comment)), so use

mutable struct OneClassSVM <: MLJBase.Unsupervised

instead of

mutable struct OneClassSVM <: MLJBase.Deterministic

And you should probably rename your predict(::OneClassSVM, ...) to transform.
Also, make a final conversion the bitarray predictions (aka transformations) to a CategoricalVector (just apply categorical to the answer) to ensure they have the right scitype, which you will articulate with a declaration

output_scitype_union(::Type{OneClassSVM})) = MLJBase.Finite{2}

(unsupervised learners have "outputs" rather than "targets").

Incidentally, the target_scitype_union declarations for your classifiers should be MLJBase.Finite. Only the regressors should have Continuous as the scitype.

didn't quite understand why it didn't work before but works now
@ablaom
Copy link
Member

ablaom commented May 28, 2019

Brilliant!! Thanks for this contribution.

It has been argued JuliaAI/MLJ.jl#51 (comment) that one-class classification is neither supervised or unsupervised and I am happy to concur with this opinion. So we will probably create a new category. For now, I think we can safely leave this as unsupervised.

I will proceed with registering LIBSVM.jl with MLJRegistry, to make the new models findable by all MLJ users. This will also be our first opportunity to test loading model code for models with names duplicated in separate packages. Will close #19 when this testing is complete.

Again many thanks.

@ablaom ablaom merged commit ca1256f into JuliaAI:master May 28, 2019
@ablaom
Copy link
Member

ablaom commented May 28, 2019

Now registered. Appears to work:

julia> using MLJ

julia> models()["LIBSVM"]
6-element Array{Any,1}:
 "EpsilonSVR" 
 "LinearSVC"  
 "NuSVR"      
 "NuSVC"      
 "SVC"        
 "OneClassSVM"

julia> @load SVC
import MLJModels ✔
import LIBSVM ✔
import MLJModels.LIBSVM_.SVC ✔

julia> mach = machine(SVC(), shuffle!(load_iris()))
Machine{SVC} @ 890


julia> evaluate!(mach)
┌ Info: Evaluating using cross-validation. 
│ nfolds=6. 
│ shuffle=false 
│ measure=MLJ.misclassification_rate 
│ operation=StatsBase.predict 
└ Resampling from all rows. 
Cross-validating: 100%[=========================] Time: 0:00:03
6-element Array{Float64,1}:
 0.0 
 0.0 
 0.04
 0.12
 0.0 
 0.08

@ablaom
Copy link
Member

ablaom commented May 28, 2019

Will now make a new release of MLJModels

@ValdarT ValdarT deleted the LIBSVM branch May 28, 2019 10:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants