Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Least Square Solution Layer (for ELM) #2565

Closed
wants to merge 60 commits into from
Closed

Conversation

Macbull
Copy link

@Macbull Macbull commented Jun 6, 2015

Currently this pull request is in progress. I need help to decide which Library should be used.

  1. Adding Extreme Learning Machine support to caffe. Recently ELM are being considered alternatives to Deep Neural Nets due to their fast training and comparable accuracy.
    reference : http://www.extreme-learning-machines.org
  2. ELM does not need any solver to optimise parameters.
  3. This implementation can be used to construct ELM Auto-Encoders easily by just using promo file (no additional layer will be required)

This PR includes two layers :

  1. LSLayer : Least square Layer will be used by ELM or Online Sequential ELM
  2. Transpose Layer : This layer will make possible for two layers to share weights but in transposed form. Improve / Fix Weight Sharing #1211. It is requirement for implementing ELM-Auto Encoders.

Issues Left :

  1. Pseudo Inverse and transpose functions are not available in cuBLAS. So we need to select between different libraries to use those functions for GPU.
    Libraries I am considering : CULA or MAGMA for GPU. Please suggest !!!
  2. Testing of layers once libraries are decided.

Future Plans : Implementing other variants of ELM like Sparse ELM or Kernelled ELM etc.

@Macbull Macbull changed the title Elm ELM Jun 6, 2015
@bhack
Copy link
Contributor

bhack commented Jun 6, 2015

@Macbull
Copy link
Author

Macbull commented Jun 6, 2015

@bhack : I am aware of that discussion. Here is reply of inventor of ELM : http://www.ntu.edu.sg/home/egbhuang/pdf/ELM-Rosenblatt-Neumann.pdf

Moreover, whosever is correct, You can not ignore the good performance of ELMs or whatever they may be called, and that's what I think is the focus of caffe. May be we can change the name of method in comments (the only place where I mentioned ELM).

@lunzueta
Copy link

lunzueta commented Jun 6, 2015

Following up bhack's comment, regarding the ELM method itself, take into account also this:
https://www.facebook.com/yann.lecun/posts/10152872571572143

And especially this:
http://theanonymousemail.com/view/?msg=ZHEZJ1AJ

I'm not sure if this method should be integrated in Caffe, yet...

@Macbull
Copy link
Author

Macbull commented Jun 6, 2015

@lunzueta
I was not able to open first link (Facebook server is down right now.) but I have read second link. And I don't think It raises any issue with performance of ELM. As said earlier, this PR does not adds any ELM layer but adds a Least square layer, which can be used to find least square solutions, which is used in ELM.

@lunzueta
Copy link

lunzueta commented Jun 6, 2015

Here there are more comments about this discussion and also about Yann Lecun's recent comments about ELM: http://www.reddit.com/r/MachineLearning/comments/34u0go/yann_lecun_whats_so_great_about_extreme_learning/

They also discuss about the performance of ELM and so on. This is a quite recent discussion, so I think that we should be careful with this issue before integrating this method in Caffe.

@Macbull
Copy link
Author

Macbull commented Jun 6, 2015

@lunzueta : yes, We should be careful, that's why we are discussing it. Right :D ? In comments, they are discussing performance of ELM, and most of them agree that it performs well considering its simplicity and low training time. I haven't gone much deeper in those conversation but I want to raise some points :

  1. As regarding complain, read the above publication. It was an invited article and clearly addresses the complaint.
  2. Regarding Yann Lecun's comment : ELM performs well beyond SVM in most of the cases, and comparable in others. (both in terms of time and accuracy)
  3. I would say it again, I have not implement ELM, I have implemented Least Square solution layer, which can be used anywhere you want, may be in LS-SVM. I will use that for constructing ELM for my own purpose.

@Macbull
Copy link
Author

Macbull commented Jun 6, 2015

Moreover the complaint was anonymous, but the paper I mentioned in above comments is published, and I am sure that it would have gone through rigorous evaluation before acceptance due to the delicacy of the issue , it is addressing.

@Macbull Macbull changed the title ELM Least Square Layer (for ELM) Jun 6, 2015
@Macbull Macbull changed the title Least Square Layer (for ELM) Least Square Solution Layer (for ELM) Jun 6, 2015
@Macbull Macbull closed this Jun 8, 2015
@Macbull Macbull reopened this Jun 9, 2015
@zdx3578
Copy link

zdx3578 commented Dec 30, 2015

High performance implementation of Extreme Learning Machines (fast randomized neural networks). https://github.com/akusok/hpelm , good code and paper ;
caffe python layer can reference hpelm code?

@akusok
Copy link

akusok commented Jan 2, 2016

Hello guys, I am an author of HPELM. I would be glad to integrate my toolbox with you code if you tell me what you want :-)

About ELM in general:

  1. In performance ELM = MLP, but training time is 10,000-100,000 times shorter.
  2. Scandal is only about authorship, as ELM starts gaining a lot of citations recently and everybody want these citations for itself
  3. Pseudoinverse is not done in practice for ELM, transpose is available in BLAS as a parameter at multiplying two matrices. Generally, with less than 3000 neurons CPU is the fastest, with more than that GPU improves speed.

@ducha-aiki
Copy link
Contributor

@akusok

  1. In performance ELM = MLP, but training time is 10,000-100,000 times shorter.
    what about sota on CIFAR-10/100? :)

@shelhamer
Copy link
Member

Closing as out-of-scope.

@shelhamer shelhamer closed this Apr 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants