Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combining different filter lengths in 1D convolutional layers #1023

Closed
sebastianruder opened this issue Nov 16, 2015 · 10 comments
Closed

Combining different filter lengths in 1D convolutional layers #1023

sebastianruder opened this issue Nov 16, 2015 · 10 comments

Comments

@sebastianruder
Copy link

[1] show that combining filter lengths around the 'best' filter length achieves better sentence classification results than just using the 'best' filter length. Is there a possibility to specify a combination of filter lengths for 1D convolution? If not, is this a feature that should be implemented?

Zhang, Y., & Wallace, B. (2015). A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification, (1). Retrieved from http://arxiv.org/abs/1510.03820

@dbonadiman
Copy link
Contributor

This is kinda of topic but thanks a lot for the paper :)
For the question itself i think that it should be up to the user to do that, on the other side the k-max pooling layer is missing from Keras and that is a big issue.

@sebastianruder
Copy link
Author

Np. :)
If it's up to the user, how would I go about implementing combined feature lengths? Can I have separate convolution layers of different configurations that convolve on the same input and merge them?

@dbonadiman
Copy link
Contributor

Using the Graph model yes. Give me 1 or two hours and i will come back i was planning to add it to my codebase :)

@dbonadiman
Copy link
Contributor

It was faster then i thought.

#let's define some variables
nb_filters = 100
filter_lenghts = [3,4,5]
convs = []
maxlen = 30

model = Graph()
#adding the input as usual
model.add_input(name='sentence', input_shape=(1,), dtype=int)
#passing the sentence through the Embedding layer
model.add_node(Embedding(nb_words, emb_dim), name = 'sentence_embeddings', input = 'sentence')

#and now create the convolutions/max_pooling 
for i in filter_lenghts:
    model.add_node(Convolution1D(nb_filter=nb_filters,
                                 filter_length=i,
                                 border_mode="valid",
                                 activation="relu",
                                 subsample_length=1), name=str(i)+'_convolution', input= 'sentence_embeddings')

    #that's an approximation of 1-maxpooling that i never really checked.
    model.add_node(MaxPooling1D(pool_length= maxlen - i + 1), name=str(i)+'_maxpooling', input=str(i)+'_convolution')
    model.add_node(Flatten(), name = str(i)+' sentence_embedding', input=str(i)+'_maxpooling')
    convs.append(str(i)+' sentence_embedding')

#now concat  all the results and  do whatever you want with the new sentence_embedding :D
model.add(NEXT_LAYER, name='whatever', inputs=convs)

I rewritten that in that comment never really compiled but if some error is raised it should be easily debugged.

In order to implement it with the sequential model you should replicate your input sentence many times but i'm not a fun of this approach.

@dbonadiman
Copy link
Contributor

@sebastianruder Just a reminder do not exaggerate on the nb_filters because depending on the task you are modelling it might not converge (just experienced that problem). 200 may be good if you have only one convolution but with more 50 may be enough.

@sebastianruder
Copy link
Author

@dbonadiman: Wow, that was fast indeed! Thanks a lot for the commented implementation snippet and the tip! :D

@dbonadiman
Copy link
Contributor

@sebastianruder It is fast to do these implementations if you have a working model to test them over ;)

@MdAsifKhan
Copy link

It creates an identifier issue:
File "build/bdist.linux-x86_64/egg/keras/layers/containers.py", line 366, in add_node
Exception: Unknown identifier: 7input

@Ambros94
Copy link

Ambros94 commented May 1, 2017

This graph model can still be created with keras 2.0 using sequential models?

@hsiaoyi0504
Copy link

@Ambros94 They are different containers actually. You should check the documentation for further details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants