Release 0.7 #132

darsnack · 2022-03-11T18:29:13Z

This should wait on #125 before merging and releasing (which I will review tomorrow). @theabhirath any other PRs with API changes that you think should go in this release?

Last minute changes:

added the ability to customize the activation function in MobileNet variants (came up in a colleague's research project and we allow this for e.g. ResNet)

theabhirath · 2022-03-11T18:34:55Z

Nope, I think this is good, but maybe there's a way to exclude the ViTs from the release for now? I'm working on a refactor for those and there might be breaking API changes when that finalises (which would probably make releases more complicated). If it's not a biggie then that's fine (a lot of the models will probably require minor API tweaks for adding more general implementations in the future, though, so I'm a little unsure how this will play out)

darsnack · 2022-03-11T18:39:11Z

That's fine, we can always make a breaking release for this package when needed. I just want to avoid releasing a new API and immediately changing it within days.

I commented out the ViT code from this release. When you submit the refactor, just comment it back in.

theabhirath · 2022-03-12T02:02:46Z

Maybe it's worth trying to expose the activation function to the user at a uniform level in the API before the release? It's not much of a pain but some models do it (ResNet, MobileNet now) and others don't (DenseNet, frex.) One idea I had was that the function for returning the layers could expose this, with a documentation note that the original model can be used from the constructor while additional customisation can be added using the layers and wrapping them (this is implicit right now but isn't documented - also, docs keep failing, not sure what that's about)

Also, another quite minor change (but could potentially help clean up the code a little bit): conv_bn currently returns an array, meaning that it needs to be splatted every time it's called in functions. Maybe we could just return a Chain

darsnack · 2022-03-12T17:29:45Z

One idea I had was that the function for returning the layers could expose this

You mean like densenet(..., activation = relu)? I definitely agree we should consistently expose this for all networks, but I'm not sure the API will be the same. For resnet, it's wrapped up inside the connection argument, so you can implicitly specify it. For mobilenetv[X], we require a different activation for some layers vs. others, so it's part of the config. It would be nice to have the same API but maybe that can be saved for a later release.

conv_bn currently returns an array, meaning that it needs to be splatted every time it's called in functions. Maybe we could just return a Chain

Taking a look over all the places it is used, it is almost always part of a larger Chain. From a users perspective, breaking the sequence into 2-layer Chains seems like a lot of visual/structural noise.

theabhirath · 2022-03-12T19:01:06Z

You mean like densenet(..., activation = relu)? I definitely agree we should consistently expose this for all networks, but I'm not sure the API will be the same. For resnet, it's wrapped up inside the connection argument, so you can implicitly specify it. For mobilenetv[X], we require a different activation for some layers vs. others, so it's part of the config. It would be nice to have the same API but maybe that can be saved for a later release.

Yeah true, this probably needs quite some thinking for making it somewhat similar for all the networks - it probably doesn't need to hold up v0.7.

Taking a look over all the places it is used, it is almost always part of a larger Chain. From a users perspective, breaking the sequence into 2-layer Chains seems like a lot of visual/structural noise.

Hmmm, true. I also see that Chain on Flux master works even with arrays inside it, so pretty soon we can do away with the splats anyways I reckon

ToucheSir · 2022-03-12T19:03:07Z

Chain with a Vector is currently an experimental option for models which would suffer from a ton of compilation latency otherwise. I'm not sure if any Metalhead ones fall into that bucket.

src/Metalhead.jl

darsnack · 2022-03-31T22:50:20Z

I tested the Chain(::Vector) option out, and it appears to make no difference for us. I don't think there's anything left before releasing (just need approval)?

theabhirath · 2022-04-01T01:27:51Z

The ViT tests are still commented out, is that intentional until we find a fix for the OOMs?

darsnack · 2022-04-01T01:35:13Z

It almost always OOMs for all systems right?

theabhirath · 2022-04-01T01:36:27Z

Yep, only macOS is spared because the runners have more memory. One solution could maybe be to remove tests for giant and gigantic - I think the tiny and base models should work fine

darsnack · 2022-04-01T01:44:45Z

Let's see what happens

darsnack · 2022-04-01T15:20:40Z

Okay, the Flux v0.13 release seems imminent, so let's just wait for that so we can bump our compat.

darsnack force-pushed the release-0.7 branch from abcff88 to 2a2bbcb Compare March 11, 2022 18:41

darsnack requested review from theabhirath and ToucheSir March 11, 2022 18:43

theabhirath reviewed Mar 27, 2022

View reviewed changes

src/Metalhead.jl Outdated Show resolved Hide resolved

darsnack force-pushed the release-0.7 branch 2 times, most recently from 1055055 to d162d9e Compare March 28, 2022 03:23

darsnack mentioned this pull request Apr 2, 2022

Add MobileNet v1 #140

Merged

darsnack added 6 commits April 4, 2022 08:22

Bump version

85fa7e8

Allow custom activation in MobileNet

d034260

Disable ViTs

02ecea7

Fix docs issues on CI

7aa92ac

Delete datasets folder

6d0d71c

Enable some ViT tests

d51e3a4

darsnack force-pushed the release-0.7 branch from 42c9816 to d51e3a4 Compare April 4, 2022 13:22

darsnack merged commit d51e3a4 into FluxML:master Apr 4, 2022

darsnack mentioned this pull request Apr 4, 2022

Release 0.7 (round 2) #142

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 0.7 #132

Release 0.7 #132

darsnack commented Mar 11, 2022 •

edited

Loading

theabhirath commented Mar 11, 2022

darsnack commented Mar 11, 2022

theabhirath commented Mar 12, 2022

darsnack commented Mar 12, 2022

theabhirath commented Mar 12, 2022

ToucheSir commented Mar 12, 2022

darsnack commented Mar 31, 2022 •

edited

Loading

theabhirath commented Apr 1, 2022

darsnack commented Apr 1, 2022

theabhirath commented Apr 1, 2022 •

edited

Loading

darsnack commented Apr 1, 2022

darsnack commented Apr 1, 2022

Release 0.7 #132

Release 0.7 #132

Conversation

darsnack commented Mar 11, 2022 • edited Loading

theabhirath commented Mar 11, 2022

darsnack commented Mar 11, 2022

theabhirath commented Mar 12, 2022

darsnack commented Mar 12, 2022

theabhirath commented Mar 12, 2022

ToucheSir commented Mar 12, 2022

darsnack commented Mar 31, 2022 • edited Loading

theabhirath commented Apr 1, 2022

darsnack commented Apr 1, 2022

theabhirath commented Apr 1, 2022 • edited Loading

darsnack commented Apr 1, 2022

darsnack commented Apr 1, 2022

darsnack commented Mar 11, 2022 •

edited

Loading

darsnack commented Mar 31, 2022 •

edited

Loading

theabhirath commented Apr 1, 2022 •

edited

Loading