-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
what exactly is the input data format expected by Metronome? #2
Comments
Its really similar to the SVMLight format where its just a CSV style line but in general it comes down to a mapping of an input vector to an output [i0 i1 i2 | o0 o1 o2] where spaces separate the vector entries and then each is indexed to save So yeah its a bit custom, but after looking around and thinking about it we Adam and I are working on some more robust and complete vectorization tools TLDR: yes, vectorization and input formats are important, we;re thinking Thanks! JP On Wed, Jul 2, 2014 at 8:52 PM, pchalasani notifications@github.com wrote:
|
I would like to add here that this is a big problem. Rather than take an adhoc approach, canova will also support different modes of feature extraction for various kinds of data. Lots of people don't think about word vectors, moving window on images, and other kinds of the harder formats. Featurization is a huge problem we'll be tackling here in the coming weeks. As ambitious as it sounds, |
Thanks for the clarifications. I was just trying to figure out how I can (say) use Metronome to deploy deep-learning on Hadoop for one of our data-sets. Eventually, I'll probably put a friendly Clojure wrapper around it. |
glad we could help. let me know if you need help getting it going, I can JP On Mon, Jul 7, 2014 at 2:55 PM, pchalasani notifications@github.com wrote:
|
subject says it all
The text was updated successfully, but these errors were encountered: