-
Notifications
You must be signed in to change notification settings - Fork 4k
Meeting Notes
-
Discussion
- STT 0.8.0!
- iOS CI!
- Remaining iOS issues
- .NET Can not import DeepSpeech NuGet package
- STT 0.8.0!
-
Review of on-going work
-
Abhishek
- Tested Bergamot Rest server integration with Firefox on Mac machine
- Global Data Privacy training
- Initial investigation of word alignment information returned by Bergamot rest-server
-
Eren
- Implementing parallel wavegan
- Trained a parallel wavegan model (High quality but a bit slow)
- Refactoring TTS repo to be more python like
-
Alex
- Butter Fuss!
- Review iOS PR
- cuDNN mystery
- Finished Docker work!
-
Reuben
- iOS CI!
- Writing Android app + DS TFLite + microphone streaming tutorial for 1.0
-
Tilman (PTO)
- Training 1.0.0 model
- Evaluation 1.0.0 trainings
- Common Voice imports
-
Kelly
- Reworking for the ∞th time developer OKR's with David on his request
- Foundation for the National Institutes of Health (FNIH)
- Reviewing proposal by FNIH to join a grant working on collecting/opening dysarthria speech
- Trying to setup a new meeting with FNIH
- LANGEQ-2020 Proposal
- Talked with Bangor about a LANGEQ-2020 coalition
- Started the process of applying for a LANGEQ-2020 grant to fund CV + DS work
- Started on writing the proposal
- OpenWRT
- Met with OpenWRT to discuss Deep Speech integration
- Following up with legal nd BizDev on work with OpenWRT
- Trying to read the tea leaves of Mozilla history with OpenWRT
- Bergamot
- Packaging server + model(s) as macOS application!
- Creating demo for Bergamot 18th Month Review
- Creating Demo and Integration Presentation for Bergamot 18th Month Review
- Creating Dissemination and Exploitation Presentation for Bergamot 18th Month Review
- Writing Bergamot Periodic Report for Bergamot 18th Month Review
- SIFIS*HOME H2020 Grant
- Continuing the process of getting over the legal hurdles of the grant
-
Rosana (PTO)
- Mozilla Voice WebSite development
- Product research
-
Abhishek
-
Discussion
- STT 0.8.0!
- iOS CI
- Docker support, building and publishing
- STT 0.8.0!
-
Review of on-going work
-
Abhishek
- Make machine translation from Firefox work with the quality estimation code
-
Eren
- Implementing parallel wavegan
- Training a (parallel wavegan) model
- Working on tflite export of models
- Working with contributors on multi-voice model
-
Alex (Can't Make Meeting)
- Docker support, building and publishing
-
Reuben
- iOS CI
- 0.8.0 documentation
-
Tilman (PTO)
- Training 1.0.0 model
-
Kelly
- Reworking for the ∞th time developer OKR's with David on his request
- Foundation for the National Institutes of Health (FNIH)
- Met with FNIH to discuss inclusion of dysarthria speech into Common Voice
- Reviewing proposal by FNIH to join a grant working on collecting/opening dysarthria speech
- LANGEQ-2020 Proposal
- Talked with Bangor about a LANGEQ-2020 coalition
- Started the process of applying for a LANGEQ-2020 grant to fund CV + DS work
- OpenWRT
- Met with OpenWRT to discuss Deep Speech integration
- Following up with legal nd BizDev on work with OpenWRT
- Trying to read the tea leaves of Mozilla history with OpenWRT
- Setting up a meeting with Kathy on Mozilla's history with OpenWRT
- Setting up a meeting with David on Mozilla's history with OpenWRT
- Bergamot
- Packaging server + model(s) as macOS application
- Creating 8-bit models from the pre-trained student models
- Updating the website with pointers to released NMT models
- SIFIS*HOME H2020 Grant
- Continuing the process of getting over the legal hurdles of the grant
-
Rosana
- FxR UX for model integration done
- Mozilla Voice WebSite development
- Product research
-
Abhishek
-
Discussion
- STT 0.8.0
-
Review of on-going work
-
Abhishek
- Make machine translation from Firefox work with the quality estimation code
- Eren (PTO)
-
Alex
- KaiOS xpcshell work
- Support on discourse
- Initial delagate support
- Docker support, building and publishing
- Updating deepspeech-server to 0.7.X
- Prep for Ministry of Finance meeting tomorrow
-
Reuben
- Training Mandarin model
- Inference time measurements for UTF-8 on laptop + phone
- Making new master alpha
- iOS target
- 0.8.0 documentation
-
Tilman
- DSAlign documentation
- Upgrading DSAlign to DeepSpeech 0.7.X
- Testing transcribe.py
- Fixing importers alphabet problem
- CV imports
-
Kelly
- Bergamot
- Packaging server + model(s) as macOS application
- Starting integration of student teacher scripts
- Starting integration of 8-bit scripts
- Creating 8-bit models from the pre-trained student models
- Updating the website with pointers to released NMT models
- W3C Workshop on Web and Machine Learning
- Reviewing workshop proposals
- NVIDIA + DSAlign's LibriVox data set
- Syncing with NVIDIA on LibriVox release + press
- Syncing with legal on license for LibriVox release
- SIFIS*HOME H2020 Grant
- Continuing the process of getting over the legal hurdles of the grant
- Bergamot
-
Rosana
- FxR UX for model integration done
- Mozilla Voice WebSite development
- Product research
-
Abhishek
-
Discussion
- STT 1.0.0 training
-
Review of on-going work
-
Rosana
- Working on product opourtunities research
- Working on model backend host
-
Abhishek
- Reproducing Firefox's machine translation workflow with newer marian-server
- Make machine translation from Firefox work with the quality estimation code
- Eren (PTO)
-
Alex
- TF submodule
- KaiOS xpcshell work
-
Reuben
- C++ generate_scorer_package
- MSYS2 issue
- iOS shared library signing
- Training Mandarin model
-
Tilman
- Restarting trainings
- DSAlign documentation
- Upgrading DSAlign to DeepSpeech 0.7.X
-
Kelly
- Bergamot
- Adressing review comments on deliverable D6.2 Mozilla Cluster Integration
- Adressing review comments on deliverable D7.3 First Dissemination Report
- Starting integration of student teacher scripts
- Starting integration of 8-bit scripts
- Creating 8-bit models from the pre-trained student models
- W3C Workshop on Web and Machine Learning
- Recruiting speakers for the workshop
- NVIDIA + DSAlign's LibriVox data set
- Syncing with NVIDIA on LibriVox release + press
- Syncing with legal on license for LibriVox release
- SIFIS*HOME H2020 Grant
- Getting CA signed
- Continuing the process of getting over the legal hurdles of the grant
- Bergamot
-
Rosana
-
Discussion
- 1.0.0 timeline
- STT 1.0.0 training
-
Review of on-going work
-
Abhishek
- Reproducing Firefox's machine translation workflow with older marian-server
- Make machine translation from Firefox work with new marian-server
-
Eren
- MelGAN's training
- PWGAN implementation
- FastSpeech implementation
- Writing 'Double Decoder Consistency' blog post
-
Alex
- Dockerfile updating
- KaiOS xpcshell continuation
-
Reuben
- Build libdeepspeech.so for iOS with TF 2.2
- Training Mandarin model
-
Tilman
- Benchmarking augmentation
- Other minor fixes
-
Kelly
- Mozilla Voice web site reviews + comms
- Preparing to give presentation for ET at Mozilla weekly meeting
- Deep Speech 1.0.0
- Reviewing for the Nth time the Mozilla Voice website texts
- Met with Alex over Comms planning
- Reviewing issues for 1.0.0 project on GitHub
- Bergamot
- Wrote Deliverable D7.3 First Dissemination Report
- Dealing with reviews of Deliverable D6.2 Mozilla Cluster Integration
- Starting integration of student teacher scripts
- Starting integration of 8-bit scripts
- SIFIS*HOME H2020 Grant
- Getting CA signed
- Continuing the process of getting over the legal hurdles of the grant
-
Abhishek
-
Discussion
- Rosana
- Te Hiku Media contracted research
- STT 0.8.0
- STT 1.0.0 + TTS 0.1.0 what's to be done
-
Review of on-going work
-
Abhishek
- Reviewing Marian Seq2Seq framework
- Bergamot project plan clarifications
- Reading "Visualizing A Neural Machine Translation Model"
-
Eren
- Make decoder and attention masking optional
- Getting Travis happy
- Multi-GPU training to vocoder module
- Writing 'Double Decoder Consistency' blog post
-
Alex
- Traininig with Mineco dataset
- Discourse support
- Dockerfile split
- TF 2.2 rebase
- Common Voice interviews
-
Reuben
- Helping Andre with some training
- Build libdeepspeech.so for iOS with TF 2.2
- Training Mandarin model
-
Tilman
- Implementing time limits for time-stretch augmentation
- Other minor fixes
-
Kelly
- Mozilla Voice web site reviews
- Bergamot
- Writing Deliverable D7.3 First Dissemination Report
- Starting integration of student teacher scripts
- Starting integration of 8-bit scripts
- Started setting up training pipeline for some more efficient models
- Handling feedback on Deliverable 6.2 Mozilla Cluster Integration
- SIFIS*HOME H2020 Grant
- Getting CA signed
- Continuing the process of getting over the legal hurdles of the grant
-
Abhishek
-
Discussion
- Abhishek!
- All Hands Lightning Talks (Signup by June 10th at 5pm PST)
- All Hands Demos (Signup by June 10th at 5pm PST)
- All Hands Contributor Nominations? (Due today)
- Te Hiku Media contracted research
- STT 0.8.0 (Next Week)
- STT 1.0.0 + TTS 0.1.0 what's to be done (Next Week)
-
Review of on-going work
-
Abhishek
- Read paper: "Attention is all you need"
- Went through the Bergamot Project plan
- Reviewing Marian Seq2Seq framework
-
Eren
- Implementing a Vocoder module for TTS
- Training models
- Learning rate scheduling
- Implementing forward TTS models
- Writing 'Double Decoder Consistency' blog post
-
Alex
- TF 2.2 update
- Learning auditwheel
- Examining KaiOS status
-
Reuben
- Adding read-only metrics
- Making new 0.8 alpha to test PyPI training tests
- Fixed throwing away of last uneven batch
- Flip order of VERSION and GRAPH_VERSION symlinks
- Fixing SWIG wrapper memory leak in decoder package
- Benchmarking UTF-8 decoding for Alex's grant documentation
-
Tilman
- Implemented and tested new caching approach
-
Kelly
- Preparing presentation for Sean on the ML work
- Bergamot
- Writing Deliverable 6.2 Mozilla Cluster Integration
- Writing Deliverable D7.3 First Dissemination Report
- Reviewing UX/UI work from Edinburgh
- Starting setup of student teacher traning runs
- Started setup of the the 8-bit training runs
- Determining if the quality estimation (QA) work package has models/code
- Integrating QA model into Firefox, if QA model/code exists
- Creating work plan for implementation of UI specifications
- SIFIS*HOME H2020 Grant
- Continuing the process of getting over the legal hurdles of the grant
-
Abhishek
-
Discussion
- Naming!
-
Review of on-going work
-
Eren
- Release TTS v 0.0.2!
- Fix isinf bug
- Creating enoder module
- Write a wiki entry for converting Torch to TF
- Implementing a Vocoder module for TTS
- Implement MelGAN
- Writing 'Double Decoder Consistency' blog post
-
Alex
- Trying to run emulator ourside of B2G build tree
- Dumping tflite matrices when using NEON / threads
-
Reuben
- Training a new Mandarin model with latest training code + new validation data
- Testing Mandarin language models with old and new models
- 0.8 stuff, docs, TypeScript, PR reviews
-
Tilman
- Examined augmentation runs
- Re-factoring augmentation cmd-line handling
- Looking into where cores are dumped on the cluster
-
Kelly
- Reading "A Survey of Monte Carlo Tree Search Methods"
- Preparing presentation for Adam the new COO on the ML work
- SIFIS*HOME H2020 Grant
- Working with Ericsson on CA amendments
- Continuing the process of getting over the legal hurdles of the grant
- Bergamot
- Financial planning for the second period
- Determining if the quality estimation (QA) work package has models/code to estimate quality
- Integrating QA model into Firefox, if QA model/code exists
- Review of UI work from work package one
- Creating work plan for implementation of UI specifications
- Running several test training runs
- Started setting up training pipeline for some more efficient models
- Preparing for the new hire
-
Eren
-
Discussion
- Naming
- When is this due?
- Who's doing the legal review?
- Is branding helping with this?
- Can we intro new packages that point the the old ones?
- Naming
-
Review of on-going work
-
Eren
- Checking out ICLR TTS papers
- Fixing "Attention dies out after 18k iterations quite randomly" issues
- Automatize TTS
- Reproduction of MelGAN
-
Alex
- Makeing the patches for WiFi on Gonk
-
Reuben
- Improving error handling for scorer loading
-
Tilman
- Started augmentation runs
- Augmentation PR
-
Kelly
- Bergamot
- Preparing financial statements for the new FTE's for the EU finance group
- Determining if the quality estimation (QA) work package has models/code to estimate quality
- Integrating QA model into Firefox, if QA model/code exists
- Review of UI work from work package one
- Creating work plan for implementation of UI specifications
- Running several test training runs
- Started setting up training pipeline for some more efficient models
- Preparing for the new hires
- SIFIS-HOME H2020 Grant
- Reviewing Ericsson markup of the CA agreement for IP vs FOSS license issues
- Organizing next all-partner CA meeting
- Continuing the process of getting over the legal hurdles of the grant
- Work on organizing the W3C Web + ML conference
- Reading "A Survey of Monte Carlo Tree Search Methods" [doing]
- Bergamot
-
Eren
-
Discussion
- 0.7.1
- Snakepit and submodules
-
Review of on-going work
-
Eren
- Checking out ICLR TTS papers
- Fixing "Attention dies out after 18k iterations quite randomly" issues
- Automatize TTS
- Reproduction of MelGAN
-
Alex
- Nodejs/armv7 breakage
- TF 2.2 for STT runtime (Allows threading in TFLite)
- Starting KaiOS Work
-
Reuben
- Transcribed Mandarin validation dat
- 0.7.1 work
- Monthly meeting with Bernardo from Iara Health
- 0.8.0 work (Usage docs, LM docs, NuGet docs, Java docs)
-
Tilman
- Running tests on augmentation test-training;
- Preparing augmentation PR for review
- Considering addition of parameter scheduling
-
Kelly
- SIFIS-HOME H2020 Grant
- Reviewing all partners legal's markup of the CA agreement
- Reviewing Mozilla legal's markup of the CA agreement
- Meeting with Mozilla's legal later today on CA agreement
- Organizing next CA meeting
- Continuing the process of getting over the legal hurdles of the grant
- Bergamot
- Review of UI work from work package one
- Creating work plan for implementation of UI specifications
- Running several test training runs
- Started setting up training pipeline
- Working on Windows TC integrations
- Preparing for the new hires
- Reading "Bandit based Monte-Carlo Planning" by Koscic
- SIFIS-HOME H2020 Grant
-
Eren
-
Discussion
- KaiOS
-
Review of on-going work
-
Eren
- Convert Torch Tacotron2 to TF
- Fixing "Attention dies out after 18k iterations quite randomly"
- Automatize TTS
- Reproduction of melgan
- Melgan with PWGAN loss from scratch 4m iterations
- Bidirectional Decoding with r=7 in backwards decoder
- Graves attention after fix of "Attention dies out after 18k iterations quite randomly"
-
Alex
- Nodejs/armv7 breakage
- TF 2.2 for STT runtime (Allows threading in TFLite)
- Starting KaiOS Work
-
Reuben
- Undoing concatenation of Mandarin samples for transcription
- 0.7.1 release work
- Fixing bug in 0.7.0 Node.JS binding
- Addressing docs pitfalls and other 0.7.0 regressions raised on discourse
- Adding --candidate_transcripts flag to Python client
- 0.7.1-alpha.1 release
- Fixed bug due to poor error handling in DS_EnableExternalScorer
- Fixing broken package README on NPM
-
Tilman
- German LibriVox data set
- NTP server sync of cluster
- German model building
-
Kelly
- SIFIS-HOME H2020 Grant
- Continuing the process of getting over the legal hurdles of the grant
- Setting up next meeting of partners' legal officers to discuss Coalition Agreement template
- Bergamot H2020 Grant
- Running several test training runs
- Started setting up training pipeline
- Downloading training data to the server
- Working on Windows TC integrations
- Attended meeting on organization of W3C Web & Machine Learning workshop as a virtual conference
- SIFIS-HOME H2020 Grant
-
Eren
-
Discussion
- DS 0.7.0 \o/!
- Naming
- Sharing LibriVox Data Set on S3
-
Review of on-going work
-
Eren
- Convert Torch Tacotron2 to TF
- Automatize TTS
- Reproduction of melgan
- Melgan with PWGAN loss from scratch (4m iterations)
- Finetune Melgan model with TTS spectrograms
- Bidirectional Decoding with r=7 in backwards decoder
-
Alex
- Discourse support
- qm215 investigation
- TF 2.2 + TFLite
-
Reuben
- 0.7 release stuff
- 0.8.0
- Remove links to docs in GitHub from main README
- Branching vs. reducing github visibility
- Upgrade paths to TF 2.x
-
Tilman
- Tests for augmentation PR
- Improved augmentation PR
- German fine-tuning model (WER of 30% on current CV German test-set)
-
Kelly
- 0.7.0 Release
- SIFIS-HOME H2020 Grant
- Met with Finance + Intelligentsia on Indirect Costs
- Got approval from EC on transfer of Coordinator
- Continuing the process of getting over the legal hurdles of the grant
- Setting up a meeting of partners' legal officers to discuss Coalition Agreement template
- Bergamot H2020 Grant
- Running several test training runs
- Started setting up training pipeline
- Working on Windows TC integrations
-
Eren
-
Discussion
- Sharing LibriVox Data Set on S3?
- DS 0.7.0?
- LM paramaters
- Audio duration
- Add Python 3.7, 3.8 CI coverage
- Do not use m/mu ABI for Py3.8+
- M-AILAB importer: Ensure all samples are 16 kHz
- Add missing external scorer
- ...
-
Review of on-going work
-
Eren
- Implemented Tacotron2 with TF
- Starting the process of passing weights from the Pytorch imp to the TF imp
- TTS overview document
- TF 2.0 version revisit
- Convert Torch Tacotron2 to TF *....
-
Alex
- Discourse support
- Mineco data
- Python version dance
- Working on Kabyle model
-
Reuben
- Delayed beam expansion to fix timing bug
- Looking into VAD/KWS decoder extensions arxiv:1611.09405
- Prototyping a tool for quick inspection/correction/tagging of samples in the cluster
-
Tilman
- Tests regarding potential duplicate samples
- Noise importing
- Fix for M-AILAB importer
- Preparing German fine-tuning
- Packaging Brazilian Portuguese models
-
Kelly
- Bergamot
- Working on TC integrations
- SIFIS-HOME H2020 Grant
- Creating C-Level presentation on the grant
- Giving C-Level presentation on the grant
- Creating estimates for Roxi of "hidden costs"
- Met with Intelligentsia to discuss the less than rapid progress of Mozilla's C-Levels
- Meeting with legal to touch base on the SIFIS agreements
- Trying to get back account setup to accept grant's funds
- Kicking off the process of getting over the legal hurdles of the grant
- Bergamot
-
-
Discussion
- DS 0.7.0?
- Only allow graph/layer initialization at start of training
- Pr2876 (TypeScript Support)
- Transfer-learning docs
- Rewrite generate_lm.py to allow usage with other languages
- Default branch the latest stable?
- ...
- DS 0.7.0?
-
Review of on-going work
-
Eren
- Implemented Tacotron2 with TF
- Starting the process of passing weights from the Pytorch imp to the TF imp
-
Alex
- TypeScript PR
- Mentoing Kabyle model
- Matrix integration with WebThings
- Intent parsing integrations with WebThings
-
Reuben
- Prototyping a tool for quick inspection/correction/tagging of samples in the cluster
- Looking into VAD/KWS decoder extensions
-
Tilman
- Refactoring (overlay-) augmentation code
-
Kelly
- Bergamot
- Reviewing applicants
- Setting up applicant interviews
- Giving applicant interviews
- Working on TC integrations
- SIFIS-HOME H2020 Grant
- Trying to get back account setup to accept grant's funds
- Kicking off the process of getting over the legal hurdles of the grant
- Bergamot
-
-
Discussion
- DS 0.7.0?
-
Review of on-going work
-
Eren
- TTS with TF for 2.1
- Working with Turkish companies on getting TTS data + using CV for data collection
- Working with Turkish bank on creating a TTS data set
-
Alex
- New hardware
- French Ministerial Transformation Fund data
- Getting MATRIX Voice and WebThings working together
-
Reuben
- Prototyping a tool for quick inspection/correction/tagging of samples in the cluster
- Mandarin data preperation
-
Tilman
- Updating the cluster
- Setting up .compute w/cache on worker
- Training LM from Oscar for PT and DE
-
Kelly
- Bergamot
- Reviewing applicants
- Setting up applicant interviews
- Working on TC integrations
- SIFIS-HOME H2020 Grant
- Submitting Mz Denmark Gmbh's un-audited financial for 2017-2019 to the EC
- Trying to get back account setup to accept grant's funds
- Kicking off the process of getting over the legal hurdles of the grant
- Prepare Annex A and B of the grant
- Bergamot
-
-
Discussion
- How's it going?
- DS 0.7.0?
-
Review of on-going work
-
Eren
- German TTS voice
- Working with Turkish TTS comapanies
- TTS with TF for 2.1 (Better this time?)
-
Alex
- Getting MATRIX Voice and WebThings working together
-
Reuben
- Prototyping a tool for quick inspection/correction/tagging of samples in the cluster
- QuartzNet explorations
-
Tilman
- Experiments around utf8
- Training/benchmarking SDB's
- Working on unlabeled samples PR (Extension of 2622)
-
Kelly
- Bergamot
- Reviewing applicants
- Setting up applicant interviews
- Working on TC integrations
- SIFIS-HOME H2020 Grant
- Submitting Mz Denmark Gmbh's un-audited financial for 2017-2019 to the EC
- Trying to get back account setup to accept grant's funds
- Kicking off the process of getting over the legal hurdles of the grant
- Prepare Annex A and B of the grant
- Continuing with grant?
- TTS Voice negotiations
- Continuing discussions on IT's support of the DS server
- Bergamot
-
-
Discussion
- How's home treating you?
-
Review of on-going work
-
Eren
- German TTS voice
- Boosting learners
- Refactoring TTS audio processing
- Implementing mean-var normalization in dev
-
Alex
- Getting Win GPU instance ready
- Getting MATRIX Voice and WebThings working together
-
Reuben [PTO]
-
Tilman
- Fixed/ing some issues with Oscar LM tool
- Training/benchmarking SDB's
- Starting to look at noise augmentation
-
Kelly
- Bergamot
- Reviewing applicants
- Setting up applicant interviews
- Working on TC integrations
- SIFIS-HOME H2020 Grant
- Submitting Mz Denmark Gmbh's un-audited financial for 2017-2019 to the EC
- Trying to get back account setup to accept grant's funds
- Kicking off the process of getting over the legal hurdles of the grant
- Prepare Annex A and B of the grant
- TTS Voice negotiations
- Internal "sell" on STT SaaS service
- Continuing discussions on IT's support of the DS server
- Bergamot
-
-
Discussion
- Weekly Update Test
-
Review of on-going work
-
Eren
- German dataset
- German model
- German vocoder
- Working on TTS normalization
- Implemented guided attention + training a model with it
-
Alex
- French model
- Getting MATRIX Voice and WebThings working together
-
Reuben [PTO]
-
Tilman
- Re-exporting English LibriVox to SDB
- Creating German LM from Oscar
- Small SDB improvements
-
Kelly
- Reviewing Bergamot applicants again
- Interviewing Bergamot applicants again
- Preparing for Bergamot (remote) work week this week, was going to travel
- Created optimizer built on optuna to optimize lm_alpha + lm_beta
- Reading: "Concentration-of-measure Inequalities" from Lugosi
- Continuing discussions on IT's support of the DS server
- Met with private.ai to discuss privacy preserving training
- Setting up TTS to read the Mozilla weekly update
-
-
Discussion
- Journal Club or Berlin ML Seminar?
- Berlin ML seminar is on "Modelling of non-linear state space systems using deep neural network"[1]
- Journal Club or Berlin ML Seminar?
-
Review of on-going work
-
Eren
- MelGan
- Guided attention
- German dataset
-
Alex
- French model
- Getting MATRIX Voice and WebThings working together
- Generalized
validate_label
, i.e. local specific
-
Reuben [PTO]
-
Tilman
- First test training with LibriVox SDB English export
- Creating German LM from Oscar
- Small SDB improvements
-
Kelly
- Reviewing Bergamot applicants again
- Interviewing Bergamot applicants again
- Created optimizer built on optuna to optimize lm_alpha + lm_beta
- Ran optimizer of lm_alpha + lm_beta on dev set + lowest loss model, got WER 5.97 on LibriSpeech clean test
- Reading: "Concentration-of-measure Inequalities" from Lugosi
- Continuing discussions on legal aspects of product data sets
- Continuing discussions on IT's support of the DS server
-
-
Discussion
- Alex and Trello
-
Review of on-going work
-
Eren
- Batching server
- Transferring all audio processing to Pytorch
-
Alex
- Making CI fast!
- pyenv PR improvement
- Android emulator and gradle parts
-
Reuben
- API changes for DeepSpeech v1.0
- Analyzing the speech proxy data dump
- QuartzNet explorations
-
Tilman
- Refactoring DSAlign exporter to consume less memory and run faster
-
Kelly
- Reviewing Bergamot applicants again
- Creating optimizer built on optuna to optimize lm_alpha + lm_beta
- Running optimizer lm_alpha + lm_beta on lowest loss model
- Reading: "Concentration-of-measure Inequalities" from Lugosi
- Kick starting discussions on legal aspects of product data sets
- Kick starting discussions on IT's support of the DS server
-
-
Discussion
- KR's
-
Review of on-going work
-
Eren
- Punctuation
- Dealing with the german dataset
-
Alex
- Landing node-gyp cache
- Trying to get local SWIG build working w/TC
- Working on Matrix on RPi4 with DeepSpeech and WebThings
-
Reuben
- Analyzing the speech proxy data dump
- Landing transfer learning PR
- Reviewing top_paths != 1 PR
-
Tilman
- Worked on catalog combiner
- Worked on export.py of DSAlign
-
Kelly
- Creating optimizer built on optuna to optimize lm_alpha + lm_beta
- Working on Mozilla ML Web Site
- Continued training English models on the 6000's
- Revising KR's
- Bergamot
- Reviewing applicants again
- Working with graphics designer on poster/flyer design
- Working on TC integrations
-
-
Review of on-going work
-
Eren
- Released the best LJSpeech TTS and PWGAN model
- Writing examples for TTS
- Working with German TTS talent to fix some examples + add data
- Add Decouples Linear Loss to Dev branch
- Pytorch based STFT
- Train only with log domain mel w/o any further preprocessing
- Train with larger discriminator as in
-
Alex
- Landing node-gyp cache
- Trying to get local SWIG build working w/TC
- Working on Matrix on RPi4 with DeepSpeech and WebThings
-
Reuben
- API stabilization for DeepSpeech v1 (Single file packaging for LM, model packaging...)
- Analyzing the speech proxy data dump
- Analyzing DeepSpeech errors in Firefox Voice
- Create Mandarin LMs from OSCAR
- Gathering and sending DanSpeech student project ideas
-
Tilman
- SDB PR follow-up
- Implemented SDB export in DSAlign
-
Kelly
- Review PR 2723
- Training English models on the 6000's
- Benchmarking pre MFCC fix English models on the 6000's
- Talking with new finance about the EU grant
- Working on Mozilla India Voice Strategy
- Working on Mozilla+GIZ Call for Proposals
- Working on getting auditor's declaration to EU on signed
-
-
Discussion
- All-Hands
- DeepSpeech model installation UX
-
Review of on-going work
-
Eren
- Mel-GAN experiments
- PWGAN experiments (Train with larger discriminator, No normalization only db conversion...)
- TTS (Pytorch based STFT, Train only with log domain mel...)
-
Alex
- Landing CTC decoder checks
- Rebasing French docker to current 0.7.0a1 + Running training to verify no regression
- Playing with MATRIX Voice
-
Reuben
- API stabilization for DeepSpeech v1 (Single file packaging for LM, model packaging...)
- Embed beam width into model and simplified CreateModel API
- Analyzing DeepSpeech errors in Firefox Voice
- Create Mandarin LMs from OSCAR
- QuartzNet explorations
- Client library for TTS server
-
Tilman (PTO)
-
Kelly
- Some "BizDev" for Petpooja
- Bergamot travel/conference planning
- Training English models on the 6000's
- Talking with new finance about the EU grant
- Working on getting auditor's declaration to EU on signed
-
-
Announcements
- Internal Q+A
- Objectives
- All-Hands
- Demos
- Ligtning talks
-
Review of on-going work
-
Eren (PTO)
-
Alex
- Discourse/GitHub support
- Updating WebSpeech API code
- Cross-compiled KenLM for RPI to dynamically build on-device LM (Cool!)
-
Reuben
- Single file packaging for LM
- API changes as a result of single file packaging for LM
-
Tilman
- SDB file format (Finished coding on external sorting)
-
Kelly
- Bergamot demo (Compiling + running server on cluster)
- Rasa demo (Getting initial version running and modifying it)
- Getting acoustic office booth for All-Hands (Solidifying shipping date + movers)
- Monthly Bergamot Meeting
- Getting Intelligentsia Consultants paid
- Getting new contract signed for auditor
-
-
Announcements
- Deep Speech 0.6.1 out!
- Copenhagen Report
- Hyperparameter tuning
- Deep Speech 1.0.0
-
Review of on-going work
-
Eren
- Fast Speech (Attention to Feedforward)
- PWGAN experiments (CPU realtime GPU much faster than realtime)
- Post-Net/Decoder decoupling
- Mel-GAN experiments
- Forward attention and Tacotron2 experiments
- DeepSpeech with TF2
-
Alex
- Discourse/GitHub support
- Cross-compiled KenLM for RPI to dynamically build on-device LM (Cool!)
-
Reuben
- 0.6.1 out the door
- Create Mandarin LMs from OSCAR
- Experimenting with different LM sizes
- Experimenting with QuartzNet
- Training Mandarin models
- TensorFlow 1.15 update
-
Tilman
- Continued work on automatic sourcing of Librivox data
- Calibre based tool to go from formatted book to plain text
- Started a run server, currently aligned 27K hours of English!
- SDB file format
- Continued work on automatic sourcing of Librivox data
-
Kelly
- Getting acoustic office booth for All-Hands
- Setting up Firefox w/Bergamot tranlation engine again for All-Hands demo
- Planning for meeting Edinburgh Group in Berlin before All-Hands for demo work
- Proof reading 1E9 interview
- Training English models on the 6000's (Optimizing dropout, alpha, and beta)
- Figuring out legalities of billing + hiring for the EU grant
-
-
Announcements
- Welcome back!
- 0.6.1?
-
Review of on-going work
-
Eren
- Fast Speech (Attention to Feedforward)
- PWGAN experiments (CPU realtime GPU much faster than realtime)
- Post-Net/Decoder decoupling
- Mel-GAN experiments
- Forward attention and Tacotron2 experiments
- DeepSpeech with TF2
-
Alex
- Discourse/GitHub support
- Cross-compiling KenLM for RPI to dynamically build on-device LM (Cool!)
-
Reuben
- In Copenhagen with Alex talking with Denmark government on their STT challenge
- Experimenting with QuartzNet https://arxiv.org/abs/1910.10261
-
Tilman
- Continued work on automatic sourcing of Librivox data
- More reliable ZIM indexing/book-fetching
- Calibre based tool to go from formatted book to plain text
- Started a run server, currently aligned 27K hours of English!
- Working on experimental data format
- Continued work on automatic sourcing of Librivox data
-
Kelly
- Nothing useful, answering emails all day (Inbox was over 500 now is at 34)
-
-
Announcements
- Kelly at NeurIPS
-
Review of on-going work
-
Eren
- Picking things back up after vacation
- Inspecting experiments that finished over vacation
- Working around sshfs problems by spawning a separate mount per experiment
- Comparing vocoder architectures
- Talking to some companies that are interested in TTS collaboration
-
Alex
- Talked at Open Source conference
- Got some feedback from people integrating DeepSpeech
- Talk was well received
- Talked to people working integrating DeepSpeech French model for automatic video transcripts
- Collaboration between several universities
- Following up on conversations
- Firefox DeepSpeech process isolation
- Need to get in touch with maintainer
- Check assumptions on what's doable/acceptable
- Talked at Open Source conference
-
Reuben
- Fixing some problems that popped up in the 0.6 release
- Moving examples off-repo
- Publishing TFLite package on Linux/Windows/macOS
- Using simpler README in online package listings
- Following up on feedback from release
- Talked to Ryan Hileman from https://talonvoice.com/
- He mentioned that using all of Common Voice valid.tsv lead to significant improvements for people with non-US accents
- We should experiment with using all of valid (subtracting dev/test) to see if it helps
- Experimenting with QuartzNet https://arxiv.org/abs/1910.10261
- Fixing some problems that popped up in the 0.6 release
-
Tilman
- Continued work on automatic sourcing of Librivox data
- More reliable ZIM indexing/book-fetching
- Calibre based tool to go from formatted book to plain text
- Started a basic run on home box to test whole setup end-to-end
- Modulo export step, everything seems to be working
- Can improve throughput by parallelizing downloads
- Continued work on automatic sourcing of Librivox data
-
Kelly
- At NeurIPS
-
-
Announcements
- Release 0.6.0 TODO's
- Blog post [Reuben]
- Partner email [Kelly]
- Benchmark models [Kelly]
- Release notes [Reuben/Kelly]
- Release 0.6.0 TODO's
-
Review of on-going work
-
Eren (PTO)
-
Alex
- Working on presentation for Open Source conference
- Firefox DeepSpeech process isolation
- Meetup on Wed w/Deep Speech + Video Streaming integration group
-
Reuben
- Finished awesome TTS packaging
- Blog post about DeepSpeech 0.6.0
- Improving word timing demo
- Somewhat stable API for TTS models
- Obtained remainder of the OI's 1950 hours of Mandarin
-
Tilman
- Re-Export of NPR data
- Continued work on LibriVox sourcing code
- More reliable ZIM indexing/book-fetching
- Calibre based tool to go from formatted book to plain text
-
Kelly
- IGF Conference
- LREC 2020 paper on Common Voice
- ACL 2020 paper on Interpreting Contextualized Representations via Static Embedding Analysis
- Continuing talks with our voice talent over public use of her voice data
- Getting contract for a German Voice talent signed
- Getting Intelligentsia Consultants paid for their work on the new IoT H2020 grant
- Preparing talk for LT4All Conference
- Denmark GovTech-Program meeting
- Deep Speech 0.6.0
- Training, and benchmarking final 0.6.0 model
- Readthedocs documentation
- Wrote partner email + gathered partner emails
-
-
Announcements
- W3C Workshop on Web & Machine Learning
- All-Hands demos
- Intern interviews
- Release 0.6.0 TODO's
- Blog post [Reuben]
- Release notes [All]
- Language model from new corpus [Reuben]
- Train models using latest Common Voice [Kelly]
- Benchmark model using latest Common Voice [Kelly/Reuben]
- Deep Speech 1.0
-
Review of on-going work
-
Eren (PTO)
-
Alex
- MozIoT addon for STT
- Updating WebSpeech API patches
- Starting work on RemoteDataDecoder
-
Reuben
- API for TTS models
- Making REST server easy to use via wheel packaging
- Testing TTS wheel on Linux
- Writing instructions
- Blog post about DeepSpeech 0.6.0
- Obtained remainder of the OI's 1950 hours of Mandarin
- API for TTS models
-
Tilman
- Improving DSAlign
- Getting better alignment logging
- Getting (speaker) meta-data
- Improving DSAlign
-
Kelly
- Gave LPSS keynote
- Kicking off talks with our voice talent over public use of her voice data
- Getting first draft of contract for a German Voice talent reviewed by legal + voice talent
- Getting Intelligentsia Consultants paid for their work on the new IoT H2020 grant
- Working with Diane Tate on All-Hands demo admin
- Preparing talk for LT4All Conference
- Training, and eventually benchmarking on many data sets, models for 0.6.0
-
-
Announcements
- All-Hands demos
- Rasa integration of STT + TTS (English) [Kelly]
- NMT integration into Firefox (German-to-English) [Kelly]
- WebThings Deep Speech integration (English) [Alex]
- STT (German) [Tilman]
- Standard TTS demo showing quality w/BERT generator (English) [Eren]
- STT in Firefox (English + client-server) [Alex]
- STT in MR (English) [Alex]
- Release 0.6.0 TODO's
- Blog post [Reuben]
- Release notes [All]
- Language model from new corpus [Reuben]
- Train models using latest Common Voice [Kelly]
- Benchmark model using latest Common Voice [Kelly/Reuben]
- Deep Speech 1.0
- All-Hands demos
-
Review of on-going work
-
Eren
- TTS developer support
- Updating/fixing colab example
- Merge Zoneout and new Forward attention to dev branch
- Check incompatibility of PWGAN results wrt ESPNet
- New audio parameters w/PWGAN experiments
- Use Zoneout instead of Dropout w/Forward attention and Tacotron2 experiments
- Implement Guided Attention
-
Alex
- Fixing french commonvoice/sentence collector dataset
- Webspeech API (temp) patch for noise cancellation
- WebThings integrations with Deep Speech
- Spoke at Toulouse's Capitole du Libre
-
Reuben
- Experimenting with OpenWebText corpus for LM
- Blog post about DeepSpeech 0.6.0
- Experiments with Mandarin data from OI (Test model quality)
- Optimizing UTF-8 training code
-
Tilman
- Landing transcribe.py
- Working on alignment-automation
-
Kelly
- Training 0.6.0 model (Low learning rates with high droputs, smooth convergence)
- Preparing talk for LPSS
- Organizing CV demo for LT4All conference
- New H2020 IoT grant legwork (Writing proposal, due November 19th)
- New H2020 grant proposal "AIZEN" submitted
- Flying to Taipei tomorrow
-
-
Announcements
- All-Hands demos
- Rasa integration of STT + TTS (English) [Kelly]
- NMT integration into Firefox (German-to-English) [Kelly]
- WebThings Deep Speech integration (English) [Alex]
- STT (German) [Tilman]
- Standard TTS demo showing quality w/BERT generator (English) [Eren]
- STT in Firefox (English + client-server) [Alex]
- STT in MR (English) [Alex]
- Release 0.6.0 TODO's
- Blog post [Reuben]
- Release notes [All]
- Language model from new corpus [Reuben]
- Train model using latest Common Voice [Kelly]
- Benchmark model using latest Common Voice [Kelly/Reuben]
- Deep Speech 1.0
- All-Hands demos
-
Review of on-going work
-
Eren
- Training a Parallel WaveGAN model
- Forward backward decoder training
- Looking again at Location-Relative Attention Mechanisms
-
Alex
- Preparing for talk next week at Toulouse's Capitole du Libre
- TFLite representative data set optimization does not optimize
-
Reuben
- Landing UTF-8 changes (Replacing current character-based mode with UTF-8 mode)
- Experimenting with OpenWebText corpus for LM
- Blog post about DeepSpeech 0.6.0
- Experiments with Mandarin data from OI (Test model quality)
- Optimizing UTF-8 training code
-
Tilman
- Building German LM
- Imported all German data to server
- Preping for test epoch of all German data
- Uplifting transcribe.py to work with calatog files
-
Kelly
- Training 0.6.0 model
- Preparing talk for LPSS
- Bergamot WebSite (Re-working the partner page)
- Organizing CV demo for LT4All conference
- New H2020 IoT grant legwork (Writing proposal, due November 19th)
- New H2020 grant proposal "AIZEN" (Writing proposal, due November 19th)
-
-
Announcements
- All-Hands demos?
- Release 0.6.0 TODO's
- Blog post
- Release notes
- Language model from new corpus?
- Train model using latest Common Voice
- Benchmark model using latest Common Voice
- 1.0 Before All-Hands?
- Downloader for models?
- Downloader allows for 3rd parties?
- ...
-
Review of on-going work
-
Eren
- Evaluating different TTS architectures (N x Conv Encoder for different N)
- Fine tuning with a small corpus (Attention fails)
- Implementing TTS in TF 2.0 for deployment (TF2 needs more to make TTS on par with pytorch.)
- Forward backward decoder training
- Looking again at Location-Relative Attention Mechanisms
-
Alex
- Taskcluster migration (Surprise!)
- TFLite representative data set optimization does not optimize
-
Reuben
- Landing UTF-8 changes (Replacing current character-based mode with UTF-8 mode)
- Embedded alphabet in model!
- Experimenting with OpenWebText corpus for LM
- Blog post about DeepSpeech 0.6.0
- Experiments with Mandarin data from OI (Test model quality)
- Optimizing UTF-8 training code
-
Tilman
- Imported all German data to server
- Preping for test epoch of all German data
- Building German LM
- Uplifting transcribe.py to work with calatog files
-
Kelly
- New H2020 IoT grant legwork (Writing proposal, due November 19th)
- Preparing talk for LPSS
- Organizing CV demo for LT4All conference
- Bergamot WebSite (Re-working the partner page)
- New H2020 grant proposal "AIZEN" (Writing proposal, due November 19th)
-
-
Announcements
- 2020 Planning Meeting
- All-Hands demos?
-
Review of on-going work
-
Eren
- TTS w/TF (Slow)
- Merging bidirectional TTS decoder
-
Alex
- Fixed framabook and wikisource data extractors for unicode normalization
- Fixing evaluate_tflite
- Running a non-optimized french model for WER comparision of representative_dataset usage
-
Reuben
- Landing UTF-8 changes
- Merged and rebased preliminary changes
- Replacing current character-based mode with UTF-8 mode
- Shift byte values by 1 to keep alphabet size 256
- Writing blog post about new things in DeepSpeech 0.6.0
- Experiments with Mandarin data from OI
- Landing UTF-8 changes
-
Tilman
- Importer fixes
- Importing CV-de, M-AILABS, SWC and TUDA on cluster
- Thinking about presenting about online hard example mining in Journal Club next week
-
Kelly
- Voice 2020 planning
- 2 day planning meeting
- Writing up 2020 OKR's
- New H2020 grants legwork
- Writing up grant for secure WoT work including on-device STT
- Writing up grant for AIZEN, PhD "interns" at Mozilla
- Helping security group navigate H2020 grants
- Voice 2020 planning
-
-
Announcements
- Some nodes' GPUs are falling off the PCIe bus (2 or 3 GPUs)
-
Review of on-going work
-
Eren
- TF2 implementatin of DeepSpeech
- Started training on multi-GPU (on local machine)
- Then will try the cluster
- Forward-backwards attention still training
- Fixed some workstation problems
- Will continue experiments on TTS
- Train Tacotron with x-vectors
- Not good quality
- TF2 implementatin of DeepSpeech
-
Alex
- Fixed client memory leak last week
- Valgrind is completely happy with our code now
- Using feeding code to extract a representative dataset for TFLite quantization
- Submitted and got accepted to talk at an event about Common Voice and DeepSpeech. Event is on 16/11/2019.
- Contacted by French company interested in using DeepSpeech in their products for people with a hearing impairment.
- Benchmarking different batch sizes with CuDNN RNN and French data.
- Disabling early stopping and training for longer helps the accuracy of the French model (WER 9.5% w/ 15 epochs -> 7.5% w/ 20 epochs).
- Fixed client memory leak last week
-
Reuben
- Experiments with Mandarin data from OI
- Re-importing from scratch with all checks and clean-up passes in a single script to make sure nothing is left out.
- Then OI can use the script to check the data as it's provided by the vendor.
- Adapting scorer clean-up patches to replace current character-level code paths with codepoint-level logic + utf-8 instead
- Re-testing utf-8 checkpoints with test/export graph fix (need to re-test baseline)
- Experiments with Mandarin data from OI
-
Tilman
- Cluster set-up last week went smoothly
- Today realized some GPUs are dropping off the PCIe bus
- Doing some stress tests
- Conterintuitively things seem more stable under stress, but need more testing
- Did some binary search for max batch size on new machines, seems like we can get 180 on the Q6000
- Talked to lawyers about German copyright(-equivalent) law
- Thinking about presenting about online hard example mining in Journal Club next week
-
Kelly Did not attend
-
-
Announcements
- At Meta-Fourm Mozilla won the Meta Seal of Recognition
- Server installation this week!
- Power going out in the Berlin office Wed/Thur nights, but not in the server rooms!
-
Review of on-going work
-
Eren
- Train Tacotron with x-vectors
- Implementing different TTS architectures (Nx conv encoder)
- Implementing TTS in TF 2.0
-
Alex
- WS API Patch work with A and R
- WS API 2 out of 3 positive reviews!
- PR for exposing LM parameters
- Creating a french model v0.3
- Experimenting with metadata for tflite, e.g. graph version
-
Reuben
- Experiments with Mandarin data from OI
- Adapting scorer clean-up patches to replace current character-level code paths with codepoint-level logic + utf-8 instead
- Testing 5127 checkpoint against other datasets
- Re-testing utf-8 checkpoints with test/export graph fix
- pt-BR training runs after reducing sentence duplication in the dataset
- Importing CV 2.0 English in the cluster
- Improving how native client handles sample rates
-
Tilman
- Spoken Wikipedia data/importer
- Prepating statistics about LibriVox
-
Kelly
- EU 9 month review
- Server installation
- Recommendation for Rishi
- Letter of support for EAR proposal
- SIFIS-Home H2020 grant proposal
- AIZEN H2020 grant proposal
- Preparing for 2020 planing meeting at MV next week
- Writing ApS Termination Report for NMT H2020 grant
-
-
Announcements *
-
Review of on-going work
-
Eren traveling to conference
-
Alex
- Refactoring of documentation to RST format
- Looking into memory leaks reported by Carlos
- Landed m-AI Labs importer
- Working on French model with m-AI Labs dataset
-
Reuben
- Creating Mandarin LM from Wikipedia using CV scripts
- LM leads to very marginal improvement in CER
- Latin alphabet being present in LM vocabulary biases decoder towards beams with Latin characters because they get scored first
- Building the LM from the same data but excluding Latin characters to see how big the impact is
- Depending on results need to think about how to make the model handle Latin alphabets for things like brand names, country names, etc.
- Trying (unsuccessfully) to replicate the weirdly low validation loss from 5127 run (compared to test loss)
- Creating Mandarin LM from Wikipedia using CV scripts
-
Tilman
- First NPR training run!
- Investigating weird results from 5127 run: high test WER, high test loss compared to validation
- Writing German importers for
- "Spoken Wikipedia Corpora" <- focusing on this one, kind of complicated to import
- "German Distant Speech Corpus (TUDA)"
- ...
- First NPR training run!
-
Kelly Traveling to Meta Forum
-
-
Announcements
- Register for All-Hands!
-
Review of on-going work
-
Eren
- Working on multi-speaker model (Got 900 speakers working, females work better)
- Enabled using multiple datasets for TTS
- Training Tacotron2 with x-vectors
- Looking to move to TF2.0
-
Alex
- Trying to work around the duplication problems in French Common Voice (Duplicates hurt WER a lot)
- Integrating DeepSpeech into WebThings for on-device STT
- Evaluating M-AILABS dataset for French (Looking good fot STT use)
- Taskcluster: Fixing disk space issue on macOS
-
Reuben
- Tackled the UTF-8 decoder head-on
- UTF-8 close to grapheme performance
- UTF-8 Mandarin training runs with Magicdata
- Creating Mandarin LM from Wikipedia using CV scripts
-
Tilman
- First NPR training run!
- Writing German importers for
- "Spoken Wikipedia Corpora"
- "German Distant Speech Corpus (TUDA)"
- ...
-
Kelly
- ICLR 2020
- Finshed submission with Rishi
- LPSS 2020
- Writing submission
- Preparing plenary
- Preparing for Meta Forum 2019 - European Language Grid conference
- Got poster printed
- Preparing talk
- Jugend hackt
- Preparing talk for Friday
- EU NMT grant contractural work
- New H2020 Grant
- Writing grant
- Organizing meetup of coalition
- Organizing funds for Intelligentsia to help with grant writing
- Branding "Kickoff" meeting for TTS and STT
- 100k hours engineering sync up meeting
- ICLR 2020
-
-
Announcements
- London ReWork
- Interspeech
- Meta-Fourm
- New server delivered
-
Review of on-going work
-
Eren
- Implement Duration Predictor
- Train Duration Predictor
- Compute phoneme alingments with attention
- Try multi-head attention
-
Alex
- Unified documentation!
- Examples now run in PR's!
- RPi4 works in realtime!
-
Reuben
- Baseline Mandarin grapheme runs
- Testing data augmentation
- Augmentation documentation
-
Tilman
- First NPR forced alignment run done!
- Fixing some data problems
-
Kelly
- ICLR 2020
- Writing/revising submission with Rishi
- LPSS 2020
- Writing submission
- Preparing for Meta Forum 2019 - European Language Grid conference
- Getting poster printed
- Preparing talk
- EU NMT grant contractural work
- Got C. Beard sign the GA Declaration + Accession Form
- Got the Opinion Letter of the Leaving Beneficiary signed by the ApS
- New H2020 Grant
- ICLR 2020
-
-
Announcements
- N/A?
-
Review of on-going work
-
Eren (OOO)
- Multi-speaker embedding
- Multi-headed attention test
- Implement Duration Predictor
-
Alex (PTO)
-
Reuben
- Baseline Mandarin grapheme runs
- Testing data augmentation
- Data augmentation PR
- API cleanup
-
Tilman
- First NPR forced alignment run done!
-
Kelly
- New servers on order should be built in 3 weeks
- First version of Firefox build with Edinburgh NMT addition
- Preparing for Meta Forum 2019 - European Language Grid conference
- Designing poster
- Purchasing poster holder
- Getting poster printed
- EU NMT grant contractural work
- Getting C. Beard sign the GA Declaration + Accession Form
- Getting the Opinion Letter of the Leaving Beneficiary signed by the ApS
-
-
Announcements
- Register for a conference if you want:
-
Review of on-going work
-
Eren
- Working on speaker vocoder, trying to overfit test set
- Got multi-speaker working well
-
Alex (PTO)
-
Reuben
- Baseline Mandarin grapheme runs
- Testing data augmentation
- Profiling native client
-
Tilman
- Forced alignment
- Started testing with a subset of the NPR data
- Almost finished transcribing NPR data
-
Kelly
- New servers on order should be built in 3 weeks
- Working with OI over Mandarin data set mitigation plan
- Debugging German Firefox build with Edinburgh NMT addition
- Reviewing rasa's approach to conversational AI
- Read "Rasa: Open Source Language Understanding and Dialog Management"
- Read "Few-Shot Generalization Across Dialog Tasks"
- Went through getting started tutorial
- Going through STT/TTS tutorial
- EU NMT grant contractural work
- Getting C. Beard sign the GA Declaration + Accession Form
- Getting the Opinion Letter of the Leaving Beneficiary signed by the ApS
-
-
Announcements
- Register for a conference if you want:
-
Review of on-going work
-
Eren
- Testing new non-linearity
- Got server setup and set uot to others for DE TTS
- Working on getting VCTK data set working
- Talking to Vestel on possible partnership
-
Alex (did not attend)
-
Reuben
- TTS C/C++ API
- Baseline Mandarin grapheme runs
- Decoder refactor
-
Tilman
- Forced alignment
- Working on edge cases from alignemnt algorithm to deal with weaknesses in STT model
- Finished the README for aligner and some related polish work
- Parallelizing DeepSpeech transcription process
- Started testing with a subset of the NPR data
-
Kelly
- Catching up on email
- Celtic Language Technology Workshop Keynote
- The Next Web/Common Voice interview
- EU NMT grant contractural work
- Finding out how the contract should be changed
- Becoming LEAR got GmbH
- Writing letter for ApS to leave grant
- Gathering documents required for the GmbH to sign to enter the grant
- Getting Firefox build setup to integrate the Edinburgh's NMT engine server-side
-
-
Announcements
- Register for a conference if you want:
-
Review of on-going work
-
Eren
- Merging gradual training (gradual decrease of r-value during training)
- In the dev branch, waiting for tests before merging into master
- Started work on German dataset again
- Decided to have a demo model for Telekom
- If demo goes well we'll be better able to record a better German dataset
-
Alex (did not attend)
-
Reuben
- TTS C/C++ API
- Baseline Mandarin grapheme runs
- Decoder refactor
-
Tilman
- Forced alignment
- Working on edge cases from alignemnt algorithm to deal with weaknesses in STT model
- Finished the README for aligner and some related polish work
- Parallelizing DeepSpeech transcription process
- Then will start testing with a subset of the NPR data
-
Kelly (Celtic Language Technology Workshop)
-
-
Announcements
- Journal Club on PTO, conflicts with ET Monthly All Hands
- Register for a conference if you want:
-
Review of on-going work
-
Eren
- Just back from PTO
-
Alex
- Just back from PTO
-
Reuben
- TTS C/C++ API
- Baseline Mandarin grapheme runs
-
Tilman (PTO)
-
Kelly
- Working w/Google W3C on modifications to the Web Speech API
- Working on SNAFU ApS vs GmbH
- Got PIC number validated
- In the process of getting LEAR appointment
- Preping for Celtic Language Technology Workshop
- New H2020 grants legwork
- Ordering new worker servers
-
-
Announcements
- Journal Club
- Rishi Walking backwards on Sesame Street - An evaluation of context independent word vectors derived from context dependent ones
- Register for a conference if you want:
- Journal Club
-
Review of on-going work
-
Eren (PTO)
-
Alex(PTO)
-
Reuben
- TTS C/C++ API
- Landed CuDNN RNN PR
- Fixed TF 1.14 regression
- Baseline Mandarin grapheme runs
-
Tilman
- Forced alignment
- Writing README for repo
- Finishing up alpha version
- Recursive split/decent approach for remaining short/bad-matches
- Forced alignment
-
Kelly
- ACL/NMT/NLP for Conversational AI conferences
- NMT Project WebSite updates
- NMT Project Coalition Meeting
- Working w/Google W3C on modifications to the Web Speech API
- Working on SNAFU ApS vs GmbH
- Got PIC number validated
- In the process of getting LEAR appointment
- LEAR appointment letter
- Declaration of Consent
- Legal Representative identity document
- LEAR identity document
- Preping for Celtic Language Technology Workshop
- New H2020 grants legwork
- Ordering new worker servers (Waiting for Godot to sign)
- Reviewing CV's/Interviewing for grant positions (On hold due to ApS vs GmbH SNAFU)
-
-
Announcements
- Journal Club volunteers?
-
Review of on-going work
-
Eren
- Train TTS in Libri-TTS with 300 speakers
- Implemented Deep Griffin–Lim via 1903.03971
- Try to predict phase from amplitude spectrogram
- Added speaking embedding with Thomas
-
Reuben
- Apartments!
- Setup of new laptop
- Preparing CuDNN RNN PR
- Fixing TF 1.14 regression
- Baseline Mandarin grapheme run
-
Alex
- Improving macOS build time
- Fixing at Firefox crash on windows
- Firefox
- Implementing json model descriptor
- Implementing model download from about:deepspeech
- Implemented libdeepspeech download from about:deepspeech
-
Tilman
- Forced alignment
- Writing README for repo
- Finishing up alpha version
- Forced alignment
-
Kelly
- Submitted grant w/Te Hiku Media to NZ government
- Working on SNAFU ApS vs GmbH
- Getting VAT Extract from Bundeszentralamt für Steuer
- Got registration extract for the GmbH from Handelsregister in Berlin
- Filled out FEL Form private entity form for GmbH
- Got signature from Chris Beard on FEL Form
- Starting process of getting LEAR appointment
- New H2020 grant legwork
- Ordering new worker servers
- Reviewing CV's/Interviewing for grant positions (On hold due to ApS vs GmbH SNAFU)
-
-
Announcements
- Journal Club volunteers?
- ICTurkey in Istanbul
-
Review of on-going work
-
Eren
- ICTurkey in Istanbul
- Train TTS in Libri-TTS with 300 speakers
- Improving on Griffin–Lim with 1903.03971
- Work on optimizing WaveRNN for inference
- Starting on pruning work, reading research
- Adding speaking embedding with Thomas
-
Reuben
- Apartments!
- TF 1.14 update
- Baseline Mandarin grapheme run
- Rebasing CuDNN RNN branch to PR after 1.14 lands
-
Alex
- Fixing TC problems with npm
- Updated nodejs versions
- Fixed nodejs destructor crashes
- Firefox
- Implementing json model descriptor
- Implementing model download from about:deepspeech
- Implemented libdeepspeech download from about:deepspeech
-
Tilman
- Forced alignment
-
Kelly
- Met with Viamo + GIZ on possible use of DS in Viamo's many IVR installs
- Working on Te Hiku Media grant
- Working on SNAFU ApS vs GmbH
- New H2020 grant legwork
- Ordering new worker servers
- Reviewing CV's/Interviewing for grant positions on hold due to SNAFU (ApS vs GmbH)
-
-
Announcements
- Rishi is going to present at week's Journal Club!
-
Discussion
- No topics?
-
Review of on-going work
-
Eren
- Work on optimizing WaveRNN for inference
- Trying to train Libri TTS w/300 speakers
- Created working model w/24KHz sampling
- Working on model w/16KHz sampling
- Starting on pruning work, reading research
- Working on adding phase prediction net to Giffin-Lim
- Adding speaking embedding with Thomas
- Going to Istanbul for H2020 meeting
-
Reuben (Moving)
- Moving!
-
Alex
- Investigating nodejs 9x issues
- Cleaning French Common Voice data/sentences
-
Tilman (PTO)
- Forced alignment
-
Kelly
- Telekom NDA signed!
- Ordering new worker servers
- Reviewing CV's/Interviewing for grant positions on hold due to SNAFU (ApS vs GmbH)
- Working on SNAFU, met with auditor, financial advisor, and meeting with legal later today
- Finished NMT First Dissemination Plan, turned into EC.
- Finished NMT Data Plan, turned into EC.
-
-
Announcements
- Deep Speech reached 100k downloads
- All-Hands
- On-Device WebSpeech API demo at all-hands well recieved
- DeepSpeech + androidspeech merged into an experimental Firefox Reality branch
- 0.5.1 and v0.6.0-alpha.0 "Out the door"
- Starting work on another German Government grant working with DFKI
- Eren is going to present at week's Journal Club! (Really this time!?)
-
Discussion
- Torrent models? (Statistics on model downloads?)
-
Review of on-going work
-
Eren
- Commandline TTS tools
- Working on universal vocoder
- Traning with global style tokens
- About to train with global style tokens
- Got funding attend the H2020 conference in Istanbul
- Got German TTS working to Telekom demo
-
Reuben
- Moving!
- Fixing macOS task failures
- Working on testing loss normalizaion
-
Alex
- Updating ds-srv to 0.5.1
- Looking at performance of DS on RPi 4
- Cleaning French Common Voice data/sentences
-
Tilman (PTO)
- Forced alignment
-
Kelly
- Ordering new worker servers
- Got NDA back from Telekom now our lawyers are looking over their changes
- Reviewing CV's of applicants wanting to join the NMT project
- Interviewing applicants wanting to join the NMT project
- Dealing with legal SNAFU on EU grant with respect to the office the purchases are made from
- Dealing with legal SNAFU on EU grant with respect to the office the employees are hired into
- Working on NMT First Dissemination Plan, due to the EU June 30th
-
-
Announcements
- 0.5.0 ("Decoder optimizations" PR and done?)
- Rishi at NAACL this week presenting a paper SPARSE: Structured Prediction using Argument-Relative Structured Encoding
- Starting work on another H2020 grant working with other EU partners (CNR, RISE, Luminem, Intel, PoliTO) dealing with secure IoT where we'd be lead partner
- Eren is going to present a next week's Journal Club
-
Review of on-going work
-
Eren
- Training Tacotron with the "Voice of Mozilla"
- Looking for better German dataset
- Working on Tacotron to merge it with WaveRNN
-
Reuben (PTO)
- Decoder optimizations PR
- TF and TFLite Refactor
-
Alex
- Deep Speech (TFLite) on device with Firefox
- Initial integration
- Integration with proper threading
- Turning on sandbox
- TC Green!
- Working on French model
- Cleaning French Common Voice data/sentences
- Deep Speech (TFLite) on device with Firefox
-
Tilman
- Continued noise1-only training
- Deep Speech (TFLite) on Firefox Reality
- Implementing streamable downloding of models
- Testing streamable downloding of models
-
Kelly
- Sent out NDA to Telekom
- Ordering new worker servers
- ParaCrawl based models legal issues
- Reviewing CV's of applicants wanting to join the NMT project
- Working on NMT First Dissemination Plan, due to the EU June 30th
- Working on NMT Data Management Plan, due to the EU June 30th
-
-
Announcements
- Rishi starts this week
-
Review of on-going work
-
Eren
- First results from Tacotron 2 on "Voice of Mozilla" data set
- Training Tacotron with the "Voice of Mozilla"
-
Reuben
- Aishell runs (UTF-8 branch promising)
- Other Mandarin importers
- Writing model packaging requirements doc
-
Alex
- Deep Speech (TFLite) on device with Firefox
- Initial integration
- Integration with proper threading
- Turning on sandbox
- Working on French model
- Cleaning French Common Voice data/sentences
- Deep Speech (TFLite) on device with Firefox
-
Tilman
- Continued noise1-only training
- Deep Speech (TFLite) on Firefox Reality
- Implementing streamable downloding of models
- Testing streamable downloding of models
-
Kelly
- Prepared talk for re*work Boston
- Gave re*work Boston talk on DS+CV
- ParaCrawl based models legal issues
- NDA for Telekom
- Looking TTS German training data
- Reviewing CV's of applicants wanting to join the NMT project
- Obtaining new worker servers
- Working on NMT First Dissemination Plan
-
-
Announcements
- Rishi starts again next week
-
Review of on-going work
-
Eren
- Examination of the "Voice of Mozilla" data set
- Training various models with the "Voice of Mozilla" (Seems to not work well.)
- Training Merlin
- Applied to H2020 pair making conference, looking for others to work with
-
Reuben
- Aishell runs (UTF-8 branch promising)
- Other Mandarin importers
- Reviewing streaming decoder PR 2121
-
Alex
- Deep Speech (TFLite) on device with Firefox
- Initial integration
- Integration with proper threading
- Working on French model
- Interview with French newspaper tomorrow
- Meeting in 1 week w/French government on CV
- Deep Speech (TFLite) on device with Firefox
-
Tilman
- Cleaning up snakepit user code
- Deep Speech (TFLite) on Firefox Reality
- Getting Android dev env setup for FxR + DS
-
Kelly
- Preparing talk for re*work Boston
- Re-Engaged with Marketing for naming/branding of STT & TTS
- Writing up Rishi's new Onboarding Plan
- ParaCrawl based models legal issues
- NDA for Telekom
- Looking TTS German training data
- Reviewing CV's of applicants wanting to join the NMT project
- Obtaining new worker servers
- Working on NMT First Dissemination Plan
- Met with OI on Data Commons
-
-
Announcements
- Journal Club Volunteers
-
Review of on-going work
-
Eren
- Aligning with Gentle
- Testing word based TTS
- Training various models with the "Voice of Mozilla" (Seems to not work well.)
- Reducing vocoder size, pruning + arch changes
- Discourse TTS
-
Reuben
- Immigration bureaucracy
- Aishell runs (UTF-8 branch promising)
- Fixed pre-processing of data (Spaces incorrect)
- Other Mandarin importers
-
Alex
- PSU Replacement
- Demangled symbols
- Updated to newest SWIG
- Removed Python 2.7 support
- Deep Speech (TFLite) on device with Firefox
-
Tilman
- Noise training
- Fixing image on cluster
- Cleaning up snakepit user code
- Adding port forwarding to cluster snakepit
- Running test of standard 0.5.0 training run against the noise test set
-
Kelly
- Met with OI on sentence collection lessons
- Met with spoken.io on possible collaboration
- Preparing talk for re*work Boston
- Working on internal mana page for ML
- Working on NMT First Dissemination Plan
- Reviewing Deep Speech PR 2111
-
-
Announcements
- Kelly at re:publica
- Serious Firefox incident over the weekend, recommend watching the project meeting
-
Review of on-going work
-
Eren
- Continuing training on Mozilla voice dataset
- Comparing different runs to find settings for next run
- Reading papers to investigate different architectures (alternative to Tacotron)
- Collaborating with Thomas Werkmeister on Global Style Tokens implementation
- Will apply for German govt. grant
- Created Discourse category for TTS
- Mozilla dataset voice collection done (25h)
- For now data is not releasable, only for internal use
- Continuing training on Mozilla voice dataset
-
Reuben
- AISHELL baseline grapheme training run
- UTF8 run got similar performance to German CV2 runs
- Test epochs as well as native client can't handle super large alphabets due to caching of logits
- Uses too much memory
- Test epoch needs to be pipelined instead of caching all logits in memory
- Native client would need to have a streaming decoder
- AISHELL baseline grapheme training run
-
Alex
- LinguaLibre/TrainingSpeech/French CV2 training run
- Fixing importers
- Training inside Docker
- Creating a training workflow directly from a Common Voice release to be able to iterate quickly when there's new data
- Creating report on our work for last year for French govt. grant
- LinguaLibre/TrainingSpeech/French CV2 training run
-
Tilman
- Noise augmentation runs
- Halved LR runs not looking so promising
- Snakepit large file FS operations (upload/download) with continue
- Should release soon
- Noise augmentation runs
-
Kelly (did not attend, at re:publica)
-
-
Announcements
- Live access to TensorBoard?
-
Review of on-going work
-
Eren
- Training on Mozilla voice dataset, various models. We've about 20 hours of (clean) data
- Overfitting data too clean
- Using WaveRNN
- Pruning for future looks on-device use cases
- Training on Mozilla voice dataset, various models. We've about 20 hours of (clean) data
-
Reuben
- UTF-8 (Germany Training on all Germany CV data)
- May integrate "Portuguese streaming patch"
- AISHELL tests
-
Alex
- Meta-data exposed to all bindings!
- Added NodeJS 12 support
- Docker for fr training
- WebSpeech API backed but on-device Deep Speech
-
Tilman
- Back from PTO
- Matrix run with no-noise, noise 1, and noise 2
- Continuing no-noise on noise 1 w/more epochs
- Augmented run with lower learning rate
- Snakepit features [pit exec + pit (big) file transfer]
-
Kelly
- 2-byte run w/ low LR
- Working on getting more info on MR requirements
- Working on a validation plan for when we obtain "MT" mandarin data
- Met with legal on Telekom partnership
- Translating Hindi data from Latin script to Devanagari script
- Getting H2020 positions into GreenHouse and posted on Mozilla's jobs site
- Deriving MOS scores from the latest TTS tests
- 0.5.0 v1 training run
- Preparing talk for re;publica
- Preparing talk for re*work Boston
- Preparing talk for LPSS
-
-
Announcements
- Alex: PTO
- David Bryant: In Berlin this week
-
Review of on-going work
-
Eren
- Start training on Mozilla voice dataset, various models. We've about 20 hours of (clean) data
- Release for LJSpeech and WaveRNN models
- Trying experiments with forward attention
- Trying to create music with Tacotron!
-
Josh
- UTF-8 (English as Multi-Byte idea)
- Writing NSF grant proposal
- Writing Mozilla Fellow grant proposal
-
Reuben
- UTF-8 (Germany Training on all Germany CV data)
- Fixing importer to not throw away data
- Remove punctuation from text
- Longer runs
- May integrate "Portuguese streaming patch"
- UTF-8 (Germany Training on all Germany CV data)
-
Alex (PTO)
-
Tilman
- Working on GPT-2 ideas/demo
- Augmentation work training continuing
- pit added various commands + bug fixes
-
Kelly
- Kick-off meeting EU grant audit + payroll firm
- Working on getting more info on MR requirements
- Working on a validation plan for when we obtain "MT" mandarin data
- Debugging the n-byte runs
- H2020 grant poster design
- Reading Neural Ordinary Diffirential Equations
- Journal Club Presentation
- Automatic Summarization demo
- Small platform STT demo (VW Demo)
-
-
Announcements
- TBD
-
Review of on-going work
-
Eren
- Start training on Mozilla voice dataset, various models. We've 15+ hours of (too clean!) data
- Met with Telekom + Amazon on TTS, possible joint work with Telekom pending NDA
- Working on new release for LJSpeech for Tacotron 2
- Created demo of current voice reading "The Pocket Article" for MOS test against other commercial engines
- Trying experiments with forward attention
- Trying to create music with Tacotron
-
Josh
- UTF-8 (English as Multi-Byte idea)
- Writing NSF grant proposal
- Writing Mozilla Fellow grant proposal
-
Reuben
- Update Deep Speech to newer TF Data API's, TF MFCC's
- Trained w/Deep Speech on newer TF Data API's, TF MFCC's, 1.13
- Add version info to exported graphs
- Adding to extended info, transcript probability
- Fixing crash bug, no way to properly deallocate transcripts
- UTF-8 (Germany Training on all Germany CV data)
-
Alex
- Working on WebSpeech API in Firefox (DeepSpeech server on GPC)
- Windows Python bindings!
- Add Windows Python packages to upload tasks
- Selective registration and/or limiting CUDA compute compatibility to 3.5 to limit distro size
-
Tilman
- Augmentation work continuing (Fixed bug in voice-corpus-tool)
- Train, dev, test split of noise data
- Fixing issue #2020
- Working on GPT-2 ideas/demo
- Fixing Common Voice 2.0 importer
-
Kelly
- Kick-off meeting EU grant audit + payroll firm this Friday
- Training initial Hindi models using GramVaani (100 hr) data set
- Met with GIZ to work on structure of long term partnership
- Working on getting more info on MR requirements (Meeting w/Janice tomorrow)
-
-
Announcements
- Kyrgyz Voice Technology Hackathon
-
Review of on-going work
-
Eren
- Refactored BN changes
- Collecting/reviewing voice talent data batch 21-22 (9.8 hours)
- Training Nancy with T2+WaveRNN (Dropout hack seems to work a bit)
- Working on WaveRNN precision w/Gaussian and Gaussians
- Start training Tacotron on Mozilla voice dataset (we've 10 hours of (too clean!) data)
-
Josh
- Getting back from Kyrgyz Voice Technology Hackathon
-
Reuben
- TF 1.13 (Infra problems maybe solved!)
- Updating Deep Speech to newer TF Data API's, TF MFCC's
- Training w/Deep Speech on newer TF Data API's, TF MFCC's, 1.13
-
Alex
- Working on WebSpeech API in Firefox
- Deep Speech windows taskcluster support
- Integrated Deep Speech in to Firefox Reality browser!
-
Tilman
- Augmentation work continuing
- Fixing Common Voice 2.0 importer
- Reading GPT-1 and GPT-2 papers
- Journal club presentation on GPT papers
- Reviewing PR 1919
-
Kelly
- Meet w/IT to discuss WebSpeech API for Firefox DeepSpeech Backend Deployment
- Reviewing PR 1919
- Finally got permission to hire for the EU grant from Chris Beard
- Negotiating contract with EU grant's audit + payroll firm (data protection terms)
- Getting DS product requirements from MR
- Hindi importer for GramVaani
-
-
Announcements
-
Review of on-going work
-
Eren
- Replace prenet dropout with BN for test time consistency
- BN Leading to much improved models (Notice the breathing!)
- Check next batch of recordings
- Organize new batches from voice talent
- Trained WaveRNN on Tacotron2 specs
- Created pocket article with WaveRNN
- Extract 10bit audio for WaveRNN training
- Train WaveRNN 10bit
- Implement state transfer for multiple sentence synthesis
- Try Tacotron1 with BN prenet
-
Josh
-
Reuben
- Immigration bureaucracy!
- TF 1.13
- Updating Deep Speech to newer TF API's
-
Alex
- Experimenting with some model download logic in mozillaspeechlibrary/MR browser
- WebSpeech API in Firefox
- Deep Speech windows taskcluster support
- Integrating Deep Speech in to Firefox Reality browser
-
Tilman
- Fixed and restarted augmentation script
- Fixed and deployed fix for apt-daily service problem on workers
- Started reading GPT-1 and GPT-2 papers
- Coding on pit exec
-
Kelly
- Met with Goethe & Aaaron.ai (DS + CV partner?)
- Met with Mycroft
- Writing Hindi importer (Handed off to absin1)
- Administrative tasks for EU grant for NMT
- Unsticking payment to finance management firm
- CASA ticket for new auditor + payroll provider
- Obtained new auditor + payroll provider contract
- Starting the process of obtaining new worker servers
- Two quotes from Server Bau, 8xTitan RTX and 8xRTX 2080TI
- Two quotes from BOXX, 8xTitan RTX and 8xRTX 2080TI
-
-
Announcements
- Kigali Hackathon Blog Post
- VW Progress
- New Nodes (Model size question)
- Update on Cluster status
-
Review of on-going work
-
Eren
- Train new master with Nancy
- Transfer learning for Mozilla voice
- Check batch 6-8 recordings
- Organize new batches from voice talent
- Enable process based multi-GPU training for TTS on cluster
- Train Tacotron2 on LJSpeech
-
Josh
- Got UTF-8 working on cluster for Slovenian
- Working out alphabet.txt and LM issues for zh
-
Reuben
- Immigration bureaucracy!
-
Alex
- Experimenting with some model download logic in mozillaspeechlibrary/MR browser
- Identifying broken sentences on Common Voice
- Validated 3600+ french sentences on Sentence Collector tool
-
Tilman
- Some fixes on the cluster (proxy, apt settings)
- Working on job-individual pairing of workers with their daemon
-
Kelly
- Wrote CorporaCreator PR to remove all sentences with digits
- Wrote CorporaCreator PR to sync documentation with removal of all sentences with digits
- Wrote letter of support for National Library of Wales project to use DS + CV for transcribing their Welsh holdings
- Re-negotiated completion date of Voice Talent's constract
- Supplied re-formatted sentences to Voice Talent w/out phonetic spelling to speed their workflow
- Sent DS TFLite demo to VW
- Wrote up instructions for VW on using the DS TFLite demo
- Walked VW through installation of the demo
- Agreed to joint Mozilla + GIZ presentation at re:publica
- Talked to OI on details of the simplified Chinese Mandarin text corpus
- Reading BERT for Journal club
- Interviewing payroll providers + obtaining quotes
-
-
Announcements
- Kigali Hackathon
- HD failure on cluster (Fixed. Thanks to Tilman!)
- Update on Cluster status
-
Review of on-going work
-
Eren
- Creating corpus for TTS voice talent
- Released TTS on LJSpeech!
- Smart init for RNN
- Queing of frame length
-
Josh (US Holiday)
- Background reading on Chinese ASR
- Waiting for cluster for UTF8 runs
-
Reuben
- Getting streaming decoder re-based + working
- Implies API changes
- Implies beam issues of intermediate vs final decode
- Starting DS tests on the cluster
- Getting streaming decoder re-based + working
-
Alex (Did not attend)
-
Tilman
- Updated server to newest snakepit
- Need to figure out CUDA TF version questions on cluster
- Fixed HD!
-
Kelly
- EU Grant Finance + Management signed 3 year contract
- Got finance approval for 2 new headcounts for EU grant
- Interviewing payroll + audit providers
- Talked with VW on DS use
- Talked with OI + VW on Mandarin data
-
-
Announcements
- Kelly at Kigali Hackathon
- HD failure on cluster
-
Review of on-going work
-
Eren
- Creating corpus for TTS voice talent
- Different corpora from open TTS datasets + Common Voice
- Trying to find optimal set of sentences to give to talent
- Trained TTS on LJSpeech
- Trying phonemes instead of graphemes
- It lead to overfitting
- Started writing a wiki page for TTS repo with information on training/dataset quality
- Second meeting with voice talent on thursday, to try out a new batch with less weird words
- Creating corpus for TTS voice talent
-
Josh
- Background reading on Chinese ASR
- Waiting for cluster for UTF8 runs
-
Reuben
- Getting CuDNN RNN working on DeepSpeech + TF 1.13 (2x faster training)
- Benchmarking clients with changes to model
- Need to verify speedup on cluster machines
- Getting CuDNN RNN working on DeepSpeech + TF 1.13 (2x faster training)
-
Alex (Did not attend)
-
Tilman
- Updated server to newest snakepit
- Pretty much done, now figuring out networking capabilities for multi-node jobs
- When running multi-worker jobs, when jobs are stopped the whole head node can crash, taking down all jobs
- LXD API does not help with preventing it
- Trying to find a working alternative for creating virtual networks between the nodes
- Other alternative is to drop network isolation between jobs for now
- Will try some experiments with Josh and Eren's jobs
- Beeping sound in server room, one of the hard drive bays was showing a failure
- Took out failing drive and ordered a replacement
- Replacement comes in on Wednesday
-
Kelly (Did not attend)
-
-
Announcements
- Common Voice + Deep Speech Hackathon in Kigali next week
- Common Voice work week in Berlin
-
Review of on-going work
-
Eren
- Started work on M-AI Labs dataset
- Getting en-UK working (Data set too noisy)
- Audiobooks with different speakers, en-UK is single speaker (Data set too noisy)
- Different voice style for different books, so not very good for TTS
- Training on German dataset
- Same problems as en-UK
- Preparing for a small general release of TTS
- Blog post about TTS changes
- Switching vocoder when cluster is updated
- Creating corpus for TTS voice talent
- Started work on M-AI Labs dataset
-
Josh
- MAML + Bytes are all you need....
- Settling back in
-
Reuben
- Bytes for Deep Speech
- Pytorch deployment (Looking at jit and the like)
- Automation of Windows builds
- Arch exploration (Bottle neck at 3rd layer)
-
Alex (PTO)
- Working on Java/Android support
- PR is ready for review
- Has Android 7.0 and 8.1 APKs
- Prepares things for publishing on Maven
- Getting help from Android Components team
- PR also adds Android tests (running in x86 emulator)
- Looking into making tests faster on AWS
- Postponed cleaning up SWIG generated types
- Built and ran mozillaspeech library with DeepSpeech in Firefox Reality
- Working on Java/Android support
-
Tilman
- Updating server to newest snakepit
-
Kelly
- PR's for CorporaCreator
- Preping for Common Voice + Deep Speech Hackathon in Kigali
- Common Voice work week this week in Berlin
- TTS Voice talent starts this week
-
-
Announcements
- EU funds in the bank for NMT project
- Welsh Language Technology Conference
- MAML + Bytes Paper at InterSpeech (Deadline: Mar 29) or ACL (Deadline: Mar 4)?
-
Review of on-going work
-
Eren (PTO)
- Started work on M-AI Labs dataset
- Getting en-UK working
- Audiobooks with different speakers, en-UK is single speaker
- Different voice style for different books, so not very good for TTS
- Training on German dataset
- Same problems as en-UK
- Preparing for a small general release of TTS
- Blog post about TTS changes
- Started work on M-AI Labs dataset
-
Josh
- MAML + Bytes are all you need....
-
Reuben
- Bytes for Deep Speech
-
Alex (PTO)
- Working on Java/Android support
- PR is ready for review
- Has Android 7.0 and 8.1 APKs
- Prepares things for publishing on Maven
- Getting help from Android Components team
- PR also adds Android tests (running in x86 emulator)
- Looking into making tests faster on AWS
- Postponed cleaning up SWIG generated types
- Built and ran mozillaspeech library with DeepSpeech in Firefox Reality
- Working on Java/Android support
-
Tilman
- Final stages of refactoring Snakepit
- Code is pretty much complete, now debugging
- Debugging sequelize queries
- Added in queries to filter jobs
- Added ability to add job info to display
- Added continuation
- Importing old job data into DB
- Looking for material for Journal Club presentation
- Final stages of refactoring Snakepit
-
Kelly
- PR's for CorporaCreator
- Fixing de quotes
- langs flag
- isset correction
- Reviewing XOR user splitting method of train,dev,test
- Attended + Presented at Welsh Language Technology Conference
- Attended MoFo meeting on GIZ partnership
- Sync Meeting with Andreas Boven's team looking at new form factors for Firefox
- PR's for CorporaCreator
-
-
Announcements
- Kelly at Browser Translation Kick-Off meeting
-
Review of on-going work
-
Eren
- Started work on M-AI Labs dataset
- Getting en-UK working
- Audiobooks with different speakers, en-UK is single speaker
- Different voice style for different books, so not very good for TTS
- Training on German dataset
- Same problems as en-UK
- Preparing for a small general release of TTS
- Blog post about TTS changes
- Started work on M-AI Labs dataset
-
Josh
- ICML paper
- German finished but performs very badly
- Maybe due to language model defficiencies
- Interpretability of different layers
- Ran alphabet shuffling experiments but can't yet make sense of it
- Finishing paper text today
- ICML paper
-
Reuben
- Transfer learning experiments on transfering DeepSpeech to speech/non-speech classification
- Journal club presentation on "Bytes are all you need"
- TTS Pytorch JIT in C++ (on hold)
-
Alex
- Working on Java/Android support
- PR is ready for review
- Has Android 7.0 and 8.1 APKs
- Prepares things for publishing on Maven
- Getting help from Android Components team
- PR also adds Android tests (running in x86 emulator)
- Looking into making tests faster on AWS
- Postponed cleaning up SWIG generated types
- Built and ran mozillaspeech library with DeepSpeech in Firefox Reality
- Working on Java/Android support
-
Tilman
- Final stages of refactoring Snakepit
- Code is pretty much complete, now debugging
- Debugging sequelize queries
- Looking for material for Journal Club presentation
- Final stages of refactoring Snakepit
-
Kelly (Traveling - Browser Translation Kickoff Meeting)
- Could not attend
-
-
Announcements
- ICML paper
- Hindi Data Set
- Common Voice alpha data set release
- Deep Speech release 0.4.0 and 0.4.1!
-
Review of on-going work
-
Eren
- Phoneme based training
- Starting on AI Labs data sets
- Selected Voice123 voices to contract
- Trained another network with TWEB dataset
- Created demo read article for Firefox Listen MOS study
-
Josh (Traveling)
- ICML paper
- Multi-Task learning
- Common Voice data set release
-
Reuben (PTO)
- TTS Pytorch JIT in C++
- "Bytes are all you need"
-
Alex (PTO)
- Maven integration for Android
- Testing for Android on Task Cluster
- Deep Speech release 0.4.0 and 0.4.1
-
Tilman
- Snakepit LXD integration
- Snakepit MySQL integration
-
Kelly
- ICML paper
- Common Voice alpha data set release
- Next Deep Speech release 0.4.0 and 0.4.1
- H2020 Stuff (Hiring Financial manager, 2 Headcounts, payroll service + getting EU funds + logo design + Domain Name + WebSite)
- Building many Probing and Trie language models for v0.5.0 to benchmark
- Benchmarking Probing and Trie language models for v0.5.0
-
-
Announcements
- ICML papers
- Hindi Data Set
- Common Voice data set release
- Next Deep Speech release
- Trained the new model (Best model yet! 8.26% WER on Librispeech clean test)
- Release notes updated
-
Review of on-going work
-
Eren
- Phoneme based training
- Wrote a Google Collab notebook for training TTS
- Replaced dropout with RReLU
- Trained another network with TWEB dataset
-
Josh
- ICML papers
- Multi-Task learning
- Common Voice data set release
-
Reuben
- Next Deep Speech release
- TTS Pytorch JIT in C++
- "Bytes are all you need"
-
Alex
- Reading Email! :-)
- Next Deep Speech release
- Maven integration for Android
- Testing for Android on Task Cluster
-
Tilman
- Snakepit LXD integration
- Snakepit MySQL integration
- Common Voice data set release
-
Kelly
- ICML papers
- H2020 Stuff
- Next Deep Speech release
- Common Voice data set release
- Recruiting H2020 project + financial manager
- Building many Probing and Trie language models for v0.5.0 to benchmark
- Benchmarking Probing and Trie language models for v0.5.0
-
-
Announcements
- Demos!
- Presentations!
- Next Deep Speech release
- Train new model
- Release notes
-
Review of on-going work
-
Eren
- TTS distributed training
- Updating to Pytorch 1.0.0
-
Josh
- NAACL paper
- Multi-Task learning
- Common Voice data set release
-
Reuben
- Windows PR
- Embedding of meta-data
- Training for next Deep Speech release
- Issue 1744 (Increase mfcc step size)
-
Alex (Sick)
- Integrating Deep Speech in to Firefox Reality
- Next Deep Speech release
-
Tilman
- Snakepit LXD integration
- Common Voice data set release
-
Kelly
- Common Voice data set release
- Recruiting H2020 project + financial manager
- Building many Probing and Trie language models for v0.4.0 to benchmark
- Benchmarking Probing and Trie language models for v0.4.0
-
-
Announcements
- Demos
- TTS [Eren]
- Snakepit [Tilman]
- STT Streaming [Reuben]
- AutomaticSummarization [Kelly]
- STT in Firefox Reality [Alex]
- Demo Hardware
- Bluetooth Speaker for Tilman? [Kelly]
- Bluetooth Headphones for Eren [Kelly]
- Bluetooth Headphones + Microphone for Reuben [Kelly]
- USB-C Headset with Mic for Alex [Kelly]
- Presentations
- TTS [Eren]
- STT [Reuben]
- Snakepit [Tilman]
- AutomaticSummarization [Kelly]
- STT in Firefox Reality? [Alex]
- Nancy Export [Done]
- Fisher Re-Export [Done]
- Switchboard Export [Doing]
- Demos
-
Review of on-going work
-
Eren
- End-to-end Tacotron + WaveRNN training
- Exploring info bottleneck of Tacotron
- Switching parts of Tacotron to Tacotron2 to find problems
-
Josh
- Learning pit
- DS Training on Chuvash
- Data massaging for Chuvash + other languages
-
Reuben
- Expose ctcdecode to Python
- Snapdragon 835 port of Deep Speech
-
Alex
- Investigating TFLite
- Snapdragon 835 port of Deep Speech
- Investigating TFLite on Android using NNAPI
- Integrating Deep Speech in to Firefox Reality
- Investigating TFLite benchmarking (Pixel2 right now at half-realtime)
-
Tilman
- Starting training off augmented data sets HDF5 files
- Exploring LXD integration
-
Kelly
- Recruiting H2020 project + financial manager
- Reviewing H2020 Coalition Agreement for Mozilla specific issues
- De-Risking and risky H2020 tasks (Project Management, Financial Management....)
- Building many Probing and Trie language models for v0.4.0 to benchmark
- Benchmarking Probing and Trie language models for v0.4.0
-
-
Announcements
- Demos
- TTS [Eren]
- Esper? [Reuben]
- Snakepit [Tilman]
- STT Streaming [Reuben]
- STT Tatar, Kyrgyz...? [Josh]
- STT in Firefox Reality [Alex]
- Failures of TTS cluster based training
- Fisher re-export
- Pipsqueak's home?
- Demos
-
Review of on-going work
-
Eren
- End-to-end Tacotron + WaveRNN training
- Exploring info bottleneck of Tacotron
- Switching parts of Tacotron to Tacotron2 to find problems
-
Josh
- Learning pit
- DS Training on Chuvash
- Data massaging for Chuvash + other languages
-
Reuben
- Expose ctcdecode to Python
- Snapdragon 835 port of Deep Speech
-
Alex
- Investigating TFLite
- Snapdragon 835 port of Deep Speech
- Investigating TFLite on Android using NNAPI
- Integrating Deep Speech in to Firefox Reality
- Investigating TFLite benchmarking (Pixel2 right now at half-realtime)
-
Tilman
- Starting training off augmented data sets HDF5 files
- Exploring LXD integration
-
Kelly
- Recruiting H2020 project + financial manager
- Reviewing H2020 Coalition Agreement for Mozilla specific issues
- De-Risking and risky H2020 tasks (Project Management, Financial Management....)
- Building many Probing and Trie language models for v0.4.0 to benchmark
- Benchmarking Probing and Trie language models for v0.4.0
-
-
Announcements
- Pipsqueak's home?
- Josh at Indiana University this week to work on NAACL-HLT Deep Speech paper
-
Review of on-going work
-
Eren
- End-to-end Tacotron + WaveRNN training
- Exploring info bottleneck of Tacotron
- Switching parts of Tacotron to Tacotron2 to find problems
-
Josh
- Learning pit
- DS Training on Chuvash
- Data massaging for Chuvash + other languages
-
Reuben
- Expose ctcdecode to Python
- Snapdragon 835 port of Deep Speech
-
Alex
- Investigating TFLite
- Snapdragon 835 port of Deep Speech
- Investigating TFLite on Android using NNAPI
- Integrating Deep Speech in to Firefox Reality
- Investigating TFLite benchmarking (Pixel2 right now at half-realtime)
-
Tilman
- Starting training off augmented data sets HDF5 files
- Exploring LXD integration
-
Kelly
- Recruiting H2020 project + financial manager
- Reviewing H2020 Coalition Agreement for Mozilla specific issues
- De-Risking and risky H2020 tasks (Project Management, Financial Management....)
- Building many Probing and Trie language models for v0.4.0 to benchmark
- Benchmarking Probing and Trie language models for v0.4.0
-
-
Review of on-going work
-
Eren
- End-to-end Tacotron + WaveRNN training
- Exploring info bottleneck of Tacotron
- Switching parts of Tacotron to Tacotron2 to find problems
-
Josh
- Learning pit
- DS Training on Kyrgyz
- Data massaging for Kyrgyz + other languages
-
Reuben
- Expose ctcdecode to Python and use it in evaluate.py
- Snapdragon 835 port of Deep Speech
- Running in to many Op bugs
- Starting simple CNN STT runs to test RNN alternatives
- Starting exploring alternatives for broken Ops
-
Alex
- Investigating TFLite
- Upgrade of OSX Build Infra
- Updated TF to newest version
- Investigating TFLite benchmarking (Pixel2 right now at half-realtime)
- Investigating TFLite on Android using NNAPI (Some ops not supported)
- Reviewing "Expose ctcdecode to Python and use it in evaluate.py" PR
-
Tilman
- Starting training off augmented data sets HDF5 files
- Exploring LXD integration
-
Kelly
- Creating/Reviewing 3 year/2019 Language Based Assistants Plan
- Reviewing "Expose ctcdecode to Python and use it in evaluate.py" PR
- Recruiting H2020 project + financial manager
- Reviewing H2020 Grant Agreement for Mozilla specific issues
- De-Risking and risky H2020 tasks (Project Management, Financial Management....)
- Building many Probing and Trie language models for v0.4.0 to benchmark
- Benchmarking Probing and Trie language models for v0.4.0
-
-
Announcements
- v0.3.0 Release!
- MLPerf wants to use Deep Speech (Wants quantized weights)
- Josh Meyer joins us today as an intern!
-
Review of on-going work
-
Josh
- Orientation
- Journal Club Presentation
-
Eren
- Tacotron + WaveRNN
- Starting on Tacotron2 implementation (Alignment seems to fail)
- Switching parts of Tacotron to Tacotron2 to find problems
-
Alex
- Investigating TFLite
- Upgrade of OSX Build Infra
- Updated TF to newest version
- Investigating TFLite benchmarking (Pixel2 right now at half-realtime)
- Investigating TFLite on Android using NNAPI (Some ops not supported)
-
Tilman
- Starting training off augmented data sets HDF5 files
- Backing out file system
- Exploring LXD
-
Reuben
- New CTC algorithm implementation in native client
- Python binding of CTC algorithm
- Snapdragon 835 port of Deep Speech
- Running in to man Op bugs
- Starting simple CNN STT runs to test RNN alternatives
- Starting exploring alternatives for broken Ops
-
Kelly
- Recruiting H2020 project + financial manager
- Reviewing H2020 Grant Agreement for Mozilla specific issues
- De-Risking and risky H2020 tasks (Project Management, Financial Management....)
- Building many Probing and Trie language models for v0.4.0 to benchmark
- Benchmarking Probing and Trie language models for v0.4.0
-
-
Announcements
- v0.3.0 Release
- Create release notes
- Updating README.md's
- Optimization of lm_weight and valid_word_count_weight
- Update lm_weight and valid_word_count_weight in repo
- Find performance for separate LS clean, LS other, and CV (Nice to Have)
- Testing stuff by hand (Checkpoint, model, Issue 1645 [0.1.X, w/o LM...]....)
- v0.3.0 Release
-
Review of on-going work
-
Eren
- Wrote caching data loader
- Wrote data loader compatible with common TTS data sets
- Integrated generic data loader with branches
- Starting on Tacotron2 implementation as Tacotron seems the bottleneck
-
Alex
- French STT starting with English model (Transfer learning)
- Investigating TF profiler
- Preparing to simplify build steps on OS X
-
Tilman
- Generating augmented data sets HDF5 files
- Starting training off augmented data sets HDF5 files
-
Reuben
- Setting up TCN runs
- Snapdragon 835 port of Deep Speech
- Running in to man Op bugs
- Starting simple CNN STT runs to test RNN alternatives
- Starting exploring alternatives for broken Ops
- Optimized of lm_weight and valid_word_count_weight
-
Kelly
- Recruiting H2020 project + financial manager
- Editing H2020 grant to clarify Mozilla deliverables
- Creating Common Voice slides for "Jugend hackt" on Wednesday
- Reviewing H2020 Annotated Model Grant Agreement for Mozilla specific issues
- De-Risking and risky H2020 tasks (Project Management, Financial Management....)
- Building many Probing and Trie language models for v0.4.0 to benchmark
- Benchmarking Probing and Trie language models for v0.4.0 with
- LibriSpeech Other
- LibriSpeech Clean
- Common Voice
- Update lm_weight and valid_word_count_weight in repo
-
-
Announcements
- v0.3.0 Release
- Create release notes
- Updating README.md's
- Optimization of lm_weight and valid_word_count_weight
- Update lm_weight and valid_word_count_weight in repo
- Find performance for separate LS clean, LS other, and CV (Nice to Have)
- Testing stuff by hand (Checkpoint, model....)
- Server update, wait on test of LS clean + LS other
- v0.3.0 Release
-
Review of on-going work
-
Eren
- Long training run of WaveRNN (Training to convergence)
- Training NVIDIA WaveNet
- Fast at inference slow training
- Can benefit from the cluster
- Training beyond stop token overfitting as this happens early
- Found trick, place space before punctuation to improve pronunciation
- Got a license for Nancy TTS corpus (High quality recording with little echo and background noise)
-
Alex
- French STT starting with English model (Transfer learning)
- Trying to switch to gcc7.2 for armv7/aarch64
- Investigating TF profiler
-
Tilman
- httpfs + snakepit integration
- Voice augmentation tooling
- Generates HDF5 files
- Put in relative path handling
- Added pre-computed duration tag to CSV
- Optimizing so that things complete in a reasonable time
- Added artifact creation from down/up sampling, formats...
- Fixing thread pool bug in python (Subprocess of thread in pool hangs)
-
Reuben
- Snapdragon 835 port of Deep Speech
- Running in to man Op bugs
- Starting simple CNN STT runs to test RNN alternatives
- Starting exploring alternatives for broken Ops
- Optimization of lm_weight and valid_word_count_weight
- Update lm_weight and valid_word_count_weight in repo
- Snapdragon 835 port of Deep Speech
-
-
Kelly
- Recruiting H2020 project + financial manager
- Editing H2020 grant to clarify Mozilla deliverables
- Creating QBR slides for STT, TTS, NMT, Summarization, Common Voice...
- Reviewing H2020 Annotated Model Grant Agreement for Mozilla specific issues
- De-Risking and risky H2020 tasks (Project Management, Financial Management....)
- All-Hands mic, amp, mixer... setup
- Building many Probing and Trie language models for v0.4.0 to benchmark
- Benchmarking Probing and Trie language models for v0.4.0 with
- LibriSpeech Other
- LibriSpeech Clean
- Common Voice
- Writing blog post for Common Voice landing in Amazon's Pubic Data sets
-
Announcements
- v0.2.1 Release
- Create release notes
- Updating README.md's
- Optimization of lm_weight and valid_word_count_weight
- Update lm_weight and valid_word_count_weight in repo
- Find performance for separate LS clean, LS other, and CV (Nice to Have)
- Testing stuff by hand (Checkpoint, model....)
- Cluster instabilities?
- v0.2.1 Release
-
Review of on-going work
-
Eren
- WaveRNN vocoder
- Joined positive result tech together
- Trained model
- Released model
- In the background FFTNet
- Attention scaling experiments 5!
-
Alex
- Moving to Tensorflow 1.11
- Fixed non-deterministic output with the streaming model
- Fixed intermittent test failure on prod models for NodeJS and Python
- Enforced same sox options as libsox for C++ client
- Optimizing trie loading
-
Tilman
- httpfs + snakepit integration
- snakepit changes[1]
- Voice augmentation tooling
- Cluster crashes
-
Reuben
- Snapdragon 835 port of Deep Speech using Qualcomm's SDK (TFlite experiments)
- Optimization of lm_weight and valid_word_count_weight
- Update lm_weight and valid_word_count_weight in repo
-
Kelly
- Recruiting H2020 project + financial manager
- Editing H2020 grant to clarify Mozilla deliverables
- Creating QBR slides for STT, TTS, NMT, Summarization, Common Voice...
- Reviewing H2020 Annotated Model Grant Agreement for Mozilla specific issues
- De-Risking and risky H2020 tasks (Project Management, Financial Management....)
- All-Hands mic, amp, mixer... setup
- Building many Probing and Trie language models for v0.4.0 to benchmark
- Benchmarking Probing and Trie language models for v0.4.0 with
- LibriSpeech Other
- LibriSpeech Clean
- Common Voice
- Writing blog post for Common Voice landing in Amazon's Pubic Data sets
- Setting up AMI's on Amazon for blog post for Common Voice landing in Amazon's Pubic Data sets
-
-
Announcement
- v0.2.0 Released!
- Streaming blog post going out!
-
Review of on-going work
-
Eren (PTO)
- WaveNet vocoder
- Binary convergence loss
- Larger Attention filters
- Softmax predictions
- Loc-alignment with only average history
- Bahdenau attention
-
Alex (Conference)
- Moving to Tensorflow 1.11rc's
- Hosting a conference in the Paris office
-
Tilman
- httpfs + snakepit integration
-
Reuben
- Snapdragon 835 port of Deep Speech using Qualcomm's SDK (TFlite experiments)
-
Kelly
- All-Hands mic, amp, mixer... setup
- Building many Probing and Trie language models for v0.3.0 to benchmark
- Benchmarking Probing and Trie language models for v0.3.0 with
- LibriSpeech Other
- LibriSpeech Clean
- Common Voice
- Writing blog post for Common Voice landing in Amazon's Pubic Data sets
- Setting up AMI's on Amazon for blog post for Common Voice landing in Amazon's Pubic Data sets
- Purchasing Mandarin data sets (LDC and other options)
- Creating Mandarin data sets (MTurk and other options)
-
-
Announcement
- Streaming blog post going out tomorrow
- v0.2.0 Release
- Updating README.md's
- Testing of rounded model
- Optimization of lm_weight and valid_word_count_weight
- Update lm_weight and valid_word_count_weight in repo
- Update lm_weight and valid_word_count_weight in blog post
- Find performance for separate LS clean, LS other, and CV (Nice to Have)
- Getting PyGithub to upload
- Finalize blog post (Link to original 10% blog post)
- Finalize release notes (Linking to new API, Mention feature caching increases RAM needed...)
- Merge Feature caching PR
- Merge Language Models PR
- Testing stuff by hand (Checkpoint, model....)
-
Review of on-going work
-
Eren
- Binary convergence loss
- Larger Attention filters
- Softmax predictions
- Loc-alignment with only average history
- Bahdenau attention
-
Alex
- Discourse/github support
- TF Master merge
- Tool to push to GitHub
- Auto upload of assets to GitHub for release
- TFLite port of Deep Speech on to Pixel 2
-
Tilman
- httpfs + snakepit integration
- Adding "pit ls" and "pit cp" to ls and cp files from the server \o/!
- Working on freesound.org samples (down/up sampling, collecting metadata tags in csv....)
-
Reuben
- Streaming blog post
- Creating 0.2.0 release model
- Rounding 0.2.0 release model
- Benchmarking 0.2.0 release model
- Benchmarking 0.2.0 rounded release model
- Merging quantized trie language model
- Snapdragon 835 port of Deep Speech using Qualcomm's SDK
-
Kelly
- All-Hands mic, amp, mixer... setup
- Administrative tasks for Horizon 2020 Grant
- NSF + Visa + IP + Other HR Hell
- Helping with 0.2.0 release tasks
- Obtaining Mandarin data sets (THCHS-30, MAT-2000Com, MAT-2500ExtV-Com, TCC-300Com...)
- Purchasing Mandarin data sets (LDC and other options)
- Creating Mandarin data sets (MTurk and other options)
- Writing blog post for Common Voice landing in Amazon's Pubic Data sets
- Setting up AMI's on Amazon for blog post for Common Voice landing in Amazon's Pubic Data sets
-
-
Review of on-going work
-
Eren (OoO)
- Setup TTS server
- Computer Vision Conference
-
Alex
- Common Voice Kiosk mode
- Upgrading VMware Fusion
- Moving hardware to a new home
- Discourse/github support
- TFLite port of Deep Speech on to Pixel 2
-
Tilman
- httpfs + snakepit integration
- Working on freesound.org samples (down/up sampling, collecting metadata tags in csv....)
-
Reuben
- Clear Captions Time Estimate
- Common Voice Data test training
- 0.2.0 Optimizations (Pre-processing)
- 0.2.0 Small bug fixes from alpha testers
- Snapdragon 835 port of Deep Speech using Qualcomm's SDK
-
Kelly
- All-Hands mic, amp, mixer... setup
- Administrative tasks for Horizon 2020 Grant
- Getting LXC containers to work on the cluster
- Forced alignment of NPR data using Gentle
- NSF + Visa + IP + Other HR Hell
- Obtaining Mandarin data sets (THCHS-30, MAT-2000Com, MAT-2500ExtV-Com, TCC-300Com...)
- Purchasing Mandarin data sets
- Writing blog post for Common Voice landing in Amazon's Pubic Data sets
-
-
Review of on-going work
-
Alex (PTO)
-
Eren (PTO)
-
Tilman
- Added 4 GPU machine to staging env
- Working on freesound.org samples (down/up sampling, collecting metadata tags in csv....)
-
Reuben
- Common Voice Data
- Data cleaning (HTML instead of text....)
- Gotten through one epoch
- Starting training w/F,SW,L+CV
- 0.2.0 Optimizations (Pre-processing)
- 0.2.0 Small bug fixes from alpha testers
- Streaming architecture blog post
- Snapdragon 835 port of Deep Speech starting Qualcomm vs Tensorflow
- Common Voice Data
-
Kelly
- Setting up STT server for MozCast
- All-Hands setup starting
- Administrative tasks for Horizon 2020 Grant
- Getting LXC containers to work on the cluster
- Forced alignment of NPR data using Gentle
- NSF + Visa + IP + Other HR Hell
- Obtaining Mandarin data sets (THCHS-30, MAT-2000Com, MAT-2500ExtV-Com, TCC-300Com...)
-
-
Review of on-going work
-
Alex (PTO)
-
Tilman (PTO)
-
Reuben
- Collating 03.0 changes for merge
- 0.3.0 Optimizations (Pre-processing)
- Reading ClairNet(TTS)
-
Eren
- Profiled TTS code with cProfile
- Solved TTS server quality
- Working on new checkpoint release for TTS
- Audio enhancement to improve GriffinLim quality (EnhanceNet) 1703.09452
- Working on blog post for TTS
-
Kelly
- NSF + Visa + IP + Other HR Hell
- Voice Talent Contract
- Uploading Rishi's data
- Pre-processing Rishi's data
- Obtaining Mandarin data sets (THCHS-30, MAT-2000Com, MAT-2500ExtV-Com, TCC-300Com...)
- Setting up demo TTS server for internal use
- Setting up demo STT server for internal use
- Gathering STT requirements for Mozcast
- Gathering STT requirements for MR
-
Rishi
- Summarization
- Code complete for new ideas
- Training on Newroom with various hyperparameters
- Getting really good results on Newroom (Adjusting learning rate, number of layers, embedding dimension)
- Starting on Transformer (Something a bit buggy)
- Summarization
-
-
Announcement
- Tilman talking at Journal Club
-
Review of on-going work
-
Reuben (PTO)
- Collating 03.0 changes for merge
- 0.3.0 Optimizations (Pre-processing)
- Reading ClairNet(TTS)
-
Eren
- Trained TTS on LG Speech data set and released model
- Starting to use LVS vocoder library
- Started training FFTNet again
- Starting to train on Mycroft data (Multiple Speech Samples)
- Running in to load speed problems for LG Speech
-
Tilman
- Working on fixing Eren's efficiency problem (Seems to be firejail problem)
- Debugging w/Eren
- FUSE Filesystem to get around firejail NSF problems
-
Kelly
- NSF + Visa + IP + Other HR Hell
- Voice Talent Contract
-
Rishi*
- Summarization
- Code complete for new ideas
- Issues w/beam search + ROUGE solved
- Training on Newroom with various hyperparameters
- Getting really good results on Newroom (Adjusting learning rate, number of layers, embedding dimension)
- Starting on Transformer (Something a bit buggy)
- Summarization
-
-
Announcement
- Taipei
-
Review of on-going work
-
Reuben
- Collating 03.0 changes for merge
- 0.3.0 Optimizations (Pre-processing)
- Reading ClairNet(TTS)
-
Eren (Sick)
-
Tilman
- Working on fixing Eern's efficiency problem
- Server install
- Debugging w/Eren
- FUSE Filesystem to get around firejail NSF problems
-
Kelly
- NSF + Visa + IP + Other HR Hell
- Voice Talent Contract
-
Rishi*
- Summarization
- Code complete for new ideas
- Issues w/beam search + ROUGE solved
- Data needs to be copied to server (preprocessing bad + data slightly corrupt)
- Summarization
-
-
Announcement
- Kelly in Taipei this week
- Alex on parental leave Aug 1 - Sep 4
- Switching ordering of 0.3.0 and 0.2.0 releases, streaming comes first!
-
Review of on-going work
-
Reuben
- Collating 03.0 changes for merge
-
Eren
- FFTNET experiemnts
- Working on pre-processing
- Starting working with new data sets
- Starting trying NVIDIA WaveNet on cluster
-
Tilman
- Working on fixing Eern's efficieny problem
-
Kelly
- Meeting with Taiwian government on CV+DS
- Meeting with Taiwian press on CV+DS
- Meeting with Taiwian community on CV+DS
- Meeting with Taiwian linguists on CV text corpus collection
-
Rishi*
- Summarization
- OpenNMT now supports multi-GPUs!
- Starting implementing improvements to OpenNMP in latex doc
- Summarization
-
-
Announcement
- Kelly in Taipei next week
- Alex on parental leave Aug 1 - Sep 4
-
Review of on-going work
-
Reuben
- Evaluated LM like that from 0.1.1
- Evaluating decoder with suggestions from community
-
Eren
- FFTNET experiemnts
- Working on pre-processing
- Starting working with new data sets
- Starting trying NVIDIA WaveNet on cluster
-
Tilman
- Firejail does not work on NFS!
- Working on fix, possibly LXC
- Working on temporary fix for cluster until final solution happens
-
Kelly
- Email!
- LM re-creation (Tomorrow)
- DS + CV lectures in Taipei
- Meeting with Taiwian government on CV
-
Rishi*
- Journal Club!
- Summarization
- OpenNMT now supports multi-GPUs!
- Starting implementing improvements to OpenNMP in latex doc
-
-
Review of on-going work
-
Reuben
- AOT compiliation for streaming
- Figuring out snakepit
- Training streaming model (Waiting on Cluster)
- Experimenting with realigning and splitting pt-BR corpus
-
Eren
- Wrote FFTNET, benchmarked inference speed https://github.com/mozilla/fftnet
- 0.003s per sample ~ 60 seconds per audio second on CPU
- 0.009s per sample ~ 180 seconds per audio second on GPU
- Waiting on cluster!
- Wrote FFTNET, benchmarked inference speed https://github.com/mozilla/fftnet
-
Tilman
- Fixed several bugs
- Early stopping and ../tmp file deleting (!)
- Wrong job startup work directory for jobs on Ubuntu 18.04
- Put node mlc2 onto cluster and put Eren into group 'test' - so his jobs will always run on mlc2
- Working on serious issue about insufficient job protection
- Helping cluster users
- Fixed several bugs
-
-
Update
- All-Hands!
-
Review of on-going work
-
Reuben
- Back from PTO!
- AOT compiliation for streaming
- Training streaming model (Waiting on Cluster)
-
Tilman
- Optimizing Snakepit
- Demo Snakepit install
- Installing servers
- Starting Freesound downloads
-
Eren
- Writing FFTNET + testing it (Working on memoization)
- Waiting on cluster!
-
Kelly
- Setting up server
- Working with Legal + HR for NSF grant
-
Rishi
- AS initial arch
- Initial training of AS archs
-
-
Update
- All-Hands
- Team Dinner Wednesday
- Demos
- STT - Update previous demo + showcase streaming + faster (Memory in Alex's graphs)
- TTS - Original plan to demo on notebook "MVD" (mbx now wants server) so working on AWS server wrapper
- Job Scheduler
- Setting up new cluster on Wednesday
- Shooting for Snakepit install on Wednesday (Fallback AWS install)
- Presentations
- All-Hands
-
Review of on-going work
-
Reuben
- Training 0.3.0 model (Experimenting on various hyperparameters, exchanging LSTM w/Block Fused LSTM...)
- Looking at curriculum learning variations
-
Tilman
- Hardening Snakepit (Replacing node addition process)
- Securing Snakepit configuration steps
-
Eren
- NVIDIA install
- Writing FFTNET + testing it (Working on memoization)
- Training new checkpoint with location sensitive attention + ablation w/tocotron
- Setting up server
-
Kelly
- Created PR for switch of language model for 0.2.0
- Created STT Lightning Talk for All-Hands
- Created 1st Half of longer STT Talk for All-Hands
-
-
Update
- Servers in the office + power cords + 10Gb cables, trying for full install Wednesday
- RE*WORK
-
Review of on-going work
-
Reuben
- Training 0.3.0 model (Experimenting on various hyperparameters)
- Dropout now not used for validation
- Maintaining TTS training on Brazilian corpora with TBTT (Samples to long)
- Drudging through issue 1156 debugging (Rebased on streaming code)
-
Tilman
- Working on integrating pip cache to prevent multiple downloads
- Integrating firejail in to job scheduler to prevent interference
-
Eren
- Released new checkpoint
- Training new checkpoint with location sensitive attention + stop token prediction
- Writing FFTNET + testing it
-
Kelly
- RE*WORK
- Benchmarking on old server done, compiling results for 0.2.0
-
-
Update
- Servers in the office, going to install everything this week
- Manager's off-site
- TTS quality sounding really good
- STT, awesome work, now try and get 0.2.0 and 0.3. out the door with noise robust models
-
Review of on-going work
-
Reuben
- Maintaining TTS training on Brazilian corpora with TBTT (Samples to long)
- Drudging through issue 1156 debugging (Rebased on streaming code)
-
Tilman
- Added job ground support
- Added auto share flag for groups (By default shared with particular group)
- Moving data backend to SQLLite to ease archiving jobs
-
Eren
- Stop token prediction done
- But quality drops, attention miss-aligned
- Integrated Pytorch Griffin-Lim much faster than real time
-
Alexandre
- Imported 1.2M strings from French Parliament!
- Starting working on Gutenburg import (1k books)
- Tried OpenCL for Deep Speech on TF 1.8 (Works!)
-
Kelly
- Managers Off-Site
- Lost of benchmarking finished LSTM RNN, GRU RNN, Vanilla RNN, LSTM BRNN w/varying width...
- Trying to get LM benchmarking off the old server to compile results for 0.2.0
-
-
Review of on-going work
-
Reuben
- Drudging through issue 1156 debugging
- Cleaning up pending streaming API work to prepare for streaming model training
- Maintaining TTS training on Brazilian corpora
-
Tilman
- Optimizations and testing SnakePit with DeepSpeech
- Demo training with PyTorch to make sure things work
- Capturing error codes of failed jobs
- Making SSH sessions to last longer and avoid repeated setups for short polls
-
Eren
- Updated to PyTorch 0.4.0 and retrained, but quality is worse
- Could be PyTorch bug or problem in upgrade, investigating
- Tested much faster GPU Griffin-Lim implementation, slightly lower quality
- Stop token prediction is done and works well
- Looking into NVIDIA's Tacotron 2 implementation for optimization tips and to use as a Tacotron 2 reference
-
Alexandre
- Working on French parliament corpus to make it suitable for training
- Working on packaging code to improve documentation of packages on PyPI/npm
- Playing with newer versions of OpenCL
- Debugging TensorFlow/KenLM linking error due to double-conversion library clash
-
Kelly (Managers Off-Site)
-
-
TODO
- Talk about eager distribution of alpha/beta for testing
-
Review of on-going work
-
Reuben(PTO)
-
Tilman(PTO)
-
Eren
- Released a new checkpoint (170k steps)
- Working on WORLD vocoder impl to replace Griffin-Lim
- Transfer learning experiments from RNN to CNN
- TBTT to deal with long sample
-
Alexandre
- RPi + LePotato taskcluster cluster
- Getting physical location for cluster
- French text data for CV
- Starting work on GPU versions for ARM boards
- Worked on a clean version + distribution mechanism
-
Kelly
- Emails
- Starting to summarize results of benchmarks
-
-
Update
- Servers in the office, electrician done
-
Review of on-going work
-
Reuben
- Working on TTS deployment - how hard it'd be to stand up an internal test server
- Deep Speech streaming C API
-
Tilman
- Job scheduler
- Implemented resource reservation/permission scheme based on groups
- Users and devices have groups assigned and need to match for user to access device
- Per-group folders are mounted on run directory when user has appropriate permissions
- Implementing resource allocation for ports for inter-node communication
- Job scheduler
-
Eren
- Released a new checkpoint (170k steps)
- Working on WORLD vocoder impl to replace Griffin-Lim
- Alexandre (could not attend)
- Kelly (out for the week)
-
Reuben
-
Update
- Server delivery was scheduled for today, but Kelly is out
- Check Journal Club presentation
-
Review of on-going work
-
Reuben
- Looking into TTS end-of-sequence prediction
- Deep Speech streaming C API
- Reviewing PRs
-
Tilman
- Job scheduler
- Error handling UI polish
- Reporting timing of state changes for jobs
- Implementing user groups + group permissions for accessing restricted resources
- Job scheduler
-
Alexandre
- Merged RPi3 testing on TaskCluster
- Two RPi3's running at home linked to TaskCluster
- Missing 2.7 support in Raspbian for NumPy and SciPy packages, so dropped 2.7 builds for RPi3
- Debugging mysterious Node v8/v9 crashes in RPi3 with Valgrind/ASAN (workaround by going back to GCC 4.9)
- Eren (had to go before the meeting)
- Kelly (out for the week)
-
Reuben
-
Update
- Servers in Berlin, but working on delivery + movers as we no longer have a freight elevator
- Electricians back from vacation may come in this week
-
Review of on-going work
-
Eren
- TTS variations/improvements
- Vocoder variation
- STT CNN exchange for BRNN
- TTS variations/improvements
-
Reuben
- pt-BR TTS
- Deep Speech streaming C API
- Streaming blog post
-
Kelly
- Talking with partners
- Cluster creation: Contracting Electrician
- Benchmarking: Issues 1241, 1242, 1243, 1246 (All does except vanilla RNN)
- Language model corpus creation: Issues 1244 and 955 (Dealing with server crashes)
- Starting Sprachspiel implementation (Value Head, Attention, Decoder)
- Learning PyTorch
-
Tilman
- Job scheduler
- Monitoring
- Status report
- Job scheduler
- Alexandre (PTO)
-
Eren
-
Discussion
- Release notes (Include links to release README)
- Link to proper README in PyPi, npm....
- Release notes (Include links to release README)
-
Review of on-going work
-
Eren
- TTS variations/improvements
- STT CNN exchange for BRNN
-
Reuben
- WER from inference graph
- Deep Speech streaming
- Streaming blog post
-
Kelly
- Cluster creation: Contracting Electrician
- Benchmarking: Issues 1241, 1242, 1243, 1246
- Language model corpus creation: Issues 1244 and 955
- Starting Sprachspiel implementation (Value Head, Attention, Decoder)
- Learning PyTorch
- Tilman (PTO)
- Alexandre (PTO)
-
Eren
-
Discussion
- Use of GitHub Projects
- Release notes (Include links to release README)
- Link to proper README in PyPi, npm....
-
Review of on-going work
-
Eren (On-Boarding)
- TTS variations/improvements
- STT CNN exchange for BRNN
-
Tilman
- Job scheduler
- Exploring other schedulers (Servers are coming real soon™)
-
Reuben
- WER from inference graph
- Deep Speech streaming
-
Kelly
- Cluster creation: Contracting Electrician
- Benchmarking: Issues 1241, 1242, 1243, 1246
- Language model corpus creation: Issues 1244 and 955
- Learning PyTorch
- Starting Sprachspiel implementation (Value Head, Attention, Decoder)
-
Alexandre
- French Common Voice
- TF OpenCL support
-
Eren (On-Boarding)
-
Discussion
- Release of RC's
- Confirm that's RC' don't install automatically. (Confirmed!)
- Release notes (Include links to release README)
- Link to proper README in PyPi, npm....
- Release of RC's
-
Review of on-going work
-
Eren
- TTS variations/improvements
- STT CNN exchange for BRNN
-
Tilman
- Job scheduler
- Exploring other schedulers (Servers are coming real soon™)
-
Reuben
- Deep Speech streaming
- Training/Inference calculate MFCC's the same
-
Kelly
- Cluster creation: Contracting Electrician
- Benchmarking: Issues 1241, 1242, 1243, 1246
- Language model corpus creation: Issues 1244 and 955
- Learning PyTorch
- Starting Sprachspiel implementation (Value Head, Attention, Decoder)
-
Alexandre
- French Common Voice
- TF OpenCL support
-
Eren
-
Discussion
- Quick meeting today (Monthly Internal Meeting starts in 30min)
-
Review of on-going work
-
Eren
- TTS variations/improvements
- STT CNN exchange for BRNN
-
Tilman
- Job scheduler
- Exploring other schedulers (Servers are coming real soon™)
-
Reuben
- Deep Speech streaming
-
Kelly
- Cluster creation: Contracting Electrician
- Benchmarking: Issues 1241, 1242, 1243, 1246
- Language model corpus creation: Issues 1244 and 955
- Learning PyTorch
- Starting Sprachspiel implementation
-
Alexandre
- French Common Voice
- TF OpenCL support
-
Eren
-
Discussion?
-
Review of on-going work
- Eren (Out sick)
-
Tilman
- Job scheduler
-
Reuben
- Deep Speech streaming
-
Kelly
- Cluster creation: Contracting Electrician
- Mozilla's letter of support for CDT in NLP
- Benchmarking: Issues 1240, 1241, 1242, 1243, 1246, 1254
- Language model corpus creation: Issues 1244 and 955
- Learning PyTorch
- Starting Sprachspiel implementation
-
Alexandre
- TF OpenCL support
-
Release 0.1.1!
- Congrats!
-
Discussion
-
Review of on-going work
-
Eren
- Work Week Presentations!
- TTS Engine
- Initial code repo not completely working, debugging
- Following work on Vocoder
-
Tilman
- Work Week Presentations!
-
Reuben
- Work Week Presentations!
- Issue 1156 (Language model incorrectly drops spaces)
- TTS Engine
- Creating TTS data
- Creation of MAD+VAD to handle music
- Creation of data sets to train MAD+VAD
-
Kelly
- Work Week Presentations!
- Working on getting UPS+PDU (Sent off PO!)
- Conversational agent research
- Preparations for Work Week
- Interviews (AI4All, Newsy, Contagious Magazine, Let's Get Mental)
- Prioritizing Deep Speech partnerships
-
Alexandre
- Work Week Presentations!
- New OS X workers
- Spec'ing out new OS hardware
- Monolithic TF [done]
-
Eren
-
Release 0.1.1
- Release testing [Reuben] (done)
- Release management [Reuben + Alex] (done?)
- Document hyperparameters for release notes [Kelly]
- Release now!
-
Discussion
-
Review of on-going work
-
Eren
- TTS Engine
- Initial code repo not completely working, debugging
- Following work on Vocoder
- TTS Engine
-
Tilman
- FOSDEM Presentation!
-
Reuben
- Issue 1156 (Language model incorrectly drops spaces)
- TTS Engine
- Creating TTS data
- Creation of MAD+VAD to handle music
- Creation of data sets to train MAD+VAD
-
Kelly
- Working on getting UPS+PDU (Waiting on Finance's PO)
- Conversational agent research
- Preparations for Release 0.1.1
- Preparations for Work Week
- Common Voice Work Week
-
Alexandre
- New OS X workers
- Spec'ing out new OS hardware
- Monolithic TF
-
Eren
-
Release 0.1.1
- Document training from pb [Reuben] (done)
- Create release notes [All] (done)
- Add leading section of Changes [All] (done)
- Add list of contributors since last release [Kelly] (done)
- Release testing [Reuben]
- Release management [Reuben + Alex]
-
Discussion
- Work week initial agenda[1]
-
Monday
- Pipsqueak - Deep Speech on RPi3 (¼ Day) [Alex]
- Presentation on Pipsqueak (1 hour)
- Feedback/Suggestions on Pipsqueak (1 hour)
- Deep Speech streaming support (⅜ Day) [Tilman]
- Presentation on Deep Speech streaming support (1 hour)
- Feedback/Suggestions on Deep Speech streaming support (1 hour)
- Lunch
- Feedback/Suggestions on Deep Speech streaming support (1 hour)
- Pipsqueak - Architectural variations (¼ Day) [TBD]
- Presentation on Pipsqueak (1 hour)
- Feedback/Suggestions on Pipsqueak - Tie in with Deep Speech streaming support (1 hour)
- Pipsqueak - Deep Speech on RPi3 (¼ Day) [Alex]
-
Tuesday
- Job Scheduler (¼ day) [Tilman]
- Presentation on Job Scheduler (1 hour)
- Feedback/Suggestions on Job Scheduler (1 hour)
- Deep Speech additional languages (¼ Day) [Kelly+Michael]
- Common Voice additional languages (1 hour) [Michael]
- Deep Speech additional languages (1 hour) [Kelly]
- Lunch
- Virtual Assistant Work (⅛ Day) [Kelly]
- Introduction to Scout - Mozilla's Virtual Assistant
- Survey of Possible Asks - Intent parser, keyword spotter...
- Automatic Summarization (⅛ Day) [Kelly]
- Introduction to Mozilla's Automatic Summarization Work
- Mozilla's Automatic Summarization Corpora
- Dinner
- Job Scheduler (¼ day) [Tilman]
-
Wednesday
- TTS engine initial architecture (½ Day) [Reuben+Eren]
- Lunch
- Automatic Summarization Firefox integration (¼ Day) [Kelly+Martin]
- 3rd Party Automatic Summarization integration (Unknown) [Martin]
- Mozilla's Automatic Summarization integration (Unknown) [Martin]
- TTS + STT Firefox integration [Kelly + All]
-
Thursday
- Heads Down Working
- Lunch
- Heads Down Working
-
Friday
- Heads Down Working
- Lunch
- Heads Down Working
-
Monday
- Work week presentation template[2].
- Work week initial agenda[1]
-
Review of on-going work
-
Eren
- Orientation
- TTS Engine
-
Tilman
- FOSDEM Presentation
-
Reuben
- Issue 1156 (Language model incorrectly drops spaces)
- TTS Engine
- Creating TTS data
- Creation of MAD+VAD to handle music
- Creation of data sets to train MAD+VAD
-
Kelly
- Working on getting UPS+PDU (Waiting on Finance's PO)
- Conversational agent research
- Preparations for Release 0.1.1
- Preparations for Work Week
- Common Voice Work Week
- AlphaGo Zero Presentation
-
Alexandre
- New OS X workers
- Spec'ing out new OS hardware
- Monolithic TF
-
Eren
-
Release 0.1.1
- Integrating frozen model testing into our main code [Reuben] (done)
- Create pb file [Kelly] (done)
- TODO: Document training from pb [Reuben]
- TODO: Create release notes [All]
- Add leading section of Changes [All]
- Add list of contributors since last release [Kelly]
-
Discussion
- Adversarial Attacks (1801.01944, 1801.00554...)
- Open issue on 1801.01944
- Work week initial agenda
- Job Scheduler
- Deep Speech streaming support [Point Person:Tilman]
- Deep Speech additional languages [Point Person:Kelly]
- Pipsqueak (Deep Speech on RPi3) (Should also explore arch variations) [Point Person:Alex]
- TTS engine initial architecture [Point People:Reuben+Eren]
- Virtual Assistant Work (Initial possibilities: Intent parser, keyword spotter, conversational agent...)
- Automatic summarization
- Adversarial Attacks (1801.01944, 1801.00554...)
-
Review of on-going work
-
Eren
- Orientation
- TTS Engine
-
Tilman
- Corpus creation
- Audio augmentation
- Job scheduler
- Addressing PR comments
-
Reuben
- Issue 1156 (Language model incorrectly drops spaces)
- TTS Engine
-
Kelly
- Working on getting new hardware (Everything signed, waiting on payment + delivery) [done]
- Working on getting UPS+PDU (Quote took too long to process, new quote active and in flight) [doing]
- Conversational agent research
- Preparations for Release 0.1.1
-
Alexandre
- New OS X workers
-
Eren
-
Announcements
- Eren!
- Work week: Week of February 12
-
Release 0.1.1
- Train model with same hyperparameters as in 0.1.0 model [Kelly] (done)
- Harmonize WER with Kaldi's WER (done) [WER 5.6% on librivox clean test]
- Integrating frozen model testing into our main code [Reuben] (done not merged)
- TODO: Create pb file [Kelly]
-
TODO: Link deepspeech properly on macOS (@executable_path) [Alex]Fixed in #1051 - TODO: Document training from pb [Reuben]
-
Discussion
- Broad 2018 Goals
- Ship 3 ML based technologies and/or services in Firefox
- Release Speech-to-Text engine + models w/<10% WER on 2 other data sets
- Release Speech-to-Text engine and non-English model
- Release RiP3 Speech-to-Text engine and English model(s)
- Release Text-to-Speech engine + English model(s)
- Common Voice as Largest Open English Corpus + another language
- Support Assistants group with required ML algorithms and models
- Explore ML + Data business models
- Work week initial agenda
- Deep Speech streaming support
- Deep Speech additional languages
- Pipsqueak (Deep Speech on RPi3)
- TTS engine initial architecture
- Virtual Assistant Work (Details dependent upon Jofish's team's needs)
- Broad 2018 Goals
-
Review of on-going work
-
Eren
- Orientation
- TTS Engine
-
Tilman
- Corpus creation
- Audio augmentation
- Job scheduler
- Addressing PR comments
-
Reuben
- Issue 1156 (Language model incorrectly drops spaces)
- TTS Engine
-
Kelly
- Training production model on cluster [done]
- Spec'ing out more servers [done]
- Working on getting new hardware (Everything signed, waiting on payment + delivery) [done]
- Working on getting UPS+PDU (Waiting on Bebenita) [doing]
- Alexandre (Sick)
-
Eren
-
Release 0.1.1
- Harmonize WER with Kaldi's WER [Kelly or Tilman]
- Train model with same hyperparameters as in 0.1.0 model [Kelly]
- TODO: Link deepspeech properly on macOS (@executable_path) [Alex]
- TODO: Integrating frozen model testing into our main code [Reuben]
-
Discussion
- Date for 2018 kick-off work week
Review of on-going work
- TF 1.4 support landed [Alex]
- Python 2.7, 3.4, 3.5 and 3.6 builds of our TensorFlow fork [Alex]
- General fixes for problems encountered by users [Alex]
- Removed dependency on older version of SciPy [Alex]
- TODO: Link deepspeech properly on macOS (@executable_path) [Alex]
- Ran benchmarks comparing impact of AVX/AVX2 instructions. Almost no difference in inference time. [Alex]
- Almost done collecting dataset for summarization [Anurag]
- Ran benchmarks on GRU vs LSTM. LSTM converging faster [Anurag]
- TODO: tune hyperparameters over current architecture [Anurag]
- TODO: test orthogonal RNNs [Anurag]
- Got test epochs to work with frozen models [Reuben]
- Fixed data race in feeding code that caused non-deterministic Word Error Rates when running on a single machine [Reuben]
- TODO: Integrating frozen model testing into our main code [Reuben]
- Working on voice agent demo [Tilman]
- Looking into making inference streamable [Tilman]
- TODO: benchmark runs with different architectures (unidirectional vs. bidirectional) [Tilman]
Discussion
- Should we disable AVX2 in our TensorFlow packages?
- Should we do tests on macOS? Alex checking if we can get mac minis as workers.
-
Release TODO
- Go live with the Hacks blog post (Wait until Wednesday 8am PST) [Reuben]
-
All-Hands TODO
- Demo using pb? [Reuben]
-
Discussion
- TF 1.4 support finish at all-hands
- How to support JS in future with no SWIG support
-
Release TODO
- Test CV data set [Tilman]
- Update documents to suggest virtual env [Alex]
- Add GPU whl's to PyPI + update docs [Reuben]
- Change contact point for PyPI packages [Reuben]
- Add node packages by hand + update docs [Reuben + Alex]
- Tag release [Reuben + Alex]
- Upload model + LibriVox clean audio samples to github releases [Kelly]
- Remake gif using github releases model + LibriVox clean audio samples [Kelly]
- Update docs with new gif [Kelly]
- Go live with the Hacks blog post [Reuben]
- Inform partners of release [Kelly]
- Add documentation on GPU vs CPU speed (Talk about numbers from Rueben's computer) [Kelly]
- Add documentation that model size is not optimized [Kelly]
- Apply roundings of graph (Nice to have) [Reuben]
- Write release notes (Send out for review) [Kelly]
- Email testers to create virtual env (Done) [Kelly]
- Link to discourse in release notes/readme [Kelly]
-
Announcements
- Work week on week of Nov 13 in Berlin
- Talk about corpus targets (Street, office....)
- Comms wants us to release on the 21st because of 57
- Work week on week of Nov 13 in Berlin
-
Discussion
- Packaging progress?
- Training (Looking good but with LM decoder a bit slow) [Kelly]
- Documentation [Kelly+Reuben]
- Communications (Golem coming to video/interview us on Nov 15) [Kelly]
- Discourse forum on Deep Speech (Done) [Kelly]
- Model testing on Task Cluster [Kelly+ Alex]
- Give large model to Alex (Done) [Kelly]
- Finding beta testers [Michael]
- Packaging progress?
-
Review of on-going work
-
Tilman
- Corpus creation
- Audio augmentation
- Addressing PR comments
-
Reuben
- Reviewing Tilman's PR
- Trying TensorFlow 1.4 MFCC's
- Reach out to hacks
- NPR Importer
-
Kelly
- Training production model on cluster
- Specing out more servers
- Working on getting new hardware
- Alexandre (PTO)
-
Anurag (In NY,NY)
- Creating ORNN presentation
- Formalize corpus creation
-
Tilman
-
Announcements
- Work week on week of Nov 13 in Berlin
-
Discussion
- Packaging progress?
- Training (Trained w/out Fisher. Fisher export corrupted, re-exporting) [Kelly]
- Documentation [Kelly+Reuben]
- native_client README.md re-written [Reuben]
- Communications (Golem coming to video/interview us on the release) [Kelly]
- Discourse forum on Deep Speech [Kelly]
- Model testing on Task Cluster [Kelly+ Alex]
- Give large model to Alex [Kelly]
- Finding beta testers [Michael]
- Packaging progress?
-
Review of on-going work
- Tilman (PTO)
-
Reuben
- Documentation native client README.md
- Fixing native client little problems, e.g. error messages, what happens when a param is not there
- Reach out to hacks
-
Kelly
- Setting up current master to run on cluster
- Completed run with out Fisher
- Re-exporting Fisher as it seems to be corrupted
- Specing out more servers
- Journal Club A Neural Conversation Model plus other related work?
- Setting up current master to run on cluster
-
Alexandre
- Python & Node packages cross-compilation locally
- Progresses on the use of tfcompile
- Build time the tfcompile configuration file
- Audio length now variable
- Simplifying AOT use
- Anurag (In NY,NY)
-
TODO
- One-pagers motivation on github
-
Announcements
- Work week on week of Nov 13 in Berlin
-
Discussion
- Packaging progress?
- Train model [Kelly]
- Documentation [Kelly+Reuben]
- Communications [Kelly]
- Discourse forum on Deep Speech [Kelly]
- Model testing on Task Cluster [Kelly+ Alex]
- Packaging progress?
-
Review of on-going work
-
Tilman
- Cocktail Party Noise importer
- Re-Review Germany blog post
-
Reuben
- Documentation native client README.md
- Fixing native client little problems, e.g. error messages, what happens when a param is not there
- Reach out to hacks
-
Kelly
- Setting up current master to run on cluster
- Specing out more servers
- Journal Club Get To The Point: Summarization with Pointer-Generator Networks
-
Alexandre
- Reviewing Tilman's PR
- Progresses on the use of tfcompile
- Build time the tfcompile configuration file
- Audio length now variable
- Simplifying AOT use
-
Anurag
- Working on Wikipedia based data sets
-
Tilman
-
TODO
- One-pagers motivation on github
-
Announcements
- Update on Berlin office opening, C-Level demos, press coverage
- Work week on week of Nov 13 in Berlin
-
Discussion
- Packaging progress?
- Packing script done
- Train model [Kelly]
- Documentation [Kelly]
- Marketing, comic, Hacks [Kelly]
- Discourse forum on Deep Speech [Kelly]
- Model testing on Task Cluster [Kelly+ Alex]
- Custom CTC decoder in native clients [Reuben]
- Tool to load checkpoint (Refactor Deep Speech) [Reuben]
- Update export/loading to new API [?]
- Packaging progress?
-
Review of on-going work
-
Tilman
- Rebasing code
- Testing rebased code
- Cocktail Party Noise importer
-
Reuben
- Language Model Blog Post (The Deep Speech Journey)
- Custom CTC in all native clients
- Reach out to hacks
-
Kelly
- Setting up current master to run on cluster
- Specing out more servers
-
Alexandre
- Progresses on the use of tfcompile
- Build time the tfcompile configuration file
- Audio length now variable
- Simplifying AOT use
- Progresses on the use of tfcompile
-
Anurag
- Automatic summarization literature review
- Working on Wikipedia based data sets
-
Tilman
-
TODO
- One-pagers motivation on github
-
Announcements
- Report on managers meeting
-
Discussion
- Packaging progress?
- Packing script done
- Train model [Kelly]
- Documentation [Kelly]
- Marketing, comic, Hacks [Kelly]
- Discourse forum on Deep Speech [Kelly]
- Model testing on Task Cluster [Kelly+ Alex]
- Custom CTC decoder in native clients [Reuben]
- Tool to load checkpoint (Refactor Deep Speech) [Reuben]
- Update export/loading to new API [?]
- Packaging progress?
-
Review of on-going work
-
Tilman
- Rebasing code
- Testing rebased code
- Cocktail Party Noise importer
-
Reuben
- Language Model Blog Post (The Deep Speech Journey)
- Custom CTC in all native clients
- Reach out to hacks
-
Kelly
- Setting up demo on my laptop
- Setting up current master to run on cluster
- Setting up C-Level Common Voice demo
- Got funding for more servers
-
Alexandre
- Progresses on the use of tfcompile
- Build time the tfcompile configuration file
- Audio length now variable
- Simplifying AOT use
- Progresses on the use of tfcompile
-
Anurag
- Automatic summarization literature review
- Working on Wikipedia based data sets
-
Tilman
-
Announcements
-
Discussion
- Packaging progress?
- Packing script done
- Train model [Kelly]
- Documentation [Kelly]
- Marketing, comic, Hacks [Kelly]
- Discourse forum on Deep Speech [Kelly]
- Host model/release on github releases
- Model testing on Task Cluster [Kelly+ Alex]
- Client allows batch processing [No]
- Custom CTC decoder in native clients [Reuben]
- Tool to load checkpoint (Refactor Deep Speech) [Reuben]
- Packaging progress?
-
Review of on-going work
-
Tilman
- Added module based logging
- In graph replication single mode working
- In graph replication cluster mode working
- Checkpoint logic re-write to do early stopping
-
Reuben
- Language Model Blog Post (The Deep Speech Journey)
- Custom CTC in all native clients
- Demo tool
- Reach out to hacks
-
Kelly
- Gave talk in Taipei on Common Voice for Mozilla Developer Conference
- Writing with IAS[4] Mozilla Research Grant Proposal
- Writing initial 2018 plan for the Machine Learning group
-
Alexandre
- OS X taskcluster integration with new MacBook Pro!
- Optimized task cluster build!
- Hosting two meetups!
-
Anurag
- Journal club presentation
- Reading Google's NMT paper
- Automatic summarization literature review
-
Tilman
-
Announcements
- Tomorrow Kelly will talk at BerlinNLP about Deep Speech, will be recorded.
-
Discussion
- Packaging for alpha release, what needs to be done? (Unblocking community)
- Setup testing of model on TaskCluster ("Should be easy" -Alex) [Alex]
- Write script that loads checkpoint + does inference [Reuben]
- Write documentation [Kelly]
- Decide where to store models (Language Models + DeepSpeech Models) [gitlfs (LM) + release (GitHub Release)]
- Native client binaries how they're obtained + installed? [gitlfs, s3...]
- Release should contain everything: language models + DeepSpeech model + code...
- Decide how to package: models + frozen model
- Include a demo? (native client, Reuben’s GUI merged after release [Reuben])
- Packaging for alpha release, what needs to be done? (Unblocking community)
-
Review of on-going work
- Tilman Debugging PR on dynamic batch size & In graph replication & Cocktail Party
-
Reuben
- Language Model Blog Post (The Deep Speech Journey)
-
Kelly
- Traveling to Taipei for Mozilla Developer Conference
- Creating Common Voice presentation for Mozilla Developer Conference
- Creating Deep Speech presentation for Berlin NLP Meetup
- Automatic Summarization massive param study
- Preparing yaml for basline model variations
- Getting seq2seq framework to work on the cluster using our enqueue/SLURM framework
- Issue 625 (Create NewsML Importer) (Trying new Opus VAD)
- Interviewing job candidates
- Creating Pocket Corpus
-
Alexandre
- PR 834 (TaskCluster Decision Task)
-
Anurag
- Benchmark checkpoints upload
- Automatic summarization literature review
-
Announcements
- Working with legal on:
- Possible integration of Google open speech corpus in to Common Voice corpus
- Possible integration of Mythic's open speech corpus in to Common Voice corpus
- Working with legal on:
-
Review of on-going work
- Tilman Debugging PR on dynamic batch size & Cocktail Party
-
Reuben
- PR 805 (Score CTC prefix beams with KenLM)
- Language Model Blog Post
- Review PR 810 (Local/Remote benchmarking tool)
- WER Report debugging
-
Kelly
- Creating Common Voice presentation for Mozilla Developer Conference
- Creating Deep Speech presentation for Berlin NLP Meetup
- Automatic Summarization massive param study
- Preparing yaml for basline model variations
- Getting seq2seq framework to work on the cluster using our enqueue/SLURM framework
- Issue 625 (Create NewsML Importer) (VAD is getting tripped up in music)
- Interview with Vozpopuli
- Interviewing job candidates
- Reviewing PR 805 (Score CTC prefix beams with KenLM)
-
Alexandre
- PR 820 (Issue818+819)
- PR 810 (Local/Remote benchmarking tool)
- Reviewing PR 805 (Score CTC prefix beams with KenLM)
- Journal Club Pete Warden's Book "Building Mobile Applications with TensorFlow"
-
Announcements
- Reuben's PR[2] integrating the language model deeper in to the CTC decoder gives WER of 6.48% on Librivox clean
- With WER of 6.48% on Librivox clean it appears[1] as if we're the best FOSS STT engine
- Talked with Pete Warden, tech lead of embedded TF + lead of the Google open speech corpus, from Google
- Possibility of collaboration on Common Voice
- Wants to work with us on fixing TF bugs preventing Alex from progressing on quantization
- Tomorrow TV interview with NTN24 on Common Voice + Speech at Mozilla
-
Review of on-going work
- Tilman Debugging PR on dynamic batch size & Cocktail Party
- Reuben Journal Club & DS2 Tests & WER Report debugging
-
Kelly
- Automatic Summarization massive param study
- Preparing yaml for basline model variations
- Getting seq2seq framework to work on the cluster using our enqueue/SLURM framework
- Issue 625 (Create NewsML Importer) (VAD is getting tripped up in music)
- Interview with NTN24 and Vozpopuli
- Interviewing job candidates
- Reviewing PR 805 (Score CTC prefix beams with KenLM)
- Review Bug 1396158 (Removal of Pocketsphinx from Firefox)
- Automatic Summarization massive param study
- Alexandre (Sick)
-
Announcements
- Running TED+Librivox+Fisher+Switchboard training on both cluster nodes
- Librivox clean test a data set
- Librivox 4-gram language model
- Librivox clean validation a data set
- Running TED+Librivox+Fisher+Switchboard training on both cluster nodes
-
Review of on-going work
- Tilman(PTO)
- Reuben CTC Decoder + Language model & DS2 Tests & WER Report debugging
-
Kelly
- Automatic Summarization massive param study
- Preparing yaml for basline model
- Preparing yaml for basline model variations
- Getting seq2seq framework to work on the cluster using our enqueue/SLURM framework
- Training TED+LibriVox+Fisher+Switchboad for 13 epochs
- Issue 625 (Create NewsML Importer) [doing]
- Issue 791 (Switch language model to OpenSLR's standard LibriSpeech 4gram model)
- Automatic Summarization massive param study
- Alexandre C++, Python, and nodejs bindings + AOT (Dealing with compiled in wav file length problem)
-
Announcements
- Running Issue 759 (More BiRNN layers no LM) on the server
- Presentation to Sean+Azita+Katharina on Common Voice "Six Month Plan" went well
- Finished TED+Librivox+Fisher 13 epoch run
- WER 27.19% on TED test set
- Google's WER for TED test is 27.32%[1].
- TED test set is hard!
-
Discussion
- Early stopping architecture
- Checkpoint every epoch in dir1
- Checkpoint every 10 min in dir2
- Flag to switch which checkpoint is used
- Packaging model for "alpha soft release"
- Baby step TED+Librivox+Fisher+SWD training [Kelly]
- Baby step verify deployment infra works [Alex]
- Baby step NPR importer [Kelly]
- Baby step TED+Librivox+Fisher+SWD+NPR training [Kelly]
- Adapting engine to any custom Language
- Reuben integrating this with current LM work
- Should create an alphabet file with the alphabet in use
- Early stopping architecture
-
Review of on-going work
- Reuben CTC Decoder + Language model + DS2 Tests
- Tilman(PTO)
- Anurag Deep Compression[2] training
-
Kelly
- Automatic Summarization data set creation
- Automatic Summarization parameter study
- Automatic Summarization data set pre-processing for seq2seq framework
- Issue 759+760 (Add more BRNN layers subtract language model)
- Issue 625 (Create NewsML Importer) [doing]
- Issue 692 (Adapting engine to any Custom Language) [doing]
- Preperation for Weekly Journal Club Meeting
- Alexandre C++, Python, and nodejs bindings
-
Announcements
- Currently doing a TED+Librivox+Fisher (Dealing with continuation strangeness)
- Old server back online at the new offices (Internally accessible. VPN?)
-
Discussion
- Do we want to revise a site that reports on WER?
-
Review of on-going work
- Reuben CTC Decoder + Language model + DS2 Tests
- Tilman Dynamic Batch Sizing
- Anurag Deep Compression[2] + Benchmarking + Early Stopping
-
Kelly
- Mycroft PR partnership
- Automatic Summarization data set creation
- Automatic Summarization parameter study
- Six Month Plan for Common Voice
- Presentation to Sean on Common Voice Six Month Plan
- Talk with Voice Fill
- Alexandre(PTO)
-
Announcements
- Currently doing a TED+Librivox+Fisher run
- New servers back online at the new offices
- Old server still not back online at the new offices
-
Discussion
- Distributing models, how do we want to do this?
- Discussion results:
- Distribute protobuf + checkpoint and tf_compile result
- Need to choose distribution of training data, i.e. which training sets to train on
- Fine tune model with a custom data set to target particular use case, TDB
-
Review of on-going work
- Reuben CTC Decoder + Language model + Journal Club
- Tilman Dynamic Batch Sizing
- Anurag Deep Compression[2] + Benchmarking (On hold until old server installed in new Berlin office) + Early Stopping
-
Kelly
- NPR Importer
- Mycroft PR partnership
- New Berlin Office Servers
- Automatic Summarization Roadmap
- Six Month Plan for Common Voice
- Presentation to Sean on Common Voice Six Month Plan
- Alexandre OS X TC builds + nodejs builds
-
TODO
- Look into getting Nanshu access to the Berlin cluster [kelly]
-
Announcements
- Segfault looks to be solved by transparent huge pages being switched off!!!
- Currently doing a TED+Librivox+Fisher run
- Mycroft is making PR's in Common Voice and Deep Speech to follow
- Deep Speech + Common Voice blog post live Mozilla Blog[1]
- SoftAtHome interested in helping with putting DeepSpeech on RPi3
- Meeting with i2x to discuss possible collaborations
- New servers will be offline sometime this week when moved from the colocation facility to the new offices
- Old server will be offline this week until new hardware is delivered to "rack it"
-
Review of on-going work
-
TODO
- Look into getting Nanshu access to the Berlin cluster [kelly]
-
Announcements
- Working on Mycroft partnership (1-2 more devs + contributions to Common Voice) meeting tomorrow
- Common Voice getting 20k-40k contributions per day
- RiseML created blog post on distributed Deep Speech but hasn't made it public yet
- European Language Resource Association interested in partnering on Common Voice
- Tons of Common Voice press
-
Review of on-going work
- Reuben CTC Decoder (Integrated in to Tensorflow, but how to expose to external devs)
- Tilman Segfault Distributed Tensorflow + Dynamic Batch Sizing
- Anurag Deep Compression[0] + Benchmarking[1][2][3][4][5]... + Early Stopping
- Kelly Journal Club Kronecker Recurrent Units + Common Voice Press + Automatic Summarization + NPR Importer[7]
- Alexandre Getting RNN and tfcompile happy together + OS X TC builds + nodejs builds
-
TODO
- Look into getting Nanshu access to the Berlin cluster (On hold until segfault is solved)
- Contact Urdu, Macedonian... developers to see if they want to open source models and we host them on S3 say [done]
- Open issue to allow our code to work on various languages easily [done]
-
Welcome - Nanshu Wang, Rob Smith, Nicholas Lane
-
Announcements
- Working on Mycroft partnership (1-2 more devs + contributions to Common Voice)
- PC Mag covered Common Voice "Mozilla Asks Everyone to Donate Their Voice"
- Common Voice getting 14k contributions per day
- RiseML porting code to Google Cloud
- RiseML creating blog post on distributed Deep Speech
-
Future work
- Kelly Text To Speech & Automatic Summarization
- Anurag DeepCompression
Don't edit this footer for questions, add them to the page with the edit button at the top.