TODO.txt


* changelog and readme
* What else is on the git repository as issues?

* Segmentation options, especially merging, to be sorted (see below)
* The full set of features to be implemented, checked, and then used
* LSTM and CNN plus other learners
* tSNE and PCA

* For cheatsheet and zooniverse
    -> sort out data
    -> make the files
    -> Cheatsheet: make the webpage
    -> Zooniverse: 
        -> What data?
        -> Picture of animal
        -> Sample files
        -> Put the data into the web portal

* Use the spectrogram inversion code to (a) change pitch, (b) clean the file as an image and reconvert

* Wigner-Villes Distribution code (eventually cython)
* New fundamental frequency algorithm

* More thought about how we decide and encode certainty (certainly is, certainly isn't)

# ==============
# TODO

# Finish segmentation
#   Add a minimum length of time for a segment -> make this a parameter
#   Finish sorting out parameters for median clipping segmentation, energy segmentation
#   Finish cross-correlation to pick out similar bits of spectrogram -> and what other methods?
#   Add something that aggregates them -> needs planning

# Interface -> inverted spectrogram does not work - spec and amp do not synchronize

# Actions -> Denoise -> median filter check
# Make the median filter on the spectrogram have params and a dialog. Other options?

# Finish the raven features

# Would it be good to smooth the image? Actually, lots of ideas here! Might be nice way to denoise?
    # Median filter, smoothing, consider also grab-cut
    # Continue to play with inverting spectrogram

# Colourmaps
    # HistogramLUTItem

# Context menu different for day and night birds?

# Minor:
# Consider always resampling to 22050Hz (except when it's less in file :) )?
# Font size to match segment size -> make it smaller, could also move it up or down as appropriate
# Where should label be written?
# Use intensity of colour to encode certainty?
# If don't select something in context menu get error -> not critical
# Colours of the segments to be visible with different colourmaps? Not important!

# Look at raven and praat and luscinia -> what else is actually useful? Other annotations on graphs?

# Don't really want to load the whole thing, just 5 mins, and then move through with arrows -> how?
# This is sometimes called paging, I think. (y, sr = librosa.load(filename, offset=15.0, duration=5.0) might help. Doesn't do much for the overview through)

# ===============
# TODO for AviaNZ

Tier 1 annotation - labels to use so that we can evaluate results in detail
## AviaNZ annotation ######################################## in corresponding GT (-sec.txt) #############
quality     male            female          cannot decide   # time     presence/absence type    quality  #
v close     'Kiwi(M)1'      'Kiwi(F)1'      'Kiwi1'         # 1        1                M/F     *****    #
close       'Kiwi(M)2'      'Kiwi(F)2'      'Kiwi2'         #                                            #
faded       'Kiwi(M)3'      'Kiwi(F)3'      'Kiwi3'         #                                            #
v faded     'Kiwi(M)4'      'Kiwi(F)4'      'Kiwi4'         #                                            #
v v faded   'Kiwi(M)5'      'Kiwi(F)5'      'Kiwi5'         # n        1                M/F     *        #
                                                                                                         #
Bittern annotation - same as kiwi                                                                        #
            'Bittern(B)1' to 'Bittern(B)5' for the booms                                                 #
            'Bittern(I)1' to 'Bittern(I)5' for the inhalations                                           #
##########################################################################################################

AvianZ seems to be stable now! Just minor things
    Set Operator/Reviewer (Current File) -> does not remember if it has no annotation. Instead do we make
    the option disable if it has no annotation?
    Spectrogram options highest frq. 16000->4000 does it need to remember when moving to next file (currently not)
        Ask if you want to save it, then put it in the params file?

    Bug to fix: the very first annotation gets lost.
    #done
    annotations lost after changing interface settings.
    #done, update the overview
    Add a sub menu item 'Save selected sound': useful to extract interesting sounds from long recordings (this stops me going back to Praat)
    #done
    Change spectrogram parameters is not working smooth
        -   if someone changes window length and hop they assume it is saved (when you open a new file changes are gone), so let it remember?
        -   low and high frequency is the same, in addition it is ugly!
    GT format - need GT is in .txt format (no need to have them as .xlsx) then update the code
    #done
    Time axis is wrong when file name has more text in the begining e.g. SM recordings
    #done

Look into MFCC as a way to do template matching. Plus some others, but start there.
So for each segment identified by the wavelets, compute the coefficients, and plot them for now :)
# done some experiments with Ponui dataset and Tier1. MFCC + DTW could manage Ponui data but not good for Tier1, might need more examples
Training data
    -   needs to be less noisy and not overlapped with other species.
    -   how long the examples should be? complete call or individual syllbles?
    -   I choose syllables, then the problem was test sounds (AviaNZ detections) are usually complete calls/more than one syllables.
        length/#syllables in test and templates matter, so I got the average MFCC over all frames.
    -   how many MFCCs? tested with 12, 20, 48. 12 with delta was better.

NP:
Introduce confidence using energy ratio after detection
# done, eRatio was good at picking level 1-3 according to my testing.
# I'm processing Tier1 data using wavelet segmentation followed by this eRatio and generating two excel
# outputs:
        (1) 'Possible': includes all the segments that turned out to be kiwi
        (2) 'withCofidence':  includes only the segments confirmed with eRatio.
# and the annotation with 'kiwi' for segments with confidence and 'kiwi?' for possibles.
# Phase 1 (wavelet detection) of Tier1 data processing is completing on 15th Jan 2018. Started on 30th Oct, done mostly
  between 3 machines (plus 6 from the undergrad lab for 1 week). Tried Azure but wan't easy and had to gave it up (needs data to be uploaded).

SM:

from PyQt4 import QtGui
QtGui.QImageReader.supportedImageFormats()


Check the parameters for the clicks
    And where to use -- current idea is to remove them during/after detection
And can we use the same idea for wind?

Adaptive noise floor/SNR

Finish feature vectors
Look into image denoising more

SM Choose best sets of nodes and evaluate
SM Make the general segmentation more sensitive
SM Get to grips with the bloody wavelet segmenter

NP Features
NP Good training set
NP Work out how not to mistake kiwi
NP Sort out 1 version of code for excel exporting with 1 minute presence/absence
    Move from AviaNZ to SupportClasses
    #Done
    Add the 1 min workflow (see below)
    #Done the first version - arrange output without changing the detectors
NP Check manual
    #Done

We should start to decide what makes a specialised filter
    Mingap, minlength
    Threshold for segmenters
    Wavelet nodes
    Trained classifiers

(2) SRM And the video!
(6) SRM Multiprocessing where possible
    It made things slower!
(7) NP Classify segments
    (i) Wavelets better version
    (ii) Learning based on whatever (check Raven features)
    (iii) SRM Wavelet energy features

# What we want from the Tier 1 output:
Presence or absence at some time resolution (e.g., 1 min, or 5 mins)
    (1) Pick up a 1 min section
    (2) Detect kiwi
    (3) As soon as a definite kiwi is detected, mark presence, move on to next section
    (4) If a possible kiwi is detected, mark it and keep on going
    (5) If you find a definite, delete possible, go to (1)
    (6) If find no kiwi mark as absence
    (7) If find only possibles, ask user at end of all file processing using Check segments interface

Work out how to make old wavelet method produce too many segments

Make a new method (not inside Segmentation):
    (1) Perform segmentation using at least one of old wavelet method and any other segmenter, possibly multiple options
    # Partly done, (FIR + median clip). Add in wavelets.
    # And it does need parameters
    (2) Combine the segments as appropriate
    # Two versions are done. Opens it out so far, no max. Either with or without envelopes kept.
    (3) Perform classification on each segment using at least one of wavelet energies, Raven features, MFCC, LPC, fundamental freq
    # To be done (ready to modify wavelet segmentation part)

Plot the wavelet energy for noise, crickets, calls, etc.
    Suppress the noise nodes, reconstruct -- what happens?
Parallel process the segmenters

get name from standard files and use it
    # Name bit is done, not yet used

# Look into the invertible CQT
    # Doesn't seem to help with tril1
# Dominant frequency
# warbleR features
# plots of eg xcorr
# DTW - 2D, fast
    look into DP

NP:

Test the idea of using the wavelet nodes with either
    (a or b or c) and not (d or e)
    (a and (b or c))
    or maybe both :)
    Will have to count the number of times each occurs
    Idea is to reduce (i) misclassifications, (2) crickets, (3) wide-band clicks, (4) rain

Denoising experiments
    -> short time better than long?
    -> denoise only segments
    -> compare python and matlab

Segmentation experiments
    -> like thesis but better :)
    -> paper

(1) Minor bugs and extensions

    NP: Read mp3 files
# SM: pysox

    ??: What to do with stereo sound? How about consistent sample rates?

    NP: Make play sounds play the denoised version after denoising
# This seems to be complicated with undo etc. Still thinking whats the best way to do it,
  can easily add a separate button to play denoised but not nice. I fixed the plotting problem though.

	NP: Stop loading the file when choose to cancel the progress bar
# Removed the cancel button for now. otherwise have to unroll what happened inside the loadFile when the user cancel it.

	File list dock becomes frozen - one was experiencing this. Had to restart the program.
	    # Can we reproduce it?

    NP: Fully integrate wavelet seg into program
    NP: We actually want Kiwi (M) and Kiwi (F), and need to get all the ruru calls
    NP: Want to have some form of machine learning
		(1) decision tree
		(2) MLP
		(3) SVM
		(4) boosting
    SM: And need to think about the 95% confidence thing a lot more

    SM: Make the segmentations work fully
        Get the minimum time of a segment parameter sorted
        Work out how to combine the methods, particularly, spot overlaps in segments and combine them
        Work out parameters and how to set them
        Remove cross-correlation? Or just improve?

    Both: Think about nice ways to train a new wavelet filter
		And get the whole workflow sorted for it

(4) Features
    Use the wavelets
    Finish the Raven features
    Add MFCC (nearly done)
    And whatever else seems interesting

    More on fundamental frequency
        smoothing?
        harvest or bana (yaapt was awful)
    Shape metrics

(5) Learning
    Standard methods
        MLP
        Decision tree
        Boosting
        SVM
    LSTM or GRU

    HMM to string syllables together

(6) Other
    Think more about the spectrogram inversion
        If it works --> Stu's bats
        Denoise the spectrogram fully (median filter, smoothing, consider grab-cut)

    Any necessary database or metadata things?

    Bats do keep on coming up...

    Generative noise model

# Tier 1
(1) Segmentation that is as good as we can get it
(2) Wavelet recognition ditto
(3) Non-wavelet recognition ditto