Skip to content

Commit

Permalink
Added example data and improved docs
Browse files Browse the repository at this point in the history
  • Loading branch information
mbsantiago committed Sep 26, 2024
1 parent 136eb8f commit 8a957a6
Show file tree
Hide file tree
Showing 24 changed files with 252 additions and 81 deletions.
48 changes: 36 additions & 12 deletions back/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,42 @@

![Whombat Logo](assets/logo.svg)

Welcome to the Whombat documentation, your go-to resource for understanding and
utilizing this open-source audio annotation tool tailored for ML development.
Welcome to the _Whombat_ documentation, your go-to resource for understanding and utilizing this open-source audio annotation tool tailored for ML development.

If you're eager to dive into using Whombat, explore the
[User Guide](user_guide/index.md) for comprehensive instructions.
If you're eager to dive into using Whombat, explore the [User Guide](user_guide/index.md) for comprehensive instructions.

For those keen on delving into the internal workings of Whombat, integrating it
with Python scripts, or contributing to its development, keep an eye on the
upcoming [Developer Guide](developer_guide/index.md). Additionally, find
detailed technical documentation in the [Reference](reference/api.md) section. Check
out the [Contributing](CONTRIBUTING.md) section to learn how you can contribute
to the Whombat community.
## Why Whombat?

Please bear with us, as this documentation is a work in progress, and we're
working hard to smooth out any rough edges.
In the realm of audio analysis tools, numerous options exist, each with its specific strengths.
Tools like [Raven](https://www.ravensoundsoftware.com/) and [Audacity](https://www.audacityteam.org/) excel at audio analysis but might fall short when it comes to the specialized needs of audio annotation for machine learning development.
Whombat bridges this gap by focusing on the following key requirements:

1. **Evolving Datasets**: Whombat empowers you to build and manage enduring datasets of audio recordings, accommodating changes and updates as your project evolves.

2. **Structured Annotation Projects**: Create well-defined annotation projects, complete with clear instructions and tasks.
Track progress and ensure consistency across your team.

3. **Annotation Review**: Thoroughly examine and validate existing annotations to maintain data quality.

4. **Import/Export Flexibility**: Import and export annotations in various formats, facilitating collaboration and integration with other tools.

5. **Flexible Annotation**: Annotate sounds with precision, specifying Regions of Interest (ROIs) using bounding boxes, intervals, or other methods.
Attach rich metadata to annotations, including tags and notes.

6. **Sound Exploration**: Explore your sound collection to uncover patterns and gain insights.
Facilitate sound identification and classification through interactive tools.

7. **Model Comparison**: Compare model outputs with human annotations to identify areas where your model is underperforming and pinpoint where additional annotation is needed.

_Whombat_'s design prioritizes **careful, manual data curation**.
We believe this emphasis on precision will enable the community to build gold-standard bioacoustic datasets, fueling the development of cutting-edge machine learning models.

## Learn more

For those keen on delving into the internal workings of _Whombat_, integrating it with Python scripts, or contributing to its development, keep an eye on the upcoming [Developer Guide](developer_guide/index.md).
Additionally, find detailed technical documentation in the [Reference](reference/api.md) section.
Check out the [Contributing](CONTRIBUTING.md) section to learn how you can contribute to the Whombat community.

!!! warning

Please bear with us, as this documentation is a work in progress, and we're working hard to smooth out any rough edges.
Binary file added back/docs/assets/img/exploration_counts.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added back/docs/assets/img/exploration_tabs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added back/docs/assets/img/filtering.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added back/docs/assets/img/pagination.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added back/docs/assets/img/recording_map.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added back/docs/assets/img/sound_event_scatterplot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
99 changes: 72 additions & 27 deletions back/docs/user_guide/exploration.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,84 @@
# Data Exploration

Whombat provides diverse tools for exploring the data you've meticulously worked
on. Below, we'll introduce three key exploration avenues:
_Whombat_ allows you to delve deep into your meticulously curated data through a variety of exploration tools.
Discover patterns, gain insights, and refine your understanding of your audio recordings, audio clips, and sound events.

1. Recordings
2. Audio Clips
3. Sound Events
Key Exploration Areas:

To access the exploration section, click on the "Exploration" button on the
sidebar or utilize the navigation cards on the home page. Once in the
exploration section, choose your preferred exploration type.
1. Recordings: Explore metadata, locations, and spectrograms of your audio recordings.
2. Audio Clips: Examine specific segments of your recordings in detail.
3. Sound Events: Analyze and visualize sound events within their feature space

## Recordings
## Accessing Data Exploration

Explore and search through all the recordings registered in Whombat. Utilize the
search bar or filters to narrow down the displayed recordings. A list of
recording spectrograms is presented for swift browsing.
To begin your exploration, click on the "Exploration" button on the sidebar or use the navigation cards on the home page.

## Audio Clip
<figure markdown="span">
![side bar](../assets/img/side_bar.png){ width="200" }
<figcaption>Button D will take you to the exploration section</figcaption>
</figure>

In the Audio Clips section, spectrograms are showcased in a Gallery view,
allowing simultaneous viewing of multiple clips. Employ filtering tools to
choose specific clips for examination.
Once you're in the exploration section, you can choose what you want to explore (recording, clips or sound events) and your preferred view – typically, a list format or a gallery format is available.

## Sound Events
<figure markdown="span">
![exploration tabs](../assets/img/exploration_tabs.png){ width="400" }
</figure>

Navigate individual sound events in this section, opting for either the Gallery
view or the Scatter Plot view. The gallery displays each sound event's
spectrogram, while the scatterplot provides a 3D representation. Each dot
represents a sound event, with coordinates based on three acoustic
features—defaulting to duration, bandwidth, and lowest frequency. Click on a
point in the scatterplot to view the corresponding spectrogram of the sound
event.
## Controlling the Number of Items Displayed

## Conclusion
The exploration pages provide tools to navigate through all the items registered in _Whombat_.
To ensure smooth performance, only a limited number of items are displayed at a time.
You can see the current viewing range and the total number of items in the top left corner:

Our commitment to enhancing exploration tools will offer you even greater
insights into your data. Stay tuned for updates!
<figure markdown="span">
![exploration counts](../assets/img/exploration_counts.png){ width="250" }
</figure>

To navigate through the list and adjust the number of items displayed simultaneously, use the pagination controls located in the top right corner:

<figure markdown="span">
![pagination controls](../assets/img/pagination.png){ width="400" }
</figure>

## Filtering Items

To refine the list of items you're currently viewing, you can apply custom filters.
The filtering controls are situated in the top left corner:

<figure markdown="span">
![filter controls](../assets/img/filtering.png){ width="400" }
</figure>

Click on the "Apply filters" button to open the filtering menu, where you can browse and select from the available filters.
Once applied, the filters will appear as badges on the "Filter bar" next to the button.
You can easily remove any filter by clicking on the cross within its corresponding badge.

## Explore recordings on a Map

Visualize the geographical distribution of your recordings by switching to the "Map" tab within the Recording exploration section.

<figure markdown="span">
![recording exploration tabs](../assets/img/recording_exploration_tabs.png){ width="400" }
</figure>

The map on the left will display a marker for each recording location.
Clicking on any marker will reveal details about the corresponding recording on the right side of the screen.
You'll be able to view the spectrogram and listen to the audio playback.

<figure markdown="span">
![recordings map](../assets/img/recording_map.png){ width="700" }
</figure>

## Sound Event Scatterplot

Visualize and explore your sound events in a dynamic 3D scatterplot by navigating to the sound event section.
This representation allows you to quickly browse and analyze a large number of sound events within their feature space.

Every sound event is plotted based on three key features: duration, lowest frequency, and highest frequency.
The color of each point corresponds to the species tag associated with the sound event, providing a visual way to identify patterns and clusters.

<figure markdown="span">
![sound event scatterplot](../assets/img/sound_event_scatterplot.png){ width="700" }
</figure>

Click on any point in the scatterplot to Get detailed information about the selected sound event and see a focused spectrogram highlighting the chosen sound event.
63 changes: 49 additions & 14 deletions back/docs/user_guide/faq.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,61 @@
# FAQ

## 1. I'm having trouble logging in. What should I do?
## Starting Up Whombat

If you're logging into Whombat for the first time, you might not have a
personalized username and password yet. Don't worry, Whombat automatically sets
up a default user during initialization.
### I'm having trouble logging in. What should I do?

If you're logging into Whombat for the first time, you might not have a personalized username and password yet.
Don't worry, Whombat automatically sets up a default user during initialization.

To get started:

- Username: **admin**
- Password: **admin**

After your initial login, head over to your user profile to customize both your
username and password to something more personalized and secure.
After your initial login, head over to your user profile to customize both your username and password to something more personalized and secure.

## Focusing on sounds

### Can I isolate sounds within specific frequency range?

If you know your target sounds fall within a specific frequency range, you can apply a bandpass filter to focus your attention and filter out extraneous noise.

To do this:

1. **Access Spectrogram Settings**: Locate the Spectrogram Settings within the annotation interface.
(You may need to refer to the [Spectrogram Settings](spectrogram_display.md#spectrogram-settings) section of the documentation for the precise location).
2. **Apply Bandpass Filter**: Adjust the filter settings to define your desired frequency range.

!!! tip "Additional tips"

**Experiment with Filter Settings**: Try different frequency ranges to find the optimal settings for isolating your target sounds.
**Combine with Denoising**: Use the denoising feature in conjunction with filtering to further enhance clarity.

## Ultrasonic recordings

### I have time expanded recordings, can I use them?

Whombat fully supports time-expanded audio recordings, commonly used in bioacoustics research to analyze high-frequency vocalizations like bat calls.
While Whombat assumes recordings are not time-expanded by default, you can easily adjust for this:

1. **Navigate to the Recording Detail Page**: Access the page dedicated to the specific recording you want to work with.
2. **Update Time Expansion Factor**: In the recording media info, you'll find an option to specify the "Time Expansion Factor" used during recording.
Enter the correct value here.

!!! warning "Adjust the time expansion early"

Set the time expansion factor as soon as you upload a time-expanded recording to ensure accurate frequency calculations from the start.

??? tip "Restoring the original samplerate"

While it's possible to unexpand recordings (refer to the [bats section](https://xeno-canto.org/help/FAQ#bats) of the xeno-canto documentation for tips), Whombat allows you to work directly with time-expanded recordings without altering the original data. We recommend this approach as it maintains the integrity of your source material and provides a clear record of how the recording was created.

## Import and Export

## 2. What is the AOEF format?
### What is the AOEF format?

The AOEF format is a custom data format designed for integration with Whombat
data. It is outlined in the `soundevent` package, and for a more in-depth
understanding, we suggest checking out their
[documentation](https://mbsantiago.github.io/soundevent/). In simple terms, it's
a [JSON](https://www.json.org)-based format, drawing heavy inspiration from the
[COCO dataset](https://cocodataset.org/#format-data) format.
The AOEF format is a custom data format designed for integration with Whombat data.
It is outlined in the `soundevent` package, and for a more in-depth understanding, we suggest checking out their [documentation](https://mbsantiago.github.io/soundevent/).
In simple terms, it's a [JSON](https://www.json.org)-based format, drawing heavy inspiration from the [COCO dataset](https://cocodataset.org/#format-data) format.

*[AOEF]: Acoustic Object Exchange Format
\*[AOEF]: Acoustic Object Exchange Format
52 changes: 36 additions & 16 deletions back/docs/user_guide/spectrogram_display.md
Original file line number Diff line number Diff line change
@@ -1,58 +1,77 @@
# Spectrogram Display
# Spectrogram Navigation

Visualizing audio with spectrograms is key to making accurate annotations, and Whombat makes the process easier with tools to tailor the display to your needs.

You'll typically only view a segment of the full spectrogram to focus on specific parts or to keep things running smoothly.
You'll typically only view a segment of the full recording spectrogram to focus on specific parts or to keep things running smoothly.

??? info "Spectrograms in Chunks"

Whombat avoids calculating the entire spectrogram at once. Instead, it works in manageable chunks, letting you navigate your audio effortlessly. Only the visible chunks (and a few nearby) are computed, so you can zoom out and watch as Whombat fills in the rest in real time.

Whombat offers several ways to move around your spectrogram. We'll cover each method below, with the labeled spectrogram as your guide.
Whombat offers several ways to move around your spectrogram.
We'll cover each method below, with the labeled spectrogram as your guide.

![Spectrogram](../assets/img/spectrogram.png)

## Moving Around
## Dragging

### Dragging
The simplest way to navigate is by dragging.
Just click anywhere on the spectrogram and drag to move around.
But remember, this only works if dragging is enabled.
Check the toolbar – if the drag button (C) is active, you're good to go.
If not, just click it to activate dragging.

The simplest way to navigate is by dragging. Just click anywhere on the spectrogram and drag to move around. But remember, this only works if dragging is enabled. Check the toolbar – if the drag button (C) is active, you're good to go. If not, just click it to activate dragging.
You can also use the bar at the bottom (G), which shows a mini version of the whole spectrogram.
Click and drag the highlighted section to navigate.

You can also use the bar at the bottom (G), which shows a mini version of the whole spectrogram. Click and drag the highlighted section to navigate.

### Scrolling
## Scrolling

You can also navigate by scrolling.

Just hover your mouse over the spectrogram or the bar at the bottom, then use your scroll wheel:

- Scroll up and down: Move vertically through frequencies. Most likely you won't see any changes as the default behaviour is to display the full frequency range.
- Scroll up and down: Move vertically through frequencies.
Most likely you won't see any changes as the default behaviour is to display the full frequency range.
- Ctrl + Scroll: Move horizontally through time.

Zooming is just as easy:

- Ctrl + Scroll: Zoom in and out vertically (frequencies).
- Ctrl + Shift + Scroll: Zoom in and out horizontally (time).

### Double Clicking
??? warning "I want to scroll down the page, not the spectrogram"

When your mouse hovers over the spectrogram or the navigation bar below it, scrolling with your mouse wheel or trackpad will navigate within the spectrogram, not the webpage itself.
To scroll up or down the webpage, simply move your mouse cursor away from the spectrogram and navigation bar area, then scroll as usual.

## Seek to a Particular Moment

To quickly navigate to a specific point within the audio recording, double-click directly on the spectrogram.
This will automatically position the seek bar (red vertical bar) at the chosen point.
When you resume playback (e.g., by pressing the spacebar), the audio will start from the newly set location.

??? tip "Use the Player slider to seek"

Double-click on the spectrogram to center it at that point and start playback from there. Note that playback will only begin if it's already playing – otherwise, just hit the play button in the audio player (E).
While you can also seek by dragging the slider in the audio player, double-clicking on the spectrogram offers a more precise and convenient way to repeatedly listen to specific sections of the recording, especially for those faint, noisy, or ambiguous animal sounds.

## Zooming

Whombat offers a handy zoom feature beyond scrolling. To zoom in, simply click the zoom button (D) in the toolbar and then draw a box around the area you'd like to focus on.
Whombat offers a handy zoom feature beyond scrolling.
To zoom in, simply click the zoom button (D) in the toolbar and then draw a box around the area you'd like to focus on.

Need to step back? Click the back button (B) to return to the previous view, or click the home button (A) to reset the zoom completely.

## Spectrogram Settings

Fine-tune your spectrogram's appearance by clicking the settings button (F) in the toolbar. This opens a handy side menu where you can adjust various parameters.
Fine-tune your spectrogram's appearance by clicking the settings button (F) in the toolbar.
This opens a handy side menu where you can adjust various parameters.

!!! tip "Saving Your Settings"

Changes you make here only apply to the current spectrogram. To use these settings for all spectrograms, simply click the "Save" button at the top. You can also restore the default settings with the "Reset" button.

You have control over how audio is loaded, and how the spectrogram is calculated and displayed. Here's a breakdown of each setting:
You have control over how audio is loaded, and how the spectrogram is calculated and displayed.
Here's a breakdown of each setting:

| Setting | Description |
| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
Expand All @@ -72,7 +91,8 @@ You have control over how audio is loaded, and how the spectrogram is calculated

## Audio Player

At the top right, you'll find the audio player (E). Here, you can easily control playback, pause at any moment, jump to specific points in the audio, and even adjust the playback speed to match your workflow.
At the top right, you'll find the audio player (E).
Here, you can easily control playback, pause at any moment, jump to specific points in the audio, and even adjust the playback speed to match your workflow.

## Keyboard Shortcuts

Expand Down
2 changes: 1 addition & 1 deletion back/mkdocs-guide.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ nav:
- user_guide/index.md
- Installation: user_guide/installation.md
- Datasets: user_guide/datasets.md
- Spectrogram Display: user_guide/spectrogram_display.md
- Spectrogram Navigation: user_guide/spectrogram_display.md
- Annotation Projects: user_guide/annotation_projects.md
- Evaluation: user_guide/evaluation.md
- Data Exploration: user_guide/exploration.md
Expand Down
2 changes: 1 addition & 1 deletion back/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ nav:
- user_guide/index.md
- Installation: user_guide/installation.md
- Datasets: user_guide/datasets.md
- Spectrogram Display: user_guide/spectrogram_display.md
- Spectrogram Navigation: user_guide/spectrogram_display.md
- Annotation Projects: user_guide/annotation_projects.md
- Evaluation: user_guide/evaluation.md
- Data Exploration: user_guide/exploration.md
Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading

0 comments on commit 8a957a6

Please sign in to comment.