Add new hydrophones to ML pipeline & document process #128

scottveirs · 2023-08-02T01:15:47Z

Deploy at least general OrcaHello classifier on all new hydrophone locations currently streaming to live.orcasound.net listeners for human detection.
Document the administrative and/or dev/op procedure for adding a new node to the inference system in the Orcasound administrative wiki

In 2024, we are excited to add the North San Juan Channel hydrophone which was just repaired and restarted streaming last week!

In 2023, the number of active nodes in the network has increased from 3 to these 7 locations ready for production:

The current nodes and some metadata should be accessible by the time of the 2023 Microsoft hackathon programmatically via a new Orcasound API.

micya · 2023-08-02T22:23:07Z

Steps involved per location:

Add configuration file: see Port Townsend config for reference. Place new file in same directory.
Modify last line of Dockerfile to point to new config (NOTE: we should move away from having to bake the config file into the docker image so that we can build one image and specify the relevant configs externally).
Build docker container: https://github.com/orcasound/aifororcas-livesystem/tree/main/InferenceSystem#building-the-docker-container-for-production
Push docker image to Azure Container Registry: https://github.com/orcasound/aifororcas-livesystem/tree/main/InferenceSystem#pushing-your-image-to-azure-container-registry
Deploy to Azure Kubernetes Service: https://github.com/orcasound/aifororcas-livesystem/tree/main/InferenceSystem#deploying-an-updated-docker-build-to-azure-kubernetes-service (create namespace, secret, deployment)

micya · 2023-08-02T22:24:03Z

Need to check with @micowan on whether anything needs to be done for moderator portal.

micowan · 2023-08-03T11:19:10Z

Will have to look at the code again. I know we were looking at putting the hydrophone locations into a config, but don't know if that ever happened. Are the feed turned on for the new ones/are they creating records in the CosmosDB? From: Michelle Yang ***@***.***> Sent: Wednesday, August 2, 2023 6:24 PM To: orcasound/aifororcas-livesystem ***@***.***> Cc: Mike Cowan ***@***.***>; Mention ***@***.***> Subject: Re: [orcasound/aifororcas-livesystem] Add new hydrophones to ML pipeline & document process (Issue #128) Need to check with @micowan<https://github.com/micowan> on whether anything needs to be done for moderator portal. - Reply to this email directly, view it on GitHub<#128 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AFM7GJAFXX5UZW7TGG2JA6LXTLHQ7ANCNFSM6AAAAAA3ARCUKE>. You are receiving this because you were mentioned.Message ID: ***@***.******@***.***>>

micya · 2023-08-03T12:10:09Z

From the description, I don't believe the inference system has been brought up yet. So no records in Cosmos DB yet.

No additional handling needs to be done for inference system -> Cosmos DB, since Cosmos DB is really storing a blob of json which accepts any arbitrary string.

catskids3 · 2023-08-13T13:05:04Z

Checked the code. We did in fact turn the locations into a config setting last go round. So, adding from the UI perspective should be as simple as updating that config with the new locations. I know Scott showed a spreadsheet or api or something during the discussion last week that listed the locations. If they are updating that one themselves, and we can pull from that, we could make the list "live" vs a config setting. But that is just a thought.

scottveirs · 2023-09-08T23:51:58Z

Hey @micowan et al! I see two possible routes to updating the config file, or more dynamically managing the ML pipeline:

The orcasite wiki lists a recent dump of the feeds table and I could update it this weekend for the hackathon
Recent Orcasound backend improvements make it possible to access the feeds table itself programatically, e.g. here -- https://beta.orcasound.net/graphiql via queries like:

{feeds 
	{nodeName}
}

scottveirs · 2023-09-08T23:53:28Z

Also @micowan, I mentioned to @skanderm that your existing config file held JSON, so he said he could work on new API endpoint that could provide JSON to you...

skanderm · 2023-09-09T03:14:08Z

You should be able to get an updated list here: https://beta.orcasound.net/api/json/feeds

You may need to set these headers as well:
curl -s -H "Content-Type: application/vnd.api+json" -H "Accept: application/vnd.api+json" https://beta.orcasound.net/api/json/feeds

catskids3 · 2023-09-09T14:59:17Z

@scottveirs and @skanderm, the url: https://beta.orcasound.net/api/json/feeds was absolutely perfect!

I have already added this to a new hydrophones endpoint in the API so that we can access it from the UI. I also brought in the url and html in case it makes sense to add them to the UI somewhere.

Thanks!!!

skanderm · 2023-09-09T15:44:32Z

Glad you found it useful! Will the config be modifiable? We’re planning to deploy the changes to https://live.orcasound.net at some point.

catskids3 · 2023-09-09T18:08:58Z

If I am understanding the question correctly, yes. We will be able to change the URL we are pointing to on the fly by updating the configuration setting in Azure.

catskids3 · 2023-09-13T00:47:40Z

@scottveirs and @skanderm a quick question, there is a hydrophone location you call Orcasound Lab, can you confirm that this the Haro Strait hydrophone that we reference in the Cosmos DB. And if so, which is the correct name/label? We may need coding/configuration changes on our end if it is "Orcasound Lab".

scottveirs · 2023-09-13T13:11:52Z

Good question! Yes they are one and the same. But the official name for that feed is indeed “ Orcasound Lab.” Since there will eventually be more than one hydrophone node in Haro Strait, switching to “Orcasound Lab” would be prudent long-term strategy. Scott

…

On Tue, Sep 12, 2023 at 17:47 catskids3 ***@***.***> wrote: @scottveirs <https://github.com/scottveirs> and @skanderm <https://github.com/skanderm> a quick question, there is a hydrophone location you call Orcasound Lab, can you confirm that this the Haro Strait hydrophone that we reference in the Cosmos DB. And if so, which is the correct name/label? We may need coding/configuration changes on our end if it is "Orcasound Lab". — Reply to this email directly, view it on GitHub <#128 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADLE3M7ZX5GFLZ57IDUZNDLX2D7DNANCNFSM6AAAAAA3ARCUKE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

micowan · 2023-09-13T13:17:14Z

OK. Great. Since we are changing the partition strategy, which requires as rebuild of the data set, I can take care of that one off during the migration. We will need to speak with @micya or @pastorep about how it is marked coming out of the ML pipeline. Thanks for the feedback and quick turnaround.

micya · 2023-09-13T15:59:47Z

OK. Great. Since we are changing the partition strategy, which requires as rebuild of the data set, I can take care of that one off during the migration. We will need to speak with @micya or @pastorep about how it is marked coming out of the ML pipeline. Thanks for the feedback and quick turnaround.

I found that location information is hardcoded in the inference system script:

aifororcas-livesystem/InferenceSystem/src/LiveInferenceOrchestrator.py

Lines 36 to 40 in 2ed0955

    
           ORCASOUND_LAB_LOCATION = {"id": "rpi_orcasound_lab", "name": "Haro Strait", "longitude":  -123.17357, "latitude": 48.55833} 
        
           PORT_TOWNSEND_LOCATION = {"id": "rpi_port_townsend", "name": "Port Townsend", "longitude":  -122.76045, "latitude": 48.13569} 
        
           BUSH_POINT_LOCATION = {"id": "rpi_bush_point", "name": "Bush Point", "longitude":  -122.6039, "latitude": 48.03371} 
        
           source_guid_to_location = {"rpi_orcasound_lab" : ORCASOUND_LAB_LOCATION, "rpi_port_townsend" : PORT_TOWNSEND_LOCATION, "rpi_bush_point": BUSH_POINT_LOCATION}

.

We should probably pull that out and configure it via an environment variable.

micowan · 2023-09-13T16:05:22Z

Michelle. Thanks for finding that. Also, if you are going to be changing the data port, we will want to incorporate the changes I requested earlier. i.e. remove the reviewed and SRKWFound properties (may have these spelled wrong) and replace with a new property called "state" which will be populated with the term "Unreviewed". "state" is also the new partition key. Also need a new property called "locationName" at the top level of the JSON that duplicates the name in the Location portion of the JSON.
Thanks.

scottveirs · 2023-09-13T22:45:22Z

@micowan and @micya --

@salsal97 is teaching David and I here in Redmond how to add the new Sunset Bay location to the ML pipeline.

If the new model deployment creates a candidate, will it show up in the Moderator portal auto-magically now? Or is there some hardcoding of the new location within the UI portal code? (i.e. "Sunset Bay metadata that's now available via the API provided by Skander).

It looks like your recent pull request, Mike, might be the answer my question?

Maybe Tara or someone else who knows C# could review the PR?

catskids3 · 2023-09-14T00:44:23Z

Once the new API is deployed (which won't be until after the Moderator UI is updated), as long as the location appears in the site Skander provided me, it will show up in the Moderator portal. I cannot speak to the ML pipeline except to say Michelle indicated they had it hardcoded in one of the python scripts. If they could call the new API as well, then it could be removed from there and we would all be pulling from the same place. But I don't know if that is a possibility from the ML side.

…

On Wed, Sep 13, 2023 at 6:45 PM Scott Veirs ***@***.***> wrote: @micowan <https://github.com/micowan> and @micya <https://github.com/micya> -- @salsal97 <https://github.com/salsal97> is teaching David and I here in Redmond how to add the new Sunset Bay location to the ML pipeline. If the new model deployment creates a candidate, will it show up in the Moderator portal auto-magically now? Or is there some hardcoding of the new location within the UI portal code? (i.e. "Sunset Bay metadata that's now available via the API provided by Skander). It looks like your recent pull request <#131>, Mike, might be the answer my question? Maybe Tara or someone else who knows C# could review the PR? — Reply to this email directly, view it on GitHub <#128 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AM6IJXCXNFHMONOWWTZJATDX2IZQ5ANCNFSM6AAAAAA3ARCUKE> . You are receiving this because you commented.Message ID: ***@***.***>

salsal97 · 2023-09-14T19:55:20Z

This PR should be a step toward getting this issue squared out #136

skanderm · 2023-11-01T20:53:39Z

Hi everyone! We've updated the live site. As referenced here: #128 (comment), please update the endpoint to https://live.orcasound.net/api/json/feeds. Thank you!

micowan · 2023-11-03T12:26:20Z

@skanderm, Thanks for this. I have replaced the beta url with this new one in the codebase I am working.

tanviraja24 · 2024-09-16T19:34:33Z

Based off https://live.orcasound.net/listen, are there any new hydrophones available to add?

micowan · 2024-09-16T21:50:36Z

Scott gave me a URL last year: https://live.orcasound.net/api/json/feeds which has 7 hydrophones listed (including Haro Strait as Orcasound Lab). I have changed the API to pull this list for all Moderator features (picklists, etc.)

scottveirs · 2024-09-18T00:38:28Z

Before taking the steps that Michelle outlined, we need to fix a change that was recently made to the Amazon S3 buckets where the live audio data are stored. In the process of moving the data streams and archive to Amazon-sponsored buckets (and dramatically reducing our storage and egress costs), we had to rename the streaming data bucket.

The old name of the audio data bucket was streaming-orcasound-net
The new name of the bucket from which OrcaHello should acquire data is audio-orcasound-net

My understanding is that the S3 bucket URI is hard-coded into the Docker images for each location. Ideally, we'd move the audio data source URI/URL outside of the image and into a configuration file.

The other place I see the S3 bucket name is hard-coded is here in the Orchestrator.py code --

aifororcas-livesystem/InferenceSystem/src/LiveInferenceOrchestrator.py

Line 145 in 736c864

    
           hydrophone_stream_url = 'https://s3-us-west-2.amazonaws.com/streaming-orcasound-net/' + hls_hydrophone_id

Steps involved per location:

Add configuration file: see Port Townsend config for reference. Place new file in same directory.

Modify last line of Dockerfile to point to new config (NOTE: we should move away from having to bake the config file into the docker image so that we can build one image and specify the relevant configs externally).

Build docker container: https://github.com/orcasound/aifororcas-livesystem/tree/main/InferenceSystem#building-the-docker-container-for-production

Push docker image to Azure Container Registry: https://github.com/orcasound/aifororcas-livesystem/tree/main/InferenceSystem#pushing-your-image-to-azure-container-registry

Deploy to Azure Kubernetes Service: https://github.com/orcasound/aifororcas-livesystem/tree/main/InferenceSystem#deploying-an-updated-docker-build-to-azure-kubernetes-service (create namespace, secret, deployment)

scottveirs self-assigned this Aug 2, 2023

scottveirs added documentation Improvements or additions to documentation inference system Code to perform inference with the trained model(s) 2023-hackathon Goals or topics for the 2023 annual Microsoft hackathon labels Aug 2, 2023

skanderm mentioned this issue Nov 8, 2023

v3 post-deployment checklist orcasound/orcasite#270

Closed

11 tasks

scottveirs added the 2024-hackathon Goals or issues for the 2024 annual Microsoft hackathon label Jul 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new hydrophones to ML pipeline & document process #128

Add new hydrophones to ML pipeline & document process #128

scottveirs commented Aug 2, 2023 •

edited

Loading

micya commented Aug 2, 2023

micya commented Aug 2, 2023

micowan commented Aug 3, 2023 via email

micya commented Aug 3, 2023

catskids3 commented Aug 13, 2023

scottveirs commented Sep 8, 2023

scottveirs commented Sep 8, 2023

skanderm commented Sep 9, 2023

catskids3 commented Sep 9, 2023

skanderm commented Sep 9, 2023

catskids3 commented Sep 9, 2023

catskids3 commented Sep 13, 2023

scottveirs commented Sep 13, 2023 via email

micowan commented Sep 13, 2023

micya commented Sep 13, 2023 •

edited

Loading

micowan commented Sep 13, 2023

scottveirs commented Sep 13, 2023

catskids3 commented Sep 14, 2023 via email

salsal97 commented Sep 14, 2023

skanderm commented Nov 1, 2023

micowan commented Nov 3, 2023

tanviraja24 commented Sep 16, 2024

micowan commented Sep 16, 2024

scottveirs commented Sep 18, 2024 •

edited

Loading

Add new hydrophones to ML pipeline & document process #128

Add new hydrophones to ML pipeline & document process #128

Comments

scottveirs commented Aug 2, 2023 • edited Loading

micya commented Aug 2, 2023

micya commented Aug 2, 2023

micowan commented Aug 3, 2023 via email

micya commented Aug 3, 2023

catskids3 commented Aug 13, 2023

scottveirs commented Sep 8, 2023

scottveirs commented Sep 8, 2023

skanderm commented Sep 9, 2023

catskids3 commented Sep 9, 2023

skanderm commented Sep 9, 2023

catskids3 commented Sep 9, 2023

catskids3 commented Sep 13, 2023

scottveirs commented Sep 13, 2023 via email

micowan commented Sep 13, 2023

micya commented Sep 13, 2023 • edited Loading

micowan commented Sep 13, 2023

scottveirs commented Sep 13, 2023

catskids3 commented Sep 14, 2023 via email

salsal97 commented Sep 14, 2023

skanderm commented Nov 1, 2023

micowan commented Nov 3, 2023

tanviraja24 commented Sep 16, 2024

micowan commented Sep 16, 2024

scottveirs commented Sep 18, 2024 • edited Loading

scottveirs commented Aug 2, 2023 •

edited

Loading

micya commented Sep 13, 2023 •

edited

Loading

scottveirs commented Sep 18, 2024 •

edited

Loading