Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Thredds to supported version #413

Merged
merged 25 commits into from
Dec 5, 2024
Merged

Update Thredds to supported version #413

merged 25 commits into from
Dec 5, 2024

Conversation

mishaschwartz
Copy link
Collaborator

@mishaschwartz mishaschwartz commented Dec 14, 2023

Overview

Unidata has dropped support for TDS versions < 5.x. This updates Thredds to version 5.5.

Requires matching update:

Changes

Non-breaking changes

  • Adds...
  • New component version pavics/thredds-docker:5.5-unidata-2024-11-19-with-tds-plugin-jar-with-dependencies.jar

Breaking changes

  • None

Related Issue / Discussion

Additional Information

Links to other issues or sources.

birdhouse_daccs_configs_branch: master
birdhouse_skip_ci: false

@github-actions github-actions bot added component/THREDDS Features or components related to THREDDS documentation Improvements or additions to documentation labels Dec 14, 2023
@tlvu
Copy link
Collaborator

tlvu commented Dec 14, 2023

FYI, for this change, the automated pipeline test is not enough. .ncml tests are not done against the new Thredds in this PR but against the production server with the current Thredds to avoid having to deploy all the big data required by .ncml on test servers.

We have already attempted to upgrade Thredds on Ouranos side and we have .ncml issues and, if my memory is right, there are also 2 notebooks not using .ncml failing as well which should be caught by the pipeline.

@mishaschwartz
Copy link
Collaborator Author

@tlvu

Thanks for the info. I mostly created this PR to test the pipeline with the updated version to see what happens. I can close this if you're working on it on the ouranos side.

@tlvu
Copy link
Collaborator

tlvu commented Dec 14, 2023

Oh keep this PR, we have not yet opened a formal PR on our side, we simply override in env.local for testing. We can piggyback on your PR once we sort out the various issues on our side.

@crim-jenkins-bot
Copy link
Collaborator

E2E Test Results

DACCS-iac Pipeline Results

Build URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2366/
Result : failure

BIRDHOUSE_DEPLOY_BRANCH : update-thredds-5.4
DACCS_CONFIGS_BRANCH : master
PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master
PAVICS_SDI_BRANCH : master

DESTROY_INFRA_ON_EXIT : true
PAVICS_HOST : https://host-140-20.rdext.crim.ca

PAVICS-e2e-workflow-tests Pipeline Results

Tests URL : http://daccs-jenkins.crim.ca:80/job/PAVICS-e2e-workflow-tests/job/master/1482/

NOTEBOOK TEST RESULTS
    
[2023-12-14T20:21:16.415Z] ============================= test session starts ==============================
[2023-12-14T20:21:16.415Z] platform linux -- Python 3.9.16, pytest-7.3.1, pluggy-1.0.0
[2023-12-14T20:21:16.415Z] rootdir: /home/jenkins/agent/workspace/PAVICS-e2e-workflow-tests_master
[2023-12-14T20:21:16.415Z] plugins: anyio-3.6.1, dash-2.10.0, nbval-0.9.6, tornasync-0.6.0.post2, xdist-3.3.1
[2023-12-14T20:21:16.415Z] collected 265 items
[2023-12-14T20:21:16.415Z] 
[2023-12-14T20:21:27.112Z] notebooks-auth/geoserver.ipynb ..................                        [  6%]
[2023-12-14T20:21:54.191Z] notebooks-auth/test_cowbird_jupyter.ipynb ..........                     [ 10%]
[2023-12-14T20:22:02.837Z] notebooks-auth/test_thredds.ipynb ...........                            [ 14%]
[2023-12-14T20:22:11.544Z] pavics-sdi-master/docs/source/notebooks/WCS_example.ipynb .......        [ 17%]
[2023-12-14T20:22:21.246Z] pavics-sdi-master/docs/source/notebooks/WFS_example.ipynb ......         [ 19%]
[2023-12-14T20:22:29.940Z] pavics-sdi-master/docs/source/notebooks/WMS_example.ipynb ........       [ 22%]
[2023-12-14T20:29:53.778Z] pavics-sdi-master/docs/source/notebooks/climex.ipynb ............        [ 27%]
[2023-12-14T20:29:53.778Z] pavics-sdi-master/docs/source/notebooks/eccc-geoapi-climate-stations.ipynb . [ 27%]
[2023-12-14T20:29:59.420Z] ...............                                                          [ 33%]
[2023-12-14T20:30:09.837Z] pavics-sdi-master/docs/source/notebooks/eccc-geoapi-xclim.ipynb .....    [ 35%]
[2023-12-14T20:30:17.178Z] pavics-sdi-master/docs/source/notebooks/esgf-dap.ipynb ......            [ 37%]
[2023-12-14T20:30:33.269Z] pavics-sdi-master/docs/source/notebooks/forecasts.ipynb ......           [ 39%]
[2023-12-14T20:30:34.675Z] pavics-sdi-master/docs/source/notebooks/jupyter_extensions.ipynb .       [ 40%]
[2023-12-14T20:30:39.719Z] pavics-sdi-master/docs/source/notebooks/opendap.ipynb .......            [ 42%]
[2023-12-14T20:30:44.519Z] pavics-sdi-master/docs/source/notebooks/pavics_thredds.ipynb .....       [ 44%]
[2023-12-14T20:34:02.266Z] pavics-sdi-master/docs/source/notebooks/regridding.ipynb ............... [ 50%]
[2023-12-14T20:35:18.259Z] .............                                                            [ 55%]
[2023-12-14T20:35:20.033Z] pavics-sdi-master/docs/source/notebooks/rendering.ipynb F.FF             [ 56%]
[2023-12-14T20:35:22.616Z] pavics-sdi-master/docs/source/notebooks/subset-user-input.ipynb ........ [ 59%]
[2023-12-14T20:35:39.762Z] .................                                                        [ 66%]
[2023-12-14T20:35:46.780Z] pavics-sdi-master/docs/source/notebooks/subsetting.ipynb ......          [ 68%]
[2023-12-14T20:35:48.173Z] pavics-sdi-master/docs/source/notebook-components/weaver_example.ipynb . [ 68%]
[2023-12-14T20:36:05.962Z] .........                                                                [ 72%]
[2023-12-14T20:36:15.808Z] finch-master/docs/source/notebooks/dap_subset.ipynb ...........          [ 76%]
[2023-12-14T20:36:25.188Z] finch-master/docs/source/notebooks/finch-usage.ipynb ......              [ 78%]
[2023-12-14T20:36:27.107Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb . [ 78%]
[2023-12-14T20:36:30.198Z] ......                                                                   [ 81%]
[2023-12-14T20:36:38.783Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-2Subsetting.ipynb . [ 81%]
[2023-12-14T20:36:52.964Z] .............                                                            [ 86%]
[2023-12-14T20:37:02.969Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-3Climate-Indicators.ipynb . [ 86%]
[2023-12-14T20:37:38.282Z] ....s.                                                                   [ 89%]
[2023-12-14T20:37:46.429Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-4Ensembles.ipynb . [ 89%]
[2023-12-14T20:38:00.039Z] ...                                                                      [ 90%]
[2023-12-14T20:38:14.968Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-5Visualization.ipynb . [ 90%]
[2023-12-14T20:38:37.896Z] ......                                                                   [ 93%]
[2023-12-14T20:38:39.512Z] notebooks/hummingbird.ipynb ............                                 [ 97%]
[2023-12-14T20:41:28.996Z] notebooks/stress-tests.ipynb ......                                      [100%]
[2023-12-14T20:41:28.996Z] 
[2023-12-14T20:41:28.996Z] =================================== FAILURES ===================================
    
  

@tlvu
Copy link
Collaborator

tlvu commented Jan 17, 2024

A few problems found related to Thredds 5.4
Unidata/tds#449
Unidata/tds#406

Notebook changes required:
Ouranosinc/pavics-sdi#317
Ouranosinc/PAVICS-landing#77

@github-actions github-actions bot added the ci/tests Issues or changes related to tests scripts label Jan 17, 2024
@crim-jenkins-bot
Copy link
Collaborator

E2E Test Results

DACCS-iac Pipeline Results

Build URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2420/
Result : failure

BIRDHOUSE_DEPLOY_BRANCH : update-thredds-5.4
DACCS_CONFIGS_BRANCH : master
PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master
PAVICS_SDI_BRANCH : master

DESTROY_INFRA_ON_EXIT : true
PAVICS_HOST : https://host-140-46.rdext.crim.ca

PAVICS-e2e-workflow-tests Pipeline Results

Tests URL : http://daccs-jenkins.crim.ca:80/job/PAVICS-e2e-workflow-tests/job/master/1498/

NOTEBOOK TEST RESULTS
    
[2024-01-17T22:33:10.547Z] ============================= test session starts ==============================
[2024-01-17T22:33:10.547Z] platform linux -- Python 3.9.16, pytest-7.3.1, pluggy-1.0.0
[2024-01-17T22:33:10.547Z] rootdir: /home/jenkins/agent/workspace/PAVICS-e2e-workflow-tests_master
[2024-01-17T22:33:10.547Z] plugins: anyio-3.6.1, dash-2.10.0, nbval-0.9.6, tornasync-0.6.0.post2, xdist-3.3.1
[2024-01-17T22:33:10.547Z] collected 264 items
[2024-01-17T22:33:10.547Z] 
[2024-01-17T22:33:20.928Z] notebooks-auth/geoserver.ipynb ..................                        [  6%]
[2024-01-17T22:33:54.638Z] notebooks-auth/test_cowbird_jupyter.ipynb ..........                     [ 10%]
[2024-01-17T22:33:59.781Z] notebooks-auth/test_thredds.ipynb ...........                            [ 14%]
[2024-01-17T22:34:09.377Z] pavics-sdi-master/docs/source/notebooks/WCS_example.ipynb .......        [ 17%]
[2024-01-17T22:34:19.740Z] pavics-sdi-master/docs/source/notebooks/WFS_example.ipynb ......         [ 19%]
[2024-01-17T22:34:26.438Z] pavics-sdi-master/docs/source/notebooks/WMS_example.ipynb ........       [ 22%]
[2024-01-17T22:47:35.169Z] pavics-sdi-master/docs/source/notebooks/climex.ipynb ............        [ 27%]
[2024-01-17T22:47:35.432Z] pavics-sdi-master/docs/source/notebooks/eccc-geoapi-climate-stations.ipynb . [ 27%]
[2024-01-17T22:47:44.008Z] ...............                                                          [ 33%]
[2024-01-17T22:47:54.172Z] pavics-sdi-master/docs/source/notebooks/eccc-geoapi-xclim.ipynb .....    [ 35%]
[2024-01-17T22:48:01.464Z] pavics-sdi-master/docs/source/notebooks/esgf-dap.ipynb ......            [ 37%]
[2024-01-17T22:48:18.849Z] pavics-sdi-master/docs/source/notebooks/forecasts.ipynb ......           [ 39%]
[2024-01-17T22:48:24.608Z] pavics-sdi-master/docs/source/notebooks/opendap.ipynb .......            [ 42%]
[2024-01-17T22:48:29.156Z] pavics-sdi-master/docs/source/notebooks/pavics_thredds.ipynb .....       [ 44%]
[2024-01-17T22:52:13.417Z] pavics-sdi-master/docs/source/notebooks/regridding.ipynb ............... [ 50%]
[2024-01-17T22:53:30.247Z] .............                                                            [ 54%]
[2024-01-17T22:53:35.021Z] pavics-sdi-master/docs/source/notebooks/rendering.ipynb ..FF             [ 56%]
[2024-01-17T22:53:37.608Z] pavics-sdi-master/docs/source/notebooks/subset-user-input.ipynb ........ [ 59%]
[2024-01-17T22:53:54.973Z] .................                                                        [ 65%]
[2024-01-17T22:54:03.287Z] pavics-sdi-master/docs/source/notebooks/subsetting.ipynb ......          [ 68%]
[2024-01-17T22:54:04.670Z] pavics-sdi-master/docs/source/notebook-components/weaver_example.ipynb . [ 68%]
[2024-01-17T22:54:16.606Z] ........F                                                                [ 71%]
[2024-01-17T22:54:27.479Z] finch-master/docs/source/notebooks/dap_subset.ipynb ...........          [ 76%]
[2024-01-17T22:54:37.108Z] finch-master/docs/source/notebooks/finch-usage.ipynb ......              [ 78%]
[2024-01-17T22:54:39.038Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb . [ 78%]
[2024-01-17T22:54:42.106Z] ......                                                                   [ 81%]
[2024-01-17T22:54:50.251Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-2Subsetting.ipynb . [ 81%]
[2024-01-17T22:55:06.258Z] .............                                                            [ 86%]
[2024-01-17T22:55:18.512Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-3Climate-Indicators.ipynb . [ 86%]
[2024-01-17T22:56:04.236Z] ....s.                                                                   [ 89%]
[2024-01-17T22:56:12.412Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-4Ensembles.ipynb . [ 89%]
[2024-01-17T22:56:27.944Z] ...                                                                      [ 90%]
[2024-01-17T22:56:42.852Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-5Visualization.ipynb . [ 90%]
[2024-01-17T22:57:07.610Z] ......                                                                   [ 93%]
[2024-01-17T22:57:10.373Z] notebooks/hummingbird.ipynb ............                                 [ 97%]
[2024-01-17T22:59:59.854Z] notebooks/stress-tests.ipynb ......                                      [100%]
[2024-01-17T22:59:59.854Z] 
[2024-01-17T22:59:59.854Z] =================================== FAILURES ===================================
    
  

@crim-jenkins-bot
Copy link
Collaborator

E2E Test Results

DACCS-iac Pipeline Results

Build URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2421/
Result : failure

BIRDHOUSE_DEPLOY_BRANCH : update-thredds-5.4
DACCS_CONFIGS_BRANCH : master
PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master
PAVICS_SDI_BRANCH : master

DESTROY_INFRA_ON_EXIT : true
PAVICS_HOST : https://host-140-69.rdext.crim.ca

PAVICS-e2e-workflow-tests Pipeline Results

Tests URL : http://daccs-jenkins.crim.ca:80/job/PAVICS-e2e-workflow-tests/job/master/1499/

NOTEBOOK TEST RESULTS
    
[2024-01-17T22:34:14.373Z] ============================= test session starts ==============================
[2024-01-17T22:34:14.373Z] platform linux -- Python 3.9.16, pytest-7.3.1, pluggy-1.0.0
[2024-01-17T22:34:14.373Z] rootdir: /home/jenkins/agent/workspace/PAVICS-e2e-workflow-tests_master@2
[2024-01-17T22:34:14.373Z] plugins: anyio-3.6.1, dash-2.10.0, nbval-0.9.6, tornasync-0.6.0.post2, xdist-3.3.1
[2024-01-17T22:34:14.373Z] collected 264 items
[2024-01-17T22:34:14.373Z] 
[2024-01-17T22:34:26.345Z] notebooks-auth/geoserver.ipynb ..................                        [  6%]
[2024-01-17T22:34:56.733Z] notebooks-auth/test_cowbird_jupyter.ipynb ..........                     [ 10%]
[2024-01-17T22:35:02.166Z] notebooks-auth/test_thredds.ipynb ...........                            [ 14%]
[2024-01-17T22:35:16.066Z] pavics-sdi-master/docs/source/notebooks/WCS_example.ipynb .......        [ 17%]
[2024-01-17T22:35:30.408Z] pavics-sdi-master/docs/source/notebooks/WFS_example.ipynb ......         [ 19%]
[2024-01-17T22:35:40.989Z] pavics-sdi-master/docs/source/notebooks/WMS_example.ipynb ........       [ 22%]
[2024-01-17T22:47:57.270Z] pavics-sdi-master/docs/source/notebooks/climex.ipynb ............        [ 27%]
[2024-01-17T22:47:57.270Z] pavics-sdi-master/docs/source/notebooks/eccc-geoapi-climate-stations.ipynb . [ 27%]
[2024-01-17T22:48:04.968Z] ...............                                                          [ 33%]
[2024-01-17T22:48:15.219Z] pavics-sdi-master/docs/source/notebooks/eccc-geoapi-xclim.ipynb .....    [ 35%]
[2024-01-17T22:48:21.966Z] pavics-sdi-master/docs/source/notebooks/esgf-dap.ipynb ......            [ 37%]
[2024-01-17T22:48:39.144Z] pavics-sdi-master/docs/source/notebooks/forecasts.ipynb ......           [ 39%]
[2024-01-17T22:48:43.930Z] pavics-sdi-master/docs/source/notebooks/opendap.ipynb .......            [ 42%]
[2024-01-17T22:48:48.742Z] pavics-sdi-master/docs/source/notebooks/pavics_thredds.ipynb .....       [ 44%]
[2024-01-17T22:52:08.226Z] pavics-sdi-master/docs/source/notebooks/regridding.ipynb ............... [ 50%]
[2024-01-17T22:53:27.783Z] .............                                                            [ 54%]
[2024-01-17T22:53:33.197Z] pavics-sdi-master/docs/source/notebooks/rendering.ipynb ..FF             [ 56%]
[2024-01-17T22:53:36.317Z] pavics-sdi-master/docs/source/notebooks/subset-user-input.ipynb ........ [ 59%]
[2024-01-17T22:53:53.937Z] .................                                                        [ 65%]
[2024-01-17T22:54:02.283Z] pavics-sdi-master/docs/source/notebooks/subsetting.ipynb ......          [ 68%]
[2024-01-17T22:54:03.691Z] pavics-sdi-master/docs/source/notebook-components/weaver_example.ipynb . [ 68%]
[2024-01-17T22:54:22.062Z] ........F                                                                [ 71%]
[2024-01-17T22:54:31.540Z] finch-master/docs/source/notebooks/dap_subset.ipynb ...........          [ 76%]
[2024-01-17T22:54:41.373Z] finch-master/docs/source/notebooks/finch-usage.ipynb ......              [ 78%]
[2024-01-17T22:54:42.752Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb . [ 78%]
[2024-01-17T22:54:46.046Z] ......                                                                   [ 81%]
[2024-01-17T22:54:52.844Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-2Subsetting.ipynb . [ 81%]
[2024-01-17T22:55:09.289Z] .............                                                            [ 86%]
[2024-01-17T22:55:21.529Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-3Climate-Indicators.ipynb . [ 86%]
[2024-01-17T22:56:06.782Z] ....s.                                                                   [ 89%]
[2024-01-17T22:56:16.828Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-4Ensembles.ipynb . [ 89%]
[2024-01-17T22:56:31.615Z] ...                                                                      [ 90%]
[2024-01-17T22:56:46.528Z] PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-5Visualization.ipynb . [ 90%]
[2024-01-17T22:57:11.326Z] ......                                                                   [ 93%]
[2024-01-17T22:57:14.072Z] notebooks/hummingbird.ipynb ............                                 [ 97%]
[2024-01-17T23:00:00.239Z] notebooks/stress-tests.ipynb ......                                      [100%]
[2024-01-17T23:00:00.239Z] 
[2024-01-17T23:00:00.239Z] =================================== FAILURES ===================================
    
  

@tlvu
Copy link
Collaborator

tlvu commented Mar 23, 2024

FYI @mishaschwartz we still have not done fixing the issues on our side. Besides there is a performce problem with 5.4 (Unidata/tds#406) and we hope to be fixed in 5.5 so this PR is not ready (5.5 is not released).

@tlvu
Copy link
Collaborator

tlvu commented Nov 19, 2024

NCML, UDDC, ISO links fixed by adding the missing jar, see Unidata/thredds-docker#310 (comment)

No other clues found for broken NetcdfSubset link, opened an issue Unidata/tds#544

@github-actions github-actions bot added the component/magpie Related to https://github.com/Ouranosinc/Magpie label Nov 21, 2024
@@ -33,6 +33,7 @@ providers:
dodsC,
wcs,
wms,
ncss,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave the old location.
THREDDS_VERSION can be overridden, so a server could still employ the older variant.

Given that, that will most probably cause a conflict in the data_type resolution order if all 3 are defined, so maybe alternate paths or a dynamic variable resolution will be needed here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh... you're right. Ok I'll figure something out.

Comment on lines 36 to 37
ncss/grid,
ncss/point,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test if this works?

I do not know if Magpie/Twitcher handles this properly.
The data types were not planned to include a /, so I'm not sure if a url.split("/") might happen somewhere in the code leading to misinterpretation of the targeted data_type.

Copy link
Collaborator Author

@mishaschwartz mishaschwartz Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seemed to be fine but I didn't rigorously test by trying to read the two options with different directory permissions I guess. I'll look into that.

BUT... if these prefixes can't handle a / we should change the service definitions to:

<service name="ncssGrid" serviceType="NetcdfSubset" base="${TWITCHER_PROTECTED_PATH}/thredds/ncss-grid/" />
<service name="ncssPoint" serviceType="NetcdfSubset" base="${TWITCHER_PROTECTED_PATH}/thredds/ncss-point/" />

or similar and that way we don't have to worry about that at all

Copy link
Collaborator Author

@mishaschwartz mishaschwartz Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok maybe we can't change the path... the URL construction docs for this service look like point or grid needs to be a subpath under ncss:

https://docs.unidata.ucar.edu/tds/current/userguide/netcdf_subset_service_ref.html#url-construction

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data types were not planned to include a /, so I'm not sure if a url.split("/") might happen somewhere in the code leading to misinterpretation of the targeted data_type.

Yup magpie splits it up (see here) so our only option at this point is to keep the magpie config as it is and treat grid/ and point/ as directory paths if we want to differentiate their magpie permissions.

That's going to cause other confusion though...

Right now we have URL paths like:

  • ${FQDN}/twitcher/ows/proxy/thredds/ncss/datasets/...
  • ${FQDN}/twitcher/ows/proxy/thredds/dodsC/datasets/...

So if we want to set specific permissions on the datasets/ directory we can do that by setting one directory permission rule on the datasets/ subdirectory and it will work for all services.

But if we instead have:

  • ${FQDN}/twitcher/ows/proxy/thredds/ncss/point/datasets/...
  • ${FQDN}/twitcher/ows/proxy/thredds/dodsC/datasets/...

Then we would need to set the same permission rule on datasets/ as well as point/datasets/. If we don't set the same rule on point/datasets/ then the rule won't apply to resources accessed by the ncss/point service.

Please let me know if I'm interpreting this correctly @fmigneault

If I'm right then I think we need to update Magpie to handle this use case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to revert my changes for now until we figure something out.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlvu Magpie cannot handle this case currently because it wasn't designed for this. It assumes THREDDS "services" don't contain a /, so it could rely on {Twitcher-prefix}/{THREDDS_name}/{THREDDS_service}/{rest-as-dir/file}.

@fmigneault I am still lost. What case are you talking about?

Just to be clear, in the current state, with Misha's proposed Magpie config change rollback, do we have any problems?

Been running some Jenkins on the new Thredds and I am having some weird errors. Not sure if it's my test system or the code. So if you foresee some problem, please let me know.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ncssGrid and ncssPoint services won't be protected like other services do (ie: a common browse/read/write set for all services) since the paths are not handled.
For the moment, if you want them protected, you need to duplicate the permission hierarchy.

i.e.:

A file accessed by ncssGrid via ${FQDN}/twitcher/ows/proxy/thredds/ncss/point/datasets/sub/file.nc will actually go through ncss service, and point will be considered by Magpie as if it was any other directory (like datasets or sub).

Therefore, the same datasets/sub/file.nc would need 3 sets of permissions:

  • datasets/sub/file.nc => (browse,read,write) for anything currently handled
  • grid/datasets/sub/file.nc => (read) only if accessing via ncssGrid
  • point/datasets/sub/file.nc => (read) only if accessing via ncssPoint

and those (read) need to be duplicated for every user/group/dir/file combination applicable

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ncssGrid and ncssPoint services won't be protected like other services do (ie: a common browse/read/write set for all services) since the paths are not handled.

@fmigneault I am surprised. If we "Deny or Allow" everything under ncss path, isn't both of those variant grid/datasets/sub/file.nc and point/datasets/sub/file.nc will be properly denied or allowed, because they are all under ncss/ path?

Under ncss is read only no write anyways, since there is no write for that kind of path.

Copy link
Collaborator

@fmigneault fmigneault Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlvu
Yes.
A deny top-level will restrict everything lower. But usually, a deny is "undone" by an allow on a lower specific resource we want to grant access. The same issue also happens even when working the other way around -- allow recursive followed by restricted deny lower (see below example).

With the current implementation of THREDDS that checks the service [accses-mode] (dodC, ncss, etc.) between Twitcher-proxy-path and the rest of the dir/file path, those resources would be at different "level" for the same file reference from the point of view of Magpie. So you would need to duplicate your 'allows'/'denies' across [accses-mode].

RESOURCE        PERMISSION APPLIED / RESULT
===========================================
THREDDS         allow-recursive

ncss            [access-mode] (not a resource, cannot have permissions directly)
  datasets      deny-recursive
     sub        allow-recursve (all but sub's contents are blocked)

  public        allow-recursive
     secure     deny-recursive
       secret   allow-match only for user-1, maybe group "manager" also, others...

  point         <-- THREDDS sees this as another [access-mode], but Magpie thinks its a RESOURCE!
     datasets   
       sub      <==== OH! OH! undesired full open-access (it's not under the denied "/datasets")

     public     allow-recursive (but because on THREDDS)
       secure   if you forget to replicate above 'secure', this is still full open acces

  grid          (repeat again, another set of duplicat permissions, for each group/user combination)
    ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A deny top-level will restrict everything lower. But usually, a deny is "undone" by an allow on a lower specific resource we want to grant access. The same issue also happens even when working the other way around -- allow recursive followed by restricted deny lower (see below example).

Oh OK !!! Now I understand, thanks ! Top-level Allow or Deny will work. But adding an exception under the top-level will not.

tlvu added a commit that referenced this pull request Dec 5, 2024
Replay the diff at commit
43646c5
from PR #413
@github-actions github-actions bot added the ci/operations Continuous Integration components label Dec 5, 2024
@tlvu tlvu merged commit 16b12e4 into master Dec 5, 2024
4 of 5 checks passed
@tlvu tlvu deleted the update-thredds-5.4 branch December 5, 2024 23:56
tlvu added a commit to Ouranosinc/pavics-sdi that referenced this pull request Dec 5, 2024
For Thredds v5 upgrade in
bird-house/birdhouse-deploy#413.

For the following error and other output changes:
```
  _______ pavics-sdi-master/docs/source/notebooks/rendering.ipynb::Cell 2 ________
  Notebook cell execution failed
  Cell 2: Cell outputs differ

  Input:
  sorted(wms.contents["tasmax"].styles.keys())

  Traceback:
   mismatch 'text/plain'

   assert reference_output == test_output failed:

    "['boxfill/al...fill/sst_36']" == "['colored_co...ter/default']"

    - ['colored_contours/default',
    -  'contours',
    -  'default-scalar/default',
    -  'raster/default']
    + ['boxfill/alg',
    +  'boxfill/alg2',
    +  'boxfill/ferret',
    +  'boxfill/greyscale',
    +  'boxfill/ncview',
    +  'boxfill/occam',
    +  'boxfill/occam_pastel-30',
    +  'boxfill/rainbow',
    +  'boxfill/redblue',
    +  'boxfill/sst_36']

  _______ pavics-sdi-master/docs/source/notebooks/rendering.ipynb::Cell 3 ________
  Notebook cell execution failed
  Cell 3: Cell execution caused an exception

  Input:
  resp = wms.getmap(
      layers=["tasmax"],
      styles=["boxfill/occam"],
      format="image/png",
      colorscalerange=f"{mn},{mx}",
      size=[256, 256],
      srs="CRS:84",
      bbox=(150, 30, 250, 80),
      time="2006-02-15",
      transparent=True,
  )
  Image(resp.read())

  Traceback:

  ---------------------------------------------------------------------------
  ServiceException                          Traceback (most recent call last)
  Cell In[1], line 1
  ----> 1 resp = wms.getmap(
        2     layers=["tasmax"],
        3     styles=["boxfill/occam"],
        4     format="image/png",
        5     colorscalerange=f"{mn},{mx}",
        6     size=[256, 256],
        7     srs="CRS:84",
        8     bbox=(150, 30, 250, 80),
        9     time="2006-02-15",
       10     transparent=True,
       11 )
       12 Image(resp.read())

  File /opt/conda/envs/birdy/lib/python3.11/site-packages/owslib/map/wms130.py:309, in WebMapService_1_3_0.getmap(self, layers, styles, srs, bbox, format, size, time, elevation, dimensions, transparent, bgcolor, exceptions, method, timeout, **kwargs)
      305 data = urlencode(request)
      307 self.request = bind_url(base_url) + data
  --> 309 u = openURL(base_url, data, method, timeout=timeout or self.timeout, auth=self.auth, headers=self.headers)
      311 # need to handle casing in the header keys
      312 headers = {}

  File /opt/conda/envs/birdy/lib/python3.11/site-packages/owslib/util.py:210, in openURL(url_base, data, method, cookies, username, password, timeout, headers, verify, cert, auth)
      207 req = requests.request(method.upper(), url_base, headers=headers, **rkwargs)
      209 if req.status_code in [400, 401]:
  --> 210     raise ServiceException(req.text)
      212 if req.status_code in [404, 500, 502, 503, 504]:    # add more if needed
      213     req.raise_for_status()

  ServiceException: <ServiceExceptionReport version="1.3.0"
                          xmlns="http://www.opengis.net/ogc"
                          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                          xsi:schemaLocation="http://www.opengis.net/ogc http://schemas.opengis.net/wms/1.3.0/exceptions_1_3_0.xsd">
      <ServiceException code="StyleNotDefined">
          The layer tasmax does not support the style boxfill
      </ServiceException>
  </ServiceExceptionReport>
```

Output change:
```
  _ pavics-sdi-fix-for-Thredds-v5/docs/source/notebooks/CaSR_basic.ipynb::Cell 4 _
  Notebook cell execution failed
  Cell 4: Cell outputs differ

  Input:
  bbox.rotated_pole

  Traceback:
   mismatch 'text/plain'

   assert reference_output == test_output failed:

    '<xarray.Data...itude:    0.0' == '<xarray.Data...itude:    0.0'

    Skipping 138 identical leading characters in diff, use -v to show
    Skipping 67 identical trailing characters in diff, use -v to show
      utes:
    -     long_name:                    coordinates of the rotated North Pole
          earth_radius:                 6371220.0
          grid_mapping_name:            rotated_latitude_longitude
          grid_north_pole_latitude:     31.758316040039062
          grid_north_pole_longitude:    87.59703063964844
    +     long_name:                    coordinates of the rotated North Pole
          long
```
tlvu added a commit to Ouranosinc/PAVICS-landing that referenced this pull request Dec 5, 2024
For Thredds v5 update in
bird-house/birdhouse-deploy#413

To fix output change:

```
  _ PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb::Cell 0 _
  Notebook cell execution failed
  Cell 0: Cell outputs differ

  Input:
  from siphon.catalog import TDSCatalog

  url = "https://boreas.ouranos.ca/twitcher/ows/proxy/thredds/catalog/datasets/simulations/bias_adjusted/cmip6/ouranos/ESPO-G/ESPO-G6-R2v1.0.0/catalog.xml"  # TEST_USE_PROD_DATA

  # Create Catalog
  cat = TDSCatalog(url)

  # List of datasets
  print(f"Number of datasets: {len(cat.datasets)}")

  # Access mechanisms - here we are interested in OPENDAP, a data streaming protocol
  cds = cat.datasets[0]
  print(f"Access URLs: {tuple(cds.access_urls.keys())}")

  Traceback:
   mismatch 'stdout'

   assert reference_output == test_output failed:

    "Number of da...cdfSubset')\n" == "Number of da...cdfSubset')\n"

    Skipping 43 identical leading characters in diff, use -v to show
    Skipping 50 identical trailing characters in diff, use -v to show
    - erver', 'OpenDAP', 'NC
    ?           ^^^
    + erver', 'OPENDAP', 'NC
    ?           ^^^

  _ PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb::Cell 2 _
  Notebook cell execution failed
  Cell 2: Cell outputs differ

  Input:
  # Extract a subset of the file

  # Again, this only creates an in-memory representation of the data
  sub = ds.tasmin.sel(time="2050").isel(rlon=400, rlat=350)

  # The data is only downloaded when we actually need it for a computation.
  sub.mean(keep_attrs=True).compute()

  Traceback:
   mismatch 'text/plain'

   assert reference_output == test_output failed:

    '<xarray.Data...60   50   50]' == '<xarray.Data...60   50   50]'

    Skipping 38 identical leading characters in diff, use -v to show
    Skipping 225 identical trailing characters in diff, use -v to show
      B
    - array(279.47516, dtype=float32)
    ?         ^ - ^^
    + array(278.72336, dtype=float32)
    ?         ^  ^^^
      Coordinates:
          rlat          float32 4B -14.67
          rlon          float32 4B 360.6
          rotated_pole  float32 4B 9.969e+36
          lat           float32 4B 43.57
          lon           float32 4B -91.6
      Attributes:
    -     long_name:                                   Minimal daily temperature
          cell_methods:                                time: minimum within days
          description:                                 Daily minimal temperature as...
          grid_mapping:                                rotated_pole
          history:                                     [DATE_TIME] Data c...
    +     long_name:                                   Minimal daily temperature
          stan

  _ PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb::Cell 3 _
  Notebook cell execution failed
  Cell 3: Cell outputs differ

  Input:
  ssp245_data = [cat.datasets[x] for x in cat.datasets if "ssp245" in x]
  ssp245_data

  Traceback:
   mismatch 'text/plain'

   assert reference_output == test_output failed:

    '[day_ESPO-G6...1001231.ncml]' == '[day_ESPO-G6...1001231.ncml]'

    Skipping 35 identical leading characters in diff, use -v to show
    - ioMIP_NAM_AS-RCEC_TaiESM1_ssp245_r1i1p1f1_19500101-21001231.ncml,
    ?           ^ ^^^^^ ^^^   ^
    + ioMIP_NAM_NUIST_NESM3_ssp245_r1i1p1f1_19500101-21001231.ncml,
    ?           ^^^ ^ ^   ^
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NOAA-GFDL_GFDL-ESM4_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NIMS-KMA_KACE-1-0-G_ssp245_r1i1p1f1_19500101-21001230.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NCC_NorESM2-MM_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NCC_NorESM2-LM_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MRI_MRI-ESM2-0_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-LR_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-HR_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MOHC_UKESM1-0-LL_ssp245_r1i1p1f2_19500101-21001230.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC6_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC-ES2L_ssp245_r1i1p1f2_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_IPSL_IPSL-CM6A-LR_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_INM_INM-CM5-0_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_INM_INM-CM4-8_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_EC-Earth-Consortium_EC-Earth3_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_EC-Earth-Consortium_EC-Earth3-Veg_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_EC-Earth-Consortium_EC-Earth3-CC_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO_ACCESS-ESM1-5_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO-ARCCSS_ACCESS-CM2_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CNRM-CERFACS_CNRM-ESM2-1_ssp245_r1i1p1f2_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CNRM-CERFACS_CNRM-CM6-1_ssp245_r1i1p1f2_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CMCC_CMCC-ESM2_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CCCma_CanESM5_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CAS_FGOALS-g3_ssp245_r1i1p1f1_19500101-21001231.ncml,
       day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_BCC_BCC-CSM2-MR_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CAS_FGOALS-g3_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CCCma_CanESM5_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CMCC_CMCC-ESM2_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CNRM-CERFACS_CNRM-CM6-1_ssp245_r1i1p1f2_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CNRM-CERFACS_CNRM-ESM2-1_ssp245_r1i1p1f2_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO-ARCCSS_ACCESS-CM2_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO_ACCESS-ESM1-5_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_EC-Earth-Consortium_EC-Earth3-CC_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_EC-Earth-Consortium_EC-Earth3-Veg_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_EC-Earth-Consortium_EC-Earth3_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_INM_INM-CM4-8_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_INM_INM-CM5-0_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_IPSL_IPSL-CM6A-LR_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC-ES2L_ssp245_r1i1p1f2_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC6_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MOHC_UKESM1-0-LL_ssp245_r1i1p1f2_19500101-21001230.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-HR_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-LR_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MRI_MRI-ESM2-0_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NCC_NorESM2-LM_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NCC_NorESM2-MM_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NIMS-KMA_KACE-1-0-G_ssp245_r1i1p1f1_19500101-21001230.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NOAA-GFDL_GFDL-ESM4_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NUIST_NESM3_ssp245_r1i1p1f1_19500101-21001231.ncml]
    ?                                              ^^^  ^^   ^
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_AS-RCEC_TaiESM1_ssp245_r1i1p1f1_19500101-21001231.ncml]
    ?                                              ^ ++++++ ^^   ^

  _ PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb::Cell 4 _
  Notebook cell execution failed
  Cell 4: Cell outputs differ

  Input:
  tcr_likely_models = [
      "BCC-CSM2-MR",
      "FGOALS-g3",
      "CMCC-ESM2",
      "CNRM-ESM2-1",
      "ACCESS-CM2",
      "ACCESS-ESM1-5",
      "MPI-ESM1-2-HR",
      "INM-CM5-0",
      "MIROC6",
      "MPI-ESM1-2-LR",
      "MRI-ESM2-0",
      "NorESM2-LM",
      "KACE-1-0-G",
      "GFDL-ESM4",
      "MIROC-ES2L",
  ]

  def _filter_tcr_likely(files):
      return [d for d in files if any([h in d.name for h in tcr_likely_models])]

  # create a simple search sub-function

  def get_ncfilelist(scen=None, url=None, tcr_likely=False):
      cat = TDSCatalog(url)
      ncfiles = [cat.datasets[c] for c in cat.datasets if scen in c]
      if tcr_likely:
          expected = len(tcr_likely_models)
          ncfiles = _filter_tcr_likely(ncfiles)
          if len(ncfiles) == expected:
              display(f"Successfully found {expected} datasets for {scen}")
              return ncfiles
          else:
              raise ValueError(
                  f"Expected number of datasets for {scen} is {expected} : found {len(ncfiles)}"
              )

  datasets = {}
  for scen in ["ssp245", "ssp370"]:
      datasets[scen] = get_ncfilelist(scen=scen, url=url, tcr_likely=True)
  display(datasets["ssp245"])
  display(datasets["ssp370"])

  Traceback:
   mismatch 'text/plain'

   assert reference_output == test_output failed:

    '[day_ESPO-G6...1001231.ncml]' == '[day_ESPO-G6...1001231.ncml]'

    Skipping 35 identical leading characters in diff, use -v to show
    - ioMIP_NAM_BCC_BCC-CSM2-MR_ssp245_r1i1p1f1_19500101-21001231.ncml,
    ?           ^^^ ^^^ ^  ^^^^
    + ioMIP_NAM_NOAA-GFDL_GFDL-ESM4_ssp245_r1i1p1f1_19500101-21001231.ncml,
    ?           ^^^^^^^^^ ^^^^ ^  ^
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NIMS-KMA_KACE-1-0-G_ssp245_r1i1p1f1_19500101-21001230.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NCC_NorESM2-LM_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MRI_MRI-ESM2-0_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-LR_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-HR_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC6_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC-ES2L_ssp245_r1i1p1f2_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_INM_INM-CM5-0_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO_ACCESS-ESM1-5_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO-ARCCSS_ACCESS-CM2_ssp245_r1i1p1f1_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CNRM-CERFACS_CNRM-ESM2-1_ssp245_r1i1p1f2_19500101-21001231.ncml,
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CMCC_CMCC-ESM2_ssp245_r1i1p1f1_19500101-21001231.ncml,
       day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CAS_FGOALS-g3_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CMCC_CMCC-ESM2_ssp245_r1i1p1f1_19500101-21001231.ncml,
    ?                                              ^^   ^^   ^                                          ^^
    +  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_BCC_BCC-CSM2-MR_ssp245_r1i1p1f1_19500101-21001231.ncml]
    ?                                              ^   ^   ^   +++                                       ^
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CNRM-CERFACS_CNRM-ESM2-1_ssp245_r1i1p1f2_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO-ARCCSS_ACCESS-CM2_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO_ACCESS-ESM1-5_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_INM_INM-CM5-0_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC-ES2L_ssp245_r1i1p1f2_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC6_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-HR_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-LR_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MRI_MRI-ESM2-0_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NCC_NorESM2-LM_ssp245_r1i1p1f1_19500101-21001231.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NIMS-KMA_KACE-1-0-G_ssp245_r1i1p1f1_19500101-21001230.ncml,
    -  day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NOAA-GFDL_GFDL-ESM4_ssp245_r1i1p1f1_19500101-21001231.ncml]


  _ PAVICS-landing-fix-for-Thredds-v5-output-update/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-3Climate-Indicators.ipynb::Cell 0 _
  Notebook cell execution failed
  Cell 0: Cell outputs differ

  Input:
  import os

  os.environ["USE_PYGEOS"] = "0"  # force use Shapely with GeoPandas
  import warnings

  import geopandas as gpd
  import matplotlib.pyplot as plt
  import numba
  import xarray as xr
  from clisops.core import subset
  from dask.diagnostics import ProgressBar
  from siphon.catalog import TDSCatalog
  from xclim import atmos

  warnings.simplefilter("ignore")
  # TODO change address
  url = "https://boreas.ouranos.ca/twitcher/ows/proxy/thredds/catalog/datasets/simulations/bias_adjusted/cmip6/ouranos/ESPO-G/ESPO-G6-R2v1.0.0/catalog.xml"  # TEST_USE_PROD_DATA

  # Create Catalog
  cat = TDSCatalog(url)

  # Subset over the Gasp�� peninsula in eastern Quebec
  gaspe = gpd.GeoDataFrame.from_file(
      "/notebook_dir/pavics-homepage/tutorial_data/gaspesie_mrc.geojson"
  )
  ds = subset.subset_shape(
      xr.open_dataset(
          cat.datasets[0].access_urls["OPENDAP"],
          chunks=dict(time=365 * 4, rlon=50, rlat=50),
      ),
      shape=gpd.GeoDataFrame(geometry=gaspe.buffer(0.05)),
  )

  # What we see here is only a representation of the full content, the entire data set hasn't been loaded.
  display(ds)

  # plot of single day tasmin
  a = ds.tasmin.isel(time=0).plot(figsize=(10, 4))

  Traceback:
   mismatch 'text/plain'

   assert reference_output == test_output failed:

    '<xarray.Data...    EPSG:4326' == '<xarray.Data...    EPSG:4326'

    Skipping 1172 identical leading characters in diff, use -v to show
    - tes: (12/83)
    ?           ^
    + tes: (12/81)
    ?           ^
          Conventions:                     CF-1.7 CMIP-6.2
          Notes:                           Regridded on the grid of RDRS v2.1, then...
          activity_id:                     CMIP
    -     branch_method:                   Hybrid-restart from year/DATE/of p...
    -     branch_time:                     0.0
    ?            --                      ^^^^^
    +     branch_method:                   standard
    ?              ++++                    ^^^^^^^^
          branch_time_in_child:            0.0
    +     branch_time_in_parent:           109573.0
          ...                              ...
          license_type:                    permissive
          terms_of_use:                    In addition to the provided licence, the...
          attribution:                     Use of this dataset should be acknowledg...
          modeling_realm:                  atmos
    -     source_institution:              AS-RCEC
    ?                                      ^ ^^^^^
    +     source_institution:              NUIST
    ?                                      ^^^ ^
          crs:                             EPSG:4326
```
@tlvu
Copy link
Collaborator

tlvu commented Dec 6, 2024

@fmigneault Before you say I merge this while you had a change requested, your requested change is for Misha's magpie config change and he already reverted it.

@mishaschwartz
Copy link
Collaborator Author

Before you say I merge this while you had a change requested, your requested change is for Misha's magpie config change and he already reverted it.

@tlvu Ok but the issue here (Ouranosinc/Magpie#633) now becomes an actual bug instead of a potential future bug. If any deployment uses specific directory permissions for thredds, this introduces a security risk for that deployment. We really needed to resolve Ouranosinc/Magpie#633 before merging this PR.

I'm going to recommend that we add a warning to release 2.6.3, immediately try to fix Ouranosinc/Magpie#633, and then release 2.7.0 with the updated Magpie changes as soon as possible.

I have 1000+ other things to do but I can put resolving the Ouranosinc/Magpie#633 near the top of my priority list if @fmigneault can review it. If @fmigneault does not have the ability to review it shortly then we should really really revert this change and wait until Ouranosinc/Magpie#633 can be addressed.

@fmigneault
Copy link
Collaborator

Yeah. Same points as @mishaschwartz
It wasn't completely reverted. The THREDDS services are added and are not functional.

I don't mind that much since I'm not using those services. Are they needed by anyone?
If this is needed soon, I can take a lot at it.

@tlvu
Copy link
Collaborator

tlvu commented Dec 6, 2024

@mishaschwartz, @fmigneault

Woops, I guess I should have documented my reasoning.

This Thredds v5 actually have 2 minors issues

I considered them minor and not blocking because

  • Magpie top-level Allow or Deny still work across the board, exception under top-level also works across the board, except for NCSS only and NCSS is not widely used
  • WMS is not widely used, similar to NCSS

Other features I wanted to get from this newer Thredds

  • security fixes (newer Tomcat) and the old v4 series is not even available on DockerHub anymore
    • CRIM apparently also pulled this newer Thredds v5 for another of their deployment due to security concerns
  • new experimental Zarr support

On the topic of documenting, I just realized the CHANGES.md file is not up-to-date with all the discussions here in this PR. I will make a PR amending this !

I think, for long running PR, we should double check the CHANGES.md before merge.

I understand everyone is busy so Ouranosinc/Magpie#633 can wait unless you actually need it.

tlvu added a commit that referenced this pull request Dec 9, 2024
…minor issues (#486)

## Overview

Document reasoning for pulling in new Thredds v5, even with minor issues
in previous PR #413.

## CI Operations

<!--
The test suite can be run using a different DACCS config with
``birdhouse_daccs_configs_branch: branch_name`` in the PR description.
To globally skip the test suite regardless of the commit message use
``birdhouse_skip_ci`` set to ``true`` in the PR description.

Using ``[<cmd>]`` (with the brackets) where ``<cmd> = skip ci`` in the
commit message will override ``birdhouse_skip_ci`` from the PR
description.
Such commit command can be used to override the PR description behavior
for a specific commit update.
However, a commit message cannot 'force run' a PR which the description
turns off the CI.
To run the CI, the PR should instead be updated with a ``true`` value,
and a running message can be posted in following PR comments to trigger
tests once again.
-->

birdhouse_daccs_configs_branch: master
birdhouse_skip_ci: false
tlvu added a commit that referenced this pull request Dec 9, 2024
## Overview

Fix the error below at the URL https://HOST/canarie/node/service/stats:
Bad return code from
http://thredds:8080//twitcher/ows/proxy/thredds/catalog.html (Expecting
200, Got 404

Old URL: http://thredds:8080//twitcher/ows/proxy/thredds/catalog.html
New URL:
http://thredds:8080/twitcher/ows/proxy/thredds/catalog/catalog.html

An oversight from the previous PR
#413

## Changes

**Non-breaking changes**
- Adapt canarie-api to new Thredds v5 URL

## CI Operations

<!--
The test suite can be run using a different DACCS config with
``birdhouse_daccs_configs_branch: branch_name`` in the PR description.
To globally skip the test suite regardless of the commit message use
``birdhouse_skip_ci`` set to ``true`` in the PR description.

Using ``[<cmd>]`` (with the brackets) where ``<cmd> = skip ci`` in the
commit message will override ``birdhouse_skip_ci`` from the PR
description.
Such commit command can be used to override the PR description behavior
for a specific commit update.
However, a commit message cannot 'force run' a PR which the description
turns off the CI.
To run the CI, the PR should instead be updated with a ``true`` value,
and a running message can be posted in following PR comments to trigger
tests once again.
-->

birdhouse_daccs_configs_branch: master
birdhouse_skip_ci: false
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/operations Continuous Integration components ci/tests Issues or changes related to tests scripts component/magpie Related to https://github.com/Ouranosinc/Magpie component/THREDDS Features or components related to THREDDS documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate to THREDDS 5.4
4 participants