-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Thredds to supported version #413
Conversation
FYI, for this change, the automated pipeline test is not enough. We have already attempted to upgrade Thredds on Ouranos side and we have |
Thanks for the info. I mostly created this PR to test the pipeline with the updated version to see what happens. I can close this if you're working on it on the ouranos side. |
Oh keep this PR, we have not yet opened a formal PR on our side, we simply override in |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2366/Result : failure BIRDHOUSE_DEPLOY_BRANCH : update-thredds-5.4 DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-20.rdext.crim.ca PAVICS-e2e-workflow-tests Pipeline ResultsTests URL : http://daccs-jenkins.crim.ca:80/job/PAVICS-e2e-workflow-tests/job/master/1482/NOTEBOOK TEST RESULTS |
A few problems found related to Thredds 5.4 Notebook changes required: |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2420/Result : failure BIRDHOUSE_DEPLOY_BRANCH : update-thredds-5.4 DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-46.rdext.crim.ca PAVICS-e2e-workflow-tests Pipeline ResultsTests URL : http://daccs-jenkins.crim.ca:80/job/PAVICS-e2e-workflow-tests/job/master/1498/NOTEBOOK TEST RESULTS |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2421/Result : failure BIRDHOUSE_DEPLOY_BRANCH : update-thredds-5.4 DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-69.rdext.crim.ca PAVICS-e2e-workflow-tests Pipeline ResultsTests URL : http://daccs-jenkins.crim.ca:80/job/PAVICS-e2e-workflow-tests/job/master/1499/NOTEBOOK TEST RESULTS |
FYI @mishaschwartz we still have not done fixing the issues on our side. Besides there is a performce problem with 5.4 (Unidata/tds#406) and we hope to be fixed in 5.5 so this PR is not ready (5.5 is not released). |
…antic versioning tag
NCML, UDDC, ISO links fixed by adding the missing jar, see Unidata/thredds-docker#310 (comment) No other clues found for broken NetcdfSubset link, opened an issue Unidata/tds#544 |
@@ -33,6 +33,7 @@ providers: | |||
dodsC, | |||
wcs, | |||
wms, | |||
ncss, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave the old location.
THREDDS_VERSION
can be overridden, so a server could still employ the older variant.
Given that, that will most probably cause a conflict in the data_type
resolution order if all 3 are defined, so maybe alternate paths or a dynamic variable resolution will be needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugh... you're right. Ok I'll figure something out.
ncss/grid, | ||
ncss/point, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test if this works?
I do not know if Magpie/Twitcher handles this properly.
The data types were not planned to include a /
, so I'm not sure if a url.split("/")
might happen somewhere in the code leading to misinterpretation of the targeted data_type
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seemed to be fine but I didn't rigorously test by trying to read the two options with different directory permissions I guess. I'll look into that.
BUT... if these prefixes can't handle a /
we should change the service definitions to:
<service name="ncssGrid" serviceType="NetcdfSubset" base="${TWITCHER_PROTECTED_PATH}/thredds/ncss-grid/" />
<service name="ncssPoint" serviceType="NetcdfSubset" base="${TWITCHER_PROTECTED_PATH}/thredds/ncss-point/" />
or similar and that way we don't have to worry about that at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok maybe we can't change the path... the URL construction docs for this service look like point or grid needs to be a subpath under ncss:
https://docs.unidata.ucar.edu/tds/current/userguide/netcdf_subset_service_ref.html#url-construction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data types were not planned to include a /, so I'm not sure if a url.split("/") might happen somewhere in the code leading to misinterpretation of the targeted data_type.
Yup magpie splits it up (see here) so our only option at this point is to keep the magpie config as it is and treat grid/
and point/
as directory paths if we want to differentiate their magpie permissions.
That's going to cause other confusion though...
Right now we have URL paths like:
${FQDN}/twitcher/ows/proxy/thredds/ncss/datasets/...
${FQDN}/twitcher/ows/proxy/thredds/dodsC/datasets/...
So if we want to set specific permissions on the datasets/
directory we can do that by setting one directory permission rule on the datasets/
subdirectory and it will work for all services.
But if we instead have:
${FQDN}/twitcher/ows/proxy/thredds/ncss/point/datasets/...
${FQDN}/twitcher/ows/proxy/thredds/dodsC/datasets/...
Then we would need to set the same permission rule on datasets/
as well as point/datasets/
. If we don't set the same rule on point/datasets/
then the rule won't apply to resources accessed by the ncss/point
service.
Please let me know if I'm interpreting this correctly @fmigneault
If I'm right then I think we need to update Magpie to handle this use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to revert my changes for now until we figure something out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tlvu Magpie cannot handle this case currently because it wasn't designed for this. It assumes THREDDS "services" don't contain a
/
, so it could rely on{Twitcher-prefix}/{THREDDS_name}/{THREDDS_service}/{rest-as-dir/file}
.
@fmigneault I am still lost. What case are you talking about?
Just to be clear, in the current state, with Misha's proposed Magpie config change rollback, do we have any problems?
Been running some Jenkins on the new Thredds and I am having some weird errors. Not sure if it's my test system or the code. So if you foresee some problem, please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ncssGrid
and ncssPoint
services won't be protected like other services do (ie: a common browse/read/write set for all services) since the paths are not handled.
For the moment, if you want them protected, you need to duplicate the permission hierarchy.
i.e.:
A file accessed by ncssGrid
via ${FQDN}/twitcher/ows/proxy/thredds/ncss/point/datasets/sub/file.nc
will actually go through ncss
service, and point
will be considered by Magpie as if it was any other directory (like datasets
or sub
).
Therefore, the same datasets/sub/file.nc
would need 3 sets of permissions:
- datasets/sub/file.nc => (browse,read,write) for anything currently handled
- grid/datasets/sub/file.nc => (read) only if accessing via ncssGrid
- point/datasets/sub/file.nc => (read) only if accessing via ncssPoint
and those (read) need to be duplicated for every user/group/dir/file combination applicable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
ncssGrid
andncssPoint
services won't be protected like other services do (ie: a common browse/read/write set for all services) since the paths are not handled.
@fmigneault I am surprised. If we "Deny or Allow" everything under ncss
path, isn't both of those variant grid/datasets/sub/file.nc
and point/datasets/sub/file.nc
will be properly denied or allowed, because they are all under ncss/
path?
Under ncss
is read only no write anyways, since there is no write for that kind of path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tlvu
Yes.
A deny top-level will restrict everything lower. But usually, a deny is "undone" by an allow on a lower specific resource we want to grant access. The same issue also happens even when working the other way around -- allow recursive followed by restricted deny lower (see below example).
With the current implementation of THREDDS that checks the service [accses-mode] (dodC, ncss, etc.) between Twitcher-proxy-path and the rest of the dir/file path, those resources would be at different "level" for the same file reference from the point of view of Magpie. So you would need to duplicate your 'allows'/'denies' across [accses-mode].
RESOURCE PERMISSION APPLIED / RESULT
===========================================
THREDDS allow-recursive
ncss [access-mode] (not a resource, cannot have permissions directly)
datasets deny-recursive
sub allow-recursve (all but sub's contents are blocked)
public allow-recursive
secure deny-recursive
secret allow-match only for user-1, maybe group "manager" also, others...
point <-- THREDDS sees this as another [access-mode], but Magpie thinks its a RESOURCE!
datasets
sub <==== OH! OH! undesired full open-access (it's not under the denied "/datasets")
public allow-recursive (but because on THREDDS)
secure if you forget to replicate above 'secure', this is still full open acces
grid (repeat again, another set of duplicat permissions, for each group/user combination)
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A deny top-level will restrict everything lower. But usually, a deny is "undone" by an allow on a lower specific resource we want to grant access. The same issue also happens even when working the other way around -- allow recursive followed by restricted deny lower (see below example).
Oh OK !!! Now I understand, thanks ! Top-level Allow or Deny will work. But adding an exception under the top-level will not.
For Thredds v5 upgrade in bird-house/birdhouse-deploy#413. For the following error and other output changes: ``` _______ pavics-sdi-master/docs/source/notebooks/rendering.ipynb::Cell 2 ________ Notebook cell execution failed Cell 2: Cell outputs differ Input: sorted(wms.contents["tasmax"].styles.keys()) Traceback: mismatch 'text/plain' assert reference_output == test_output failed: "['boxfill/al...fill/sst_36']" == "['colored_co...ter/default']" - ['colored_contours/default', - 'contours', - 'default-scalar/default', - 'raster/default'] + ['boxfill/alg', + 'boxfill/alg2', + 'boxfill/ferret', + 'boxfill/greyscale', + 'boxfill/ncview', + 'boxfill/occam', + 'boxfill/occam_pastel-30', + 'boxfill/rainbow', + 'boxfill/redblue', + 'boxfill/sst_36'] _______ pavics-sdi-master/docs/source/notebooks/rendering.ipynb::Cell 3 ________ Notebook cell execution failed Cell 3: Cell execution caused an exception Input: resp = wms.getmap( layers=["tasmax"], styles=["boxfill/occam"], format="image/png", colorscalerange=f"{mn},{mx}", size=[256, 256], srs="CRS:84", bbox=(150, 30, 250, 80), time="2006-02-15", transparent=True, ) Image(resp.read()) Traceback: --------------------------------------------------------------------------- ServiceException Traceback (most recent call last) Cell In[1], line 1 ----> 1 resp = wms.getmap( 2 layers=["tasmax"], 3 styles=["boxfill/occam"], 4 format="image/png", 5 colorscalerange=f"{mn},{mx}", 6 size=[256, 256], 7 srs="CRS:84", 8 bbox=(150, 30, 250, 80), 9 time="2006-02-15", 10 transparent=True, 11 ) 12 Image(resp.read()) File /opt/conda/envs/birdy/lib/python3.11/site-packages/owslib/map/wms130.py:309, in WebMapService_1_3_0.getmap(self, layers, styles, srs, bbox, format, size, time, elevation, dimensions, transparent, bgcolor, exceptions, method, timeout, **kwargs) 305 data = urlencode(request) 307 self.request = bind_url(base_url) + data --> 309 u = openURL(base_url, data, method, timeout=timeout or self.timeout, auth=self.auth, headers=self.headers) 311 # need to handle casing in the header keys 312 headers = {} File /opt/conda/envs/birdy/lib/python3.11/site-packages/owslib/util.py:210, in openURL(url_base, data, method, cookies, username, password, timeout, headers, verify, cert, auth) 207 req = requests.request(method.upper(), url_base, headers=headers, **rkwargs) 209 if req.status_code in [400, 401]: --> 210 raise ServiceException(req.text) 212 if req.status_code in [404, 500, 502, 503, 504]: # add more if needed 213 req.raise_for_status() ServiceException: <ServiceExceptionReport version="1.3.0" xmlns="http://www.opengis.net/ogc" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/ogc http://schemas.opengis.net/wms/1.3.0/exceptions_1_3_0.xsd"> <ServiceException code="StyleNotDefined"> The layer tasmax does not support the style boxfill </ServiceException> </ServiceExceptionReport> ``` Output change: ``` _ pavics-sdi-fix-for-Thredds-v5/docs/source/notebooks/CaSR_basic.ipynb::Cell 4 _ Notebook cell execution failed Cell 4: Cell outputs differ Input: bbox.rotated_pole Traceback: mismatch 'text/plain' assert reference_output == test_output failed: '<xarray.Data...itude: 0.0' == '<xarray.Data...itude: 0.0' Skipping 138 identical leading characters in diff, use -v to show Skipping 67 identical trailing characters in diff, use -v to show utes: - long_name: coordinates of the rotated North Pole earth_radius: 6371220.0 grid_mapping_name: rotated_latitude_longitude grid_north_pole_latitude: 31.758316040039062 grid_north_pole_longitude: 87.59703063964844 + long_name: coordinates of the rotated North Pole long ```
For Thredds v5 update in bird-house/birdhouse-deploy#413 To fix output change: ``` _ PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb::Cell 0 _ Notebook cell execution failed Cell 0: Cell outputs differ Input: from siphon.catalog import TDSCatalog url = "https://boreas.ouranos.ca/twitcher/ows/proxy/thredds/catalog/datasets/simulations/bias_adjusted/cmip6/ouranos/ESPO-G/ESPO-G6-R2v1.0.0/catalog.xml" # TEST_USE_PROD_DATA # Create Catalog cat = TDSCatalog(url) # List of datasets print(f"Number of datasets: {len(cat.datasets)}") # Access mechanisms - here we are interested in OPENDAP, a data streaming protocol cds = cat.datasets[0] print(f"Access URLs: {tuple(cds.access_urls.keys())}") Traceback: mismatch 'stdout' assert reference_output == test_output failed: "Number of da...cdfSubset')\n" == "Number of da...cdfSubset')\n" Skipping 43 identical leading characters in diff, use -v to show Skipping 50 identical trailing characters in diff, use -v to show - erver', 'OpenDAP', 'NC ? ^^^ + erver', 'OPENDAP', 'NC ? ^^^ _ PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb::Cell 2 _ Notebook cell execution failed Cell 2: Cell outputs differ Input: # Extract a subset of the file # Again, this only creates an in-memory representation of the data sub = ds.tasmin.sel(time="2050").isel(rlon=400, rlat=350) # The data is only downloaded when we actually need it for a computation. sub.mean(keep_attrs=True).compute() Traceback: mismatch 'text/plain' assert reference_output == test_output failed: '<xarray.Data...60 50 50]' == '<xarray.Data...60 50 50]' Skipping 38 identical leading characters in diff, use -v to show Skipping 225 identical trailing characters in diff, use -v to show B - array(279.47516, dtype=float32) ? ^ - ^^ + array(278.72336, dtype=float32) ? ^ ^^^ Coordinates: rlat float32 4B -14.67 rlon float32 4B 360.6 rotated_pole float32 4B 9.969e+36 lat float32 4B 43.57 lon float32 4B -91.6 Attributes: - long_name: Minimal daily temperature cell_methods: time: minimum within days description: Daily minimal temperature as... grid_mapping: rotated_pole history: [DATE_TIME] Data c... + long_name: Minimal daily temperature stan _ PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb::Cell 3 _ Notebook cell execution failed Cell 3: Cell outputs differ Input: ssp245_data = [cat.datasets[x] for x in cat.datasets if "ssp245" in x] ssp245_data Traceback: mismatch 'text/plain' assert reference_output == test_output failed: '[day_ESPO-G6...1001231.ncml]' == '[day_ESPO-G6...1001231.ncml]' Skipping 35 identical leading characters in diff, use -v to show - ioMIP_NAM_AS-RCEC_TaiESM1_ssp245_r1i1p1f1_19500101-21001231.ncml, ? ^ ^^^^^ ^^^ ^ + ioMIP_NAM_NUIST_NESM3_ssp245_r1i1p1f1_19500101-21001231.ncml, ? ^^^ ^ ^ ^ + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NOAA-GFDL_GFDL-ESM4_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NIMS-KMA_KACE-1-0-G_ssp245_r1i1p1f1_19500101-21001230.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NCC_NorESM2-MM_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NCC_NorESM2-LM_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MRI_MRI-ESM2-0_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-LR_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-HR_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MOHC_UKESM1-0-LL_ssp245_r1i1p1f2_19500101-21001230.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC6_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC-ES2L_ssp245_r1i1p1f2_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_IPSL_IPSL-CM6A-LR_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_INM_INM-CM5-0_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_INM_INM-CM4-8_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_EC-Earth-Consortium_EC-Earth3_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_EC-Earth-Consortium_EC-Earth3-Veg_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_EC-Earth-Consortium_EC-Earth3-CC_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO_ACCESS-ESM1-5_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO-ARCCSS_ACCESS-CM2_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CNRM-CERFACS_CNRM-ESM2-1_ssp245_r1i1p1f2_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CNRM-CERFACS_CNRM-CM6-1_ssp245_r1i1p1f2_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CMCC_CMCC-ESM2_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CCCma_CanESM5_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CAS_FGOALS-g3_ssp245_r1i1p1f1_19500101-21001231.ncml, day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_BCC_BCC-CSM2-MR_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CAS_FGOALS-g3_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CCCma_CanESM5_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CMCC_CMCC-ESM2_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CNRM-CERFACS_CNRM-CM6-1_ssp245_r1i1p1f2_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CNRM-CERFACS_CNRM-ESM2-1_ssp245_r1i1p1f2_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO-ARCCSS_ACCESS-CM2_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO_ACCESS-ESM1-5_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_EC-Earth-Consortium_EC-Earth3-CC_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_EC-Earth-Consortium_EC-Earth3-Veg_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_EC-Earth-Consortium_EC-Earth3_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_INM_INM-CM4-8_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_INM_INM-CM5-0_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_IPSL_IPSL-CM6A-LR_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC-ES2L_ssp245_r1i1p1f2_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC6_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MOHC_UKESM1-0-LL_ssp245_r1i1p1f2_19500101-21001230.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-HR_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-LR_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MRI_MRI-ESM2-0_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NCC_NorESM2-LM_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NCC_NorESM2-MM_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NIMS-KMA_KACE-1-0-G_ssp245_r1i1p1f1_19500101-21001230.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NOAA-GFDL_GFDL-ESM4_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NUIST_NESM3_ssp245_r1i1p1f1_19500101-21001231.ncml] ? ^^^ ^^ ^ + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_AS-RCEC_TaiESM1_ssp245_r1i1p1f1_19500101-21001231.ncml] ? ^ ++++++ ^^ ^ _ PAVICS-landing-master/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-1DataAccess.ipynb::Cell 4 _ Notebook cell execution failed Cell 4: Cell outputs differ Input: tcr_likely_models = [ "BCC-CSM2-MR", "FGOALS-g3", "CMCC-ESM2", "CNRM-ESM2-1", "ACCESS-CM2", "ACCESS-ESM1-5", "MPI-ESM1-2-HR", "INM-CM5-0", "MIROC6", "MPI-ESM1-2-LR", "MRI-ESM2-0", "NorESM2-LM", "KACE-1-0-G", "GFDL-ESM4", "MIROC-ES2L", ] def _filter_tcr_likely(files): return [d for d in files if any([h in d.name for h in tcr_likely_models])] # create a simple search sub-function def get_ncfilelist(scen=None, url=None, tcr_likely=False): cat = TDSCatalog(url) ncfiles = [cat.datasets[c] for c in cat.datasets if scen in c] if tcr_likely: expected = len(tcr_likely_models) ncfiles = _filter_tcr_likely(ncfiles) if len(ncfiles) == expected: display(f"Successfully found {expected} datasets for {scen}") return ncfiles else: raise ValueError( f"Expected number of datasets for {scen} is {expected} : found {len(ncfiles)}" ) datasets = {} for scen in ["ssp245", "ssp370"]: datasets[scen] = get_ncfilelist(scen=scen, url=url, tcr_likely=True) display(datasets["ssp245"]) display(datasets["ssp370"]) Traceback: mismatch 'text/plain' assert reference_output == test_output failed: '[day_ESPO-G6...1001231.ncml]' == '[day_ESPO-G6...1001231.ncml]' Skipping 35 identical leading characters in diff, use -v to show - ioMIP_NAM_BCC_BCC-CSM2-MR_ssp245_r1i1p1f1_19500101-21001231.ncml, ? ^^^ ^^^ ^ ^^^^ + ioMIP_NAM_NOAA-GFDL_GFDL-ESM4_ssp245_r1i1p1f1_19500101-21001231.ncml, ? ^^^^^^^^^ ^^^^ ^ ^ + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NIMS-KMA_KACE-1-0-G_ssp245_r1i1p1f1_19500101-21001230.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NCC_NorESM2-LM_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MRI_MRI-ESM2-0_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-LR_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-HR_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC6_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC-ES2L_ssp245_r1i1p1f2_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_INM_INM-CM5-0_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO_ACCESS-ESM1-5_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO-ARCCSS_ACCESS-CM2_ssp245_r1i1p1f1_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CNRM-CERFACS_CNRM-ESM2-1_ssp245_r1i1p1f2_19500101-21001231.ncml, + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CMCC_CMCC-ESM2_ssp245_r1i1p1f1_19500101-21001231.ncml, day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CAS_FGOALS-g3_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CMCC_CMCC-ESM2_ssp245_r1i1p1f1_19500101-21001231.ncml, ? ^^ ^^ ^ ^^ + day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_BCC_BCC-CSM2-MR_ssp245_r1i1p1f1_19500101-21001231.ncml] ? ^ ^ ^ +++ ^ - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CNRM-CERFACS_CNRM-ESM2-1_ssp245_r1i1p1f2_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO-ARCCSS_ACCESS-CM2_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_CSIRO_ACCESS-ESM1-5_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_INM_INM-CM5-0_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC-ES2L_ssp245_r1i1p1f2_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MIROC_MIROC6_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-HR_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MPI-M_MPI-ESM1-2-LR_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_MRI_MRI-ESM2-0_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NCC_NorESM2-LM_ssp245_r1i1p1f1_19500101-21001231.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NIMS-KMA_KACE-1-0-G_ssp245_r1i1p1f1_19500101-21001230.ncml, - day_ESPO-G6-R2_v1.0.0_CMIP6_ScenarioMIP_NAM_NOAA-GFDL_GFDL-ESM4_ssp245_r1i1p1f1_19500101-21001231.ncml] _ PAVICS-landing-fix-for-Thredds-v5-output-update/content/notebooks/climate_indicators/PAVICStutorial_ClimateDataAnalysis-3Climate-Indicators.ipynb::Cell 0 _ Notebook cell execution failed Cell 0: Cell outputs differ Input: import os os.environ["USE_PYGEOS"] = "0" # force use Shapely with GeoPandas import warnings import geopandas as gpd import matplotlib.pyplot as plt import numba import xarray as xr from clisops.core import subset from dask.diagnostics import ProgressBar from siphon.catalog import TDSCatalog from xclim import atmos warnings.simplefilter("ignore") # TODO change address url = "https://boreas.ouranos.ca/twitcher/ows/proxy/thredds/catalog/datasets/simulations/bias_adjusted/cmip6/ouranos/ESPO-G/ESPO-G6-R2v1.0.0/catalog.xml" # TEST_USE_PROD_DATA # Create Catalog cat = TDSCatalog(url) # Subset over the Gasp�� peninsula in eastern Quebec gaspe = gpd.GeoDataFrame.from_file( "/notebook_dir/pavics-homepage/tutorial_data/gaspesie_mrc.geojson" ) ds = subset.subset_shape( xr.open_dataset( cat.datasets[0].access_urls["OPENDAP"], chunks=dict(time=365 * 4, rlon=50, rlat=50), ), shape=gpd.GeoDataFrame(geometry=gaspe.buffer(0.05)), ) # What we see here is only a representation of the full content, the entire data set hasn't been loaded. display(ds) # plot of single day tasmin a = ds.tasmin.isel(time=0).plot(figsize=(10, 4)) Traceback: mismatch 'text/plain' assert reference_output == test_output failed: '<xarray.Data... EPSG:4326' == '<xarray.Data... EPSG:4326' Skipping 1172 identical leading characters in diff, use -v to show - tes: (12/83) ? ^ + tes: (12/81) ? ^ Conventions: CF-1.7 CMIP-6.2 Notes: Regridded on the grid of RDRS v2.1, then... activity_id: CMIP - branch_method: Hybrid-restart from year/DATE/of p... - branch_time: 0.0 ? -- ^^^^^ + branch_method: standard ? ++++ ^^^^^^^^ branch_time_in_child: 0.0 + branch_time_in_parent: 109573.0 ... ... license_type: permissive terms_of_use: In addition to the provided licence, the... attribution: Use of this dataset should be acknowledg... modeling_realm: atmos - source_institution: AS-RCEC ? ^ ^^^^^ + source_institution: NUIST ? ^^^ ^ crs: EPSG:4326 ```
@fmigneault Before you say I merge this while you had a change requested, your requested change is for Misha's magpie config change and he already reverted it. |
@tlvu Ok but the issue here (Ouranosinc/Magpie#633) now becomes an actual bug instead of a potential future bug. If any deployment uses specific directory permissions for thredds, this introduces a security risk for that deployment. We really needed to resolve Ouranosinc/Magpie#633 before merging this PR. I'm going to recommend that we add a warning to release 2.6.3, immediately try to fix Ouranosinc/Magpie#633, and then release 2.7.0 with the updated Magpie changes as soon as possible. I have 1000+ other things to do but I can put resolving the Ouranosinc/Magpie#633 near the top of my priority list if @fmigneault can review it. If @fmigneault does not have the ability to review it shortly then we should really really revert this change and wait until Ouranosinc/Magpie#633 can be addressed. |
Yeah. Same points as @mishaschwartz I don't mind that much since I'm not using those services. Are they needed by anyone? |
Woops, I guess I should have documented my reasoning. This Thredds v5 actually have 2 minors issues
I considered them minor and not blocking because
Other features I wanted to get from this newer Thredds
On the topic of documenting, I just realized the CHANGES.md file is not up-to-date with all the discussions here in this PR. I will make a PR amending this ! I think, for long running PR, we should double check the CHANGES.md before merge. I understand everyone is busy so Ouranosinc/Magpie#633 can wait unless you actually need it. |
…minor issues (#486) ## Overview Document reasoning for pulling in new Thredds v5, even with minor issues in previous PR #413. ## CI Operations <!-- The test suite can be run using a different DACCS config with ``birdhouse_daccs_configs_branch: branch_name`` in the PR description. To globally skip the test suite regardless of the commit message use ``birdhouse_skip_ci`` set to ``true`` in the PR description. Using ``[<cmd>]`` (with the brackets) where ``<cmd> = skip ci`` in the commit message will override ``birdhouse_skip_ci`` from the PR description. Such commit command can be used to override the PR description behavior for a specific commit update. However, a commit message cannot 'force run' a PR which the description turns off the CI. To run the CI, the PR should instead be updated with a ``true`` value, and a running message can be posted in following PR comments to trigger tests once again. --> birdhouse_daccs_configs_branch: master birdhouse_skip_ci: false
## Overview Fix the error below at the URL https://HOST/canarie/node/service/stats: Bad return code from http://thredds:8080//twitcher/ows/proxy/thredds/catalog.html (Expecting 200, Got 404 Old URL: http://thredds:8080//twitcher/ows/proxy/thredds/catalog.html New URL: http://thredds:8080/twitcher/ows/proxy/thredds/catalog/catalog.html An oversight from the previous PR #413 ## Changes **Non-breaking changes** - Adapt canarie-api to new Thredds v5 URL ## CI Operations <!-- The test suite can be run using a different DACCS config with ``birdhouse_daccs_configs_branch: branch_name`` in the PR description. To globally skip the test suite regardless of the commit message use ``birdhouse_skip_ci`` set to ``true`` in the PR description. Using ``[<cmd>]`` (with the brackets) where ``<cmd> = skip ci`` in the commit message will override ``birdhouse_skip_ci`` from the PR description. Such commit command can be used to override the PR description behavior for a specific commit update. However, a commit message cannot 'force run' a PR which the description turns off the CI. To run the CI, the PR should instead be updated with a ``true`` value, and a running message can be posted in following PR comments to trigger tests once again. --> birdhouse_daccs_configs_branch: master birdhouse_skip_ci: false
Overview
Unidata has dropped support for TDS versions < 5.x. This updates Thredds to version 5.5.
Requires matching update:
Changes
Non-breaking changes
Breaking changes
Related Issue / Discussion
Additional Information
Links to other issues or sources.
birdhouse_daccs_configs_branch: master
birdhouse_skip_ci: false