Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while using add_dqr_to_qc: flag_values and flag_masks not set as expected #750

Closed
sgupta92 opened this issue Nov 8, 2023 · 9 comments · Fixed by #755
Closed

Error while using add_dqr_to_qc: flag_values and flag_masks not set as expected #750

sgupta92 opened this issue Nov 8, 2023 · 9 comments · Fixed by #755

Comments

@sgupta92
Copy link

sgupta92 commented Nov 8, 2023

  • ACT version: 1.5.3
  • Python version: 3.11.4
  • Operating System: windows

Trying to add DQRs into QC variable for 'lv_e' (latent flux) within the 30ecor datastream from SGP.

fluxdata = act.io.armfiles.read_netcdf('sgp30ecorE2.b1.20050622.000000.cdf', keep_variables=['lv_e','wind_dir','qc_lv_e'])
fluxdata = act.qc.arm.add_dqr_to_qc(fluxdata, variable=['lv_e'])

Using 'act.qc.arm.add_dqr_to_qc' on sgp30ecorE2.b1.20050622.000000.cdf or sgp30ecorE6.b1.20040705.000000.cdf. The latter is associated with the following DQR: https://adc.arm.gov//ArchiveServices/DQRService?dqrid=D050524.3

Using 'cleanup_qc=True' did not help. Neither did running clean.cleanup() after reading the file. Error is provided below

Update: Updated the package from 1.5.1 to 1.5.3 and that did not fix the original issue. Now also getting an additional error : ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))


KeyError Traceback (most recent call last)
File ~\AppData\Local\anaconda3\Lib\site-packages\act\qc\qcfilter.py:658, in QCFilter.available_bit(self, qc_var_name, recycle)
657 try:
--> 658 flag_masks = self._ds[qc_var_name].attrs['flag_masks']
659 flag_value = False

KeyError: 'flag_masks'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last)
File ~\AppData\Local\anaconda3\Lib\site-packages\act\qc\qcfilter.py:662, in QCFilter.available_bit(self, qc_var_name, recycle)
661 try:
--> 662 flag_masks = self._ds[qc_var_name].attrs['flag_values']
663 flag_value = True

KeyError: 'flag_values'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last)
File ~\AppData\Local\anaconda3\Lib\site-packages\act\qc\qcfilter.py:666, in QCFilter.available_bit(self, qc_var_name, recycle)
665 try:
--> 666 self._ds[qc_var_name].attrs['flag_values']
667 flag_masks = self._ds[qc_var_name].attrs['flag_masks']

KeyError: 'flag_values'

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
Cell In[118], line 60
57 fluxdata = act.io.armfiles.read_netcdf('sgp30ecor'+site[0:3]+'.b1/sgp30ecor'+site[0:3]+'.b1.'+date+'.000000.cdf')
59 # Add DQRs to QC variable
---> 60 fluxdata = act.qc.arm.add_dqr_to_qc(fluxdata, variable=['lv_e'])
61 wdir=fluxdata.wind_dir.where(fluxdata.qc_wind_dir==0)
62 latent_flux_1=fluxdata.lv_e.where(fluxdata.qc_lv_e==0)

File ~\AppData\Local\anaconda3\Lib\site-packages\act\qc\arm.py:179, in add_dqr_to_qc(ds, variable, assessment, exclude, include, normalize_assessment, cleanup_qc, dqr_link, skip_location_vars)
177 for key, value in dqr_results.items():
178 try:
--> 179 ds.qcfilter.add_test(
180 var_name,
181 index=value['index'],
182 test_meaning=value['test_meaning'],
183 test_assessment=value['test_assessment'],
184 )
185 except IndexError:
186 print(f"Skipping '{var_name}' DQR application because of IndexError")

File ~\AppData\Local\anaconda3\Lib\site-packages\act\qc\qcfilter.py:345, in QCFilter.add_test(self, var_name, index, test_number, test_meaning, test_assessment, flag_value, recycle)
342 qc_var_name = self._ds.qcfilter.check_for_ancillary_qc(var_name, flag_type=flag_value)
344 if test_number is None:
--> 345 test_number = self._ds.qcfilter.available_bit(qc_var_name, recycle=recycle)
347 self._ds.qcfilter.set_test(var_name, index, test_number, flag_value)
349 if flag_value:

File ~\AppData\Local\anaconda3\Lib\site-packages\act\qc\qcfilter.py:670, in QCFilter.available_bit(self, qc_var_name, recycle)
668 flag_value = False
669 except KeyError:
--> 670 raise ValueError(
671 'Problem getting next value from '
672 'available_bit(). flag_values and '
673 'flag_masks not set as expected'
674 )
676 if flag_masks == []:
677 next_bit = 1

ValueError: Problem getting next value from available_bit(). flag_values and flag_masks not set as expected

@AdamTheisen
Copy link
Collaborator

AdamTheisen commented Nov 9, 2023

Thanks @sgupta92!

This problem is due to the older QC metadata in that file. It's using an old ARM convention that we have not incorporated into ACT yet. @zssherman how good are you with regular expression? I think we would want to extract each of the 0x lines accordingly
----/bit /Bit description
"0x0 = value is within the specified range\n",

Of course, pulling in the valid_min, valid_max from the variable.

                 :qc_method = "Standard Mentor QC" ;
		:Mentor_QC_Field_Information = "For each qc_<field> interpret the values as follows:\n",
			"\n",
			"Basic mentor QC checks (bit values):\n",
			"==========================================\n",
			"0x0 = value is within the specified range\n",
			"0x1 = value is equal to \'missing_value\'\n",
			"0x2 = value is less than the \'valid_min\'\n",
			"0x4 = value is greater than the \'valid_max\'\n",
			"0x8 = value failed the \'valid_delta\' check\n",
			"\n",
			"If the value is a \'missing_value\' no min, max, or delta checks are performed.\n",
			"\n",
			"The delta checks are done by comparing each data value to the one just\n",
			"prior to it in time. If a previous data value does not exist or is a\n",
			"\'missing_value\' the delta check will not be performed.\n",
			"\n",
			"Note that the delta computation for multi-dimensioned data compares the\n",
			"absolute value between points in the same spatial location to the previous\n",
			"point in time.\n",
			"\n",
			"If the associated non-QC field does not contain any mentor-specified minimum,\n",
			"maximum, or delta information a qc_field is not generated.\n",
			"" ;

@zssherman
Copy link
Collaborator

I have some experience with the regex module etc, I can try to take a look, so the goal from the old file is to pull those specific lines? How are they then applied?

@AdamTheisen
Copy link
Collaborator

@zssherman I'm talking with the ingest lead today and will find out more information. @kenkehoe and I have been talking about this as well and there are quite a lot of older datastreams that we have this convention for.

@AdamTheisen
Copy link
Collaborator

@zssherman @kenkehoe I did confirm with Brian that these all follow the standard ARM convention so if this is a global attribute and we don't have any other QC information, we can apply this. The assessments would all be bad. Does this sound like a good path forward? It would be a lot easier than regular expressions on this and trying to decode the text... Thoughts?
0 = Good
1 = Missing Value
2 = Valid Min
4 = Valid Max
8 = Valid Delta

@zssherman
Copy link
Collaborator

@AdamTheisen I guess i'm still slightly confused, so what is the path forward? Or what is needed (My brain is going burr)

@AdamTheisen
Copy link
Collaborator

AdamTheisen commented Nov 9, 2023

@zssherman I want @kenkehoe thoughts here but I'm thinking that in the get_attr_info function, after the assessment and comments are set (~line 356) we could check if those variables are None and if this attribute exists. If so, then set those variables appropriately to the generic ARM way.

			"Basic mentor QC checks:\n",
			"=======================\n",
			"A value of  0 means that no mentor QC (missing/min/max/delta) checks failed\n",
			"A value of  1 means that the sample contained a \'missing data\' value\n",
			"A value of  2 means that the sample failed the \'minimum\' check\n",
			"A value of  4 means that the sample failed the \'maximum\' check\n",
			"A value of  8 means that the sample failed the \'delta\' check\n",

@kenkehoe
Copy link
Contributor

kenkehoe commented Nov 9, 2023

I think we should check for the attribute existence along with some simple value checking. We should see if the value starts with "Basic mentor QC checks:" and contains the correct number of carriage returns. That should be enough to ensure we are applying the assumptions correctly.

Also test 1, 2, 3 will be bad. 4 will be indeterminate. That is the current ARM standard.

@AdamTheisen
Copy link
Collaborator

I ran through all the ARM cases of this and the text is consistently either 920 or 1562 characters long so I think we could add that check to ensure.

@zssherman
Copy link
Collaborator

Sounds good, I'll try to take a crack at this earlier next week

AdamTheisen added a commit to AdamTheisen/ACT that referenced this issue Nov 17, 2023
AdamTheisen added a commit that referenced this issue Nov 17, 2023
* ENH: Fix to resolve #750 for older ARM qc convention in files

* ENH: Adding in tesets

* ENH: Streamlining code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants