-
-
Notifications
You must be signed in to change notification settings - Fork 23
Search Keyword: max
- Introduction
- Maximum from Groups of Hashes
- Maximum from 1-Dimensional Arrays
- Dealing With Bad Data
- Bonus: Second Maximum
The [max([NAME])]
search keyword will match child nodes which have -- or do not have, when inverted with !
-- the maximum value in either:
- a named child key shared by multiple immediate-child Hashes (maps/dicts); or
- an entire present single-dimension Array (sequence/list).
Zero to many matches are made depending on how many nodes are evaluated and how many of those contain the maximum value. So, it is possible for this keyword to match more than one node when they share the same maximum value. When you really need only one match, you can use a Collector and Array Element index to filter the result to only the first match, demonstrated below.
Remember to place this keyword so that it operates against the parent node of all children that are being evaluated. Against Hash data, each child must have the named key; child nodes missing the key are ignored. An exception is raised when this keyword is placed so that it would be forced to evaluate each child in isolation. The examples below will demonstrate what this looks like.
[max([NAME])]
accepts up to one parameter, NAME
. This parameter:
- is mandatory when evaluating Hash (map/dict) peers, specifying the exact -- case-sensitive -- name of the required child key; and
- must not be present when evaluating Array (sequence/list) elements.
To illustrate, the example commands below will use this sample data as max-examples.yaml:
---
# Consistent Data Types
prices_aoh:
- product: doohickey
price: 4.99
- product: fob
price: 4.99
- product: whatchamacallit
price: 9.95
- product: widget
price: 0.98
- product: unknown
prices_hash:
doohickey:
price: 4.99
fob:
price: 4.99
whatchamacallit:
price: 9.95
widget:
price: 0.98
unknown:
prices_array:
- 4.99
- 4.99
- 9.95
- 0.98
- null
# Inconsistent Data Types
bare: value
bad_prices_aoh:
- product: doohickey
price: 4.99
- product: fob
price: not set
- product: whatchamacallit
price: 9.95
- product: widget
price: true
- product: unknown
bad_prices_hash:
doohickey:
price: 4.99
fob:
price: not set
whatchamacallit:
price: 9.95
widget:
price: true
unknown:
bad_prices_array:
- 4.99
- not set
- 9.95
- 0.98
- null
There are two groups of Hashes (maps/dicts) in the sample data:
- Arrays of Hashes (sequence-of-maps / list-of-dicts) at
prices_aoh
andbad_prices_aoh
; and - Hashes of Hashes (map-of-maps / dict-of-dicts) at
prices_hash
andprices_hash
.
This section will ignore the bad_*
data, which is demonstrated later to show how bad data impacts the outcome of this search keyword.
As data expression strategies, each type of Hash grouping has pros and cons versus the other type. The [max(NAME)]
Search Keyword can handle both in the same way.
Any search against either type occurs at the parent node of the grouping. In the sample data above, that would be the prices_aoh
or prices_hash
nodes, like so:
$ yaml-get --query='/prices_aoh[max(price)]' max-examples.yaml
{"product": "whatchamacallit", "price": 9.95}
$ yaml-get --query='/prices_hash[max(price)]' max-examples.yaml
{"price": 9.95}
Should you want only the maximum price value or only the name of the matching "product", add another Hash Key Segment for the Array of Hashes and -- for the name -- a [name()]
Search Keyword for the Hash of Hashes because its children are uniquely identified their key names rather than the value of their own child identifier key:
$ yaml-get --query='/prices_aoh[max(price)]/price' max-examples.yaml
9.95
$ yaml-get --query='/prices_hash[max(price)]/price' max-examples.yaml
9.95
$ yaml-get --query='/prices_aoh[max(price)]/product' max-examples.yaml
whatchamacallit
$ yaml-get --query='/prices_hash[max(price)][name()]' max-examples.yaml
whatchamacallit
In the sample data, 1-dimensional Arrays are represented by prices_array
and bad_prices_array
. For this simple demonstration, we'll look at the good data.
Since 1-dimensional Arrays don't have any keys, searching them uses the empty-parameter list form of the [max()]
search keyword. Such a search must be performed at the parent node of all elements under consideration, like so:
$ yaml-get --query='prices_array[max()]' max-examples.yaml
9.95
This search keyword will do its best to coalesce incompatible data-types so that they can be compared. No assumptions are made about the data, so when all elements under comparison are the same data-type -- numbers with numbers, text with text, and such -- the result is predictable. However, when incompatible data-types are all compared together, the results may be unexpected. This is because incompatible data-type comparisons are performed against their String equivalents; the result may seem unnatural. In this case, the character sorting locale of your Python run-time will dictate which String values are considered greater than others.
Take a look at what happens when the maximum of incompatible data is calculated against the bad sample data:
$ yaml-get --query='bad_prices_aoh[max(price)]' max-examples.yaml
{"product": "fob", "price": "not set"}
$ yaml-get --query='bad_prices_hash[max(price)]' max-examples.yaml
{"price": "not set"}
$ yaml-get --query='bad_prices_array[max()]' max-examples.yaml
not set
Notice that these results don't look anything like price amounts, but rather the arbitrary String value, "not set". When all of the available (bad) data is compared as String data, the lower-case "n" was considered to be the highest value among all of the options.
It should also be noted that the maximum of any single element is itself and this cannot be inverted:
$ yaml-get --query='bare[max()]' max-examples.yaml
value
$ yaml-get --query='bare[!max()]' max-examples.yaml
CRITICAL: Required YAML Path does not match any nodes, 'bare[!max()]'.
This requires version 3.6.0 or higher.
For fun, what if you wanted only the second-maximum (second-highest) matches? Using a simple Collector -- and continuing to demonstrate against both Array-of-Hash and Hash-of-Hash groups -- you can easily get the maximum of all data that is not the maximum, like so:
$ yaml-get --query='(prices_aoh[!max(price)])[max(price)]' max-examples.yaml
{"product": "doohickey", "price": 4.99}
{"product": "fob", "price": 4.99}
$ yaml-get --query='(prices_hash[!max(price)])[max(price)]' max-examples.yaml
{"price": 4.99}
{"price": 4.99}
Should you have a more complex use-case, you could also employ Collector Math to subtract the first-maximum (the highest) match from some source data and then get the maximum of the remaining entries, like so:
$ yaml-get --query='(prices_aoh)-(prices_aoh[max(price)])[max(price)]' max-examples.yaml
{"product": "doohickey", "price": 4.99}
{"product": "fob", "price": 4.99}
$ # Notice that Hash-of-Hash results require an extra wildcard segment due to the nature of the collected data
$ yaml-get --query='(prices_hash)-(prices_hash[max(price)]).*[max(price)]' max-examples.yaml
{"price": 4.99}
{"price": 4.99}
Notice in either case there are two matches. This is because both happen to share the same maximum "price" after the first-maximum has been removed from evaluation. If you wanted only the first of these matches, Collect the output and access the first result via an Array Index Segment:
$ yaml-get --query='((prices_aoh[!max(price)])[max(price)])[0]' max-examples.yaml
{"product": "doohickey", "price": 4.99}
$ yaml-get --query='((prices_hash[!max(price)])[max(price)])[0]' max-examples.yaml
{"price": 4.99}
As before, if you really only wanted to see the second-highest price, add another Hash Key Segment to select the price
node of the returned Hash (map/dict):
$ yaml-get --query='((prices_aoh[!max(price)])[max(price)])[0].price' max-examples.yaml
4.99
$ yaml-get --query='((prices_hash[!max(price)])[max(price)])[0].price' max-examples.yaml
4.99