Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise YAML species definitions to support multiple thermo models #20

Closed
speth opened this issue Jan 8, 2020 · 23 comments
Closed

Revise YAML species definitions to support multiple thermo models #20

speth opened this issue Jan 8, 2020 · 23 comments
Labels
feature-request New feature request question Further information is requested

Comments

@speth
Copy link
Member

speth commented Jan 8, 2020

Abstract

The aim of this enhancement is to improve the organization of thermo data within YAML "species" entries to allow better re-use with multiple phase thermo models and simplify serialization. Comments on the possible solutions would be greatly appreciated.

Motivation

Two problematic cases have been identified with how the species entries are currently organized. One comes from a discussion on Cantera/cantera#641 about how to store model-specific parameters for multiple models (i.e. Redlich-Kwong and Peng-Robinson coefficients) with a single species definition. The other was one I noticed while working on the implementation of #11, which is that (a) species data needed by the ThermoPhase is almost always in the equation-of-state field, with the exception of several fields used in the Debye-Huckel model and (b) we are using the equation-of-state field both for setting up ThermoPhase objects and for setting up PDSS objects.

Two species definitions for illustration:

- name: H2  # species for Redlich-Kwong phase
  composition: {H: 2}
  thermo:
    model: NASA7
    temperature-ranges: [200.0, 1000.0, 3500.0]
    data:
    - [2.34433112, 0.00798052075, -1.9478151e-05, 2.01572094e-08, -7.37611761e-12,
      -917.935173, 0.683010238]
    - [3.3372792, -4.94024731e-05, 4.99456778e-07, -1.79566394e-10, 2.00255376e-14,
      -950.158922, -3.20502331]
  equation-of-state:
    model: Redlich-Kwong
    units: {length: cm, pressure: bar, quantity: mol}
    a: [3.0e+08, -3.30e+06]
    b: 31.0
- name: NaCl(aq)  # species for Debye-Huckel phase
  composition: {Na: 1, Cl: 1}
  thermo:
    model: piecewise-Gibbs
    h0: -96.03E3 cal/mol
    dimensionless: true
    data: {298.15: -174.5057463, 333.15: -174.5057463}
  equation-of-state:
    model: constant-volume
    molar-volume: 1.3
  electrolyte-species-type: weak-acid-associated  # top-level field used only by Debye-Huckel
  weak-acid-charge: -1.0  # top-level field used only by Debye-Huckel

Possible Solutions

There are several possible solutions, with different pros and cons. In the examples below, the thermo field for each species has been elided for simplicity.

Update: Based on the discussion with @ischoegl some of the pros/cons have been updated, and a 4th option has been introduced. I think the newly-introduced Option 4 is my current preference.

Option 1: Allow equation-of-state to be a list, and put all phase-specific thermo data an entry in this list

- name: H2  # species for Redlich-Kwong phase
  composition: {H: 2}
  equation-of-state:
  - model: Redlich-Kwong
    units: {length: cm, pressure: bar, quantity: mol}
    a: [3.0e+08, -3.30e+06]
    b: 31.0
  - model: Peng-Robinson
    units: {length: cm, pressure: bar, quantity: mol}
    a: [2.1e+08, -4.50e+06]
    b: 27.0
- name: NaCl(aq)  # species for Debye-Huckel phase
  composition: {Na: 1, Cl: 1}
  equation-of-state:
  - model: constant-volume
    molar-volume: 1.3
  - model: Debye-Huckel
    electrolyte-species-type: weak-acid-associated
    weak-acid-charge: -1.0

Pros:

  • No change required for species with only a single equation-of-state field
  • Suggests a parallel route to enabling multiple sets of species transport data

Cons:

  • Implementation complexity due to intermingling of data for setting up both ThermoPhase and PDSS models, and due to the fact that the equation-of-state field can be either a map or a list of maps.
  • Introduces strange behavior for the Debye-Huckel model, where two equation-of-state entries are used simultaneously
  • Mis-categorizes Debye-Huckel species thermo data, which isn't really used to define the equation of state

Option 2: Make equation-of-state a map of maps

- name: H2  # species for Redlich-Kwong phase
  composition: {H: 2}
  equation-of-state:
    Redlich-Kwong:
      units: {length: cm, pressure: bar, quantity: mol}
      a: [3.0e+08, -3.30e+06]
      b: 31.0
    Peng-Robinson:
      units: {length: cm, pressure: bar, quantity: mol}
      a: [2.1e+08, -4.50e+06]
      b: 27.0
- name: NaCl(aq)  # species for Debye-Huckel phase
  composition: {Na: 1, Cl: 1}
  equation-of-state:
    constant-volume:
      molar-volume: 1.3
    Debye-Huckel:
      electrolyte-species-type: weak-acid-associated
      weak-acid-charge: -1.0

Pros:

  • Simpler implementation than option 1 due to consistent data structure

Cons:

  • Introduces an extra layer of nesting required, even though for most species there will only be one item in equation-of-state map.
  • Inconsistent with the current structure of the transport and thermo nodes
  • Requires changes to all existing files that use the equation-of-state field
  • Introduces strange behavior for the Debye-Huckel model, where two equation-of-state entries are used simultaneously
  • Mis-categorizes Debye-Huckel species thermo data, which isn't really used to define the equation of state

Option 3: Use equation-of-state only for PDSS setup, and use a per-model top level field for phase-specific data

- name: H2  # species for Redlich-Kwong phase
  composition: {H: 2}
  Redlich-Kwong:
    units: {length: cm, pressure: bar, quantity: mol}
    a: [3.0e+08, -3.30e+06]
    b: 31.0
  Peng-Robinson:
    units: {length: cm, pressure: bar, quantity: mol}
    a: [2.1e+08, -4.50e+06]
    b: 27.0
- name: NaCl(aq)  # species for Debye-Huckel phase
  composition: {Na: 1, Cl: 1}
  equation-of-state:
    model: constant-volume
    molar-volume: 1.3
  Debye-Huckel:
    electrolyte-species-type: weak-acid-associated
    weak-acid-charge: -1.0

Pros:

  • Separation of ThermoPhase and PDSS parameters makes implementation straightforward

Cons:

  • Requires changes for some existing files (mostly those using these two phase models)
  • Equation of state data ends up specified in two different ways depending on what should be considered an implementation detail (whether or not the phase uses the PDSS model)

Option 4: Allow lists in the equation-of-state section for multiple sets of equation of state parameters, of which one will be used with a particular phase. Create a new top-level key for Debye-Huckel parameters. This is a mix of options 1 and 3.

- name: H2  # species for Redlich-Kwong phase
  composition: {H: 2}
  equation-of-state:
  - model: Redlich-Kwong
    units: {length: cm, pressure: bar, quantity: mol}
    a: [3.0e+08, -3.30e+06]
    b: 31.0
  - model: Peng-Robinson
    units: {length: cm, pressure: bar, quantity: mol}
    a: [2.1e+08, -4.50e+06]
    b: 27.0
- name: NaCl(aq)  # species for Debye-Huckel phase
  composition: {Na: 1, Cl: 1}
  equation-of-state:
  - model: constant-volume
    molar-volume: 1.3
  Debye-Huckel:
    electrolyte-species-type: weak-acid-associated
    weak-acid-charge: -1.0

Pros:

  • Allows equation of state parameters compatible with different thermo models, stored in a consistent location regardless of the thermo model type

Cons:

  • Some implementation complexity required to allow equation-of-state to be either a map or a list of maps.
  • Debye-Huckel parameters are still unique in using a top-level field of their own in the species definition
@speth speth added feature-request New feature request question Further information is requested labels Jan 8, 2020
@ischoegl
Copy link
Member

ischoegl commented Jan 8, 2020

I'm not sure that the differentiation between PDSS (pressure dependent standard state) and ThermoPhase is sufficiently clear, as the former is just an aspect of the latter? I.e. Option 2 (or the less readable Option 1) would make more sense for users that are not familiar with 'under-the-hood' implementations. Without knowing the context, Option 3 is honestly confusing as it mixes two notations and is not consistent: at least in the YAML phase documentation, Debye-Huckel is just another thermo phase, so it should be an alternative to constant-volume, and stand on its own ... ?

As an aside, it is not clear why the current implementation uses top-level entries electrolyte-species-type, weak-acid-charge, ionic-radius for an otherwise specific ThermoPhase implementation (Debye-Huckel, see YAML species documentation). Shouldn't those be part of a Debye-Huckel entry in equation-of-state, as suggested in Options 1 and 2?

PS: there may be an Option 4, which eliminates the equation-of-state field in favor of per-model top level fields only (similar to Option 3, but shifting constant-volume up by one level).

@speth
Copy link
Member Author

speth commented Jan 8, 2020

I'm not sure that the differentiation between PDSS (pressure dependent standard state) and ThermoPhase is sufficiently clear, as the former is just an aspect of the latter?

I find that there's very little that's actually clear when it comes to the PDSS classes in Cantera. PDSS models apply to individual species, and each species in a phase can use a different PDSS model. At least some of the PDSS models can be used with multiple different ThermoPhase models (of the ones derived from the VPStandardState class). It's important to retain the ability to mix and match these models.

I had hoped that the name equation-of-state for the YAML field would be more helpful than pressure-dependent-standard-state, since in many instances, there is no pressure dependence, e.g. for constant volume species. Of course, things get a bit confused once you recognize that there are species where we have additional data needed to define their equation of state but the corresponding ThermoPhase doesn't use PDSS objects at all (i.e. Redich-Kwong).

Debye-Huckel is just another thermo phase, so it should be an alternative to constant-volume, and stand on its own ... ?

The DebyeHuckel ThermoPhase class needs each species to have a PDSS model in order to get density information, which it gets from the constant-volume model in this example, but could be something else like density-temperature-polynomial. This phase also requires some additional species-specific information, which is does not apply to any particular PDSS model.

Option 3 is honestly confusing as it mixes two notations and is not consistent

The two different notations is kind of the point of this option. The equation-of-state field specifies one of several PDSS models, which should be used for any phase (at least, any of the ones derived from VPStandardStateTP). The other top-level fields may be used depending on the phase thermo model.

As an aside, it is not clear why the current implementation uses top-level entries electrolyte-species-type, weak-acid-charge, ionic-radius for an otherwise specific ThermoPhase implementation (Debye-Huckel, see YAML species documentation). Shouldn't those be part of a Debye-Huckel entry in equation-of-state, as suggested in Options 1 and 2?

Changing the location of these fields is one of the two main points of this enhancement proposal.

PS: there may be an Option 4, which eliminates the equation-of-state field in favor of per-model top level fields only (similar to Option 3, but shifting constant-volume up by one level).

This would make it difficult to create PDSS objects using the factory pattern, since you wouldn't know which top-level field of the species entry was the appropriate one. This issue also affects Options 1 and 2, to a lesser extent. And of course, I meant to say that suggestions for alternative structures are welcome.

@ischoegl
Copy link
Member

ischoegl commented Jan 8, 2020

@speth ... your explanations confirm some of my suspicions (especially regarding Debye-Huckel).

The DebyeHuckel ThermoPhase class needs each species to have a PDSS model in order to get density information, [...]

... would it make sense to merge PDSS information into the Debye-Huckel entry (they're not supposed to be alternate models). I.e.

- name: NaCl(aq)  # species for Debye-Huckel phase
  composition: {Na: 1, Cl: 1}
  Debye-Huckel:
    equation-of-state: 
      model: constant-volume
      molar-volume: 1.3
    electrolyte-species-type: weak-acid-associated
    weak-acid-charge: -1.0

which would work as long as there aren't multiple PDSS specifications in the same file (not sure whether the current YAML approach would allow for this).

@speth
Copy link
Member Author

speth commented Jan 8, 2020

An equation-of-state field is needed not just for the Debye-Huckel model, though, but to use this species definition with any phase model derived from VPStandardStateTP. I think this structure would make it difficult to instantiate the PDSS object in a suitably generic location in the code. What would you do with the equation-of-state fields for all of the other phase types that use PDSS objects?

@ischoegl
Copy link
Member

ischoegl commented Jan 8, 2020

@speth ... what I suggested above was mostly in direct response to your explanation. I was also thinking about how things are instantiated from the UI (the user implicitly specifies ThermoPhase by selecting a phase name from the YAML input file, which would be Debye-Huckel, see debye-huckel-B-dot-ak in thermo-models.yaml): from that perspective, PDSS would be selected via the Debye-Huckel switch.

- name: NaCl(aq)  # species for Debye-Huckel phase
  composition: {Na: 1, Cl: 1}
  Debye-Huckel:
    equation-of-state: 
      model: constant-volume
      molar-volume: 1.3 
    electrolyte-species-type: weak-acid-associated
    weak-acid-charge: -1.0
  # other CPStandardStateTP derived phase models

I acknowledge that there may be redundant equation-of-state entries, but imho it is more important to create a clear input structure.

@speth
Copy link
Member Author

speth commented Jan 8, 2020

So if you wanted to use this species definition with additional phase thermo models, you would have to add a new section to the species definition for each of them, even if all they required was the specification of a constant molar volume? That loses the level of reusability that even the current structure has. I think it's more likely to want to be able to re-use the same PDSS model with multiple phase thermo models than it is to want a different PDSS model for each phase model (and if that is needed, there's always the option of having a completely separate species definition).

My question about the other phase types that use PDSS models was mainly concerned with the fact that with the exception of Debye-Huckel, the other models that use PDSS species do not require any additional species-specific parameters. So you'd end up with what seems like an unnecessary extra level of nesting:

- name: KCl(l)
  composition: {K: 1, Cl: 1}
  Margules:
    equation-of-state:
      model: constant-volume
      molar-volume: 37.57 cm^3/gmol

This structure also makes it more difficult to construct the PDSS object for the species, since we don't know where the equation-of-state field is unless we have the ThermoPhase object on hand and can query it, and then we can only (easily) specify one place to look. In contrast, the current initialization code only needs to check that the phase is a VPStandardStateTP and then the correct PDSS object can be created directly.

I should also clarify that I don't think that these top-level sections should necessarily be paired 1 to 1 with phase thermo models. For instance, it might be useful to allow a critical-state field that contains the critical temperature and pressure, which could then be used to determine Redlich-Kwong or Peng-Robinson parameters if model-specific coefficients are not given.

@ischoegl
Copy link
Member

ischoegl commented Jan 8, 2020

@speth ... as long as there's a single PDSS for any species, it looks like Option 3 is indeed a decent solution then (I don't see how the other options you listed address what you outlined satisfactorily). The main thing is that the meaning of equation-of-state field is redefined, which was the biggest stumbling block here.

However, PDSS is not really clear from the YAML front-end (and I am not aware of a comprehensive list in the documentation, although my overview is admittedly spotty). Per your comment:

I find that there's very little that's actually clear when it comes to the PDSS classes in Cantera.

is there a way to make the YAML input more self-explanatory? There appears to be a shifting meaning of what is an equation of state, and what is a sub-(or super?)-model of an equation of state. Imho a cleaner delineation of meanings would go a long way (see also issue #6).

As an example, I'm wondering whether it wouldn't make sense to rename equation-of-state to standard-state (or standard-state-model, or simply retain PDSS from the C++ layer) to clarify that this field applies to more than one thermo phase (which is oftentimes synonymous to an EoS).

@speth
Copy link
Member Author

speth commented Jan 9, 2020

is there a way to make the YAML input more self-explanatory? There appears to be a shifting meaning of what is an equation of state, and what is a sub-(or super?)-model of an equation of state. Imho a cleaner delineation of meanings would go a long way (see also issue #6).

As an example, I'm wondering whether it wouldn't make sense to rename equation-of-state to standard-state (or standard-state-model, or simply retain PDSS from the C++ layer) to clarify that this field applies to more than one thermo phase (which is oftentimes synonymous to an EoS).

There have been some changes terminology, which I had hoped were an improvement over what was used in XML (and is still pervasive in the C++ code). While I understand the utility of referencing thermodynamic properties to a state that may not correspond to the canonical thermodynamic standard state of the pure substance at 298.15 K and 1 bar (i.e. referencing solutes to infinite dilution in a particular solvent), I still cannot parse what it means for the standard state to depend on pressure and temperature.

However, all of the so-called PDSS models do provide a pressure-volume-temperature relationship for the species, while in many cases (though admittedly not all), the additional information needed to compute thermodynamic properties (i.e. enthalpy, entropy, etc.) is not part of the PDSS model. This was my motivation for the use of the term equation-of-state here. I think the ThermoPhase model is more than just an "equation of state", since it implements not just the P-v-T relationship but also calculations for enthalpy, entropy, and other thermodynamic properties.

I guess one bit of complexity that has arisen here comes from the fact that for equations of state like Redlich-Kwong, there are additional species-specific properties needed, but that model doesn't happen to use the PDSS machinery. I had been hoping to treat this as an implementation detail rather than something that needs to be reflected in the input file. From this perspective, Options 1 and 2 do better job than Option 3.

@ischoegl
Copy link
Member

ischoegl commented Jan 9, 2020

OK, I'm starting to see what the motivation is, and can agree on the definition of equation-of-state (although it is unfortunately not clear in the C++ implementation, and R-K etc. straddle definitions). I'm still not convinced that Options 1 and 2 do better, as Debye-Huckel would not constitute an equation-of-state in the strict sense, as it instead implements aspects of ThermoPhase?

Nomenclature aside it may make sense to go with Option 3 as it best reflects the interdependence of classes. I.e. YAML could be used as a starting point for a clarification of the interdependence of equations of state, phases, etc, in documentation (also addressing issue #6). Users typically don't know how things are implemented, so the main point is to be clear in how objects depend on each other in an intuitive interface. Whether or not C++ is clear is a separate issue.

Alternatively, there are solutions that combine aspects of Options 1/2 and 3 (i.e. clarify that Debye-Huckel is not an EoS by listing it separately from equation-of-state). Those should be my 2 cents.

@speth
Copy link
Member Author

speth commented Jan 9, 2020

OK, I can see that putting the extra Debye-Huckel fields under the equation-of-state field would be confusing, to the extent that they aren't part of the equation of state, and presumably only one set of equation of state data should be in use at a time (even if there is some way of having multiple set of parameters that could be used by different phase-level thermo models). I guess the issue then is that these Debye-Huckel parameters is unlike anything else we currently have: species-specific parameters of the thermo model that do not define the species equation of state.

In that case, I can see the appeal of taking the approach of Options 1 or 2 for allowing multiple sets of equation-of-state parameters. The logic would then be that for any given ThermoPhase model, at most one set of those parameters would be used, either directly by the ThermoPhase class or indirectly through a PDSS object. But for the Debye-Huckel parameters, use Option 3.

@ischoegl
Copy link
Member

ischoegl commented Jan 9, 2020

👍 Yes, at least to me that would clarify the interdependence.

I guess the issue then is that these Debye-Huckel parameters is unlike anything else we currently have: species-specific parameters of the thermo model that do not define the species equation of state.

That’s not necessarily a disadvantage if it clarifies the overall structure / interdependence. There are likely other tweaks, but that’s as far as my feedback goes.

@speth
Copy link
Member Author

speth commented Jan 9, 2020

OK, thanks for the discussion. I've updated some of the pros/cons, and introduced Option 4 (combination of 1 and 3) to provide an easier starting point for others who might want to offer their thoughts.

@decaluwe
Copy link
Member

decaluwe commented Jan 9, 2020

Well, this quickly escalated past something I can readily comment on 🤣 (i.e. the details of models like PDSS et al. are currently beyond me.)

Regardless, comment I will:

  • Naming: I think of the equation-of-state field as providing parameters required by the EoS. Personally, I find it to be suitably clear, as used. standard-state, imho, would be particularly poorly suited for the Redlich-Kwong and Peng-Robinson models, since these parameters relate not to the standard state, but to the EoS's departure terms.
  • Preferred option: From a user standpoint, I find option 1 to be the easiest to understand. Option 3 would be my second preference, although here I wonder if there is a naming convention for entries like Peng-Robinson or Redlich-Kwong to indicate that these are entries for required parameters. Would something like Peng-Robinson-parameters or somesuch be too long?
  • I will also say that I find all of these suitable, so I will cede primacy to whichever one causes the fewest implementation headaches.

The last thing on my mind which is partially relevant: the Peng-Robinson model also requires the acentric factor. So even if a users provides critical properties, under the current structure they will still need to provide the acentric factor somewhere. I think this should be fine under the current phase construction protocol (i.e. load any user-provided species params, then scan the critical properties database to attempt to fill any unspecified a and b params), but just wanted to raise the issue in case there were unanticipated issues that I'm unaware of.

Lastly, the acentric factor is sort of a headache, regardless, in that it can be both a transport property (for highPressureGas) and a thermo property (for Peng-Robinson). Down the road, might there be a way to load these types of fundamental properties in a more general way that can be utilized as needed by whichever routines need it?

@decaluwe
Copy link
Member

decaluwe commented Jan 9, 2020

Lol - the conversation evolved beneath my feet, as I was typing that.

Just wanted to note that Option 4 is equivalently suitable to Option 1, from my standpoint.

@speth
Copy link
Member Author

speth commented Jan 9, 2020

Thanks, @decaluwe. I was vaguely aware that the acentric factor came into play in the Soave-Redlich-Kwong equation, but didn't realize that it was used in the Peng-Robinson equation as well. Given that it is a physical property of the species, not just a model parameter, and that it is used in several equations of state as well as transport models, I think there's a case to be made that it could be included as a top-level field in the species definition. Or perhaps, given the close relationship to the critical properties, as part of a field that also includes those values.

@ischoegl
Copy link
Member

ischoegl commented Jan 10, 2020

A minor remark following up on @speth's last comment: using that rationale, electrolyte-species-type, weak-acid-charge, and ionic-radius could likewise be viewed as physical properties of the species (rather than a model parameter), in which case the Debye-Huckel field is moot (It just so happens that it is the only model using those)? This would mean that the entry won't look different from the current version?

@speth
Copy link
Member Author

speth commented Jan 10, 2020

I get the impression that electrolyte-species-type and weak-acid-charge are kind of particular to the Debye-Huckel model, and perhaps even to the specific implementation in Cantera. The ionic radius might be a more general physical property, but I'm a little concerned about whether there are different definitions that could be needed for use with different models, and for that reason tying it to the Debye-Huckel model seems like the safer choice.

@decaluwe
Copy link
Member

Yeah, the acentric factor is used to calculate kappa, which is used to calculate alpha, which shows up in the Peng-Robinson equation.

I think the answer of "what to do with the acentric factor?" depends on if there will be other similar such properties. Having a loose acentric-factor floating around as a top-level entry seems sort of arbitrary and hard for a user to predict. But if there will (eventually) be several, would a map of physical-props be worthwhile?

If not, then I would say that storing it with the critical properties is a good approach. This probably simplifies the implementation, too. If no P-R constants (a, b, and w_ac) are provided, first look to see if a critical-props field was provided, and if not, then scan the critical properties database.

One downside, regardless of how we answer this question at the YAML level: I need to add acentric factors for all those species in the critProperties.xml database. Also, I'm guessing I ought to convert this database to YAML?

@speth
Copy link
Member Author

speth commented Jan 10, 2020

A physical-properties field would be a possibility, although I wonder if that would lead to a bit of confusion for species where we have the density set under the equation-of-state field. I'd lean toward having the field be named critical-properties, and contain the critical-temperature, critical-pressure, critical-volume, and acentric-factor, even if the acentric factor isn't strictly a property of the critical state.

Yes, we should convert the critProperties.xml file to YAML, although that doesn't necessarily need to be done immediately. When we do this, though, I think we should make sure to have the entries in that file use the same structure as species entries in a "normal" YAML file.

@ischoegl
Copy link
Member

ischoegl commented Jan 10, 2020

👍 on the critical-properties avenue, although a less specific name to accommodate/future-proof other parameters (?) may be good.

@bryanwweber
Copy link
Member

@speth Since Cantera/cantera#795 is merged, what's the status of this issue now? Can it be closed?

@speth
Copy link
Member Author

speth commented Apr 5, 2021

Most of this is resolved by that PR, but I didn't want to lose this last idea of creating a critical-properties field inside a species entry, specifically for use when migrating the critProperties.xml file to YAML. I guess this could be moved to a new issue, probably on Cantera/cantera, regarding the need to replace critProperties.xml given the deprecation of XML within Cantera.

@speth
Copy link
Member Author

speth commented Jul 8, 2021

I'm closing this since #107 now covers the one remaining issue of finding a home for critical property information.

@speth speth closed this as completed Jul 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request New feature request question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants