Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support no-date citations #70

Closed
rudolf-adamkovic opened this issue Nov 17, 2021 · 32 comments
Closed

Support no-date citations #70

rudolf-adamkovic opened this issue Nov 17, 2021 · 32 comments

Comments

@rudolf-adamkovic
Copy link

rudolf-adamkovic commented Nov 17, 2021

In APA Style, citing sources with no date BibTeX field should render as $AUTHOR (n.d.) instead of $AUTHOR.

Tested with Pandoc and CSL: https://github.com/citation-style-language/styles/blob/master/apa.csl

@rudolf-adamkovic rudolf-adamkovic changed the title Fix no-date citations in APA Style Support no-date citations Nov 17, 2021
@andras-simonyi
Copy link
Owner

andras-simonyi commented Nov 17, 2021

This is actually an interesting case: the relevant part of the style is this (in the else branch of a conditional checking whether there is a filled in issued or at least status field:

<else>
     <group>
          <text term="no date" form="short"/>
          <text variable="year-suffix" prefix="-"/>
     </group>
</else>

according to the standard,

cs:group implicitly acts as a conditional: cs:group and its child elements are suppressed if a) at least one rendering element in cs:group calls a variable (either directly or via a macro), and b) all variables that are called are empty.

on my reading this means that the standard-compliant rendering is to suppress the term when year-suffix is empty, i.e., the item isn't disambiguated. If Pandoc's citeproc renders the no date term even for non-disambiguated items then it diverges from the standard AFAICS. Or is year-suffix somehow a special case? @denismaier, @bdarcus could you comment on this?

@bdarcus
Copy link
Collaborator

bdarcus commented Nov 17, 2021

I believe @bwiernik has coded a lot of the APA style, so perhaps he can weigh.

But at first glance, I agree this doesn't look like a citeproc bug (though does seem like a style bug). Still, surprising if a bug in the style, given how widely it's used.

@bwiernik
Copy link

Hmm, it may be that both pandoc and citeproc-js diverge from the spec on this point or treat year-suffix specially, as the style works correctly with both of them. Let me look into it. There doesn't at first glance look like a need for a group here.

@bdarcus
Copy link
Collaborator

bdarcus commented Nov 17, 2021

There doesn't at first glance look like a need for a group here.

What's interesting is the group is not there in the only other place "no date" shows up in the style.

https://github.com/citation-style-language/styles/blob/0710b5153960b5f79a3318c6f2e5b22c3110a5f9/apa.csl#L575-L578

@andras-simonyi
Copy link
Owner

andras-simonyi commented Nov 17, 2021

What's interesting is the group is not there in the only other place "no date" shows up in the style.

I'm also looking at the the other, groupless "no date" occurrence and citeproc-el is actually suppressing it as well -- I've a vague memory that in certain cases the macro element (?) also has this implicit conditional behavior, at least if one looks at some of the test-suite examples. Within the context of citeproc-el the simplest solution would probably be just making an exception for year-suffix, if this doesn't break something else.

bwiernik added a commit to bwiernik/styles that referenced this issue Nov 17, 2021
Unnecessary and removing ensures good performance across citeprocs

andras-simonyi/citeproc-el#70 (comment)
@bwiernik
Copy link

I opened a PR for the style to remove that group

@andras-simonyi
Copy link
Owner

meanwhile I've found this comment in the citeproc-js source:

if (variable === "year-suffix") {
    // year-suffix always signals that it produces output,
    // even when it doesn't. This permits it to be used with
    // the "no date" term inside a group used exclusively
    // to control formatting.

this kind of settles the issue for me. OTOH, maybe this special status of year-suffix should be mentioned in the standard?

@bdarcus
Copy link
Collaborator

bdarcus commented Nov 17, 2021

meanwhile I've found this comment in the citeproc-js source:

I'd probably need to think about it more, but doesn't that sound a bit hackish?

Or at least I'm not clear on it; what does that last clause mean?

I wonder how citeproc-rs (which is basically a clean rewrite in rust) handles this ...

@denismaier
Copy link
Collaborator

Well, let's ask @cormacrelf

@denismaier
Copy link
Collaborator

But I agree that's hackish. Could we convert no date into some sort of substitute for a missing year? I mean in a future release.

@bdarcus
Copy link
Collaborator

bdarcus commented Nov 17, 2021

But I agree that's hackish. Could we convert no date into some sort of substitute for a missing year? I mean in a future release.

It looks like that's what cormac is doing.

https://github.com/zotero/citeproc-rs/blob/19f26ddfaaf9eb46d7d075e5e9accea1a494fefd/crates/proc/src/group.rs#L29

Is there something in our spec that is ambiguous, that needs to be changed here?

Or is this just a citeproc-js thing?

My hunch is the latter; that it's an internal detail.

@andras-simonyi
Copy link
Owner

andras-simonyi commented Nov 17, 2021

AFAICS the problem is that the year-suffix variable has a genuinely special rendering status for items with no date: the accompanying term (no date) has to be rendered regardless of whether the variable is empty or not. In other words, the term-variable rendering dependency is exactly the opposite of the typical (maybe all other??) cases.

@bdarcus
Copy link
Collaborator

bdarcus commented Nov 17, 2021

AFAICS the problem is that the year-suffix variable has a genuinely special rendering status for items with no date: the accompanying term (no date) has to be rendered regardless of whether the variable is empty or not. In other words, the term-variable rendering dependency is exactly the opposite of the typical (maybe all other??) cases.

Ah, right. In effect, the value for a nil date is not nil.

Citations can be such a PITA.

@bwiernik
Copy link

bwiernik commented Nov 17, 2021

I think the easiest approach would be to make a "no-date" variable which is empty if issued is present and renders the no-date term otherwise

That would solve the need to special treatment of year-suffix entirely and can be fixed in existing styles by a batch change

@andras-simonyi
Copy link
Owner

andras-simonyi commented Nov 17, 2021

As a temporary solution I've merged a PR hopefully fixing this particular issue by not reporting an empty variable if the variable happens to be the year-suffix. @salutis, could you check? I'll revisit the code if the above suggestion or an alternative gets implemented in the standard and/or in the styles.

@cormacrelf
Copy link

Bruce has the right tack on what citeproc-rs is doing (also note UnresolvedMissing comes from mixing with Unresolved, which according to a grep on the codebase is not used for anything else other than year-suffix).

For any implementation, it is important to recognise that the rendering of a cite changes as it goes through disambiguation, and that implicit conditionals are a part of the rendering process. Variables that were initially found to be empty may no longer be, and so implicit conditionals that were once implicitly suppressed may no longer be. If your model of the renderer is a completely pure function of (variables, disambiguation progress) => HTML then this will be trivial. citeproc-rs, on the other hand, uses trees with enough information stored in each node, and a way to delay and later resolve a variable's presence, and a procedure for propagating that resolution upward through any implicit conditionals above it in the tree. But the spec need not bless a particular way of implementing this.

As has been suggested, the most helpful addition you could make to the spec would be that the implicit conditional part include a mention of year-suffix, as the delayed resolution of its presence or absence is the only reason this happens (in current CSL at least). Something like

As a result of disambiguation, a variable that was initially empty (in particular year-suffix) may no longer be empty. In this case, groups that were initially implicitly suppressed as a result of that variable being empty will no longer be suppressed.

@cormacrelf
Copy link

cormacrelf commented Nov 18, 2021

@bwiernik The no-date solution is neither necessary nor sufficient:

  • You could use <if match="none" variable="issued"> to achieve that already
  • year-suffix can appear in arbitrary positions, including just for formatting a normal year suffix, or after a name, etc

The problem statement is incorrect:

AFAICS the problem is that the year-suffix variable has a genuinely special rendering status for items with no date

It has a genuinely special rendering status, with no qualifications on that. Any interaction with missing issued is a coincidence. This is illustrated by a final complication:

  • This should all happen with <if disambiguate="true"> as well.

It is just that this device is much rarer, and is not really tested AFAIK. Nevertheless, I believe a variable rendered inside such a conditional should technically be able to wake a ghost group from the dead.

@bwiernik
Copy link

If citeproc writers are happy to treat year-suffix as always being non-empty, even if it renders nothing, that's fine by me.

@cormacrelf
Copy link

I disagree. You need it to work with implicit conditionals. Treating it as always non-empty means that the example from above with the grouped term + year-suffix would need an explicit conditional, which is in my view a breaking change to the implicit conditional behaviour promised by the spec. It looks like it would be a very involved review of the styles in the repo to rectify. For example here it would result in empty renders (should be CSL ERROR etc) showing up as : , without amendment. Further:

  • As I pointed out above, it doesn't save any work for implementation, since those pesky arbitrary conditionals still need to be able to produce output, and they work essentially the same way.
  • I think it's a perfectly reasonable thing for the spec to require of implementations. It would be freakishly complicated that implicit conditionals put terms/plain text on equal footing with affixes, in all situations except when near a year-suffix. It undermines the feature as a whole, which already feels a little scary to rely on. You may as well remove the entire thing and make people spell out every variable in a branch, if you're going to make it that hard to use confidently.
  • There are perfectly valid use cases where implicit conditionals are needed for effective use of year-suffix, beyond what APA and its ilk do. Not everyone's using it in an else-branch of an issued macro. (Note also that year-suffix is completely disentangled from dates, you can use it in any position with or without a date, e.g. attached to a title.) One use case:
<group>
  <text term="circa" /> <!-- please don't render me unnecessarily! -->
  <text macro="date-year" /> <!-- renders one of any number of date variables, or some other term -->
  <text variable="year-suffix" /> <!-- can't feasibly test all those date variables to guard this -->
</group>

That's almost what's going on here

@cormacrelf
Copy link

cormacrelf commented Nov 18, 2021

Ah, it goes even further: year-suffix is subject to multiple-use suppression. You cannot reliably know, even by testing issued and every other date variable it's meant to appear next to, whether any particular year-suffix ought to get rendered with unconditional surrounding plaintext. You can put five year-suffixes in a row (e.g. via five different date macros), and only the first one will end up non-empty. The rest should suppress their surrounding plaintext, but if they're always non-empty, then you would get e.g. n.d.-a ... n.d. n.d. n.d. n.d.. No combination of date variable tests can differentiate this.

This would mix especially poorly with e.g. APA-style date macros called called multiple times due to cs:substitute, which would result in suppression of issued, and the second macro call to render the else branch and therefore an extraneous unconditional n.d..

If anything I think year-suffix is the one variable where implicit conditionals are strictly necessary as the only way to conditionally suppress surrounding plaintext output. Every other variable in an implicit conditional can be translated into a (verbose, difficult to maintain) bunch of explicit if statements. (This does raise the question of what it means to test <if variable="year-suffix">, but perhaps that's best considered undefined, it would be far more complicated still.)

@bwiernik
Copy link

Cormac I think we are thinking about the issue on weirdly different levels, so let's zoom out.

What is the issue we are trying to resolve?

I thought the problem was the conflict of the general "groups are suppressed if all variables are empty" logic with the expect related to year-suffix.

I offered 2 solutions to this unique exception. Either we introduce a new date variable that can avoid the group suppression logic (which is necessary and sufficient to resolve the issue at hand), or we short circuit the group-related suppression logic for the unique case that is year-suffix, which is efficiently addressed by treating it as always non-empty.

There seems to be some broader issue about variables we are disagreeing on. Can you elaborate about what your concern is?

@bdarcus
Copy link
Collaborator

bdarcus commented Nov 18, 2021

I don't fully understand the Haskell code, but here seems to be where the new Haskell library handles it.

https://github.com/jgm/citeproc/blob/4a7b98afabebd7a074489ba500d68ee6aa75d3a8/src/Citeproc/Eval.hs#L1644

@denismaier
Copy link
Collaborator

denismaier commented Nov 18, 2021

Has anyone actually checked pandoc's behaviour? I don't understand how etext affects egroup. @jgm?

@denismaier
Copy link
Collaborator

denismaier commented Nov 18, 2021

Ok, it does work correctly.
And i think it's implemented here:
https://github.com/jgm/citeproc/blob/4a7b98afabebd7a074489ba500d68ee6aa75d3a8/src/Citeproc/Eval.hs#L1549
year-suffix is not counted as a variable.

adam3smith pushed a commit to citation-style-language/styles that referenced this issue Nov 19, 2021
Unnecessary and removing ensures good performance across citeprocs

andras-simonyi/citeproc-el#70 (comment)
@rudolf-adamkovic
Copy link
Author

@andras-simonyi

@salutis, could you check?

It works beautifully. Thank you!

@cormacrelf
Copy link

cormacrelf commented Nov 21, 2021

Ok, a couple of clarifications:

  • citeproc-rs didn't intend to treat year-suffix as always present. Somehow a combination of things I've done has roughly that effect, which I consider a bug. You should be able to do <group><text value="attached plain text" /> <text variable="year-suffix" /></group> and have the implicit group-suppression do its thing, that's what I was trying to permit, and I have devised a fix.
  • I did indeed misunderstand what was being solved, because we're all talking about year-suffix but I didn't actually see any reason it needs to be special, apart from the whole disambiguation thing where its 'empty/non-empty' status changes as you disambiguate. But that wasn't the problem, I see that.
  • The most relevant special thing about it is that it is often used next to a no date term, and despite no date standing in for what would otherwise be a rendered date variable, terms do not act as variables for implicit conditional groups. I believe that is the core problem here.
  • This points to a much better solution: treat the no date term as a non-empty variable for implicit conditionals.
  • I think this addresses the problem much better than dealing with year-suffix does. It is also better to be adding to the behaviour than making a very subtle exception that one must watch out for.

For a thorny example of why year-suffix always acting non-empty (despite mostly being literally empty) is worse:

<group>
    <text value="prefix-" />
    <date variable="issued" />
    <text variable="year-suffix" />
</group>

You don't want a lone prefix- hanging around in the output. That is possible because year-suffix can obviously end up being empty. The attached text here should be rendered whenever either of issued or year-suffix has a value, but other than that, never. This means people have to very carefully write out their groupings such that this cannot happen.

Whereas with no date acting as a variable, it's impossible to get the attached plain text without the n.d., because terms do not produce empty output on their own (unless the term itself is empty, but one must already account for that). So your output always at least makes sense, and it lines up with the actual usage of no date.

@denismaier
Copy link
Collaborator

  • The most relevant special thing about it is that it is often used next to a no date term, and despite no date standing in for what would otherwise be a rendered date variable, terms do not act as variables for implicit conditional groups. I believe that is the core problem here.
  • This points to a much better solution: treat the no date term as a non-empty variable for implicit conditionals.

I think that's pretty much the reasoning behind my suggestion above to convert "no date" into some sort of a substitute for a missing year. I was thinking about something like this:

<date variable="issued" >
  <substitute term="no date"/>
</date>

Here, the term would act as a variable.

So, @cormacrelf's example would look like this:

<group>
    <text value="prefix-" />
    <date variable="issued" >
      <substitute term="no date"/>
    </date>
    <text variable="year-suffix" />
</group>

@bwiernik
Copy link

The most relevant special thing about it is that it is often used next to a no date term, and despite no date standing in for what would otherwise be a rendered date variable, terms do not act as variables for implicit conditional groups. I believe that is the core problem here.

Is that the problem though. Isn't suppressing terms when an accompanying variable is not present is one of the core reasons for this feature?

Perhaps my mental model of when terms are used inside groups like this is wrong. What are some other examples where a term should be suppressed or not suppressed when paired with a variable in group?

I'm wondering if this is specific to the no date/year-suffix context.

@cormacrelf
Copy link

cormacrelf commented Nov 23, 2021

Read it again, I wasn't referring to the general behaviour that terms have (and should have), I was talking about how it applies to no date even though no date is really a stand-in for a variable. That's the crucial bit, people use no date where they would normally use <date>, and it is bound to end up in duplicated code with a <date> replaced and people are going to expect it to work too. The whole concept of no date and styles that use n.d. is to "fill" an empty date variable and avoid it rendering nothing, especially for an author-date style where the date itself indicates it's a citation and not just a parenthetical name. Avoiding rendering nothing should entail avoiding group suppression. So when it doesn't behave in CSL like a date at all, that is very incongruous and unintuitive, and that's why I think it's the cause of the problem.

There might be other terms like this, and a cursory look through the list gives two others: ibid (standing in for most of a cite) and anonymous (standing in for a name, I think?). But even then, they do not have the problem where they are usually used next to a very-likely-empty variable year-suffix. So yes, it may be specific to no date/year-suffix, but worth considering these two. If you want to put it in the spec, then it's easy to explain why no date and maybe others are special: they are used to render something in spite of an associated variable being empty, and you might want it to behave as if it had rendered a variable. If you are like every style on earth and have have a macro for your dates, it makes even more sense:

<macro name="date">
  <choose><if variable="issued">
    <date variable="issued" ... />
  </if>
  <else>
    <text term="no date" />
  </else></choose>
</macro>

...

<group>
  <text value="first published " />
  <text macro="date" />
</group>

You would really expect this to work as if a variable had been rendered, even in the case of first published n.d.. If people didn't want this missing date variable to render anything, they wouldn't have put no date in there in the first place, because the default else branch is empty.

There is another solution using the following, but as I just outlined, it requires thinking your way around the natural meaning of no date and digging into the CSL spec for an author to accomplish.

<!-- no variables attempted inside the group, so it is never suppressed! -->
<group>
  <text term="no date" />
</group>
<text variable="year-suffix" />

The examples you were looking for:

<group> <!-- absent any variable attempts inside (and through macros/conditionals), group is not suppressed -->
  <text value="verbatim" />
  <text term="reference" /> <!-- terms and values are treated the same way -->
</group>

<group>
  <text value="verbatim" />
  <text variable="MISSING" />
  <text variable="AS MANY MISSING AS YOU LIKE" />
  <text variable="PRESENT" /> <!-- satisfies the group condition,
                                   it attempted many but one succeeded;
                                   the group is not suppressed -->
</group>

<group>
  <text value="verbatim" />
  <text variable="MISSING" /> <!-- causes group to be suppressed as a whole -->
</group>

@rudolf-adamkovic
Copy link
Author

rudolf-adamkovic commented Dec 15, 2021

I have been citing no-date items for a while now, and it works. Can I close this issue?

@andras-simonyi
Copy link
Owner

I have been citing no-date items for a while now, and it works. Can I close this issue?

Yes, I think so -- thanks is advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants