Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOI missing from attributes slot #182

Closed
jbdorey opened this issue Feb 19, 2023 · 2 comments
Closed

DOI missing from attributes slot #182

jbdorey opened this issue Feb 19, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@jbdorey
Copy link

jbdorey commented Feb 19, 2023

Hi there,

I'm having trouble where my script previously worked. My script is:

ColsToKeep = c("scientificName","family", "subfamily","genus","subgenus","subspecies","species" ) # This list is longer but includes ALA and non-ALA columns
ALA_taxon = "Apiformes"

ALA_Occurence_download <- galah::galah_call() %>%
galah::galah_identify(ALA_taxon) %>%
galah::galah_select(tidyselect::any_of(ColsToKeep)) %>%
galah::atlas_occurrences(mint_doi = TRUE)

attrs_ALA_Occurence_download <- attributes(ALA_Occurence_download)

However, now the doi slot is empty (attrs_ALA_Occurence_download$doi) and I'm not sure why. This then stops me from downloading the file in a later line of my function.

Let me know if you need more context!

galah version: 1.5.1

@jbdorey jbdorey added the bug Something isn't working label Feb 19, 2023
mjwestgate added a commit that referenced this issue Feb 20, 2023
This got deleted at some point in the last release
daxkellie added a commit that referenced this issue Feb 21, 2023
* Save DOI to attributes of download from `atlas_occurrences` when `mint_doi = TRUE`
* Fixed `collect_occurrences` to work with DOI
* `collect_occurrences` doesn't work with urls still
@daxkellie
Copy link
Contributor

Thanks for raising this issue, and it looks like you caught a minor mistake on our end. In a nutshell, we made lots of changes behind the scenes to improve internal downloads in galah 1.5.1, and after we made those changes it looks like we accidentally omitted the bit of code that adds the DOI onto the download when mint_doi = TRUE (even though we still generated it along with the downloaded records!)

The latest commit to the dev branch has fixed this. If you install {galah} from the current GitHub dev branch, saving a DOI should now work!

remotes::install_github("AtlasOfLivingAustralia/galah@dev")

The DOI can be used within collect_occurrences() to download the data again - just be sure to specify that you are providing a DOI with doi = because this function still has some bugs that need fixing, and this seems to stop them from cropping up for now until we fix them all.

Here's a working example of the code your provided above:

# remotes::install_github("AtlasOfLivingAustralia/galah@dev")
library(galah)
library(dplyr)
library(tidyr)

galah_config(email = "dax.kellie@csiro.au")

ColsToKeep <- c("scientificName","family", "subfamily","genus","subgenus",
               "subspecies","species")
ALA_taxon <- "Apiformes"

ALA_Occurrence_download <- galah_call() %>%
  galah_identify(ALA_taxon) %>%
  galah_select(any_of(ColsToKeep)) %>%
  atlas_occurrences(mint_doi = TRUE)
#> This query will return 271,088 records
#> 
#> Checking queue
#> Current queue size: 1 inqueue . running ........

attributes(ALA_Occurrence_download)$doi # Returns DOI
#> [1] "https://doi.org/10.26197/ala.37450b00-40c1-4d55-8067-0f9e80d4d5f7"

# Redownloads records from DOI
collect_occurrences(doi = attributes(ALA_Occurrence_download)$doi) 
#> Downloading
#> # A tibble: 271,088 × 7
#>    scientificName family subfamily genus subgenus subspecies species
#>    <chr>          <chr>  <chr>     <chr> <chr>    <lgl>      <chr>  
#>  1 APIDAE         Apidae <NA>      <NA>  <NA>     NA         <NA>   
#>  2 APIDAE         Apidae <NA>      <NA>  <NA>     NA         <NA>   
#>  3 APIDAE         Apidae <NA>      <NA>  <NA>     NA         <NA>   
#>  4 APIDAE         Apidae <NA>      <NA>  <NA>     NA         <NA>   
#>  5 APIDAE         Apidae <NA>      <NA>  <NA>     NA         <NA>   
#>  6 APIDAE         Apidae Apinae    <NA>  <NA>     NA         <NA>   
#>  7 APIDAE         Apidae <NA>      <NA>  <NA>     NA         <NA>   
#>  8 APIDAE         Apidae Apinae    <NA>  <NA>     NA         <NA>   
#>  9 APIDAE         Apidae <NA>      <NA>  <NA>     NA         <NA>   
#> 10 APIDAE         Apidae Apinae    <NA>  <NA>     NA         <NA>   
#> # … with 271,078 more rows

@jbdorey
Copy link
Author

jbdorey commented Feb 25, 2023

Thank you very much for the quick fix! It looks like my function is once again functional ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants