bda - getter function #127

zoometh · 2021-03-23T10:31:59Z

first step to recover the 'bda' C14 database

added values/complete the files:

url_references.csv
variable_reference.csv
material_thesaurus.csv
country_thesaurus.csv

added 'bda' to get_c14data()

add lines variable_reference.csv add lines country_thesaurus.csv

dirkseidensticker · 2021-03-23T11:38:34Z

@zoometh what a great starting point. I do not know how far you are to write the actual parser, but maybe you want to consider to have a look a the parser for the euroevol database. This database is also based on multiple individual files that need to be joined.

If you need any assistance, I am there as well. Many thanks for your contribution already 👍

zoometh · 2021-03-23T17:07:07Z

@dirkseidensticker Yes I've started to write the get_bda() function inspired by the get_euroevol() function. I guess the get_bda() function will be ready within a week. If I have some issues, I'll let you know.

zoometh · 2021-03-28T19:49:44Z

@dirkseidensticker I've finally created the get_bda function. The joins between the 3 unzipped .xlsx files (Dates, Sites, Occupations) works but I was unable to check it with the c14bazAAR package since the url_references.csv file appears to be removed (temporally) from the package, maybe because of my previous commit.. This is the first time I used branches, and my skills with GitHub are still basics. Whatever, I can still help out.

dirkseidensticker · 2021-03-29T08:44:21Z

@zoometh we made some changes within #128 and the latest v2 release. We removed the url table which one had to update on the remote. Could you pull the changes and add your url's to the new db_info_table.csv table? get_db_url("[the database name]") refers now to this table.

Removing the odd url_references.csv reference was something we wanted to do for some time, sorry that it intervened with your PR.

zoometh · 2021-03-29T12:09:44Z

I've just added a line in db_info_table.csv to register the bda database, here

R/get_bda.R

dirkseidensticker · 2021-04-01T11:22:11Z

R/get_bda.R

+
+  unzip(temp, files="BDA-Table_Dates.xlsx", exdir=td, overwrite=TRUE)
+  c14dates <- readxl::read_excel(paste0(td,"/BDA-Table_Dates.xlsx"))
+  unzip(temp, files="BDA-Table_Occupations.xlsx", exdir=td, overwrite=TRUE)


add utils::unzip(...

dirkseidensticker · 2021-04-01T11:22:17Z

R/get_bda.R

+  c14dates <- readxl::read_excel(paste0(td,"/BDA-Table_Dates.xlsx"))
+  unzip(temp, files="BDA-Table_Occupations.xlsx", exdir=td, overwrite=TRUE)
+  c14occupa <- readxl::read_excel(paste0(td,"/BDA-Table_Occupations.xlsx"))
+  unzip(temp, files="BDA-Table_Sites.xlsx", exdir=td, overwrite=TRUE)


add utils::unzip(...

dirkseidensticker · 2021-04-01T11:23:10Z

R/get_bda.R

+  c14occupa <- readxl::read_excel(paste0(td,"/BDA-Table_Occupations.xlsx"))
+  unzip(temp, files="BDA-Table_Sites.xlsx", exdir=td, overwrite=TRUE)
+  c14sites <- readxl::read_excel(paste0(td,"/BDA-Table_Sites.xlsx"))
+


add

names(c14dates) <- gsub("\u00E9", "e", names(c14dates)) names(c14occupa) <- gsub("\u00E9", "e", names(c14occupa)) names(c14sites) <- gsub("\u00E9", "e", names(c14sites))

in order to replace non-ascii characters

dirkseidensticker · 2021-04-01T11:23:53Z

R/get_bda.R

+    dplyr::left_join(c14sites, by = c("id_site_associé" = "id_sites")) %>%
+    dplyr::left_join(c14occupa, by = c("id_occupation_liée" = "id_occupations_2019")) %>%
+    dplyr::transmute(
+      method = .data[["Méthode"]],


change "Méthode" into "Methode"

dirkseidensticker · 2021-04-01T11:24:28Z

R/get_bda.R

+      site = .data[["Nom_site"]],
+      sitetype = .data[["Nature_site"]],
+      feature = .data[["num_couche"]],
+      period = .data[["période"]],


change "période" into ""periode"

dirkseidensticker · 2021-04-01T11:25:03Z

R/get_bda.R

+      feature = .data[["num_couche"]],
+      period = .data[["période"]],
+      culture = .data[["culture"]],
+      material = .data[["Matériel_daté"]],


change "Matériel_daté" into `"Matèriel_date"

dirkseidensticker · 2021-04-01T11:25:46Z

R/get_bda.R

+      period = .data[["période"]],
+      culture = .data[["culture"]],
+      material = .data[["Matériel_daté"]],
+      species = .data[["Matériel_daté_précision"]],


change "Matériel_daté_précision" into "Materiel_date_precision"

dirkseidensticker · 2021-04-01T11:26:07Z

R/get_bda.R

+      culture = .data[["culture"]],
+      material = .data[["Matériel_daté"]],
+      species = .data[["Matériel_daté_précision"]],
+      region = .data[["Région"]],


change "Région"into "Region"

dirkseidensticker · 2021-04-01T11:26:35Z

R/get_bda.R

+      lat = .data[["Latitude"]],
+      lon = .data[["Longitude"]],
+      shortref = .data[["Ref_biblio"]],
+      comment = .data[["Fiabilité.x"]],


change "Fiabilité.x" into "Fiabilite.x"

dirkseidensticker · 2021-04-01T11:27:10Z

R/get_bda.R

+  c14sites <- readxl::read_excel(paste0(td,"/BDA-Table_Sites.xlsx"))
+
+  bda <- c14dates %>%
+    dplyr::left_join(c14sites, by = c("id_site_associé" = "id_sites")) %>%


change "id_site_associé" into "id_site_associe"

dirkseidensticker · 2021-04-01T11:27:39Z

R/get_bda.R

+
+  bda <- c14dates %>%
+    dplyr::left_join(c14sites, by = c("id_site_associé" = "id_sites")) %>%
+    dplyr::left_join(c14occupa, by = c("id_occupation_liée" = "id_occupations_2019")) %>%


change "id_occupation_liée" into "id_occupation_liee"

dirkseidensticker · 2021-04-01T11:28:17Z

data-raw/url_reference.csv

@@ -21,3 +21,4 @@ emedyd,2017,1,https://discovery.ucl.ac.uk/id/eprint/1570274/1/robertsetal17.zip
 katsianis,2020-08-20,1,https://rdr.ucl.ac.uk/ndownloader/files/23166314
 rapanui,2020-08-21,1,https://github.com/clipo/rapanui-radiocarbon/archive/master.zip
 mesorad,2020-09-01,1,https://github.com/eehh-stanford/price2020/raw/master/MesoRAD-v.1.1_FINAL_no_locations.xlsx
+bda,2021-03-23,1,https://api.nakala.fr/data/10.34847/nkl.dde9fnm8/189e04d917ffd68352a389006f357b58efda855e


line should be removed, as it is solved with #130

dirkseidensticker

dear @zoometh , please excuse the long waiting time. I checked you parser, which looks (nearly) perfect! Congratulations and many thanks for your efforts from @nevrome and myself.

While running the checks, I noticed that you kept the french "é" in the column header. While this does not trigger an error, it flags a warning. Could you replace those characters?

Also, I noticed that you PR #130 was a bit redundant. Just keep in mind that when you add the url to the reference table, you would need to perform a "Clean and Rebuild" (in RStudio within the "More" menue within the "Build" tab. That triggers that the helper indices, like db_info_table are build again. Maybe we should add this to the README? @nevrome what do you think?

zoometh · 2021-04-01T20:16:18Z

Dear ***@***.****, ***@***.*** *I'm currently out of the office with little access to the internet. Get back to work next Sunday (!). If that's fine for you I will accomplish the required changes at this date. Let me know,

…

On Thu, Apr 1, 2021 at 1:32 PM Dirk Seidensticker ***@***.***> wrote: ***@***.**** requested changes on this pull request. dear @zoometh <https://github.com/zoometh> , please excuse the long waiting time. I checked you parser, which looks (nearly) perfect! Congratulations and many thanks for your efforts from @nevrome <https://github.com/nevrome> and myself. While running the checks, I noticed that you kept the french "é" in the column header. While this does not trigger an error, it flags a warning. Could you replace those characters? Also, I noticed that you PR #130 <#130> was a bit redundant. Just keep in mind that when you add the url to the reference table, you would need to perform a "Clean and Rebuild" (in RStudio within the "More" menue within the "Build" tab. That triggers that the helper indices, like db_info_table are build again. Maybe we should add this to the README? @nevrome <https://github.com/nevrome> what do you think? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#127 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQGHHLTWWMKDALQTLVZHWADTGRKVFANCNFSM4ZU4RE3Q> .

dirkseidensticker · 2021-04-02T11:17:05Z

... Just keep in mind that when you add the url to the reference table, you would need to perform a "Clean and Rebuild" (in RStudio within the "More" menue within the "Build" tab. That triggers that the helper indices, like db_info_table are build again. Maybe we should add this to the README? @nevrome what do you think?

I missed this one yesterday: you need to run data-raw/data_prep.R to integrate the url added to the reference table (see no8 in Adding database getter functions).

nevrome · 2021-04-03T14:23:46Z

Yes - it's maybe a bit tricky that no5, 6 and 7 don't do anything without 8, but I didn't want to be redundant

zoometh

I guess that's all fine. What will be the next step @dirkseidensticker ?

zoometh added 6 commits March 23, 2021 09:47

data-raw add

40c9695

material

0431704

material

5b5e48b

material_thesaurus

69c2658

add lines url_references.csv

7f19ebb

add lines variable_reference.csv add lines country_thesaurus.csv

get_c14data

4585c8f

Merge branch 'master' into bda

9b1f421

zoometh added 3 commits March 28, 2021 21:13

get_dba creation

8f13e65

Merge branch 'bda' of https://github.com/zoometh/c14bazAAR into bda

418746a

add get_bda function

0416d38

nevrome mentioned this pull request Mar 30, 2021

TODO List of databases that could be accessed with c14bazAAR #2

Closed

27 tasks

dirkseidensticker self-requested a review April 1, 2021 10:20

dirkseidensticker reviewed Apr 1, 2021

View reviewed changes

R/get_bda.R Show resolved Hide resolved

dirkseidensticker reviewed Apr 1, 2021

View reviewed changes

dirkseidensticker suggested changes Apr 1, 2021

View reviewed changes

nevrome mentioned this pull request Apr 3, 2021

new getter function for nerd db #131

Merged

zoometh closed this Apr 4, 2021

zoometh reopened this Apr 4, 2021

zoometh commented Apr 4, 2021

View reviewed changes

delete url_reference.csv

6f1d499

nevrome mentioned this pull request Apr 4, 2021

bda #133

Merged

nevrome closed this Apr 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bda - getter function #127

bda - getter function #127

zoometh commented Mar 23, 2021

dirkseidensticker commented Mar 23, 2021

zoometh commented Mar 23, 2021

zoometh commented Mar 28, 2021

dirkseidensticker commented Mar 29, 2021

zoometh commented Mar 29, 2021

dirkseidensticker Apr 1, 2021

dirkseidensticker Apr 1, 2021

dirkseidensticker Apr 1, 2021

dirkseidensticker Apr 1, 2021

dirkseidensticker Apr 1, 2021

dirkseidensticker Apr 1, 2021

dirkseidensticker Apr 1, 2021

dirkseidensticker Apr 1, 2021

dirkseidensticker Apr 1, 2021

dirkseidensticker Apr 1, 2021

dirkseidensticker Apr 1, 2021

dirkseidensticker Apr 1, 2021

dirkseidensticker left a comment

zoometh commented Apr 1, 2021 via email

dirkseidensticker commented Apr 2, 2021

nevrome commented Apr 3, 2021

zoometh left a comment

bda - getter function #127

bda - getter function #127

Conversation

zoometh commented Mar 23, 2021

dirkseidensticker commented Mar 23, 2021

zoometh commented Mar 23, 2021

zoometh commented Mar 28, 2021

dirkseidensticker commented Mar 29, 2021

zoometh commented Mar 29, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dirkseidensticker left a comment

Choose a reason for hiding this comment

zoometh commented Apr 1, 2021 via email

dirkseidensticker commented Apr 2, 2021

nevrome commented Apr 3, 2021

zoometh left a comment

Choose a reason for hiding this comment