-
Get retraction notices
- select the title field and search retract*
- filter results to "Retractions"
- export all results with Author, Source, Title, and Abstract selected
-
Clean retraction notices file
-
Merge retracted articles and citations to retracted articles
-
-
use the output of #3 and iterate over retracted articles and get the day (PD), month (PD), year (PY), and research area (WC).
-
Creates a query string of the following format:
(SO=({SO})) AND DOP=(YYYY-MM-DD/YYYY-MM-DD) AND DT=(Article) AND WC=({WC})
Notes: * Constructs DOP from PD and PY * DOP ranges from start of the month to the end of the month when the day is not present. e.g, 2005-01-01/2005-01-31 * When the day is present, e.g., 2005-01-21/2005-01-21 * DT is always the same
* Sample final search string: ``` (SO=(Science)) AND DOP=(2005-01-21/2005-01-21) AND DT=(Article) AND WC=(Multidisciplinary Sciences) ```
-
Goes to www.webofscience.com and clicks on advanced search and parameters and enters the query string
-
exports all search results as an excel file. Save the file as search_results/rowid.xls where rowid is the rowid of the retracted articles being queried.
-
-
Selects two records at random from the search results. And then save them to a separate file (selected_records.csv) along with the rowid column. (Results from each query will be concatenated to selected_records.csv)
-
Get articles that cite control group articles
- using the output from #6 For each of the two records:
- Searches for the title (TI), e.g.,
TI=(Mathematical modeling of planar cell polarity to understand domineering nonautonomy)
- Clicks on the records citations
- Exports all the citations as tab delimited files (citing_articles/rowid_1.tsv and citing_articles/rowid_2.tsv)
- Python 3 and selenium
- Chrome (version 92 or newer)
- Make sure the chromedriver.exe (download it from https://sites.google.com/chromium.org/driver/) is at the same path as the main script while running