Skip to content

Commit b8e9516

Browse files
authored
Merge pull request #145 from UA-Libraries-Research-Data-Services/update-r-recipes
Update Wiley and ScienceDirect in R recipes
2 parents d17a80a + 58610bf commit b8e9516

5 files changed

Lines changed: 104 additions & 44 deletions

File tree

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
jupyter-book
1+
jupyter-book<2
22
matplotlib
33
numpy

src/python/usa-spending.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
"- Terms\n",
1818
" - <a href=\"https://github.com/fedspendingtransparency/usaspending-api?tab=CC0-1.0-1-ov-file\" target=\"_blank\">USAspending API License</a>: <a href=\"\" target=\"_blank\">CC0 1.0 Univeral</a>\n",
1919
"- Data Reuse\n",
20-
" - <a href=\"https://www.usaspending.gov/about#about-licensing\" target=\"_blank\">USspending Data Reuse</a>\n",
20+
" - <a href=\"https://www.usaspending.gov/about#about-licensing\" target=\"_blank\">USAspending Data Reuse</a>\n",
2121
"\n",
2222
"*These recipe examples were tested on May 5, 2025.*"
2323
]

src/r/arxiv.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,9 +30,10 @@ Please see the following resources for more information on API usage:
3030

3131
The following packages libraries need to be installed into your environment to run the code examples in this tutorial. These packages can be installed with `install.packages()`.
3232

33-
- <a href="https://cran.r-project.org/web/packages/aRxiv/index.html" target="_blank">arXiv: Interface to the arXiv API</a>
33+
- <a href="https://cran.r-project.org/web/packages/aRxiv/index.html" target="_blank">aRxiv: Interface to the arXiv API</a>
3434
- <a href="https://cran.r-project.org/web/packages/ggplot2/index.html" target="_blank">ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics</a>
3535

36+
We load the libraries used in this tutorial below:
3637

3738
``` r
3839
library(aRxiv)

src/r/sdirect.md

Lines changed: 65 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,72 @@
1+
---
2+
title: "ScienceDirect API in R"
3+
output:
4+
html_document:
5+
keep_md: true
6+
---
7+
18
# ScienceDirect API in R
29

310
by Michael T. Moen
411

5-
These recipe examples demonstrate how to use Elsevier’s [ScienceDirect API](https://dev.elsevier.com/) to retrieve full-text articles in various formats (XML, text).
12+
These recipe examples demonstrate how to use Elsevier’s <a href="https://dev.elsevier.com/" target="_blank">ScienceDirect API</a> to retrieve full-text articles in various formats (XML, text).
613

714
*This tutorial content is intended to help facilitate academic research. Please check your institution for their Text and Data Mining or related License Agreement with Elsevier.*
815

9-
- **Documentation**
10-
- [ScienceDirect API](https://dev.elsevier.com/)
11-
- [ScienceDirect API Documentation](https://dev.elsevier.com/sd_api_spec.html)
12-
13-
- **Terms**
14-
- [ScienceDirect API Terms of Use](https://dev.elsevier.com/api_key_settings.html)
16+
Please see the following resources for more information on API usage:
1517

16-
- **Data Reuse**
17-
- [Elsevier Text & Data Mining](https://dev.elsevier.com/tecdoc_text_mining.html)
18+
- Documentation
19+
- <a href="https://dev.elsevier.com/" target="_blank">ScienceDirect API</a>
20+
- <a href="https://dev.elsevier.com/sd_api_spec.html" target="_blank">ScienceDirect API Documentation</a>
21+
- Terms
22+
- <a href="https://dev.elsevier.com/api_key_settings.html" target="_blank">ScienceDirect API Terms of Use</a>
23+
- Data Reuse
24+
- <a href="https://dev.elsevier.com/tecdoc_text_mining.html" target="_blank">Elsevier Text & Data Mining</a>
1825

19-
> **Note:** See your institution's rate limit in the [ScienceDirect API Terms of Use](https://dev.elsevier.com/api_key_settings.html).
26+
_**NOTE:**_ See your institution's rate limit with <a href="https://dev.elsevier.com/api_key_settings.html" target="_blank">ScienceDirect API Terms of Use</a>.
2027

28+
*If you have copyright or other related text and data mining questions, please contact The University of Alabama Libraries or your respective library/institution.*
2129

22-
*These recipe examples were tested on February 7, 2025.*
30+
*These recipe examples were tested on October 27, 2025.*
2331

2432
## Setup
2533

2634
### Import Libraries
2735

28-
```r
36+
The following packages need to be installed into your environment to run the code examples in this tutorial. These packages can be installed with `install.packages()`.
37+
38+
- <a href="https://cran.r-project.org/web/packages/httr/index.html" target="_blank">httr: Tools for Working with URLs and HTTP</a>
39+
40+
We load the libraries used in this tutorial below:
41+
42+
43+
``` r
2944
library(httr)
3045
```
3146

3247
### Import API Key
3348

34-
An API key is required to access the ScienceDirect API. Registration is available on the [Elsevier developer portal](https://dev.elsevier.com/). The key is imported from an environment variable below:
49+
An API key is required for to access the ScienceDirect API. You can sign up for one at <a href="https://dev.elsevier.com/" target="_blank">Elsevier developer portal</a>.
50+
51+
We keep our token in a `.Renviron` file that is stored in the working directory and use `Sys.getenv()` to access it. The `.Renviron` should have an entry like the one below.
3552

36-
```r
37-
myAPIKey <- Sys.getenv("sciencedirect_key")
53+
```text
54+
SCIENCE_DIRECT_API_KEY="PUT_YOUR_API_KEY_HERE"
55+
```
56+
57+
Below, we can test to whether the key was successfully imported.
58+
59+
60+
``` r
61+
if (nzchar(Sys.getenv("SCIENCE_DIRECT_API_KEY"))) {
62+
print("API key successfully loaded.")
63+
} else {
64+
warning("API key not found or is empty.")
65+
}
66+
```
67+
68+
```
69+
## [1] "API key successfully loaded."
3870
```
3971

4072
### Identifier Note
@@ -43,41 +75,51 @@ We will use DOIs as the article identifiers. See our Crossref and Scopus API tut
4375

4476
## 1. Retrieve full-text XML of an article
4577

46-
```r
78+
79+
``` r
4780
# For XML download
4881
elsevier_url <- "https://api.elsevier.com/content/article/doi/"
4982
doi1 <- '10.1016/j.tetlet.2017.07.080' # Example Tetrahedron Letters article
50-
fulltext1 <- GET(paste0(elsevier_url, doi1, "?APIKey=", myAPIKey, "&httpAccept=text/xml"))
83+
fulltext1 <- GET(paste0(
84+
elsevier_url, doi1,
85+
"?APIKey=", Sys.getenv("SCIENCE_DIRECT_API_KEY"),
86+
"&httpAccept=text/xml"))
5187

5288
# Save to file
5389
writeLines(content(fulltext1, "text"), "fulltext1.xml")
5490
```
5591

5692
## 2. Retrieve plain text of an article
5793

58-
```r
94+
95+
``` r
5996
# For simplified text download
6097
doi2 <- '10.1016/j.tetlet.2022.153680' # Example Tetrahedron Letters article
61-
fulltext2 <- GET(paste0(elsevier_url, doi2, "?APIKey=", myAPIKey, "&httpAccept=text/plain"))
98+
fulltext2 <- GET(paste0(
99+
elsevier_url, doi2,
100+
"?APIKey=", Sys.getenv("SCIENCE_DIRECT_API_KEY"),
101+
"&httpAccept=text/plain"))
62102

63103
# Save to file
64104
writeLines(content(fulltext2, "text"), "fulltext2.txt")
65105
```
66106

67107
## 3. Retrieve full-text in a loop
68108

69-
```r
109+
110+
``` r
70111
# Make a list of 5 DOIs for testing
71112
dois <- c('10.1016/j.tetlet.2018.10.031',
72113
'10.1016/j.tetlet.2018.10.033',
73114
'10.1016/j.tetlet.2018.10.034',
74115
'10.1016/j.tetlet.2018.10.038',
75116
'10.1016/j.tetlet.2018.10.041')
76-
```
77117

78-
```r
79118
for (doi in dois) {
80-
article <- GET(paste0(elsevier_url, doi, "?APIKey=", myAPIKey, "&httpAccept=text/plain"))
119+
article <- GET(paste0(
120+
elsevier_url, doi,
121+
"?APIKey=", Sys.getenv("SCIENCE_DIRECT_API_KEY"),
122+
"&httpAccept=text/plain"))
81123
doi_name <- gsub("/", "_", doi)
82124
writeLines(content(article, "text"), paste0(doi_name, "_plain_text.txt"))
83125
Sys.sleep(1)

src/r/wiley-tdm.md

Lines changed: 35 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "wiley-tdm"
2+
title: "Wiley Text and Data Mining (TDM) in R"
33
output:
44
html_document:
55
keep_md: true
@@ -9,26 +9,32 @@ output:
99

1010
by Michael T. Moen
1111

12-
This tutorial is designed to support academic research. Please consult your institution’s library or legal office regarding its Text and Data Mining license agreement with Wiley.
12+
The Wiley Text and Data Mining (TDM) API allows users to retrieve the full-text articles of subscribed Wiley content in PDF form. TDM use is for non-commercial scholarly research, see terms and restrictions in below links.
1313

14-
### Documentation
15-
- [Wiley Text and Data Mining](https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining)
14+
*This tutorial content is intended to help facilitate academic research. Please check your institution for their Text and Data Mining or related License Agreement with Wiley.*
1615

17-
### Terms of Use
18-
- [Wiley Text and Data Mining Agreement](https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining#accordionHeader-3)
16+
Please see the following resources for more information on API usage:
1917

20-
### Data Reuse
21-
- [Service Name] Data Reuse *(link to be provided by the service)*
18+
- Documentation
19+
- <a href="https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining" target="_blank">Wiley Text and Data Mining</a>
20+
- Terms
21+
- <a href="https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining#accordionHeader-3" target="_blank">Wiley Text and Data Mining Agreement</a>
22+
- Data Reuse
23+
- <a href="https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining#accordionHeader-3" target="_blank">Wiley TDM Data Reuse</a> (see sections 4 and 5 of Text and Data Mining Agreement)
2224

23-
*These recipe examples were tested on February 12, 2025.*
25+
*These recipe examples were tested on October 27, 2025.*
2426

2527
**_NOTE:_** The Wiley TDM API limits requests to a maximum of 3 requests per second.
2628

2729
## Setup
2830

2931
### Import Libraries
3032

31-
This tutorial uses the following libraries:
33+
The following packages need to be installed into your environment to run the code examples in this tutorial. These packages can be installed with `install.packages()`.
34+
35+
- <a href="https://cran.r-project.org/web/packages/httr/index.html" target="_blank">httr: Tools for Working with URLs and HTTP</a>
36+
37+
We load the libraries used in this tutorial below:
3238

3339

3440
``` r
@@ -37,14 +43,27 @@ library(httr)
3743

3844
### Text and Data Mining Token
3945

40-
A token is required to access the Wiley TDM API. Sign up can be found [here](https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining#accordionHeader-2). Import your token below:
46+
A token is required for text and data mining with Wiley. You can sign up for one on the <a href="https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining#accordionHeader-2" target="_blank">Wiley Text and Data Mining page</a>.
47+
48+
We keep our token in a `.Renviron` file that is stored in the working directory and use `Sys.getenv()` to access it. The `.Renviron` should have an entry like the one below.
49+
50+
```text
51+
WILEY_TDM_TOKEN="PUT_YOUR_TOKEN_HERE"
52+
```
53+
54+
Below, we can test to whether the key was successfully imported.
4155

4256

4357
``` r
44-
wiley_token <- Sys.getenv("wiley_token")
58+
if (nzchar(Sys.getenv("WILEY_TDM_TOKEN"))) {
59+
print("API key successfully loaded.")
60+
} else {
61+
warning("API key not found or is empty.")
62+
}
63+
```
4564

46-
# The token will be sent as a header in the API calls
47-
headers <- add_headers("Wiley-TDM-Client-Token" = wiley_token)
65+
```
66+
## [1] "API key successfully loaded."
4867
```
4968

5069
## 1. Retrieve full-text of an article
@@ -59,14 +78,13 @@ In the first example, we download the full-text of the article with the DOI "10.
5978
doi <- "10.1002/net.22207"
6079
url <- paste0("https://api.wiley.com/onlinelibrary/tdm/v1/articles/", doi)
6180

62-
response <- GET(url, headers)
81+
response <- GET(url, add_headers("Wiley-TDM-Client-Token" = Sys.getenv("WILEY_TDM_TOKEN")))
6382

6483
if (status_code(response) == 200) {
6584
# Download if status code indicates success
6685
filename <- paste0(gsub("/", "_", doi), ".pdf")
6786
writeBin(content(response, "raw"), filename)
6887
cat(paste0(filename, " downloaded successfully\n"))
69-
7088
} else {
7189
# Print status code if unsuccessful
7290
cat(paste0("Failed to download PDF. Status code: ", status_code(response), "\n"))
@@ -96,14 +114,13 @@ dois <- c(
96114
# Loop through DOIs and download each article
97115
for (doi in dois) {
98116
url <- paste0("https://api.wiley.com/onlinelibrary/tdm/v1/articles/", doi)
99-
response <- GET(url, headers)
117+
response <- GET(url, add_headers("Wiley-TDM-Client-Token" = Sys.getenv("WILEY_TDM_TOKEN")))
100118

101119
if (status_code(response) == 200) {
102120
# Download if status code indicates success
103121
filename <- paste0(gsub("/", "_", doi), ".pdf")
104122
writeBin(content(response, "raw"), filename)
105123
cat(paste0(filename, " downloaded successfully\n"))
106-
107124
} else {
108125
# Print status code if unsuccessful
109126
cat(paste0("Failed to download PDF. Status code: ", status_code(response), "\n"))

0 commit comments

Comments
 (0)