paperscraper.citations
paperscraper.citations
¶
citations
¶
get_citations_by_doi(doi: str) -> int
¶
Get the number of citations of a paper according to semantic scholar.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
doi
|
str
|
the DOI of the paper. |
required |
Returns:
Type | Description |
---|---|
int
|
The number of citations |
Source code in paperscraper/citations/citations.py
get_citations_from_title(title: str) -> int
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
title
|
str
|
Title of paper to be searched on Scholar. |
required |
Raises:
Type | Description |
---|---|
TypeError
|
If sth else than str is passed. |
Returns:
Name | Type | Description |
---|---|---|
int |
int
|
Number of citations of paper. |
Source code in paperscraper/citations/citations.py
entity
¶
core
¶
Entity
¶
An abstract entity class with a set of utilities shared by the objects that perform self-linking analyses, such as Paper and Researcher.
Source code in paperscraper/citations/entity/core.py
paper
¶
Paper
¶
Bases: Entity
Source code in paperscraper/citations/entity/paper.py
__init__(input: str, mode: ModeType = 'infer')
¶
Set up a Paper object for analysis.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input
|
str
|
Paper identifier. This can be the title, DOI or semantic scholar ID of the paper. |
required |
mode
|
ModeType
|
The format in which the ID was provided. Defaults to "infer". |
'infer'
|
Raises:
Type | Description |
---|---|
ValueError
|
If unknown mode is given. |
Source code in paperscraper/citations/entity/paper.py
self_references()
¶
Extracts the self references of a paper, for each author.
self_citations()
¶
Extracts the self citations of a paper, for each author.
get_result() -> Optional[PaperResult]
¶
Provides the result of the analysis.
Returns: PaperResult if available.
Source code in paperscraper/citations/entity/paper.py
researcher
¶
Researcher
¶
Bases: Entity
Source code in paperscraper/citations/entity/researcher.py
__init__(input: str, mode: ModeType = 'infer')
¶
Construct researcher object for self citation/reference analysis.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input
|
str
|
A researcher to search for. |
required |
mode
|
ModeType
|
This can be a |
'infer'
|
Raises:
Type | Description |
---|---|
ValueError
|
Unknown mode |
Source code in paperscraper/citations/entity/researcher.py
self_references()
¶
Sifts through all papers of a researcher and extracts the self references.
self_citations()
¶
orcid
¶
orcid_to_author_name(orcid_id: str) -> Optional[str]
¶
Given an ORCID ID (as a string, e.g. '0000-0002-1825-0097'), returns the full name of the author from the ORCID public API.
Source code in paperscraper/citations/orcid.py
self_citations
¶
self_citations_paper(inputs: Union[str, List[str]], verbose: bool = False) -> Union[CitationResult, List[CitationResult]]
async
¶
Analyze self-citations for one or more papers by DOI or Semantic Scholar ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs
|
Union[str, List[str]]
|
A single DOI/SSID string or a list of them. |
required |
verbose
|
bool
|
If True, logs detailed information for each paper. |
False
|
Returns:
Type | Description |
---|---|
Union[CitationResult, List[CitationResult]]
|
A single CitationResult if a string was passed, else a list of CitationResults. |
Source code in paperscraper/citations/self_citations.py
self_references
¶
self_references_paper(inputs: Union[str, List[str]], verbose: bool = False) -> Union[ReferenceResult, List[ReferenceResult]]
async
¶
Analyze self-references for one or more papers by DOI or Semantic Scholar ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs
|
Union[str, List[str]]
|
A single DOI/SSID string or a list of them. |
required |
verbose
|
bool
|
If True, logs detailed information for each paper. |
False
|
Returns:
Type | Description |
---|---|
Union[ReferenceResult, List[ReferenceResult]]
|
A single ReferenceResult if a string was passed, else a list of ReferenceResults. |
Raises:
Type | Description |
---|---|
ValueError
|
If no references are found for a given identifier. |
Source code in paperscraper/citations/self_references.py
tests
¶
test_self_references
¶
TestSelfReferences
¶
Source code in paperscraper/citations/tests/test_self_references.py
test_compare_async_and_sync_performance(dois)
¶
Compares the execution time of asynchronous and synchronous self_references
for a list of DOIs.
Source code in paperscraper/citations/tests/test_self_references.py
utils
¶
get_doi_from_title(title: str) -> Optional[str]
¶
Searches the DOI of a paper based on the paper title
Parameters:
Name | Type | Description | Default |
---|---|---|---|
title
|
str
|
Paper title |
required |
Returns:
Type | Description |
---|---|
Optional[str]
|
DOI according to semantic scholar API |
Source code in paperscraper/citations/utils.py
get_doi_from_ssid(ssid: str, max_retries: int = 10) -> Optional[str]
¶
Given a Semantic Scholar paper ID, returns the corresponding DOI if available.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
ssid
|
str
|
The paper ID on Semantic Scholar. |
required |
Returns:
Type | Description |
---|---|
Optional[str]
|
str or None: The DOI of the paper, or None if not found or in case of an error. |
Source code in paperscraper/citations/utils.py
get_title_and_id_from_doi(doi: str) -> Dict[str, Any]
¶
Given a DOI, retrieves the paper's title and semantic scholar paper ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
doi
|
str
|
The DOI of the paper (e.g., "10.18653/v1/N18-3011"). |
required |
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
dict or None: A dictionary with keys 'title' and 'ssid'. |
Source code in paperscraper/citations/utils.py
author_name_to_ssaid(author_name: str) -> str
¶
Given an author name, returns the Semantic Scholar author ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
author_name
|
str
|
The full name of the author. |
required |
Returns:
Type | Description |
---|---|
str
|
str or None: The Semantic Scholar author ID or None if no author is found. |
Source code in paperscraper/citations/utils.py
determine_paper_input_type(input: str) -> Literal['ssid', 'doi', 'title']
¶
Determines the intended input type by the user if not explicitly given (infer
).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input
|
str
|
Either a DOI or a semantic scholar paper ID or an author name. |
required |
Returns:
Type | Description |
---|---|
Literal['ssid', 'doi', 'title']
|
The input type |
Source code in paperscraper/citations/utils.py
get_papers_for_author(ss_author_id: str) -> List[str]
async
¶
Given a Semantic Scholar author ID, returns a list of all Semantic Scholar paper IDs for that author.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
ss_author_id
|
str
|
The Semantic Scholar author ID (e.g., "1741101"). |
required |
Returns:
Type | Description |
---|---|
List[str]
|
A list of paper IDs (as strings) authored by the given author. |
Source code in paperscraper/citations/utils.py
find_matching(first: List[Dict[str, str]], second: List[Dict[str, str]]) -> List[str]
¶
Ingests two sets of authors and returns a list of those that match (either based on name or on author ID).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
first
|
List[Dict[str, str]]
|
First set of authors given as list of dict with two keys ( |
required |
second
|
List[Dict[str, str]]
|
Second set of authors given as list of dict with two same keys. |
required |
Returns:
Type | Description |
---|---|
List[str]
|
List of names of authors in first list where a match was found. |
Source code in paperscraper/citations/utils.py
check_overlap(n1: str, n2: str) -> bool
¶
Check whether two author names are identical. TODO: This can be made more robust
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n1
|
str
|
first name |
required |
n2
|
str
|
second name |
required |
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
Whether names are identical. |
Source code in paperscraper/citations/utils.py
clean_name(s: str) -> str
¶
Clean up a str by removing special characters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
s
|
str
|
Input possibly containing special symbols |
required |
Returns:
Type | Description |
---|---|
str
|
Homogenized string. |