paperscraper.scholar
paperscraper.scholar
¶
get_citations_from_title(title: str) -> int
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
title
|
str
|
Title of paper to be searched on Scholar. |
required |
Raises:
Type | Description |
---|---|
TypeError
|
If sth else than str is passed. |
Returns:
Name | Type | Description |
---|---|---|
int |
int
|
Number of citations of paper. |
Source code in paperscraper/citations/citations.py
dump_papers(papers: pd.DataFrame, filepath: str) -> None
¶
Receives a pd.DataFrame, one paper per row and dumps it into a .jsonl file with one paper per line.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
papers
|
DataFrame
|
A dataframe of paper metadata, one paper per row. |
required |
filepath
|
str
|
Path to dump the papers, has to end with |
required |
Source code in paperscraper/utils.py
get_scholar_papers(title: str, fields: List = ['title', 'authors', 'year', 'abstract', 'journal', 'citations'], *args, **kwargs) -> pd.DataFrame
¶
Performs Google Scholar API request of a given title and returns list of papers with fields as desired.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
title
|
str
|
Query to arxiv API. Needs to match the arxiv API notation. |
required |
fields
|
List
|
List of strings with fields to keep in output. |
['title', 'authors', 'year', 'abstract', 'journal', 'citations']
|
Returns:
Type | Description |
---|---|
DataFrame
|
pd.DataFrame. One paper per row. |
Source code in paperscraper/scholar/scholar.py
get_and_dump_scholar_papers(title: str, output_filepath: str, fields: List = ['title', 'authors', 'year', 'abstract', 'journal', 'citations']) -> None
¶
Combines get_scholar_papers and dump_papers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
title
|
str
|
Paper to search for on Google Scholar. |
required |
output_filepath
|
str
|
Path where the dump will be saved. |
required |
fields
|
List
|
List of strings with fields to keep in output. |
['title', 'authors', 'year', 'abstract', 'journal', 'citations']
|
Source code in paperscraper/scholar/scholar.py
scholar
¶
get_scholar_papers(title: str, fields: List = ['title', 'authors', 'year', 'abstract', 'journal', 'citations'], *args, **kwargs) -> pd.DataFrame
¶
Performs Google Scholar API request of a given title and returns list of papers with fields as desired.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
title
|
str
|
Query to arxiv API. Needs to match the arxiv API notation. |
required |
fields
|
List
|
List of strings with fields to keep in output. |
['title', 'authors', 'year', 'abstract', 'journal', 'citations']
|
Returns:
Type | Description |
---|---|
DataFrame
|
pd.DataFrame. One paper per row. |
Source code in paperscraper/scholar/scholar.py
get_and_dump_scholar_papers(title: str, output_filepath: str, fields: List = ['title', 'authors', 'year', 'abstract', 'journal', 'citations']) -> None
¶
Combines get_scholar_papers and dump_papers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
title
|
str
|
Paper to search for on Google Scholar. |
required |
output_filepath
|
str
|
Path where the dump will be saved. |
required |
fields
|
List
|
List of strings with fields to keep in output. |
['title', 'authors', 'year', 'abstract', 'journal', 'citations']
|