Skip to content

API Reference

This section documents the public API of paperscraper.

Below you’ll find links to the documentation for each module:

Citation

If you use paperscraper, please cite a paper that motivated our development of this tool.

@article{born2021trends,
  title={Trends in Deep Learning for Property-driven Drug Design},
  author={Born, Jannis and Manica, Matteo},
  journal={Current Medicinal Chemistry},
  volume={28},
  number={38},
  pages={7862--7886},
  year={2021},
  publisher={Bentham Science Publishers}
}


Top-level API

paperscraper

Initialize the module.

dump_queries(keywords: List[List[Union[str, List[str]]]], dump_root: str) -> None

Performs keyword search on all available servers and dump the results.

Parameters:

Name Type Description Default
keywords List[List[Union[str, List[str]]]]

List of lists of keywords Each second-level list is considered a separate query. Within each query, each item (whether str or List[str]) are considered AND separated. If an item is again a list, strs are considered synonyms (OR separated).

required
dump_root str

Path to root for dumping.

required
Source code in paperscraper/__init__.py
def dump_queries(keywords: List[List[Union[str, List[str]]]], dump_root: str) -> None:
    """Performs keyword search on all available servers and dump the results.

    Args:
        keywords (List[List[Union[str, List[str]]]]): List of lists of keywords
            Each second-level list is considered a separate query. Within each
            query, each item (whether str or List[str]) are considered AND
            separated. If an item is again a list, strs are considered synonyms
            (OR separated).
        dump_root (str): Path to root for dumping.
    """

    for idx, keyword in enumerate(keywords):
        for db, f in QUERY_FN_DICT.items():
            logger.info(f" Keyword {idx + 1}/{len(keywords)}, DB: {db}")
            filename = get_filename_from_query(keyword)
            os.makedirs(os.path.join(dump_root, db), exist_ok=True)
            f(keyword, output_filepath=os.path.join(dump_root, db, filename))