Awesome

Documents: https://paperdl.readthedocs.io/

Paperdl

Search and download paper from specific websites.
You can star this repository to keep track of the project if it's helpful for you, thank you for your support.

Support List

Source	Support Search?	Support Download?
scihub	✗	✓
baiduwenku	✗	✓
arxiv	✓	✓
googlescholar	✓	✓

Install

Pip install

run "pip install paperdl"

Source code install

(1) Offline
Step1: git clone https://github.com/CharlesPikachu/paperdl.git
Step2: cd paperdl -> run "python setup.py install"
(2) Online
run "pip install git+https://github.com/CharlesPikachu/paperdl.git@master"

Quick Start

Calling API

If you want to search and download papers from arxiv and google scholar, you can write codes as follow:

from paperdl import paperdl

config = {'logfilepath': 'paperdl.log', 'savedir': 'papers', 'search_size_per_source': 5, 'proxies': {}}
target_srcs = ['arxiv', 'googlescholar']
client = paperdl.Paperdl(config=config)
client.run(target_srcs)

In addition, if you can not visit google, you can set config as follow:

config = {'logfilepath': 'paperdl.log', 'savedir': 'papers', 'search_size_per_source': 5, 'proxies': {}, 'area': 'CN'}

You can also only download papers by using sci-hub as follow:


from paperdl import paperdl

config = {'logfilepath': 'paperdl.log', 'savedir': 'papers', 'search_size_per_source': 5, 'proxies': {}}
client = paperdl.SciHub(config=config, logger_handle=paperdl.Logger('paper.log'))
paperinfo = {
    'savename': '9193963',
    'ext': 'pdf',
    'savedir': 'outputs',
    'input': 'https://ieeexplore.ieee.org/document/9193963/',
    'source': 'scihub',
}
client.download([paperinfo])

Calling EXE

Usage: paperdl [OPTIONS]

Options:
  --version               Show the version and exit.
  -m, --mode TEXT         the used mode, support "search" and "download"
  -i, --inp TEXT          the paper to download, the supported format is the
                          same as sci-hub
  -s, --source TEXT       the used source, support "arxiv", "scihub" and
                          "googlescholar", you can use "," to split multi
                          sources
  -d, --savedir TEXT      the directory for saving papers
  -l, --logfilepath TEXT  the logging filepath
  -z, --size INTEGER      search size per source
  -p, --proxies TEXT      the proxies to be adopted
  -a, --area TEXT         your area, support "CN" and "EN"
  -c, --cookie TEXT       the cookie copied from the target website, only used
                          in "baiduwenku"
  --help                  Show this message and exit.

Screenshot

Projects in Charles_pikachu

Games: Create interesting games by pure python.
DecryptLogin: APIs for loginning some websites by using requests.
Musicdl: A lightweight music downloader written by pure python.
Videodl: A lightweight video downloader written by pure python.
Pytools: Some useful tools written by pure python.
PikachuWeChat: Play WeChat with itchat-uos.
Pydrawing: Beautify your image or video.
ImageCompressor: Image compressors written by pure python.
FreeProxy: Collecting free proxies from internet.
Paperdl: Search and download paper from specific websites.
Sciogovterminal: Browse "The State Council Information Office of the People's Republic of China" in the terminal.
CodeFree: Make no code a reality.
DeepLearningToys: Some deep learning toys implemented in pytorch.
DataAnalysis: Some data analysis projects in charles_pikachu.
Imagedl: Search and download images from specific websites.
Pytoydl: A toy deep learning framework built upon numpy.
NovelDL: Search and download novels from some specific websites.

WeChat Official Accounts

Charles_pikachu