Awesome
WikiExtractor for WikiMentions
This is a modified version of the great Wikiextractor with the additional option to extract the internal Wikipedia links from an article.
If you run the following command with the enwiki-XXXXXXXX-pages-articles1.xml-XXXXXXXX.bz2 replaced by an actual dump file
python WikiExtractor.py --json --filter_disambig_pages --processes 2 --collect_links enwiki-XXXXXXXX-pages-articles1.xml-XXXXXXXX.bz2 -o test
then each articles dictionary contains an additional field 'internal_links'. Please see this notebook for a HOWTO and code snippet for reading the data.
For the full README please consult https://github.com/attardi/wikiextractor. However, I have not tested my modifications with other options.