Awesome
cascadia
<!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section --> <!-- ALL-CONTRIBUTORS-BADGE:END -->TOC
cascadia - CSS selector CLI tool
The Go Cascadia package implements CSS selectors for html. This is the command line tool, started as a thin wrapper around that package, but growing into a better tool to test CSS selectors without writing Go code:
Usage
$ cascadia
cascadia wrapper
Version 1.3.0 built on 2023-06-30
Copyright (C) 2016-2023, Tong Sun
Command line interface to go cascadia CSS selectors package
Usage:
cascadia -i in -c css -o [Options...]
Options:
-h, --help display help information
-i, --in *The html/xml file to read from (or stdin)
-o, --out *The output file (or stdout)
-c, --css *CSS selectors (can provide more if not using --piece)
-t, --text Text output for none-block selection mode
-R, --Raw Raw text output, no trimming of leading and trailing white space
-p, --piece sub CSS selectors within -css to split that block up into pieces
format: PieceName=[PieceStyle:]selector_string
PieceStyle:
RAW : will return the selected as-is
ATTR : will return the value of attribute selector_string
Else the text will be returned
-d, --delimiter delimiter for pieces csv output [= ]
-w, --wrap-html wrap up the output with html tags
-y, --style style component within the wrapped html head
-b, --base base href tag used in the wrapped up html
-q, --quiet be quiet
Its output has two modes, none-block selection mode and block selection mode, depending on whether the --piece
parameter is given on the command line or not.
For details about the concept of block and pieces, check out andrew-d/goscrape (in fact, cascadia
was initially developed just for it, so that I don't need to tweak Go code, build & run it just to test out the block and pieces selectors). Here is the exception:
- Inside each page, there's 1 or more blocks - some logical method of splitting up a page into subcomponents.
- Inside each block, you define some number of pieces of data that you wish to extract. Each piece consists of a name, a selector, and what data to extract from the current block.
This all sounds rather complicated, but in practice it's quite simple. See the next section for details.
In summary,
- The none-block selection mode will output the selection as HTML source by default
- but if
-t
, or--text
cli option is provided, the none-block selection mode will output as text instead.- By default, such text output will get their leading and trailing white space trimmed.
- However, if
-R
, or--Raw
cli option is provided, no trimming will be done.
- but if
- The block selection mode will output HTML as text in a
tsv
/csv
table form by default- if the
--piece
selection is prefixed withRAW:
, then that specific block selection will output in HTML instead. See the following for details.
- if the
Examples
All the three -i -o -c
options are required. By default it reads from stdin
and output to stdout
:
$ echo '<input type="radio" name="Sex" value="F" />' | tee /tmp/cascadia.xml | cascadia -i -o -c 'input[name=Sex][value=F]'
1 elements for 'input[name=Sex][value=F]':
<input type="radio" name="Sex" value="F"/>
Either the input or the output can be followed by a file name:
$ cascadia -i /tmp/cascadia.xml -o -c 'input[name=Sex][value=F]'
1 elements for 'input[name=Sex][value=F]':
<input type="radio" name="Sex" value="F"/>
$ cascadia -i /tmp/cascadia.xml -c 'input[name=Sex][value=F]' -o /tmp/out.html
1 elements for 'input[name=Sex][value=F]':
$ cat /tmp/out.html
<input type="radio" name="Sex" value="F"/>
More other options can be applied too:
# using --wrap-html
$ cascadia -i /tmp/cascadia.xml -c 'input[name=Sex][value=F]' -o /tmp/out.html -w
1 elements for 'input[name=Sex][value=F]':
$ cat /tmp/out.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<base href="">
</head>
<body>
<input type="radio" name="Sex" value="F"/>
</body>
# using --wrap-html with --style
$ cascadia -i /tmp/cascadia.xml -c 'input[name=Sex][value=F]' -o /tmp/out.html -w -y '<link rel="stylesheet" href="styles.css">'
1 elements for 'input[name=Sex][value=F]':
$ cat /tmp/out.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<base href="">
<link rel="stylesheet" href="styles.css">
</head>
<body>
<input type="radio" name="Sex" value="F"/>
</body>
-
For more on using the
--style
option, check out "adding styles". -
For more examples, check out the wiki, which includes but not limits to,
Install Debian/Ubuntu package
sudo apt install -y cascadia
Download/install binaries
- The latest binary executables are available as the result of the Continuous-Integration (CI) process.
- I.e., they are built automatically right from the source code at every git release by GitHub Actions.
- There are two ways to get/install such binary executables
- Using the binary executables directly, or
- Using packages for your distro
The binary executables
- The latest binary executables are directly available under
https://github.com/suntong/cascadia/releases/latest - Pick & choose the one that suits your OS and its architecture. E.g., for Linux, it would be the
cascadia_verxx_linux_amd64.tar.gz
file. - Available OS for binary executables are
- Linux
- Mac OS (darwin)
- Windows
- If your OS and its architecture is not available in the download list, please let me know and I'll add it.
- The manual installation is just to unpack it and move/copy the binary executable to somewhere in
PATH
. For example,
tar -xvf cascadia_*_linux_amd64.tar.gz
sudo mv -v cascadia_*_linux_amd64/cascadia /usr/local/bin/
rmdir -v cascadia_*_linux_amd64
Distro package
The repo setup instruction url has been given above. For example, for Debian --
Debian package
curl -1sLf \
'https://dl.cloudsmith.io/public/suntong/repo/setup.deb.sh' \
| sudo -E bash
# That's it. You then can do your normal operations, like
sudo apt update
apt-cache policy cascadia
sudo apt install -y cascadia
Install Source
To install the source code instead:
go install github.com/suntong/cascadia@latest
Author
Tong SUN
Powered by WireFrame
the one-stop wire-framing solution for Go cli based projects, from init to deploy.
Contributors ✨
Thanks goes to these wonderful people (emoji key):
<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section --> <!-- prettier-ignore-start --> <!-- markdownlint-disable --> <table> <tbody> <tr> <td align="center" valign="top" width="14.28%"><a href="https://github.com/suntong"><img src="https://avatars.githubusercontent.com/u/422244?v=4?s=100" width="100px;" alt="suntong"/><br /><sub><b>suntong</b></sub></a><br /><a href="https://github.com/suntong/cascadia/commits?author=suntong" title="Code">💻</a> <a href="#ideas-suntong" title="Ideas, Planning, & Feedback">🤔</a> <a href="#design-suntong" title="Design">🎨</a> <a href="#data-suntong" title="Data">🔣</a> <a href="https://github.com/suntong/cascadia/commits?author=suntong" title="Tests">⚠️</a> <a href="https://github.com/suntong/cascadia/issues?q=author%3Asuntong" title="Bug reports">🐛</a> <a href="https://github.com/suntong/cascadia/commits?author=suntong" title="Documentation">📖</a> <a href="#blog-suntong" title="Blogposts">📝</a> <a href="#example-suntong" title="Examples">💡</a> <a href="#tutorial-suntong" title="Tutorials">✅</a> <a href="#tool-suntong" title="Tools">🔧</a> <a href="#platform-suntong" title="Packaging/porting to new platform">📦</a> <a href="https://github.com/suntong/cascadia/pulls?q=is%3Apr+reviewed-by%3Asuntong" title="Reviewed Pull Requests">👀</a> <a href="#question-suntong" title="Answering Questions">💬</a> <a href="#maintenance-suntong" title="Maintenance">🚧</a> <a href="#infra-suntong" title="Infrastructure (Hosting, Build-Tools, etc)">🚇</a></td> <td align="center" valign="top" width="14.28%"><a href="https://github.com/hoshsadiq"><img src="https://avatars.githubusercontent.com/u/600045?v=4?s=100" width="100px;" alt="Hosh"/><br /><sub><b>Hosh</b></sub></a><br /><a href="https://github.com/suntong/cascadia/commits?author=hoshsadiq" title="Code">💻</a> <a href="https://github.com/suntong/cascadia/issues?q=author%3Ahoshsadiq" title="Bug reports">🐛</a> <a href="#userTesting-hoshsadiq" title="User Testing">📓</a></td> <td align="center" valign="top" width="14.28%"><a href="https://github.com/mh-cbon"><img src="https://avatars.githubusercontent.com/u/17096799?v=4?s=100" width="100px;" alt="mh-cbon"/><br /><sub><b>mh-cbon</b></sub></a><br /><a href="https://github.com/suntong/cascadia/issues?q=author%3Amh-cbon" title="Bug reports">🐛</a> <a href="#ideas-mh-cbon" title="Ideas, Planning, & Feedback">🤔</a> <a href="#userTesting-mh-cbon" title="User Testing">📓</a></td> <td align="center" valign="top" width="14.28%"><a href="https://www.digglife.net"><img src="https://avatars.githubusercontent.com/u/1468378?v=4?s=100" width="100px;" alt="朱聖黎 (Zhu Sheng Li)"/><br /><sub><b>朱聖黎 (Zhu Sheng Li)</b></sub></a><br /><a href="https://github.com/suntong/cascadia/issues?q=author%3Adigglife" title="Bug reports">🐛</a> <a href="#userTesting-digglife" title="User Testing">📓</a></td> <td align="center" valign="top" width="14.28%"><a href="https://github.com/himcc"><img src="https://avatars.githubusercontent.com/u/3031794?v=4?s=100" width="100px;" alt="himcc"/><br /><sub><b>himcc</b></sub></a><br /><a href="https://github.com/suntong/cascadia/commits?author=himcc" title="Code">💻</a> <a href="https://github.com/suntong/cascadia/issues?q=author%3Ahimcc" title="Bug reports">🐛</a> <a href="#userTesting-himcc" title="User Testing">📓</a></td> <td align="center" valign="top" width="14.28%"><a href="http://www.devalias.net/"><img src="https://avatars.githubusercontent.com/u/753891?v=4?s=100" width="100px;" alt="Glenn 'devalias' Grant"/><br /><sub><b>Glenn 'devalias' Grant</b></sub></a><br /><a href="https://github.com/suntong/cascadia/commits?author=0xdevalias" title="Code">💻</a> <a href="https://github.com/suntong/cascadia/issues?q=author%3A0xdevalias" title="Bug reports">🐛</a> <a href="#userTesting-0xdevalias" title="User Testing">📓</a></td> </tr> </tbody> </table> <!-- markdownlint-restore --> <!-- prettier-ignore-end --> <!-- ALL-CONTRIBUTORS-LIST:END -->This project follows the all-contributors specification. Contributions of any kind welcome!