Home

Awesome

pod-jaydeesimon-jsoup

A babashka pod for parsing and selecting HTML using CSS queries. Built on top of jsoup.

This project is meant to be an experiment for exposing useful functionality from common libraries I use. In this case, jsoup. I often write one-off programs to scrape HTML and answer questions about them using jsoup's select functionality. With babashka pods, this can be called from a babashka script.

Usage

$ rlwrap bb
Babashka v0.1.0 REPL.
Use :repl/quit or :repl/exit to quit the REPL.
Clojure rocks, Bash reaches.

user=> (require '[babashka.pods :as pods])

user=> (pods/load-pod "./pod-jaydeesimon-jsoup")
#:pod{:id "pod.jaydeesimon.jsoup"}

user=> (require '[pod.jaydeesimon.jsoup :as jsoup])

user=> (-> (curl/get "https://clojure.org")
           :body
           (jsoup/select "div.clj-header-message")
           first
           :text)
"Clojure is a robust, practical, and fast programming language with a set of useful features that together form a simple, coherent, and powerful tool."

Alternatively, you can load the pod by using lein run or java -jar and pass it the uberjar.

user=> (pods/load-pod ["lein" "run" "-m" "pod-jaydeesimon-jsoup.core"])

or

;; do `lein uberjar` to make sure the jar is built
user=> (pods/load-pod ["java" "-jar" "target/uberjar/pod-jaydeesimon-jsoup-0.1-standalone.jar"])

Available Vars

This pod exposes just one var.

Credits