Awesome
Shiva
Shiva library: Implementation in Rust of a parser and generator for documents of any type
Features
- Common Document Model (CDM) for all document types
- Parsers produce CDM
- Generators consume CDM
Common Document Model
Supported document types
Document type | Parse | Generate |
---|---|---|
Plain text | + | + |
Markdown | + | + |
HTML | + | + |
+ | + | |
JSON | + | + |
XML | + | + |
CSV | + | + |
RTF | + | + |
DOCX | + | + |
XLS | + | - |
XLSX | + | + |
ODS | + | + |
Typst | - | + |
Parse document features
Document type | Header | Paragraph | List | Table | Image | Hyperlink | PageHeader | PageFooter |
---|---|---|---|---|---|---|---|---|
Plain text | - | + | - | - | - | - | - | - |
Markdown | + | + | + | + | + | + | - | - |
HTML | + | + | + | + | + | + | - | - |
- | + | + | - | - | - | - | - | |
DOCX | + | + | + | + | - | + | - | - |
RTF | + | + | + | + | - | + | + | + |
JSON | + | + | + | + | - | + | + | + |
XML | + | + | + | + | + | + | + | + |
CSV | - | - | - | + | - | - | - | - |
XLS | - | - | - | + | - | - | - | - |
XLSX | - | - | - | + | - | - | - | - |
ODS | - | - | - | + | - | - | - | - |
Generate document features
Document type | Header | Paragraph | List | Table | Image | Hyperlink | PageHeader | PageFooter |
---|---|---|---|---|---|---|---|---|
Plain text | + | + | + | + | - | + | + | + |
Markdown | + | + | + | + | + | + | + | + |
HTML | + | + | + | + | + | + | - | - |
+ | + | + | + | + | + | + | + | |
DOCX | + | + | + | + | + | + | - | - |
RTF | + | + | + | + | + | + | - | - |
JSON | + | + | + | + | - | + | + | + |
XML | + | + | + | + | + | + | + | + |
CSV | - | - | - | + | - | - | - | - |
XLSX | - | - | - | + | - | - | - | - |
ODS | - | - | - | + | - | - | - | - |
Typst | + | + | + | + | + | + | + | + |
Usage Shiva library
Cargo.toml
[dependencies]
shiva = { version = "1.4.9", features = ["html", "markdown", "text", "pdf", "json",
"csv", "rtf", "docx", "xml", "xls", "xlsx", "ods", "typst"] }
main.rs
fn main() {
let input_vec = std::fs::read("input.html").unwrap();
let input_bytes = bytes::Bytes::from(input_vec);
let document = shiva::html::Transformer::parse(&input_bytes).unwrap();
let output_bytes = shiva::markdown::Transformer::generate(&document).unwrap();
std::fs::write("out.md", output_bytes).unwrap();
}
Shiva CLI & Server
Build executable Shiva CLI and Shiva Server
git clone https://github.com/igumnoff/shiva.git
cd shiva/cli
cargo build --release
Run executable Shiva CLI
cd ./target/release/
./shiva README.md README.html
Run Shiva Server
cd ./target/release/
./shiva-server --port=8080 --host=127.0.0.1
Who uses Shiva
Contributing
I would love to see contributions from the community. If you experience bugs, feel free to open an issue. If you would like to implement a new feature or bug fix, please follow the steps:
- Do fork
- Add comment to the issue that you are going to work on it
- Create pull request
If you would like add new document type, you need to implement the following traits:
Required: shiva::core::TransformerTrait
pub trait TransformerTrait {
fn parse(document: &Bytes) -> anyhow::Result<Document>;
fn generate(document: &Document) -> anyhow::Result<Bytes>;
}
Optional: shiva::core::TransformerWithImageLoaderSaverTrait (If images store outside of document for example: HTML, Markdown)
pub trait TransformerWithImageLoaderSaverTrait {
fn parse_with_loader<F>(document: &Bytes, image_loader: F) -> anyhow::Result<Document>
where F: Fn(&str) -> anyhow::Result<Bytes>;
fn generate_with_saver<F>(document: &Document, image_saver: F) -> anyhow::Result<Bytes>
where F: Fn(&Bytes, &str) -> anyhow::Result<()>;
}