Home

Awesome

Yo!

Micro Web Crawler in PHP & Manticore

Yo! is the super thin client-server crawler based on Manticore full-text search.
Compatible with different networks, includes flexible settings, history snaps, CLI tools and adaptive JS-less UI.

Available alternative branch for Gemini Protocol!

Features

Components

Install

Environment

Debian

Yo search engine uses Manticore as the primary database. If your server sensitive to power down, change default binlog flush strategy to binlog_flush = 1

Deployment

Project in development, to create new search project, use dev-main branch:

Development

Update

Init

Usage

Web UI

  1. cd src/webui
  2. php -S 127.0.0.1:8080
  3. open http://127.0.0.1:8080 in browser

Documentation

CLI

Index

Init

Create initial index

php src/cli/index/init.php [reset]
Alter

Change existing index

php src/cli/index/alter.php {operation} {column} {type}

Document

Add
php src/cli/document/add.php URL
Crawl
php src/cli/document/crawl.php
Clean

Make index optimization, apply new configuration rules

php src/cli/document/clean.php [limit]
Search
php src/cli/document/search.php '@title "*"' [limit]
Migration
YGGo

Import index from YGGo database

php src/cli/yggo/import.php 'host' 'port' 'user' 'password' 'database' [unique=off] [start=0] [limit=100]

Source DB fields required:

Backup

Logical

SQL text dumps could be useful for public index distribution, but requires more computing resources.

Read more

Physical

Better for infrastructure administration and includes original data binaries.

Read more

Instances

Yggdrasil