Home

Awesome

<p align="center"> <img height="400px" src="assets/logo.png"> </p> <p align="center"> <b>Seamlessly fetch paginated data from any source!</b> </p> <p align="center"> <a href="https://pkg.go.dev/github.com/cyucelen/walker?tab=doc"> <img src="https://img.shields.io/badge/go.dev-reference-007d9c?logo=go&logoColor=white" alt="godoc" title="godoc"/> </a> <a href="https://github.com/cyucelen/walker/tags"> <img src="https://img.shields.io/github/v/tag/cyucelen/walker" alt="semver tag" title="semver tag"/> </a> <a href="https://github.com/cyucelen/walker/actions/workflows/go.yml"> <img src="https://img.shields.io/github/actions/workflow/status/cyucelen/walker/go.yml?branch=master" /> </a> <a href="https://codecov.io/gh/cyucelen/walker"> <img src="https://codecov.io/gh/cyucelen/walker/branch/master/graph/badge.svg" /> </a> <a href="https://goreportcard.com/report/github.com/cyucelen/walker"> <img src="https://goreportcard.com/badge/github.com/cyucelen/walker" /> </a> <a href="https://github.com/cyucelen/walker/blob/master/LICENSE"> <img src="https://img.shields.io/github/license/cyucelen/walker.svg"> </a> </p>

walker

Walker simplifies the process of fetching paginated data from any data source. With Walker, you can easily configure the start position and count of documents to fetch, depending on your needs. Additionally, Walker supports parallel processing, allowing you to fetch data more efficiently and at a faster rate.

The real purpose of the library is to provide a solution for walking through the pagination of API endpoints. With the NewApiWalker, you can easily fetch data from any paginated API endpoint and process the data concurrently. You can also create your own custom walker to fit your specific use case.

Features

Examples

Basic Usage

func source(start, fetchCount int) ([]int, error) {
	return []int{start, fetchCount}, nil
}

func sink(result []int, stop func()) error {
	fmt.Println(result)
	return nil
}

func main() {
	walker.New(source, sink).Walk()
}

Output:

[0 10]
[1 10]
[4 10]
[2 10]
[3 10]
[5 10]
[8 10]
[9 10]
[7 10]
[6 10]
...
to Infinity

Walking through the pagination of API endpoints

Fetching all the breweries from Open Brewery DB:

func buildRequest(start, fetchCount int) (*http.Request, error) {
	url := fmt.Sprintf("https://api.openbrewerydb.org/breweries?page=%d&per_page=%d", start, fetchCount)
	return http.NewRequest(http.MethodGet, url, http.NoBody)
}

func sink(res *http.Response, stop func()) error {
	var payload []map[string]any
	json.NewDecoder(res.Body).Decode(&payload)

	if len(payload) == 0 {
		stop()
		return nil
	}

	return saveBreweries(payload)
}

func main() {
	walker.NewApiWalker(http.DefaultClient, buildRequest, sink).Walk()
}

To create API walker you just need to provide:

Check examples for more usecases.

Configuration

OptionDescriptionDefaultAvailable Values
WithPaginationDefines the pagination strategywalker.OffsetPagination{}walker.OffsetPagination{}, walker.CursorPagination{}
WithMaxBatchSizeDefines limit for document count to stop after reached10int
WithParallelismDefines number of workers to run provided sourceruntime.NumCPU()int
WithLimiterDefines limit for document count to stop after reachedwalker.InfiniteLimiter()walker.InfiniteLimiter(), walker.ConstantLimiter(int)
WithRateLimitDefines rate limit by count and per durationunlimited(int, time.Duration)
WithContextDefines contextcontext.Background()context.Context

Contribution

I would like to accept any contributions to make walker better and feature rich. Feel free to contribute with your usecase!