Awesome
Component Developer Documentation
This repository is for developer documentation related to various VELOC / SCR components. It also contains documentation on policies that apply to all of the repositories.
Status
All open issues for the components can be viewed on the Components Project Board.
Repo | GitHub Actions |
---|---|
KVTree | |
AXL | |
SPath | |
Shuffile | |
Redset | |
Rankstr | |
ER |
Components Diagram
Component Descriptions
Basic Data Structures and Algorithms
KVTree: Recursive key-value structure
Documentation:
Each KVTree object contains a list of key/value pairs. Each key is a string, each value is another kvtree object. This is a nested data structures, similar to a python dict or perl hash. The library provides functions to serialize a kvtree object to / from a file. It also optionally provides MPI send / recv functions to transfer an object from one process to another.
spath: represent and manipulate file system paths
Documentation:
Create an spath object from a string.
The library includes functions to extract components (such as dirname, basename).
It can create an absolute path or compute a relative path from a source path to a destination path.
It can also simplify a path (i.e., convert ../foo//bar
to foo/bar
).
rankstr: splits processes into groups based on a set of process which have the same input string
Rankstr uses bitonic sort for a scalable method to identify process groups. It is useful to create a communicator of ranks that all share the same storage device, then rank 0 in this communicator can create directory and inform others that dir has been created with barrier. It is also used to split processes into groups based on failure group (failure group of NODE --> splits MPI_COMM_WORLD into subgroups based on hostname).
File transfers between cache and parallel file system
AXL: Asynchronous transfer library
Documentation:
AXL is used to transfer a file from one path to another using synchronous and asynchronous methods. This can only be done between storage tiers, AXL does not (yet) support movement within a storage tier (such as between 2 compute nodes). Asynchronous methods include via pthreads, IBM BB API, Cray Datawarp. AXL will create directories for destination files.
Redundancy Encoding/Decoding and File Migration
Redset: Encode/decode a set of files with a redundancy method
Documentation:
- API overview
- redset data structure
- redundancy schemes
- SCR dev docs: redundancy descriptors
- SCR dev docs: Encoding schemes
- SCR dev docs: XOR scheme
Redset will create the redundancy data needed for a set of files. It can rebuild a file with provided redundancy information.
Shuffile: Shuffle files between MPI ranks
Documentation:
Files are registered with Used during restart, shuffile will move a file to the 'owning' MPI rank.
ER: Encode + Rebuild
ER provides an abstraction around shuffile and redset. During encode, ER calls redset to apply a redundancy scheme to a set of files, and then it calls shuffile to record which rank owns which files. During rebuild, ER first calls shuffile to move files back to their owning ranks, and then redset is called to rebuild any missing files.