Awesome
<!-- README.md is generated from README.Rmd. Please edit that file -->sagemaker
<!-- badges: start --> <!-- badges: end -->The idea is to rebuild AWS Sagemaker Python SDK using R6 classes and paws behind the scenes.
Installation
You can install the development version of sagemaker from GitHub with:
# install.packages("remotes)
remotes::install_github("DyfanJones/sagemaker-r-sdk")
Warning!!!
This repo is in constantly under development and is not currently stable. sagemaker is currently aligning it’s api with sagemaker v2, apologises for any code breaking this causes.
API overview:
This package aims to mimic python’s AWS Sagemaker SDK api, but using
R6
and paws
Architecture Design:
sagemaker
is a metadata package that contains all methods to interact
with Amazon Sagemaker.
- sagemaker.core:
Containse core components of sdk for example
Session
R6 class - sagemaker.common: Contains common components used throughout sagemaker sdk
- sagemaker.mlcore: Contains core components for machine learning (ML) and amazon developed ML.
- sagemaker.mlframework:
Contains ML frameworks developed for Amazon Sagemaker i.e.
SKLearn
- sagemaker.workflow: Contains sagemaker pipeline and workflows
- sagemaker.debugger: Contains debugging methods (https://github.com/awslabs/sagemaker-debugger-rulesconfig)
Learn from examples:
Amazon Algorithms:
sagemaker
is designed to minic python’s sagemaker sdk. Therefore all
examples for python’s sagemaker should be able to accessible.
Examples:
- Targeted Direct Marketing predicts potential customers that are most likely to convert based on customer and aggregate level metrics, using Amazon SageMaker’s implementation of XGBoost.
- XGBoost Tuning shows how to use SageMaker hyperparameter tuning to improve your model fits for the Targeted Direct Marketing task.
- BlazingText Word2Vec generates Word2Vec embeddings from a cleaned text dump of Wikipedia articles using SageMaker’s fast and scalable BlazingText implementation.
R Model Examples:
- R Multivariate Adaptive Regression Splines example over iris data.frame
Note: If a feature hasn’t yet been implemented please feel free to raise a pull request or a ticket
For developers
To keep the package within the CRAN size limit of 5MB. sagemaker is currently using a separate repository (sagemaker-r-test-data) to store R variants of test data stored in sagemaker-python-sdk. sagemaker-r-test-data will only consist of data that can’t be read into R natively i.e. python pickle files. For other test data sagemaker will read it directly from sagemaker-python-sdk.