Home

Awesome

POMDPs

LinuxMac OS XWindows
Build StatusBuild StatusBuild Status

Docs Dev-Docs Gitter Slack

This package provides a core interface for working with Markov decision processes (MDPs) and partially observable Markov decision processes (POMDPs). The POMDPTools package acts as a "standard library" for the POMDPs.jl interface, providing implementations of commonly-used components such as policies, belief updaters, distributions, and simulators.

Our goal is to provide a common programming vocabulary for:

  1. Expressing problems as MDPs and POMDPs.
  2. Writing solver software.
  3. Running simulations efficiently.

POMDPs.jl integrates with other ecosystems:

For a detailed introduction, check out our Julia Academy course! For help, please post in GitHub Discussions tab. We welcome contributions from anyone! See CONTRIBUTING.md for information about contributing.

Installation

POMDPs.jl and associated solver packages can be installed using Julia's package manager. For example, to install POMDPs.jl and the QMDP solver package, type the following in the Julia REPL:

using Pkg; Pkg.add("POMDPs"); Pkg.add("QMDP")

Quick Start

To run a simple simulation of the classic Tiger POMDP using a policy created by the QMDP solver, you can use the following code (note that POMDPs.jl is not limited to discrete problems with explicitly-defined distributions like this):

using POMDPs, QuickPOMDPs, POMDPTools, QMDP

m = QuickPOMDP(
    states = ["left", "right"],
    actions = ["left", "right", "listen"],
    observations = ["left", "right"],
    initialstate = Uniform(["left", "right"]),
    discount = 0.95,

    transition = function (s, a)
        if a == "listen"
            return Deterministic(s) # tiger stays behind the same door
        else # a door is opened
            return Uniform(["left", "right"]) # reset
        end
    end,

    observation = function (s, a, sp)
        if a == "listen"
            if sp == "left"
                return SparseCat(["left", "right"], [0.85, 0.15]) # sparse categorical distribution
            else
                return SparseCat(["right", "left"], [0.85, 0.15])
            end
        else
            return Uniform(["left", "right"])
        end
    end,

    reward = function (s, a)
        if a == "listen"
            return -1.0
        elseif s == a # the tiger was found
            return -100.0
        else # the tiger was escaped
            return 10.0
        end
    end
)

solver = QMDPSolver()
policy = solve(solver, m)

rsum = 0.0
for (s,b,a,o,r) in stepthrough(m, policy, "s,b,a,o,r", max_steps=10)
    println("s: $s, b: $([s=>pdf(b,s) for s in states(m)]), a: $a, o: $o")
    global rsum += r
end
println("Undiscounted reward was $rsum.")

For more examples and examples with visualizations, reference the Examples and Gallery of POMDPs.jl Problems sections of the documentaiton.

Documentation and Tutorials

In addition to the above-mentioned Julia Academy course, detailed documentation and examples can be found here.

Docs Docs

Supported Packages

Many packages use the POMDPs.jl interface, including MDP and POMDP solvers, support tools, and extensions to the POMDPs.jl interface. POMDPs.jl and all packages in the JuliaPOMDP project are fully supported on Linux. OSX and Windows are supported for all native solvers*, and most non-native solvers should work, but may require additional configuration.

Tools:

POMDPs.jl itself contains only the core interface for communicating about problem definitions; these packages contain implementations of commonly-used components:

PackageBuildCoverage
POMDPTools (hosted in this repository)Build Status
ParticleFiltersBuild Statuscodecov.io

Implemented Models:

Many models have been implemented using the POMDPs.jl interface for various projects. This list contains a few commonly used models:

PackageBuildCoverage
POMDPModelsCIcodecov
LaserTagCIcodecov
RockSampleCIcodecov
TagPOMDPProblemCICoverage Status
DroneSurveillanceBuild statuscodecov
ContinuumWorldCICoverage Status
VDPTag2Build Status
RoombaPOMDPs (Roomba Localization)CI

MDP solvers:

PackageBuild/CoverageOnline/<br>OfflineContinuous<br>States - ActionsRating<sup>3</sup>
DiscreteValueIterationBuild Status <br> Coverage StatusOfflineN-N★★★★★
LocalApproximationValueIterationBuild Status <br> Coverage StatusOfflineY-N★★
GlobalApproximationValueIterationBuild Status <br> Coverage StatusOfflineY-N★★
MCTS (Monte Carlo Tree Search)Build Status <br> Coverage StatusOnlineY (DPW)-Y (DPW)★★★★

POMDP solvers:

PackageBuild/CoverageOnline/<br>OfflineContinuous<br>States-Actions-ObservationsRating<sup>3</sup>
QMDP (suboptimal)Build Status <br> Coverage StatusOfflineN-N-N★★★★★
FIB (suboptimal)Build Status <br> Coverage StatusOfflineN-N-N★★
BeliefGridValueIterationBuild Status <br> codecovOfflineN-N-N★★
SARSOP*Build Status <br> Coverage StatusOfflineN-N-N★★★★
NativeSARSOPBuild Status <br> Coverage StatusOfflineN-N-N★★★★
ParticleFilterTrees (SparsePFT, PFT-DPW)Build Status <br> codecovOnlineY-Y<sup>2</sup>-Y★★★
BasicPOMCPBuild Status <br> Coverage StatusOnlineY-N-N<sup>1</sup>★★★★
ARDESPOTBuild Status <br> Coverage StatusOnlineY-N-N<sup>1</sup>★★★★
AdaOPSCI <br> codecov.ioOnlineY-N-Y★★★★
MCVIBuild Status <br> Coverage StatusOfflineY-N-Y★★
POMDPSolve*Build Status <br> Coverage StatusOfflineN-N-N★★★
IncrementalPruningBuild Status <br> Coverage StatusOfflineN-N-N★★★
POMCPOWBuild Status <br> Coverage StatusOnlineY-Y<sup>2</sup>-Y★★★
AEMSBuild Status <br> Coverage StatusOnlineN-N-N★★
PointBasedValueIterationBuild status <br> Coverage StatusOfflineN-N-N★★

<sup>1</sup>: Will run, but will not converge to optimal solution

<sup>2</sup>: Will run, but convergence to optimal solution is not proven, and it will likely not work well on multidimensional action spaces. See also https://github.com/michaelhlim/VOOTreeSearch.jl.

Reinforcement Learning:

PackageBuild/CoverageContinuous<br>StatesContinuous<br>ActionsRating<sup>3</sup>
TabularTDLearningBuild Status <br> Coverage StatusNN★★
DeepQLearningBuild Status <br> Coverage StatusY<sup>1</sup>N★★★

<sup>1</sup>: For POMDPs, it will use the observation instead of the state as input to the policy.

<sup>3</sup> Subjective rating; File an issue if you believe one should be changed

Performance Benchmarks:

Package
DESPOT

*These packages require non-Julia dependencies

Citing POMDPs

If POMDPs is useful in your research and you would like to acknowledge it, please cite this paper:

@article{egorov2017pomdps,
  author  = {Maxim Egorov and Zachary N. Sunberg and Edward Balaban and Tim A. Wheeler and Jayesh K. Gupta and Mykel J. Kochenderfer},
  title   = {{POMDP}s.jl: A Framework for Sequential Decision Making under Uncertainty},
  journal = {Journal of Machine Learning Research},
  year    = {2017},
  volume  = {18},
  number  = {26},
  pages   = {1-5},
  url     = {http://jmlr.org/papers/v18/16-300.html}
}