Awesome
Haskell's Dangerous Functions
What does dangerous mean?
Dangerous could mean either of these:
- Partial: can throw exceptions in pure code
- Unsafe: can cause segfaults
- Has unexpected performance characteristics
- Doesn't do what you want
- Doesn't do what you think it does
How to forbid these dangerous functions in your codebase
-
Copy the
hlint.yaml
file in this repository to.hlint.yaml
within your repositorycat /path/to/haskell-dangerous-functions >> /path/to/your/project/.hlint.yaml
-
Run
hlint
on your code. Make sure to require new changes to behlint
-clean. You can usehlint --default
to generate a settings file ignoring all the hints currently outstanding. You can use pre-commit hooks to forbid committing non-hlint
-clean changes. -
Whenever you want to make an exception, and use a forbidden function anyway, use the
ignore
key to add an exception to the.hlint.yaml
file.
FAQ
-
It seems pretty silly that these functions still exist, and their dangers not well-documented.
I know! See the relevant discussion on the GHC issue tracker.
-
Surely everyone knows about these?
Maybe, but I certainly didn't, until they caused real production issues.
Contributing
WANTED: Evidence of the danger in these functions. If you can showcase a public incident with real-world consequences that happened because of one of these functions, I would love to refer to it in this document!
If you know about another dangerous function that should be avoided, feel free to submit a PR! Please include:
- an
hlint
config to forbid the function inhlint.yaml
. - a section in this document with:
- Why the function is dangerous
- A reproducible way of showing that it is dangerous.
- An alternative to the dangerous function
It might be that the function you have in mind is not dangerous but still weird. In that case you can add it to the Haskell WAT list.
Overview of the dangerous functions
forkIO
TL;DR: Using forkIO
is VERY hard to get right, use the async library instead.
The main issue is that when threads spawned using forkIO
throw an exception, this exception is not rethrown in the thread that spawned that thread.
As an example, suppose we forkIO
a server and something goes wrong.
The main thread will not notice that anything went wrong.
The only indication that an exception was thrown will be that something is printed on stderr
.
$ cat test.hs
#!/usr/bin/env stack
-- stack --resolver lts-15.15 script
{-# LANGUAGE NumericUnderscores #-}
import Control.Concurrent
main :: IO ()
main = do
putStrLn "Starting our 'server'."
forkIO $ do
putStrLn "Serving..."
threadDelay 1_000_000
putStrLn "Oh no, about to crash!"
threadDelay 1_000_000
putStrLn "Aaaargh"
undefined
threadDelay 5_000_000
putStrLn "Still running, eventhough we crashed"
threadDelay 5_000_000
putStrLn "Ok that's enough of that, stopping here."
Which outputs:
$ ./test.hs
Starting our 'server'.
Serving...
Oh no, about to crash!
Aaaargh
test.hs: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
undefined, called at /home/syd/test/test.hs:17:5 in main:Main
Still running, eventhough we crashed
Ok that's enough of that, stopping here.
Instead, we can use concurrently_
from the async
package:
$ cat test.hs
-- stack --resolver lts-15.15 script
{-# LANGUAGE NumericUnderscores #-}
import Control.Concurrent
import Control.Concurrent.Async
main :: IO ()
main = do
putStrLn "Starting our 'server'."
let runServer = do
putStrLn "Serving..."
threadDelay 1_000_000
putStrLn "Oh no, about to crash!"
threadDelay 1_000_000
putStrLn "Aaaargh"
undefined
let mainThread = do
threadDelay 5_000_000
putStrLn "Still running, eventhough we crashed"
threadDelay 5_000_000
putStrLn "Ok that's enough of that, stopping here."
concurrently_ runServer mainThread
to output:
$ ./test.hs
Starting our 'server'.
Serving...
Oh no, about to crash!
Aaaargh
test.hs: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
undefined, called at /home/syd/test.hs:18:9 in main:Main
See also:
forkProcess
Mostly impossible to get right.
You probably want to be using the async
library instead.
If you think "I know what I'm doing" then you're probably still wrong. Rethink what you're doing entirely.
See also https://www.reddit.com/r/haskell/comments/jsap9r/how_dangerous_is_forkprocess/
Partial functions
head
Throws an exception in pure code when the input is an empty list.
Prelude> head []
*** Exception: Prelude.head: empty list
Use listToMaybe
instead.
Applies to Data.Text.head as well
Trail of destruction:
tail
Throws an exception in pure code when the input is an empty list.
Prelude> tail []
*** Exception: Prelude.tail: empty list
Use drop 1
or a case-match instead.
Applies to Data.Text.tail as well
init
Throws an exception in pure code when the input is an empty list.
Prelude> init []
*** Exception: Prelude.init: empty list
Use a case-match on the reverse
of the list instead, but keep in mind that it uses linear time in the length of the list.
Use a different data structure if that is an issue for you.
Since base-4.19
you can also employ unsnoc
.
Applies to Data.Text.init as well
last
Throws an exception in pure code when the input is an empty list.
Prelude> last []
*** Exception: Prelude.last: empty list
Use a listToMaybe . reverse
instead, but keep in mind that it uses linear time in the length of the list.
Use a different data structure if that is an issue for you.
Since base-4.19
you can also employ unsnoc
.
Applies to Data.Text.last as well
'!!'
Throws an exception in pure code when the index is out of bounds.
Prelude> [1, 2, 3] !! 3
*** Exception: Prelude.!!: index too large
It also allows negative indices, for which it also throws.
Prelude> [1,2,3] !! (-1)
*** Exception: Prelude.!!: negative index
The right way to index is to not use a list, because list indexing takes O(n)
time, even if you find a safe way to do it.
If you really need to deal with list indexing (you don't), then you can use a combination of take
and drop
or (since base-4.19
) '!?'
.
fromJust
Throws an exception in pure code when the input is Nothing
.
Prelude Data.Maybe> fromJust Nothing
*** Exception: Maybe.fromJust: Nothing
CallStack (from HasCallStack):
error, called at libraries/base/Data/Maybe.hs:148:21 in base:Data.Maybe
fromJust, called at <interactive>:11:1 in interactive:Ghci1
Use a case-match instead.
read
There are multiple reasons not to use read
.
The most obvious one is that it is partial.
It throws an exception in pure code whenever the input cannot be parsed (and doesn't even give a helpful parse error):
Prelude> read "a" :: Int
*** Exception: Prelude.read: no parse
You can use readMaybe
to get around this issue, HOWEVER:
The second reason not to use read
is that it operates on String
.
read :: Read a => String -> a
If you are doing any parsing, you should be using a more appropriate data type to parse: (Text
or ByteString
)
The third reason is that read
comes from the Read
type class, which has no well-defined semantics.
In an ideal case, read
and show
would be inverses but this is just not the reality.
See UTCTime
as an example.
toEnum
The toEnum :: Enum => Int -> a
function is partial whenever the Enum
erable type a
is smaller than Int
:
Prelude> toEnum 5 :: Bool
*** Exception: Prelude.Enum.Bool.toEnum: bad argument
Prelude Data.Word> toEnum 300 :: Word8
*** Exception: Enum.toEnum{Word8}: tag (300) is outside of bounds (0,255)
succ
and pred
These are partial, on purpose. According to the docs:
The calls
succ maxBound
andpred minBound
should result in a runtime error.
Prelude Data.Word> succ 255 :: Word8
*** Exception: Enum.succ{Word8}: tried to take `succ' of maxBound
Prelude Data.Word> pred 0 :: Word8
*** Exception: Enum.pred{Word8}: tried to take `pred' of minBound
Use something like (succMay
](https://hackage.haskell.org/package/safe-0.3.19/docs/Safe.html#v:succMay).
Functions involving division
Prelude> quot 1 0
*** Exception: divide by zero
Prelude> minBound `quot` (-1) :: Int
*** Exception: arithmetic overflow
Prelude> div 1 0
*** Exception: divide by zero
Prelude> minBound `div` (-1) :: Int
*** Exception: arithmetic overflow
Prelude> rem 1 0
*** Exception: divide by zero
Prelude> mod 1 0
*** Exception: divide by zero
Prelude> divMod 1 0
*** Exception: divide by zero
Prelude> quotRem 1 0
*** Exception: divide by zero
Whenever you consider using division, really ask yourself whether you need division.
For example, you can (almost always) replace a `div` 2 <= b
by a <= 2 * b
.
(If you're worried about overflow, then use a bigger type.)
If your use-case has a fixed (non-0
) literal denominator, like a `div` 2
, and you have already considered using something other than division, then your case constitutes an acceptable exception.
Note that integer division may not be what you want in the first place anyway:
Prelude> 5 `div` 2
2 -- Not 2.5
See also https://github.com/NorfairKing/haskell-WAT#num-int
minimum
and maximum
These functions throw an exception in pure code whenever the input is empty:
Prelude> minimum []
*** Exception: Prelude.minimum: empty list
Prelude> maximum []
*** Exception: Prelude.maximum: empty list
Prelude> minimum Nothing
*** Exception: minimum: empty structure
Prelude> minimum (Left "wut")
*** Exception: minimum: empty structure
Prelude Data.Functor.Const> minimum (Const 5 :: Const Int ())
*** Exception: minimum: empty structure
The same goes for minimumBy
and maximumBy
.
You can use minimumMay
from the safe
package (or a case-match on the sort
-ed version of your list, if you don't want an extra dependency).
Applies to Data.Text.maximum and Data.Text.minimum as well
Data.Text.Encoding.decodeUtf8
Throws on invalid UTF-8 datao use Data.Text.Encoding.decodeUtf8'
instead.
Functions that throw exceptions in pure code on purpose
throw
Purposely throws an exception in pure code.
Prelude Control.Exception> throw $ ErrorCall "here be a problem"
*** Exception: here be a problem
Don't throw from pure code, use throwIO instead.
undefined
Purposely fails, with a particularly unhelpful error message.
Prelude> undefined
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
undefined, called at <interactive>:1:1 in interactive:Ghci1
Deal with errors appropriately instead.
Also see error
below.
error
Purposely fails, with an only slightly less unhelpful error message than undefined
.
Prelude> error "here be a problem"
*** Exception: here be a problem
CallStack (from HasCallStack):
error, called at <interactive>:4:1 in interactive:Ghci1
Deal with errors appropriately instead.
If you're really very extra sure that a certain case will never happen.
Bubble up the error to the IO
part of your code and then use throwIO
or die
.
Functions that do unexpected things
realToFrac
This function goes through Rational
:
-- | general coercion to fractional types
realToFrac :: (Real a, Fractional b) => a -> b
realToFrac = fromRational . toRational
Rational
does not have all the values that a Real
like Double
might have, so things will go wrong in ways that you don't expect:
Prelude> realToFrac nan :: Double
-Infinity
Avoid general coercion functions and anything to do with Double
in particular.
See also https://github.com/NorfairKing/haskell-WAT#real-double
%
: Rational values
The %
function is used to construct rational values:
data Ratio a = !a :% !a deriving Eq
Prelude Data.Int Data.Ratio> 1 % 12 :: Ratio Int8
1 % 12
There are constraints on the two values in Rational values:
Recall (from the docs); "The numerator and denominator have no common factor and the denominator is positive."
When using fixed-size underlying types, you can end up with invalid Ratio
values using Num
functions:
Prelude Data.Int Data.Ratio> let r = 1 % 12 :: Ratio Int8
Prelude Data.Int Data.Ratio> r - r
0 % (-1)
Prelude Data.Int Data.Ratio> r + r
3 % (-14)
> r * r
1 % (-112)
When using arbitrarily-sized underlying types, you can end up with arbitrary runtime:
(1 % 100)^10^10^10 :: Rational -- Prepare a way to kill this before you try it out.
Ratio
values create issues for any underlying type, so avoid them.
Consider whether you really need any rational values.
If you really do, and you have a clear maximum value, consider using fixed-point values.
If that does not fit your use-case, consider using Double
with all its caveats.
fromIntegral
and fromInteger
fromIntegral
has no constraints on the size of the output type, so that output type could be smaller than the input type.
In such a case, it performs silent truncation:
> fromIntegral (300 :: Word) :: Word8
44
Similarly for fromInteger
:
> fromInteger 300 :: Word8
44
fromIntegral
has also had some very nasty bugs that involved the function behaving differently (even partially) depending on optimisation levels.
See GHC #20066 and GHC #19345.
Avoid general coercion functions but write specific ones instead, as long as the type of the result is bigger than the type of the input.
word32ToWord64 :: Word32 -> Word64
word32ToWord64 = fromIntegral -- Safe because Word64 is bigger than Word32
Prefer to use functions with non-parametric types and/or functions that fail loudly, like these:
Witness the trail of destruction:
- Bug in
System.IO.hWaitForInput
because offromIntegral
- Bug in cryptography-related code because of
fromIntegral
I was also pointed to the finitary
package but I haven't used it yet.
toEnum
The toEnum
function suffers from the following problem on top of being partial.
Some instances of Enum
use "the next constructor" as the next element while others use a n+1
variant:
Prelude> toEnum 5 :: Double
5.0
Prelude Data.Fixed> toEnum 5 :: Micro
0.000005
Depending on what you expected, one of those doesn't do what you think it does.
fromEnum
From the docs:
It is implementation-dependent what fromEnum returns when applied to a value that is too large to fit in an Int.
For example, some Integer
that does not fit in an Int
will be mapped to 0
, some will be mapped all over the place
Prelude> fromEnum (2^66 :: Integer) -- To 0
0
Prelude> fromEnum (2^65 :: Integer) -- To 0
0
Prelude> fromEnum (2^64 :: Integer) -- To 0
0
Prelude> fromEnum (2^64 -1 :: Integer) -- To -1 ?!
0
Prelude> fromEnum (2^63 :: Integer) -- To -2^63 ?!
-9223372036854775808
This is because fromEnum :: Integer -> Int
is implemented using integerToInt
which treats big integers and small integers differently.
succ
and pred
These suffer from the same problem as toEnum
(see above) on top of being partial.
Prelude> succ 5 :: Double
6.0
Prelude Data.Fixed> succ 5 :: Micro
5.000001
Prelude> pred 0 :: Word
*** Exception: Enum.pred{Word}: tried to take `pred' of minBound
Prelude Data.Ord Data.Int> succ (127 :: Int8)
*** Exception: Enum.succ{Int8}: tried to take `succ' of maxBound
fromString
on ByteString
When converting to ByteString
, fromString
silently truncates to the bottom eight bits, turning your string into garbage.
> print "⚠"
"\9888"
> print (fromString "⚠" :: ByteString)
"\160"
The enumFromTo
-related functions
These also suffer from the same problem as toEnum
(see above)
Prelude> succ 5 :: Int
6
Prelude Data.Fixed> succ 5 :: Micro
5.000001
Functions related to String
-based IO
Input
System.IO.getChar
System.IO.getLine
System.IO.getContents
System.IO.interact
System.IO.readIO
System.IO.readLn
System.IO.readFile
These behave differently depending on env vars, and actually fail on non-text data in files:
Prelude> readFile "example.dat"
*** Exception: Test/A.txt: hGetContents: invalid argument (invalid byte sequence) "\226\8364
See also this blogpost.
Use ByteString
-based input and then use Data.Text.Encoding.decodeUtf8'
if necessary. (But not Data.Text.Encoding.decodeUtf8
, see above.)
Output
System.IO.putChar
System.IO.putStr
System.IO.putStrLn
System.IO.print
System.IO.writeFile
System.IO.appendFile
These behave differently depending on env vars:
$ ghci
Prelude> putStrLn "\973"
ύ
but
$ export LANG=C
$ export LC_ALL=C
$ ghci
Prelude> putStrLn "\973"
?
Use ByteString
-based output, on encoded Text
values or directly on bytestrings instead.
writeFile
caused a real-world outage for @tomjaguarpaw on 2021-09-24.
See also this blogpost.
Functions related to Text
-based IO
Data.Text.IO.readFile
Data.Text.IO.Lazy.readFile
These have the same issues as readFile
.
See also this blogpost.
Since text-2.1
one can replace Data.Text.IO
with Data.Text.IO.Utf8
.
Functions with unexpected performance characteristics
nub
O(n^2)
, use ordNub
instead
Trail of destruction: https://gitlab.haskell.org/ghc/ghc/-/issues/8173#note_236901
foldl
and foldMap
Lazy. Use foldl'
and foldMap'
instead.
See this excellent explanation.
sum
and product
Lazy accumulator, but is fixed as of GHC 9.0.1.
genericLength
genericLength
consumes O(n) stack space when returning a strict numeric type. Lazy numeric types (e.g. data Nat = Z | S Nat
) are very rare in practice so genericLength
is probably not what you want.
Confusing functions
These functions are a bad idea for no other reason than readability. If there is a bug that involves these functions, it will be really easy to read over them.
unless
unless
is defined as follows:
unless p s = if p then pure () else s
This is really confusing in practice use when
with not
instead.
either
Either takes two functions as arguments, one for the Left
case and one for the Right
case.
Which comes first? I don't know either, just use a case-match instead.
Modules or packages to avoid
These are debatable, but requiring a good justification for using them is a good default.
Control.Lens
The lens
package is full of abstract nonsense and obscure operators.
There are good reasons (in exceptional cases) for using it, like in cursor
, for example, but it should be avoided in general.
It also has an ENORMOUS dependency footprint.
If you need to use a dependency that uses lenses without the lens
dependency, you can use microlens
to stick with the (relatively) simple parts of lenses.
If you need to use a dependency that uses lens
, go ahead and use lens
, but stick with view
(^.
) and set
(.~
).
Extensions to avoid
These are also debatable and low priority compared to the rest in this document, but still interesting to consider
{-# LANGUAGE DeriveAnyClass #-}
Just use
data MyExample
instance MyClass MyExample
instead of
data MyExample
deriving MyClass
GHC (rather puzzlingly) gives the recommendation to turn on DeriveAnyClass
even when that would lead to code that throws an exception in pure code at runtime.
As a result, banning this extension brings potentially great upside: preventing a runtime exception, as well as reducing confusion, for the cost of writing a separate line for an instance if you know what you are doing.
See also this great explainer video by Tweag.
{-# LANGUAGE TupleSections #-}
This lets you add {-# LANGUAGE TupleSections #-}
and potential confusion to write (, v)
instead of \a -> (a, v)
.
Whenever you feel the need to use TupleSections
, you probably want to be using a data type instead of tuples instead.
{-# LANGUAGE DuplicateRecordFields #-}
To keep things simple, use prefix-named record fields like this:
data Template = Template { templateName :: Text, templateContents :: Text }
instead of this
{-# LANGUAGE DuplicateRecordFields #-}
data Template = Template { name :: Text, contents :: Text }
It may be more typing but it makes code a lot more readable.
If you are concerned about not being able to auto-derive aeson
's ToJSON
and FromJSON
instances anymore,
you shouldn't be. You can still that using something like aeson-casing
.
It's also dangerous to have serialisation output depend on the naming of things in your code, so be sure to test your serialisation with both property tests via genvalidity-sydtest-aeson
and golden tests via sydtest-aeson
.
{-# LANGUAGE NamedFieldPuns #-}
Introduces unnecessary syntax.
For this example:
data C = C { a :: Int }
just use this:
f c = foo (a c)
instead of this:
f (C {a}) = foo a
{-# LANGUAGE OverloadedLabels #-}
If you're using this, you either know what you're doing - in which case you should know better than to use this - or you don't - in which case you definitely shouldn't use it. Keep your code simple and just use record field selectors instead.
This extension often goes hand in hand with lens usage, which should also be discouraged, see above.
Unsafe functions
unsafePerformIO
Before you use this function, first read its documentation carefully. If you've done (and I know you haven't, you liar) and still want to use it, read the following section first.
When you use unsafePerformIO
, you pinkie-promise that the code in the IO a
that you provide is definitely always 100% pure, you swear.
If this is not the case, all sorts of assumptions don't work anymore.
For example, if the code that you want to execute in unsafePerformIO
is not evaluated, then the IO is never executed:
Prelude> fmap (const 'o') $ putStrLn "hi"
hi
'o'
Prelude System.IO.Unsafe> const 'o' $ unsafePerformIO $ putStrLn "hi"
'o'
Another issue is that pure code can be inlined whereas IO-based code cannot. When you pinkie-promise that your code is "morally" pure, you also promise that inlining it will not cause trouble. This is not true in general:
Prelude System.IO.Unsafe> let a = unsafePerformIO $ putStrLn "hi"
Prelude System.IO.Unsafe> a
hi
()
Prelude System.IO.Unsafe> a
()
Lastly, this function is also not type-safe, as you can see here:
$ cat file.hs
import Data.IORef
import System.IO.Unsafe
test :: IORef [a]
test = unsafePerformIO $ newIORef []
main = do
writeIORef test [42]
bang <- readIORef test
print $ map (\f -> f 5 6) (bang :: [Int -> Int -> Int])
$ runhaskell file.hs
[file.hs: internal error: stg_ap_pp_ret
(GHC version 8.8.4 for x86_64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
[1] 13949 abort (core dumped) runhaskell file.hs
unsafeDupablePerformIO
Like unsafePerformIO
but is even less safe.
unsafeInterleaveIO
Used to define lazy IO, which should be avoided.
unsafeFixIO
Unsafe version of fixIO
.
Deprecated
return
Use pure
instead.
See https://gitlab.haskell.org/ghc/ghc/-/wikis/proposal/monad-of-no-return