Planet Haskell without Tom Moertel's postsPipes Output
http://pipes.yahoo.com/pipes/pipe.info?_id=Jti_id1G3hGa1KrxdPQQIA
Fri, 22 May 2015 11:29:12 +0000http://pipes.yahoo.com/pipes/The new Stackage Server
https://www.fpcomplete.com/blog/2015/05/new-stackage-server
tl;dr Please check out beta.stackage.orgI made the first commit to the Stackage Server code
base a little over a year ago. The
goal was to provide a place to host package sets which both limited the
number of packages from Hackage available, and modified packages where
necessary. This server was to be populated by regular Stackage builds, targeted
at multiple GHC versions, and consisted of both inclusive and exclusive sets.
It also allowed interested individuals to create their own package sets.If any of those details seem surprising today, they should. A lot has happened
for the Stackage project in the past year, making details of what was initially
planned irrelevant, and making other things (like hosting of package
documentation) vital. We now have LTS Haskell. Instead of running with multiple
GHC versions, we have Stackage Nightly which is targeted at a single GHC major
version. To accomodate goals for GPS Haskell (which unfortunately never
materialized),
Stackage no longer makes corrections to upstream packages.I could go into lots more detail on what is different in project requirements.
Instead, I'll just summarize: I've been working on a simplified version of the
Stackage Server codebase to address our goals better, more easily ensure high
availability, and make the codebase easier to maintain. We also used this
opportunity to test out a new hosting system our DevOps team put together. The
result is running on beta.stackage.org, and will
replace the official stackage.org after a bit more testing (which I hope
readers will help with).The codeAll of this code lives on the simpler branch of the stackage-server code
base, and much to my joy, resulted in quite a bit less
code. In fact,
there's just about a 2000 line reduction. The rest of this post will get into
how that happened.No more custom package setsOne of the features I mentioned above was custom package sets. This fell out
automatically from the initial way Stackage Server was written, so it was
natural to let others create package sets of their own. However, since release,
only one person actually used that feature. I discussed with him, and he agreed
with the decision to deprecate and then remove that functionality.So why get rid of it now? Two powerful reasons:We already host a public mirror of all packages on S3. Since we no longer patch upstream packages, it's best if tooling is able to just refer to that high-reliability service.We now have Git repositories for all of LTS Haskell and Stackage Nightly. Making these the sources of package sets means we don't have two (possibly conflicting) sources of data. That brings me to the second pointUpload code is goneWe had some complicated logic to allow users to upload package sets. It started
off simple, but over time we added Haddock hosting and other metadata features,
making the code more complex. Actually, it ended up having two parallel code
paths for this. So instead, we now just upload information on the package sets
to the Git repositories, and leave it up to a separate process (described
below) to clone these repositories and make the data available to the server.Haddocks on S3After generating a snapshot, the Haddocks used to be tarred and compressed, and
then uploaded as a compressed bundle to S3. Then, Stackage Server would receive
a request for files, unpack them, and serve them. This presented some problems:Users would have to wait for a first request to succeed during the unpackingWith enough snapshots being generated, we would eventually run out of disk space and need to clear our temp directorySince we run our cluster in a high availabilty mode with multiple horizontally-scaled machines, one machine may have finished unpacking when another didn't, resulting in unstyled content (see issue #82).Instead, we now just upload the files to S3 and redirect there from
stackage-server (though we'll likely switch to reverse proxying to allow for
nicer SSL urls). In fact, you can easily view these docs, at URLs such as
http://haddock.stackage.org/lts-2.9/ or
https://s3.amazonaws.com/haddock.stackage.org/nightly-2015-05-21/index.html.These Haddocks are publicly available, and linkable from projects beyond
Stackage Server. Each set of Haddocks is guaranteed to have consistent internal
links to other compatible packages. And while some documentation doesn't
generate due to known package
bugs,
the generation is otherwise reliable.I've already offered access to these docs to Duncan for usage on Hackage, and
hope that will improve the experience for users there.Metadata SQLite databasePreviously, information on snapshots was stored in a PostgreSQL database that
was maintained by Stackage Server. This database also had package metadata,
like author, homepage, and description. Now, we have a completely different
process:The all-cabal-metadata from the Commercial Haskell Special Interest Group provides an easily cloneable Git repo with package metadata, which is automatically updated by Travis.We run a cron job on the stackage-build server that updates the lts-haskell, stackage-nightly, and all-cabal-metadata repos and generates a SQLite database from them with all of the data that Stackage Server needs. You can look at the Stackage.Database module for some ideas of what this consists of. That database gets uploaded to Amazon S3, and is actually publicly available if you want to poke at itThe live server downloads a new version of this file on a regular basisI've considered spinning off the Stackage.Download code into its own repository
so that others can take advantage of this functionality in different contexts
if desired. Let me know if you're interested.At this point, the PostgreSQL database is just used for non-critical
functionality, such as social features (tags and likes).Slightly nicer URLsWhen referring to a snapshot, there are "official" short names (slugs), of the
form lts-2.9 and nightly-2015-05-22. The URLs on the new server
now reflect this perfectly, e.g.:
https://beta.stackage.org/nightly-2015-05-22.
We originally used hashes of the snapshot content for the original URLs, but
that was fixed a while
ago. Now that we only have
to support these official snapshots, we can always (and exclusively) use these
short names.As a convenience, if you visit the following URLs, you get automatic redirects:/nightly redirects to the most recent nightly/lts to the latest LTS/lts-X to the latest LTS in the X.* major version (e.g., today, /lts-2 redirects to /lts-2.9)This also works for URLs under that hierarchy. For example, consider
https://beta.stackage.org/lts/cabal.config,
which is an easy way to get set up with LTS in your project (by running wget
https://beta.stackage.org/lts/cabal.config).ECS-based hostingWhile not a new feature of the server itself, the hosting cluster we're running
this on is brand new. Amazon recently released EC2 Container Service, which is
a service for running Docker containers. Since we're going to be using this for
the new School of
Haskell, it's
nice to be giving it a serious usage now. We also make extensive use of Docker
for customer projects, both for builds and hosting, so it's a natural extension
for us.This ECS cluster uses standard Amazon services like Elastic Load Balancer (ELB)
and auto-scaling to provide for high availability in the case of machine
failure. And while we have a lot of confidence in our ability to keep Stackage
Server up and running regularly, it's nice that our most important user-facing
content is provided by these external services:Haddocks on S3Package mirroring on S3LTS Haskell and Stackage Nightly build plans on GithubPackage metadata on GithubPackage index metadata on Github (via stackage-update and all-cabal-files/hashes)This provides for a pleasant experience in both browsing the website and using
Stackage in your build system.A special thanks to Jason Boyer for providing this new hosting cluster, which
the whole FP Complete team is looking forward to putting through its paces.https://www.fpcomplete.com/blog/2015/05/new-stackage-serverFri, 22 May 2015 07:30:00 +0000Handling Control-C in Haskell
http://neilmitchell.blogspot.com/2015/05/handling-control-c-in-haskell.html
Neil Mitchelltag:blogger.com,1999:blog-7094652.post-8069742894679200534Thu, 21 May 2015 20:19:00 +0000Clojure/Erlang backend engineer/architect at Zoomo (Full-time)
http://functionaljobs.com/jobs/8831-clojure-erlang-backend-engineer-architect-at-zoomo
urn:uuid:d3c17c09-b898-6d02-ed4e-36d906d36e98Wed, 20 May 2015 10:40:39 +0000Call C functions from Haskell without bindings
https://www.fpcomplete.com/blog/2015/05/inline-c
Because Haskell is a language of choice for many problem domains, and
for scales ranging from one-off scripts to full scale web services, we
are fortunate to by now have over 8,000 open source packages (and
a few commercial ones besides) available to build from. But in
practice, Haskell programming in the real world involves interacting
with myriad legacy systems and libraries. Partially because the
industry is far older than the comparatively recent strength of our
community. But further still, because quality new high-performance
libraries are created every day in languages other than Haskell, be it
intensive numerical codes or frameworks for shuffling bits across
machines. Today we are releasing inline-c, a package for writing
mixed C/Haskell source code that seamlessly invokes native and foreign
functions in the same module. No FFI required.The joys of programming with foreign codeImagine that you just found a C library that you wish to use for your
project. The standard workflow is to,check Hackage if a package with a set of bindings for that library
exists,if one does, program against that, orif it doesn't, write your own bindings package, using Haskell's
FFI.Writing and maintaining bindings for large C libraries is hard work.
The libraries are constantly updated upstream, so that the bindings
you find are invariably out-of-date, providing only partial coverage
of library's API, sometimes don't compile cleanly against the latest
upstream version of the C library or need convoluted and error-prone
conditional compilation directives to support multiple versions of the
API in the package. Which is a shame, because typically you only need
to perform a very specific task using some C library, using only
a minute proportion of its API. It can be frustrating for a bindings
package to fail to install, only because the binding for some function
that you'll never use doesn't match up with the header files of the
library version you happen to have installed on your system.This is especially true for large libraries that expose sets of
related but orthogonal and indepedently useful functions, such as
GTK+, OpenCV or numerical libraries such as the GNU Scientific Library
(GSL), NAG and IMSL. inline-c lets you call functions from these
libraries using the full power of C's syntax, directly from client
code, without the need for monolithic bindings packages. High-level
bindings (or "wrappers") may still be useful to wrap low-level details
into an idiomatic Haskell interface, but inline-c enables rapid
prototyping and iterative development of code that uses directly some
of the C library today, keeping for later the task of abstracting
calls into a high-level, type safe wrapper as needed. In short,
inline-c let's you "pay as you go" when programming foreign code.We first developed inline-c for use with numerical libraries, in
particular the popular and very high quality commercial
NAG library, for
tasks including ODE solving, function optimization, and interpolation.
If getting seamless access to the gold standard of fast and reliable
numerical routines is what you need, then you will be interested in
our companion package to work specifically with NAG,
inline-c-nag.A taste of inline-cWhat follows is just a teaser of what can be done with inline-c.
Please refer to the
Haddock documentation and
the README for
more details on how to use the showcased features.Let's say we want to use C's variadic printf function and its
convenient string formats. inline-c let's you write this function
call inline, without any need for a binding to the foreign function:{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
import qualified Language.C.Inline as C
C.include " "
C.include " "
main :: IO ()
main = do
x <- [C.exp| int{ printf("Some number: %.2f\n", cos(0.5)) } |]
putStrLn $ show x ++ " characters printed."Importing Language.C.Inline brings into scope the Template Haskell
function include to include C headers ( and ),
and the exp quasiquoter for embedding expressions in C syntax in
Haskell code. Notice how inline-c has no trouble even with
C functions that have advanced calling conventions, such as variadic
functions. This is a crucial point: we have the full power of
C available at our fingertips, not just whatever can be shoe-horned
through the FFI.We can capture Haskell variables to be used in the C expression, such
as when computing x below:mycos :: CDouble -> IO CDouble
mycos x = [C.exp| double{ cos($(double x)) } |]The anti-quotation $(double x) indicates that we want to capture the
variable x from the Haskell environment, and that we want it to have
type double in C (inline-c will check at compile time that this is
a sensible type ascription).We can also splice in a block of C statements, and explicitly return
the result:C.include " "
-- | @readAndSum n@ reads @n@ numbers from standard input and returns
-- their sum.
readAndSum :: CInt -> IO CInt
readAndSum n = do
x <- [C.block| int {
int i, sum = 0, tmp;
for (i = 0; i < $(int n); i++) {
scanf("%d ", &tmp);
sum += tmp;
}
return sum;
} |]
print xFinally, the library provides facilities to easily use Haskell data in
C. For example, we can easily use Haskell ByteStrings in C:{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE QuasiQuotes #-}
import qualified Data.ByteString as BS
import Data.Monoid ((<>))
import Foreign.C.Types
import qualified Language.C.Inline as C
import Text.RawString.QQ (r)
C.context (C.baseCtx <> C.bsCtx)
-- | Count the number of set bits in a 'BS.ByteString'.
countSetBits :: BS.ByteString -> IO CInt
countSetBits bs = [C.block|
int {
int i, bits = 0;
for (i = 0; i < $bs-len:bs; i++) {
unsigned char ch = $bs-ptr:bs[i];
bits += (ch * 01001001001ULL & 042104210421ULL) % 017;
}
return bits;
}
|]In this example, we use the bs-len and bs-ptr anti-quoters to get
the length and pointer for a Haskell ByteString. inline-c has
a modular design: these anti-quoters are completely optional and can
be included on-demand. The C.context invocation adds the extra
ByteStrings anti-quoters to the base set. Similar facilities are
present to easily use Haskell Vectors as well as for invoking
Haskell closures from C code.Larger examplesWe have included various examples in the
inline-c and
inline-c-nag repositories.
Currently they're geared toward scientific and numerical computing,
but we would welcome contributions using inline-c in other fields.For instance,
gsl-ode.hs
is a great example of combining the strengths of C and the strengths
of Haskell to good effect: we use a function from C's
GNU Scientific Library for
solving ordinary differential equations (ODE) to solve
a Lorenz system, and
then take advantage of the very nice Chart-diagrams Haskell library
to display its x and z coordinates:In this example, the vec-ptr anti-quoter is used to get a pointer out
of a mutable vector:$vec-ptr:(double *fMut)Where fMut is a variable of type Data.Storable.Vector.Mutable.Vector
CDouble. Moreover, the fun anti-quoter is used to get a function
pointer from a Haskell function:$fun:(int (* funIO) (double t, const double y[], double dydt[], void * params))Where, funIO is a Haskell function of typeCDouble -> Ptr CDouble -> Ptr CDouble -> Ptr () -> IO CIntNote that all these anti-quoters (apart from the ones where only one
type is allowed, like vec-len or bs-ptr) force the user to specify
the target C type. The alternative would have been to write the
Haskell type. Either way some type ascription is unfortunately
required, due to a limitation of Template Haskell. We choose C type
annotations because in this way, the user can understand precisely and
state explicitly the target type of any marshalling.Note that at this stage, type annotations are needed, because it is
not possible to get the type of locally defined variables in Template Haskell.How it works under the hoodinline-c generates a piece of C code for most of the Template Haskell
functions and quasi-quoters function that it exports. So when you write[C.exp| double{ cos($(double x)) } |]a C function gets generated:double some_name(double x) {
return cos(x);
}This function is then bound to in Haskell through an automatically
generated FFI import declaration and invoked passing the right argument
-- the x variable from the Haskell environment. The types specified in
C are automatically translated to the corresponding Haskell types, to
generate the correct type signatures.Custom anti quoters, such as vec-ptr and vec-len, handle the C and
Haskell types independently. For example, when writing[C.block| double {
int i;
double res;
for (i = 0; i < $vec-len:xs; i++) {
res += $vec-ptr:(double *xs)[i];
}
return res;
} |]we'll get a function of typedouble some_name(int xs_len, double *xs_ptr)and on the Haskell side the variable xs will be used in conjuction
with some code getting its length and the underlying pointer, both to be
passed as arguments.Building programs that use inline-cThe C code that inline-c generates is stored in a file named like the
Haskell source file, but with a .c extension.When using cabal, it is enough to specify generated C source, and
eventual options for the C code:executable foo
main-is: Main.hs, Foo.hs, Bar.hs
hs-source-dirs: src
-- Here the corresponding C sources must be listed for every module
-- that uses C code. In this example, Main.hs and Bar.hs do, but
-- Foo.hs does not.
c-sources: src/Main.c, src/Bar.c
-- These flags will be passed to the C compiler
cc-options: -Wall -O2
-- Libraries to link the code with.
extra-libraries: -lm
...Note that currently cabal repl is not supported, because the C code is
not compiled and linked appropriately. However, cabal repl will fail
at the end, when trying to load the compiled C code, which means that we
can still use it to type check our package when developing.If we were to compile the above manually we could do$ ghc -c Main.hs
$ cc -c Main.c -o Main_c.o
$ ghc Foo.hs
$ ghc Bar.hs
$ cc -c Bar.c -o Bar_c.o
$ ghc Main.o Foo.o Bar.o Main_c.o Bar_c.o -lm -o MainExtending inline-cAs mentioned previously, inline-c can be extended by defining custom
anti-quoters. Moreover, we can also tell inline-c about more C types
beyond the primitive ones.Both operations are done via the Context data type. Specifically, the
Context contains a TypesTable, mapping C type specifiers to Haskell
types; and a Map of AntiQuoters. A baseCtx is provided specifying
mappings from all the base C types to Haskell types (int to CInt,
double to CDouble, and so on). Contexts can be composed using
their Monoid instance.For example, the vecCtx contains two anti-quoters, vec-len and
vec-ptr. When using inline-c with external libraries we often
define a context dedicated to said library, defining a TypesTable
converting common types find in the library to their Haskell
counterparts. For example
inline-c-nag defines a context
containing information regarding the types commonly using in the NAG
scientific library.See the Language.C.Inline.Context module documentation for more.C++ supportOur original use case for inline-c was always C oriented. However,
thanks to extensible contexts, it should be possible to build C++
support on top of inline-c, as we dabbled with in
inline-c-cpp. In this way,
one can mix C++ code into Haskell source files, while reusing the
infrastructure that inline-c provides for invoking foreign
functions. Since inline-c generates C wrapper functions for all
inline expressions, one gets a function with bona fide C linkage to
wrap a C++ call, for free. Dealing with C++ templates, passing C++
objects in and out and conveniently manipulating them from Haskell are
the next challenges. If C++ support is what you need, feel free to
contribute to this ongoing effort!Wrapping upWe meant inline-c as a simple, modular alternative to monolithic
binding libraries, borrowing the core concept of FFI-less programming of
foreign code from the
H project
and
language-c-inline. But
this is just the first cut! We are releasing the library to the
community early in hopes that it will accelerate the Haskell community's
embrace of quality foreign libraries where they exist, as an alternative
to expending considerable resources reinventing such libraries for
little benefit. Numerical programming, machine learning, computer
vision, GUI programming and data analysis come to mind as obvious areas
where we want to leverage existing quality code. In fact, FP Complete is
using inline-c today to enable quick access to all of NAG, a roughly
1.6K function strong library, for a large compute-intensive codebase. We
hope to see many more use cases in the future.https://www.fpcomplete.com/blog/2015/05/inline-cWed, 20 May 2015 10:00:00 +0000Security
http://feedproxy.google.com/~r/CartesianClosedComic/~3/RaqTXz2_q3Y/27
http://ro-che.info/ccc/27Tue, 19 May 2015 23:00:00 +0000Morte: an intermediate language for super-optimizing functional programs
http://www.haskellforall.com/2014/09/morte-intermediate-language-for-super.html
The Haskell language provides the following guarantee (with caveats): if two programs are equal according to equational reasoning then they will behave the same. On the other hand, Haskell does not guarantee that equal programs will generate identical performance. Consequently, Haskell library writers must employ rewrite rules to ensure that their abstractions do not interfere with performance.Now suppose there were a hypothetical language with a stronger guarantee: if two programs are equal then they generate identical executables. Such a language would be immune to abstraction: no matter how many layers of indirection you might add the binary size and runtime performance would be unaffected.Here I will introduce such an intermediate language named Morte that obeys this stronger guarantee. I have not yet implemented a back-end code generator for Morte, but I wanted to pause to share what I have completed so far because Morte uses several tricks from computer science that I believe deserve more attention.Morte is nothing more than a bare-bones implementation of the calculus of constructions, which is a specific type of lambda calculus. The only novelty is how I intend to use this lambda calculus: as a super-optimizer.NormalizationThe typed lambda calculus possesses a useful property: every term in the lambda calculus has a unique normal form if you beta-reduce everything. If you're new to lambda calculus, normalizing an expression equates to indiscriminately inlining every function call.What if we built a programming language whose intermediate language was lambda calculus? What if optimization was just normalization of lambda terms (i.e. indiscriminate inlining)? If so, then we would could abstract freely, knowing that while compile times might increase, our final executable would never change.RecursionNormally you would not want to inline everything because infinitely recursive functions would become infinitely large expressions. Fortunately, we can often translate recursive code to non-recursive code!I'll demonstrate this trick first in Haskell and then in Morte. Let's begin from the following recursive List type along with a recursive map function over lists:import Prelude hiding (map, foldr)data List a = Cons a (List a) | Nilexample :: List Intexample = Cons 1 (Cons 2 (Cons 3 Nil))map :: (a -> b) -> List a -> List bmap f Nil = Nilmap f (Cons a l) = Cons (f a) (map f l)-- Argument order intentionally switchedfoldr :: List a -> (a -> x -> x) -> x -> xfoldr Nil c n = nfoldr (Cons a l) c n = c a (foldr l c n)result :: Intresult = foldr (map (+1) example) (+) 0-- result = 9Now imagine that we disable all recursion in Haskell: no more recursive types and no more recursive functions. Now we must reject the above program because:the List data type definition recursively refers to itselfthe map and foldr functions recursively refer to themselvesCan we still encode lists in a non-recursive dialect of Haskell?Yes, we can!-- This is a valid Haskell program{-# LANGUAGE RankNTypes #-}import Prelude hiding (map, foldr)type List a = forall x . (a -> x -> x) -> x -> xexample :: List Intexample = \cons nil -> cons 1 (cons 2 (cons 3 nil))map :: (a -> b) -> List a -> List bmap f l = \cons nil -> l (\a x -> cons (f a) x) nilfoldr :: List a -> (a -> x -> x) -> x -> xfoldr l = lresult :: Intresult = foldr (map (+ 1) example) (+) 0-- result = 9Carefully note that:List is no longer defined recursively in terms of itselfmap and foldr are no longer defined recursively in terms of themselvesYet, we somehow managed to build a list, map a function over the list, and fold the list, all without ever using recursion! We do this by encoding the list as a fold, which is why foldr became the identity function.This trick works for more than just lists. You can take any recursive data type and mechanically transform the type into a fold and transform functions on the type into functions on folds. If you want to learn more about this trick, the specific name for it is "Boehm-Berarducci encoding". If you are curious, this in turn is equivalent to an even more general concept from category theory known as "F-algebras", which let you encode inductive things in a non-inductive way.Non-recursive code greatly simplifies equational reasoning. For example, we can easily prove that we can optimize map id l to l:map id l-- Inline: map f l = \cons nil -> l (\a x -> cons (f a) x) nil= \cons nil -> l (\a x -> cons (id a) x) nil-- Inline: id x = x= \cons nil -> l (\a x -> cons a x) nil-- Eta-reduce= \cons nil -> l cons nil-- Eta-reduce= lNote that we did not need to use induction to prove this optimization because map is no longer recursive. The optimization became downright trivial, so trivial that we can automate it!Morte optimizes programs using this same simple scheme:Beta-reduce everything (equivalent to inlining)Eta-reduce everythingTo illustrate this, I will desugar our high-level Haskell code to the calculus of constructions. This desugaring process is currently manual (and tedious), but I plan to automate this, too, by providing a front-end high-level language similar to Haskell that compiles to Morte:-- mapid.mt( \(List : * -> *)-> \( map : forall (a : *) -> forall (b : *) -> (a -> b) -> List a -> List b )-> \(id : forall (a : *) -> a -> a) -> \(a : *) -> map a a (id a))-- List(\(a : *) -> forall (x : *) -> (a -> x -> x) -> x -> x)-- map( \(a : *)-> \(b : *)-> \(f : a -> b)-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)-> \(x : *)-> \(Cons : b -> x -> x)-> \(Nil: x)-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil)-- id(\(a : *) -> \(va : a) -> va)This line of code is the "business end" of the program:\(a : *) -> map a a (id a)The extra 'a' business is because in any polymorphic lambda calculus you explicitly accept polymorphic types as arguments and specialize functions by applying them to types. Higher-level functional languages like Haskell or ML use type inference to automatically infer and supply type arguments when possible.We can compile this program using the morte executable, which accepts a Morte program on stdin, outputs the program's type stderr, and outputs the optimized program on stdout:$ morte < id.mt∀(a : *) → (∀(x : *) → (a → x → x) → x → x) → ∀(x : *) → (a → x → x) → x → xλ(a : *) → λ(l : ∀(x : *) → (a → x → x) → x → x) → lThe first line is the type, which is a desugared form of:forall a . List a -> List aThe second line is the program, which is the identity function on lists. Morte optimized away the map completely, the same way we did by hand.Morte also optimized away the rest of the code, too. Dead-code elimination is just an emergent property of Morte's simple optimization scheme.EqualityWe could double-check our answer by asking Morte to optimize the identity function on lists:-- idlist.mt( \(List : * -> *)-> \(id : forall (a : *) -> a -> a) -> \(a : *) -> id (List a))-- List(\(a : *) -> forall (x : *) -> (a -> x -> x) -> x -> x)-- id(\(a : *) -> \(va : a) -> va)Sure enough, Morte outputs an alpha-equivalent result (meaning the same up to variable renaming):$ ~/.cabal/bin/morte < idlist.mt∀(a : *) → (∀(x : *) → (a → x → x) → x → x) → ∀(x : *) → (a → x → x) → x → xλ(a : *) → λ(va : ∀(x : *) → (a → x → x) → x → x) → vaWe can even use the morte library to mechanically check if two Morte expressions are alpha-, beta-, and eta- equivalent. We can parse our two Morte files into Morte's Expr type and then use the Eq instance for Expr to test for equivalence:$ ghciPrelude> import qualified Data.Text.Lazy.IO as TextPrelude Text> txt1 <- Text.readFile "mapid.mt"Prelude Text> txt2 <- Text.readFile "idlist.mt"Prelude Text> import Morte.Parser (exprFromText)Prelude Text Morte.Parser> let e1 = exprFromText txt1Prelude Text Morte.Parser> let e2 = exprFromText txt2Prelude Text Morte.Parser> import Control.Applicative (liftA2)Prelude Text Morte.Parser Control.Applicative> liftA2 (==) e1 e2Right True$ -- `Right` means both expressions parsed successfully$ -- `True` means they are alpha-, beta-, and eta-equivalentWe can use this to mechanically verify that two Morte programs optimize to the same result.Compile-time computationMorte can compute as much (or as little) at compile as you want. The more information you encode directly within lambda calculus, the more compile-time computation Morte will perform for you. For example, if we translate our Haskell List code entirely to lambda calculus, then Morte will statically compute the result at compile time.-- nine.mt( \(Nat : *)-> \(zero : Nat)-> \(one : Nat)-> \((+) : Nat -> Nat -> Nat)-> \((*) : Nat -> Nat -> Nat)-> \(List : * -> *)-> \(Cons : forall (a : *) -> a -> List a -> List a)-> \(Nil : forall (a : *) -> List a)-> \( map : forall (a : *) -> forall (b : *) -> (a -> b) -> List a -> List b )-> \( foldr : forall (a : *) -> List a -> forall (r : *) -> (a -> r -> r) -> r -> r )-> ( \(two : Nat) -> \(three : Nat) -> ( \(example : List Nat) -> foldr Nat (map Nat Nat ((+) one) example) Nat (+) zero ) -- example (Cons Nat one (Cons Nat two (Cons Nat three (Nil Nat)))) ) -- two ((+) one one) -- three ((+) one ((+) one one)))-- Nat( forall (a : *)-> (a -> a)-> a-> a)-- zero( \(a : *)-> \(Succ : a -> a)-> \(Zero : a)-> Zero)-- one( \(a : *)-> \(Succ : a -> a)-> \(Zero : a)-> Succ Zero)-- (+)( \(m : forall (a : *) -> (a -> a) -> a -> a)-> \(n : forall (a : *) -> (a -> a) -> a -> a)-> \(a : *)-> \(Succ : a -> a)-> \(Zero : a)-> m a Succ (n a Succ Zero))-- (*)( \(m : forall (a : *) -> (a -> a) -> a -> a)-> \(n : forall (a : *) -> (a -> a) -> a -> a)-> \(a : *)-> \(Succ : a -> a)-> \(Zero : a)-> m a (n a Succ) Zero)-- List( \(a : *)-> forall (x : *)-> (a -> x -> x) -- Cons-> x -- Nil-> x)-- Cons( \(a : *)-> \(va : a)-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)-> \(x : *)-> \(Cons : a -> x -> x)-> \(Nil : x)-> Cons va (vas x Cons Nil))-- Nil( \(a : *)-> \(x : *)-> \(Cons : a -> x -> x)-> \(Nil : x)-> Nil)-- map( \(a : *)-> \(b : *)-> \(f : a -> b)-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)-> \(x : *)-> \(Cons : b -> x -> x)-> \(Nil: x)-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil)-- foldr( \(a : *)-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)-> vas)The relevant line is:foldr Nat (map Nat Nat ((+) one) example) Nat (+) zeroIf you remove the type-applications to Nat, this parallels our original Haskell example. We can then evaluate this expression at compile time:$ morte < nine.mt∀(a : *) → (a → a) → a → aλ(a : *) → λ(Succ : a → a) → λ(Zero : a) → Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ Zero))))))))Morte reduces our program to a church-encoded nine.Run-time computationMorte does not force you to compute everything using lambda calculus at compile time. Suppose that we wanted to use machine arithmetic at run-time instead. We can do this by parametrizing our program on:the Int type,operations on Ints, andany integer literals we useWe accept these "foreign imports" as ordinary arguments to our program:-- foreign.mt-- Foreign imports \(Int : *) -- Foreign type-> \((+) : Int -> Int -> Int) -- Foreign function-> \((*) : Int -> Int -> Int) -- Foreign function-> \(lit@0 : Int) -- Literal "1" -- Foreign data-> \(lit@1 : Int) -- Literal "2" -- Foreign data-> \(lit@2 : Int) -- Literal "3" -- Foreign data-> \(lit@3 : Int) -- Literal "1" -- Foreign data-> \(lit@4 : Int) -- Literal "0" -- Foreign data-- The rest is compile-time lambda calculus-> ( \(List : * -> *) -> \(Cons : forall (a : *) -> a -> List a -> List a) -> \(Nil : forall (a : *) -> List a) -> \( map : forall (a : *) -> forall (b : *) -> (a -> b) -> List a -> List b ) -> \( foldr : forall (a : *) -> List a -> forall (r : *) -> (a -> r -> r) -> r -> r ) -> ( \(example : List Int) -> foldr Int (map Int Int ((+) lit@3) example) Int (+) lit@4 ) -- example (Cons Int lit@0 (Cons Int lit@1 (Cons Int lit@2 (Nil Int)))) ) -- List ( \(a : *) -> forall (x : *) -> (a -> x -> x) -- Cons -> x -- Nil -> x ) -- Cons ( \(a : *) -> \(va : a) -> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x) -> \(x : *) -> \(Cons : a -> x -> x) -> \(Nil : x) -> Cons va (vas x Cons Nil) ) -- Nil ( \(a : *) -> \(x : *) -> \(Cons : a -> x -> x) -> \(Nil : x) -> Nil ) -- map ( \(a : *) -> \(b : *) -> \(f : a -> b) -> \(l : forall (x : *) -> (a -> x -> x) -> x -> x) -> \(x : *) -> \(Cons : b -> x -> x) -> \(Nil: x) -> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil ) -- foldr ( \(a : *) -> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x) -> vas )We can use Morte to optimize the above program and Morte will reduce the program to nothing but foreign types, operations, and values:$ morte < foreign.mt∀(Int : *) → (Int → Int → Int) → (Int → Int → Int) → Int →Int → Int → Int → Int → Intλ(Int : *) → λ((+) : Int → Int → Int) → λ((*) : Int → Int → Int) → λ(lit : Int) → λ(lit@1 : Int) → λ(lit@2 : Int) → λ(lit@3 : Int) → λ(lit@4 : Int) → (+) ((+) lit@3 lit) ((+) ((+) lit@3 lit@1) ((+) ((+) lit@3 lit@2) lit@4))If you study that closely, Morte adds lit@3 (the "1" literal) to each literal of the list and then adds them up. We can then pass this foreign syntax tree to our machine arithmetic backend to transform those foreign operations to efficient operations.Morte lets you choose how much information you want to encode within lambda calculus. The more information you encode in lambda calculus the more Morte can optimize your program, but the slower your compile times will get, so it's a tradeoff.CorecursionCorecursion is the dual of recursion. Where recursion works on finite data types, corecursion works on potentially infinite data types. An example would be the following infinite Stream in Haskell:data Stream a = Cons a (Stream a)numbers :: Stream Intnumbers = go 0 where go n = Cons n (go (n + 1))-- numbers = Cons 0 (Cons 1 (Cons 2 (...map :: (a -> b) -> Stream a -> Stream bmap f (Cons a l) = Cons (f a) (map f l)example :: Stream Intexample = map (+ 1) numbers-- example = Cons 1 (Cons 2 (Cons 3 (...Again, pretend that we disable any function from referencing itself so that the above code becomes invalid. This time we cannot reuse the same trick from previous sections because we cannot encode numbers as a fold without referencing itself. Try this if you don't believe me.However, we can still encode corecursive things in a non-corecursive way. This time, we encode our Stream type as an unfold instead of a fold:-- This is also valid Haskell code{-# LANGUAGE ExistentialQuantification #-}data Stream a = forall s . MkStream { seed :: s , step :: s -> (a, s) }numbers :: Stream Intnumbers = MkStream 0 (\n -> (n, n + 1))map :: (a -> b) -> Stream a -> Stream bmap f (MkStream s0 k) = MkStream s0 k' where k' s = (f a, s') where (a, s') = k s In other words, we store an initial seed of some type s and a step function of type s -> (a, s) that emits one element of our Stream. The type of our seed s can be anything and in our numbers example, the type of the internal state is Int. Another stream could use a completely different internal state of type (), like this:-- ones = Cons 1 onesones :: Stream Intones = MkStream () (\_ -> (1, ()))The general name for this trick is an "F-coalgebra" encoding of a corecursive type.Once we encode our infinite stream non-recursively, we can safely optimize the stream by inlining and eta reduction:map id l-- l = MkStream s0 k= map id (MkStream s0 k)-- Inline definition of `map`= MkStream s0 k' where k' = \s -> (id a, s') where (a, s') = k s-- Inline definition of `id`= MkStream s0 k' where k' = \s -> (a, s') where (a, s') = k s-- Inline: (a, s') = k s= MkStream s0 k' where k' = \s -> k s-- Eta reduce= MkStream s0 k' where k' = k-- Inline: k' = k= MkStream s0 k-- l = MkStream s0 k= lNow let's encode Stream and map in Morte and compile the following four expressions:map ididmap f . map gmap (f . g)Save the following Morte file to stream.mt and then uncomment the expression you want to test:( \(id : forall (a : *) -> a -> a)-> \( (.) : forall (a : *) -> forall (b : *) -> forall (c : *) -> (b -> c) -> (a -> b) -> (a -> c) )-> \(Pair : * -> * -> *)-> \(P : forall (a : *) -> forall (b : *) -> a -> b -> Pair a b)-> \( first : forall (a : *) -> forall (b : *) -> forall (c : *) -> (a -> b) -> Pair a c -> Pair b c )-> ( \(Stream : * -> *) -> \( map : forall (a : *) -> forall (b : *) -> (a -> b) -> Stream a -> Stream b ) -- example@1 = example@2 -> ( \(example@1 : forall (a : *) -> Stream a -> Stream a) -> \(example@2 : forall (a : *) -> Stream a -> Stream a) -- example@3 = example@4 -> \( example@3 : forall (a : *) -> forall (b : *) -> forall (c : *) -> (b -> c) -> (a -> b) -> Stream a -> Stream c ) -> \( example@4 : forall (a : *) -> forall (b : *) -> forall (c : *) -> (b -> c) -> (a -> b) -> Stream a -> Stream c ) -- Uncomment the example you want to test -> example@1-- -> example@2-- -> example@3-- -> example@4 ) -- example@1 (\(a : *) -> map a a (id a)) -- example@2 (\(a : *) -> id (Stream a)) -- example@3 ( \(a : *) -> \(b : *) -> \(c : *) -> \(f : b -> c) -> \(g : a -> b) -> map a c ((.) a b c f g) ) -- example@4 ( \(a : *) -> \(b : *) -> \(c : *) -> \(f : b -> c) -> \(g : a -> b) -> (.) (Stream a) (Stream b) (Stream c) (map b c f) (map a b g) ) ) -- Stream ( \(a : *) -> forall (x : *) -> (forall (s : *) -> s -> (s -> Pair a s) -> x) -> x ) -- map ( \(a : *) -> \(b : *) -> \(f : a -> b) -> \( st : forall (x : *) -> (forall (s : *) -> s -> (s -> Pair a s) -> x) -> x ) -> \(x : *) -> \(S : forall (s : *) -> s -> (s -> Pair b s) -> x) -> st x ( \(s : *) -> \(seed : s) -> \(step : s -> Pair a s) -> S s seed (\(seed@1 : s) -> first a b s f (step seed@1)) ) ))-- id(\(a : *) -> \(va : a) -> va)-- (.)( \(a : *)-> \(b : *)-> \(c : *)-> \(f : b -> c)-> \(g : a -> b)-> \(va : a)-> f (g va))-- Pair(\(a : *) -> \(b : *) -> forall (x : *) -> (a -> b -> x) -> x)-- P( \(a : *)-> \(b : *)-> \(va : a)-> \(vb : b)-> \(x : *)-> \(P : a -> b -> x)-> P va vb)-- first( \(a : *)-> \(b : *)-> \(c : *)-> \(f : a -> b)-> \(p : forall (x : *) -> (a -> c -> x) -> x)-> \(x : *)-> \(Pair : b -> c -> x)-> p x (\(va : a) -> \(vc : c) -> Pair (f va) vc))Both example@1 and example@2 will generate alpha-equivalent code:$ morte < example1.mt∀(a : *) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → xλ(a : *) → λ(st : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → st$ morte < example2.mt∀(a : *) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → xλ(a : *) → λ(va : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → vaSimilarly, example@3 and example@4 will generate alpha-equivalent code:$ morte < example3.mt∀(a : *) → ∀(b : *) → ∀(c : *) → (b → c) → (a → b) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (c → s → x) → x) → x) → xλ(a : *) → λ(b : *) → λ(c : *) → λ(f : b → c) → λ(g : a → b) → λ(st : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → λ(x : *) → λ(S : ∀(s : *) → s → (s → ∀(x : *) → (c → s → x) → x) → x) → st x (λ(s : *) → λ(seed : s) → λ(step : s → ∀(x : *) → (a → s → x) → x) → S s seed (λ(seed@1 : s) → λ(x : *) → λ(Pair : c → s → x) → step seed@1 x (λ(va : a) → Pair (f (g va)))))$ morte < example4.mt∀(a : *) → ∀(b : *) → ∀(c : *) → (b → c) → (a → b) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (c → s → x) → x) → x) → xλ(a : *) → λ(b : *) → λ(c : *) → λ(f : b → c) → λ(g : a → b) → λ(va : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → λ(x : *) → λ(S : ∀(s : *) → s → (s → ∀(x : *) → (c → s → x) → x) → x) → va x (λ(s : *) → λ(seed : s) → λ(step : s → ∀(x : *) → (a → s → x) → x) → S s seed (λ(seed@1 : s) → λ(x : *) → λ(Pair : c → s → x) → step seed@1 x (λ(va : a) → Pair (f (g va))))We inadvertently proved stream fusion for free, but we're still not done, yet! Everything we learn about recursive and corecursive sequences can be applied to model recursive and corecursive effects!EffectsI will conclude this post by showing how to model both recursive and corecursive programs that have side effects. The recursive program will echo ninety-nine lines from stdin to stdout. The equivalent Haskell program is in the comment header:-- recursive.mt-- The Haskell code we will translate to Morte:---- import Prelude hiding (-- (+), (*), IO, putStrLn, getLine, (>>=), (>>), return )-- -- -- Simple prelude---- data Nat = Succ Nat | Zero---- zero :: Nat-- zero = Zero---- one :: Nat-- one = Succ Zero---- (+) :: Nat -> Nat -> Nat-- Zero + n = n-- Succ m + n = m + Succ n---- (*) :: Nat -> Nat -> Nat-- Zero * n = Zero-- Succ m * n = n + (m * n)---- foldNat :: Nat -> (a -> a) -> a -> a-- foldNat Zero f x = x-- foldNat (Succ m) f x = f (foldNat m f x)---- data IO r-- = PutStrLn String (IO r)-- | GetLine (String -> IO r)-- | Return r---- putStrLn :: String -> IO U-- putStrLn str = PutStrLn str (Return Unit)---- getLine :: IO String-- getLine = GetLine Return---- return :: a -> IO a-- return = Return---- (>>=) :: IO a -> (a -> IO b) -> IO b-- PutStrLn str io >>= f = PutStrLn str (io >>= f)-- GetLine k >>= f = GetLine (\str -> k str >>= f)-- Return r >>= f = f r---- -- Derived functions---- (>>) :: IO U -> IO U -> IO U-- m >> n = m >>= \_ -> n---- two :: Nat-- two = one + one---- three :: Nat-- three = one + one + one---- four :: Nat-- four = one + one + one + one---- five :: Nat-- five = one + one + one + one + one---- six :: Nat-- six = one + one + one + one + one + one---- seven :: Nat-- seven = one + one + one + one + one + one + one---- eight :: Nat-- eight = one + one + one + one + one + one + one + one---- nine :: Nat-- nine = one + one + one + one + one + one + one + one + one---- ten :: Nat-- ten = one + one + one + one + one + one + one + one + one + one---- replicateM_ :: Nat -> IO U -> IO U-- replicateM_ n io = foldNat n (io >>) (return Unit)---- ninetynine :: Nat-- ninetynine = nine * ten + nine---- main_ :: IO U-- main_ = replicateM_ ninetynine (getLine >>= putStrLn)-- "Free" variables( \(String : * )-> \(U : *)-> \(Unit : U) -- Simple prelude-> ( \(Nat : *) -> \(zero : Nat) -> \(one : Nat) -> \((+) : Nat -> Nat -> Nat) -> \((*) : Nat -> Nat -> Nat) -> \(foldNat : Nat -> forall (a : *) -> (a -> a) -> a -> a) -> \(IO : * -> *) -> \(return : forall (a : *) -> a -> IO a) -> \((>>=) : forall (a : *) -> forall (b : *) -> IO a -> (a -> IO b) -> IO b ) -> \(putStrLn : String -> IO U) -> \(getLine : IO String) -- Derived functions -> ( \((>>) : IO U -> IO U -> IO U) -> \(two : Nat) -> \(three : Nat) -> \(four : Nat) -> \(five : Nat) -> \(six : Nat) -> \(seven : Nat) -> \(eight : Nat) -> \(nine : Nat) -> \(ten : Nat) -> ( \(replicateM_ : Nat -> IO U -> IO U) -> \(ninetynine : Nat) -> replicateM_ ninetynine ((>>=) String U getLine putStrLn) ) -- replicateM_ ( \(n : Nat) -> \(io : IO U) -> foldNat n (IO U) ((>>) io) (return U Unit) ) -- ninetynine ((+) ((*) nine ten) nine) ) -- (>>) ( \(m : IO U) -> \(n : IO U) -> (>>=) U U m (\(_ : U) -> n) ) -- two ((+) one one) -- three ((+) one ((+) one one)) -- four ((+) one ((+) one ((+) one one))) -- five ((+) one ((+) one ((+) one ((+) one one)))) -- six ((+) one ((+) one ((+) one ((+) one ((+) one one))))) -- seven ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one)))))) -- eight ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one))))))) -- nine ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one)))))))) -- ten ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one))))))))) ) -- Nat ( forall (a : *) -> (a -> a) -> a -> a ) -- zero ( \(a : *) -> \(Succ : a -> a) -> \(Zero : a) -> Zero ) -- one ( \(a : *) -> \(Succ : a -> a) -> \(Zero : a) -> Succ Zero ) -- (+) ( \(m : forall (a : *) -> (a -> a) -> a -> a) -> \(n : forall (a : *) -> (a -> a) -> a -> a) -> \(a : *) -> \(Succ : a -> a) -> \(Zero : a) -> m a Succ (n a Succ Zero) ) -- (*) ( \(m : forall (a : *) -> (a -> a) -> a -> a) -> \(n : forall (a : *) -> (a -> a) -> a -> a) -> \(a : *) -> \(Succ : a -> a) -> \(Zero : a) -> m a (n a Succ) Zero ) -- foldNat ( \(n : forall (a : *) -> (a -> a) -> a -> a) -> n ) -- IO ( \(r : *) -> forall (x : *) -> (String -> x -> x) -> ((String -> x) -> x) -> (r -> x) -> x ) -- return ( \(a : *) -> \(va : a) -> \(x : *) -> \(PutStrLn : String -> x -> x) -> \(GetLine : (String -> x) -> x) -> \(Return : a -> x) -> Return va ) -- (>>=) ( \(a : *) -> \(b : *) -> \(m : forall (x : *) -> (String -> x -> x) -> ((String -> x) -> x) -> (a -> x) -> x ) -> \(f : a -> forall (x : *) -> (String -> x -> x) -> ((String -> x) -> x) -> (b -> x) -> x ) -> \(x : *) -> \(PutStrLn : String -> x -> x) -> \(GetLine : (String -> x) -> x) -> \(Return : b -> x) -> m x PutStrLn GetLine (\(va : a) -> f va x PutStrLn GetLine Return) ) -- putStrLn ( \(str : String) -> \(x : *) -> \(PutStrLn : String -> x -> x ) -> \(GetLine : (String -> x) -> x) -> \(Return : U -> x) -> PutStrLn str (Return Unit) ) -- getLine ( \(x : *) -> \(PutStrLn : String -> x -> x ) -> \(GetLine : (String -> x) -> x) -> \(Return : String -> x) -> GetLine Return ))This program will compile to a completely unrolled read-write loop, as most recursive programs will:$ morte < recursive.mt∀(String : *) → ∀(U : *) → U → ∀(x : *) → (String → x → x) → ((String → x) → x) → (U → x) → xλ(String : *) → λ(U : *) → λ(Unit : U) → λ(x : *) → λ(PutStrLn : String → x → x) → λ(GetLine : (String → x) → x) → λ(Return : U → x) → GetLine (λ(va : String) → PutStrLn va (GetLine (λ(va@1 : String) → PutStrLn va@1 (GetLine (λ(va@2 : String) → PutStrLn va@2 (GetLine (λ(va@3 : String) → PutStrLn ... ... GetLine (λ(va@92 : String) → PutStrLn va@92 (GetLine (λ(va@93 : String) → PutStrLn va@93 (GetLine (λ(va@94 : String) → PutStrLn va@94 (GetLine (λ(va@95 : String) → PutStrLn va@95 (GetLine (λ(va@96 : String) → PutStrLn va@96 (GetLine (λ(va@97 : String) → PutStrLn va@97 (GetLine (λ(va@98 : String) → PutStrLn va@98 (Return Unit))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))In contrast, if we encode the effects corecursively we can express a program that echoes indefinitely from stdin to stdout:-- corecursive.mt-- data IOF r s-- = PutStrLn String s-- | GetLine (String -> s)-- | Return r---- data IO r = forall s . MkIO s (s -> IOF r s)---- main = MkIO-- Nothing-- (maybe (\str -> PutStrLn str Nothing) (GetLine Just))( \(String : *)-> ( \(Maybe : * -> *) -> \(Just : forall (a : *) -> a -> Maybe a) -> \(Nothing : forall (a : *) -> Maybe a) -> \( maybe : forall (a : *) -> Maybe a -> forall (x : *) -> (a -> x) -> x -> x ) -> \(IOF : * -> * -> *) -> \( PutStrLn : forall (r : *) -> forall (s : *) -> String -> s -> IOF r s ) -> \( GetLine : forall (r : *) -> forall (s : *) -> (String -> s) -> IOF r s ) -> \( Return : forall (r : *) -> forall (s : *) -> r -> IOF r s ) -> ( \(IO : * -> *) -> \( MkIO : forall (r : *) -> forall (s : *) -> s -> (s -> IOF r s) -> IO r ) -> ( \(main : forall (r : *) -> IO r) -> main ) -- main ( \(r : *) -> MkIO r (Maybe String) (Nothing String) ( \(m : Maybe String) -> maybe String m (IOF r (Maybe String)) (\(str : String) -> PutStrLn r (Maybe String) str (Nothing String) ) (GetLine r (Maybe String) (Just String)) ) ) ) -- IO ( \(r : *) -> forall (x : *) -> (forall (s : *) -> s -> (s -> IOF r s) -> x) -> x ) -- MkIO ( \(r : *) -> \(s : *) -> \(seed : s) -> \(step : s -> IOF r s) -> \(x : *) -> \(k : forall (s : *) -> s -> (s -> IOF r s) -> x) -> k s seed step ) ) -- Maybe (\(a : *) -> forall (x : *) -> (a -> x) -> x -> x) -- Just ( \(a : *) -> \(va : a) -> \(x : *) -> \(Just : a -> x) -> \(Nothing : x) -> Just va ) -- Nothing ( \(a : *) -> \(x : *) -> \(Just : a -> x) -> \(Nothing : x) -> Nothing ) -- maybe ( \(a : *) -> \(m : forall (x : *) -> (a -> x) -> x-> x) -> m ) -- IOF ( \(r : *) -> \(s : *) -> forall (x : *) -> (String -> s -> x) -> ((String -> s) -> x) -> (r -> x) -> x ) -- PutStrLn ( \(r : *) -> \(s : *) -> \(str : String) -> \(vs : s) -> \(x : *) -> \(PutStrLn : String -> s -> x) -> \(GetLine : (String -> s) -> x) -> \(Return : r -> x) -> PutStrLn str vs ) -- GetLine ( \(r : *) -> \(s : *) -> \(k : String -> s) -> \(x : *) -> \(PutStrLn : String -> s -> x) -> \(GetLine : (String -> s) -> x) -> \(Return : r -> x) -> GetLine k ) -- Return ( \(r : *) -> \(s : *) -> \(vr : r) -> \(x : *) -> \(PutStrLn : String -> s -> x) -> \(GetLine : (String -> s) -> x) -> \(Return : r -> x) -> Return vr ))This compiles to a state machine that we can unfold one step at a time:$ morte < corecursive.mt∀(String : *) → ∀(r : *) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (String → s → x) → ((String → s) → x) → (r → x) → x) → x) → xλ(String : *) → λ(r : *) → λ(x : *) → λ(k : ∀(s : *) → s → (s → ∀(x : *) → (String → s → x) → ((String → s) → x) → (r → x) → x) → x) → k (∀(x : *) → (String → x) → x → x) (λ(x : *) → λ(Just : String → x) → λ(Nothing : x) → Nothing) (λ(m : ∀(x : *) → (String → x) → x → x) → m (∀(x : *) → (String → (∀(x : *) → (String → x) → x → x) → x) → ((String → ∀(x : *) → (String → x) → x → x) → x) → (r → x) → x) (λ(str : String) → λ(x : *) → λ(PutStrLn : String → (∀(x : *) → (String → x) → x → x) → x) → λ(GetLine : (String → ∀(x : *) → (String → x) → x → x) → x) → λ(Return : r → x) → PutStrLn str (λ(x : *) → λ(Just : String → x) → λ(Nothing : x) → Nothing)) (λ(x : *) → λ(PutStrLn : String → (∀(x : *) → (String → x) → x → x) → x) → λ(GetLine : (String → ∀(x : *) → (String → x) → x → x) → x) → λ(Return : r → x) → GetLine (λ(va : String) → λ(x : *) → λ(Just : String → x) → λ(Nothing : x) → Just va))I don't expect you to understand that output other than to know that we can translate the output to any backend that provides functions, and primitive read/write operations.ConclusionIf you would like to use Morte, you can find the library on both Github and Hackage. I also provide a Morte tutorial that you can use to learn more about the library.Morte is dependently typed in theory, but in practice I have not exercised this feature so I don't understand the implications of this. If this turns out to be a mistake then I will downgrade Morte to System Fw, which has higher-kinds and polymorphism, but no dependent types.Additionally, Morte might be usable to transmit code in a secure and typed way in distributed environment or to share code between diverse functional language by providing a common intermediate language. However, both of those scenarios require additional work, such as establishing a shared set of foreign primitives and creating Morte encoders/decoders for each target language.Also, there are additional optimizations which Morte might implement in the future. For example, Morte could use free theorems (equalities you deduce from the types) to simplify some code fragments even further, but Morte currently does not do this.My next goals are:Add a back-end to compile Morte to LLVMAdd a front-end to desugar a medium-level Haskell-like language to MorteOnce those steps are complete then Morte will be a usable intermediate language for writing super-optimizable programs.Also, if you're wondering, the name Morte is a tribute to a talking skull from the game Planescape: Torment, since the Morte library is a "bare-bones" calculus of constructions.LiteratureIf this topic interests you more, you may find the following links helpful, in roughly increasing order of difficulty:Data and Codata - A blog post by Dan Piponi introducing the notions of data and codataChurch encoding - A wikipedia article on church encoding (converting things to lambda expressions)Total Functional Programming - A paper by D. A. Turner on total programming using data and codataRecursive types for free! - An unpublished draft by Philip Wadler about F-algebras and F-coalgebrasUnderstanding F algebras - A blog post by Bartosz MilewskiBeyond Church encoding: Boehm-Berarducci isomorphism of algebraic data types and polymorphic lambda-terms - Oleg Kiselyov's collection of notes on Boehm-Berarducci encoding, a more complete version of Church encodingF-algebras and F-coalgebras - wikipedia articles that are short, sweet, and utterly impenetrable unless you already understand the subjectGabriel Gonzaleztag:blogger.com,1999:blog-1777990983847811806.post-1179905794869657936Fri, 12 Sep 2014 11:00:00 +0000PSA: GHC 7.10, cabal, and Windows
https://www.fpcomplete.com/blog/2015/05/psa-ghc-710-cabal-windows
Since we've received multiple bug
reports on this, and there are
many people suffering from it reporting on the cabal
issue, Neil and I decided a more
public announcement was warranted.There is an as-yet undiagnosed bug in cabal which causes some packages to fail
to install. Packages known to be affected are blaze-builder-enumerator,
data-default-instances-old-locale, vector-binary-instances, and
data-default-instances-containers. The output looks something like:Resolving dependencies...
Configuring data-default-instances-old-locale-0.0.1...
Building data-default-instances-old-locale-0.0.1...
Failed to install data-default-instances-old-locale-0.0.1
Build log ( C:\Users\gl67\AppData\Roaming\cabal\logs\data-default-instances-old-locale-0.0.1.log ):
Building data-default-instances-old-locale-0.0.1...
Preprocessing library data-default-instances-old-locale-0.0.1...
[1 of 1] Compiling Data.Default.Instances.OldLocale ( Data\Default\Instances\OldLocale.hs, dist\build\Data\Default\Instances\OldLocale.o )
C:\Users\gl67\repos\MinGHC\7.10.1\ghc-7.10.1\mingw\bin\ar.exe: dist\build\libHSdata-default-instances-old-locale-0.0.1-6jcjjaR25tK4x3nJhHHjFM.a-11336\libHSdata-default-instances-old-locale-0.0.1-6jcjjaR25tK4x3nJhHHjFM.a: No such file or directory
cabal.exe: Error: some packages failed to install:
data-default-instances-old-locale-0.0.1 failed during the building phase. The
exception was:
ExitFailure 1There are two workarounds I know of at this time:You can manually unpack and install the package which seems to work, e.g.:cabal unpack data-default-instances-old-locale-0.0.1
cabal install .\data-default-instances-old-locale-0.0.1Drop down to GHC 7.8.4 until the cabal bug is fixedFor normal users, you can stop reading here. If you're interested in more
details and may be able to help fix it, here's a summary of the research I've
done so far:As far as I can tell, this is a bug in cabal-install, not the Cabal library.
Despite reports to the contrary, it does not seem to be that the
parallelization level (-j option) has any impact. The only thing that seems to
affect the behavior is whether cabal-install unpacks and installs in one
step, or does it in two steps. That's why unpacking and then installing works
around the bug.I've stared at cabal logs on this quite a bit, but don't see a rhyme or reason
to what's happening here. The bug is easily reproducible, so hopefully someone
with more cabal expertise will be able to look at this soon, as this bug has
high severity and has been affecting Windows users for almost two months.https://www.fpcomplete.com/blog/2015/05/psa-ghc-710-cabal-windowsTue, 19 May 2015 08:20:00 +0000Compiling a Lazy Language in 1,000 words
http://jozefg.bitbucket.org/posts/2015-05-19-compiling.html
http://jozefg.bitbucket.org/posts/2015-05-19-compiling.htmlTue, 19 May 2015 00:00:00 +0000Can we write a Monoidal Either?
http://jaspervdj.be/posts/2015-05-19-monoidal-either.html
http://jaspervdj.be/posts/2015-05-19-monoidal-either.htmlTue, 19 May 2015 00:00:00 +0000How Haskellers are seen and see themselves
http://chrisdone.com/posts/haskellers
http://chrisdone.com/posts/haskellersTue, 19 May 2015 00:00:00 +0000Royal Navy whistleblower says Trident is "a disaster waiting to happen"
http://wadler.blogspot.com/2015/05/royal-navy-whistleblower-says-trident.html
A Royal Navy weapons expert who served on HMS Victorious from January to April this year has released via WikiLeaks an eighteen-page report claiming Trident is "a disaster waiting to happen".McNeil's report on WikiLeaks.Original report in The Sunday Herald.McNeilly's report alleges 30 safety and security flaws on Trident submarines, based at Faslane on the Clyde. They include failures in testing whether missiles could be safely launched, burning toilet rolls starting a fire in a missile compartment, and security passes and bags going unchecked.He also reports alarms being muted because they went off so often, missile safety procedures being ignored and top secret information left unguarded.The independent nuclear submarine expert, John Large, concluded McNeilly was credible, though he may have misunderstood some of the things he saw.Large said: "Even if he is right about the disorganisation, lack of morale, and sheer foolhardiness of the personnel around him - and the unreliability of the engineered systems - it is likely that the Trident system as a whole will tolerate the misdemeanours, as it's designed to do." (Regarding the quote from Large, I'm less sanguine. Ignoring alarms is a standard prelude to disaster. See Normal Accidents.)Second report in The National.“We are so close to a nuclear disaster it is shocking, and yet everybody is accepting the risk to the public,” he warned. “It’s just a matter of time before we’re infiltrated by a psychopath or a terrorist.”Coverage in CommonSpace.Philip Wadlertag:blogger.com,1999:blog-9757377.post-1327101067614959274Mon, 18 May 2015 22:14:00 +0000The internet of code
http://www.haskellforall.com/2015/05/the-internet-of-code.html
<head></head>In this post I will introduce a proof-of-concept implementation for distributing typed code over the internet where the unit of compilation is individual expressions.The core languageTo motivate this post, consider this Haskell code:data Bool = True | Falseand :: Bool -> Bool -> Booland b1 b2 = if b1 then b2 else Falseor :: Bool -> Bool -> Boolor b1 b2 = if b1 then True else b2data Even = Zero | SuccE Odddata Odd = SuccO Evenfour :: Evenfour = SuccE (SuccO (SuccE (SuccO Zero)))doubleEven :: Even -> EvendoubleEven (SuccE o) = SuccE (SuccO (doubleOdd o)doubleEven Zero = ZerodoubleOdd :: Odd -> EvendoubleOdd (SuccO e) = SuccE (SuccO (doubleEven e)I will encode each one of the above types, terms, and constructors as separate, closed, non-recursive expressions in the calculus of constructions. You can think of the calculus of constructions as a typed assembly language for functional programs which we will use to distribute program fragments over the internet. You can learn more about the calculus of constructions and other pure type systems by reading this clear paper by Simon Peyton Jones: "Henk: a typed intermediate language".For example, here is how you encode the True constructor in the calculus of constructions:λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → TrueNote that the entire expression is the True constructor, not just the right-hand side: This is the True constructorvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True ^^^^ Not thisI just chose the variable names so that you can tell at a glance what constructor you are looking at from the right-hand side of the expression.Similarly, here is how you encode the type Bool in the calculus of constructions: This is the `Bool` typevvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool ^^^^ Not thisAgain, the entire expression is the Bool type, but I chose the variable names so that you can tell which type you are looking at from the right-hand side.You can learn more about the full set of rules for translating data types to System F (a subset of the calculus of constructions) by reading the paper: "Automatic synthesis of typed Λ-programs on term algebras". Also, I will soon release a compiler named annah that automates this translation algorithm, and I used this compiler to translate the above Haskell code to the equivalent expressions in the calculus of constructions.DistributionWe can distribute these expressions by hosting each expression as text source code on the internet. For example, I encoded all of the above types, terms and constructors in the calculus of constructions and hosted them using a static web server. You can browse these expressions by visiting sigil.place/post/0/.Click on one of the expressions in the directory listing to see how they are encoded in the calculus of constructions. For example, if you click the link to four you will find an ordinary text file whose contents look like this (formatted for clarity): λ(Even : *) -- This entire→ λ(Odd : *) -- expression is→ λ(Zero : Even) -- the number `four`→ λ(SuccE : ∀(pred : Odd) → Even) --→ λ(SuccO : ∀(pred : Even) → Odd) --→ SuccE (SuccO (SuccE (SuccO Zero))) -- Not just this last lineEach one of these expressions gets a unique URL, and we can embed any expression in our code by referencing the appropriate URL.Remote importsWe can use the morte compiler to download, parse, and super-optimize programs written in the calculus of constructions. The morte compiler reads in a program from standard input, outputs the program's type to standard error, then super-optimizes the program and outputs the optimized program to standard output.For example, we can compute and True False at compile time by just replacing and, True, and False by their appropriate URLs:$ cabal install 'morte >= 1.2'$ morte#http://sigil.place/post/0/and #http://sigil.place/post/0/True #http://sigil.place/post/0/FalseWhen we hit to signal end of standard input then morte will compile the program:$ morte#http://sigil.place/post/0/and #http://sigil.place/post/0/True #http://sigil.place/post/0/False ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → FalseThe program's type is Bool and morte optimizes away the program to False at compile time. Both the type (Bool) and the value (False) are encoded in the calculus of constructions.Here we are using morte as a compile-time calculator, mainly because Morte does not yet compile to a backend language. When I release a backend language I will go into more detail about how to embed expressions to evaluate at runtime instead of compile time.Local importsWe can shorten this example further because morte also lets you import expressions from local files using the same hashtag syntax. For example, we can create local files that wrap remote URLs like this:$ echo "#http://sigil.place/post/0/Bool" > Bool$ echo "#http://sigil.place/post/0/True" > True$ echo "#http://sigil.place/post/0/False" > False$ echo "#http://sigil.place/post/0/or" > orWe can then use these local files as convenient short-hand aliases for remote imports:$ morte#or #True #False ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → TrueWe can use imports anywhere in our program, even as types! For example, in the calculus of constructions you encode if as the identity function on #Bools:λ(b : #Bool ) → b # Note: Imports must end with whitespaceWe can then save our implementation of if to a file named if, except using the ASCII symbols \ and -> instead of λ and →:$ echo "\(b : #Bool ) -> b" > ifNow we can define our own and function in terms of if. Remember that the Haskell definition of and is:and b1 b2 = if b1 then b2 else FalseOur definition won't be much different:$ echo "\(b1 : #Bool ) -> \(b2 : #Bool ) -> #if b1 #Bool b2 #False" > andLet's confirm that our new and function works:$ echo "#and #True #False" | morte∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → FalseWe can also ask morte to resolve all imports for our and function and optimize the result:$ morte < and ∀(b1 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)→ ∀(b2 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)→ ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool λ(b1 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)→ λ(b2 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)→ b1 (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) b2 (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)We can then compare our version with the and expression hosted online, which is identical:$ curl sigil.place/post/0/and λ(b1 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)→ λ(b2 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)→ b1 (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) b2 (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)ReductionWhen we write an expression like this:#or #True #FalseThe compiler resolves all imports transitively until all that is left is an expression in the calculus of constructions, like this one:-- or( λ(b1 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)→ b1 (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True))-- True(λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True )-- False(λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)Then the compiler reduces this expression using β-reduction and ε-reduction. We can safely reduce these expressions at compile time because these reductions always terminate in the calculus of constructions, which is a total and non-recursive language.For example, the above expression reduces to: ( λ(b1 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → b1 (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True) ) (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True ) (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)-- β-reduce= (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True ) (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True) (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)-- β-reduce= ( λ(True : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → λ(False : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → True ) (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True) (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)-- β-reduce= ( λ(False : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True ) (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)-- β-reduce= λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → TrueLinkingThe and we defined is "dynamically linked", meaning that the file we saved has not yet resolved all imports:$ cat and\(b1 : #Bool ) -> \(b2 : #Bool ) -> #if b1 #Bool b2 #FalseThe morte compiler will resolve these imports every time we import this expression within a program. To be precise, each import is resolved once per program and then cached and reused for subsequent duplicate imports. That means that the compiler only imports #Bool once for the above program and not three times. Also, we can transparently cache these expressions just like any other web resource by providing the appropriate Cache-Control HTTP header. For example, my static web server sets max-age to a day so that expressions can be cached for up to one day.If our imported expressions change then our program will reflect those changes, which may or may not be desirable. For the above program dynamic linking is undesirable because if we change the file #False to point to sigil.place/post/0/True then we would break the behavior of the and function.Alternatively, we can "statically link" the and function by resolving all imports using the morte compiler. For example, I statically linked my remote and expression because the behavior should never change:$ curl sigil.place/post/0/and λ(b1 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)→ λ(b2 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)→ b1 (∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) b2 (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False)In other scenarios you might want to dynamically link expressions if you want to automatically pull in upgrades from trusted upstream sources. This is the same rationale behind service-oriented architectures which optimize for transparent system-wide updates, except that instead of updating a service we update an expression.Partial applicationWe can store partially applied functions in files, too. For example, we could store and True in a statically linked file named example using morte:$ echo "#and #True" | morte > example ∀(b2 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool)→ ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → BoolThe type still goes to standard error, but the partially applied function goes to the example file. We can use the partially applied function just by referencing our new file:$ morte#example #False -- Same as: #and #True #False ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → FalseWe can even view example and see that it's still just an ordinary text source file:$ cat exampleλ(b2 : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → b2We can also see that morte was clever and optimized #and #True to the identity function on #Bools.If we wanted to share our example code with our friends, we'd just host the file using any static web server. I like to use Haskell's warp server (from the wai-app-static package) for static hosting, but even something like python -m SimpleHTTPServer would work just as well:$ cabal install wai-app-static$ warpServing directory /tmp/code on port 3000 with ["index.html","index.htm"] index files.Then we could provide our friends with a URL pointing to the example file and they could embed our code within their program by pasting in our URL.TypesThe calculus of constructions is typed, so if you make a mistake, you'll know immediately:$ morte#True #True Expression:(λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True)(λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True)Error: Function applied to argument of the wrong typeExpected type: *Argument type: ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → BoolTypes are what differentiates morte from a curl into sh. You can use the type system to whitelist the set of permissible values to import.For example, in this code, there are only two values of #x that will type-check (up to α-conversion):(λ(b : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → b) #xTherefore, we can safely import a remote value knowing that the type-checker will reject attempts to inject arbitrary code.When building a program with effects, we can similarly refine the set of permissible actions using the types. I introduced one such example in my previous post on morte, where the recursive.mt program restricts the effects to reading and printing lines of text and nothing else. You could then import a remote expression of type: ∀(String : *)→ ∀(U : *)→ ∀(Unit : U)→ ∀(IO : *)→ ∀(GetLine : String → IO → IO)→ ∀(PutStrLn : (String → IO) → IO)→ ∀(Return : U → IO)→ IO... which is the type of an effect syntax tree built from GetLine/PutStrLn/Return constructors. The type-checker will then enforce that the imported syntax tree cannot contain any other constructors and therefore cannot be interpreted to produce any other effects.Recursive data typesYou can encode recursive data types and functions in the calculus of constructions. This is all the more amazing when you realize that the calculus of constructions does not permit recursion! Also, morte's import system forbids recursion as well. If you try to recurse using imports you will get an error:$ echo "#foo" > bar$ echo "#bar" > foo$ morte < foomorte: ⤷ #bar ⤷ #foo Cyclic import: #bar Joe Armstrong once proposed that the core language for an internet of code would require built-in support for recursion (via letrec or something similar), but that's actually not true! The paper "Automatic synthesis of typed Λ-programs on term algebras" spells out how to encode recursive data types in the non-recursive System F language. What's amazing is that the algorithm works even for mutually recursive data types like Even and Odd from our original Haskell example.You don't have to take my word for it! You can verify for yourself that the Even and Odd types and the Zero/SuccE/SuccO constructors that I hosted online are not recursive:sigil.place/post/0/Evensigil.place/post/0/Oddsigil.place/post/0/Zerosigil.place/post/0/SuccEsigil.place/post/0/SuccOLet's create local aliases for the constructors so we can built our own Even or Odd values:$ echo "#http://sigil.place/post/0/Zero" > Zero$ echo "#http://sigil.place/post/0/SuccE" > SuccE$ echo "#http://sigil.place/post/0/SuccO" > SuccOWe can then assemble the number four using these constructors:$ morte#SuccE (#SuccO (#SuccE (#SuccO #Zero ))) ∀(Even : *)→ ∀(Odd : *)→ ∀(Zero : Even)→ ∀(SuccE : ∀(pred : Odd) → Even)→ ∀(SuccO : ∀(pred : Even) → Odd)→ Even λ(Even : *)→ λ(Odd : *)→ λ(Zero : Even)→ λ(SuccE : ∀(pred : Odd) → Even)→ λ(SuccO : ∀(pred : Even) → Odd)→ SuccE (SuccO (SuccE (SuccO Zero)))The result is identical to the four that I hosted:$ curl sigil.place/post/0/four λ(Even : *)→ λ(Odd : *)→ λ(Zero : Even)→ λ(SuccE : ∀(pred : Odd) → Even)→ λ(SuccO : ∀(pred : Even) → Odd)→ SuccE (SuccO (SuccE (SuccO Zero)))We can even encode functions over mutually recursive types like doubleEven and doubleOdd. You can verify that the ones I wrote are not recursive:sigil.place/post/0/doubleEvensigil.place/post/0/doubleOdd... and then we can test that they work by doubling the number four:$ morte#http://sigil.place/post/0/doubleEven #http://sigil.place/post/0/four ∀(Even : *)→ ∀(Odd : *)→ ∀(Zero : Even)→ ∀(SuccE : ∀(pred : Odd) → Even)→ ∀(SuccO : ∀(pred : Even) → Odd)→ Even λ(Even : *)→ λ(Odd : *)→ λ(Zero : Even)→ λ(SuccE : ∀(pred : Odd) → Even)→ λ(SuccO : ∀(pred : Even) → Odd)→ SuccE (SuccO (SuccE (SuccO (SuccE (SuccO (SuccE (SuccO Zero)))))))We get back the Even number eight encoded in the calculus of constructions.Stack tracesmorte will provide a "stack trace" if there is a type error or parse error:$ echo "\(a : *) ->" > foo # This is a malformed program$ echo "#foo" > bar$ echo "#bar" > baz$ echo "#baz" > qux$ morte < quxmorte: ⤷ #qux ⤷ #baz ⤷ #bar ⤷ #foo Line: 2Column: 1Parsing: EOFError: Parsing failedYou can learn more about how morte's import system works by reading the newly add "Imports" section of the morte tutorial.Comparison to other software architecturesmorte's code distribution system most closely resembles the distribution model of Javascript, meaning that code can be downloaded from any URL and is compiled or interpreted client-side. The most important difference between the two is the granularity of imports and the import mechanism.In morte the unit of distribution is individual types, terms, and constructors and you can inject a remote expression anywhere in the syntax tree by referencing its URL. This is why we can do crazy things like use a URL for a type:λ(b : #http://sigil.place/post/0/Bool ) → ...The second difference is that morte is designed from the ground up to be typed and highly optimizable (analogous to asm.js, a restricted subset of Javascript designed for ease of optimization).The third difference is that morte lets you precisely delimit what remote code can do using the type system, unlike Javascript.Future directionsThis is just one piece of the puzzle in a long-term project of mine to build a typed and distributed intermediate language that we can use to share code across language boundaries. I want to give people the freedom to program in the language of their choice while still interoperating freely with other languages. In other words, I'm trying to build a pandoc for programming languages.However, this project is still not really usable, even in anger. There are several missing features to go, some of which will be provided by my upcoming annah library:Requirement #1: There needs to be a way to convert between restricted subsets of existing programming languages and the calculus of constructionsannah currently provides logic to encode medium-level language abstractions to and from the calculus of constructions. In fact, that's how I converted the Haskell example at the beginning of this post into the calculus of constructions. For example, I used annah to derive how to encode the SuccE constructor in the calculus of constructions:$ annah compiletype Evendata Zerodata SuccE (pred : Odd)type Odddata SuccO (pred : Even)in SuccE ... and annah correctly deduced the type and value in the calculus of constructions: ∀(pred : ∀(Even : *) → ∀(Odd : *) → ∀(Zero : Even) → ∀(SuccE : ∀(pred : Odd) → Even )→ ∀(SuccO : ∀(pred : Even) → Odd) → Odd)→ ∀(Even : *)→ ∀(Odd : *)→ ∀(Zero : Even)→ ∀(SuccE : ∀(pred : Odd) → Even)→ ∀(SuccO : ∀(pred : Even) → Odd)→ Even λ(pred : ∀(Even : *) → ∀(Odd : *) → ∀(Zero : Even) → ∀(SuccE : ∀(pred : Odd) → Even) → ∀(SuccO : ∀(pred : Even) → Odd) → Odd )→ λ(Even : *)→ λ(Odd : *)→ λ(Zero : Even)→ λ(SuccE : ∀(pred : Odd) → Even)→ λ(SuccO : ∀(pred : Even) → Odd)→ SuccE (pred Even Odd Zero SuccE SuccO)Among other things, annah automates the algorithm from "Automatic synthesis of typed Λ-programs on term algebras", which is known as "Böhm-Berarducci encoding".Requirement #2: There must be an efficient way to transmit bytes, text, and numbers alongside code.My plan is to transmit this information out-of-band as a separate file rather than embedding the data directly within the code and annah will provide a systematic convention for distributing data and referencing that data within source code.Requirement #3: There needs to be a standard library of types, data structures, functions, and side effects that all target languages must support.In other words, there needs to be some sort of thrift for code so that languages can maximize code sharing.Requirement #4: There must be better tooling for mass installation and hosting of expressions.For example, I'd like to be able to alias all imports within a remote directory to local files with a single command.Requirement #5: I need to figure out a way to mesh type inference with an expression-level distribution system.As far as I can tell this is still an open research problem and this is most likely going to be the greatest obstacle to making this usable in practice.ResourcesIf you would like to learn more about Morte or contribute, then check out the following resources:Morte on GithubMorte on HackageMorte TutorialGabriel Gonzaleztag:blogger.com,1999:blog-1777990983847811806.post-1389001474454360567Mon, 18 May 2015 11:00:00 +0000Recent Hackage improvements
http://www.well-typed.com/blog/2015/05/recent-hackage-improvements
You may or may not have noticed but over the last few months we’ve had a number of incremental improvements to the Hackage site, with patches contributed by numerous people.
I’m very pleased that we’ve had contributions from so many people recently. Apart from one patch that took a long time to merge we’ve generally been reasonably good at getting patches merged. Currently there’s just 1 outstanding pull request on hackage-server’s github site.
I gave a talk a couple months ago at the London Haskell User Group about getting started with hacking on Cabal and Hackage. Unfortunately the video isn’t yet available. (I’ll try and chase that up and link it here later).
An idea we floated at that talk was to run a couple hackathons dedicated to these and related infrastructure projects. If you want to help organise or have a venue in London, please get in touch. If you can’t get to London, fear not as we’d also welcome people attending online. Of course there’s also the upcoming ZuriHac where I expect there will be plenty of infrastructure work going on.
If you do want to get involved, the github site is the place to start. Discussion of features happens partly in issues on github and the #hackage channel on IRC. So those are good places to get feedback if you decide to start working on a bug or feature.
Recent changes
Visible changes
Code to render README and changelog files as markdown has been merged, though this is not yet in its final form.
The idea is that these days many packages have good README files but less good descriptions, partly because Haddock markup has never been very good for writing long prose descriptions while markdown is much better suited for that. So the goal is to display package READMEs in a useful form. Some people would like to be able to just write a README and not have to partially duplicate that in the .cabal file description, and we want to support that option.
So while the code has been merged (which is good because it was big and had been suffering bitrot), there is still a question of exactly when to display the README. The initial patch would show it in place of the description whenever the README existed, while a later revision would show both the description and README. Unfortunately while this greatly improved things for some packages it made things greatly worse for others.
We are now discussing when to show the README inline so that it benefits those packages that have good READMEs, without making other packages much worse. Suggestions include only showing the README when the description field is blank, and putting the README at the bottom of the package page like github does to cope better with very long READMEs, or using an expander widget. We very much welcome feedback on this question, and pull requests even more so.
In the meantime while that is being sorted out, as a temporary solution, the package page just has a link to the README rendered as markdown. If nothing else this makes it easy to see what various packages READMEs would look like when included inline (including some rendering issues due to the variety of markdown dialects).
Thanks to Christian Conkle and Matthew Pickering for the original patch, and to Matthew Pickering and Michael Snoyman for fixing the bitrot and pestering us to get it reviewed and merged. It’s all much appreciated.
Changelogs are now also rendered as markdown. This doesn’t always work because some changelogs are not markdown and look bad when rendered as such, so we also provide a plain text link. See for example the changelog for zlib-0.6.1.0.
The changelog link has been moved to the list of properties, immediately after the list of all versions. We’d had feedback that people had been overlooking the changelog link where it was previously.
Future directions here include displaying the changelog inline, or parts or with an expander.
The display of dependencies and version ranges has been improved. Since the early days of Hackage we used a disjunctive normal form presentation. This has never been particularly visually appealing or easy to understand as well as rather verbose. We now present a much shorter summary of the dependencies and their versions constraints. This display has been tweaked a few times to make it a more accurate summary.
Thanks to Kristen Kozak for making several improvements and to Michael Snoyman for feedback.
License files are now linked from the package properties. So for example rather than just saying “OtherLicense”, it now links directly to the license file inside the package tarball. See for example iconv-0.2.
The modules list on the package page now only links to Haddock pages that actually exist. This solves the problem of libraries with “semi-public” modules that are technically exposed but where they have a haddock HIDDEN pragma. The haddock index is now also linked.
Browsing the content of the package tarball is much improved. We now show a full directory listing, rather than just one directory at a time, and the presentation is generally clearer. See for example the contents of zlib-0.6.1.0.tar.gz.
Improvements to the mime types of files served from the contents of package documentation or tarballs. We now declare more things to be UTF8 and text/plain, so they’re easier to browse online rather than triggering your browser to download the file.
Thanks to Rabi Shanker and Adam Bergmark.
The recent upload page now links to another page (and corresponding RSS feed) which lists all revisions to .cabal files from authors and trustees editing .cabal metadata. These feeds now always contains at minimum 48 hours worth of changes.
Thanks to Gershom for this.
A new audit log for Hackage administrators. All changes to user’s permissions (by adding and removing them from groups) are now logged, with an optional reason/explanation, and this is visible to the admin.
Thanks to Gershom for contributing this feature and Herbert Valerio Riedel for agitating to get it included.
The presentation of edits/diffs between .cabal file revisions has been improved to identify which component the change was in (since sometimes similar changes have to be made to multiple components in a package). See for example the revisions to fay-0.18.0.4.
Thanks to Rabi Shanker.
A few boring but important ones
Build fixes for newer GHC versions. People run hackage-server instances locally so it’s always useful to be able to build with the latest, even if the central community site is using an older stable GHC for its build.
Potential DOS fix. Older versions of the aeson package had a problem with parsing large numbers that would cause it to take huge amounts of memory. We had been stuck with an old aeson package for a while due to incompatible changes in the JSON output for derived instances.
Thanks to Michael Snoyman for reporting that and Daniel Gröber for fixing it.
Updated to Cabal-1.22. This has a number of small effects, like support for new licenses, and compilers (HaskellSuite).
One regression that we discovered is that Cabal-1.22 changed one of the distribution Q/A checks on the -Werror flags, which meant that Hackage started rejecting these, even when guarded by a manual “dev” flag.
I have since sent in a patch for Cabal to allow -Werror and other development flags when guarded by a manual cabal configuration flag. This lets people use -Werror in development and in their CI systems but without us having on Hackage lots of packages that bitrot quickly due to GHC introducing new warnings and breaking packages that use -Werror by default. This Cabal change has been cherry-picked into the community hackage-server instance for the time being and is now deployed.
Thanks to the many people who pointed out that this had changed as was a problem, and to Herbert Valerio Riedel for identifying the source of the problem.
Related to the Cabal version update is a change in the way that AllRightsReserved packages are handled. Cabal used to complain about distributing AllRightsReserved packages, but that’s not appropriate at that level of the toolchain. Hackage has been updated to implement that check instead, but only for the community’s public instance. If you run your own local Hackage server then it will now accept AllRightsReserved packages.
Miscellaneous small changes
In the administrators’ control panel, old account requests and resets are now cleaned up.
Internally there is a new cron subsystem which will be useful for new features.
More resources are available in JSON format. Thanks to Herbert Valerio Riedel for pointing out places where this was missing.
Improved README instructions for the hackage-server itself. Contributed by Michael Snoyman.duncanhttp://www.well-typed.com/blog/2015/05/recent-hackage-improvementsMon, 18 May 2015 16:31:57 +0000[kdrtkgdr] Polymorphic matrix inversion
http://kenta.blogspot.com/2015/05/kdrtkgdr-polymorphic-matrix-inversion.html
Here is matrix inversion implemented in pure Haskell and a demonstration of its use with Complex Rational exact arithmetic:A= [ 9 8*i 9*i 4 8*i 3 5*i 3*i ; 6*i 6 7*i 5 0 3*i 5*i 7 ; 4 4*i 4*i 8 1*i 3 1 9*i ; 4 6*i 7 5 4*i 7 4 4 ; 0 3 6 0 4 2*i 0 7 ; 7 2 0 4*i 3 3*i 1 4*i ; 4*i 3 8 1 2*i 3 1 3*i ; 3 3 4*i 3*i 6 3 7 1*i ]Ainv= [ -651341+282669*i 70449+105643*i 470260+543736*i -90558-322394*i 940582+401976*i -1471210-32706*i 304731+202312*i 428573+497242*i ; 544367-80697*i -1154274+187794*i 1139302-87593*i -21921+1224087*i 1488234-1168263*i -1627965+1312344*i -1248400-535541*i 337176-362289*i ; -31970+991102*i 93690-139399*i 123282-341373*i -421343-757909*i 50198+495419*i 8593-654052*i -468577+527063*i 897227+455914*i ; 695254+159129*i -629809+1226054*i -2027820-1557744*i -161289+1062653*i -1299979-787107*i 915864+11456*i 1378121-78438*i 1043558-421873*i ; -925872-186803*i 1371020-146791*i -2291428-34310*i 2099076-57282*i -3405480+2662395*i 1616586-1020331*i 313380-265939*i -1386116-96576*i ; -1420678-1211999*i 1206717-844239*i 93772+1350941*i 1290846-32987*i -511+2133602*i 1126031+1379618*i -2659652-75501*i -2000675-207615*i ; 2039672+316590*i -813732+665441*i 417543-103784*i -2312041+106452*i 1746852-2305394*i -871774-685499*i 1641748-24893*i -261188+84637*i ; -23113-302279*i -610268-221894*i 1101428+322959*i -838351-211053*i -239412-540356*i 160747+259506*i 736020+689615*i -180808+391291*i ] / (-14799118+6333791*i)Aesthetically striking is how simple input matrices become complicated when inverted: complexity seemingly appears out of nowhere. The complexity arises from a convolution of several effects: numerators and denominators intermingle as fractions get added; real and imaginary parts intermingle as complex numbers get multiplied; matrix elements intermingle in row operations.(Perhaps this mixing could be put to cryptographic use someday? Row operations do vaguely look like Salsa20. AES has a field inversion to define its S box.)The code is kind of a mess, not really trying to achieve performance: we compute the LU decomposition to get the determinant, which is used just to provide the scalar factor in the final displayed result of the inverse to avoid fractions inside the matrix. (It is neat that multiplying through by the determinant makes fractions disappear.) We do a separately implemented Gauss-Jordan elimination on the original matrix to compute the inverse, then multiply by the determinant computed via LU.We made modifications to two library packages.We forked the base module Data.Complex to a new module Data.ComplexNum to provide a Complex type that does not require RealFloat (so only requires Num). (Complex normally requires RealFloat because abs and signum of a Complex number requires sqrt.) Avoiding RealFloat allowed us to form Complex Rational, i.e., Complex (Ratio Integer), a powerful data type. We defined abs and signum as error.Our starting point for matrix operations was Data.Matrix in the matrix package (by Daniel Díaz). The original luDecomp function calls abs to select the largest pivot during LU decomposition. The output type of abs is constrained in the Num class to be the same as the input type (abs :: a -> a), so Complex numbers weirdly provide a Complex output type for abs even through mathematically the absolute value of a complex number is real. Unfortunately Complex numbers do not provide Ord, so we cannot select the "largest" absolute value as the pivot. Therefore, we added a new entry point to LU decomposition, luDecompWithMag, to allow passing in a custom function to compute the magnitude. Since we are avoiding square root, we provided magnitude-squared (magnitude2) as the pivot-choosing function.We fixed some (but not all) space leaks in LU decomposition so processing matrices of size 100x100 no longer requires gigabytes of memory.We added to Data.Matrix two different implementations of matrix inversion both which use monadic imperative style for in-place modification of the augmented matrix with row operations: one (invertS) uses Control.Monad.Trans.State.Strict and the other (invert) uses MVector and Control.Monad.ST. The latter is a little bit faster. Both implementations required care to avoid space leaks, which were tracked down using ghc heap profiling.The modifications to Data.Matrix can be tracked in this github repository.Runtime (in seconds) and maxresident memory usage (k) of inversion of random Complex Rational matrices, where each entry in the original matrix is a random single-digit pure real or pure imaginary number (as in the example above with size 8), of sizes 10 through 320: [ 10 0.01 2888 ; 20 0.26 4280 ; 30 1.40 5324 ; 40 4.84 7132 ; 50 12.32 10204 ; 60 27.05 14292 ; 70 52.67 19416 ; 80 94.36 26588 ; 90 157.18 33752 ; 100 250.12 44000 ; 110 373.71 56524 ; 120 546.32 69600 ; 130 768.92 85988 ; 140 1056.05 103632 ; 150 1419.47 124900 ; 160 1877.97 148696 ; 170 2448.16 176364 ; 180 3130.31 208104 ; 190 3963.96 238824 ; 200 4958.93 277504 ; 210 6138.61 318704 ; 220 7543.16 364544 ; 230 9252.55 413700 ; 240 11115.32 467196 ; 250 13217.16 524300 ; 260 15690.80 594956 ; 270 18520.73 658448 ; 280 21761.36 734460 ; 290 25509.77 812288 ; 300 30048.34 1022028 ; 310 34641.25 1218636 ; 320 39985.63 1419340 ]Runtime and memory for Hilbert matrices (Rational) of sizes 10 through 510: [ 10 0.00 2736 ; 20 0.03 3676 ; 30 0.14 4164 ; 40 0.51 4608 ; 50 1.45 5804 ; 60 3.65 8196 ; 70 7.86 9224 ; 80 15.59 11096 ; 90 27.85 13444 ; 100 45.03 17320 ; 110 67.97 21408 ; 120 98.61 26876 ; 130 137.02 27132 ; 140 187.48 29664 ; 150 248.99 37792 ; 160 325.17 38816 ; 170 420.00 43920 ; 180 529.02 55796 ; 190 659.03 56304 ; 200 811.39 65520 ; 210 980.33 70912 ; 220 1202.42 72608 ; 230 1426.34 82852 ; 240 1696.02 89996 ; 250 2014.39 97192 ; 260 2300.64 109496 ; 270 2693.48 118692 ; 280 3124.52 130980 ; 290 3589.06 138164 ; 300 4158.67 210900 ; 310 4711.61 223188 ; 320 5323.13 243692 ; 330 6047.59 262104 ; 340 6789.74 241640 ; 350 7621.69 269268 ; 360 8433.56 338924 ; 370 9447.28 360428 ; 380 10496.28 358356 ; 390 11662.98 422872 ; 400 12935.35 375772 ; 410 14203.69 477164 ; 420 15655.88 420840 ; 430 17143.12 547820 ; 440 19003.82 517104 ; 450 20592.28 550880 ; 460 22471.48 574428 ; 470 24484.14 604120 ; 480 26672.65 662492 ; 490 28958.52 766936 ; 500 31302.45 646132 ; 510 33932.76 731120 ]Slope of runtime on a log-log plot is about 4.3.According to Wikipedia, the worst case run time (bit complexity) of Gaussian elimination is exponential, but the Bareiss algorithm can guarantee O(n^5): Bareiss, Erwin H. (1968), "Sylvester's Identity and multistep integer-preserving Gaussian elimination", Mathematics of Computation 22 (102): 565–578, doi:10.2307/2004533. Future work: try the code presented here on the exponentially bad matrices described in the paper; implement Bareiss algorithm.Also someday, it may be a fun exercise to create a data type encoding algebraic expressions which is an instance of Num, then use the polymorphism of the matrix code presented here to invert symbolic matrices.Kentag:blogger.com,1999:blog-6757805.post-7113878334971332078Mon, 18 May 2015 07:42:00 +0000Unscientific column store benchmarking in Rust
http://www.randomhacks.net/2015/05/17/unscientific-column-store-experiments-rust/
http://www.randomhacks.net/2015/05/17/unscientific-column-store-experiments-rust/Sun, 17 May 2015 20:03:57 +0000Summer School on DSL Design and Implementation
http://wadler.blogspot.com/2015/05/summer-school-on-dsl-design-and.html
The Scala team at EPFL is running a Summer School on DSL Design and Implementation, 12-17 July in Lausanne, Switzerland. They have a great line-up, including Martin Odersky, Tiark Rompf, Kunle Olukotun, and Matthew Flatt. I'll be there, speaking on A practical theory of language-integrated query and Quoted Domain Specific Languages: Everything old is new again.Philip Wadlertag:blogger.com,1999:blog-9757377.post-5101651630124944800Sat, 16 May 2015 16:18:00 +0000Status Report 5
http://wadler.blogspot.com/2015/05/status-report-5.html
I am recovered. My bone marrow biopsy and my scan at the National Amyloidosis Centre show no problems, and my urologist has discharged me. Photo above shows me and Bob Harper (otherwise known as TSOPLRWOKE, The Society of Programming Language Researchers With One Kidney Each) at Asilomar for Snapl.My thanks again to staff of the NHS. Everyone was uniformly friendly and professional, and the standard of care has been excellent. My thanks also to everyone who wished me well, and especially to the SIGPLAN EC, who passed a get-well card around the world for signing, as shown below. I am touched to have received so many good wishes.Related: Status Report, Status Report 2, A paean to the Western General, Status Report 3, Status Report 4.Philip Wadlertag:blogger.com,1999:blog-9757377.post-6476056537731579260Sat, 16 May 2015 15:17:00 +0000Status Report 4
http://wadler.blogspot.com/2015/03/status-report-4.html
It seemed as if no time had passed: the anaesthetist injected my spine, and next thing I knew I was waking in recovery. Keyhole surgery to remove my left kidney was completed on Tuesday 17 March, and I expect to leave the Western General on Saturday 21 March. Meanwhile, progress on diagnosing the amyloid spotted in my liver: I had a bone marrow biopsy on Thursday 19 March, and two days of testing at the National Amyloidosis Centre in London are to be scheduled. NHS has provided excellent care all around.My room was well placed for watching the partial eclipse this morning. A nurse with a syringe helped me jury rig a crude pinhole camera (below), but it was too crude. Fortunately, there was exactly the right amount of cloud cover through which to view the crescent sun. My fellow patients and our nurses all gathered together, and for five minutes it was party time on the ward.Update: I left the hospital as planned on Saturday 21 March. Thanks to Guido, Sam, Shabana, Stephen, and Jonathan for visits; to Marjorie for soup; to Sukkat Shalom council for a card and to Gillian for hand delivery; and to Maurice for taking me in while my family was away.Related: Status report, Status report 2, A paean to the Western General, Status report 3.Philip Wadlertag:blogger.com,1999:blog-9757377.post-8767713536338777003Fri, 20 Mar 2015 14:12:00 +0000Multiple inheritance, revisited
http://wadler.blogspot.com/2015/04/multiple-inheritance-revisited.html
Via @ReifyReflect (Sam Lindley) and @PedalKings. Previously: Multiple inheritance.Philip Wadlertag:blogger.com,1999:blog-9757377.post-3831211788037322739Mon, 20 Apr 2015 13:21:00 +0000LTMT Part 3: The Monad Cookbook
http://feedproxy.google.com/~r/SoftwareSimply/~3/1Pl8V9eINZA/ltmt-part-3-monad-cookbook.html
Introduction
The previous two posts in my Less Traveled Monad Tutorial series have not had much in the way of directly practical content. In other words, if you only read those posts and nothing else about monads, you probably wouldn't be able to use monads in real code. This was intentional because I felt that the practical stuff (like do notation) had adequate treatment in other resources. In this post I'm still not going to talk about the details of do notation--you should definitely read about that elsewhere--but I am going to talk about some of the most common things I have seen beginners struggle with and give you cookbook-style patterns that you can use to solve these issues.
Problem: Getting at the pure value inside the monad
This is perhaps the most common problem for Haskell newcomers. It usually manifests itself as something like this:
main = do
lineList <- lines $ readFile "myfile.txt"
-- ... do something with lineList here
That code generates the following error from GHC:
Couldn't match type `IO String' with `[Char]'
Expected type: String
Actual type: IO String
In the return type of a call of `readFile'
Many newcomers seem puzzled by this error message, but it tells you EXACTLY what the problem is. The return type of readFile has type IO String, but the thing that is expected in that spot is a String. (Note: String is a synonym for [Char].) The problem is, this isn't very helpful. You could understand that error completely and still not know how to solve the problem. First, let's look at the types involved.
readFile :: FilePath -> IO String
lines :: String -> [String]
Both of these functions are defined in Prelude. These two type signatures show the problem very clearly. readFile returns an IO String, but the lines function is expecting a String as its first argument. IO String != String. Somehow we need to extract the String out of the IO in order to pass it to the lines function. This is exactly what do notation was designed to help you with.
Solution #1
main :: IO ()
main = do
contents <- readFile "myfile.txt"
let lineList = lines contents
-- ... do something with lineList here
This solution demonstrates two things about do notation. First, the left arrow lets you pull things out of the monad. Second, if you're not pulling something out of a monad, use "let foo =". One metaphor that might help you remember this is to think of "IO String" as a computation in the IO monad that returns a String. A do block lets you run these computations and assign names to the resulting pure values.
Solution #2
We could also attack the problem a different way. Instead of pulling the result of readFile out of the monad, we can lift the lines function into the monad. The function we use to do that is called liftM.
liftM :: Monad m => (a -> b) -> m a -> m b
liftM :: Monad m => (a -> b) -> (m a -> m b)
The associativity of the -> operator is such that these two type signatures are equivalent. If you've ever heard Haskell people saying that all functions are single argument functions, this is what they are talking about. You can think of liftM as a function that takes one argument, a function (a -> b), and returns another function, a function (m a -> m b). When you think about it this way, you see that the liftM function converts a function of pure values into a function of monadic values. This is exactly what we were looking for.
main :: IO ()
main = do
lineList <- liftM lines (readFile "myfile.txt")
-- ... do something with lineList here
This is more concise than our previous solution, so in this simple example it is probably what we would use. But if we needed to use contents in more than one place, then the first solution would be better.
Problem: Making pure values monadic
Consider the following program:
import Control.Monad
import System.Environment
main :: IO ()
main = do
args <- getArgs
output <- case args of
[] -> "cat: must specify some files"
fs -> liftM concat (mapM readFile fs)
putStrLn output
This program also has an error. GHC actually gives you three errors here because there's no way for it to know exactly what you meant. But the first error is the one we're interested in.
Couldn't match type `[]' with `IO'
Expected type: IO Char
Actual type: [Char]
In the expression: "cat: must specify some files"
Just like before, this error tells us exactly what's wrong. We're supposed to have an IO something, but we only have a String (remember, String is the same as [Char]). It's not convenient for us to get the pure result out of the readFile functions like we did before because of the structure of what we're trying to do. The two patterns in the case statement must have the same type, so that means that we need to somehow convert our String into an IO String. This is exactly what the return function is for.
Solution: return
return :: a -> m a
This type signature tells us that return takes any type a as input and returns "m a". So all we have to do is use the return function.
import Control.Monad
import System.Environment
main :: IO ()
main = do
args <- getArgs
output <- case args of
[] -> return "cat: must specify some files"
fs -> liftM concat (mapM readFile fs)
putStrLn output
The 'm' that the return function wraps its argument in, is determined by the context. In this case, main is in the IO monad, so that's what return uses.
Problem: Chaining multiple monadic operations
import System.Environment
main :: IO ()
main = do
[from,to] <- getArgs
writeFile to $ readFile from
As you probably guessed, this function also has an error. Hopefully you have an idea of what it might be. It's the same problem of needing a pure value when we actually have a monadic one. You could solve it like we did in solution #1 on the first problem (you might want to go ahead and give that a try before reading further). But this particular case has a pattern that makes a different solution work nicely. Unlike the first problem, you can't use liftM here.
Solution: bind
When we used liftM, we had a pure function lines :: String -> [String]. But here we have writeFile :: FilePath -> String -> IO (). We've already supplied the first argument, so what we actually have is writeFile to :: String -> IO (). And again, readFile returns IO String instead of the pure String that we need. To solve this we can use another function that you've probably heard about when people talk about monads...the bind function.
(=<<) :: Monad m => (a -> m b) -> m a -> m b
(=<<) :: Monad m => (a -> m b) -> (m a -> m b)
Notice how the pattern here is different from the first example. In that example we had (a -> b) and we needed to convert it to (m a -> m b). Here we have (a -> m b) and we need to convert it to (m a -> m b). In other words, we're only adding an 'm' onto the 'a', which is exactly the pattern we need here. Here are the two patterns next to each other to show the correspondence.
writeFile to :: String -> IO ()
a -> m b
From this we see that "writeFile to" is the first argument to the =<< function. readFile from :: IO String fits perfectly as the second argument to =<<, and then the return value is the result of the writeFile. It all fits together like this:
import System.Environment
main :: IO ()
main = do
[from,to] <- getArgs
writeFile to =<< readFile from
Some might point out that this third problem is really the same as the first problem. That is true, but I think it's useful to see the varying patterns laid out in this cookbook style so you can figure out what you need to use when you encounter these patterns as you're writing code. Everything I've said here can be discovered by carefully studying the Control.Monad module. There are lots of other convenience functions there that make working with monads easier. In fact, I already used one of them: mapM.
When you're first learning Haskell, I would recommend that you keep the documentation for Control.Monad close by at all times. Whenever you need to do something new involving monadic values, odds are good that there's a function in there to help you. I would not recommend spending 10 hours studying Control.Monad all at once. You'll probably be better off writing lots of code and referring to it whenever you think there should be an easier way to do what you want to do. Over time the patterns will sink in as form new connections between different concepts in your brain.
It takes effort. Some people do pick these things up more quickly than others, but I don't know anyone who just read through Control.Monad and then immediately had a working knowledge of everything in there. The patterns you're grappling with here will almost definitely be foreign to you because no other mainstream language enforces this distinction between pure values and side effecting values. But I think the payoff of being able to separate pure and impure code is well worth the effort.mightybytetag:blogger.com,1999:blog-8768401356830813531.post-4704775857752483121Fri, 19 Dec 2014 07:01:00 +0000EclipseFP end of life (from me at least)
http://jpmoresmau.blogspot.com/2015/05/eclipsefp-end-of-life-from-me-at-least.html
Hello, after a few years and several releases, I am now stopping the maintenance of EclipseFP and its companion Haskell packages (BuildWrapper, ghc-pkg-lib and scion-browser). If anybody wants to take over I' ll gladly give them all what's required to get started. Feel free to fork and continue!Why am I stopping? Not for a very specific reason, though seeing that I had to adapt BuildWrapper to GHC 7.10 didn't exactly fill me with joy, but more generally I got tired of being the single maintainer for this project. I got a few pull requests over the years and some people have at some stage participated (thanks to you, you know who you are!), but not enough, and the biggest part of the work has been always on my shoulders. Let's say I got tired of getting an endless stream of issues reports and enhancement requests with nobody stepping up to actually address them.Also, I don't think on the Haskell side it makes sense for me to keep on working on a GHC API wrapper like BuildWrapper. There are other alternatives, and with the release of ide-backend, backed up by FPComplete, a real company staffed by competent people who seem to have more that 24 hours per day to hack on Haskell tools, it makes more sense to have consolidation there.The goal of EclipseFP was to make it easy for Java developers or other Eclipse users to move to Haskell, and I think this has been a failure, mainly due to the inherent complexity of the setup (the Haskell stack and the Java stack) and the technical challenges of integrating GHC and Cabal in a complex IDE like Eclipse. Of course we could have done better with the constraints we were operating under, but if more eyes had looked at the code and more hands had worked on deck we could have succeeded.Personally I would now be interested in maybe getting the Atom editor to use ide-backend-client, or maybe work on a web based (but local) Haskell IDE. Some of my dabblings can be found at https://github.com/JPMoresmau/dbIDE. But I would much prefer to not work on my own, so if you have an open source project you think could do with my help, I'll be happy to hear about it!I still think Haskell is a great language that would deserve a top-notch IDE, for newbies and experts alike, and I hope one day we'll get there.For you EclipseFP users, you can of course keep using it as long as it works, but if no other maintainers step up, down the line you'll have to look for other options, as compatibility with the Haskell ecosystem will not be assured. Good luck!Happy Haskell Hacking!JP Moresmautag:blogger.com,1999:blog-37404288.post-7299054554109175241Thu, 14 May 2015 13:09:00 +0000OCaml server-side developer at Ahrefs Research (Full-time)
http://functionaljobs.com/jobs/8827-ocaml-server-side-developer-at-ahrefs-research
urn:uuid:4a49dab8-8160-8a01-811b-ffb3dfcb0305Thu, 14 May 2015 08:07:26 +0000Deprecating system-filepath and system-fileio
http://www.yesodweb.com/blog/2015/05/deprecating-system-filepath
I posted this information on
Google+, but it's
worth advertising this a bit wider. The tl;dr is: system-filepath and
system-fileio are deprecated, please migrate to filepath and directory,
respectively.The backstory here is that system-filepath came into existence at a time when
there were bugs in GHC's handling of character encodings in file paths.
system-filepath fixed those bugs, and also provided some nice type safety to
prevent accidentally treating a path as a String. However, the internal
representation needed to make that work was pretty complicated, and resulted in
some weird corner case bugs.Since GHC 7.4 and up, the original character encoding issues have been
resolved. That left a few options: continue to maintain system-filepath for
additional type safety, or deprecate. John Millikin, the author of the package,
decided on the latter back in
December. Since we
were using it extensively at FP Complete via other libraries, we decided to
take over maintenance. However, this week we decided that, in fact, John was
right in the first place.I've already migrated most of my libraries away from system-filepath (though
doing so quickly was a
mistake, sorry
everyone). One nice benefit of all this is there's no longer a need to convert
between different FilePath representations all over the place. I still
believe overall that type FilePath = String is a mistake and a distinct
datatype would be better, but there's much to be said for consistency.Some quick pointers for those looking to convert:You can drop basically all usages of encodeString and decodeStringIf you're using basic-prelude or classy-prelude, you should get some deprecation warnings around functions like fpToStringMost functions have a direct translation, e.g. createTree becomes createDirectoryIfMissing True (yes, the system-filepath and system-fileio names often times feel nicer...)And for those looking for more type safety: all is not lost. Chris Done has
been working on a new
package
which is aimed at providing additional type safety around absolute/relative and
files/directories. It's not yet complete, but is already seeing some
interesting work and preventing bugs at some projects we've been working on
(and which will be announced soon).http://www.yesodweb.com/blog/2015/05/deprecating-system-filepathThu, 14 May 2015 04:00:00 +0000Hoogle inside the sandbox
https://parenz.wordpress.com/2015/05/14/hoogle-inside-the-sandbox/
Introduction
This is my first post from the (hopefuly fruitful!) series of blog posts as part of my Haskell SoC project. I will spend a great chunk of my summer hacking away on DarcsDen; in addition, I will document my hardships and successes here. You can follow my progress on my DarcsHub.
This particular post will be about my working environment.
The problem
Hoogle is an amazing tool that usually needs no introduction. Understandably, the online version at haskell.org indexes only so many packages. This means that if I want to use hoogle to search for functions and values in packages like darcs and darcsden, I will have to set up a local copy.
Cabal sandboxing is a relatively recent feature of the Cabal package manager, but I don’t think it is reasonable in this day to install from the source (let alone develop) a Haskell package without using sandboxing.
The problem seems to be that the mentioned tools do not play well together out of the box, and some amount of magic is required. In this note I sketch the solution, on which I’ve eventually arrived after a couple of tries.
Using hoogle inside a Cabal sandbox
The presumed setup: a user is working on a package X using the cabal sandboxes. The source code is located in the directory X and the path to the cabal sandbox is X/.cabal-sandbox.
Step 1: Install hoogle inside the sandbox. This is simply a matter of running cabal install hoogle inside X. If you want to have a standard database alongside the database for your packages in development, now is the time to do .cabal-sandbox/bin/hoogle data.
Step 2: Generate haddocks for the packages Y,Z you want to use with hoogle. In my case, I wanted to generate haddocks for darcs and darcsden. This is just a matter of running cabal haddock --hoogle in the correct directory.
Step 3: Convert haddocks to .hoo files. Run the following commands in X/:
.cabal-sandbox/bin/hoogle convert /path/to/packageY/dist/doc/html/*/*.txt
You should see something like
Converting /path/to/packageY/dist/doc/html/Y/Y.txt
Converting Y... done
after which the file Y.hoo appears in /path/to/packageY/dist/doc/html/Y/
Step 4: Moving and combining databases. The hoogle database should be stored in .cabal-sandbox/share/*/hoogle-*/databases. Create such directory, if it’s not present already. Then copy the ‘default’ database to that folder:
cp .cabal-sandbox/hoogle/databases/default.hoo .cabal-sandbox/share/*/hoogle-*/databases
Finally, you can combine your Y.hoo with the default database.
.cabal-sandbox/bin/hoogle combine /path/to/packageY/dist/doc/html/*/*.hoo .cabal-sandbox/share/*/hoogle-*/databases/default.hoo
mv default.hoo .cabal-sandbox/share/*/hoogle-*/databases/default.hoo
And you are done! You can test your installation
$ .cabal-sandbox/bin/hoogle rOwner
DarcsDen.State.Repo rOwner :: Simple Lens (Repository bp) String
For additional usability, consider adding .cabal-sandbox/bin to your $PATH.
Tagged: cabal, darcs, haskell, hoogleDanhttp://parenz.wordpress.com/?p=172Wed, 13 May 2015 21:54:47 +0000Want to work with me on one of these projects?
http://blog.plover.com/prog/unfinished-projects.html
I did a residency at the Recurse Center last
month. I made a profile page on their web site, which asked me to
list some projects I was interested in working on while there. Nobody
took me up on any of the projects, but I'm still interested. So if you
think any of these projects sounds interesting, drop me a note and
maybe we can get something together.
They are listed roughly in order of their nearness to completion, with
the most developed ideas first and the vaporware at the bottom. I am
generally language-agnostic, except I refuse to work in C++.
Or if you don't want to work with me, feel free to swipe any of these
ideas yourself. Share and enjoy.
Linogram
Linogram is a constraint-based diagram-drawing language that I think
will be better than prior languages (like pic, Metapost, or, god
forbid, raw postscript or SVG) and very different from WYSIWYG drawing
programs like Inkscape or Omnigraffle. I described it in detail in
chapter 9 of Higher-Order
Perl
and it's missing only one or two important features that I can't quite
figure out how to do. It also needs an SVG output module, which I
think should be pretty simple.
Most of the code for this already exists, in Perl.
I have discussed Linogram previously in this blog.
Orthogonal polygons
Each angle of an orthogonal polygon is either 90° or 270°. All 4-sided
orthogonal polygons are rectangles. All 6-sided orthogonal polygons
are similar-looking letter Ls. There are essentially only four
different kinds of 8-sided orthogonal polygons. There are 8 kinds of
10-sided orthogonal polygons, and 29 kinds of 12-sided orthogonal
polygons. I want to efficiently count the number of orthogonal
polygons with N sides, and have the computer draw exemplars of each
type.
I have a nice method for systematically generating descriptions of all
simple orthogonal polygons, and although it doesn't scale to polygons
with many sides I think I have an idea to fix that, making use of
group-theoretic (mathematical) techniques. (These would not be hard
for anyone to learn quickly; my ten-year-old daughter picked them
right up. Teaching the computer would be somewhat trickier.) For
making the pictures, I only have half the ideas I need, and I haven't
done the programming yet.
The little code I have is written in Perl, but it would be no trouble to switch to a different language.
Simple Android app
I want to learn to build Android apps for my Android phone. I think a
good first project would be a utility where you put in a sequence of
letters, say FBS, and it displays all the words that contain those
letters in order. (For FBS the list contains "afterburners",
"chlorofluorocarbons", "fables", "fabricates", …, "surfboards".) I
play this game often with my kid (the letters are supplied by license
plates we pass) and we want a way to cheat when we are stumped.
My biggest problem with Android development in the past has been
getting the immense Android SDK set up.
The project would need to be done in Java, because that is what Android uses.
gi
Git is great, but its user interface is awful. The command set is
obscure and non-orthogonal. Error messages are confusing. gi is a
thinnish layer that tries to present a more intuitive and uniform
command set, with better error messages and clearer advice, without
removing any of git's power.
There's no code written yet, and we could do it in any language. Perl
or Python would be good choices. The programming is probably easy; the
hard part of this project is (a) design and (b) user testing.
I have a bunch of design notes written up about this already.
Twingler
Twingler takes an example of an input data structure and and output
data structure, and writes code in your favorite language for
transforming the input into the output. Or maybe it takes some sort of
simplified description of what is wanted and writes the code from
that. The description would be declarative, not procedural. I'm
really not at all sure what it should do or how it should work, but I
have a lot of notes, and if we could
make it happen a lot of people would love it.
No code is written; we could do this in your favorite language. Haskell maybe?
Bonus: Whatever your favorite language is, I bet it needs something like this.
Crapspad
I want a simple library that can render simple pixel graphics and
detect and respond to mouse events. I want people to be able to learn
to use it in ten minutes. It should be as easy as programming graphics
on an Apple II and easier than a Commodore 64. It should not be a
gigantic object-oriented windowing system with widgets and all that
stuff. It should be possible to whip up a simple doodling program in
Crapspad in 15 minutes.
I hope to get Perl bindings for this, because I want to use it from
Perl programs, but we could design it to have a language-independent
interface without too much trouble.
Git GUI
There are about 17 GUIs for Git and they all suck in exactly the same
way: they essentially provide a menu for running all the same Git
commands that you would run at the command line, obscuring what is
going on without actually making Git any easier to use. Let's fix
this.
For example, why can't you click on a branch and drag it elsewhere to
rebase it, or shift-drag it to create a new branch and rebase that?
Why can't you drag diff hunks from one commit to another?
I'm not saying this stuff would be easy, but it should be
possible. Although I'm not convinced I really want to put ion the
amount of effort that would be required. Maybe we could just submit
new features to someone else's already-written Git GUI? Or if they
don't like our features, fork their project?
I have no code yet, and I don't even know what would be good to use.Mark Dominustag:blog.plover.com,2015:/prog/unfinished-projectsWed, 13 May 2015 17:53:00 +0000Senior Systems Engineer (Scala) at AdAgility (Full-time)
http://functionaljobs.com/jobs/8826-senior-systems-engineer-scala-at-adagility
urn:uuid:c411a418-a1ed-0bc0-1ac8-28ab9cca2487Wed, 13 May 2015 16:14:19 +0000Senior Software Engineer (Functional Programming/Scala) at AdAgility (Full-time)
http://functionaljobs.com/jobs/8826-senior-software-engineer-functional-programming-scala-at-adagility
urn:uuid:b4450d57-8832-9241-05b8-f3fce18bbcffWed, 13 May 2015 16:14:19 +0000Sometimes, the old ways are the best
http://www.serpentine.com/blog/2015/05/13/sometimes-the-old-ways-are-the-best/
Over the past few months, the Sigma engineering team at Facebook has rolled out a major Haskell project: a rewrite of Sigma, an important weapon in our armory for fighting spam and malware.
Sigma has a mission-critical job, and it needs to scale: its growing workload currently sees it handling tens of millions of requests per minute.
The rewrite of Sigma in Haskell, using the Haxl library that Simon Marlow developed, has been a success. Throughput is higher than under its predecessor, and CPU usage is lower. Sweet!
Nevertheless, success brings with it surprises, and even though I haven’t worked on Sigma or Haxl, I’ve been implicated in one such surprise. To understand my accidental bit part in the show, let's begin by mentioning that Sigma uses JSON internally for various purposes. These days, the Haskell-powered Sigma uses aeson, the JSON library I wrote, to handle JSON data.
A few months ago, the Haxl rewrite of Sigma was going through an episode of crazytown, in which it would intermittently and unpredictably use huge amounts of CPU and memory. The culprit turned out to be JSON strings containing zillions of backslashes. (I have no idea why. If you’ve worked with large volumes of data for a long time, you won’t even bat an eyelash at the idea that a data store somewhere contains some really weird records.)
The team quickly mitigated the problem, and gave me a nudge that I might want to look into the problem. On Sunday evening, with a glass of red wine in hand, I finally dove in to see what was wrong.
Since the Sigma developers had figured out what was causing these time and space explosions, I immediately had a test case to work with, and the results were grim: decoding a mere megabyte of continuous backslashes took over a second, consumed over a gigabyte of memory, and killed concurrency by causing the runtime system to spend almost 90% of its time in the garbage collector. Yikes!
Whatever was going on? If you look at the old implementation of aeson’s unescape function, it seems quite efficient and innocuous. It’s reasonably tightly optimized low-level Haskell.
Trouble is, unescape uses an API (a bytestring builder) that is intended for streaming a result incrementally. Unfortunately the unescape function can’t hand any data back to its caller until it has processed an entire string.
The result is as you’d expect: we build a huge chain of thunks. In this case, the thunks will eventually write data efficiently into buffers. Alas, the thunks have nobody demanding the evaluation of their contents. This chain consumes a lot (a lot!) of memory and incurs a huge amount of GC overhead (long chains of thunks are expensive). Sadness ensues.
The “old ways” in the title refer to the fix: in place of a fancy streaming API, I simply allocate a single big buffer and blast the bytes straight into it.
For that pathological string with almost a megabyte of consecutive backslashes, the new implementation is 27x faster and uses 42x less memory, all for the cost of perhaps an hour of Sunday evening hacking (including a little enabling work that incidentally illustrates just how easy it is to work with monad transformers). Not bad!Bryan O'Sullivanhttp://www.serpentine.com/blog/?p=1096Wed, 13 May 2015 16:13:43 +0000Pan-Galactic Division in Haskell
https://byorgey.wordpress.com/2015/05/12/pan-galactic-division-in-haskell/
Summary: given an injective function , it is possible to constructively “divide by ” to obtain an injection , as shown recently by Peter Doyle and Cecil Qiu and expounded by Richard Schwartz. Their algorithm is nontrivial to come up with—this had been a longstanding open question—but it’s not too difficult to explain. I exhibit some Haskell code implementing the algorithm, and show some examples.
Introduction: division by two
Suppose someone hands you the following:
A Haskell function f :: (A, Bool) -> (B, Bool), where A and B are abstract types (i.e. their constructors are not exported, and you have no other functions whose types mention A or B).
A promise that the function f is injective, that is, no two values of (A, Bool) map to the same (B, Bool) value. (Thus (B, Bool) must contain at least as many inhabitants as (A, Bool).)
A list as :: [A], with a promise that it contains every value of type A exactly once, at a finite position.
Can you explicitly produce an injective function f' :: A -> B? Moreover, your answer should not depend on the order of elements in as.
It really seems like this ought to be possible. After all, if (B, Bool) has at least as many inhabitants as (A, Bool), then surely B must have at least as many inhabitants as A. But it is not enough to reason merely that some injection must exist; we have to actually construct one. This, it turns out, is tricky. As a first attempt, we might try f' a = fst (f (a, True)). That is certainly a function of type A -> B, but there is no guarantee that it is injective. There could be a1, a2 :: A which both map to the same b, that is, one maps to (b, False) and the other to (b, True). The picture below illustrates such a situation: (a1, True) and (a2, True) both map to b2. So the function f may be injective overall, but we can’t say much about f restricted to a particular Bool value.
The requirement that the answer not depend on the order of as also makes things difficult. (Over in math-land, depending on a particular ordering of the elements in as would amount to the well-ordering principle, which is equivalent to the axiom of choice, which in turn implies the law of excluded middle—and as we all know, every time someone uses the law of excluded middle, a puppy dies. …I feel like I’m in one of those DirecTV commercials. “Don’t let a puppy die. Ignore the order of elements in as.”) Anyway, making use of the order of values in as, we could do something like the following:
For each a :: A:
Look at the B values generated by f (a,True) and f (a,False). (Note that there might only be one distinct such B value).
If neither B value has been used so far, pick the one that corresponds to (a,True), and add the other one to a queue of available B values.
If one is used and one unused, pick the unused one.
If both are used, pick the next available B value from the queue.
It is not too hard I couldn’t be bothered to show that this will always successfully result in a total function A -> B, which is injective by construction. (One has to show that there will always be an available B value in the queue when you need it.) The only problem is that the particular function we get depends on the order in which we iterate through the A values. The above example illustrates this as well: if the A values are listed in the order , then we first choose , and then . If they are listed in the other order, we end up with and . Whichever value comes first “steals” , and then the other one takes whatever is left. We’d like to avoid this sort of dependence on order. That is, we want a well-defined algorithm which will yield a total, injective function A -> B, which is canonical in the sense that the algorithm yields the same function given any permutation of as.
It is possible—you might enjoy puzzling over this a bit before reading on!
Division by N
The above example is a somewhat special case. More generally, let denote a canonical finite set of size , and let and be arbitrary sets. Then, given an injection , is it possible to effectively (that is, without excluded middle or the axiom of choice) compute an injection ?
Translating down to the world of numbers representing set cardinalities—natural numbers if and are finite, or cardinal numbers in general—this just says that if then . This statement about numbers is obviously true, so it would be nice if we could say something similar about sets, so that this fact about numbers and inequalities can be seen as just a “shadow” of a more general theorem about sets and injections.
As hinted in the introduction, the interesting part of this problem is really the word “effectively”. Using the Axiom of Choice/Law of Excluded Middle makes the problem a lot easier, but either fails to yield an actual function that we can compute with, instead merely guaranteeing the existence of such a function, or gives us a function that depends on a particular ordering of .
Apparently this has been a longstanding open question, recently answered in the affirmative by Peter Doyle and Cecil Qiu in their paper Division By Four. It’s a really great paper: they give some fascinating historical context for the problem, and explain their algorithm (which is conceptually not all that difficult) using an intuitive analogy to a card game with certain rules. (It is not a “game” in the usual sense of having winners and losers, but really just an algorithm implemented with “players” and “cards”. In fact, you could get some friends together and actually perform this algorithm in parallel (if you have sufficiently nerdy friends).) Richard Schwartz’s companion article is also great fun and easy to follow (you should read it first).
A Game of Thrones Cards
Here’s a quick introduction to the way Doyle, Qiu, and Schwartz use a card game to formulate their algorithm. (Porting this framework to use “thrones” and “claimants” instead of “spots” and “cards” is left as an exercise to the reader.)
The finite set is to be thought of as a set of suits. The set will correspond to a set of players, and to a set of ranks or values (for example, Ace, 2, 3, …) In that case corresponds to a deck of cards, each card having a rank and a suit; and we can think of in terms of each player having in front of them a number of “spots” or “slots”, each labelled by a suit. An injection is then a particular “deal” where one card has been dealt into each of the spots in front of the players. (There may be some cards left over in the deck, but the fact that the function is total means every spot has a card, and the fact that it is injective is encoded in the common-sense idea that a given card cannot be in two spots at once.) For example, the example function from before:
corresponds to the following deal:
Here each column corresponds to one player’s hand, and the rows correspond to suit spots (with the spade spots on top and the heart spots beneath). We have mapped to the ranks A, 2, 3, and mapped T and F to Spades and Hearts respectively. The spades are also highlighted in green, since later we will want to pay particular attention to what is happening with them. You might want to take a moment to convince yourself that the deal above really does correspond to the example function from before.
A Haskell implementation
Of course, doing everything effectively means we are really talking about computation. Doyle and Qiu do talk a bit about computation, but it’s still pretty abstract, in the sort of way that mathematicians talk about computation, so I thought it would be interesting to actually implement the algorithm in Haskell.
The algorithm “works” for infinite sets, but only (as far as I understand) if you consider some notion of transfinite recursion. It still counts as “effective” in math-land, but over here in programming-land I’d like to stick to (finitely) terminating computations, so we will stick to finite sets and .
First, some extensions and imports. Nothing too controversial.
> {-# LANGUAGE DataKinds #-}
> {-# LANGUAGE GADTs #-}
> {-# LANGUAGE GeneralizedNewtypeDeriving #-}
> {-# LANGUAGE KindSignatures #-}
> {-# LANGUAGE RankNTypes #-}
> {-# LANGUAGE ScopedTypeVariables #-}
> {-# LANGUAGE StandaloneDeriving #-}
> {-# LANGUAGE TypeOperators #-}
>
> module PanGalacticDivision where
>
> import Control.Arrow (second, (&&&), (***))
> import Data.Char
> import Data.List (find, findIndex, transpose)
> import Data.Maybe
>
> import Diagrams.Prelude hiding (universe, value)
> import Diagrams.Backend.Rasterific.CmdLine
> import Graphics.SVGFonts
We’ll need some standard machinery for type-level natural numbers. Probably all this stuff is in a library somewhere but I couldn’t be bothered to find out. Pointers welcome.
> -- Standard unary natural number type
> data Nat :: * where
> Z :: Nat
> Suc :: Nat -> Nat
>
> type One = Suc Z
> type Two = Suc One
> type Three = Suc Two
> type Four = Suc Three
> type Six = Suc (Suc Four)
> type Eight = Suc (Suc Six)
> type Ten = Suc (Suc Eight)
> type Thirteen = Suc (Suc (Suc Ten))
>
> -- Singleton Nat-indexed natural numbers, to connect value-level and
> -- type-level Nats
> data SNat :: Nat -> * where
> SZ :: SNat Z
> SS :: Natural n => SNat n -> SNat (Suc n)
>
> -- A class for converting type-level nats to value-level ones
> class Natural n where
> toSNat :: SNat n
>
> instance Natural Z where
> toSNat = SZ
>
> instance Natural n => Natural (Suc n) where
> toSNat = SS toSNat
>
> -- A function for turning explicit nat evidence into implicit
> natty :: SNat n -> (Natural n => r) -> r
> natty SZ r = r
> natty (SS n) r = natty n r
>
> -- The usual canonical finite type. Fin n has exactly n
> -- (non-bottom) values.
> data Fin :: Nat -> * where
> FZ :: Fin (Suc n)
> FS :: Fin n -> Fin (Suc n)
>
> finToInt :: Fin n -> Int
> finToInt FZ = 0
> finToInt (FS n) = 1 + finToInt n
>
> deriving instance Eq (Fin n)
Finiteness
Next, a type class to represent finiteness. For our purposes, a type a is finite if we can explicitly list its elements. For convenience we throw in decidable equality as well, since we will usually need that in conjunction. Of course, we have to be careful: although we can get a list of elements for a finite type, we don’t want to depend on the ordering. We must ensure that the output of the algorithm is independent of the order of elements.1 This is in fact true, although somewhat nontrivial to prove formally; I mention some of the intuitive ideas behind the proof below.
While we are at it, we give Finite instances for Fin n and for products of finite types.
> class Eq a => Finite a where
> universe :: [a]
>
> instance Natural n => Finite (Fin n) where
> universe = fins toSNat
>
> fins :: SNat n -> [Fin n]
> fins SZ = []
> fins (SS n) = FZ : map FS (fins n)
>
> -- The product of two finite types is finite.
> instance (Finite a, Finite b) => Finite (a,b) where
> universe = [(a,b) | a <- universe, b <- universe]
Division, inductively
Now we come to the division algorithm proper. The idea is that panGalacticPred turns an injection into an injection , and then we use induction on to repeatedly apply panGalacticPred until we get an injection .
> panGalacticDivision
> :: forall a b n. (Finite a, Eq b)
> => SNat n -> ((a, Fin (Suc n)) -> (b, Fin (Suc n))) -> (a -> b)
In the base case, we are given an injection , so we just pass a unit value in along with the and project out the .
> panGalacticDivision SZ f = \a -> fst (f (a, FZ))
In the inductive case, we call panGalacticPred and recurse.
> panGalacticDivision (SS n') f = panGalacticDivision n' (panGalacticPred n' f)
Pan-Galactic Predecessor
And now for the real meat of the algorithm, the panGalacticPred function. The idea is that we swap outputs around until the function has the property that every output of the form corresponds to an input also of the form . That is, using the card game analogy, every spade in play should be in the leftmost spot (the spades spot) of some player’s hand (some spades can also be in the deck). Then simply dropping the leftmost card in everyone’s hand (and all the spades in the deck) yields a game with no spades. That is, we will have an injection . Taking predecessors everywhere (i.e. “hearts are the new spades”) yields the desired injection .
We need a Finite constraint on a so that we can enumerate all possible inputs to the function, and an Eq constraint on b so that we can compare functions for extensional equality (we iterate until reaching a fixed point). Note that whether two functions are extensionally equal does not depend on the order in which we enumerate their inputs, so far validating my claim that nothing depends on the order of elements returned by universe.
> panGalacticPred
> :: (Finite a, Eq b, Natural n)
> => SNat n
> -> ((a, Fin (Suc (Suc n))) -> (b, Fin (Suc (Suc n))))
> -> ((a, Fin (Suc n)) -> (b, Fin (Suc n)))
We construct a function f' which is related to f by a series of swaps, and has the property that it only outputs FZ when given FZ as an input. So given (a,i) we can call f' on (a, FS i) which is guaranteed to give us something of the form (b, FS j). Thus it is safe to strip off the FS and return (b, j) (though the Haskell type checker most certainly does not know this, so we just have to tell it to trust us).
> panGalacticPred n f = \(a,i) -> second unFS (f' (a, FS i))
> where
> unFS :: Fin (Suc n) -> Fin n
> unFS FZ = error "impossible!"
> unFS (FS i) = i
To construct f' we iterate a certain transformation until reaching a fixed point. For finite sets and this is guaranteed to terminate, though it is certainly not obvious from the Haskell code. (Encoding this in Agda so that it is accepted by the termination checker would be a fun (?) exercise.)
One round of the algorithm consists of two phases called “shape up” and “ship out” (to be described shortly).
> oneRound = natty n $ shipOut . shapeUp
>
> -- iterate 'oneRound' beginning with the original function...
> fs = iterate oneRound f
> -- ... and stop when we reach a fixed point.
> f' = fst . head . dropWhile (uncurry (=/=)) $ zip fs (tail fs)
> f1 =/= f2 = all (\x -> f1 x == f2 x) universe
Encoding Card Games
Recall that a “card” is a pair of a value and a suit; we think of as the set of values and as the set of suits.
> type Card v s = (v, s)
>
> value :: Card v s -> v
> value = fst
>
> suit :: Card v s -> s
> suit = snd
Again, there are a number of players (one for each element of ), each of which has a “hand” of cards. A hand has a number of “spots” for cards, each one labelled by a different suit (which may not have any relation to the actual suit of the card in that position).
> type PlayerSpot p s = (p, s)
> type Hand v s = s -> Card v s
A “game” is an injective function from player spots to cards. Of course, the type system is not enforcing injectivity here.
> type Game p v s = PlayerSpot p s -> Card v s
Some utility functions. First, a function to project out the hand of a given player.
> hand :: p -> Game p v s -> Hand v s
> hand p g = \s -> g (p, s)
A function to swap two cards, yielding a bijection on cards.
> swap :: (Eq s, Eq v) => Card v s -> Card v s -> (Card v s -> Card v s)
> swap c1 c2 = f
> where
> f c
> | c == c1 = c2
> | c == c2 = c1
> | otherwise = c
leftmost finds the leftmost card in a player’s hand which has a given suit.
> leftmost :: Finite s => s -> Hand v s -> Maybe s
> leftmost targetSuit h = find (\s -> suit (h s) == targetSuit) universe
Playing Rounds
playRound abstracts out a pattern that is used by both shapeUp and shipOut. The first argument is a function which, given a hand, produces a function on cards; that is, based on looking at a single hand, it decides how to swap some cards around.2 playRound then applies that function to every hand, and composes together all the resulting permutations.
Note that playRound has both Finite s and Finite p constraints, so we should think about whether the result depends on the order of elements returned by any call to universe—I claimed it does not. Finite s corresponds to suits/spots, which corresponds to in the original problem formulation. explicitly has a canonical ordering, so this is not a problem. The Finite p constraint, on the face of it, is more problematic. We will have to think carefully about each of the rounds implemented in terms of playRound and make sure they do not depend on the order of players. Put another way, it should be possible for all the players to take their turn simultaneously.
> playRound :: (Finite s, Finite p, Eq v) => (Hand v s -> Card v s -> Card v s) -> Game p v s -> Game p v s
> playRound withHand g = foldr (.) id swaps . g
> where
> swaps = map (withHand . flip hand g) players
> players = universe
Shape Up and Ship Out
Finally, we can describe the “shape up” and “ship out” phases, beginning with “shape up”. A “bad” card is defined as one having the lowest suit; make sure every hand with any bad cards has one in the leftmost spot (by swapping the leftmost bad card with the card in the leftmost spot, if necessary).
> shapeUp :: (Finite s, Finite p, Eq v) => Game p v s -> Game p v s
> shapeUp = playRound shapeUp1
> where
> badSuit = head universe
> shapeUp1 theHand =
> case leftmost badSuit theHand of
> Nothing -> id
> Just badSpot -> swap (theHand badSuit) (theHand badSpot)
And now for the “ship out” phase. Send any “bad” cards not in the leftmost spot somewhere else, by swapping with a replacement, namely, the card whose suit is the same as the suit of the spot, and whose value is the same as the value of the bad card in the leftmost spot. The point is that bad cards in the leftmost spot are OK, since we will eventually just ignore the leftmost spot. So we have to keep shipping out bad cards not in the leftmost spot until they all end up in the leftmost spot. For some intuition as to why this is guaranteed to terminate, consult Schwartz; note that columns tend to acquire more and more cards that have the same rank as a spade in the top spot (which never moves).
> shipOut :: (Finite s, Finite p, Eq v) => Game p v s -> Game p v s
> shipOut = playRound shipOutHand
> where
> badSuit = head universe
> spots = universe
> shipOutHand theHand = foldr (.) id swaps
> where
> swaps = map (shipOut1 . (theHand &&& id)) (drop 1 spots)
> shipOut1 ((_,s), spot)
> | s == badSuit = swap (theHand spot) (value (theHand badSuit), spot)
> | otherwise = id
And that’s it! Note that both shapeUp and shipOut are implemented by composing a bunch of swaps; in fact, in both cases, all the swaps commute, so the order in which they are composed does not matter. (For proof, see Schwartz.) Thus, the result is independent of the order of the players (i.e. the set A).
Enough code, let’s see an example! This example is taken directly from Doyle and Qiu’s paper, and the diagrams are being generated literally (literately?) by running the code in this blog post. Here’s the starting configuration:
Again, the spades are all highlighted in green. Recall that our goal is to get them all to be in the first row, but we have to do it in a completely deterministic, canonical way. After shaping up, we have:
Notice how the 6, K, 5, A, and 8 of spades have all been swapped to the top of their column. However, there are still spades which are not at the top of their column (in particular the 10, 9, and J) so we are not done yet.
Now, we ship out. For example, the 10 of spades is in the diamonds position in the column with the Ace of spades, so we swap it with the Ace of diamonds. Similarly, we swap the 9 of spades with the Queen of diamonds, and the Jack of spades with the 4 of hearts.
Shaping up does nothing at this point so we ship out again, and then continue to alternate rounds.
In the final deal above, all the spades are at the top of a column, so there is an injection from the set of all non-spade spots to the deck of cards with all spades removed. This example was, I suspect, carefully constructed so that none of the spades get swapped out into the undealt portion of the deck, and so that we end up with only spades in the top row. In general, we might end up with some non-spades also in the top row, but that’s not a problem. The point is that ignoring the top row gets rid of all the spades.
Anyway, I hope to write more about some “practical” examples and about what this has to do with combinatorial species, but this post is long enough already. Doyle and Qiu also describe a “short division” algorithm (the above is “long division”) that I hope to explore as well.
The rest of the code
For completeness, here’s the code I used to represent the example game above, and to render all the card diagrams (using diagrams 1.3).
> type Suit = Fin
> type Rank = Fin
> type Player = Fin
>
> readRank :: SNat n -> Char -> Rank n
> readRank n c = fins n !! (fromJust $ findIndex (==c) "A23456789TJQK")
>
> readSuit :: SNat n -> Char -> Suit n
> readSuit (SS _) 'S' = FZ
> readSuit (SS (SS _)) 'H' = FS FZ
> readSuit (SS (SS (SS _))) 'D' = FS (FS FZ)
> readSuit (SS (SS (SS (SS _)))) 'C' = FS (FS (FS FZ))
>
> readGame :: SNat a -> SNat b -> SNat n -> String -> Game (Player a) (Rank b) (Suit n)
> readGame a b n str = \(p, s) -> table !! finToInt p !! finToInt s
> where
> table = transpose . map (map readCard . words) . lines $ str
> readCard [r,s] = (readRank b r, readSuit n s)
>
> -- Example game from Doyle & Qiu
> exampleGameStr :: String
> exampleGameStr = unlines
> [ "4D 6H QD 8D 9H QS 4C AD 6C 4S"
> , "JH AH 9C 8H AS TC TD 5H QC JS"
> , "KC 6S 4H 6D TS 9S JC KD 8S 8C"
> , "5C 5D KS 5S TH JD AC QH 9D KH"
> ]
>
> exampleGame :: Game (Player Ten) (Rank Thirteen) (Suit Four)
> exampleGame = readGame toSNat toSNat toSNat exampleGameStr
>
> suitSymbol :: Suit n -> String
> suitSymbol = (:[]) . ("♠♥♦♣"!!) . finToInt -- Huzzah for Unicode
>
> suitDia :: Suit n -> Diagram B
> suitDia = (suitDias!!) . finToInt
>
> suitDias = map mkSuitDia (fins (toSNat :: SNat Four))
> mkSuitDia s = text' (suitSymbol s) # fc (suitColor s) # lw none
>
> suitColor :: Suit n -> Colour Double
> suitColor n
> | finToInt n `elem` [0,3] = black
> | otherwise = red
>
> rankStr :: Rank n -> String
> rankStr n = rankStr' (finToInt n + 1)
> where
> rankStr' 1 = "A"
> rankStr' i | i <= 10 = show i
> | otherwise = ["JQK" !! (i - 11)]
>
> text' t = stroke (textSVG' (TextOpts lin INSIDE_H KERN False 1 1) t)
>
> renderCard :: (Rank b, Suit n) -> Diagram B
> renderCard (r, s) = mconcat
> [ mirror label
> , cardContent (finToInt r + 1)
> , back
> ]
> where
> cardWidth = 2.25
> cardHeight = 3.5
> cardCorners = 0.1
> mirror d = d d # rotateBy (1/2)
> back = roundedRect cardWidth cardHeight cardCorners # fc white
> # lc (case s of { FZ -> green; _ -> black })
> label = vsep 0.1 [text' (rankStr r), text' (suitSymbol s)]
> # scale 0.6 # fc (suitColor s) # lw none
> # translate ((-0.9) ^& 1.5)
> cardContent n
> | n <= 10 = pips n
> | otherwise = face n # fc (suitColor s) # lw none
> # sized (mkWidth (cardWidth * 0.6))
> pip = suitDia s # scale 1.1
> pips 1 = pip # scale 2
> pips 2 = mirror (pip # up 2)
> pips 3 = pips 2 pip
> pips 4 = mirror (pair pip # up 2)
> pips 5 = pips 4 pip
> pips 6 = mirror (pair pip # up 2) pair pip
> pips 7 = pips 6 pip # up 1
> pips 8 = pips 6 mirror (pip # up 1)
> pips 9 = mirror (pair (pip # up (2/3) pip # up 2)) pip # up (case finToInt s of {1 -> -0.1; 3 -> 0; _ -> 0.1})
> pips 10 = mirror (pair (pip # up (2/3) pip # up 2) pip # up (4/3))
> pips _ = mempty
> up n = translateY (0.5*n)
> pair d = hsep 0.4 [d, d] # centerX
> face 11 = squares # frame 0.1
> face 12 = loopyStar
> face 13 = burst # centerXY
> squares
> = strokeP (mirror (square 1 # translate (0.2 ^& 0.2)))
> # fillRule EvenOdd
> loopyStar
> = regPoly 7 1
> # star (StarSkip 3)
> # pathVertices
> # map (cubicSpline True)
> # mconcat
> # fillRule EvenOdd
> burst
> = [(1,5), (1,-5)] # map r2 # fromOffsets
> # iterateN 13 (rotateBy (-1/13))
> # mconcat # glueLine
> # strokeLoop
>
> renderGame :: (Natural n, Natural a) => Game (Player a) (Rank b) (Suit n) -> Diagram B
> renderGame g = hsep 0.5 $ map (\p -> renderHand p $ hand p g) universe
>
> renderHand :: Natural n => Player a -> Hand (Rank b) (Suit n) -> Diagram B
> renderHand p h = vsep 0.2 $ map (renderCard . h) universe
If we could program in Homotopy Type Theory, we could make this very formal by using the notion of cardinal-finiteness developed in my dissertation (see section 2.4).↩
In practice this function on cards will always be a permutation, though the Haskell type system is not enforcing that at all. An early version of this code used the Iso type from lens, but it wasn’t really paying its way.↩Brenthttp://byorgey.wordpress.com/?p=1439Tue, 12 May 2015 20:49:24 +0000Distributing our packages without a sysadmin
https://www.fpcomplete.com/blog/2015/05/distributing-packages-without-sysadmin
At FP Complete, we're no strangers to running complex web services. But we know
from experience that the simplest service to maintain is one someone else is
managing for you. A few days ago I described how secure package
distribution with stackage-update
and stackage-install works, focusing on the client side tooling. Today's blog
post is about how we use Amazon S3, Github, and Travis CI to host all of this
with (almost) no servers of our own (that caveat explained in the process).Making executables availableWe have two different Haskell tools needed to this hosting:
hackage-mirror to copy the raw
packages to Hackage, and
all-cabal-hashes-tool
to populate the raw cabal files with hash/package size information. But we
don't want to have to compile this executables every time we call them.
Instead, we'd like to simply download and run a precompiled executable.Like many other Github projects, these two utilize Travis CI to build and test
the code every time a commit is pushed. But that's not all; using Travis's
deployment capability, they also
upload an executable to S3.Figuring out the details of making this work is a bit tricky, so it's easiest
to just look at the .travis.yml
file.
For the security conscious: the trick is that Travis allows us to encrypt data
so that no one but Travis can decrypt it. Then, Travis can decrypt and upload
it to S3 for us.Result: a fully open, transparent process for executable building that can be
reviewed by anyone in the community, without allowing private credentials to be
leaked. Also, notice how none of our own servers needed to get involved.Running the executablesWe're going to leverage Travis yet again, and use it to run the executables it
so politely generated for us. We'll use all-cabal-hashes as our demonstration,
though
all-cabal-packages
works much the same way. We have an update.sh
script
which downloads and runs our executable, and then commits, signs, and pushes to
Github. In order to sign and push, however, we need to have a GPG and SSH key,
respectively.Once again, Travis's encryption capabilities come into play. In the
.travis.yml
file,
we decrypt a tar file containing the GPG and SSH key, put them in the correct
location, and also configure Git. Then we call out to the update.sh script.
One wrinkle here is that Travis only supports having a single encrypted file
per repo, which is why we have to tar together the two different keys, which is
a minor annoyance.As before, we have processes running on completely open, auditable systems.
Uploads are being made to providers we don't manage (either Amazon or Github).
The only thing kept hidden are the secrets themselves (keys). And if the
process ever fails, I get an immediate notification from Travis. So far, that's
only happened when I was playing with the build or Hackage was unresponsive.Running regularlyIt wouldn't be very useful if these processes weren't run regularly. This is a
perfect place for a cron job. Unfortunately, Travis doesn't yet support cron
job, though they seem to be
planning it for the future. In the meanwhile, we do have to run this on our own
service. Fortunately, it's a simple job that just asks Travis to restart the
last build it ran for each repository.To simplify even further, I run the Travis command line client from inside a
Docker container, so that the only host system dependency is Docker itself. The
wrapper script is:#!/bin/bash
set -e
set -x
docker run --rm -v /home/ubuntu/all-cabal-files-internal.sh:/run.sh:ro bilge/travis-cli /run.shThe script that runs inside the Docker container is the following (token hidden
to protect... well, me).#!/bin/bash
set -ex
travis login --skip-version-check --org --github-token XXXXXXXXX
# Trigger the package mirroring first, since it's used by all-cabal-hashes
BUILD=$(travis branches --skip-version-check -r commercialhaskell/all-cabal-packages | grep "^hackage" | awk "{ print \$2 }")
BUILDNUM=${BUILD###}
echo BUILD=$BUILD
echo BUILDNUM=$BUILDNUM
travis restart --skip-version-check -r commercialhaskell/all-cabal-packages $BUILDNUM
BUILD=$(travis branches --skip-version-check -r commercialhaskell/all-cabal-files | grep "^hackage" | awk "{ print \$2 }")
BUILDNUM=${BUILD###}
echo BUILD=$BUILD
echo BUILDNUM=$BUILDNUM
travis restart --skip-version-check -r commercialhaskell/all-cabal-files $BUILDNUM
# Put in a bit of a delay to allow the all-cabal-packages job to finish. If
# not, no big deal, next job will pick up the change.
sleep 30
BUILD=$(travis branches --skip-version-check -r commercialhaskell/all-cabal-hashes | grep "^hackage" | awk "{ print \$2 }")
BUILDNUM=${BUILD###}
echo BUILD=$BUILD
echo BUILDNUM=$BUILDNUM
travis restart --skip-version-check -r commercialhaskell/all-cabal-hashes $BUILDNUMConclusionLetting someone else deal with our file storage, file serving, executable
building, and update process is a massive time saver. Now our sysadmins can
stop dealing with these problems, and start solving complicated problems. The
fact that everyone can inspect, learn from, and understand what our services are
doing is another advantage. I encourage others to try out these kinds of
deployments whenever possible.https://www.fpcomplete.com/blog/2015/05/distributing-packages-without-sysadminWed, 13 May 2015 00:00:00 +0000GHC Weekly News - 2015/05/11
http://ghc.haskell.org/trac/ghc/blog/weekly20150511
thoughtpolicehttp://ghc.haskell.org/trac/ghc/blog/weekly20150511Mon, 11 May 2015 14:49:11 +0000Secure package distribution: ready to roll
https://www.fpcomplete.com/blog/2015/05/secure-package-distribution
We're happy to announce that all users of Haskell packages can now securely download packages. As a tl;dr, here are the changes you need to make:Add the relevant GPG key by following the instructionsInstall stackage-update and stackage-install: cabal update && cabal install stackageFrom now on, replace usage of cabal update with stk update --verify --hashesFrom now on, replace usage of cabal install ... with stk install ...This takes advantage of the
all-cabal-hashes
repository, which contains cabal files that are modified to contain package
hashes and sizes. The way we generate the all-cabal-hashes is interesting in
its own right, but I won't shoehorn that discussion into this blog post. Wait
for a separate blog post soon for a description of our lightweight architecture
for this.Note that this is an implementation of Mathieu's secure distribution
proposal,
with some details modified to work with the current state of our tooling (i.e.,
lack of package hash information from Hackage).How it worksThe all-cabal-hashes repository contains all of the cabal files Hackage knows
about. These cabal files are tweaked to have a few extra metadata fields,
including cryptographic hashes of the package tarball and the size of the
package, in bytes. (It also contains the same data in a JSON file, which is
what we currently use due to cabal issue
#2585.) There is also a tag on
the repo, current-hackage, which always points at the latest commit and is
GPG signed. (If you're wondering, we use a tag instead of just commit signing
since it's easier to verify a tag signature.)When you run stk update --verify --hashes, it fetches the latest content from
that repository, verifies the GPG signature, generates a 00-index.tar file,
and places it in the same location that cabal update would place it. At this
point, you have a verified package index on your location machine, which
contains cryptographic signatures and sizes for each package tarball.Now, when you run stk install ..., the stackage-install tool handles all
downloads for you (subject to some
caveats, like cabal issue
#2566). stackage-install will
look up all of the hashes and sizes that are present in your package index, and
verify them during download. In particular:If the server tries to send more data than expected, the download stops
immediately and an exception is thrown.If the server sends less data than expected, an exception is thrown.If the hash does not match what was expected, an exception is thrown.Only when the hash and size match does the file get written. In this way,
tarballs are only made available to the rest of your build tools after they
have been verified.What about Windows?In mailing list discussions, some people were concerned about supporting
Windows, in particular that Git and GPG may be difficult to install and
configure on Windows. But as I shared on Google+ last week,
MinGHC will now be shipping with both of
those tools. I've tested things myself on Windows with the new versions of
MinGHC, stackage-update, and stackage-install, and the instructions above worked
without a hitch.Of course, if others discover problems- either on Windows or elsewhere- please
report them so they can be fixed.Speed and reliabilityIn addition to the security benefits of this tool chain, there are also two
other obvious benefits. By downloading the package index updates via Git, we
are able to download only the differences since the last time we downloaded.
This leads to less bandwidth usage and a quicker download.This toolchain also replaces connections to Hackage with two high reliability
services: Amazon S3 (which holds the package contents) and Github. Using off
the shelf, widely used services in place of hosting everything ourself reduces
our community burden and increases our ecosystem's reliability.CaveatsThere are unfortunately still some caveats with this.The biggest hole in the fence is that we have no way of securing distribution of packages from Hackage itself. While all-cabal-hashes downloads the package index from Hackage via HTTPS (avoiding MITM attacks), there are still other attack vectors to be concerned about (such as breaching the Hackage server itself). The improved Hackage security page documents many of these concerns. Ideally, Hackage would be modified to perform package index signing itself.Due to cabal issue #2566, it's still possible that cabal-install may download packages for you instead of stackage-install, though these situations should be rare. Hopefully integrating this download code directly with a build tool will eliminate that weakness.There is still no verification of package author signatures, so that if someone's Hackage credentials are compromised (which is unfortunately very probable), a corrupted package could be present. This is something Chris Done and Tim Dysinger are working on. We're looking for others in the community to work with us on pushing forward on this. If you're interested, please contact us.Using preexisting toolsWhat's great about this toolchain is how shallow it is. All of the heavy
lifting is handled by Git, GPG, Amazon S3, Github, and (as you'll see in a
later blog post) Travis CI. We mostly just wrap around these high quality tools
and services. Not only was this a practical decision (reduce development time
and code burden), but also a security decision. Instead of creating a
Haskell-only security and distribution framework, we're reusing the same
components that are being tried and tested on a daily basis by the greater
software community. While this doesn't guarantee the tooling we use is bug
free, it does mean that the "many eyeballs" principle applies.Using preexisting tools also means that we open up the possibility of use cases
never before considered. For example, someone contacted me (anonymity
preserved) about a use case where he wanted to be able to identify which
version of Hackage was being used. Until now, such a concept didn't exist. With
a Git-based package index, the Hackage version can be identified by its commit.I'm sure others will come up with new and innovative tricks to pull off, and I
look forward to hearing about them.https://www.fpcomplete.com/blog/2015/05/secure-package-distributionMon, 11 May 2015 00:00:00 +0000Functional Programming @ Amplidata
http://tomschrijvers.blogspot.com/2015/05/functional-programming-amplidata.html
Koen De Keyser gave an interesting talk about "Functional Programming @ Amplidata" and their transition from OCaml to Haskell at the 4th meeting of the Leuven Haskell User Group.The slides are available here.The slides of other talks at the Leuven Haskell User Group can be found here.Tom Schrijverstag:blogger.com,1999:blog-5844006452378085451.post-1043217177607732057Thu, 07 May 2015 09:02:00 +0000Haskell content spinner
http://www.haskellforall.com/2015/05/haskell-content-spinner.html
Recently somebody posted a template for generating blog comment spam, so I thought: "What sillier way to show how elegant Haskell is than generating comment spam?!"The first "stanza" of the template looks like this:{I have|I've} been {surfing|browsing} online more than {three|3|2|4} hours today, yetI never found any interesting article like yours. {It's|It is}pretty worth enough for me. {In my opinion|Personally|In my view},if all {webmasters|site owners|website owners|web owners} and bloggers made good content as you did, the {internet|net|web} will be {much more|a lot more} useful than ever before.|I {couldn't|could not} {resist|refrain from} commenting.{Very well|Perfectly|Well|Exceptionally well} written!|{I will|I'll} {right away|immediately} {take hold of|grab|clutch|grasp|seize|snatch} your {rss|rss feed} as I {can not|can't} {in finding|find|to find} your {email|e-mail} subscription {link|hyperlink} or {newsletter|e-newsletter} service.Do {you have|you've} any? {Please|Kindly} {allow|permit|let} me {realize|recognize|understand|recognise|know} {so that|in order that} I {mayjust|may|could} subscribe. Thanks.|{It is|It's} {appropriate|perfect|the best} time tomake some plans for the future and {it is|it's} time to be happy. Anything of the form {x|y|z} represents a choice between alternative text fragments x, y, and z. The above template has four large alternative comments to pick from, each with their own internal variations. The purpose of these alternatives is to evade simple spam detection algorithms, much like how some viruses evade the immune system by mutating antigens.I wanted to write a Haskell program that selected a random template from one of the provided alternatives and came up with this:{-# LANGUAGE OverloadedStrings #-}import Control.Foldl (random) -- Requires `foldl-1.0.10` or higherimport Turtlemain = do x <- foldIO spam random print xspam :: Shell Textspam = -- 1st major template "" * ("I have" + "I've") * " been " * ("surfing" + "browsing") * " online more than " * ("three" + "3" + "2" + "4") * " hours today, yet I never found any interesting article like yours. " * ("It's" + "It is") * " pretty worth enough for me. " * ("In my opinion" + "Personally" + "In my view") * ", if all " * ("webmasters" + "site owners" + "website owners" + "web owners") * " and bloggers made good content as you did, the " * ("internet" + "net" + "web") * " will be " * ("much more" + "a lot more") * " useful than ever before." -- 2nd major template + " I " * ("couldn't" + "could not") * " " * ("resist" + "refrain from") * " commenting. " * ("Very well" + "Perfectly" + "Well" + "Exceptionally well") * " written!" -- 3rd major template + " " * ("I will" + "I'll") * " " * ("right away" + "immediately") * " " * ("take hold of" + "grab" + "clutch" + "grasp" + "seize" + "snatch") * " your " * ("rss" + "rss feed") * " as I " * ("can not" + "can't") * " " * ("in finding" + "find" + "to find") * " your " * ("email" + "e-mail") * " subscription " * ("link" + "hyperlink") * " or " * ("newsletter" + "e-newsletter") * " service. Do " * ("you have" + "you've") * " any? " * ("Please" + "Kindly") * " " * ("allow" + "permit" + "let") * " me " * ("realize" + "recognize" + "understand" + "recognise" + "know") * " " * ("so that" + "in order that") * " I " * ("may just" + "may" + "could") * " subscribe. Thanks." -- 4th major template + " " * ("It is" + "It's") * " " * ("appropriate" + "perfect" + "the best") * " time to make some plans for the future and " * ("it is" + "it's") * " time to be happy."Conceptually, all I did to embed the template in Haskell was to:add a quote to the beginning of the template: "replace all occurences of { with "*(" (including quotes)replace all occurences of } with ")*" (including quotes)replace all occurences of | with "+" (including quotes)add a quote to the end of the template: "In fact, I mechanically transformed the template to Haskell code using simple sed commands within vi and then just formatted the result to be more readable.Before explaining why this works, let's try our program out to verify that it works:$ ghc -O2 spam.hs$ ./spamJust " I will right away grab your rss feed as I can not find your email subscription hyperlink or newsletter service. Do you've any? Please let me recognise in order that I could subscribe. Thanks."$ ./spamJust " I'll immediately seize your rss as I can not find your email subscription link or e-newsletter service. Do you have any? Please allow me realize in order that I may subscribe. Thanks."You might wonder: how does the above program work?TypesLet's begin from the type of the top-level utility named foldIO:foldIO :: Shell a -- A stream of `a`s -> FoldM IO a b -- A fold that reduces `a`s to a single `b` -> IO b -- The result (a `b`)foldIO connects a producer of as (i.e. a Shell) to a fold that consumes as and produces a single b (i.e. a FoldM). For now we will ignore how they are implemented. Instead we will play type tetris to see how we can connect things together.The first argument we supply to foldIO is spam, whose type is:spam :: Shell TextThink of a Shell as a stream, and spam is a stream whose elements are Text values. Each element of this stream corresponds to one possible alternative for our template. For example, a template with exactly one alternative would be a stream with one element.When we supply spam as the first argument to foldIO, the compiler infers that the first a in the type of foldIO must be TextfoldIO :: Shell a -> FoldM IO a b -> IO b ^ | | |spam :: Shell Text... therefore, the second a must also be Text:foldIO :: Shell a -> FoldM IO a b -> IO b ^ ^ | | +-------------+ |spam :: Shell Text... so in this context foldIO has the more specialized type:foldIO :: Shell Text -> FoldM IO Text b -> IO b... and when we apply foldIO to spam we get the following narrower type:foldIO spam :: FoldM IO Text b -> IO bNow all we need to do is to provide a fold that can consume a stream of Text elements. We choose the random fold, which uses reservoir sampling to pick a random element from the stream. The type of random is:random :: FoldM IO a (Maybe a)In other words, given an input stream of as, this fold reduces the stream to a single Maybe a. The Maybe is either Nothing if the stream is empty or Just some random element from the stream if the stream is non-empty.When we supply random as the second argument to foldIO, the compiler infers that the a in random must be Text:foldIO spam :: FoldM IO Text b -> IO b | | | vrandom :: FoldM IO a (Maybe a)... therefore the second a must also be Text:foldIO spam :: FoldM IO Text b -> IO b | +-----------+ | | v vrandom :: FoldM IO a (Maybe a)So the specialized type of random becomes:foldIO spam :: FoldM IO Text b -> IO brandom :: FoldM IO Text (Maybe Text)Now we can apply type inference in the opposite direction! The compiler infers that the b in the type of foldIO must be Maybe Text:foldIO spam :: FoldM IO Text b -> IO b ^ | | |random :: FoldM IO Text (Maybe Text)... therefore the other b must also be Maybe Text:foldIO spam :: FoldM IO Text b -> IO b ^ ^ | | +-----------+ |random :: FoldM IO Text (Maybe Text)... so we specialize foldIO's type even further to:foldIO spam :: foldM IO Text (Maybe Text) -> IO (Maybe Text)... and when we apply that to random the type simplifies down to:foldIO spam random :: IO (Maybe Text)The end result is a subroutine that loops over the stream using reservoir sampling, selects a random element (or Nothing if the stream is empty), and then returns the result.All that's left is to print the result:main = do x <- foldIO spam random print xSo that explains the top half of the code, but what about the bottom half? What is up with the addition and multiplication of strings?OverloadingThe first trick is that the strings are actually not strings at all! Haskell lets you overload string literals using the OverloadedStrings extension so that they type-check as any type that implements the IsString type class. The Shell Text type is one such type. If you provide a string literal where the compiler expects a Shell Text then the compiler will instead build a 1-element stream containing just that string literal.The second trick is that Haskell lets you overload numeric operatoins to work on any type that implements the Num type class. The Shell Text type implements this type class, so you can add and multiply streams of text elements.The behavior of addition is stream concatenation. In our template, when we write:"Very well" + "Perfectly" + "Well" + "Exceptionally well"... we are really concatenating four 1-element streams into a combined 4-element stream representing the four alternatives.The behavior of multiplication is to sequence two templates. Either template may be a stream of multiple alternatives so when we sequence them we take the "cartesian product" of both streams. When we multiply two streams we concatenate every alterntaive from the first stream with every alternative from the second stream and return all possible combinations.For example, when we write:("couldn't " + "could not ") * ("resist" + "refrain from")This reduces to four combinations: "couldn't resist"+ "couldn't refrain from"+ "could not resist"+ "could not refrain from"You can actually derive this using the rules of addition and multiplication:("couldn't " + "could not ") * ("resist" + "refrain from")-- Multiplication distributes over left addition "couldn't " * ("resist" + "refrain from")+ "could not " * ("resist" + "refrain from")-- Multiplication distributes again "couldn't " * "resist"+ "couldn't " * "refrain from"+ "could not " * "resist"+ "could not " * "refrain from"-- Multiplying 1-element templates just sequences them "couldn't resist"+ "couldn't refrain from"+ "could not resist"+ "could not refrain from"Notice how if we sequence two 1-element templates"I have " * "been"... it's identical to string concatenation:"I have been"And that's it! We build the template using arithmetic and then we fold the results using random to select one template at random. That's the complete program! Or did we?WeightingActually there's one catch: our spam generator is very heavily biased towards the third major template. This is because the generator weights all alternatives equally, but each major template has a different number of alternatives:1st template: 2304 alternatives2nd template: 16 alternatives3rd template: 8294404th template: 12 alternativesAs a result we're 360 times more likely to get the 3rd template than the next most common template (the 1st one). How can we weight each template to undo this bias?The answer is simple: we can weight each template by using multiplication, scaling each template by the appropritae numeric factor.In this case, the weights we will apply are:1st template: Increase frequency by 360x2nd template: Increase frequency by 51840x3rd template: Keep frequency the same (1x)4th template: Increase frequency by 69120xHere's the implementation:spam = -- 1st major template 360 * "" * ("I have" + "I've") * " been " * ("surfing" + "browsing") * " online more than " * ("three" + "3" + "2" + "4") * " hours today, yet I never found any interesting article like yours. " * ("It's" + "It is") * " pretty worth enough for me. " * ("In my opinion" + "Personally" + "In my view") * ", if all " * ("webmasters" + "site owners" + "website owners" + "web owners") * " and bloggers made good content as you did, the " * ("internet" + "net" + "web") * " will be " * ("much more" + "a lot more") * " useful than ever before." -- 2nd major template + 51840 * " I " * ("couldn't" + "could not") * " " * ("resist" + "refrain from") * " commenting. " * ("Very well" + "Perfectly" + "Well" + "Exceptionally well") * " written!" -- 3rd major template + 1 * " " * ("I will" + "I'll") * " " * ("right away" + "immediately") * " " * ("take hold of" + "grab" + "clutch" + "grasp" + "seize" + "snatch") * " your " * ("rss" + "rss feed") * " as I " * ("can not" + "can't") * " " * ("in finding" + "find" + "to find") * " your " * ("email" + "e-mail") * " subscription " * ("link" + "hyperlink") * " or " * ("newsletter" + "e-newsletter") * " service. Do " * ("you have" + "you've") * " any? " * ("Please" + "Kindly") * " " * ("allow" + "permit" + "let") * " me " * ("realize" + "recognize" + "understand" + "recognise" + "know") * " " * ("so that" + "in order that") * " I " * ("may just" + "may" + "could") * " subscribe. Thanks." -- 4th major template + 69120 * " " * ("It is" + "It's") * " " * ("appropriate" + "perfect" + "the best") * " time to make some plans for the future and " * ("it is" + "it's") * " time to be happy."Now this produces a fairer distribution between the four major alternatives:$ ./spamJust "I have been surfing online more than three hours today, yet I never found any interesting article like yours. It's pretty worth enough for me. In my view, if all web owners and bloggers made good content as you did, the internet will be a lot more useful than ever before."$ ./spamJust " It's the best time to make some plans for the future and it's time to be happy."$ ./spamJust " I will right away clutch your rss feed as I can't in finding your e-mail subscription link or newsletter service. Do you have any? Kindly let me understand so that I may just subscribe. Thanks."$ ./spamJust " I could not refrain from commenting. Exceptionally well written!"Remember how we said that Shell Text implements the Num type class in order to get addition and multiplication? Well, you can also use the same Num class to overload integer literals. Any time the compiler sees an integer literal where it expects a Shell Text it will replace that integer with a stream of empty strings whose length is the given integer.For example, if you write the number 3, it's equivalent to:-- Definition of 33 = 1 + 1 + 1-- 1 = ""3 = "" + "" + ""So if you write:3 * "some string"... that expands out to:("" + "" + "") * "some string"... and multiplication distributes to give us:("" * "some string") + ("" * "some string") + ("" * "some string")... which reduces to three copies of "some string":"some string" + "some string" + "some string"This trick works even when multiplying a number by a template with multiple alternatives:2 * ("I have" + "I've")-- 2 = 1 + 1= (1 + 1) * ("I have" + "I've")-- 1 = ""= ("" + "") * ("I have" + "I've")-- Multiplication distributes= ("" * "I have") + ("" * "I've") + ("" * "I have") + ("" * "I've")-- Simplify= "I have" + "I've" + "I have" + "I've"ArithmeticIn fact, Shell Text obeys all sort of arithmetic laws:-- `0` is the identity of addition0 + x = xx + 0 = x-- Addition is associative(x + y) + z = x + (y + z-- `1` is the identity of multiplication1 * x = 1x * 1 = 1-- Multiplication is associative(x * y) * z = x * (y * z)-- Multiplication right-distributes over addition(x + y) * z = (x * z) + (y * z)-- 1-element streams left-distribute over addition"string" * (x + y) = ("string" * x) + ("string" * y)I'm not sure what the mathematical name is for this sort of structure. I usually call this a "semiring", but that's technically not correct because in a semiring we expect addition to commute, but here it does not because Shell Text preserves the ordering of results. For the case of selecting a random element ordering does not matter, but there are other operations we can perform on these streams that are order-dependent.In fact, the laws of arithmetic enforce that you weigh all fine-grained alternatives equally, regardless of the frequencies of the top-level coarse-grained alternatives. If you weighted things based on the relative frequencies of top-level alternatives you would get very inconsistent behavior.For example, if you tried to be more "fair" for outer alternatives, then addition stops being associative, meaning that these templates would no longer behave the same:{x|{y|z}}{{x|y}|z}{x|y|z}ConclusionThe Haskell template generator was so concise for two main reasons:We embedded the template directly within Haskell, so we skipped having to parse the templateWe reuse modular and highly generic components (like random, foldIO, (+), and (*) and numeric literals) instead of writing our own custom codeAlso, our program is easily modifiable. For example, if we want to collect all the templates, we just replace random with vector:import Control.Foldl (vector)import Data.Vector (Vector)main = do x <- foldIO spam vector print (x :: Vector Text)That efficiently builds a vector in place using mutation that stores all the results, then purifies the result.However, these sorts of tricks come at a cost. Most of the awesome tricks like this are not part of the standard library and instead exist in libraries, making them significantly harder to discover. Worse, you're never really "done" learning the Haskell language. The library ecosystem is a really deep rabbit hole full of jewels and at some point you just have to pick a point to just stop digging through libraries and build something useful ...... or useless, like a comment spam generator.Gabriel Gonzaleztag:blogger.com,1999:blog-1777990983847811806.post-7420170771118940066Wed, 06 May 2015 14:11:00 +0000Announcing js-jquery Haskell Library
http://neilmitchell.blogspot.com/2015/05/announcing-js-jquery-haskell-library.html
Neil Mitchelltag:blogger.com,1999:blog-7094652.post-8178719252979620784Wed, 06 May 2015 10:28:00 +0000Haskell Web Engineer at Front Row Education (Full-time)
http://functionaljobs.com/jobs/8823-haskell-web-engineer-at-front-row-education
urn:uuid:c918b967-75a7-3f96-89a4-ebb615b26053Tue, 05 May 2015 23:30:26 +0000A concrete piece of evidence for incompleteness
https://theorylunch.wordpress.com/2015/05/04/a-concrete-piece-of-evidence-for-incompleteness/
On Thursday, the 25th of March 2015, Venanzio Capretta gave a Theory Lunch talk about Goodstein’s theorem. Later, on the 9th of March, Wolfgang Jeltsch talked about ordinal numbers, which are at the base of Goodstein’s proof. Here, I am writing down a small recollection of their arguments.
Given a base , consider the base- writing of the nonnegative integer
where each is an integer between and . The Cantor base- writing of is obtained by iteratively applying the base- writing to the exponents as well, until the only values appearing are integers between and . For example, for and , we have
and also
Given a nonnegative integer , consider the Goodstein sequence defined for by putting , and by constructing from as follows:
Take the Cantor base- representation of .
Convert each into , getting a new number.
If the value obtained at the previous point is positive, then subtract from it.
(This is called the woodworm’s trick.)
Goodstein’s theorem. Whatever the initial value , the Goodstein sequence ultimately reaches the value in finitely many steps.
Goodstein’s proof relies on the use of ordinal arithmetic. Recall the definition: an ordinal number is an equivalence class of well-ordered sets modulo order isomorphisms, i.e., order-preserving bijections.Observe that such order isomorphism between well-ordered sets, if it exists, is unique: if and are well-ordered sets, and are two distinct order isomorphisms, then either or has a minimum , which cannot correspond to any element of .
An interval in a well-ordered set is a subset of the form .
Fact 1. Given any two well-ordered sets, either they are order-isomorphic, or one of them is order-isomorphic to an initial interval of the other.
In particular, every ordinal is order-isomorphic to the interval .
All ordinal numbers can be obtained via von Neumann’s classification:
The zero ordinal is , which is trivially well-ordered as it has no nonempty subsets.
A successor ordinal is an ordinal of the form , with every object in being smaller than in .
For instance, can be seen as .
A limit ordinal is a nonzero ordinal which is not a successor. Such ordinal must be the least upper bound of the collection of all the ordinals below it.
For instance, the smallest transfinite ordinal is the limit of the collection of the finite ordinals.
Observe that, with this convention, each ordinal is an element of every ordinal strictly greater than itself.
Fact 2. Every set of ordinal numbers is well-ordered with respect to the relation: if and only if .
Operations between ordinal numbers are defined as follows: (up to order isomorphisms)
is a copy of followed by a copy of , with every object in being strictly smaller than any object in .
If and are finite ordinals, then has the intuitive meaning. On the other hand, , as a copy of followed by a copy of is order-isomorphic to : but is strictly larger than , as the latter is an initial interval of the former.
is a stack of copies of , with each object in each layer being strictly smaller than any object of any layer above.
If and are finite ordinals, then has the intuitive meaning. On the other hand, is a stack of copies of , which is order-isomorphic to : but is a stack of copies of , which is order-isomorphic to .
is if , if is the successor of , and the least upper bound of the ordinals of the form with if is a limit ordinal.
If and are finite ordinals, then has the intuitive meaning. On the other hand, is the least upper bound of all the ordinals of the form where is a finite ordinal, which is precisely : but .
Proof of Goodstein’s theorem: To each integer value we associate an ordinal number by replacing each (which, let’s not forget, is the base is written in) with . For example, if , then
and (which, incidentally, equals ) so that
We notice that, in our example, , but : why is it so?, and is it just a case, or is there a rule behind this?
At each step where , consider the writing . Three cases are possible:
.
Then , as , and .
and .
Then for a transfinite ordinal , and .
and .
Then for some , and is a number whose th digit in base is zero: correspondingly, the rightmost term in will be replaced by a smaller ordinal in .
It is then clear that the sequence is strictly decreasing. But the collection of all ordinals not larger than is a well-ordered set, and every nonincreasing sequence in a well-ordered set is ultimately constant: hence, there must be a value such that . But the only way it can be so is when : in turn, the only option for to be zero, is that is zero as well. This proves the theorem.
So why is it that Goodstein’s theorem is not provable in the first order Peano arithmetics? The intuitive reason, is that the exponentiations can be arbitrarily many, which requires having available all the ordinals up to
, times , times:
this, however, is impossible if induction only allows finitely many steps, as it is the case for first-order Peano arithmetics. A full discussion of a counterexample, however, would greatly exceed the scope of this post.Silvio Capobiancohttp://theorylunch.wordpress.com/?p=1596Mon, 04 May 2015 16:34:41 +0000A Proof of Church Rosser in Twelf
http://jozefg.bitbucket.org/posts/2015-05-05-confluence.html
http://jozefg.bitbucket.org/posts/2015-05-05-confluence.htmlTue, 05 May 2015 00:00:00 +0000Modeling garbage collectors with Alloy: part 1
http://mainisusuallyafunction.blogspot.com/2015/05/modeling-garbage-collectors-with-alloy.html
Formal methods for software verification are usually seen as a high-cost tool that you would only use on the most critical systems, and only after extensive informal verification. The Alloy project aims to be something completely different: a lightweight tool you can use at any stage of everyday software development. With just a few lines of code, you can build a simple model to explore design issues and corner cases, even before you've started writing the implementation. You can gradually make the model more detailed as your requirements and implementation get more complex. After a system is deployed, you can keep the model around to evaluate future changes at low cost.Sounds great, doesn't it? I have only a tiny bit of prior experience with Alloy and I wanted to try it out on something more substantial. In this article we'll build a simple model of a garbage collector, visualize its behavior, and fix some problems. This is a warm-up for exploring more complex GC algorithms, which will be the subject of future articles.I won't describe the Alloy syntax in full detail, but you should be able to follow along if you have some background in programming and logic. See also the Alloy documentation and especially the book Software Abstractions: Logic, Language, and Analysis by Daniel Jackson, which is a very practical and accessible introduction to Alloy. It's a highly recommended read for any software developer.You can download Alloy as a self-contained Java executable, which can do analysis and visualization and includes an editor for Alloy code.The modelWe will start like so:open util/ordering [State]sig Object { }one sig Root extends Object { }sig State { pointers: Object -> set Object, collected: set Object,}The garbage-collected heap consists of Objects, each of which can point to any number of other Objects (including itself). There is a distinguished object Root which represents everything that's accessible without going through the heap, such as global variables and the function call stack. We also track which objects have already been garbage-collected. In a real implementation these would be candidates for re-use; in our model they stick around so that we can detect use-after-free.The open statement invokes a library module to provide a total ordering on States, which we will interpret as the progression of time. More on this later.RelationsIn the code that follows, it may look like Alloy has lots of different data types, overloading operators with total abandon. In fact, all these behaviors arise from an exceptionally simple data model:Every value is a relation; that is, a set of tuples of the same non-zero length.When each tuple has length 1, we can view the relation as a set. When each tuple has length 2, we can view it as a binary relation and possibly as a function. And a singleton set is viewed as a single atom or tuple.Since everything in Alloy is a relation, each operator has a single definition in terms of relations. For example, the operators . and [] are syntax for a flavor of relational join. If you think of the underlying relations as a database, then Alloy's clever syntax amounts to an object-relational mapping that is at once very simple and very powerful. Depending on context, these joins can look like field access, function calls, or data structure lookups, but they are all described by the same underlying framework.The elements of the tuples in a relation are atoms, which are indivisible and have no meaning individually. Their meaning comes entirely from the relations and properties we define. Ultimately, atoms all live in the same universe, but Alloy gives "warnings" when the type system implied by the sig declarations can prove that an expression is always the empty relation.Here are the relations implied by our GC model, as tuple sets along with their types:Object: {Object} = {O1, O2, ..., Om}Root: {Root} = {Root}State: {State} = {S1, S2, ..., Sn}pointers: {(State, Object, Object)}collected: {(State, Object)}first: {State} = {S1}last: {State} = {Sn}next: {(State, State)} = {(S1, S2), (S2, S3), ..., (S(n-1), Sn)}The last three relations come from the util/ordering library. Note that a sig implicitly creates some atoms.DynamicsThe live objects are everything reachable from the root:fun live(s: State): set Object { Root.*(s.pointers)}*(s.pointers) constructs the reflexive, transitive closure of the binary relation s.pointers; that is, the set of objects reachable from each object.Of course the GC is only part of a system; there's also the code that actually uses these objects, which in GC terminology is called the mutator. We can describe the action of each part as a predicate relating "before" and "after" states.pred mutate(s, t: State) { t.collected = s.collected t.pointers != s.pointers all a: Object - t.live | t.pointers[a] = s.pointers[a]}pred gc(s, t: State) { t.pointers = s.pointers t.collected = s.collected + (Object - s.live) some t.collected - s.collected}The mutator cannot collect garbage, but it can change the pointers of any live object. The GC doesn't touch the pointers, but it collects any dead object. In both cases we require that something changes in the heap.It's time to state the overall facts of our model:fact { no first.collected first.pointers = Root -> (Object - Root) all s: State - last | let t = s.next | mutate[s, t] or gc[s, t]}This says that in the initial state, no object has been collected, and every object is in the root set except Root itself. This means we don't have to model allocation as well. Each state except the last must be followed by a mutator step or a GC step.The syntax all x: e | P says that the property P must hold for every tuple x in e. Alloy supports a variety of quantifiers like this.Interacting with AlloyThe development above looks nice and tidy — I hope — but in reality, it took a fair bit of messing around to get to this point. Alloy provides a highly interactive development experience. At any time, you can visualize your model as a collection of concrete examples. Let's do that now by adding these commands:pred Show {}run Show for 5Now we select this predicate from the "Execute" menu, then click "Show". The visualizer provides many options to customise the display of each atom and relation. The config that I made for this project is "projected over State", which means you see a graph of the heap at one moment in time, with forward/back buttons to reach the other States.After clicking around a bit, you may notice some oddities:The root isn't a heap object; it represents all of the pointers that are reachable without accessing the heap. So it's meaningless for an object to point to the root. We can exclude these cases from the model easily enough:fact { all s: State | no s.pointers.Root}(This can also be done more concisely as part of the original sig.)Now we're ready to check the essential safety property of a garbage collector:assert no_dangling { all s: State | no (s.collected & s.live)}check no_dangling for 5 Object, 10 StateAnd Alloy says:Executing "Check no_dangling for 5 Object, 10 State" ... 8338 vars. 314 primary vars. 17198 clauses. 40ms. Counterexample found. Assertion is invalid. 14ms.Clicking "Counterexample" brings up the visualization:Whoops, we forgot to say that only pointers to live objects can be stored! We can fix this by modifying the mutate predicate:pred mutate(s, t: State) { t.collected = s.collected t.pointers != s.pointers all a: Object - t.live | t.pointers[a] = s.pointers[a] // new requirement! all a: t.live | t.pointers[a] in s.live}With the result:Executing "Check no_dangling for 5 Object, 10 State" ... 8617 vars. 314 primary vars. 18207 clauses. 57ms. No counterexample found. Assertion may be valid. 343ms.SAT solvers and bounded model checking"May be" valid? Fortunately this has a specific meaning. We asked Alloy to look for counterexamples involving at most 5 objects and 10 time steps. This bounds the search for counterexamples, but it's still vastly more than we could ever check by exhaustive brute force search. (See where it says "8617 vars"? Try raising 2 to that power.) Rather, Alloy turns the bounded model into a Boolean formula, and feeds it to a SAT solver.This all hinges on one of the weirdest things about computing in the 21st century. In complexity theory, SAT (along with many equivalents) is the prototypical "hardest problem" in NP. Why do we intentionally convert our problem into an instance of this "hardest problem"? I guess for me it illustrates a few things:The huge gulf between worst-case complexity (the subject of classes like NP) and average or "typical" cases that we encounter in the real world. For more on this, check out Impagliazzo's "Five Worlds" paper.The fact that real-world difficulty involves a coordination game. SAT solvers got so powerful because everyone agrees SAT is the problem to solve. Standard input formats and public competitions were a key part of the amazing progress over the past decade or two.Of course SAT solvers aren't quite omnipotent, and Alloy can quickly get overwhelmed when you scale up the size of your model. Applicability to the real world depends on the small scope hypothesis:If an assertion is invalid, it probably has a small counterexample.Or equivalently:Systems that fail on large instances almost always fail on small instances with similar properties.This is far from a sure thing, but it already underlies a lot of approaches to software testing. With Alloy we have the certainty of proof within the size bounds, so we don't have to resort to massive scale to find rare bugs. It's difficult (but not impossible!) to imagine a GC algorithm that absolutely cannot fail on fewer than 6 nodes, but is buggy for larger heaps. Implementations will often fall over at some arbitrary resource limit, but algorithms and models are more abstract.ConclusionIt's not surprising that our correctness propertyall s: State | no (s.collected & s.live)holds, since it's practically a restatement of the garbage collection "algorithm":t.collected = s.collected + (Object - s.live)Because reachability is built into Alloy, via transitive closure, the simplest model of a garbage collector does not really describe an implementation. In the next article we'll look at incremental garbage collection, which breaks the reachability search into small units and allows the mutator to run in-between. This is highly desirable for interactive or real-time apps; it also complicates the algorithm quite a bit. We'll use Alloy to uncover some of these complications.In the meantime, you can play around with the simple GC model and ask Alloy to visualize any scenario you like. For example, we can look at runs where the final state includes at least 5 pointers, and at least one collected object:pred Show { #(last.pointers) >= 5 some last.collected}run Show for 5Thanks for reading! You can find the code in a GitHub repository which I'll update if/when we get around to modeling more complex GCs.keegantag:blogger.com,1999:blog-1563623855220143059.post-3353166977116596792Sun, 03 May 2015 21:21:00 +0000GUI - Release of the threepenny-gui library, version 0.6.0.1
http://apfelmus.nfshost.com/blog/2015/05/03-threepenny-gui-0-6.html
http://apfelmus.nfshost.com/blog/2015/05/03-threepenny-gui-0-6.htmlSun, 03 May 2015 15:33:45 +0000Smarter validation
http://feedproxy.google.com/~r/RomanCheplyaka/~3/halAqVxPNMU/2015-05-02-smarter-validation
http://ro-che.info//articles/2015-05-02-smarter-validation.htmlSat, 02 May 2015 20:00:00 +0000Width-adaptive XMonad layout
http://feedproxy.google.com/~r/ezyang/~3/HgGh-F_b7PY/
My usual laptop setup is I have a wide monitor, and then I use my laptop screen as a secondary monitor. For a long time, I had two XMonad layouts: one full screen layout for my laptop monitor (I use big fonts to go easy on the eyes) and a two-column layout when I'm on the big screen.
But I had an irritating problem: if I switched a workspace from the small screen to the big screen, XMonad would still be using the full screen layout, and I would have to Alt-Tab my way into the two column layout. To add insult to injury, if I moved it back, I'd have to Alt-Tab once again.
After badgering the fine folks on #xmonad, I finally wrote an extension to automatically switch layout based on screen size! Here it is:
{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses #-}
-----------------------------------------------------------------------------
-- |
-- Module : XMonad.Layout.PerScreen
-- Copyright : (c) Edward Z. Yang
-- License : BSD-style (see LICENSE)
--
-- Maintainer :
-- Stability : unstable
-- Portability : unportable
--
-- Configure layouts based on the width of your screen; use your
-- favorite multi-column layout for wide screens and a full-screen
-- layout for small ones.
-----------------------------------------------------------------------------
module XMonad.Layout.PerScreen
( -- * Usage
-- $usage
PerScreen,
ifWider
) where
import XMonad
import qualified XMonad.StackSet as W
import Data.Maybe (fromMaybe)
-- $usage
-- You can use this module by importing it into your ~\/.xmonad\/xmonad.hs file:
--
-- > import XMonad.Layout.PerScreen
--
-- and modifying your layoutHook as follows (for example):
--
-- > layoutHook = ifWider 1280 (Tall 1 (3/100) (1/2) ||| Full) Full
--
-- Replace any of the layouts with any arbitrarily complicated layout.
-- ifWider can also be used inside other layout combinators.
ifWider :: (LayoutClass l1 a, LayoutClass l2 a)
=> Dimension -- ^ target screen width
-> (l1 a) -- ^ layout to use when the screen is wide enough
-> (l2 a) -- ^ layout to use otherwise
-> PerScreen l1 l2 a
ifWider w = PerScreen w False
data PerScreen l1 l2 a = PerScreen Dimension Bool (l1 a) (l2 a) deriving (Read, Show)
-- | Construct new PerScreen values with possibly modified layouts.
mkNewPerScreenT :: PerScreen l1 l2 a -> Maybe (l1 a) ->
PerScreen l1 l2 a
mkNewPerScreenT (PerScreen w _ lt lf) mlt' =
(\lt' -> PerScreen w True lt' lf) $ fromMaybe lt mlt'
mkNewPerScreenF :: PerScreen l1 l2 a -> Maybe (l2 a) ->
PerScreen l1 l2 a
mkNewPerScreenF (PerScreen w _ lt lf) mlf' =
(\lf' -> PerScreen w False lt lf') $ fromMaybe lf mlf'
instance (LayoutClass l1 a, LayoutClass l2 a, Show a) => LayoutClass (PerScreen l1 l2) a where
runLayout (W.Workspace i p@(PerScreen w _ lt lf) ms) r
| rect_width r > w = do (wrs, mlt') <- runLayout (W.Workspace i lt ms) r
return (wrs, Just $ mkNewPerScreenT p mlt')
| otherwise = do (wrs, mlt') <- runLayout (W.Workspace i lf ms) r
return (wrs, Just $ mkNewPerScreenF p mlt')
handleMessage (PerScreen w bool lt lf) m
| bool = handleMessage lt m >>= maybe (return Nothing) (\nt -> return . Just $ PerScreen w bool nt lf)
| otherwise = handleMessage lf m >>= maybe (return Nothing) (\nf -> return . Just $ PerScreen w bool lt nf)
description (PerScreen _ True l1 _) = description l1
description (PerScreen _ _ _ l2) = description l2
I'm going to submit it to xmonad-contrib, if I can figure out their darn patch submission process...Edward Z. Yanghttp://blog.ezyang.com/?p=9384Sat, 02 May 2015 04:36:55 +0000Bracket Abstraction: The Smallest PL You've Ever Seen
http://jozefg.bitbucket.org/posts/2015-05-01-brackets.html
http://jozefg.bitbucket.org/posts/2015-05-01-brackets.htmlFri, 01 May 2015 00:00:00 +0000April 2015 1HaskellADay Problems and Solutions
http://logicaltypes.blogspot.com/2015/04/april-2015-1haskelladay-problems-and.html
April 2015 April 30th, 2015: "SHOW ME DA MONAY!" http://lpaste.net/3352992723589136384 for today's #haskell problem Simple? Sure! Solution? Yes. http://lpaste.net/7331259237240143872 April 29th, 2015: We take stock of the Stochastic Oscillator http://lpaste.net/8447434917217828864 for today's #haskell problem #trading We are so partially stoched for a partial solution for the Stochastic Oscillator http://lpaste.net/4307607333212520448 April 28th, 2015: Today's #haskell puzzle as a ken-ken solver http://lpaste.net/6211501623257071616 a solution (beyond my ... ken) is defined at http://lpaste.net/929006498481176576April 27th, 2015: Rainy days and Mondays do not stop the mail, nor today's #haskell problem! http://lpaste.net/6468251516921708544 The solution posted at http://lpaste.net/6973841984536444928 … shows us view-patterns and how to spell the word 'intercalate'.April 24th, 2015: Bidirectionally (map) yours! for today's #haskell problem http://lpaste.net/1645129197724631040 A solution to this problem is posted at http://lpaste.net/540860373977268224 April 23rd, 2015: Today's #haskell problem looks impossible! http://lpaste.net/6861042906254278656 So this looks like this is a job for ... KIM POSSIBLE! YAY! @sheshanaag offers a solution at http://lpaste.net/131309 .April 22nd, 2015: "I need tea." #BritishProblems "I need clean data" #EveryonesPipeDream "Deletia" today's #haskell problem http://lpaste.net/2343021306984792064 Deletia solution? Solution deleted? Here ya go! http://lpaste.net/5973874852434542592April 21st, 2015: In which we learn about Tag-categories, and then Levenshtein distances between them http://lpaste.net/2118427670256549888 for today's #haskell problem Okay, wait: is it a group of categories or a category of groups? me confused! A solution to today's #haskell at http://lpaste.net/8855539857825464320April 20th, 2015: Today we can't see the forest for the trees, so let's change that http://lpaste.net/3949027037724803072 A solution to our first day in the tag-forest http://lpaste.net/4634897048192155648 ... make sure you're marking your trail with breadcrumbs!April 17th, 2015: No. Wait. You wanted line breaks with that, too? Well, why didn't you say so in the first place? http://lpaste.net/8638783844922687488 Have some curry with a line-breaky solution at http://lpaste.net/8752969226978852864April 16th, 2015: "more then." #okaythen Sry, not sry, but here's today's #haskell problem: http://lpaste.net/6680706931826360320 I can't even. lolz. rofl. lmao. whatevs. And a big-ole-blob-o-words is given as the solution http://lpaste.net/2810223588836114432 for today's #haskell problem. It ain't pretty, but... there it isApril 15th, 2015: Poseidon's trident or Andrew's Pitchfork analysis, if you prefer, for today's #haskell problem http://lpaste.net/5072355173985157120April 14th, 2015: Refining the SMA-trend-ride http://lpaste.net/3856617311658049536 for today's #haskell problem. Trending and throttling doesn't ... quite get us there, but ... solution: http://lpaste.net/9223292936442085376April 13th, 2015: In today's #haskell problem we learn zombies are comonadic, and like eating SMA-brains. http://lpaste.net/8924989388807471104 Yeah. That. Hold the zombies, please! (Or: when $40k net profit is not enough by half!) http://lpaste.net/955577567060951040April 10th, 2015: Today's #haskell problem delivered with much GRAVITAS, boils down to: don't be a dumb@$$ when investing http://lpaste.net/5255378926062010368 #KeepinItReal The SMA-advisor is REALLY chatty, but how good is it? TBD, but here's a very simple advisor: http://lpaste.net/109712 Backtesting for this strategy is posted at http://lpaste.net/109687 (or: how a not so good buy/sell strategy give you not so good results!)April 9th, 2015: A bit of analysis of historical stock data http://lpaste.net/6960188425236381696 for today's #haskell problem A solution to the SMA-analyses part is posted at http://lpaste.net/3427480809555099648 April 8th, 2015: MOAR! MOAR! You clamor for MOAR real-world #haskell problems, and how can I say no? http://lpaste.net/5198207211930648576 Downloading stock screens Hint: get the screens from a web service; look at, e.g.: https://code.google.com/p/yahoo-finance-managed/wiki/YahooFinanceAPIs A 'foldrM'-solution to this problem is posted at http://lpaste.net/2729747257602605056April 7th, 2015: Looking at a bit of real-world #haskell for today's stock (kinda-)screen-scraping problem at http://lpaste.net/5737110678548774912 Hint: perhaps you'd like to solve this problem using tagsoup? https://hackage.haskell.org/package/tagsoup *GASP* You mean ... it actually ... works? http://lpaste.net/1209131365107236864 A MonadWriter-y tagsoup-y Monoidial-MultiMap-y solutionApril 6th, 2015: What do three men teaching all of high school make, beside today's #haskell problem? http://lpaste.net/667230964799242240 Tired men, of course! Thanks, George Boole! Three Men and a High School, SOLVED! http://lpaste.net/7942804585247145984April 3rd, 2015: reverseR that list like a Viking! Rrrrr! for today's problem http://lpaste.net/8513906085948555264 … #haskell Totes cheated to get you the solution http://lpaste.net/1880031563417124864 used a library that I wrote, so, like, yeah, totes cheated! ;)April 2nd, 2015: We're breaking new ground for today's #haskell problem: let's reverse lists... relationally. And tag-type some values http://lpaste.net/389291192849793024 After several fits and starts @geophf learns how to reverse a list... relationally http://lpaste.net/7875722904095162368 and can count to the nr 5, as wellApril 1st, 2015: Take a drink of today's #haskell problem: love potion nr9 http://lpaste.net/435384893539614720 because, after all: all we need is love, la-di-dah-di-da! A solution can be found au shaque d'amour posted at http://lpaste.net/6859866252718899200geophftag:blogger.com,1999:blog-4650294074444534066.post-880105444986279982Thu, 30 Apr 2015 17:21:00 +0000Announcing hashabler: like hashable only more so
http://brandon.si/code/announcing-hashabler-like-hashable/
I’ve just released the first version of a haskell library for principled,
cross-platform & extensible hashing of types, which includes an implementation
of the FNV-1a algorithm. It is available on hackage,
and can be installed with:
cabal install hashabler
hashabler is a rewrite of the hashable
library by Milan Straka and Johan Tibell, having the following goals:
Extensibility; it should be easy to implement a new hashing algorithm on any
Hashable type, for instance if one needed more hash bits
Honest hashing of values, and principled hashing of algebraic data types (see
e.g. #30)
Cross-platform consistent hash values, with a versioning guarantee. Where
possible we ensure morally identical data hashes to indentical values
regardless of processor word size and endianness.
Make implementing identical hash routines in other languages as painless as
possible. We provide an implementation of a simple hashing algorithm (FNV-1a)
and make an effort define Hashable instances in a way that is well-documented
and sensible, so that e.g. one can (hopefully) easily implement string
hashing routine in JavaScript that will match the way we hash strings here.
Motivation
I started writing a fast concurrent bloom filter variant, but found none of the
existing libraries fit my needs. In particular hashable was deficient in a
number of ways:
The number of hash bits my data structure requires can vary based on user
parameters, and possibly be more than the 64-bits supported by hashable
Users might like to serialize their bloomfilter
and store it, pass it to other machines, or work with it in a different
language, so we need
hash values that are consistent across platforms
some guarantee of consistency across library versions
I was also very concerned about the general approach taken for algebraic types,
which results in collision, the use of “hashing” numeric values to themselves,
dubious combining functions, etc. It wasn’t at all clear to me how to ensure my
data structure wouldn’t be broken if I used hashable. See below for a very
brief investigation into hash goodness of the two libraries.
There isn’t interest in supporting my use case or addressing these issues in
hashable (see e.g. #73, #30, and #74)
and apparently hashable is working in practice for people, but maybe this new
package will be useful for some other folks.
Hash goodness of hashable and hashabler, briefly
Hashing-based data structures assume some “goodness” of the underlying hash
function, and may depend on the goodness of the hash function in ways that
aren’t always clear or well-understood. “Goodness” also seems to be somewhat
subjective, but can be expressed statistically in terms of bit-independence
tests, and avalanche properties, etc.; various things that e.g.
smhasher looks at.
I thought for fun I’d visualize some distributions, as that’s easier for my
puny brain to understand than statistics. We visualize 32-bit hashes by
quantizing by 64x64 and mapping that to a pixel following a hilbert curve to
maintain locality of hash values. Then when multiple hash values fall within
the same 64x64 pixel, we darken the pixel, and finally mark it red if we can’t
go any further to indicate clipping.
It’s easy to cherry-pick inputs that will result in some bad behavior by
hashable, but below I’ve tried to show some fairly realistic examples of
strange or less-good distributions in hashable. I haven’t analysed these
at all. Images are cropped ¼ size, but are representative of the whole 32-bit
range.
First, here’s a hash of all [Ordering] of size 10 (~59K distinct values):
Hashabler:
Hashable:
Next here’s the hash of one million (Word8,Word8,Word8) (having a domain ~ 16 mil):
Hashabler:
Hashable:
I saw no difference when hashing english words, which is good news as that’s
probably a very common use-case.
Please help
If you could test the library on a big endian machine and let me know how it
goes, that would be great. See
here.
You can also check out the TODOs scattered throughout the code and send
pull requests. I mayb not be able to get to them until June, but will be very
grateful!
P.S. hire me
I’m always open to interesting work or just hearing about how companies are
using haskell. Feel free to send me an email at brandon.m.simmons@gmail.comhttp://brandon.si/code/announcing-hashabler-like-hashableThu, 30 Apr 2015 15:03:43 +0000Smarter conditionals with dependent types: a quick case study
http://lambda.jstolarek.com/2015/04/smarter-conditionals-with-dependent-types-a-quick-case-study/
Find the type error in the following Haskell expression:
if null xs then tail xs else xs
You can’t, of course: this program is obviously nonsense unless you’re a typechecker. The trouble is that only certain computations make sense if the null xs test is True, whilst others make sense if it is False. However, as far as the type system is concerned, the type of the then branch is the type of the else branch is the type of the entire conditional. Statically, the test is irrelevant. Which is odd, because if the test really were irrelevant, we wouldn’t do it. Of course, tail [] doesn’t go wrong – well-typed programs don’t go wrong – so we’d better pick a different word for the way they do go.
The above quote is an opening paragraph of Conor McBride’s “Epigram: Practical Programming with Dependent Types” paper. As always, Conor makes a good point – this test is completely irrelevant for the typechecker although it is very relevant at run time. Clearly the type system fails to accurately approximate runtime behaviour of our program. In this short post I will show how to fix this in Haskell using dependent types.
The problem is that the types used in this short program carry no information about the manipulated data. This is true both for Bool returned by null xs, which contains no evidence of the result, as well as lists, that store no information about their length. As some of you probably realize the latter is easily fixed by using vectors, ie. length-indexed lists:
data N = Z | S N -- natural numbers
data Vec a (n :: N) where
Nil :: Vec a Z
Cons :: a -> Vec a n -> Vec a (S n)
The type of vector encodes its length, which means that the type checker can now be aware whether it is dealing with an empty vector. Now let’s write null and tail functions that work on vectors:
vecNull :: Vec a n -> Bool
vecNull Nil = True
vecNull (Cons _ _) = False
vecTail :: Vec a (S n) -> Vec a n
vecTail (Cons _ tl) = tl
vecNull is nothing surprising – it returns True for empty vector and False for non-empty one. But the tail function for vectors differs from its implementation for lists. tail from Haskell’s standard prelude is not defined for an empty list so calling tail [] results in an exception (that would be the case in Conor’s example). But the type signature of vecTail requires that input vector is non-empty. As a result we can rule out the Nil case. That also means that Conor’s example will no longer typecheck1. But how can we write a correct version of this example, one that removes first element of a vector only when it is non-empty? Here’s an attempt:
shorten :: Vec a n -> Vec a m
shorten xs = case vecNull xs of
True -> xs
False -> vecTail xs
That however won’t compile: now that we written type-safe tail function typechecker requires a proof that vector passed to it as an argument is non empty. The weak link in this code is the vecNull function. It tests whether a vector is empty but delivers no type-level proof of the result. In other words we need:
vecNull` :: Vec a n -> IsNull n
ie. a function with result type carrying the information about the length of the list. This data type will have the runtime representation isomorphic to Bool, ie. it will be an enumeration with two constructors, and the type index will correspond to length of a vector:
data IsNull (n :: N) where
Null :: IsNull Z
NotNull :: IsNull (S n)
Null represents empty vectors, NotNull represents non-empty ones. We can now implement a version of vecNull that carries proof of the result at the type level:
vecNull` :: Vec a n -> IsNull n
vecNull` Nil = Null
vecNull` (Cons _ _) = NotNull
The type signature of vecNull` says that the return type must have the same index as the input vector. Pattern matching on the Nil case provides the type checker with the information that the n index of Vec is Z. This means that the return value in this case must be Null – the NotNull constructor is indexed with S and that obviously does not match Z. Similarly in the Cons case the return value must be NotNull. However, replacing vecNull in the definition of shorten with our new vecNull` will again result in a type error. The problem comes from the type signature of shorten:
shorten :: Vec a n -> Vec a m
By indexing input and output vectors with different length indices – n and m – we tell the typechecker that these are completely unrelated. But that is not true! Knowing the input length n we know exactly what the result should be: if the input vector is empty the result vector is also empty; if the input vector is not empty it should be shortened by one. Since we need to express this at the type level we will use a type family:
type family Pred (n :: N) :: N where
Pred Z = Z
Pred (S n) = n
(In a fully-fledged dependently-typed language we would write normal function and then apply it at the type level.) Now we can finally write:
shorten :: Vec a n -> Vec a (Pred n)
shorten xs = case vecNull` xs of
Null -> xs
NotNull -> vecTail xs
This definition should not go wrong. Trying to swap expression in the branches will result in a type error.
Assuming we don’t abuse Haskell’s unsoundness as logic, eg. by using undefined.Jan Stolarekhttp://lambda.jstolarek.com/?p=1630Thu, 30 Apr 2015 14:42:24 +0000Diagrams + Cairo + Gtk + Mouse picking, Reloaded
http://projects.haskell.org/diagrams/blog/2015-04-30-GTK-coordinates.html
diagrams-discusshttp://projects.haskell.org/diagrams/blog/2015-04-30-GTK-coordinates.htmlThu, 30 Apr 2015 00:00:00 +0000Compiling With CPS
http://jozefg.bitbucket.org/posts/2015-04-30-cps.html
http://jozefg.bitbucket.org/posts/2015-04-30-cps.htmlThu, 30 Apr 2015 00:00:00 +0000Domains, Sets, Traversals and Applicatives
http://comonad.com/reader/2015/domains-sets-traversals-and-applicatives/
Last time I looked at free monoids, and noticed that in Haskell lists don't really cut it. This is a consequence of laziness and general recursion. To model a language with those properties, one needs to use domains and monotone, continuous maps, rather than sets and total functions (a call-by-value language with general recursion would use domains and strict maps instead).
This time I'd like to talk about some other examples of this, and point out how doing so can (perhaps) resolve some disagreements that people have about the specific cases.
The first example is not one that I came up with: induction. It's sometimes said that Haskell does not have inductive types at all, or that we cannot reason about functions on its data types by induction. However, I think this is (techincally) inaccurate. What's true is that we cannot simply pretend that that our types are sets and use the induction principles for sets to reason about Haskell programs. Instead, one has to figure out what inductive domains would be, and what their proof principles are.
Fortunately, there are some papers about doing this. The most recent (that I'm aware of) is Generic Fibrational Induction. I won't get too into the details, but it shows how one can talk about induction in a general setting, where one has a category that roughly corresponds to the type theory/programming language, and a second category of proofs that is 'indexed' by the first category's objects. Importantly, it is not required that the second category is somehow 'part of' the type theory being reasoned about, as is often the case with dependent types, although that is also a special case of their construction.
One of the results of the paper is that this framework can be used to talk about induction principles for types that don't make sense as sets. Specifically:
newtype Hyp = Hyp ((Hyp -> Int) -> Int)
the type of "hyperfunctions". Instead of interpreting this type as a set, where it would effectively require a set that is isomorphic to the power set of its power set, they interpret it in the category of domains and strict functions mentioned earlier. They then construct the proof category in a similar way as one would for sets, except instead of talking about predicates as subsets, we talk about sub-domains instead. Once this is done, their framework gives a notion of induction for this type.
This example is suitable for ML (and suchlike), due to the strict functions, and sort of breaks the idea that we can really get away with only thinking about sets, even there. Sets are good enough for some simple examples (like flat domains where we don't care about ⊥), but in general we have to generalize induction itself to apply to all types in the 'good' language.
While I haven't worked out how the generic induction would work out for Haskell, I have little doubt that it would, because ML actually contains all of Haskell's data types (and vice versa). So the fact that the framework gives meaning to induction for ML implies that it does so for Haskell. If one wants to know what induction for Haskell's 'lazy naturals' looks like, they can study the ML analogue of:
data LNat = Zero | Succ (() -> LNat)
because function spaces lift their codomain, and make things 'lazy'.
----
The other example I'd like to talk about hearkens back to the previous article. I explained how foldMap is the proper fundamental method of the Foldable class, because it can be massaged to look like:
foldMap :: Foldable f => f a -> FreeMonoid a
and lists are not the free monoid, because they do not work properly for various infinite cases.
I also mentioned that foldMap looks a lot like traverse:
foldMap :: (Foldable t , Monoid m) => (a -> m) -> t a -> m
traverse :: (Traversable t, Applicative f) => (a -> f b) -> t a -> f (t b)
And of course, we have Monoid m => Applicative (Const m), and the functions are expected to agree in this way when applicable.
Now, people like to get in arguments about whether traversals are allowed to be infinite. I know Ed Kmett likes to argue that they can be, because he has lots of examples. But, not everyone agrees, and especially people who have papers proving things about traversals tend to side with the finite-only side. I've heard this includes one of the inventors of Traversable, Conor McBride.
In my opinion, the above disagreement is just another example of a situation where we have a generic notion instantiated in two different ways, and intuition about one does not quite transfer to the other. If you are working in a language like Agda or Coq (for proving), you will be thinking about traversals in the context of sets and total functions. And there, traversals are finite. But in Haskell, there are infinitary cases to consider, and they should work out all right when thinking about domains instead of sets. But I should probably put forward some argument for this position (and even if I don't need to, it leads somewhere else interesting).
One example that people like to give about finitary traversals is that they can be done via lists. Given a finite traversal, we can traverse to get the elements (using Const [a]), traverse the list, then put them back where we got them by traversing again (using State [a]). Usually when you see this, though, there's some subtle cheating in relying on the list to be exactly the right length for the second traversal. It will be, because we got it from a traversal of the same structure, but I would expect that proving the function is actually total to be a lot of work. Thus, I'll use this as an excuse to do my own cheating later.
Now, the above uses lists, but why are we using lists when we're in Haskell? We know they're deficient in certain ways. It turns out that we can give a lot of the same relevant structure to the better free monoid type:
newtype FM a = FM (forall m. Monoid m => (a -> m) -> m) deriving (Functor)
instance Applicative FM where
pure x = FM ($ x)
FM ef < *> FM ex = FM $ \k -> ef $ \f -> ex $ \x -> k (f x)
instance Monoid (FM a) where
mempty = FM $ \_ -> mempty
mappend (FM l) (FM r) = FM $ \k -> l k <> r k
instance Foldable FM where
foldMap f (FM e) = e f
newtype Ap f b = Ap { unAp :: f b }
instance (Applicative f, Monoid b) => Monoid (Ap f b) where
mempty = Ap $ pure mempty
mappend (Ap l) (Ap r) = Ap $ (<>) < $> l < *> r
instance Traversable FM where
traverse f (FM e) = unAp . e $ Ap . fmap pure . f
So, free monoids are Monoids (of course), Foldable, and even Traversable. At least, we can define something with the right type that wouldn't bother anyone if it were written in a total language with the right features, but in Haskell it happens to allow various infinite things that people don't like.
Now it's time to cheat. First, let's define a function that can take any Traversable to our free monoid:
toFreeMonoid :: Traversable t => t a -> FM a
toFreeMonoid f = FM $ \k -> getConst $ traverse (Const . k) f
Now let's define a Monoid that's not a monoid:
data Cheat a = Empty | Single a | Append (Cheat a) (Cheat a)
instance Monoid (Cheat a) where
mempty = Empty
mappend = Append
You may recognize this as the data version of the free monoid from the previous article, where we get the real free monoid by taking a quotient. using this, we can define an Applicative that's not valid:
newtype Cheating b a =
Cheating { prosper :: Cheat b -> a } deriving (Functor)
instance Applicative (Cheating b) where
pure x = Cheating $ \_ -> x
Cheating f < *> Cheating x = Cheating $ \c -> case c of
Append l r -> f l (x r)
Given these building blocks, we can define a function to relabel a traversable using a free monoid:
relabel :: Traversable t => t a -> FM b -> t b
relabel t (FM m) = propser (traverse (const hope) t) (m Single)
where
hope = Cheating $ \c -> case c of
Single x -> x
And we can implement any traversal by taking a trip through the free monoid:
slowTraverse
:: (Applicative f, Traversable t) => (a -> f b) -> t a -> f (t b)
slowTraverse f t = fmap (relabel t) . traverse f . toFreeMonoid $ t
And since we got our free monoid via traversing, all the partiality I hid in the above won't blow up in practice, rather like the case with lists and finite traversals.
Arguably, this is worse cheating. It relies on the exact association structure to work out, rather than just number of elements. The reason is that for infinitary cases, you cannot flatten things out, and there's really no way to detect when you have something infinitary. The finitary traversals have the luxury of being able to reassociate everything to a canonical form, while the infinite cases force us to not do any reassociating at all. So this might be somewhat unsatisfying.
But, what if we didn't have to cheat at all? We can get the free monoid by tweaking foldMap, and it looks like traverse, so what happens if we do the same manipulation to the latter?
It turns out that lens has a type for this purpose, a slight specialization of which is:
newtype Bazaar a b t =
Bazaar { runBazaar :: forall f. Applicative f => (a -> f b) -> f t }
Using this type, we can reorder traverse to get:
howBizarre :: Traversable t => t a -> Bazaar a b (t b)
howBizarre t = Bazaar $ \k -> traverse k t
But now, what do we do with this? And what even is it? [1]
If we continue drawing on intuition from Foldable, we know that foldMap is related to the free monoid. Traversable has more indexing, and instead of Monoid uses Applicative. But the latter are actually related to the former; Applicatives are monoidal (closed) functors. And it turns out, Bazaar has to do with free Applicatives.
If we want to construct free Applicatives, we can use our universal property encoding trick:
newtype Free p f a =
Free { gratis :: forall g. p g => (forall x. f x -> g x) -> g a }
This is a higher-order version of the free p, where we parameterize over the constraint we want to use to represent structures. So Free Applicative f is the free Applicative over a type constructor f. I'll leave the instances as an exercise.
Since free monoid is a monad, we'd expect Free p to be a monad, too. In this case, it is a McBride style indexed monad, as seen in The Kleisli Arrows of Outrageous Fortune.
type f ~> g = forall x. f x -> g x
embed :: f ~> Free p f
embed fx = Free $ \k -> k fx
translate :: (f ~> g) -> Free p f ~> Free p g
translate tr (Free e) = Free $ \k -> e (k . tr)
collapse :: Free p (Free p f) ~> Free p f
collapse (Free e) = Free $ \k -> e $ \(Free e') -> e' k
That paper explains how these are related to Atkey style indexed monads:
data At key i j where
At :: key -> At key i i
type Atkey m i j a = m (At a j) i
ireturn :: IMonad m => a -> Atkey m i i a
ireturn = ...
ibind :: IMonad m => Atkey m i j a -> (a -> Atkey m j k b) -> Atkey m i k b
ibind = ...
It turns out, Bazaar is exactly the Atkey indexed monad derived from the Free Applicative indexed monad (with some arguments shuffled) [2]:
hence :: Bazaar a b t -> Atkey (Free Applicative) t b a
hence bz = Free $ \tr -> runBazaar bz $ tr . At
forth :: Atkey (Free Applicative) t b a -> Bazaar a b t
forth fa = Bazaar $ \g -> gratis fa $ \(At a) -> g a
imap :: (a -> b) -> Bazaar a i j -> Bazaar b i j
imap f (Bazaar e) = Bazaar $ \k -> e (k . f)
ipure :: a -> Bazaar a i i
ipure x = Bazaar ($ x)
(>>>=) :: Bazaar a j i -> (a -> Bazaar b k j) -> Bazaar b k i
Bazaar e >>>= f = Bazaar $ \k -> e $ \x -> runBazaar (f x) k
(>==>) :: (s -> Bazaar i o t) -> (i -> Bazaar a b o) -> s -> Bazaar a b t
(f >==> g) x = f x >>>= g
As an aside, Bazaar is also an (Atkey) indexed comonad, and the one that characterizes traversals, similar to how indexed store characterizes lenses. A Lens s t a b is equivalent to a coalgebra s -> Store a b t. A traversal is a similar Bazaar coalgebra:
s -> Bazaar a b t
~
s -> forall f. Applicative f => (a -> f b) -> f t
~
forall f. Applicative f => (a -> f b) -> s -> f t
It so happens that Kleisli composition of the Atkey indexed monad above (>==>) is traversal composition.
Anyhow, Bazaar also inherits Applicative structure from Free Applicative:
instance Functor (Bazaar a b) where
fmap f (Bazaar e) = Bazaar $ \k -> fmap f (e k)
instance Applicative (Bazaar a b) where
pure x = Bazaar $ \_ -> pure x
Bazaar ef < *> Bazaar ex = Bazaar $ \k -> ef k < *> ex k
This is actually analogous to the Monoid instance for the free monoid; we just delegate to the underlying structure.
The more exciting thing is that we can fold and traverse over the first argument of Bazaar, just like we can with the free monoid:
bfoldMap :: Monoid m => (a -> m) -> Bazaar a b t -> m
bfoldMap f (Bazaar e) = getConst $ e (Const . f)
newtype Comp g f a = Comp { getComp :: g (f a) } deriving (Functor)
instance (Applicative f, Applicative g) => Applicative (Comp g f) where
pure = Comp . pure . pure
Comp f < *> Comp x = Comp $ liftA2 (< *>) f x
btraverse
:: (Applicative f) => (a -> f a') -> Bazaar a b t -> Bazaar a' b t
btraverse f (Bazaar e) = getComp $ e (c . fmap ipure . f)
This is again analogous to the free monoid code. Comp is the analogue of Ap, and we use ipure in traverse. I mentioned that Bazaar is a comonad:
extract :: Bazaar b b t -> t
extract (Bazaar e) = runIdentity $ e Identity
And now we are finally prepared to not cheat:
honestTraverse
:: (Applicative f, Traversable t) => (a -> f b) -> t a -> f (t b)
honestTraverse f = fmap extract . btraverse f . howBizarre
So, we can traverse by first turning out Traversable into some structure that's kind of like the free monoid, except having to do with Applicative, traverse that, and then pull a result back out. Bazaar retains the information that we're eventually building back the same type of structure, so we don't need any cheating.
To pull this back around to domains, there's nothing about this code to object to if done in a total language. But, if we think about our free Applicative-ish structure, in Haskell, it will naturally allow infinitary expressions composed of the Applicative operations, just like the free monoid will allow infinitary monoid expressions. And this is okay, because some Applicatives can make sense of those, so throwing them away would make the type not free, in the same way that even finite lists are not the free monoid in Haskell. And this, I think, is compelling enough to say that infinite traversals are right for Haskell, just as they are wrong for Agda.
For those who wish to see executable code for all this, I've put a files here and here. The latter also contains some extra goodies at the end that I may talk about in further installments.
[1] Truth be told, I'm not exactly sure.
[2] It turns out, you can generalize Bazaar to have a correspondence for every choice of p
newtype Bizarre p a b t =
Bizarre { bizarre :: forall f. p f => (a -> f b) -> f t }
hence and forth above go through with the more general types. This can be seen here.Dan Doelhttp://comonad.com/reader/?p=994Wed, 29 Apr 2015 07:36:19 +0000An informal explanation of stackage-sandbox
https://unknownparallel.wordpress.com/2015/04/28/an-informal-explanation-of-stackage-sandbox/
Dan Burtonhttp://unknownparallel.wordpress.com/?p=181Tue, 28 Apr 2015 20:45:42 +0000C2HS Tutorial Ideas
http://www.skybluetrades.net/blog/posts/2015/04/27/c2hs-tutorial-request.html
http://www.skybluetrades.net/blog/posts/2015/04/27/c2hs-tutorial-request.htmlMon, 27 Apr 2015 15:46:42 +0000FLOPS 2016: Call for Papers
http://tomschrijvers.blogspot.com/2015/04/flops-2016-call-for-papers.html
FLOPS 2016: 13th International Symposium on Functional and Logic ProgrammingMarch 3-6, 2016, Kochi, JapanCall For Papers http://www.info.kochi-tech.ac. jp/FLOPS2016/Writing down detailed computational steps is not the only way ofprogramming. The alternative, being used increasingly in practice, isto start by writing down the desired properties of the result. Thecomputational steps are then (semi-)automatically derived from thesehigher-level specifications. Examples of this declarative styleinclude functional and logic programming, program transformation andre-writing, and extracting programs from proofs of their correctness.FLOPS aims to bring together practitioners, researchers andimplementors of the declarative programming, to discuss mutuallyinteresting results and common problems: theoretical advances, theirimplementations in language systems and tools, and applications ofthese systems in practice. The scope includes all aspects of thedesign, semantics, theory, applications, implementations, and teachingof declarative programming. FLOPS specifically aims topromote cross-fertilization between theory and practice and amongdifferent styles of declarative programming.ScopeFLOPS solicits original papers in all areas of the declarativeprogramming: * functional, logic, functional-logic programming, re-writing systems, formal methods and model checking, program transformations and program refinements, developing programs with the help of theorem provers or SAT/SMT solvers; * foundations, language design, implementation issues (compilation techniques, memory management, run-time systems), applications and case studies.FLOPS promotes cross-fertilization among different styles ofdeclarative programming. Therefore, submissions must be written to beunderstandable by the wide audience of declarative programmers andresearchers. Submission of system descriptions and declarative pearlsare especially encouraged.Submissions should fall into one of the following categories: * Regular research papers: they should describe new results and will be judged on originality, correctness, and significance. * System descriptions: they should contain a link to a working system and will be judged on originality, usefulness, and design. * Declarative pearls: new and excellent declarative programs or theories with illustrative applications.System descriptions and declarative pearls must be explicitly markedas such in the title.Submissions must be unpublished and not submitted for publicationelsewhere. Work that already appeared in unpublished or informallypublished workshops proceedings may be submitted. See also ACM SIGPLANRepublication Policy.The proceedings will be published by Springer International Publishingin the Lecture Notes in Computer Science (LNCS) series, as a printedvolume as well as online in the digital library SpringerLink.Post-proceedings: The authors of 4-7 best papers will be invited tosubmit the extended version of their FLOPS paper to a special issue ofthe journal Science of Computer Programming (SCP).Important datesMonday, September 14, 2015 (any time zone): Submission deadlineMonday, November 16, 2015: Author notificationMarch 3-6, 2016: FLOPS SymposiumMarch 7-9, 2016: PPL WorkshopSubmissionSubmissions must be written in English and can be up to 15 pages longincluding references, though pearls are typically shorter. Theformatting has to conform to Springer's guidelines. Regular researchpapers should be supported by proofs and/or experimental results. Incase of lack of space, this supporting information should be madeaccessible otherwise (e.g., a link to a Web page, or an appendix).Papers should be submitted electronically athttps://easychair.org/ conferences/?conf=flops2016Program CommitteeAndreas Abel Gothenburg University, SwedenLindsay Errington USAMakoto Hamana Gunma University, JapanMichael Hanus CAU Kiel, GermanyJacob Howe City University London, UKMakoto Kanazawa National Institute of Informatics, JapanAndy King University of Kent, UK (PC Co-Chair)Oleg Kiselyov Tohoku University, Japan (PC Co-Chair)Hsiang-Shang Ko National Institute of Informatics, JapanJulia Lawall Inria-Whisper, FranceAndres Löh Well-Typed LLP, UKAnil Madhavapeddy Cambridge University, UKJeff Polakow PivotCloud, USAMarc Pouzet École normale supérieure, FranceVítor Santos Costa Universidade do Porto, PortugalTom Schrijvers KU Leuven, BelgiumZoltan Somogyi AustraliaAlwen Tiu Nanyang Technological University, SingaporeSam Tobin-Hochstadt Indiana University, USAHongwei Xi Boston University, USANeng-Fa Zhou CUNY Brooklyn College and Graduate Center, USAOrganizersAndy King University of Kent, UK (PC Co-Chair)Oleg Kiselyov Tohoku University, Japan (PC Co-Chair)Yukiyoshi Kameyama University of Tsukuba, Japan (General Chair)Kiminori Matsuzaki Kochi University of Technology, Japan (Local Chair)flops2016 at logic.cs.tsukuba.ac dot jpTom Schrijverstag:blogger.com,1999:blog-5844006452378085451.post-3605919055189838311Mon, 27 Apr 2015 11:14:00 +0000Rejection Sampling
https://idontgetoutmuch.wordpress.com/2015/04/27/rejection-sampling/
Introduction
Suppose you want to sample from the truncated normal distribution. One way to do this is to use rejection sampling. But if you do this naïvely then you will run into performance problems. The excellent Devroye (1986) who references Marsaglia (1964) gives an efficient rejection sampling scheme using the Rayleigh distribution. The random-fu package uses the Exponential distribution.
Performance
> {-# OPTIONS_GHC -Wall #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods #-}
> {-# OPTIONS_GHC -fno-warn-orphans #-}
> {-# LANGUAGE FlexibleContexts #-}
> import Control.Monad
> import Data.Random
> import qualified Data.Random.Distribution.Normal as N
> import Data.Random.Source.PureMT
> import Control.Monad.State
Here’s the naïve implementation.
> naiveReject :: Double -> RVar Double
> naiveReject x = doit
> where
> doit = do
> y <- N.stdNormal
> if y < x
> then doit
> else return y
And here’s an implementation using random-fu.
> expReject :: Double -> RVar Double
> expReject x = N.normalTail x
Let’s try running both of them
> n :: Int
> n = 10000000
> lower :: Double
> lower = 2.0
> testExp :: [Double]
> testExp = evalState (replicateM n $ sample (expReject lower)) (pureMT 3)
> testNaive :: [Double]
> testNaive = evalState (replicateM n $ sample (naiveReject lower)) (pureMT 3)
> main :: IO ()
> main = do
> print $ sum testExp
> print $ sum testNaive
Let’s try building and running both the naïve and better tuned versions.
ghc -O2 CompareRejects.hs
As we can see below we get 59.98s and 4.28s, a compelling reason to use the tuned version. And the difference in performance will get worse the less of the tail we wish to sample from.
Tuned
2.3731610476911187e7
11,934,195,432 bytes allocated in the heap
5,257,328 bytes copied during GC
44,312 bytes maximum residency (2 sample(s))
21,224 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 23145 colls, 0 par 0.09s 0.11s 0.0000s 0.0001s
Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0002s
INIT time 0.00s ( 0.00s elapsed)
MUT time 4.19s ( 4.26s elapsed)
GC time 0.09s ( 0.11s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 4.28s ( 4.37s elapsed)
%GC time 2.2% (2.6% elapsed)
Alloc rate 2,851,397,967 bytes per MUT second
Productivity 97.8% of total user, 95.7% of total elapsed
Naïve
2.3732073159369867e7
260,450,762,656 bytes allocated in the heap
111,891,960 bytes copied during GC
85,536 bytes maximum residency (2 sample(s))
76,112 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 512768 colls, 0 par 1.86s 2.24s 0.0000s 0.0008s
Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0002s
INIT time 0.00s ( 0.00s elapsed)
MUT time 58.12s ( 58.99s elapsed)
GC time 1.86s ( 2.24s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 59.98s ( 61.23s elapsed)
%GC time 3.1% (3.7% elapsed)
Alloc rate 4,481,408,869 bytes per MUT second
Productivity 96.9% of total user, 94.9% of total elapsed
Bibliography
Devroye, L. 1986. Non-Uniform Random Variate Generation. Springer-Verlag. http://books.google.co.uk/books?id=mEw\_AQAAIAAJ.
Marsaglia, G. 1964. “Generating a Variable from the Tail of the Normal Distribution.” J-Technometrics 6 (1): 101–2.Dominic Steinitzhttp://idontgetoutmuch.wordpress.com/?p=814Mon, 27 Apr 2015 09:59:23 +0000[imxscmnf] Function call syntax
http://kenta.blogspot.com/2015/04/imxscmnf-function-call-syntax.html
A brief survey of syntax used by different programming languages to express calling a function:Calling a one-argument function:
f x
(f x)
f(x)Calling a two-argument uncurried function:
f x,y
f(x,y)
(f x y)Calling a two-argument curried function:
f x y
((f x) y)
f(x)(y)Calling a zero-argument function:
f
(f)
f()
Intriguing but not implemented as far as I know is a Lisp-like language that prefers currying and partial application like Haskell.Kentag:blogger.com,1999:blog-6757805.post-6795118661420278606Mon, 27 Apr 2015 08:25:00 +0000Fifth place in Godingame World Cup
http://www.joachim-breitner.de/blog/677-Fifth_place_in_Godingame_World_Cup
Joachim Breitnerhttp://www.joachim-breitner.de/blog/677-Fifth_place_in_Godingame_World_CupSun, 26 Apr 2015 15:01:14 +0000Cleaning stale files with Shake
http://neilmitchell.blogspot.com/2015/04/cleaning-stale-files-with-shake.html
Neil Mitchelltag:blogger.com,1999:blog-7094652.post-947269270800409732Sat, 25 Apr 2015 14:21:00 +0000Diagrams 1.3
http://projects.haskell.org/diagrams/blog/2015-04-24-diagrams-1.3.html
diagrams-discusshttp://projects.haskell.org/diagrams/blog/2015-04-24-diagrams-1.3.htmlSat, 25 Apr 2015 00:00:00 +0000Building ghc-7.8.4 for Debian 7 (wheezy) with Stackage
http://tobold.org/entry/2015-04-24
This article is a recipe for building ghc-7.8.4 from source, along
with the other tools you need to use Haskell. Why would you do this
rather than use haskell-platform? Well, the latest
haskell-platform is now over a year old, and ships with
ghc-7.8.3, so perhaps you want or need the latest compiler in the
7.8 series. Or perhaps you're just curious as to how it might be done.
We're also going to ensure that any packages we install to get things
going will be the ones from the current Stackage LTS.
So far, I've only tried this recipe on a Debian 7 (wheezy) server, although
apart from the very first step it should be the same for any distro.
Install a binary ghc
The very first step is to install a functioning binary build of
haskell-platform from somewhere... presumably your distro's
packages:
# apt-get install haskell-platform
(Throughout this recipe, I will prefix commands to be run as root with
# and commands to be run as any non-root user with $. For the
root commands, you can of course either use a root shell, or prefix each
command with sudo.)
We're also going to need some other bits and pieces to build a new
ghc. On a pretty minimal Debian 7, all I needed was this:
# apt-get install make libncurses5-dev
Build and install ghc
This is all straight forward. We want the default installation paths
under /usr/local so there is nothing to pass to configure. Build
as some non-root user:
$ wget https://downloads.haskell.org/~ghc/7.8.4/ghc-7.8.4-src.tar.xz
$ tar xfJ ghc-7.8.4-src.tar.xz
$ cd ghc-7.8.4
$ sh configure
$ make -j5
In the last line, the number 5 should be 1 more than the CPU cores
you have available.
Now install as root:
# cd /home/toby/ghc-7.8.4
# make install
Bootstrap cabal
This is the tricky part, that took me a while to work out. There is a
very loose coupling between ghc and cabal, but they do need to
agree on some things: cabal needs to install libraries to where
ghc is going to look for them! The distro supplied cabal-1.14
won't work properly with our newly installed ghc as it uses the
wrong value for libsubdir. The symptom of this is that any library
including C code will install happily, but anything depending on such a
library will then produce errors like these:
/usr/bin/ld: cannot find -lHStf-random-0.5-ghc7.8.4
/usr/bin/ld: cannot find -lHSprimitive-0.6-ghc7.8.4
/usr/bin/ld: cannot find -lHSrandom-1.1-ghc7.8.4
The cleanest way to fix this is to perform a two-stage boot of
cabal: use cabal-1.14 to install a “stage1” cabal-1.18, then
clean everything out and use that stage1 cabal to install a fully
functional “stage2” cabal-1.18. The particular version of cabal
we are using will be the one from the current LTS stackage:
# cd
# cabal update
# cabal install cabal-install==1.18.0.8
# cp .cabal/bin/cabal .
# rm -r .ghc .cabal
# ./cabal update
The last line creates a new cabal configuration file that just needs a
couple of tweaks. First, we want to reset the user-install flag. You
could do that in your favourite text editor, or with this handy sed
script:
# sed -i '/user-install/auser-install: False' .cabal/config
And we want to append the LTS stackage global file to the end of the file:
# u='https://www.stackage.org/lts/cabal.config?global=true'
# wget -q -O- $u >> .cabal/config
Now we're ready to perform the second stage install:
# ./cabal install cabal-install
# rm ./cabal
# hash -r # sacrifice goat to the great god Bash
Install other tools
Finally, to finish off, install the other basic tools:
# cabal install alex
# cabal install happyhttp://tobold.org/entry/2015-04-24Fri, 24 Apr 2015 19:27:40 +0000Easy exhaustive search with the list monad
http://blog.plover.com/prog/haskell/monad-search.html
(Haskell people may want to skip this article about Haskell, because
the technique is well-known in the Haskell community.)
Suppose you would like to perform an exhaustive search. Let's say for
concreteness that we would like to solve this cryptarithm puzzle:
S E N D
+ M O R E
-----------
M O N E Y
This means that we want to map the letters S, E, N, D, M,
O, R, Y to distinct digits 0 through 9 to produce a five-digit
and two four-digit numerals which, when added in the indicated way,
produce the indicated sum.
(This is not an especially difficult example; my 10-year-old daughter
Katara was able to solve it, with some assistance, in about 30
minutes.)
If I were doing this in Perl, I would write up either a recursive
descent search or a solution based on a stack or queue of partial
solutions which the program would progressively try to expand to a
full solution, as per the techniques of chapter 5 of Higher-Order
Perl. In Haskell, we can use the list monad to hide all the
searching machinery under the surface. First a few utility functions:
import Control.Monad (guard)
digits = [0..9]
to_number = foldl (\a -> \b -> a*10 + b) 0
remove rs ls = foldl remove' ls rs
where remove' ls x = filter (/= x) ls
to_number takes a list of digits like [1,4,3] and produces the
number they represent, 143. remove takes two lists and returns all
the things in the second list that are not in the first list. There
is probably a standard library function for this but I don't remember
what it is. This version is , but who cares.
Now the solution to the problem is:
-- S E N D
-- + M O R E
-- ---------
-- M O N E Y
solutions = do
s <- remove [0] digits
e <- remove [s] digits
n <- remove [s,e] digits
d <- remove [s,e,n] digits
let send = to_number [s,e,n,d]
m <- remove [0,s,e,n,d] digits
o <- remove [s,e,n,d,m] digits
r <- remove [s,e,n,d,m,o] digits
let more = to_number [m,o,r,e]
y <- remove [s,e,n,d,m,o,r] digits
let money = to_number [m,o,n,e,y]
guard $ send + more == money
return (send, more, money)
Let's look at just the first line of this:
solutions = do
s <- remove [0] digits
…
The do notation is syntactic sugar for
(remove [0] digits) >>= \s -> …
where “…” is the rest of the block. To expand this further, we need
to look at the overloading for >>= which is implemented differently
for every type. The mote on the left of >>= is a list value, and
the definition of >>= for lists is:
concat $ map (\s -> …) (remove [0] digits)
where “…” is the rest of the block.
So the variable s is bound to each of 1,2,3,4,5,6,7,8,9 in turn, the
rest of the block is evaluated for each of these nine possible
bindings of s, and the nine returned lists of solutions are combined
(by concat) into a single list.
The next line is the same:
e <- remove [s] digits
for each of the nine possible values for s, we loop over nine value
for e (this time including 0 but not including whatever we chose for
s) and evaluate the rest of the block. The nine resulting lists of
solutions are concatenated into a single list and returned to the
previous map call.
n <- remove [s,e] digits
d <- remove [s,e,n] digits
This is two more nested loops.
let send = to_number [s,e,n,d]
At this point the value of send is determined, so we compute and
save it so that we don't have to repeatedly compute it each time
through the following 300 loop executions.
m <- remove [0,s,e,n,d] digits
o <- remove [s,e,n,d,m] digits
r <- remove [s,e,n,d,m,o] digits
let more = to_number [m,o,r,e]
Three more nested loops and another computation.
y <- remove [s,e,n,d,m,o,r] digits
let money = to_number [m,o,n,e,y]
Yet another nested loop and a final computation.
guard $ send + more == money
return (send, more, money)
This is the business end. I find guard a little tricky so let's
look at it slowly. There is no binding (<-) in the first line, so
these two lines are composed with >> instead of >>=:
(guard $ send + more == money) >> (return (send, more, money))
which is equivalent to:
(guard $ send + more == money) >>= (\_ -> return (send, more, money))
which means that the values in the list returned by guard will be
discarded before the return is evaluated.
If send + more == money is true, the guard expression yields
[()], a list of one useless item, and then the following >>= loops
over this one useless item, discards it, and returns yields a list
containing the tuple (send, more, money) instead.
But if send + more == money is false, the guard expression yields
[], a list of zero useless items, and then the following >>= loops
over these zero useless items, never runs return at all, and yields
an empty list.
The result is that if we have found a solution at this point, a list
containing it is returned, to be concatenated into the list of all
solutions that is being constructed by the nested concats. But if
the sum adds up wrong, an empty list is returned and concated
instead.
After a few seconds, Haskell generates and tests 1.36 million choices
for the eight bindings, and produces the unique solution:
[(9567,1085,10652)]
That is:
S E N D 9 5 6 7
+ M O R E + 1 0 8 5
----------- -----------
M O N E Y 1 0 6 5 2
It would be an interesting and pleasant exercise to try to implement
the same underlying machinery in another language. I tried this in
Perl once, and I found that although it worked perfectly well, between
the lack of the do-notation's syntactic sugar and Perl's clumsy
notation for lambda functions (sub { my ($s) = @_; … } instead of
\s -> …) the result was completely unreadable and therefore
unusable. However, I suspect it would be even worse in Python
because of semantic limitations of that language. I would be
interested to hear about this if anyone tries it.
[ Addendum: Thanks to Tony Finch for pointing out the η-reduction I missed while writing this at 3 AM. ]
[ Addendum: Several people so far have misunderstood the question
about Python in the last paragraph. The question was not to implement
an exhaustive search in Python; I had no doubt that it could be done
in a simple and clean way, as it can in Perl. The question was to
implement the same underlying machinery, including the list monad
and its bind operator, and to find the solution using the list
monad.
[ Peter De Wachter has written in with a Python solution that, while
not using the list monad, I think clearly demonstrates that the
problems I was worried about will not arise, at least for this
task. I hope to post his solution in the next few days. ]Mark Dominustag:blog.plover.com,2015:/prog/haskell/monad-searchFri, 24 Apr 2015 07:16:00 +0000The list monad in Python
http://blog.plover.com/prog/haskell/monad-search-2.html
Mark Dominustag:blog.plover.com,2015:/prog/haskell/monad-search-2Fri, 24 Apr 2015 13:39:00 +0000