Planet Haskell without Tom Moertel's postsPipes Output
http://pipes.yahoo.com/pipes/pipe.info?_id=Jti_id1G3hGa1KrxdPQQIA
Wed, 23 Apr 2014 23:20:10 +0000http://pipes.yahoo.com/pipes/'T' is for Theorem-proving
http://logicaltypes.blogspot.com/2014/04/t-is-for-theorem-proving.html
'T' is for Theorem-provingSo, you've been theorem-proving all your life.Let's just get that one out there.When you're given a math problem that says:"Solve x + 1 = 2 (for x) (of course)"And you say, "Why, x is 1, of course," quick as thinking, you've just proved a theorem.Here, let me show you:x + 1 = 2 (basis) -1 = -1 (reflexive)x + 0 = 1 (addition)x = 1 (identity)Q.E.D. ('cause you rock)So, yeah. Theorem proving is this, correction: theorem proving is simply this: going from step 1, to step 2 to step 3 until you've got to where you want to be.How do you go from step 1 to step 2, etc? The same way you do everything! You follow the rules you're given.Let's prove a theorem, right now.So, in classical logic, we have a theorem that saysp → pThat is, if you've got a p, that implies (that you've got a) p.Toughie!But proving that? How do we go about doing that?Well, in classical logic, you're given three basic axioms (thanks to sigfpe for his article "Adventures in Classic Land"):1. p → (q → p)2. (p → (q → r)) → ((p → q) → (p → r))3. ¬¬p → pSo, p → p. How do we get there? Let's start with axiom 1 above.1. p → ((p → p) → p) axiom 1 (where q = (p → p))2. (p → ((p → p) → p) → ((p → (p → p)) → (p → p)) axiom 2 (where q = (p → p) and r = p)3. (p → (p → p)) → (p → p) modus ponens4. p → (p → p) axiom 1 (where q = p)5. p → p modus ponensQ.E.D. (a.k.a. whew!)I used something called modus ponens in the above proof. It says, basically, that if you've proved something that you're depending on, you can drop it. Or, logically:p → q, p ∴ qOr, we have "p implies q" we've proved p, so now we can just use q.Now, there's been a lot of thought put into theory, coming up with theorems and proving them, and this has been over a long time, from Aristotle up to now. The big question for a long time was that ...Well, theorem proving is a lot of hard work. Is there any way to automate it?So there's been this quest to make theorem proving mechanical.But to make theorem-proving mechanical, you have to mechanize everything, the axioms, the rules, and the process. Up until recent history, theory has been a lot of great minds spouting truths, and 'oh, here's a new way to look at the world!'And that has been great (it has), but it hasn't helped mechanize things.Then along came Frege. What Frege did was to give us a set of symbols that represented logical connectives and then symbols that represented things.And there you had it: when you have objects and their relations, you have an ontology, a knowledge-base. Frege provided the tools to represent knowledge as discreet things that could be manipulated by following the rules of the (symbolized) relations.He gave abstraction to knowledge and an uniform way of manipulating those abstractions, so, regardless of the underlying knowledge be it about money or chickens or knowledge, itself, it could be represented and then manipulated.That way you could go from step 1 to step 2 to step 3, etc, until you arrived at your conclusion, or, just as importantly, arrived at a conclusion (including one that might possibly say what you were trying to prove was inadmissible).Since Frege, there has been (a lot of) refinement to his system, but we've been using his system since because it works so well. He was the one who came up with the concept of types ('T' is for Types) and from that we've improved logic to be able to deliver this blog post to you on a laptop that is, at base, a pile of sand that constantly proving theorems in a descendent of Frege's logic.Let's take a look at one such mechanized system. It's a Logic Framework, and is called tweLF. The first example proof from the Princeton team that uses this system is 'implies true,' and it goes like this:imp_true: pf B -> pf (A imp B) = ...That's the declaration. It's saying: the proof of B, or pf B, implies the proof of A implies B, or pf (A imp B).How can you claim such a thing?Well, you prove it.imp_true: pf B -> pf (A imp B) = [p1: pf B] % you have the proof of B % ... but now what? for we don't have the proof that A implies B.So the 'now what' is our theorem-proving adventure. Mechanized.We don't have the proof that A implies B, but we have a logical loop-hole (called a hole) that a proof some something that's true is its proof:hole: {C} pf C.Which says, mechanized what I just said in words, so with that:imp_true: pf B -> pf (A imp B) = [p1: pf B] (hole (A imp B)).And there you go.... that is, if you're fine with a big, gaping loophole in your proof of such a thing.If you are satisfied, then, well, here, sit on this rusty nail until you get unsatisfied.So there.Okay, so we have an implication, but we need to prove that.So we introduce the implication into the proof, which is defined as:imp_i: (pf A -> pf B) -> pf (A imp B)SO! we have that, and can use it:imp_true: pf B -> pf (A imp B) = [p1 : pf B] (imp_i ([p2: pf A] (hole B))).So, we have the proof of A and that leads to B. But wait! We already have B, don't we? It's p1 (that is [p1 : pf B])BAM!imp_true: pf B -> pf (A imp B) = [p1 : pf B] (imp_i ([p2 : pf A] p1)).DONE!This is what theorem-proving is: you start with what you want, e.g.:imp_true: pf B -> pf (A imp B)which is "implies-true is that if you have B proved, then you have the proof that (A implies B) is true, too."Then you take what you've got:pf BAnd give it a value:[p1 : pf B]Then you apply your rules (in this case, implication introduction, or 'imp_i') to fill the holes in your proof, to get:[p1: pf B] (imp_i [p2: pf A] p1)Which is the demonstration of your proof, and that's why we say 'Q.E.D.' or 'quod est demonstratum.' 'Thus it is demonstrated.'Ta-dah! Now you have all the tools you need to prove theorems.Now go prove Fermat's Last Theorem.eheh.(I am so mean!)Okay, okay, so maybe Fermat's Last Theorem is a bit much, but if you want to try your hand at proving some theorems, there's a list of some simple theorems to prove.geophftag:blogger.com,1999:blog-4650294074444534066.post-7934136816419222610Wed, 23 Apr 2014 16:24:00 +0000'S' is for Simply Summing
http://logicaltypes.blogspot.com/2014/04/s-is-for-simply-summing.html
'S' is for ... simply summing.Okay, this post was going to be about symmetries and super-symmetries, then I was going to transition to sieves, like the Sieve of Eratosthenes, and how that relates to fizz-buzz, ...But then I thought: 'eh.'Yeah, that's what I thought: 'eh.'I'm a moody brute, filled with deep, indiscernible thoughts, 'cause that's how I roll.So, yeah: eh. This one will be a simple post today.Sums.Sum the numbers 1.Yeah, I just said that; humor me.Okay, the sum of 1 is 1. Toughie.Next, sum 1 + 2.3, right? Still with me?Okay, 1 + 2 + 3.Got it? 6. No probs.Now: sum 1 to 10. No laptop help, just sum it yourself.A little more fun, right? So, that's 55.Okay, no computers: sum 1 to 100.Hm, hm, hm. Done yet?La-di-dah. You're not even trying, are you? You're just reading this post.Okay, the sum from 1 to 100, inclusive, no computers is ... 5,050.1 to 1000: 500,500.Pattern? You bet.1 to ... oh, I don't know: 123:That's a might harder, but, no computers:62 00 + 12 4 0 + 18 6 = 7,626.Okay, let's verify, by asking my Haskell interpreter: Prelude> sum [1 .. 123]7626Yeppers, I was right. I was also as fast as a person typing a program into their computer to solve it, if they were using a functional programming language, and much, much faster than a person using a computer if they weren't and called up Excel, for example (loading ... loading ... loading ... blue screen of death) (You really need to virus-clean your computer, because I see you don't have a Mac, do you, tsk, tsk!) or if they pulled out their HP calculator from their high school days because they are that old, boys and girls!You see this thing in my hand, children? This is not a smart phone, it's a calculator.Children: Ooh! Papa, can I borrow it for a sec ...Children, borrowing it, the puzzling over it: Um, Dad, ... where's the Plants vs. Zombies 2 app?Yeah, a calculator, it calculates. And that's it. No google, no nothing, just numbers and numeric operations.Children, looking uncomprehendingly at the thing in their hands, then, tossing the cootied thing from their hands and run to their rooms, screaming, remembering with horror the time Dear Old Dad got out an LP and told them that music came out of it, but ... it didn't have playlists!Yeah. Calculator. But would your calculator get you that sum that fast? I mean, after you go to the CVS to buy a new battery. Those things are long-lasting, but thirty years later? And you expect your calculator to still just work?Pfft! Heh.Pick a number (natural number), any number, I'll sum it for you, from the origin.Sum 1 to 31415.Okay31415 * 15708 = 314,15 0,000 + 157,075, 000 + 21,990,5 00 + 0 + 251,320 = 493,466,820Okay, that took a bit longer, let's see what Haskell says:Prelude> sum [1..31415]493466820Okay. Damn! ... Sometimes I impress even myself.How do I do this? How do I add all the numbers from 1 to 31,415 and in under a minute? And perfectly, perfectly, correctly?Well, I'll tell you.But first, I'll tell you a story.Once upon a time, there was a little boy named Gauss, and he was a naughty, little, precocious lad, always getting into trouble for dunking Susie Derkins' pigtails in the inkwell at his desk, and that would be fine if she had jet black hair, but you know the Viking types, right? All blond-haired, blue-eyed things, and getting your honey-blond long-flowing locks dunked into ink was not on the agenda for poor, little Susie, who cried, and had to be sent home to console herself with an appointment at the beautician's who thought the Goth-look actually fit Susie well, so she came back the next day as Suze-in-black-leather-and-studs which was disruptive for the class in other ways, so Gauss found himself at detention again.But this time the math teacher had had enough of little Gauss' pronouncements and recitations of π to sixty-seven places, and decided to teach the little scamp but good.Not that I'm channeling, or anything. It's not 'Mary Sue'-writing; it's 'walking in your character's moccasins a mile' ... even though Gauss probably didn't wear moccasins all that often, anyway.Still with me?So, mean-old Herr Schopenhauer told little Gauss. "Okay, twerp, this is gonna be a light one on you. You can go home after you sum the number from one to one hundred."Huh? One to one hundred? But that will take all night!"That's right," said mean old Herr Bergermeister Meisterburger, "so you'd better get crackin'!" And he cackled evilly with Schadenfreude.The Head-Master had several names, don't you know, ... and a peridiction for Schadenfreude with his afternoon tea and muffin.So, what could little Gauss do? He got cracking, the poor lamb.1 = 11 + 2 = 31 + 2 + 3 = 61 + 2 + 3 + 4 = 10 ... obviously, you just add the last number to the previous sum1 + 2 + 3 + 4 + 5 = 15 ... now, wait a minute ...1 + 2 + 3 + 4 + 5 + 6 = 21 ... there seems to be another pattern here ...1 + 2 + 3 + 4 + 5 + 6 + 7 = 28 ... Gauss looked at the chalkboard, then ...Got it! Little Gauss thought.Then he wrote:1 + 2 + 3 + ... + 98 + 99 + 100 = 5,050."G'nite, Teach!" Gauss crowed, and off he skipped home with Suze, where they shared a chocolate milk and a strawberry poptart ... before all that sugar glaze had been added to the crust (shudder).And Herr Herren HerringSchönheit was left stuttering a "Vat? Vat? Git back here, you rapscallion!"Okay, what's a rap-scallion? Is it a thug onion with 'tude, a glock, and a triple-platinum pedigree?But all Herr whatz-his-name could do was sputter, because little Gauss got it.The sum from one (zero, actually) to any number reduces to a simple formula:sum [1..n] = n * (n + 1) / 2Either n or n + 1 is even, so one of them is divisible by two, so it works.Try it out:1 = 1 * (1 + 1) / 2 = 1 * 2 / 2 = 11 + 2 = 2 * (2 + 1) / 2 = 2 * 3 / 2 = 31 + 2 + 3 = 3 * (3 + 1) / 2 = 3 * 4 / 2 = 3 * 2 = 6... so obviously:1 + 2 + 3 + ... + 31,415 = 31,415 * (31,415 + 1) / 2 = 31,415 * 15,708 = that big number I computed earlier. Yeah, almost five-hundred million!Put that into your "ceci-n'est-pas-unu-pipe" and smoke it. I'd buy that for a dollar.Now, there's something you can impress your friends at Happy Hour with.EpilogueOkay, so you can sum 1 to ... whatevs, but what if a friend asks you, very kindly, and very politely, 'Okay, Ms. Smartypants, how about summing 1,000 to 2,000? What's that, huh? Betcha can't do that! Huh? Amirite? Amirite? Lookacha sweating bullets now, arencha? So, well, what is it? Huh?"Yeah, really politely. Like that.Now you could say, "But-but-but, Gauss' summer[-function, not -time] doesn't work like that."But then you'd have feet of clay and egg on your face.And with this hot summer coming, you don't want egg on your face. Trust me on that one.So. What to do?Think about it, that's what.You simply make your 1,000 your zero, by scaling each number back by 1,000, then get your sum, then scale back each number by 1,000. For each time you applied the scale, you apply the scale to the result.So, the sum of 1,000 to 2,000 inclusive is:1,000 + 1,001 + 1,002 + ... 1,998 + 1,999 + 2,000 =scaled back is0 + 1 + 2 + ... + 998 + 999 + 1000 = 1000 * (1000 + 1) / 2 = 500,500Now scale each number forward again, by adding 1,000 back to each number. You did that 1,001 times, right? (Don't forget to count the zero.) That's a rescalation (that's a word now) of 1,001,000. Add that to your sum to get 1,501,500.There's your answer.Let's check:Prelude> sum [1000..2000]1501500Bingo! Tickets! This must be the Front Row!And you, very politely, just told your friend where they could put their superior 'tude, too.Win-win! Good times; good times!See y'all tomorrow. By the way, what's the number of moves needed to solve the Towers of Hanoi? Not that 'Towers of Hanoi' is going to be my topic for 'T.'Not at all.geophftag:blogger.com,1999:blog-4650294074444534066.post-5511856144518086992Tue, 22 Apr 2014 14:49:00 +0000How to Fake Dynamic Binding in Nix
http://r6.ca/blog/20140422T142911Z.html
The Nix language is a purely functional, lazy, statically scoped configuration
language that is commonly used to specify software package configurations, OS system configurations, distributed system configurations, and cloud system configurations.
Static scoping means that variables are statically bound;
all variable references are resolved based on their scope at declaration time.
For example, if we declare a set with recursive bindings,
❄> let a = rec { x = "abc"; x2 = x + "123"; }; in a
{ x = "abc"; x2 = "abc123"; }
then the use of x in the definition of x2 is resolved to "abc".
Even if we later update the definition of x, the definition of x2 will not change.
❄> let a = rec { x = "abc"; x2 = x + "123"; }; in a // { x = "def"; }
{ x = "def"; x2 = "abc123"; }
Generally speaking, static binding is a good thing.
I find that languages with static scoping are easy to understand because variable references can be followed in the source code.
Static scoping lets Nix be referentially transparent, so, modulo α-renaming, you can always substitute expressions by their (partially) normalized values.
In the previous example, we can replace the expression rec { x = "abc"; x2 = x + "123"; } with { x = "abc"; x2 = "abc123"; } and the result of the program does not change.
❄> let a = { x = "abc"; x2 = "abc123"; }; in a // { x = "def"; }
{ x = "def"; x2 = "abc123"; }
That said, on some occasions it would be nice to have some dynamic binding.
Dynamic binding is used in Nixpkgs to allow users to override libraries with alternate versions in such a way that all other software packages that depend on it pick up the replacement version.
For example, we might have the following in our nixpkgs declaration
…
boost149 = callPackage ../development/libraries/boost/1.49.nix { };
boost155 = callPackage ../development/libraries/boost/1.55.nix { };
boost = boost155;
…
and perhaps we want to make boost149 the default boost version to work around some regression.
If we write nixpkgs // { boost = boost149; } then we only update the boost field of the nix package collection and none of the packages depending on boost will change.
Instead we need to use the config.packageOverrides to change boost in such a way that all expressions depending on boost are also updated.
Our goal is to understand the technique that packageOverrides and other similar overrides employ to achieve this sort of dynamic binding in a statically scoped language such as Nix.
This same technique is also used to give semantics to object-oriented languages.
First we need to review normal recursive bindings.
The rec operator is can almost be defined as a function in Nix itself by taking a fixed point.
Recall that in Nix lambda expressions are written as x: expr.
❄> let fix = f: let fixpoint = f fixpoint; in fixpoint; in
let a = fix (self: { x = "abc"; x2 = self.x + "123"; }); in
a
{ x = "abc"; x2 = "abc123"; }
The function self: { x = "abc"; x2 = self.x + "123"; } is an object written in the open recursive style.
By taking the fixpoint of this function, the recursion is closed yielding the desired value.
Written this way, we had to prefix the recursive references to x with self..
However using Nix’s with operator, we can bring the values of self into scope allowing us to drop the prefix.
❄> let fix = f: let fixpoint = f fixpoint; in fixpoint; in
let a = fix (self: with self; { x = "abc"; x2 = x + "123"; }); in
a
{ x = "abc"; x2 = "abc123"; }
This is pretty close to a definition of rec.
We can almost think of rec { bindings } as syntactic sugar for fix (self: with self; { bindings }).
In order to override the definition of x instead up updating it, we need to intercept the definition of x before the open recursion is closed.
To this end, we write a fixWithOverride function that takes a set of overriding bindings and an open recursive object and applies the override bindings before closing the recursion.
❄> let fix = f: let fixpoint = f fixpoint; in fixpoint;
withOverride = overrides: f: self: f self // overrides;
fixWithOverride = overrides: f: fix (withOverride overrides f); in
let open_a = self: with self; { x = "abc"; x2 = x + "123"; }; in
fixWithOverride { x = "def"; } open_a
{ x = "def"; x2 = "def123"; }
Success!
We have manage to override the definition of x and get the definition of x2 updated automatically to reflect the new value of x.
Let us step through this code to see how it works.
First we defined open_a to be the same as our previous definition of a, but written as an open recursive object.
The expression fixWithOverride { x = "def"; } open_a reduces to fix (withOverride { x = "def"; } open_a).
What the withOverride function does is takes an open recursive object and returns a new open recursive object with updated bindings.
In particular, withOverride { x = "def"; } open_a reduces toself: (with self; { x = "abc"; x2 = x + "123"; }) // { x = "def"; }
which in turn reduces to self: { x = "def"; x2 = self.x + "123"; }.
Finally, fix takes the fixpoint of this updated open recursive object to get the closed value { x = "def"; x2 = "def123"; }.
This is great; however, we do not want to have to work with open recursive objects everywhere.
Instead, what we can do is build a closed recursive value, but tack on an extra field named _override that provides access to withOverride applied to the open recursive object.
This will allow us to perform both updates and overrides at our discretion.
❄> let fix = f: let fixpoint = f fixpoint; in fixpoint;
withOverride = overrides: f: self: f self // overrides;
virtual = f: fix f // { _override = overrides: virtual (withOverride overrides f); }; in
let a = virtual (self: with self; { x = "abc"; x2 = x + "123"; }); in rec
{ example1 = a;
example2 = a // { x = "def"; };
example3 = a._override { x = "def"; };
example4 = example3._override { y = true; };
example5 = example4._override { x = "ghi"; };
}
{ example1 = { _override = ; x = "abc"; x2 = "abc123"; };
example2 = { _override = ; x = "def"; x2 = "abc123"; };
example3 = { _override = ; x = "def"; x2 = "def123"; };
example4 = { _override = ; x = "def"; x2 = "def123"; y = true; };
example5 = { _override = ; x = "ghi"; x2 = "ghi123"; y = true; };
}
One remaining problem with the above definition of virtual is that we cannot override the method x2 and still get access to x.
❄> let fix = f: let fixpoint = f fixpoint; in fixpoint;
withOverride = overrides: f: self: f self // overrides;
virtual = f: fix f // { _override = overrides: virtual (withOverride overrides f); }; in
let a = virtual (self: with self; { x = "abc"; x2 = x + "123"; }); in
a._override { x2 = x + "456"; }
error: undefined variable `x' at `(string):5:23'
Remembering that Nix is statically scoped, we see that the variable x in a._override { x2 = x + "456"; } is a dangling reference and does not refer to anything in lexical scope.
To remedy this, we allow the overrides parameter to withOverride to optionally take a open recursive object rather than necessarily a set of bindings.
❄> let fix = f: let fixpoint = f fixpoint; in fixpoint;
withOverride = overrides: f: self: f self //
(if builtins.isFunction overrides then overrides self else overrides);
virtual = f: fix f // { _override = overrides: virtual (withOverride overrides f); }; in
let a = virtual (self: with self; { x = "abc"; x2 = x + "123"; }); in rec
{ example6 = a._override (self: with self; { x2 = x + "456"; });
example7 = example6._override { x = "def"; };
}
{ example6 = { _override = ; x = "abc"; x2 = "abc456"; };
example7 = { _override = ; x = "def"; x2 = "def456"; };
}
This illustrates is the basic technique that packageOverrides and other similar overrides use.
The packageOverrides code is a bit more complex because there are multiple points of entry into the package override system, but the above is the essential idea behind it.
The makeOverridable function from customisation.nix creates an override field similar to my _override field above, but overrides function arguments rather than overriding recursive bindings.
The syntax virtual (self: with self; { bindings }) is a little heavy.
One could imagine adding a virtual keyword to Nix, similar to rec, so that virtual { bindings } would denote this expression.
After writing all this I am not certain my title is correct.
I called this faking dynamic binding, but I think one could argue that this is actually real dynamic binding.http://r6.ca/blog/20140422T142911Z.htmlTue, 22 Apr 2014 14:29:11 +0000'R' is fer Realz, yo!
http://logicaltypes.blogspot.com/2014/04/r-is-fer-realz-yo.html
'R' is for Let's Get Real (the Real Number line)... because this reality you're living in?It isn't. It's entirely manufactured for you and by you, and you have no clue that you're living in this unreality that you think isn't.It's as simple, and as pervasive, as this:The 'Real Numbers'?They aren't.Let's take a step back.First, as you know, there was counting, and that started, naturally, from the number one.But even that statement is so obviously flawed and ridiculous on the face of it to a modern-day mathematician. Why are you starting with the number one? Why aren't you starting with the number zero?This isn't an argument over semantics, by the way, this has real (heh!), fundamental impact, for the number one, in counting, is not the identity. You cannot add one to a number and get that number back, and if you can't do that, you don't have a category (a 'Ring'), and if you don't have that, you have nothing, because your number system has no basis, no foundation, and anything goes because nothing is sure.But getting to the number zero, admitting that it exists, even though it represents the zero quantity, so technically, it refers to things that don't exist (and therefore a fundamental problem with the number zero in ancient societies) ... well, there was a philosopher who posited that the number zero existed.He was summarily executed by Plato and his 'platonic' buddies because he had spouted heresy.If number zero exists, then you had to be able to divide by it, and when you did, you got infinity. And, obviously, infinity cannot be allowed to exist, so no number zero for you.We went on for a long, long time without the number zero. Even unto today. You study the German rule, then you learn your multiplication tables starting from which number? Not zero: one."One times one is one. One times two is two. One times three is three."Where is the recitation for the "Zero times ..."?And I mean 'we' as Western society, as shaped by Western philosophy. The Easterners, lead by the Indians, had no problem admitting the number zero, they even had a symbol for it, 0, and then playing in the infinite playground it opened up, and therefore Eastern philosophy thrived, flourished, while Western philosophy, and society, stagnated, ...... for one thousand years, ...Just because ... now get this, ... just because people refused to open their eyes to a new way of seeing the world, ...... through the number zero.BOOM!That's what happened when finally West met East, through exchange of ideas through trade (spices) and the Crusades (coffee served with croissants), and philosophers started talking, and the number zero was raised as a possibility.BOOM!Mathematics, mathematical ideas, and ideas, themselves, exploded onto the world and into though. Now that there was zero, there was infinity, now that there was infinity, and it was tenable, people now had the freedom to explore spaces that didn't exist anymore. People could go to the New World now, both figuratively and literally.For growth in Mathematics comes from opening up your mind the possibilities you wouldn't ('couldn't') consider before, and growth in Mathematics leads to opening the mind further.Take, for example, the expansion of the counting numbers, from not admitting zero to, now, admitting it, yes, but then the fractional numbers. You could count fractionally now.From zero to infinity there were an infinite number of numbers, but, with fractions, we now found that from zero to one, there were an equal number of infinite number of fractions.The really neat discovery was that if you put all the fractions in one set, and you put all the counting numbers into another, there was a one-to-one correspondence between the two.An infinity of counting numbers was the same size as an infinity of counting-by-fraction numbers.Wow.So, infinity was the biggest number, fer realz, then, eh?No, not really.Because then came what we call the 'Real Numbers' (which aren't, not by a long shot), and then we found an infinite number of numbers between one-half and one-third.But the thing with these numbers?The were rationals (fractional) in there, to be sure, but they were also irrationals.There were numbers like π and τ and e and other weird and wonderful numbers, and the problem with these numbers was that there was no correspondence between them and the rational numbers. There was no way you could combine rational numbers in any form and point directly to τ, for example. These numbers were transcendent.What's more: they were more. There were infinitely more transcendent numbers, irrational numbers, than there were rationals.And not even countably infinite more, they were uncountably more infinite.There was an infinite that was bigger than infinity, and this we call the Continuum.Why? Because between zero and one and then between one and two there's this measured, discrete, gap, and this we use for counting. There's a measured, even, step between the counting numbers, and even between the fractional numbers: you can count by them, because between them there is this discreteness.Between the Reals there's no measurable gap. You can't count by them, and you can't add 'just this much' (every time) to go from τ to π ...(Heh, actually, you can: π = τ + τ, but then what? What's the 'next' irrational number? There's no such thing as the 'next' irrational number, because no matter what 'next' number you choose, there will always be an uncountably infinite number of numbers between that 'next' number and the number you started from, so your 'next' number isn't the next number at all, and never will be.)So, wow, the Reals. Lots of them. They cover everything, then, right?Not even close.There are numbers that are not numbers.For example, what is the number of all the functions that yield the number zero?There are, in fact, an infinite number of those.How about all the functions that give you ('eventually') π?... Ooh! There are several different ones to find π, aren't they?Yes. Yes, there are. In fact, there are an uncountably infinite number of functions that compute π.Now, wait. You're saying, geophf, that there are uncountably infinite number of functions to find each and every Real number and that the Real numbers are uncountable as well, so that means...Yeah, the Continuum reigned supreme for just a while as the biggest-big number, but it was soon toppled by this Powerset infinity (my term for it, it's actually called something else).Now, I don't know the relation between the functions that yield numbers, and the functions that construct functions that do that.But do you see where we're going with this?As big as you can stretch yourself, there's new vistas to see in mathematics (and meta-mathematics, let's not neglect that, now, shall we?).But we still haven't scratched the surface.Is the world quantized, like the rational numbers? Or is it a continuum like the Reals?Or is something else, even something more?Electricity.Direct current comes to you in a straight, steady line.The thing about DC ('direct current')? It sucks.(Just ask Marvel.)If you want clean, pure, ... powerful, ... well: power, over any kind of distance, you have to ask ... not our friend Tommy (Edison), but our wild-child Tesla.He proposed to Edison that we should use AC ('alternating current') to provide electricity, and Edison threw him out of his lab, that idiot, telling him never to show his face there again.Guess how your electricity is delivered to your home today?The thing about alternating current? It's a wave-form, and not only that, it's a triple wave-form. How do real numbers model electricity? Well, with DC, you've got one number: "That there is 5 volts or 5 amps or 1.21 gigiwatts."Boy, that last one came out of the blue: like a bolt of lightning!Heh.But if it's alternating current, then you need the sine and cosine functions to describe your power. Functions? Wouldn't it be nice if it were just a number?Yes, it would be nice, and there is a number to describe wave-form functions, like your alternating current.'Imaginary' numbers.They are called 'imaginary numbers,' because, if you look hard enough on the number line, with good enough eyes, eventually you'll see the number π or τ or e or 1, or 2, or 3, or even zero.But no matter how hard you strain your eyes, you will never see a number with an imaginary component, because why? Because most imaginary numbers, being on the curve of the wave-form are either above or below the number line. They 'aren't' numbers, then, because they're not on the numberline.They're imaginary.I mean, come on! The square-root of negative one? Why would anybody do this? Unless they were daft or a bit batty.The thing is, without imaginary numbers, we wouldn't have the forms to get our heads around alternating current.Most of the world, except those very lucky few who lived within a mile or two of a power plant, would be in darkness.And the computer? Pfft! Don't get me started. Hie thee to the nunnery, because we are now back in the Dark Ages.Or at most in the Age of 'Enlightenment' where you had to run for cover when the landlady bellowed "'ware slops!" ... unless you wanted raw sewage mixed with rotted fruit on your head.But now, here we are, because we have both Real and imaginary numbers, together giving us the complex number set (which, it turns out, is not bigger than the Reals, as there is a one-to-one correspondence between each real number and each complex number. Fancy that! An infinity 'more' number of complex number above and below the Real number line gives the same number of complex numbers as Reals).We're good?Not even close.Last I checked, I don't live in Flatland, and, last I checked, nor do you.Complex go 'above' and 'below' the Real number line, but ... what about the third dimension? Is there numbers to model us in three dimension?What would such numbers look like?And here's a stunner. If I were on Mars, or the Moon, and you were here, reading this blog post, how would I know where to look to see you?The Moon, and Mars, too, has their own three-dimensional frames of reference, and the Earth has its own, too (it's called geocentric). So, to draw a line from a satellite (such as the Moon, known as 'Earth's satellite') so that it can look down at a spot on the Earth, you actually have to use a four-dimensional number to connect the two three-dimensional frames of reference so that one can look at the other. This four-dimensional number is called the Quaternion.It's simple, really, it's just rocket science.And it's really ... mind-bending, at first, wrapping your head around the math and drawing pictures, or using both your hands, three fingers out indicating both sets of axes, and you use your nose to draw the line connecting the two, and then you scream, 'but how do I measure the angles?'Not that I've worked on satellite projects or anything. cough-EarthWatch-cough.But nowadays, you can't get into making a realistic game without having quaternions under your belt. Monster sees you, monster charges you, monster bonks you on the head. Game over, thank you for playing, please insert $1.50 to die ... that is to say, to try again.The thing is: how does the monster 'see' you? The monster has it's own frame of reference, just as you do. The monster exists in its own three-dimensional coordinate system, just as you do. If you were standing on a little hillock, would you expect the monster not to see you because you're slightly elevated?Of course not! The monster sees you, the monster bonks you. All of this happens through transformation of disparate coordinate systems via quaternions.Now that's something to impress people with at cocktail parties.'Yeah, I spent all day translating coordinate systems using quaternions. Busy day, busy day."Just don't say that to a mathematician, because he'll (in general, 'he') will pause, scratch his head then ask: "So you were checking out babes checking you out?"Then you'll have to admit that, no, not that, you were instead avoiding your boss trying to catch your eye so he could hand you a whole stack of TPS reports to work on over the weekend.Like I ... didn't. Ooh. Ouch! Guess who was working through Easter?Fun, fun!Okay, though. Four dimensions. We've got it all, now, right?Not if you're a bee.Okay, where did that come from?Bees see the world differently from you and me.Please reflect on the syntax of that sentence, writers.Bees see the world differently (not: 'different') from you and me (not: 'from you and I').Gosh! Where is the (American) English language going?(But I digress.)(As always.)If you look at how they communicate through their dance, you see an orientation, but you also see some other activity, they 'waggle' (so it's called the 'waggle-dance') and the vigor at which they do their waggle communicates a more interesting cache of nectar. There are other factors, too.The fact of the matter: three dimensions are not enough for the bee's dance to communicate what it needs to say to the other drones about the location, distance, and quantity of nectar to be found.So, it has its waggle-dance to communicate this information. Everybody knows this.Until, one little girl, working on her Ph.D. in mathematics, stopped by her daddy's apiary, and, at his invitation, watched what he was doing.Eureka."Daddy," she said, "those bees are dancing in six dimensions."Guess who changed the topic of her Ph.D., right then and there?Combining distance, angle from the sun, quantity of interest, ... all the factors, the bees came up with a dance.They had only three dimensions to communicate six things.The thing is, nobody told the bees they had only three dimensions to work with. So they do their dance in six dimension.If you map what they are doing up to the sixth dimension, it gives a simple (six-dimensional) vector, which conveys all the information in one number.Bees live in six dimensions, and they live pretty darn well in it, too.Or, put this way: 80% of the world's food supply would disappear if bees didn't do what they did.You are living in six dimensions, or, more correctly, you are alive now, thanks to a little six-dimensional dance.Six dimensions.Okay, but what if you're Buckaroo Banzai?Pfft! Eight dimensions? Light weight.In String Theory, we need at least ten dimensions for super strings, and twenty-six dimensions for some types of strings.So, 'R' is for real numbers.The neat thing about numbers, is ... they can get as big as you can think them.And they're good for a Ph.D. thesis ... or two.http://logicaltypes.blogspot.com/2014/04/f-is-for-function.htmlgeophftag:blogger.com,1999:blog-4650294074444534066.post-3188544716869431955Tue, 22 Apr 2014 01:06:00 +0000'F' is for function
http://logicaltypes.blogspot.com/2014/04/f-is-for-function.html
'F' is for function.Okay, you just knew we were going there today, didn't you?Today, and ... well, every day in the last 2,600 years (I'm not joking) has had this dynamic tension between objects and functions. Which is greater?Read this tell-all post to find out!Sorry, I channeled Carl Sagan in Cosmos for a second. Not really.But I digress.WAY back when, back in the olden times, there were just things, and then when you had a full belly and your one thing, your neighbor, who happened to be named 'Jones,' had two things.And then keeping up with the Jones' came to be a real problem, because you had one thing, and they had two. So you got one more.And counting was invented.For things.In fact, in some cultures, counting doesn't exist as an abstract thing. You don't count, in general, you have different counting systems for different things. Counting, for example, pens, is a very different thing than counting eggs.Ha-ha, so quaint! Who would ever do that?Well, you do. What time is it? You have a different way for counting time than you do for counting distance or for counting groceries that you need to buy, don't you?But, at base, they are all just things: time, distance, and gallons of milk.Sorry: litres of milk. How very British Imperial of me!Question: why are Americans just about the only ones using the British Imperial system any more? That is, the real question is: why are the British not using the British Imperial system? They onto something we're not?Anybody want a satellite coded wrongly because some engineers intermingled British Imperial units with metrics?Okay, so, units, things, metrics, whatevs.And counting.Counting is doing something to something, whereas, at first, it's important you have that chicken in your stomach (cooked is nice, but optional), suddenly it's become important that you have not one chicken, a hen, but a (very rarely) occasional rooster with all your hens would be nice, too.(Roosters are big jerks, by the way.)Counting was all we had, after the chickens and their yummy eggs, that is, and that for a long while.Question: which came first, the chicken or the egg.Answer: that's obvious, the egg.Think about it. You have eggs for breakfast, then you have a chicken-salad sandwich.Like I said: obvious! First the egg, then the chicken.Unless you're one of those weirdos that eat chicken sausage with your breakfast, then it's a toss-up, but in that case I have no help for you.Okay, chickens, eggs (egg came first) (You read that first, right here on this blog) (and you thought there was no point to math!), and then counting, for a long, long time.Then, ... counting became too much, because you ran out of fingers, then toes, to count your mutton on. It happens. That's a good thing, because then people started to have to think (that's a good thing, too), and they thought, 'how do I count all my mutton when I run out of fingers and toes? I need to know that I have more sheep than the Jones', and I've paired each of my sheep to his, and I have more than twenty-two than he does. How many more do I have!"Then people started getting smart. They did more than one-to-one, or pairwise, groupings, they started grouping numbers, themselves.All this is very nice background, geophf, but to what end?First counting, then grouping, those are operations, and on or using numbers.Then addition and subtraction, which are abstractions, or generalizations on counting, then multiplication, and let's not even go into division.But.Counting, adding, multiplying. These things worked for you chickens (and eggs) (first), but also on your mutton, and when you settled down, it worked on how many feet, yards, the hectares of land you had, too. The same principles applied: numbers worked on things, no matter what the things were, and the operations on numbers worked on ... well, no matter what the numbers counted.Now. Today.Or, sometime back in the 40s anyway, way back before the Birkenstock-wearers were mellowing out with lattes, anyway. A couple of dudes (Samuel Eilenberg and Saunders Mac Lane, specifically) said, 'Hey, wait!' they exclaimed, 'it doesn't matter what the thing is, at all!' they said, 'as long as the things are all the same thing, we don't care all all about the thing itself at all!'And they invented a little something that eventually became known as Category Theory, which is the basis of ... well, a lot of things now: quantum thermodynamics, for one, and computer science, for another.And they nailed it. They created a one-to-one correspondence between the things in categories and the things in Set theory.Why is that important? Set theory is this:{}Got me?A set is a thing that contains a thing. The fundamental set is the set that contains nothing (the 'empty set'), which is this:{}And other sets are sets that contain sets:{{}, {{}, {{}, {}}}}With sets you can model numbers if you'd like:0: {},1 {{}} (the set that contains '0'), 2: {{}, {{}}} (the set that contains '0' and '1')3: {{}, {{}}, {{}, {{}}}} (the set that contains '0', '1', and '2')etc...And as Gödel (eventually) showed, once you model numbers, you can model anything.(We'll get to that ... for 'G' is for Gödel-numbering, I'm thinking)How? Well, once you have numbers, you can count with them (as just demonstrated), you can add one to them (the 'successor' function), you can take away one from the (the 'predecessor' function), you can add two of them together (successively add one to the first number until the second number is zero. Try it!), subtract, by doing the opposite (or by just removing duplicates from both until one of them is gone ... 'pairwise-subtraction'!), multiply two together (successive addition), divide numbers ... (um, yeah)... and, well, do anything with them.Once you have Number, you have counting, and from that, you have everything else.So, our dudes mapped category theory things (morphisms) down to Set mapping functions, and they had it.Because why?Well, it's rather cumbersome to carry a whole bunch of sets for your Pert formulation, especially when you want to see the money you (don't) save by paying off your mortgage's principal early.Banker: oh, geophf, pay off your mortgage principal early, you'll save a ton in the long run.me: uh, that was true when mortgage interest was 10-15-21%...(you remember those Jimmy Carter days?)me, continuing: ... but now that my interest rate is 2.5% I save zip after 30 years for early payment.Try it. Run the numbers. Bankers haven't. They're just repeating what was good advice, ... twenty-five years ago.But when you have objects, whose representations do not matter, be they 0, 1, a googol, or {{}, {{}}, {{}, {{}}}}, then you can concentrate on what's really important.The functions.Whoopsie! I slipped and gave the answer, didn't I?Me, and Alan Kay, both.Alan Kay, inventor of Smalltalk, and inspirer of the movie Tron: The objects aren't important, it's the messages that go between them that is.So there.Well, okay, functions win. It used to be the chicken (or the egg), but now we've ... grown a bit, worrying less about what fills our stomach, and where we're going to get our next meal (and not be the next meal), you know: survival.We've transcended our stomachs.Maybe.Well, okay, functions rule, so back off.The dudes didn't stop there, because there's been some really neat things going on with functions, since discovering functions are things in and of themselves, because if functions are things, in and of themselves, then, well, ...Well, you can number them (tomorrow) (promise), you can count them, you can add them, you can multiply them, you can exponate them, you can ...You can, in fact, operate on them, applying functions to functions.So, our dudes did just that. They said their functions, their morphisms have two rules to be a function.You can ask what it is, it's identity:id f = fAnd you can string them together, composing them:g . f = ... well, g . fSo, if you have > f x = succ xand > g x = x * 2then (g . f) gives you some function, we can call it, h, here if you'd like, and h is the composition of applying f to your number first, then g, or> h x = 2 (x + 1)Try it!Pull up your functional programming language listener (I use Haskell for now) (until I find or invent something better)and enter the above.Now, it's an interesting proposition to show that(g . f) == hBut, for now, in Haskell, we cannot do that.Now, in some other programming languages (I'm thinking of my experience with Twelf), you can, but first you have to provide numbers, on you own, like: 0, 1, 2, ...... and prove they are what they are, you know: they follow in order, and stuff like that ...Then you provide the functions for identity and composition, and then you can prove the above statement, because now you have theorems and a theorem prover.Yay!No, ... no 'yay'!I mean, 'yay,' if that's your thing, proving theorems (I happen to have fun doing that at times), but 'ick,' otherwise, because it's a lot of work to get to things like just plain old addition.See, theorem-prover-folk are these hard-core fighters, right on the edge of discovering new mathematics and science. Not plain-old-little me, I just use the stuff (to build new theories from preexisting, proved, ones), so I'm a lover of the stuff.To quote an eminent mathematical philosopher: I'm a lover, not a fighter.And, no, Billie-Jean is not my lover. She's just a girl that claims that I am the one.Anyway!So, but anyway, using the above two functions, identity and composition, you're now able to reason about functions generally, making new functions by stringing together (composing) other, preexisting functions.This is so important that Eilenberg and Mac Lane gave this process it's own name 'natural transformation.'But okay, Categories have units and you don't particularly care what those units are; we saw one set of units, numbers, themselves, as an example,But units can be anything.Even functions, so you have a category of functions (several categories, in fact, as there are different kinds of functions, just so you know) ... You can even have categories of categories of things (things being some arbitary unitary types, like numbers, or functions, or categories, again ... it can get as wild and woolly as you want it to get).And types. Categories can be used to model types, so you can have a category that represents the natural numbers ... as types:Nat : Categoryzero : 1 -> Natsucc : Nat -> Natso zero = zero ()one = succ zero ()two = succ onethree = succ twoetc ...And those formula, those functions works, regardless of the underlying type of things like 'number' or even 'function.'Category theory was originally called 'Abstract nonsense,' by the way, because Eilenberg and Mac Lane saw it as so abstract that they were actually talking about nothing.Sort of like how Set Theory does, starting from nothing, the empty set, and building everything from that object, but even more abstractly than that, because category theory doesn't even have the 'nothing'-object to start from, it just starts identifying and composing functions against objects about which it doesn't care what they are.Now, there was, at one point, a working definition of abstraction, and that was: the more useful a theorem is, the less abstract it is.The contrapositive implied: the more abstract a formula is, the less useful.But I find I play in these domains very happily, and in playing, I'm able to invent stuff, useful stuff that's able to do things, in software, that other software engineers struggle with and even give up, stating: "This can't be done."I love it when I'm told I "can't" do something.You know what I hear when I hear the word "can't"?I hear opportunity. Everyone else may have given up on this particular task, because it 'can't' be done. But me? I go right in with my little abstract nonsense, reinventing the language that was unable to frame the problem before with a language that can, and I find, hey, I not only can do it, but I just did.Language frames our thoughts, telling us what we can and cannot do. English is a very complex language, taking in a lot of the world and describing it just so.The language of mathematics is very, very simple, and describes a very tiny world, a world that is so tiny, in fact, doesn't exist in reality. It's a quantum-sized world: mathematics and has the possibility to be even smaller than that.But so what? I get to invent a world that doesn't exist, each time I apply a theorem from mathematics that others are unfamiliar with.And when I do so, I get to create a world where the impossible is possible.It's very liberating, being able to do things others 'can't.' And when I pull it off, not only do I win, but my customers win, and maybe even those developers who 'can't' win, too, if they're willing to open their eyes to see possibilities they didn't see before.'F' is for functions, a totally different way of seeing the world, not as a world of objects, but as a world of possibilities to explore.geophftag:blogger.com,1999:blog-4650294074444534066.post-6407711360119459341Tue, 08 Apr 2014 00:18:00 +0000Typeable and Data in Haskell
http://chrisdone.com/posts/data-typeable
http://chrisdone.com/posts/data-typeableTue, 22 Apr 2014 00:00:00 +0000Bellman Confirms A Suspicion
http://existentialtype.wordpress.com/2014/04/21/bellman-confirms-a-suspicion/
As is by now well-known, I regard the supposed opposition between static and dynamic languages as a fallacy: the latter, being a special case of the former, can scarcely be an alternative to it. I cannot tell you how many times I’ve been handed arguments along the lines of “oh, static languages are just fine, but I want something more dynamic,” the speaker not quite realizing the absurdity of what they are saying. Yet somehow this sort of argument has an appeal, and I’ve often wondered why. I think it’s mostly just semantic innocence, but I’ve long had the suspicion that part of it is that it sounds good to be dynamic (active, outgoing, nimble) rather than static (passive, boring, staid). As we all know, much of the popularity of programming languages comes down to such superficialities and misunderstandings, so what else is new?
Well, nothing, really, except that I recently learned (from Guy Blelloch) the origin of the notably inapt term dynamic programming for a highly useful method of memoization invented by Richard Bellman that is consonant with my suspicion. Bellman, it turns out, had much the same thought as mine about the appeal of the word “dynamic”, and used it consciously to further his own ends:
“I spent the Fall quarter (of 1950) at RAND. My first task was to find a name for multistage decision processes.
“An interesting question is, ‘Where did the name, dynamic programming, come from?’ The 1950s were not good years for mathematical research. We had a very interesting gentleman in Washington named Wilson. He was Secretary of Defense, and he actually had a pathological fear and hatred of the word, research. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term, research, in his presence. You can imagine how he felt, then, about the term, mathematical. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose? In the first place I was interested in planning, in decision making, in thinking. But planning, is not a good word for various rea- sons. I decided therefore to use the word, ‘programming.’ I wanted to get across the idea that this was dynamic, this was multistage, this was time-varying—I thought, let’s kill two birds with one stone. Let’s take a word that has an absolutely precise meaning, namely dynamic, in the classical physical sense. It also has a very interesting property as an adjective, and that is it’s impossible to use the word, dynamic, in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It’s impossible. Thus, I thought dynamic programming was a good name. It was something not even a Congressman could object to. So I used it as an umbrella for my activities” (p. 159).
Hilarious, or what? It explains a lot, I must say, and confirms a long-standing suspicion of mine about the persistent belief in a non-existent opposition.Filed under: Research, Teaching Tagged: dynamic and static typingRobert Harperhttp://existentialtype.wordpress.com/?p=922Mon, 21 Apr 2014 04:24:44 +0000Setting up Samsung Wireless Printer on Linux
http://feedproxy.google.com/~r/RomanCheplyaka/~3/dRs6qRYL1IM/2014-04-21-wireless-samsung-printer-linux.html
http://ro-che.info//articles/2014-04-21-wireless-samsung-printer-linux.htmlSun, 20 Apr 2014 21:00:00 +0000CUFP 2014 Call For Presentations
http://comonad.com/reader/2014/cufp-cfp/
Workshop for
Commercial Users of Functional Programming 2014
Sponsored by SIGPLAN
[CUFP 2014](http://cufp.org/conference)
Co-located with ICFP 2014
Gothenburg, Sweden
Sep 4-6
Talk Proposal Submission Deadline: 27 June 2014
CUFP 2014 Presentation Submission Form
The annual CUFP workshop is a place where people can see how others are using functional programming to solve real world problems; where practitioners meet and collaborate; where language designers and users can share ideas about the future of their favorite language; and where one can learn practical techniques and approaches for putting functional programming to work.
Giving a CUFP Talk
If you have experience using functional languages in a practical setting, we invite you to submit a proposal to give a talk at the workshop. We're looking for two kinds of talks:
Experience reports are typically 25 minutes long, and aim to inform participants about how functional programming plays out in real-world applications, focusing especially on lessons learned and insights gained. Experience reports don't need to be highly technical; reflections on the commercial, management, or software engineering aspects are, if anything, more important.
Technical talks are also 25 minutes long, and should focus on teaching the audience something about a particular technique or methodology, from the point of view of someone who has seen it play out in practice. These talks could cover anything from techniques for building functional concurrent applications, to managing dynamic reconfigurations, to design recipes for using types effectively in large-scale applications. While these talks will often be based on a particular language, they should be accessible to a broad range of programmers.
We strongly encourage submissions from people in communities that are underrepresented in functional programming, including but not limited to women; people of color; people in gender, sexual and romantic minorities; people with disabilities; people residing in Asia, Africa, or Latin America; and people who have never presented at a conference before. We recognize that inclusion is an important part of our mission to promote functional programming. So that CUFP can be a safe environment in which participants openly exchange ideas, we abide by the SIGPLAN Conference Anti-Harassment Policy.
If you are interested in offering a talk, or nominating someone to do
so, please submit your presentation before 27 June 2014 via the
CUFP 2014 Presentation Submission Form
You do not need to submit a paper, just a short proposal for your talk! There will be a short scribe's report of the presentations and discussions but not of the details of individual talks, as the meeting is intended to be more a discussion forum than a technical interchange.
Nevertheless, presentations will be video taped and presenters will be expected to sign an ACM copyright release form.
Note that we will need all presenters to register for the CUFP workshop and travel to Gothenburg at their own expense.
Program Committee
Edward Kmett (McGraw Hill Financial), co-chair
Marius Eriksen (Twitter, Inc.), co-chair
Ozgun Ataman (Soostone, Inc.)
Tim Chevalier (AlephCloud)
Derek Elkins (Now Business Intelligence)
Matthew Might (University of Utah)
Richard Minerich (Bayard Rock)
Audrey Tang (Apple, Inc.)
Jason Zaugg (Typesafe)
More information
For more information on CUFP, including videos of presentations from
previous years, take a look at the CUFP website at cufp.org. Note that presenters, like other attendees, will need to register for the event. Presentations will be video taped and presenters will be expected to sign an ACM copyright release form. Acceptance and rejection letters will be sent out by July 16th.
Guidance on giving a great CUFP talk
Focus on the interesting bits: Think about what will distinguish your talk, and what will engage the audience, and focus there. There are a number of places to look for those interesting bits.
Setting: FP is pretty well established in some areas, including formal verification, financial processing and server-sid web-services. An unusual setting can be a source of interest. If you're deploying FP-based mobile UIs or building servers on oil rigs, then the challenges of that scenario are worth focusing on. Did FP help or hinder in adapting to the setting?
Technology: The CUFP audience is hungry to learn about how FP techniques work in practice. What design patterns have you applied, and to what areas? Did you use functional reactive programming for user interfaces, or DSLs for playing chess, or fault-tolerant actors for large scale geological data processing? Teach us something about the techniques you used, and why we should consider using them ourselves.
Getting things done: How did you deal with large software development in the absence of a myriad of pre-existing support that are often expected in larger commercial environments (IDEs, coverage tools, debuggers, profilers) and without larger, proven bodies of libraries? Did you hit any brick walls that required support from the community?
Don't just be a cheerleader: It's easy to write a rah-rah talk about how well FP worked for you, but CUFP is more interesting when the talks also spend time on what _doesn't_ work. Even when the results were all great, you should spend more time on the challenges along the way than on the parts that went smoothly.Edward Kmetthttp://comonad.com/reader/?p=923Sun, 20 Apr 2014 19:13:41 +0000Announcing cabal 1.20
http://blog.johantibell.com/2014/04/announcing-cabal-120.html
On behalf of all cabal contributors, I'm proud to announce cabal 1.20. This is quite a big release, with 404 commits since 1.18. To install:
cabal update
cabal install Cabal-1.20.0.0 cabal-install-1.20.0.0
New features
Since there are 404 commits since cabal 1.18, there are too many changes to give all of them a just treatment here. I've cherry-picked some that I thought you would find interesting.
To see the full list of commits, run:
git log Cabal-v1.18.1.3..Cabal-v1.20.0.0
in the cabal repo.
Dependency freezing
If you're building an application that you're delivering to customers, either as binary or as something that runs on your servers, you want to make sure unwanted changes don't sneak in between releases.
For example, say you've found a bug in the just released 1.0.0.0 version and you want to release 1.0.0.1, which contains a fix. If you build the binary on a different machine than you built the release on, you risk building it against a slightly different set of dependencies, which means that your introducing untested code into your application, possible causing new bugs.
cabal freeze lets developers of applications freeze the set of dependencies used to make builds reproducible. cabal freeze computes a valid set of package versions, just like cabal install does, and stores the result in cabal.config. You can then check cabal.config into your source control system to make sure that all developers that work on the application build against the exact same set of dependencies.
Here's the contents of the cabal.config file created by running cabal freeze in the cabal-install repo:
constraints: ...
HTTP ==4000.2.8,
array ==0.4.0.1,
base ==4.6.0.1,
bytestring ==0.10.0.2,
...
If you later want to update the set of dependencies either:
manually edit cabal.config or
delete (parts of) cabal.config and re-run cabal freeze.
Note that you should only use cabal freeze on applications you develop in-house, not on packages you intend to upload to Hackage.
Parallel cabal build
Starting with 7.8, GHC now accepts a -j flag to allow using multiple CPU cores to build a single package. This build performance enhancement is now exposed as a -j flag on cabal build (and also cabal test/bench/run). Build time improvements are modest, but free.
Flag to temporary ignore upper bounds
When new major versions of a package P is released, it usually takes a little while for packages that depend on P to update their version bounds to allow for the new version. This can be frustrating if you just want to test if your own package, which depends on both some of these other packages and on P, builds with the new P.
The --allow-newer=P flag tells the dependency solver to ignore all upper version bounds on P, allowing you to try to compile all packages against this newer version.
Unnecessary re-linking avoidance
Before 1.20, cabal would sometimes re-link build targets that hadn't changed. For example, if you had several test suites that tested different parts of your library, every test suite would be re-linked when you ran cabal test, even if no source file that the test suite depends on was changed. The same thing would happen for executables and benchmarks.
Now cabal doesn't re-link executables (of any kind) unless something changed.
Streaming cabal test output
cabal test can now stream its output to stdout, making it easier to see progress for long-running tests. Streaming output is enabled by passing --show-details=streaming to cabal test and is off by default (for now.)
New cabal exec command
Cabal sandboxes have almost completely replaced previous sandbox implementations. There was one use case that wasn't completely captured by the integrated sandbox implementation, namely starting new shells where the environment was set up to automatically use the sandbox for all GHC commands.
cabal exec allows you to launch any binary in an environment where the sandbox package DB is used by default. In particular you can launch a new shell using cabal exec [ba]sh.
Haddock configuration options
Haddock options can now be set in ~/.cabal/config. Here are the options and their default values:
haddock
-- keep-temp-files: False
-- hoogle: False
-- html: False
-- html-location:
-- executables: False
-- tests: False
-- benchmarks: False
-- all:
-- internal: False
-- css:
-- hyperlink-source: False
-- hscolour-css:
-- contents-location:
How to use cabal
While not strictly related to this release, I thought I'd share how we expect users to use cabal. Using cabal this way makes sure that the features work well for you, now and in the future.
The main message is this: to build a package, use build, not install.
Building packages using cabal install comes from a time when
installing dependencies was more difficult,
depending on non-published packages was difficult (i.e. no sandbox add-source), and
using the other commands required manual configure-ing.
My intention is to remove the need for install from the development workflow altogether. Today the recommended way to build a package is to run this once:
cabal sandbox init
cabal install --only-dep # optionally: --enable-tests
The development workflow is then just
cabal build/test/bench/run
No configuring (or manual rebuilding) needed. build implies configure and test/bench/run imply build.
Soon build will also imply the above dependency installation, when running in a sandbox.
Credits
Here are the contributors for this release, ordered by number of commits:
Mikhail Glushenkov
Johan Tibell
Duncan Coutts
Thomas Tuegel
Ian D. Bollinger
Ben Armston
Niklas Hambüchen
Daniel Trstenjak
Tuncer Ayaz
Herbert Valerio Riedel
Tillmann Rendel
Liyang HU
Dominic Steinitz
Brent Yorgey
Ricky Elrod
Geoff Nixon
Florian Hartwig
Bob Ippolito
Björn Peemöller
Wojciech Danilo
Sergei Trofimovich
Ryan Trinkle
Ryan Newton
Roman Cheplyaka
Peter Selinger
Niklas Haas
Nikita Karetnikov
Nick Smallbone
Mike Craig
Markus Pfeiffer
Luke Iannini
Luite Stegeman
John Lato
Jens Petersen
Jason Dagit
Gabor Pali
Daniel Velkov
Ben Millwood
Apologies if someone was left out. Once in a while we have to make commits on behalf of an author. Those authors are not captured above.Johan Tibelltag:blogger.com,1999:blog-927289640963145319.post-7907479414537624982Sun, 20 Apr 2014 11:33:00 +0000'M' is for burrito. No, really! (Monads, a burrito-escque introduction)
http://logicaltypes.blogspot.com/2014/04/m-is-for-burrito-no-really-monads.html
'M' is for Burrito. No, really.(A little post about monads)This one's easy.A monad is a burrito: a nice, soft-shell wrapper hiding all that yummy, but complicated stuff inside (green peppers and onions? who was the first one to think of that?)A monad is a burrito, wrapping all that complicated stuff like an integer:Just 3under its soft-shell wrapper:[1,2,3]I just showed you two monads: the first one is the 'Maybe' monad, representing semideterminism (we're certain, in this case, that the underlying unit is 'Just' 3, but we could, in a different situation, be equally certain that the underlying unit was 'Nothing' Those two cases (two = semi) give us our certainty (determinism) one way or the other), the second is the 'List' monad, representing nondeterminism (you could have the empty list [] (no answers, no = non), or, in the above case, a list with a set of different answers (one or several)).So, monad = burrito. Simple, right?Class dismissed."Excuse me, Professor geophf," the wormy student in the back of the class wearing b-c glasses and who always has to ask annoying questions in his whiny voice at the very end of class, making everybody late for their next class and me late for my latte, and you don't want to make me late for my latte! He continued, oblivious: "What about monads that don't contain anything at all? What is that? An empty burrito?"He giggled at his own joke, thinking himself rather smart and drôle."What?" I roared, "An empty burrito? Such sacrilege cannot be! Besides, what's the point of an empty monad? That's just ridiculous!"I was rather peeved that this young upstart would trespass on the beauty of my burrito-analogue. It was simple and easy to understand. Why mess with it?"But," he continued in his whiny voice, "what about the empty list and 'Nothing' in the Maybe monad. They carry no value but are essential to each of those monads, how do you explain these monadic values as burritos? Are they churros?"I sputtered in fury. He had me there.So I did what every tyrant teacher does, I relied on my position of authority."No, monads are burritos! Get that through your head!" And then: "Class dismissed!"Everybody fled. I do cut quite the imposing figure when I have that angry glint in my eye and my flinty face set in stone.Okay, so monads aren't burritos. I admit it. (But not to Mr. Wormwood. Never, never!) There is nothing that says a monad has to carry a value, it's just a convenience to think of monads like that: a wrapper around an 'ordinary' object. But monads are objects, themselves, and, in fact, more 'ordinary,' that is: regular, than plain-old objects like integers.The problem of plain-old objects, the problem that monads address, each in their own way (and there are many species of monads, not just maybes and lists, but eithers, and states and ... stuff! Lots of wild and weird different stuff!), is that plain-old objects admit the uninhabited instance, ⊥, in their type. So the type of natural numbers is 0, 1, 2, 3, ... but it's also ⊥. For programming languages that realize this (haskell, for example), they deal with this gracefully. For programming languages that realize this, but then do nothing about it, there is a systemic problem of the program going into an unstable state when encountering an uninhabited instance.Something that causes Algol-child language programmers to shake in their booties: the 'null-pointer exception.'Question: how can Java throw a 'NullPointerException,' when it doesn't have the pointer construct in its language, at all? Am I the first to call this one out?So, the function in these languages that have ⊥ but do not handle it properly must always admit that you're not working with functions at all, for:x + yis x plus y, but if either in uninhabited, the answer is KABOOM!A program crash. You just destroyed your entire system. And you didn't even do it: the language designers did.Monads were an answer to this problem. A little dude with a long, bushy beard by the name of Kleisli developed the work around what came to be known as the Kleisli categories, that had monads as their objects and (what came to be known as) Kleisli arrows as morphism, or functions.The guarantee of the Kleisli category was that, indeed, every object as a monad, so there's no uninhabited type, this means Kleisli functions always get you back into a Kleisli category (yes: once you enter the monadic domain, you never leave it. That's the guarantee of a Kleisli category, you can only be a monad (or a monadic function or monad category ...) to get in, and no monads exist outside the Kleisli categories. The monadic domain is the Hotel California of categories).Okay, you're stuck. And the guarantee of the monad is that you don't care what's under the soft-shell, there's no 'getting a value out' of a monad, because, as my annoying student, Mr. Wormwood, pointed out, there are monads with no underlying values. So what do you do with them?Monads have a special function associated with them, called the 'join' operator.Monads, you see, are a special type in that a join of a monad of a monad is just a monad, so, for example,join (Just (Just 3))gives you Just 3What does that do for us?Well, the structure of a monadic function is this:Monad m ↠ f m :: a → b mThat is, the function takes an ordinary type a and 'lifts' it into the monadic category giving the (monadic) answer of type b (correctly m b, as m is the monadic type and b is the type carried by the monad).Okay, but monadic functions can't dig out the underlying type in a monad of its category, so how do even the functions work at all?Just as you can lift an unit type into a monad:return 3 = Just 3You can also lift an ordinary function into the monadic domain:liftM succ = m Int → m Int (for some monadic type m)Okay, so, but where does that even get us? We're still stuck, right?Well, the thing is, you can even lift monadic functions into the monadic-monadic domain:liftM f (of type Monad m ↠ a → m b) = f' :: Monad m ↠ m a → m (m b)What does that get us?Well, combining join with a lifted monadic function gets us a monad from monadic input:join (liftM f (Just 3)) = some monadic answer dependent on the resultant type of f.This whole lifting-into-the-monadic-domain-and-then-joining-the-result is so common when working in monad categories that a special operator was invented, à la Haskell, named '>>=' (pronounced: 'bind').Now, with bind the true power of monads come out:m >>= f >>= g >>= h >>= ... etc.What you see here a monad being bound, in turn, to the monadic functions f, g, h, etc.So, okay, you've got a short-hand for monadically binding functions together. Great. Good on you, and have a latte.The thing here is this.In 'plain-old' programming, you have this ever-present fear that a NullPointerException is going to ruin your life, so you can't write:obj.getA().getB().getC().thenDoSomething()getting an A from obj, then getting a B from that A, then getting a C from that B, then you do something with that C.I mean, you can write that code, but if any of those objects are null: obj, A, B, C, your whole statement explodes?No, worst than that: your entire system blows up, and you have to deal with irate customers wondering why they got a 505-Service down error when they submit their credit card information but forgot to enter their zip code, or some such.So you have to wrap that statement in a try-block, and try it and see if it works, and if it doesn't you have this whole envelope, this whole burrito shell of code you have to write to handle the uninhabited case.Now let's look at the monadic function binding again:m >>= f >>= g >>= h >>= ... etc.What happens if m is null, or the result of f m is ...No. Wait. Monads don't have nulls. Monads are just monads, so the above binding ...... okay, wait for it, ...just.works.In fact, in the Kleisli category, it is guaranteed to work.It. just. works.For you, not a Java(script) programmer, you're like: "Well, duh, yeah, that's how it's supposed to be! 1 + 1 = 2, not 'error' or whatever."You can say that, and expect that, in a pure mathematical domain (with monads implicit in the mathematics for you. You're welcome.) But a Java programmer, with any experience, any hard-won experience should be (rightly) awed."Wait. You mean, it just works? How is that possible?"Yeah. How is that possible? That Java code just works?It doesn't. But, when I translate the Kleisli category down into a Java implementation, and then lift every Java object into that cateogy, so it's not a Plain-old Java Object (POJO) any more, but now it's a monad in the Kleisli category, and operate on it as such from then on.Well, then, it just works.Some people look at my monadic harness and say I'm unnecessarily complicating things.I beg to differ."But, geophf, I just want the start date; all this stuff just complicates things!"But what do you want the start date ... for? If you just want the start date, thenpojo.getStartDate()will get you there, and you're done.But if you want to base any logic off it? Is your start date the start of a work-flow? So if the start date is null, your system, that you unit tested with start date values, now crashes in production and you have no idea why, because you unit tested it, and the code looks good.But that one null wrecks everything.I don't have any nulls. So I lift my start date up into a monad, and start my work flow, and my work flow works, and if the start date is null, then the start date isn't lifted, the functions fall through, because they're not working with a monad, and I just go onto the next thing.And I coded zero lines of code around that.Your try-catch blocks looking for nulls everywhere ...Well, what's more complicated now? What code works in production, regardless of dirty or bad data?Monads are a guarantee, and, as burritos go, they are quite tasty. Give them a try!p.s.:Now there's the whole conversation around monads that carry no value intentionally throughout the entire monad. So, for example, if the monad type is 'Quark' the the monadic inhabited types are up, down, strange, charmed, top and bottom, and their values are ... nothing at all: their instance is the value, it has no underlying value it carries (unless you want to get into the guts of spin and charge ...), its the particular monad instance, or the particular quark, we care about, not what's inside at all, something like monads as enumerated values, but ... eh. Monads-as-burritos is a limited view, it will trip you up, but for the most part it works. When you do get to the point of treating monads-as-monads, and not caring, anymore, about the underlying value, you can tap yourself on the shoulder with monadic-Nirvana. It's a really good feeling, and gives an entirely new understanding of monads and their utility. "Good to know," as it were, but monad-as-burrito works for the most part and is tasty, too, to boot.geophftag:blogger.com,1999:blog-4650294074444534066.post-1262165182710747032Tue, 15 Apr 2014 15:46:00 +0000JSON validation combinators
http://feedproxy.google.com/~r/RomanCheplyaka/~3/C2rB4qE9qeo/2014-04-20-json-validation-combinators.html
http://ro-che.info//articles/2014-04-20-json-validation-combinators.htmlSat, 19 Apr 2014 21:00:00 +0000Silicon Valley could force NSA reform, tomorrow. What's taking so long?
http://wadler.blogspot.com/2014/04/silicon-valley-could-force-nsa-reform.html
CEOs from Yahoo to Dropbox and Microsoft to Zynga met at the White House, but are they just playing for the cameras? Photograph: Kevin Lamarque / Reuters Trevor Timm asks a key question in The Guardian:The CEOs of the major tech companies came out of the gate swinging 10 months ago, complaining loudly about how NSA surveillance has been destroying privacy and ruining their business. They still are. Facebook founder Mark Zuckerberg recently called the US a "threat" to the Internet, and Eric Schmidt, chairman of Google, called some of the NSA tactics "outrageous" and potentially "illegal". They and their fellow Silicon Valley powerhouses – from Yahoo to Dropbox and Microsoft to Apple and more – formed a coalition calling for surveillance reform and had conversations with the White House.But for all their talk, the public has come away empty handed. The USA Freedom Act, the only major new bill promising real reform, has been stalled in the Judiciary Committee. The House Intelligence bill may be worse than the status quo. Politico reported on Thursday that companies like Facebook and are now "holding fire" on the hill when it comes to pushing for legislative reform....We know it's worked before. Three years ago, when thousands of websites participated in an unprecedented response to internet censorship legislation, the Stop Online Piracy Act (Sopa), the public stopped a once-invincible bill in its tracks. If they really, truly wanted to do something about it, the online giants of Silicon Valley and beyond could design their systems so that even the companies themselves could not access their users' messages by making their texting and instant messaging clients end-to-end encrypted.But the major internet outfits were noticeably absent from this year's similar grassroots protest – dubbed The Day We Fight Back – and refused to alter their websites à la Sopa. If they really believed the NSA was the threat so many of them have claimed, they'd have blacked out their websites in protest already.Philip Wadlertag:blogger.com,1999:blog-9757377.post-4794434573585281932Sat, 19 Apr 2014 08:59:00 +0000I Shall Vote No
http://wadler.blogspot.com/2014/04/i-shall-vote-no.html
Spotted on Bella Caledonia.[After Christopher Logue, I Shall Vote Labour (1966)By A.R. FrithI shall vote No because, without Westminster, We’d never have got rid of the Poll TaxI shall vote No because eight hundred thousand Scots live in England, and there are no jobs here to match their talents and meet their aspirationsI shall vote No, because my grandmother was a MacDougallI shall vote No in case Shell and BP leave and take their oil with themI shall vote No because otherwise we would have to give back the pandasI shall vote No because I am feartI shall vote No because the people who promised us a better deal if we voted No in 79, and warned us of the dire consequences of devolution in 97, tell us we shouldI shall vote No so as not to let down my fellow socialists in Billericay and BasildonandI shall vote No, because if we got rid of Trident and stopped taking part in illegal wars we would be a target for terrorismI shall vote No because if I lived under a government that listened to me and had policies I agreed with, I wouldn’t feel BritishI shall vote No because the RAF will bomb our airports if we are a separate countryI shall vote No because to vote Yes dishonours the Dead of the Great War, who laid down their lives for the rights of small nationsI shall vote No, lest being cut off from England turns Red Leicester cheese and Lincolnshire sausages into unobtainable foreign delicacies, like croissants, or bananasI shall vote No, because, as a progressive, I have more in common with Billy Bragg or Tariq Ali, who aren’t Scottish, than some toff like Lord Forsyth, who is.I shall vote No, because the certainty of billions of pounds worth of spending cuts to come is preferable to the uncertainty of wealthI shall vote No, because it is blindingly obvious that Scotlands voice at the UN, and other international bodies, will be much diminished if we are a member-stateI shall vote No because having a parliament with no real power, and another which is run by people we didnt vote for, is the best of both worldsI shall vote No because I trust and admire Nick Clegg, who is promising us Federalism when the Liberals return to officeI shall vote No, because Emma Thompson would vote No, and her Dad did The Magic RoundaboutI shall vote No, because A.C. Grayling would vote No,and his Mum was born on Burns NightI shall vote No because David Bowie asked Kate Moss to tell us to, and he lives in New York and used to be famousI shall vote No, because nobody ever asks me what I thinkI shall vote No, because a triple-A credit rating is vital in the modern worldI shall vote No because things are just fine as they areI shall vote No because the English say they love us,and that if we vote Yes, they will wreck our economy.Philip Wadlertag:blogger.com,1999:blog-9757377.post-645517616742013499Sat, 19 Apr 2014 08:35:00 +0000Haskell gets static typing right
http://www.well-typed.com/blog/91
The following blog post originally appeared as a guest post on the Skills Matter blog. Well-Typed are regularly teaching both introductory or advanced Haskell courses at Skills Matter, and we will also be running a special course on Haskell’s type system.
Statically typed languages are often seen as a relic of the past – old and clunky. Looking at languages such as C and Java, we’re used to writing down a lot of information in a program that just declares certain variables to be of certain types. And what do we get in return? Not all that much. Yes, granted, some errors are caught at compile time. But the price is high: we’ve cluttered up the code with noisy declarations. Often, code has to be duplicated or written in a more complicated way, just to satisfy the type checker. And then, we still have a significant risk of run-time type errors, because type casting is common-place and can fail at unexpected moments.
So it isn’t a big surprise that dynamically typed languages are now very fashionable. They promise to achieve much more in less time, simply by getting rid of static type checking.
However, I want to argue that we shouldn’t be too keen on giving up the advantages of static types, and instead start using programming languages that get static typing right. Many functional languages such as Scala, F#, OCaml and in particular Haskell are examples of programming languages with strong static type systems that try not to get in the way, but instead guide and help the programmer.
In the rest of this post, I want to look at a few of the reasons why Haskell’s type system is so great. Note that some of the features I’m going to discuss are exclusive to Haskell, but most are not. I’m mainly using Haskell as a vehicle to advertise the virtues of good static type systems.
1. Type inference
Type inference makes the compiler apply common sense to your programs. You no longer have to declare the types of your variables, the compiler looks at how you use them and tries to determine what type they have. If any of the uses are inconsistent, then a type error is reported. This removes a lot of noise from your programs, and lets you focus on what’s important.
Of course, you are still allowed to provide explicit type signatures, and encouraged to do so in places where it makes sense, for example, when specifying the interface of your code.
2. Code reuse
Nothing is more annoying than having to duplicate code. In the ancient days of statically typed programming, you had to write the same function several times if you wanted it to work for several types. These days, most languages have “generics” that allow you to abstract over type parameters. In Haskell, you just write a piece of code that works for several types, and type inference will tell you that it does, by inferring a type that is “polymorphic”. For example, write code that reverses all the elements of a data structure, and type inference will tell you that your code is independent of the type of elements of the data structure, so it’ll just work regardless of what element type you use. If you write code that sorts a data structure, type inference will figure out that all you require to know about the elements is that they admit an ordering.
3. No run-time type information by default
Haskell erases all type information after type checking. You may think that this is mainly a performance issue, but it’s much more than that. The absence of run-time type information means that code that’s polymorphic (i.e., type-agnostic, see above) cannot access certain values. This can be a powerful safety net. For example, just the type signature of a function can tell you that the function could reorder, delete or duplicate elements in a data structure, but not otherwise touch them, modify them or operate on them in any other way. Whereas in the beginning of this post I complained that bad static type systems don’t allow you to do what you want because they’re not powerful enough, here we can deliberately introduce restrictions to save us (as well as colleagues) from accidental mistakes. So polymorphism turns out to be much more than just a way to reduce code duplication.
By the way, Haskell gives you various degrees of selective run-time typing. If you really need it in places, you can explicitly attach run-time type information to values and then make type-based decisions. But you say where and when, making a conscious choice that you gain flexibility at the cost of safety.
4. Introducing new datatypes made easy
In Haskell, it’s a one-liner to define a new datatype with a new name that has the same run-time representation as an existing type, yet is treated as distinct by the type system. (This may sound trivial, but surprisingly many statically typed languages get it wrong.) So for example it’s easy to define lots of different types that are all integers internally: counters, years, quantities, … In Haskell, this simple feature is often used to define safe boundaries: a specific type for URLs, a specific type for SQL queries, a specific type for HTML documents, and so on. Each of these types then comes with specific functions to operate on them. All such operations guarantee that whenever you have a value of this type, it’s well-formed, and whenever you render a value of this type, it’s syntactically correct and properly escaped.
5. Explicit effects
In virtually all programming languages, a function that performs some calculations on a few numbers and another function that performs the same calculations, but additionally sends a million spam emails to addresses all over the world, have exactly the same type, and therefore the same interface. Not so in Haskell. If a function writes to the screen, reads from the disk, sends messages over the network, accesses the system time, or makes use of any other so-called side effect, this is visible in its type. This has two advantages: first, it makes it much easier to rely on other people’s code. If you look at the interface and a function is effect-free, then you for example automatically know that it is also thread-safe. Second, the language facilitates a design where side effects are isolated into relatively small parts of the code base. This may seem difficult to achieve for highly stateful systems, but surprisingly, it usually is not: even interactive systems can usually be described as pure functions reacting to a series of requests with appropriate responses, and a separate driver that does the actual communication. Such separation makes it not only easier to test the program, but also facilitates the evolution of the program such, for example, to adapt it to run in a different environment. Haskell’s type system therefore encourages good design.
6. Types as a guide in program development
If you only ever see types as a source of errors, and therefore as enemies on your path of getting your program accepted, you’re doing them injustice. Types as provided by Haskell are an element of program design. If you give your program precise types and follow systematic design principles, your program almost writes itself. Programming with a strong type system is comparable to playing a puzzle game, where the type system removes many of the wrong choices and helpfully guides you to take the right path. This style of programming is supported by a new language extension called “Typed Holes” where you leave parts of your program unspecified during development, and obtain feedback from the development environment about what type has to go into the hole, and what program fragments you have available locally to construct a value of the desired type. Playing this kind of puzzle game is actually quite fun!
7. Programming on the type level
Haskell’s type system provides many advanced features that you don’t usually have to know about, but that can help you if you want to ensure that some complicated invariants hold in your program. Scarily named concepts such as “higher-ranked polymorphism”, “generalized algebraic datatypes” and “type families” essentially provide you with a way to write programs that compute with types. The possibilities are nearly endless. From playful things such as writing a C-printf-style function where the first argument determines the number of arguments that are expected afterwards as well as their types, you can go on to code that provides useful guarantees such as that mutable references that are available within one thread of control are guaranteed not to be accessed in a completely different context, arrays that can adapt to different internal representations depending on what type of values they contain, working with lists that are guaranteed to be of a specific length, or with trees that are guaranteed to be balanced, or with heterogeneous lists (where each element can be of a different type) in a type-safe way. The goal is always to make illegal inputs impossible to construct. If they’re impossible to construct by the type system, you can isolate sanity tests at the boundary of your code, rather than having to do them over and over again. The good thing is that these features are mostly optional, and often hardly affect the interface of libraries. So as a user, you can benefit from libraries employing such features and having extra safety guarantees internally. As a library writer, you can choose whether you’re happy with the normal level of Haskell type safety (which is already rather a lot), or if you want to spend some extra effort and get even more.
If my overview has tempted you and you now want to learn more about Haskell, you’re welcome follow one of my introductory or advanced Haskell courses that I (together with my colleagues at Well-Typed) regularly teach at Skills Matter. These courses do not just focus on the type system of Haskell (although that’s a significant part). They introduce the entire language in a hands-on way with lots of examples and exercises, as well as providing guidelines on how to write idiomatic Haskell and how to follow good development practices.
If you already know some Haskell and are particularly interested in the advanced type system features mentioned in point 7, we also offer a new one-day course on Haskell’s type system that specifically focuses on how far you can push it.andreshttp://www.well-typed.com/blog/91Thu, 17 Apr 2014 16:04:23 +0000Autocomplete command-line flags with GHC 7.8.2
http://lambda.jstolarek.com/2014/04/autocomplete-command-line-flags-with-ghc-7-8-2/
GHC 7.8.2 has been released just a few days ago1. This is the first official release of GHC that contains my contributions. Most of them are improvements in the code generator and are thus practically invisible to most users. But I also implemented one very nice feature that will be useful to an average Haskeller. GHC now has --show-options flag that lists all command-line flags. This feature can be used to auto-complete command-line flags in shells that support this feature. To enable auto-completion in Bash add this code snippet to your ~/.bashrc file:
# Autocomplete GHC commands
_ghc()
{
local envs=`ghc --show-options`
# get the word currently being completed
local cur=${COMP_WORDS[$COMP_CWORD]}
# the resulting completions should be put into this array
COMPREPLY=( $( compgen -W "$envs" -- $cur ) )
}
complete -F _ghc -o default ghc
From my experience the first completion is a bit slow but once the flags are cached things work fast.
Please ignore 7.8.1 release. It shipped with a bug that caused rejection of some valid programs.Jan Stolarekhttp://lambda.jstolarek.com/?p=1487Wed, 16 Apr 2014 14:48:03 +0000Team Scotland
http://wadler.blogspot.com/2014/04/team-scotland.html
What happens after Yes? It's not the SNP, it's the people. A great post on Bella Caledonia. The UK chattering classes have been wondering what a real, mass grass-roots campaign might look like in modern, professionalised politics. Impotent is their usual conclusion. Well come on up and we’ll show you. The old feudal dance where councilor doths cap to MP, MP to Minister, Minister to Prime Minister and Prime Minister to corporate CEO may well continue apace even here in Scotland. But it’s not winning Scotland. Philip Wadlertag:blogger.com,1999:blog-9757377.post-7257139306444886875Wed, 16 Apr 2014 12:19:00 +0000Help ORG restart the debate about internet filters
http://wadler.blogspot.com/2014/04/help-org-restart-debate-about-internet.html
The Open Rights Group is starting a campaign opposed to the default filtering now imposed by all providers in the UK---de facto censorship. You can fund it via IndieGogo.Philip Wadlertag:blogger.com,1999:blog-9757377.post-2324392517281288887Wed, 16 Apr 2014 11:59:00 +0000Writing admin interfaces with Fay using fay-builder
http://feedproxy.google.com/~r/typlab/blog/~3/v4aSPW4evFs/82777010096
http://engineering.silk.co/post/82777010096Tue, 15 Apr 2014 09:38:00 +0000The heartbleed bug and FP Haskell Center
https://www.fpcomplete.com/blog/2014/04/heartbleed
The heartbleed bug and FP Haskell CenterIf you haven't heard about the heartbleed bug and related security
issues,
this article
provides a good explanation.Applications developed in FPHC and deployed using the FP Application
servers are not vulnerable to this bug.Services we use from Amazon and GitHub were affected. SSL connections
to our servers go through Amazon software, and we use SSL to connect
to GitHub repositories. Both Amazon and GitHub have already updated
their services to remove the vulnerability. FP Complete has followed
GitHub's suggestions by changing our passwords and OAuth tokens. You
can read
those guidelines
at github.While we have no evidence that any sensitive information was leaked
from FPHC, we recommend changing your password for FPHC as soon as
possible, just in case.Other measures to increase security never hurt. Things like using
two-factor authentication on sites for which it is available, and
using password locker software that will generate strong passwords
unique to each site, will help prevent people breaking into your
accounts. This event provides a good time to consider adding these
extra security measures if you aren't already using them.https://www.fpcomplete.com/blog/2014/04/heartbleedMon, 14 Apr 2014 14:25:52 +0000Continuations and Exceptions
http://jozefg.bitbucket.org/posts/2014-04-14-either-and-conts.html
http://jozefg.bitbucket.org/posts/2014-04-14-either-and-conts.htmlMon, 14 Apr 2014 00:00:00 +0000As a spin off from teaching programming to my 10 year old son...
http://justtesting.org/post/82283434665
http://justtesting.org/post/82283434665Thu, 10 Apr 2014 11:47:54 +0000Migration Complete: Welcome to blog.cppcabrera.com!
http://cppcabrera.blogspot.com/2014/04/migration-complete-welcome-to.html
The migration is now complete. Currently, I have:* Atom feed support* Hosted on https:// (with heartbleed patched)* Sources hosted on github* Main blog over at https://blog.cppcabrera.com* All content from here ported over* All posts tagged appropriatelyI've documented the process as my first new post at the new location: Learning Hakyll and Setting UpAt this point, the one feature I'd like to add to my static blog soon is the ability to have a Haskell-only feed. I'll be working on that over the coming week.Thanks again for reading, and I hope you'll enjoy visiting the new site!Alejandro Cabreratag:blogger.com,1999:blog-5533014087482697280.post-4652260788507899126Thu, 10 Apr 2014 07:27:00 +0000Parallelism and Concurrency, Revisited
http://existentialtype.wordpress.com/2014/04/09/parallelism-and-concurrency-revisited/
To my delight, I still get compliments on and criticisms of my post from three years ago (can it possibly be that long?) on parallelism and concurrency. In that post I offered a “top down” argument to the effect that these are different abstractions with different goals: parallelism is about exploiting computational resources to maximize efficiency, concurrency is about non-deterministic composition of components in a system. Parallelism never introduces bugs (the semantics is identical to the sequential execution), but concurrency could be said to be the mother lode of all bugs (the semantics of a component changes drastically, without careful provision, when composed concurrently with other components). The two concepts just aren’t comparable, yet somehow the confusion between them persists. (Not everyone agrees with me on this distinction, but neither have I seen a comparable analysis that shows them to be the same concept. Most complaints seem to be about my use of the words “parallelism” and “concurrency” , which is an unavoidable problem, or about my temerity in trying to define two somewhat ill-defined concepts, a criticism that I’ll just have to accept.)
I’ve recently gotten an inkling of why it might be that many people equate the two concepts (or see no point in distinguishing them). This post is an attempt to clear up what I perceive to be a common misunderstanding that seems to explain it. It’s hard for me to say whether it really is all that common of a misunderstanding, but it’s the impression I’ve gotten, so forgive me if I’m over-stressing an obvious point. In any case I’m going to try for a “bottom up” explanation that might make more sense to some people.
The issue is scheduling.
The naive view of parallelism is that it’s just talk for concurrency, because all you do when you’re programming in parallel is fork off some threads, and then do something with their results when they’re done. I’ve previously argued that this is the wrong way to think about parallelism (it’s really about cost), but let’s just let that pass. It’s unarguably true that a parallel computation does consist of a bunch of, well, parallel computations. So, the argument goes, it’s nothing but concurrency, because concurrency is, supposedly, all about forking off some threads and waiting to see what they do, and then doing something with it. I’ve argued that that’s not a good way to think about concurrency either, but we’ll let that pass too. So, the story goes, concurrency and parallelism are synonymous, and bullshitters like me are just trying to confuse people and make trouble.
Being the troublemaker that I am, my response is, predictably, no, just no. Sure, it’s kinda sorta right, as I’ve already acknowledged, but not really, and here’s why: scheduling as you learned about it in OS class (for example) is an altogether different thing than scheduling for parallelism. And this is the heart of the matter, from a “bottom-up” perspective.
There are two aspects of OS-like scheduling that I think are relevant here. First, it is non-deterministic, and second, it is competitive. Non-deterministic, because you have little or no control over what runs when or for how long. A beast like the Linux scheduler is controlled by a zillion “voodoo parameters” (a turn of phrase borrowed from my queueing theory colleague, Mor Harchol-Balter), and who the hell knows what is going to happen to your poor threads once they’re in its clutches. Second, and more importantly, an OS-like scheduler is allocating resources competitively. You’ve got your threads, I’ve got my threads, and we both want ours to get run as soon as possible. We’ll even pay for the privilege (priorities) if necessary. The scheduler, and the queueing theory behind it (he says optimistically) is designed to optimize resource usage on a competitive basis, taking account of quality of service guarantees purchased by the participants. It does not matter whether there is one processor or one thousand processors, the schedule is unpredictable. That’s what makes concurrent programming hard: you have to program against all possible schedules. And that’s why you can’t prove much about the time or space complexity of your program when it’s implemented concurrently.
Parallel scheduling is a whole ‘nother ball of wax. It is (usually, but not necessarily) deterministic, so that you can prove bounds on its efficiency (Brent-type theorems, as I discussed in my previous post and in PFPL). And, more importantly, it is cooperative in the sense that all threads are working together for the same computation towards the same ends. The threads are scheduled so as to get the job (there’s only one) done as quickly and as efficiently as possible. Deterministic schedulers for parallelism are the most common, because they are the easiest to analyze with respect to their time and space bounds. Greedy schedulers, which guarantee to maximize use of available processors, never leaving any idle when there is work to be done, form an important class for which the simple form of Brent’s Theorem is obvious.
Many deterministic greedy scheduling algorithms are known, of which I will mention p-DFS and p-BFS, which do p-at-a-time depth- and breadth-first search of the dependency graph, and various forms of work-stealing schedulers, pioneered by Charles Leiserson at MIT. (Incidentally, if you don’t already know what p-DFS or p-BFS are, I’ll warn you that they are a little trickier than they sound. In particular p-DFS uses a data structure that is sort of like a stack but is not a stack.) These differ significantly in their time bounds (for example, work stealing usually involves expectation over a random variable, whereas the depth- and breadth traversals do not), and differ dramatically in their space complexity. For example, p-BFS is absolutely dreadful in its space complexity. For a full discussion of these issues in parallel scheduling, I recommend Dan Spoonhower’s PhD Dissertation. (His semantic profiling diagrams are amazingly beautiful and informative!)
So here’s the thing: when you’re programming in parallel, you don’t just throw some threads at some non-deterministic competitive scheduler. Rather, you generate an implicit dependency graph that a cooperative scheduler uses to maximize efficiency, end-to-end. At the high level you do an asymptotic cost analysis without considering platform parameters such as the number of processors or the nature of the interconnect. At the low level the implementation has to validate that cost analysis by using clever techniques to ensure that, once the platform parameters are known, maximum use is made of the computational resources to get your job done for you as fast as possible. Not only are there no bugs introduced by the mere fact of being scheduled in parallel, but even better, you can prove a theorem that tells you how fast your program is going to run on a real platform. Now how cool is that?Filed under: Programming, Research Tagged: concurrency, parallelismRobert Harperhttp://existentialtype.wordpress.com/?p=906Thu, 10 Apr 2014 03:15:00 +0000Gibbs Sampling in R, Haskell, Jags and Stan
http://idontgetoutmuch.wordpress.com/2014/04/09/gibbs-sampling-in-r-haskell-jags-and-stan/
Introduction
It’s possible to Gibbs sampling in most languages and since I am doing some work in R and some work in Haskell, I thought I’d present a simple example in both languages: estimating the mean from a normal distribution with unknown mean and variance. Although one can do Gibbs sampling directly in R, it is more common to use a specialised language such as JAGS or STAN to do the actual sampling and do pre-processing and post-processing in R. This blog post presents implementations in native R, JAGS and STAN as well as Haskell.
Preamble
> {-# OPTIONS_GHC -Wall #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods #-}
> {-# OPTIONS_GHC -fno-warn-orphans #-}
> {-# LANGUAGE NoMonomorphismRestriction #-}
> module Gibbs (
> main
> , m
> , Moments(..)
> ) where
>
> import qualified Data.Vector.Unboxed as V
> import qualified Control.Monad.Loops as ML
> import Data.Random.Source.PureMT
> import Data.Random
> import Control.Monad.State
> import Data.Histogram ( asList )
> import Data.Histogram.Fill
> import Data.Histogram.Generic ( Histogram )
> import Data.List
> import qualified Control.Foldl as L
>
> import Diagrams.Backend.Cairo.CmdLine
>
> import LinRegAux
>
> import Diagrams.Backend.CmdLine
> import Diagrams.Prelude hiding ( sample, render )
The length of our chain and the burn-in.
> nrep, nb :: Int
> nb = 5000
> nrep = 105000
Data generated from .
> xs :: [Double]
> xs = [
> 11.0765808082301
> , 10.918739177542
> , 15.4302462747137
> , 10.1435649220266
> , 15.2112705014697
> , 10.441327659703
> , 2.95784054883142
> , 10.2761068139607
> , 9.64347295100318
> , 11.8043359297675
> , 10.9419989262713
> , 7.21905367667346
> , 10.4339807638017
> , 6.79485294803006
> , 11.817248658832
> , 6.6126710570584
> , 12.6640920214508
> , 8.36604701073303
> , 12.6048485320333
> , 8.43143879537592
> ]
A Bit of Theory
Gibbs Sampling
For a multi-parameter situation, Gibbs sampling is a special case of Metropolis-Hastings in which the proposal distributions are the posterior conditional distributions.
Referring back to the explanation of the metropolis algorithm, let us describe the state by its parameters and the conditional posteriors by where then
where we have used the rules of conditional probability and the fact that
Thus we always accept the proposed jump. Note that the chain is not in general reversible as the order in which the updates are done matters.
Normal Distribution with Unknown Mean and Variance
It is fairly standard to use an improper prior
The likelihood is
re-writing in terms of precision
Thus the posterior is
We can re-write the sum in terms of the sample mean and variance using
Thus the conditional posterior for is
which we recognise as a normal distribution with mean of and a variance of .
The conditional posterior for is
which we recognise as a gamma distribution with a shape of and a scale of
In this particular case, we can calculate the marginal posterior of analytically. Writing we have
Finally we can calculate
This is the non-standardized Student’s t-distribution .
Alternatively the marginal posterior of is
where is the standard t distribution with degrees of freedom.
The Model in Haskell
Following up on a comment from a previous blog post, let us try using the foldl package to calculate the length, the sum and the sum of squares traversing the list only once. An improvement on creating your own strict record and using foldl’ but maybe it is not suitable for some methods e.g. calculating the skewness and kurtosis incrementally, see below.
> x2Sum, xSum, n :: Double
> (x2Sum, xSum, n) = L.fold stats xs
> where
> stats = (,,) <$>
> (L.premap (\x -> x * x) L.sum) <*>
> L.sum <*>
> L.genericLength
And re-writing the sample variance using
we can then calculate the sample mean and variance using the sums we have just calculated.
> xBar, varX :: Double
> xBar = xSum / n
> varX = n * (m2Xs - xBar * xBar) / (n - 1)
> where m2Xs = x2Sum / n
In random-fu, the Gamma distribution is specified by the rate paratmeter, .
> beta, initTau :: Double
> beta = 0.5 * n * varX
> initTau = evalState (sample (Gamma (n / 2) beta)) (pureMT 1)
Our sampler takes an old value of and creates new values of and .
> gibbsSampler :: MonadRandom m => Double -> m (Maybe ((Double, Double), Double))
> gibbsSampler oldTau = do
> newMu <- sample (Normal xBar (recip (sqrt (n * oldTau))))
> let shape = 0.5 * n
> scale = 0.5 * (x2Sum + n * newMu^2 - 2 * n * newMu * xBar)
> newTau <- sample (Gamma shape (recip scale))
> return $ Just ((newMu, newTau), newTau)
From which we can create an infinite stream of samples.
> gibbsSamples :: [(Double, Double)]
> gibbsSamples = evalState (ML.unfoldrM gibbsSampler initTau) (pureMT 1)
As our chains might be very long, we calculate the mean, variance, skewness and kurtosis using an incremental method.
> data Moments = Moments { mN :: !Double
> , m1 :: !Double
> , m2 :: !Double
> , m3 :: !Double
> , m4 :: !Double
> }
> deriving Show
> moments :: [Double] -> Moments
> moments xs = foldl' f (Moments 0.0 0.0 0.0 0.0 0.0) xs
> where
> f :: Moments -> Double -> Moments
> f m x = Moments n' m1' m2' m3' m4'
> where
> n = mN m
> n' = n + 1
> delta = x - (m1 m)
> delta_n = delta / n'
> delta_n2 = delta_n * delta_n
> term1 = delta * delta_n * n
> m1' = m1 m + delta_n
> m4' = m4 m +
> term1 * delta_n2 * (n'*n' - 3*n' + 3) +
> 6 * delta_n2 * m2 m - 4 * delta_n * m3 m
> m3' = m3 m + term1 * delta_n * (n' - 2) - 3 * delta_n * m2 m
> m2' = m2 m + term1
In order to examine the posterior, we create a histogram.
> numBins :: Int
> numBins = 400
> hb :: HBuilder Double (Data.Histogram.Generic.Histogram V.Vector BinD Double)
> hb = forceDouble -<< mkSimple (binD lower numBins upper)
> where
> lower = xBar - 2.0 * sqrt varX
> upper = xBar + 2.0 * sqrt varX
And fill it with the specified number of samples preceeded by a burn-in.
> hist :: Histogram V.Vector BinD Double
> hist = fillBuilder hb (take (nrep - nb) $ drop nb $ map fst gibbsSamples)
Now we can plot this.
And calculate the skewness and kurtosis.
> m :: Moments
> m = moments (take (nrep - nb) $ drop nb $ map fst gibbsSamples)
ghci> import Gibbs
ghci> putStrLn $ show $ (sqrt (mN m)) * (m3 m) / (m2 m)**1.5
8.733959917065126e-4
ghci> putStrLn $ show $ (mN m) * (m4 m) / (m2 m)**2
3.451374739494607
We expect a skewness of 0 and a kurtosis of for . Not too bad.
The Model in JAGS
JAGS is a mature, declarative, domain specific language for building Bayesian statistical models using Gibbs sampling.
Here is our model as expressed in JAGS. Somewhat terse.
model {
for (i in 1:N) {
x[i] ~ dnorm(mu, tau)
}
mu ~ dnorm(0, 1.0E-6)
tau <- pow(sigma, -2)
sigma ~ dunif(0, 1000)
}
To run it and examine its results, we wrap it up in some R
## Import the library that allows R to inter-work with jags.
library(rjags)
## Read the simulated data into a data frame.
fn <- read.table("example1.data", header=FALSE)
jags <- jags.model('example1.bug',
data = list('x' = fn[,1], 'N' = 20),
n.chains = 4,
n.adapt = 100)
## Burnin for 10000 samples
update(jags, 10000);
mcmc_samples <- coda.samples(jags, variable.names=c("mu", "sigma"), n.iter=20000)
png(file="diagrams/jags.png",width=400,height=350)
plot(mcmc_samples)
dev.off()
And now we can look at the posterior for .
The Model in STAN
STAN is a domain specific language for building Bayesian statistical models similar to JAGS but newer and which allows variables to be re-assigned and so cannot really be described as declarative.
Here is our model as expressed in STAN. Again, somewhat terse.
data {
int N;
real x[N];
}
parameters {
real mu;
real sigma;
}
model{
x ~ normal(mu, sigma);
mu ~ normal(0, 1000);
}
Just as with JAGS, to run it and examine its results, we wrap it up in some R.
library(rstan)
## Read the simulated data into a data frame.
fn <- read.table("example1.data", header=FALSE)
## Running the model
fit1 <- stan(file = 'Stan.stan',
data = list('x' = fn[,1], 'N' = 20),
pars=c("mu", "sigma"),
chains=3,
iter=30000,
warmup=10000)
png(file="diagrams/stan.png",width=400,height=350)
plot(fit1)
dev.off()
Again we can look at the posterior although we only seem to get medians and 80% intervals.
PostAmble
Write the histogram produced by the Haskell code to a file.
> displayHeader :: FilePath -> Diagram B R2 -> IO ()
> displayHeader fn =
> mainRender ( DiagramOpts (Just 900) (Just 700) fn
> , DiagramLoopOpts False Nothing 0
> )
> main :: IO ()
> main = do
> displayHeader "diagrams/DataScienceHaskPost.png"
> (barDiag
> (zip (map fst $ asList hist) (map snd $ asList hist)))
The code can be downloaded from github.Dominic Steinitzhttp://idontgetoutmuch.wordpress.com/?p=666Wed, 09 Apr 2014 11:43:36 +0000Some not-so-grand challenges in bioinformatics
http://blog.malde.org/posts/bioinformatics-challenges.html
http://blog.malde.org/posts/bioinformatics-challenges.htmlTue, 08 Apr 2014 17:00:00 +0000Proposal: Changes to the PVP
http://www.yesodweb.com/blog/2014/04/proposal-changes-pvp
As I mentioned two posts
ago, there was a
serious discussion on the libraries mailing
list
about the Package Versioning
Policy (PVP).This blog post presents some concrete changes I'd like to see to the PVP to
make it better for both general consumers of Hackage, and for library authors
as well. I'll start off with a summary of the changes, and then give the
explanations:The goal of the PVP needs to be clarified. Its purpose is not to ensure
reproducible builds of non-published software, but rather to provide for
more reliable builds of libraries on Hackage. Reproducible builds should be
handled exclusively through version freezing, the only known technique to
actually give the necessary guarantees.Upper bounds should not be included on non-upgradeable packages, such as
base and template-haskell (are there others?). Alternatively, we should
establish some accepted upper bound on these packages, e.g. many people place
base < 5 on their code.We should be distinguishing between mostly-stable packages and unstable
packages. For a package like text, if you simply import Data.Text (Text,
pack, reverse), or some other sane subset, there's no need for upper bounds.Note that this doesn't provide a hard-and-fast rule like the current PVP, but is
rather a matter of discretion. Communication between library authors and users (via
documentation or other means) would be vital to making this work well.For a package version A.B.C, a bump in A or B indicates some level of
breaking change. As an opt-in approach, package authors are free to
associated meaning to A and B beyond what the PVP requires. Libraries which use these
packages are free to rely on the guarantees provided by package authors when
placing upper bounds.Note that this is very related to point (3).While I (Michael Snoyman) am the author of this proposal, the following people
have reviewed the proposal and support it:Bryan O'SullivanFelipe LessaRoman CheplyakaVincent HanquezReproducible buildsThere are a number of simple cases that can result in PVP-compliant code not
being buildable. These aren't just hypothetical cases; in my experience as both
a package author and Stackage maintainer, I've seen these come up.Package foo version 1.0 provides an instance for MonadFoo for IO and
Identity. Version 1.1 removes the IO instance for some reason. Package bar
provides a function:bar :: MonadFoo m => Int -> m DoublePackage bar compiles with both version 1.0 and 1.1 of foo, and therefore
(following the PVP) adds a constraint to its cabal file foo >= 1.0 && < 1.2.Now a user decides to use the bar package. The user never imports anything from
foo, and therefore has no listing for foo in the cabal file. The user code
depends on the IO instance for MonadFoo. When compiled with foo 1.0, everything
works fine. However, when compiled with foo 1.1, the code no longer compiles.Similarly, instead of typeclass instances, the same situation can occur
with module export lists. Consider version 1.0 of foo which provides:module Foo (foo1, foo2) whereVersion 1.1 removes the foo2 export. The bar package reexports the entire Foo
module, and then a user package imports the module from bar. If the user
package uses the foo2 function, it will compile when foo-1.0 is used, but not
when foo-1.1 is used.In both of these cases, the issue is the same: transitive dependencies are not
being clamped down. The PVP makes an assumption that the entire interface for a
package can be expressed in its version number, which is not true. I see three
possible solutions to this:Try to push even more of a burden onto package authors, and somehow make
them guarantee that their interface is completely immune to changes
elsewhere in the stack. This kind of change was proposed on the libraries list.
I'm strongly opposed to some kind of change like this: it makes authors' lives
harder, and makes it very difficult to provide backwards compatibility in
libraries. Imagine if transformers 0.4 adds a new MonadIO instance; the logical
extreme of this position would be to disallow a library from working with both
transformers 0.3 and 0.4, which will split Hackage in two.Modify the PVP so that instead of listing just direct dependencies, authors
are required to list all transitive dependencies as well. So it would be a
violation to depend on bar without explicitly listing foo in the dependency
list. This will work, and be incredibly difficult to maintain. It will also
greatly increase the time it takes for a new version of a deep dependency to be
usable due to the number of authors who will have to bump version bounds.Transfer responsibility for this to package users: if you first built your
code against foo 1.0, you should freeze that information and continue
building against foo 1.0, regardless of the presence of new versions of foo.
Not only does this increase reproducibility, it's just common sense: it's
entirely possible that new versions of a library will introduce a runtime bug,
performance regression, or even fix a bug that your code depends on. Why should
the reliability of my code base be dependent on the actions of some third party
that I have no control over?Non-upgradeable packagesThere are some packages which ship with GHC and cannot be upgraded. I'm aware
of at least base and template-haskell, though perhaps there are others
(haskell98 and haskell2010?). In the past, there was good reason to place upper
bounds on base, specifically with the base 3/4 split. However, we haven't had
that experience in a while, and don't seem to be moving towards doing that
again. In today's world, we end up with the following options:Place upper bounds on base to indicate "I haven't tested this with newer
versions of GHC." This then makes it difficult for users to test out that
package with newer versions of GHC.Leave off upper bounds on base. Users may then try to install a package onto
a version of GHC on which the package hasn't been tested, which will result
in either (1) everything working (definitely the common case based on my
Stackage maintenance), or (2) getting a compilation error.I've heard two arguments to push us in the direction of keeping the upper
bounds in this case, so I'd like to address them both:cabal error messages are easier to understand than GHC error messages. I have two problems with that:I disagree: cabal error messages are terrible. (I'm told this will be fixed in the next version of cabal.) Take the following output as a sample:cabal: Could not resolve dependencies:
trying: 4Blocks-0.2
rejecting: base-4.6.0.1/installed-8aa... (conflict: 4Blocks => base>=2 && <=4)
rejecting: base-4.6.0.1, 4.6.0.0, 4.5.1.0, 4.5.0.0, 4.4.1.0, 4.4.0.0, 4.3.1.0,
4.3.0.0, 4.2.0.2, 4.2.0.1, 4.2.0.0, 4.1.0.0, 4.0.0.0, 3.0.3.2, 3.0.3.1 (global
constraint requires installed instance)I've seen a number of users file bug reports not understanding that
this message means "you have the wrong version of GHC."Even if the error messages were more user-friendly, they make it more
difficult to fix the actual problem: the code doesn't compile with the
new version of GHC. Often times, I've been able to report an error message to a
library author and, without necessarily even downloading the new version of
GHC, he/she has been able to fix the problem.Using upper bounds in theory means that cabal will be able to revert to an
older version of the library that is compatible with the new version of
GHC. However, I find it highly unlikely that there's often- if ever- a case
where an older version of a library is compatible with a later version of GHC.Mostly-stable, and finer-grained versioningI'll combine the discussion of the last two points. I think the heart of the
PVP debates really comes from mostly-stable packages. Let's contrast with the
extremes. Consider a library which is completely stable, never has a breaking
change, and has stated with absolute certainty that it never will again. Does
anyone care about upper bounds on this library? They're irrelevant! I'd have
no problem with including an upper bound, and I doubt even the staunchest PVP
advocates would really claim it's a problem to leave it off.On the other hand, consider an extremely unstable library, which is releasing
massively breaking changes on a weekly basis. I would certainly agree in that
case that an upper bound on that library is highly prudent.The sticking point is the middle ground. Consider the following code snippet:import Data.Text (Text, pack)
foo :: Text
foo = pack "foo"According to the PVP as it stands today, this snippet requires an upper bound
of < 1.2 on the text package. But let's just play the odds here: does anyone
actually believe there's a real chance that the next iteration of text will
break this code snippet? I highly doubt it; this is a stable subset of the text
API, and I doubt it will ever be changing. The same can be said of large
subsets of many other packages.By putting in upper bounds in these cases, we run a very real risk of
bifurcating Hackage into "those demanding the new text version for some new
feature" vs "those who haven't yet updated their upper bounds to allow the new
version of text."The PVP currently takes an extremely conservative viewpoint on this, with the
goal of solving just one problem: making sure code that compiles now continues
to compile. As I demonstrated above, it doesn't actually solve that problem
completely. And in addition, in this process, it has created other problems,
such as this bifurcation.So my proposal is that, instead of creating rigid rules like "always put an
upper bound no matter what," we allow some common sense into the process, and
also let package authors explicitly say "you can rely on this API not
changing."http://www.yesodweb.com/blog/2014/04/proposal-changes-pvpTue, 08 Apr 2014 15:55:00 +0000Generate Javascript classes for your .NET types
http://noamlewis.wordpress.com/2014/04/08/generate-javascript-classes-for-your-net-types/
We open-sourced another library: ClosureExterns.NET (on github and nuget). It generates Javascript classes from .NET-based backend types, to preserve type “safety” (as safe as Javascript gets) across front- and backend. As a bonus you get Google closure annotations. The type annotations are understood by WebStorm (and other editors) and improve your development experience. Also, if you use Google Closure to compile or verify your code, it will take these types into account. We use it extensively with C#. We haven’t tried it with F#, but it’s supposed to work with any .NET type.
ClosureExterns.NET makes it easier to keep your frontend models in sync with your backend. The output is customizable – you can change several aspects of the generated code. For example you can change the constructor function definition, to support inheritance from some other Javascript function. For more details see ClosureExternOptions.
Getting Started
First, install it. Using nuget, install the package ClosureExterns.
Then, expose a method that generates your externs. For example, a console application:
public static class Program
{
public static void Main()
{
var types = ClosureExternsGenerator.GetTypesInNamespace(typeof(MyNamespace.MyType));
var output = ClosureExternsGenerator.Generate(types);
Console.Write(output);
}
}
You can also customize the generation using a ClosureExternsOptions object.
Example input/output
Input
class B
{
public int[] IntArray { get; set; }
}
Output
var Types = {};
// ClosureExterns.Tests.ClosureExternsGeneratorTest+B
/** @constructor
*/
Types.B = function() {};
/** @type {Array. } */
Types.B.prototype.intArray = null;
For a full example see the tests. Tagged: .NET, c#, Javascriptsinelawhttp://noamlewis.wordpress.com/?p=454Tue, 08 Apr 2014 15:34:48 +0000Bargain Priced Coroutines
http://jozefg.bitbucket.org/posts/2014-04-08-bargain-coroutines.html
http://jozefg.bitbucket.org/posts/2014-04-08-bargain-coroutines.htmlTue, 08 Apr 2014 00:00:00 +0000Haskell error reporting with locations, update
http://augustss.blogspot.com/2014/04/haskell-error-reporting-with-locations_5.html
augustsstag:blogger.com,1999:blog-25541020.post-5606262483642528561Sat, 05 Apr 2014 19:53:00 +0000A new SQL injection attack?
http://wadler.blogspot.com/2014/04/a-new-sql-injection-attack.html
Against speed cameras?Philip Wadlertag:blogger.com,1999:blog-9757377.post-5752325024466233033Sat, 05 Apr 2014 12:47:00 +0000This is a test
http://blog.sigfpe.com/2014/04/this-is-test.html
Dan Piponitag:blogger.com,1999:blog-11295132.post-4225046217596306003Sat, 05 Apr 2014 01:05:00 +0000Calculating the Minimum Variance Portfolio in R, Pandas and IAP
https://www.fpcomplete.com/blog/2014/04/mvp
IntroductionAs part of producing a demo for FP Complete's new IAP product, I wound up implementing the Minimum Variance Portfolio calculation for a stock portfolio in R, then in Haskell for the IAP, and finally in Python using the NumPy and SciPy extension libraries. I want to look at the process of writing each of these, and the resulting code in this article. I am not an expert in these things, so if you know a better way to do them, please let me know. In particular, the R example was borrowed from someone else.The functionFirst, we find a version of the formula for MVP that can be converted into those systems. I like the one used by Yue Kuen KWOK:ω = Ω⁻¹ 1/ 1 ⋅ Ω⁻¹ 1where Ω is the covariant matrix of the returns for the stocks in question.R versionThe R version is fairly straightforward - we just need to put the pieces together:minvar <- function (px){
Rt <- px[-1,]/px[-nrow(px),]-1
cov.Rt <- cov(Rt)
one.vec <- rep(1,nrow(cov.Rt))
num <- solve(cov.Rt) %*% one.vec
den <- as.numeric(t(one.vec) %*% num)
return(num/den)
}Calculating returns can be done with standard matrix operations and slicing. The covariant function is built in, as is inverting it (solve). Since the numerator Ω⁻¹ 1 appears in the denominator, I reuse it's value there.All the array operations were documented in the same place. That I only needed one unit vector was a bit of a surprise, but R sized it dynamically to work. That I had to transpose the unit vector and use the cross product operator (%*%) to get a dot product was a a less pleasant surprise, but is apparently a standard R idiom.Python versionThe Python version is almost a direct copy of the R version. def minvar(prices):
cov = np.cov((prices[1:] / prices[:-1] - 1).transpose())
vu = np.array(cov.shape[1] * [1], float)
num = np.dot(np.linalg.inv(cov), vu)
den = np.dot(vu, num)
return num / denIn this case, I passed the returns matrix to the covariant function directly. The NumPy dot function performs both cross products and dot products, and again the unit vector adopts it's length dynamically.Documentation was a bit more scattered. Being a mix of traditional imperative and object-oriented, some functions are functions in the module, and others are object methods. The biggest surprise was that the returns matrix needed to be transposed before the covariant was calculated.Haskell versionHaskell is not quite as obvious a translation of the R and Python versions, but is a more straightforward translation of the original formula - once you notice that Ω⁻¹ 1 has been factored into tv. It reads from top to bottom like the original, with the main formula at the top and the various terms defined below.Returns again use the standard Haskell idiom for slicing the array. This is a bit more verbose than either R or Python, as they are functions rather than special syntax.minVariance prices = tv / scalar (tvu <.> tv)
where rets = dropRows 1 prices / takeRows (rows prices - 1) prices -
(_, cov) = meanCov $ rets
vu = constant 1.0 (cols cov)
tvu = constant 1.0 (rows cov)
tv = inv cov <> vuThe documentation was again straightforward, with everything being a function in the hmatrix package. In this case, both unit vectors were needed, as Haskell does not scale them automatically. It was the least surprising in at least one way - it used a distinct dot product operator for the two vectors rather than transposing - whether implicitly like Python or explicitly like R - the unit vector in a cross product.PerformanceWhile performance comparisons with IAP aren't very useful, as it runs in the cloud so doing comparisons on identical systems may be impossible, Haskell does have one interesting advantage.All three of these systems have the same basic architecture - a high-level language running in an interactive environment, with bindings to low-level, fast implementations of the matrix manipulations. Haskell adds the ability to compile your program into native x86_64 code. Doing so reduced the wall clock time of this short demo by roughly two orders of magnitude.SummaryI found the IAP version a little easier to deal with. Having custom operators and functions for everything - and this will only get better with time - along with being able to use the mathematical layout of the where statement just made things a little easier to deal with. While not having unit vectors that automatically size themselves - or transpose into matrices - is a little inconvenient, this also exposed a problem in the original R version, in that the unit vector's length was initially set wrong. I'm not sure that will make any real difference, but the thought that the language can catch such errors for me is comforting.https://www.fpcomplete.com/blog/2014/04/mvpFri, 04 Apr 2014 20:07:00 +0000Haskell error reporting with locations
http://augustss.blogspot.com/2014/04/haskell-error-reporting-with-locations.html
augustsstag:blogger.com,1999:blog-25541020.post-8125460940361199531Fri, 04 Apr 2014 07:28:00 +0000Premature optimization: it's not just about code
http://feedproxy.google.com/~r/blogspot/hSASX/~3/oMbF7HE3tUw/premature-optimization-its-not-just.html
When we talk about performance, we shouldn't look only to performance of code execution. There are more types that we should care about, like time to market, development time and processes. Actually, we shouldn't look into any of the performance measures at a single perspective. As I see, all these measures sums up to constitute a single value of business performance. There are lots of things that can slow down your business, and if you have a slow business, you'll have a hard time to keep pace with your rivals. As you discover that some areas of your company may be slowing you down, you may become tempted to add key performance indicators to each step trying to squeeze all possible value out of each part of your processes. This can bring up a whole new set of problems, your business can be seen as a complex system and it will adapt itself to render good results even in chaotic scenarios. [1] So, to avoid going nuts with indicators all over the place, you decide to avoid bottlenecks from the start. To each and every new thing you need to deliver, you'll start a precise planning routine, choosing among a plethora of providers, technologies and unicorns to avoid at all costs to have a new bottleneck in the future. This is what I like to call premature optimization in business performance. Simply put: you can't have a business if you don't have a business. How can you possibly know all the events that will happen to have an impact in the decision you take today? You can't.I'm not saying that planning is wrong or a Bad Thing™. What I really want to avoid is losing energy on stuff that won't make a difference. You may spend a whole year choosing between apples and oranges, and in the end be crushed by some weirdo playing around with random technology. Why? Because he was not worried about future bottlenecks. He was really trying to do something cool, and he did it while you were arguing in endless and purposeless meetings. You're always trading performance measurements. For example, everyone is talking about service oriented architecture (SOA), but what does it really means in terms of business? It means a tradeoff between code execution performance and others benefits to the business as a whole, like decentralized grow and continuous delivery. Or, you may look at the difference between Apple's App Store and Google's Play Store model. There is a huge tradeoff of quality assurance and time to market between these two players: one offers their customers (smartphone owners), quality assured apps; the other offers their customers (operating system users), fast time to market for their apps. The big question really is: what are you trying to deliver to your customer? Are you trying to deliver software over time or value over time? I bet most of you are trying to deliver value over time, so why bother with premature optimization of technologies or processes? Let the bad stuff to die on its own: failing fast is orders of magnitude better than not doing anything at all. And let the good stuff to accompany you to where you want to go. Don't bother guessing the future, embrace the uncertainty of life: you can't predict how a complex system will behave, you can't know how your systems, services or whatever you are doing will be used in the real world. Put it into the wild, take the complaints and tune it (or throw it away) afterwards, this is a constant process. [2] Start measuring how much you are delivering over time and stop over-planning. Your end goal is to deliver stuff. The world couldn't care less about what you can do, it only cares about what you do. ReferencesComplex Systems and Key Performance Indicators - Márcio Geovani JasinskiComplaint-Driven Development - Jeff AtwoodThiago Negritag:blogger.com,1999:blog-5832470180761099563.post-2111154740007733423Thu, 03 Apr 2014 16:46:00 +0000Heads up: scotty-hastache 0.2.1
http://parenz.wordpress.com/2014/04/03/scotty-hastache-0-2-1/
To accommodate for the release of hastache-0.6, I’ve updated the scotty-hastache package to the version 0.2.1. Tagged: haskell, hastache, scotty, webDanhttp://parenz.wordpress.com/?p=163Thu, 03 Apr 2014 13:59:11 +0000A small Haskell extension
http://augustss.blogspot.com/2014/04/a-small-haskell-extension.html
augustsstag:blogger.com,1999:blog-25541020.post-5657007189613124192Thu, 03 Apr 2014 07:40:00 +0000Consolidation progress: shakespeare, conduit, http-client
http://www.yesodweb.com/blog/2014/04/consolidation-progress
My last blog post detailed a number of changes I was going to be making for package consolidation. A number of those have gone through already, this blog post is just a quick summary of the changes.shakespeareshakespeare is now a single package. hamlet, shakespeare-css, shakespeare-js,
shakespeare-i18n, shakespeare-text, and servius have all been merged in and
marked as deprecated. I've also uploaded new, empty versions of those
deprecated packages. This means that, in order to support both the old and new
versions of shakespeare, you just need to ensure that you have both the
shakespeare and deprecated packages listed in your cabal file. In other words,
if previously you depended on hamlet, now you should depend on hamlet and
shakespeare. When you're ready to drop backwards compatibility, simply put a
lower bound of >= 2.0 on shakespeare and remove the deprecated packages.(Note: this method for dealing with deprecated packages is identical for all
future deprecations, I won't detail the steps in the rest of this blog post.)conduitconduit-extra now subsumes attoparsec-conduit, blaze-builder-conduit,
network-conduit, and zlib-conduit. It also includes three modules that used to
be in conduit itself: .Text, .Binary, and .Lazy. To deal with this change,
simply adding conduit-extra to your dependencies should be sufficient.The other changes have to do with resourcet. In particular:Data.Conduit no longer reexports identifiers from resourcet and
monad-control. These should be imported directly from their sources.Instead of defining its own MonadThrow typeclass, resourcet now uses the
MonadThrow typeclass from the exceptions package. For backwards
compatibility, Control.Monad.Trans.Resource provides monadThrow as an alias
for the new throwM function.The Resource monad had a confusing name, in that it wasn't directly related to the ResourceT transformer. I've renamed it to Acquire, and put it in its own module (Data.Acquire).I'm actually very happy with Acquire, and think it's a great alternative to hard-coding either the bracket pattern or resourcet into libraries. I'm hoping to add better support to WAI for Acquire, and blog a bit more about the usage of Acquire.MonadUnsafeIO has been removed entirely. All of its functionality can be replaced with MonadPrim and MonadBase (for example, see the changes to blaze-builder-conduit).MonadActive, which is only needed for Data.Conduit.Lazy, has been moved to that module.http-clienthttp-client-multipart has been merged into http-client. In addition, instead of using the failure package, http-client now uses the exceptions package.http-client-conduit has been merged into http-conduit. I've also greatly expanded the Network.HTTP.Client.Conduit module to contain what I consider its next-gen API. In particular:No usage of ResumableSource.Instead of explicit ResourceT usage, it uses the Acquire monad and bracket pattern (acquireResponse, withResponse).Instead of explicitly passing around a Manager, it uses MonadReader and the HasHttpManager typeclass.I'm curious how people like the new API. I have no plans on removing or changing the current Network.HTTP.Conduit module, this is merely an alternative approach.Updated yesod-platformI've also released a new version of yesod-platform that uses the new versions
of the packages above. A number of packages on Hackage still depend on conduit
1.0, but I've sent quite a few pull requests in the past few days to get things
up-to-date. Thankfully, maintaining compatibility with both 1.0 and 1.1 is
pretty trivial.http://www.yesodweb.com/blog/2014/04/consolidation-progressThu, 03 Apr 2014 07:07:00 +0000Server Game Developer at Quark Games (Full-time)
http://functionaljobs.com/jobs/8701-server-game-developer-at-quark-games
urn:uuid:8842c567-d462-3cad-3354-954134280d66Wed, 02 Apr 2014 23:36:10 +0000Haddock 2.14.2
http://fuuzetsu.co.uk/blog/posts/2014-04-02-Haddock-2.14.2.html
http://fuuzetsu.co.uk/blog/posts/2014-04-02-Haddock-2.14.2.htmlWed, 02 Apr 2014 23:03:20 +0000Student’s T and Space Leaks
http://idontgetoutmuch.wordpress.com/2014/04/02/students-t-and-space-leaks/
Introduction
The other speaker at the Machine Learning Meetup at which I gave my talk on automatic differentiation gave a very interesting talk on A/B testing. Apparently this is big business these days as attested by the fact I got 3 ads above the wikipedia entry when I googled for it.
It seems that people tend to test with small sample sizes and to do so very often, resulting in spurious results. Of course readers of XKCD will be well aware of some of the pitfalls.
I thought a Bayesian approach might circumvent some of the problems and set out to write a blog article only to discover that there was no Haskell library for sampling from Student’s t. Actually there was one but is currently an unreleased part of random-fu. So I set about fixing this shortfall.
I thought I had better run a few tests so I calculated the sampled mean, variance, skewness and kurtosis.
I wasn’t really giving this my full attention and as a result ran into a few problems with space. I thought these were worth sharing and that is what this blog post is about. Hopefully, I will have time soon to actually blog about the Bayesian equivalent of A/B testing.
Preamble
> {-# OPTIONS_GHC -Wall #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods #-}
> {-# OPTIONS_GHC -fno-warn-orphans #-}
>
> {-# LANGUAGE NoMonomorphismRestriction #-}
>
> module StudentTest (
> main
> ) where
>
> import qualified Data.Vector.Unboxed as V
> import Data.Random.Source.PureMT
> import Data.Random
> import Data.Random.Distribution.T
> import Control.Monad.State
> import Data.Histogram.Fill
> import Data.Histogram.Generic ( Histogram )
> import Data.List
Space Analysis
Let’s create a reasonable number of samples as the higher moments converge quite slowly.
> nSamples :: Int
> nSamples = 1000000
An arbitrary seed for creating the samples.
> arbSeed :: Int
> arbSeed = 8
Student’s t only has one parameter, the number of degrees of freedom.
> nu :: Integer
> nu = 6
Now we can do our tests by calculating the sampled values.
> ts :: [Double]
> ts =
> evalState (replicateM nSamples (sample (T nu)))
> (pureMT $ fromIntegral arbSeed)
> mean, variance, skewness, kurtosis :: Double
> mean = (sum ts) / fromIntegral nSamples
> variance = (sum (map (**2) ts)) / fromIntegral nSamples
> skewness = (sum (map (**3) ts) / fromIntegral nSamples) / variance**1.5
> kurtosis = (sum (map (**4) ts) / fromIntegral nSamples) / variance**2
This works fine for small sample sizes but not for the number we have chosen.
./StudentTest +RTS -hc
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
It seems a shame that the function in the Prelude has this behaviour but never mind let us ensure that we consume values strictly (they are being produced lazily).
> mean' = (foldl' (+) 0 ts) / fromIntegral nSamples
> variance' = (foldl' (+) 0 (map (**2) ts)) / fromIntegral nSamples
> skewness' = (foldl' (+) 0 (map (**3) ts) / fromIntegral nSamples) / variance'**1.5
> kurtosis' = (foldl' (+) 0 (map (**4) ts) / fromIntegral nSamples) / variance'**2
We now have a space leak on the heap as using the ghc profiler below shows. What went wrong?
If we only calculate the mean using foldl then all is well. Instead of 35M we only use 45K.
Well that gives us a clue. The garbage collector cannot reclaim the samples as they are needed for other calculations. What we need to do is calculate the moments strictly altogether.
Let’s create a strict record to do this.
> data Moments = Moments { m1 :: !Double
> , m2 :: !Double
> , m3 :: !Double
> , m4 :: !Double
> }
> deriving Show
And calculate the results strictly.
>
> m = foldl' (\m x -> Moments { m1 = m1 m + x
> , m2 = m2 m + x**2
> , m3 = m3 m + x**3
> , m4 = m4 m + x**4
> }) (Moments 0.0 0.0 0.0 0.0) ts
>
> mean'' = m1 m / fromIntegral nSamples
> variance'' = m2 m / fromIntegral nSamples
> skewness'' = (m3 m / fromIntegral nSamples) / variance''**1.5
> kurtosis'' = (m4 m / fromIntegral nSamples) / variance''**2
Now we have what we want; the program runs in small constant space.
> main :: IO ()
> main = do
> putStrLn $ show mean''
> putStrLn $ show variance''
> putStrLn $ show skewness''
> putStrLn $ show kurtosis''
Oh and the moments give the expected answers.
ghci> mean''
3.9298418844289093e-4
ghci> variance''
1.4962681916693004
ghci> skewness''
1.0113188204317015e-2
ghci> kurtosis''
5.661776268997382
Running the Code
To run this you will need my version of random-fu. The code for this article is here. You will need to compile everything with profiling, something like
ghc -O2 -main-is StudentTest StudentTest.lhs -prof
-package-db=.cabal-sandbox/x86_64-osx-ghc-7.6.2-packages.conf.d
Since you need all the packages to be built with profiling, you will probably want to build using a sandbox as above. The only slightly tricky aspect is building random-fu so it is in your sandbox.
runghc Setup.lhs configure --enable-library-profiling
--package-db=/HasBayes/.cabal-sandbox/x86_64-osx-ghc-7.6.2-packages.conf.d
--libdir=/HasBayes/.cabal-sandbox/libDominic Steinitzhttp://idontgetoutmuch.wordpress.com/?p=659Wed, 02 Apr 2014 10:17:00 +0000Embedded Security in the Mainstream
http://leepike.wordpress.com/2014/04/01/embedded-security-in-the-mainstream/
Science Friday had an interesting podcast with Bruce Schneier worth a listen. It discussed security in embedded systems (or as is hip to say now, the Internet of Things). The idea of security in embedded systems seemed pretty obscure just a few years ago, so it’s nice to see it being featured on a general-audience radio show.
So who’s going to listen to it from their phone, connected to their car’s audio over Bluetooth, on the way to the store to buy a Nest smoke alarm?Lee Pikehttp://leepike.wordpress.com/?p=681Wed, 02 Apr 2014 04:21:04 +0000Exceptional Testing
http://neilmitchell.blogspot.com/2014/04/exceptional-testing.html
Neil Mitchelltag:blogger.com,1999:blog-7094652.post-37638653305136889Tue, 01 Apr 2014 20:57:00 +0000Fixing foldl
http://www.well-typed.com/blog/90
The foldl function is broken. Everyone knows it’s broken. It’s been broken for nearly a quarter of a century. We should finally fix it!
Today I am proposing that Prelude.foldl be redefined using the implementation currently known as Data.List.foldl'.
foldl is broken!
I’m sure you knew that already, but just in case…
Have you ever noticed that Haskellers usually recommend using either foldr or foldl' but not foldl? For example Real World Haskell has this to say:
Due to the thunking behaviour of foldl, it is wise to avoid this function in real programs: even if it doesn’t fail outright, it will be unnecessarily inefficient. Instead, import Data.List and use foldl'.
In the online version of the book the first user comments on that paragraph are
Why isn’t Data.List foldl implementation placed in Prelude?
I second the question: Why isn’t foldl' the default?
Good question.
Ok, so obviously we’re talking about the difference between foldl and foldl':
foldl :: (b -> a -> b) -> b -> [a] -> b
foldl f a [] = a
foldl f a (x:xs) = foldl f (f a x) xs
foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f a [] = a
foldl' f a (x:xs) = let a' = f a x in a' `seq` foldl' f a' xs
The dry technical difference is that foldl' evaluates the call to f before making the next recursive call. The consequences of that are perhaps not immediately obvious, so we’ll take a step back and look at a slightly bigger picture.
Folding left and right
When we first learn Haskell we learn that there are two ways to fold a list, from the left or right.
foldl f z [x1, x2, ..., xn] = (...((z `f` x1) `f` x2) `f`...) `f` xn
foldr f z [x1, x2, ..., xn] = x1 `f` (x2 `f` ... (xn `f` z)...)
Saying “from the left” or “from the right” is a description of what foldl and foldr calculate, with the parenthesis nesting to the left or to the right. At runtime of course we always have to start from the left (front) of the list.
We later learn other ways of thinking about left and right folds, that a left fold can be used like a classic loop where we go through the whole list eagerly, while a right fold can be used like a demand-driven iterator. For the left fold that means using a function that is strict in its first argument (like (+)) while for the right fold that means using a function that is not strict in its second argument (like (:)).
Indeed when looking at whether we want foldl or foldr in any particular case our choice is usually governed by whether we want “all at once” behaviour (foldl) or if we want incremental or short-cut behaviour (foldr).
Accumulating thunks
Again, as we are learning Haskell, we get told that foldl has this crazy behaviour
foldl (+) 0 (1:2:3:[])
= foldl (+) (0 + 1) (2:3:[])
= foldl (+) ((0 + 1) + 2) (3:[])
= foldl (+) (((0 + 1) + 2) + 3) []
= (((0 + 1) + 2) + 3)
when what we had in mind when we thought of an accumulating loop was
foldl' (+) 0 (1:2:3:[])
= foldl' (+) 1 (2:3:[])
= foldl' (+) 3 (3:[])
= foldl' (+) 6 []
= 6
Of course that’s just what foldl' does, it evaluates the call to + before making the next recursive call.
When is foldl (rather than foldl') useful?
The short answer is “almost never”.
As beginners we often assume that foldl must still make sense in some cases (or why have both?) but it turns out that’s not really the case.
When the f argument of foldl is a strict function then delaying the evaluation does not gain us anything as it all has to be evaluated at the end anyway. The only time when delaying the evaluation could save us anything is when the f function is not strict in its first argument – in which case you either don’t care or probably should have been using foldr in the first place.
In fact even if our f function is non-strict in its first argument, we probably do not gain anything from delaying evaluation and it usually does no harm to evaluate earlier. Remember that we still have to traverse the whole list, we don’t get any short-cutting behaviour like with foldr.
We can, if we think about it, construct examples where foldl' would be too strict. We could define last and last' like this:
last = foldl (\_ y -> y) (error "empty list")
last' = foldl' (\_ y -> y) (error "empty list")
Now if we try
> last [1,undefined,3]
3
> last' [1,undefined,3]
*** Exception: Prelude.undefined
This is because our accumulator is always the latest element that we’ve seen but we don’t actually want to evaluate the elements (except the last one).
So it’s true that foldl' fails in this case, but it’s also a silly definition, the usual definition is a lot clearer
last [x] = x
last (_:xs) = last xs
last [] = error "empty list"
That goes for pretty much all the other examples you might be able to think of where foldl would work but foldl' would not: the examples are either artificial or are clearer written in other ways.
People sometimes point out that sum is defined using foldl and not foldl' and claim that this is something to do with Haskell’s designers wanting to allow Num instances where (+) might be lazy. This is pretty much nonsense. If that were the case then sum would have been defined using foldr rather than foldl to properly take advantage of short-cutting behaviour. A much simpler explanation is that foldl' was not available in early versions of Haskell when sum was defined.
In nearly 15 years as a Haskell programmer I think I’ve specifically needed foldl rather than foldl' about three times. I say “about” because I can only actually remember one. That one case was in a slightly cunning bit of code for doing cache updates in a web server. It would almost certainly have been clearer as a local recursion but I was amused to find a real use case for foldl and couldn’t help myself from using it just for fun. Of course it needed a comment to say that I was using it on purpose rather than by mistake!
So why do we have foldl and foldl'?
If foldl is almost always a mistake (or merely benign) then why do we have it in the first place?
I don’t know for sure, but here’s my guess…
When Haskell 1.0 was published on this day 24 years ago there was no seq function at all, so there was no choice but to define foldl in the “classic” way.
Eventually, six years later after much discussion, we got the seq function in Haskell 1.3. Though actually in Haskell 1.3 seq was part of an Eval class, so you couldn’t use it just anywhere, such as in foldl. In Haskell 1.3 you would have had to define foldl' with the type:
foldl' :: Eval b => (b -> a -> b) -> b -> [a] -> b
Haskell 1.4 and Haskell 98 got rid of the Eval class constraint for seq but foldl was not changed. Hugs and GHC and other implementations added the non-standard foldl'.
I suspect that people then considered it a compatibility and inertia issue. It was easy enough to add a non-standard foldl' but you can’t so easily change the standard.
I suspect that if we had had seq from the beginning then we would have defined foldl using it.
Miranda, one of Haskell’s predecessor languages, already had seq 5 years before Haskell 1.0.
A strict foldl in Orwell!
Orwell is an interesting case. Orwell was another Haskell predecessor, very similar to Miranda and early Haskell. An informed source told me that Orwell had defined its foldl in the way that we now define foldl', ie with strict evaluation. Information on Orwell is a little hard to get ahold of online these days so I asked Philip Wadler. Phil very kindly fished out the manuals and looked up the definitions for me.
In the original version:
An Introduction to Orwell
(DRAFT)
Philip Wadler
1 April 1985
In the standard prelude:
lred f a [] = a
lred f a (x:xs) = lred f (f a x) xs
But five years later, by the time Haskell 1.0 is being published…
An Introduction to Orwell 6.00
by Philip Wadler
revised by Quentin Miller
Copyright 1990 Oxford University Computing Lab
In the standard prelude:
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f a [] = a
foldl f a (x:xs) = strict (foldl f) (f a x) xs
Note the use of strict. Presumably Orwell’s strict function was defined as (or equivalent to)
strict :: (a -> b) -> a -> b
strict f x = x `seq` f x
(These days in Haskell we call this function ($!).)
So my source was right, Orwell did change foldl to be the strict version!
I contend that this was and is the right decision, and that it was just a consequence of the late arrival of seq in Haskell and inertia and fears about backwards compatibility that have kept us from fixing foldl.
Just do it!
It’d help all of us who are sometimes tempted to use foldl because we can’t be bothered to import Data.List. It’d help confused beginners. It’d save teachers from the embarrassment of having to first explain foldl and then why you should never use it.
Orwell fixed this mistake at least 24 years ago, probably before Haskell 1.0 was released. Just because it’s an old mistake doesn’t mean we shouldn’t fix it now!
A postscript: which foldl'?
I hate to complicate a simple story but I should admit that there are two plausible definitions of foldl' and I’ve never seen any serious discussion of why we use one rather than the other (I suspect it’s another historical accident).
So the version above is the “standard” version, perhaps more clearly written using bang patterns as
foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f a [] = a
foldl' f a (x:xs) = let !a' = f a x
in foldl' f a' xs
But an equally plausible version is
foldl'' :: (b -> a -> b) -> b -> [a] -> b
foldl'' f !a [] = a
foldl'' f !a (x:xs) = foldl'' f (f a x) xs
The difference is this: in the first version we evaluate the new accumulating parameter before making the recursive call, while in the second version we ensure that the accumulating parameter is always evaluated (to WHNF) on entry into each call.
These two ways of forcing the evaluation have almost the same effect. It takes a moment or two to find a case where it makes a difference, but here’s one:
foldl' (\_ y -> y) undefined [1] = 1
foldl'' (\_ y -> y) undefined [1] = undefined
The standard foldl' ensures that all the new accumulating parameter values are evaluated, but still allows the initial value to be unevaluated. The alternative version simply says that the accumulating parameter is always evaluated.
The second version is attractive from a code generation point of view. One of the clever things that GHC can do with foldl' (and strict function arguments generally) is to unbox the values, so for example an Int can be unboxed to a Int# and passed in registers rather than on the heap. With the standard foldl' it needs a special case for the first iteration of the loop where the initial value of the accumulating parameter which might not be evaluated yet. With foldl'', that’s not a problem, we can unbox things right from the start. In practice, GHC can often see that the initial value is evaluated anyway, but not always.
(Don Stewart and I noticed this a few years back when we were working on stream fusion for lists. We had defined foldl' on streams in a way that corresponded to the second form above and then got a test failure when doing strictness testing comparing against the standard list functions.)
So if we’re going to fix foldl to be the strict version, then perhaps it should be the fully strict version, not just the “strict after the first iteration” version.duncanhttp://www.well-typed.com/blog/90Tue, 01 Apr 2014 10:38:20 +0000Calculating Shanten in Mahjong
http://feedproxy.google.com/~r/ezyang/~3/Sox7M1xX8s4/
Move aside, poker! While the probabilities of various poker hands are well understood and tabulated, the Chinese game of chance Mahjong [1] enjoys a far more intricate structure of expected values and probabilities. [2] This is largely due in part to the much larger variety of tiles available (136 tiles, as opposed to the standard playing card deck size of 52), as well as the turn-by-turn game play, which means there is quite a lot of strategy involved with what is ostensibly a game of chance. In fact, the subject is so intricate, I’ve decided to write my PhD thesis on it. This blog post is a condensed version of one chapter of my thesis, considering the calculation of shanten, which we will define below. I’ll be using Japanese terms, since my favorite variant of mahjong is Riichi Mahjong; you can consult the Wikipedia article on the subject if you need to translate.
Calculating Shanten
The basic gameplay of Mahjong involves drawing a tile into a hand of thirteen tiles, and then discarding another tile. The goal is to form a hand of fourteen tiles (that is, after drawing, but before discarding a tile) which is a winning configuration. There are a number of different winning configurations, but most winning configurations share a similar pattern: the fourteen tiles must be grouped into four triples and a single pair. Triples are either three of the same tile, or three tiles in a sequence (there are three “suits” which can be used to form sequences); the pair is two of the same tiles. Here is an example:
Represented numerically, this hand consists of the triples and pairs 123 55 234 789 456.
One interesting quantity that is useful to calculate given a mahjong hand is the shanten number, that is, the number of tiles away from winning you are. This can be used to give you the most crude heuristic of how to play: discard tiles that get you closer to tenpai. The most widely known shanten calculator is this one on Tenhou’s website [3]; unfortunately, the source code for this calculator is not available. There is another StackOverflow question on the subject, but the “best” answer offers only a heuristic approach with no proof of correctness! Can we do better?
Naïvely, the shanten number is a breadth first search on the permutations of a hand. When a winning hand is found, the algorithm terminates and indicates the depth the search had gotten to. Such an algorithm is obviously correct; unfortunately, with 136 tiles, one would have to traverse hands (choices of new tiles times choices of discard) while searching for a winning hand that is n-shanten away. If you are four tiles away, you will have to traverse over six trillion hands. We can reduce this number by avoiding redundant work if we memoize the shanten associated with hands: however, the total number of possible hands is roughly , or 59 bits. Though we can fit (via a combinatorial number system) a hand into a 64-bit integer, the resulting table is still far too large to hope to fit in memory.
The trick is to observe that shanten calculation for each of the suits is symmetric; thus, we can dynamic program over a much smaller space of the tiles 1 through 9 for some generic suit, and then reuse these results when assembling the final calculation. is still rather large, so we can take advantage of the fact that because there are four copies of each tile, an equivalent representation is a 9-vector of the numbers zero to four, with the constraint that the sum of these numbers is 13. Even without the constraint, the count is only two million, which is quite tractable. At a byte per entry, that’s 2MB of memory; less than your browser is using to view this webpage. (In fact, we want the constraint to actually be that the sum is less than or equal to 13, since not all hands are single-suited, so the number of tiles in a hand is less.
The breadth-first search for solving a single suit proceeds as follows:
Initialize a table A indexed by tile configuration (a 9-vector of 0..4).
Initialize a todo queue Q of tile configurations.
Initialize all winning configurations in table A with shanten zero (this can be done by enumeration), recording these configurations in Q.
While the todo queue Q is not empty, pop the front element, mark the shanten of all adjacent uninitialized nodes as one greater than that node, and push those nodes onto the todo queue.
With this information in hand, we can assemble the overall shanten of a hand. It suffices to try every distribution of triples and the pairs over the four types of tiles (also including null tiles), consulting the shanten of the requested shape, and return the minimum of all these configurations. There are (by stars and bars) combinations, for a total of 140 configurations. Computing the shanten of each configuration is a constant time operation into the lookup table generated by the per-suit calculation. A true shanten calculator must also accomodate the rare other hands which do not follow this configuration, but these winning configurations are usually highly constrained, and quite easily to (separately) compute the shanten of.
With a shanten calculator, there are a number of other quantities which can be calculated. Uke-ire refers to the number of possible draws which can reduce the shanten of your hand: one strives for high uke-ire because it means that probability that you will draw a tile which moves your hand closer to winning. Given a hand, it's very easy to calculate its uke-ire: just look at all adjacent hands and count the number of hands which have lower shanten.
Further extensions
Suppose that you are trying to design an AI which can play Mahjong. Would the above shanten calculator provide a good evaluation metric for your hand? Not really: it has a major drawback, in that it does not consider the fact that some tiles are simply unavailable (they were discarded). For example, if all four “nine stick” tiles are visible on the table, then no hand configuration containing a nine stick is actually reachable. Adjusting for this situation is actually quite difficult, for two reasons: first, we can no longer precompute a shanten table, since we need to adjust at runtime what the reachability metric is; second, the various suits are no longer symmetric, so we have to do three times as much work. (We can avoid an exponential blowup, however, since there is no inter-suit interaction.)
Another downside of the shanten and uke-ire metrics is that they are not direct measures of “tile efficiency”: that is, they do not directly dictate a strategy for discards which minimizes the expected time before you get a winning hand. Consider, for example, a situation where you have the tiles 233, and only need to make another triple in order to win. You have two possible discards: you can discard a 2 or a 3. In both cases, your shanten is zero, but discarding a 2, you can only win by drawing a 3, whereas discarding a 3, you can win by drawing a 1 or a 4. Maximizing efficiency requires considering the lifetime ure-kire of your hands.
Even then, perfect tile efficiency is not enough to see victory: every winning hand is associated with a point-score, and so in many cases it may make sense to go for a lower-probability hand that has higher expected value. Our decomposition method completely falls apart here, as while the space of winning configurations can be partitioned, scoring has nonlocal effects, so the entire hand has to be considered as a whole. In such cases, one might try for a Monte Carlo approach, since the probability space is too difficult to directly characterize. However, in the Japanese Mahjong scoring system, there is yet another difficulty with this approach: the scoring system is exponential. Thus, we are in a situation where the majority of samples will be low scoring, but an exponentially few number of samples have exponential payoff. In such cases, it’s difficult to say if random sampling will actually give a good result, since it is likely to miscalculate the payoff, unless exponentially many samples are taken. (On the other hand, because these hands are so rare, an AI might do considerably well simply ignoring them.)
To summarize, Mahjong is a fascinating game, whose large state space makes it difficult to accurately characterize the probabilities involved. In my thesis, I attempt to tackle some of these questions; please check it out if you are interested in more.
[1] No, I am not talking about the travesty that is mahjong solitaire.
[2] To be clear, I am not saying that poker strategy is simple—betting strategy is probably one of the most interesting parts of the game—I am simply saying that the basic game is rather simple, from a probability perspective.
[3] Tenhou is a popular Japanese online mahjong client. The input format for the Tenhou calculator is 123m123p123s123z, where numbers before m indicate man tiles, p pin tiles, s sou tiles, and z honors (in order, they are: east, south, west, north, white, green, red). Each entry indicates which tile you can discard to move closer to tenpai; the next list is of ure-kire (and the number of tiles which move the hand further).Edward Z. Yanghttp://blog.ezyang.com/?p=8147Tue, 01 Apr 2014 08:20:46 +0000Fun and Profit with Strongly-Typed Data Schemas
http://www.well-typed.com/blog/89
Over the past few months, Duncan and I have been working with Chris Dornan and Alfredo Di Napoli on api-tools, a DSL for specifying data schemas for REST-like APIs in Haskell. If you’re interested in the real-world use of Haskell, static types and DSLs, why not come along to hear Chris talk about it?
Wednesday 9th April, 6:30pm, London
Find out more and register for free over at Skills Matter:
Typical business apps store structured data, transform it and send it hither and thither. They are typically made of multiple components that have to agree on the schema of the data they exchange. So a large part of what it means to be “flexible” is for it to be easy to modify and extend the schema of the data that the system handles.
Strong typing can help with this, ensuring that the code that accesses the data is consistent with the schema. One idea that has been tried with databases is to generate Haskell types from the database schema, or generate the database schema from the Haskell types, or both from some standalone schema.
In this talk we will describe how we have applied this same idea but for REST APIs. We have a DSL that we use to specify the schema of the data available via a REST API. With a machine description of the schema we can generate the code and documentation for many parts of the system, and of course those parts are then easily adaptable when the schema changes. We generate Haskell types, JSON conversion functions, REST API documentation, disk persistence and more. We also have a story for managing schema changes and migrating data. This system is in use at Iris Connect and is now available on Hackage.
This talk will also discuss further applications we have found for these tools and some of the experience from the development project at Iris Connect, including problems we have had to solve building the Haskell components of the system and the wider challenge of integrating it into the overall development project.
And if that isn’t enough for you, Well-Typed’s Haskell courses are back at the end of April, with an all-new course on advanced features of the Haskell type system. Stay tuned for more events coming soon…adamhttp://www.well-typed.com/blog/89Tue, 01 Apr 2014 07:24:01 +0000Haskell structured diffs
http://chrisdone.com/posts/haskell-diff
http://chrisdone.com/posts/haskell-diffTue, 01 Apr 2014 00:00:00 +0000Infrastructure Engineer at Zeta Project Germany GmbH (Full-time)
http://functionaljobs.com/jobs/8699-infrastructure-engineer-at-zeta-project-germany-gmbh
urn:uuid:2c828a4f-17b0-5797-eec9-5c9c3ac3e298Mon, 31 Mar 2014 18:14:23 +0000Migrating to Static, Self-Hosted Blog
http://cppcabrera.blogspot.com/2014/03/migrating-to-static-self-hosted-blog.html
Hello, readers.Soon, I'll be moving away from the Blogger platform to a statically hosted blog. I've been playing with hakyll for some time now, and am pleased with it's support for syntax highlighting, markup formats, RSS/Atom generation, and packaged Warp.It'll be great to develop my blog with emacs, leveraging git for backup and rsync for deployment.Expect:* Less latency* A home-grown design* More code samples, with pretty highlightingWith possibly:* More frequent postsStay tuned. I'll be releasing the new blog by the end of the week!Alejandro Cabreratag:blogger.com,1999:blog-5533014087482697280.post-5131689201497099331Mon, 31 Mar 2014 15:50:00 +0000ANN: hastache 0.6
http://parenz.wordpress.com/2014/03/31/hastache-0-6/
Announcing: hastache 0.6
Hastache is a Haskell implementation of the mustache templating system.
Quick start
cabal update
cabal install hastache
A simple example:
import Text.Hastache
import Text.Hastache.Context
import qualified Data.Text.Lazy.IO as TL
main = hastacheStr defaultConfig (encodeStr template) (mkStrContext context)
>>= TL.putStrLn
template = "Hello, {{name}}!\n\nYou have {{unread}} unread messages."
context "name" = MuVariable "Haskell"
context "unread" = MuVariable (100 :: Int)
Read Mustache documentation for template syntax; consult README for more details.
Whats’s new in 0.6?
The interface of the library has been switched from ByteString to (lazy) Text. That means, for example, that the type of hastcheStr function is now the following:
hastacheStr
:: MonadIO m => MuConfig m -> Text -> MuContext m -> m Text
The generic context generation (mkGenericContext) now supports functions of the types Text -> Text and Text -> m Text, as well as types with multiple constructors. That is, given a Haskell datastructure
data A = A { str :: String }
| B { num :: Int }
it is possible to write a template like this:
{{#A}}
A : {{str}}
{{/A}}
{{#B}}
B : {{num}}
{{/B}}
Please take a look at the multiple constructors example if you are interested.
Additionally, a couple of new MuVar instances has been added.
Links
hackage
source code
issue tracker
Acknowledgments
Special thanks to Mark Lentczner, Alexander Voikov and others who reported issues, tested the library and provided feedback. Tagged: haskell, hastacheDanhttp://parenz.wordpress.com/?p=151Mon, 31 Mar 2014 07:47:49 +0000One step forward, two steps back
http://chrisdone.com/posts/one-step-forward-two-steps-back
http://chrisdone.com/posts/one-step-forward-two-steps-backSun, 30 Mar 2014 00:00:00 +0000My first computer
http://gergo.erdi.hu/blog/2014-03-29-my_first_computer
http://gergo.erdi.hu/blog/2014-03-29-my_first_computerSat, 29 Mar 2014 13:06:00 +0000Listing licenses with cabal-db
http://tab.snarc.org/posts/haskell/2014-03-29-cabal-db-license.html
http://tab.snarc.org/posts/haskell/2014-03-29-cabal-db-license.htmlSat, 29 Mar 2014 00:00:00 +0000Where credit belongs for Hack
http://www.serpentine.com/blog/2014/03/28/where-credit-belongs-for-hack/
It’s been exciting to see the positive reception in the tech press and the open source community to Hack, the new programming language that my team at Facebook released last week.
At the same time, one part of that coverage has made me wince: a couple of articles give the impression that I created Hack, but I didn’t. What I do is manage the Hack team, and since I’m incredibly proud of the work they do, I’d like to shine a little light on the people who really deserve the credit.
A few years ago, Julien Verlaguet and Alok Menghrajani had a notion that Facebook could improve some aspects of how it ships software. They developed a proof-of-concept of a system that could detect certain kinds of logic error in a program, and showed it to a few colleagues. Buoyed by the positive feedback they received, they decided to invest more effort in the project, which would in due course become Hack.
This process underscores one of the aspects I find most appealing about engineering at Facebook: engineers are welcome to try out bold ideas, and to push on with them if they look like they’ve got a good prospect of paying off.
The work that Julien and Alok did to get the project off the ground was far from simple. They had to design not just a type system, but one where the type checking algorithms could scale to run efficiently and incrementally across hundreds of thousands of source files. Not just a static type system, but one that could interoperate seamlessly with PHP’s dynamic types. Even the notionally simple act of watching the filesystem to quickly find changed files is impressively difficult; just ask anyone who’s had to wrestle with race conditions involving Linux’s inotify subsystem.
When presented with the early prototype of Hack, Drew Paroski went beyond simply offering support: he immediately signed up to enrich Hack by building the Collections system, a set of clean, modern APIs for dealing with bulk data. While the face that Collections presents to the programmer is easy to understand, there’s once again a lot of subtle work going on behind the scenes (including a lot of runtime support in HHVM, to make them efficient). Collections interoperate seamlessly with PHP arrays, both when a program is being typechecked and while it’s running.
While designing a language and building a typechecker are substantial efforts in their own right, a big reason that we’re excited about Hack is that it is already a success within Facebook. This is in large part due to the work of two engineers, Gabe Levi and Josh Watzman, who converted a large part of our codebase to use Hack’s type annotations. Pulling this off is a surprisingly subtle task. In a large dynamically typed codebase, there can be both latent type errors and code that is more elegantly left dynamically typed. Introducing static types on a large scale involves a series of judgment calls: is what we’re looking at an error; a piece of code that we can refactor so that we can express its types statically; or something that’s just fine as it stands?
As the rest of the team proceeded with the language design, engineering, and conversion work, Joel Marcey built out the HHVM and Hack documentation, while Eugene Letuchy both added features to the language (e.g. richer support for traits) and made significant contributions to our PHP-to-Hack conversion work.
I’m proud to be associated with such skilled people who put so much passion and care into their work. When it comes to Hack, I’m simply the messenger, and these engineers have all along been the ones making it happen.Bryan O'Sullivanhttp://www.serpentine.com/blog/?p=1038Fri, 28 Mar 2014 18:03:09 +0000Announcing postgresql-simple-0.4.2
http://blog.melding-monads.com/2014/03/27/announcing-postgresql-simple-0-4-2/
With 19 new releases, postgresql-simple has seen substantial development since my announcement of version 0.3.1 nearly a year ago. Compared to my last update, the changes are almost exclusively bugfixes and new features. Little code should break as a result of these changes, and most if not all of the code that does break should fail at compile time.
As it stands today, postgresql-simple certainly has the best support for postgres-specific functionality of anything on hackage. So, for a few highlights:
Parametrized Identifiers
Thanks to contributions from Tobias Florek, perhaps the most exciting change for many is that postgresql-simple now properly supports parametrized identifiers, including column, table, and type names. These are properly escaped via libpq. So for example, you can now write:
execute conn "DROP TABLE IF EXISTS ? CASCADE"
(Only ("schema.table" :: QualifiedIdentifier))
Of course, this particular example is still potentially very insecure, but it shouldn’t be possible to create a SQL injection vulnerability this way.
The downside of this change is that postgresql-simple now requires libpq version 9.0 or later. However, community support for 8.4 is coming to and end this July. Also, it is possible to use newer versions of libpq to connect to older versions of postgres, so you don’t have to upgrade your server. (In fact, in one particular situation I’m still using postgresql-simple to connect to postgresql 8.1, although some features such as non-builtin types don’t work.)
Copy In and Copy Out support
While the postgresql-libpq binding has supported COPY FROM STDIN and COPY TO STDOUT for some time, postgresql-simple now supports these directly without having to muck around with postgresql-libpq calls via the Internal module.
If you are interested in streaming data to and from postgres, you may also be interested in higher-level COPY bindings for pipes and io-streams. This is available in Oliver Charles’s pipes-postgresql-simple, which also supports cursors, and my own unreleased postgresql-simple-streams, which also supports cursors and large objects.
Out-of-box support for JSON, UUID, and Scientific
Thanks to contributions from Bas van Dijk, postgresql-simple now provides FromField and ToField instances for aeson‘s value type, Antoine Latter’s uuid, as well as scientific.
Savepoints and nestable folds
Thanks to contributions from Joey Adams, postgresql-simple now has higher-level support for savepoints in the Transaction module, as well as the ability to nest the fold operator. Previously, postgresql-simple had assigned a static name to the cursor that underlies fold, and now every connection has a counter used to generate temporary names.
Parametrized VALUES expressions
While executeMany and returning already support common use cases of this new feature, the new Values type allows you to parameterize more than just a single VALUES table literal.
For example, these situations commonly arise when dealing with writable common table expressions. Let’s say we have a table of things with an associated table of attributes:
CREATE TABLE things (
id SERIAL NOT NULL,
name TEXT NOT NULL,
PRIMARY KEY (id)
);
CREATE TABLE attributes (
id INT NOT NULL REFERENCES things,
key TEXT NOT NULL,
value BIGINT NOT NULL
);
Then we can populate both a thing and its attributes with a single query, returning the id generated by the database:
query conn [sql|
WITH thing AS (
INSERT INTO things (name) VALUES (?) RETURNING id
), newattrs AS (
INSERT INTO attributes
SELECT thing.id, a.*
FROM thing JOIN ? a
)
SELECT id FROM thing;
|] ("bob", Values [ "text" ,"int8" ]
[ ( "foo" , 42 )
, ( "bar" , 60 ) ])
The empty case is also dealt with correctly; see the documentation for details.
Improved support for outer joins
A long standing challenge with postgresql-simple is dealing with outer joins in a sane way. For example, with the schema above, let’s say we want to fetch all the things along with their attributes, whether or not any given thing has any attributes at all. Previously, we could write:
getAllTheThings :: Connection -> IO [(Text, Maybe Text, Maybe Int64)]
getAllTheThings conn = do
query conn [sql|
SELECT name, key, value
FROM things LEFT OUTER JOIN attributes
ON things.id = attributes.id
|]
Now, the columns from the attributes table are not nullable, so normally we could avoid the Maybe constructors, however the outer join changes that. Since both of these columns are not nullable, they are always both null or both not null, which is an invariant not captured by the type. And a separate Maybe for each column gets awkward to deal with, especially when more columns are involved.
What we would really like to do is change the type signature to:
getAllTheThings :: Connection -> IO [Only Text :. Maybe (Text, Int64)]
And now we can, and it will just work! Well, almost. The caveat is that there is a separate instance FromRow (Maybe ...) for most of the provided FromRow instances. This won’t work with your own FromRow instances unless you also declare a second instance. What’s really desired is a generic instance:
instance FromRow a => FromRow (Maybe a)
This instance would return Nothing if all the columns that would normally be consumed are null, and attempt a full conversion otherwise. This would reduce code bloat and repetition, and improve polymorphism and compositionality.
But alas, it’s not possible to define such an instance without changing the FromRow interface, and quite probably breaking everybody’s FromRow instances. Which I’m totally willing to do, once somebody comes up with a way to do it.lpsmithhttp://blog.melding-monads.com/?p=505Fri, 28 Mar 2014 00:46:56 +0000Old neglected theorems are still theorems
http://existentialtype.wordpress.com/2014/03/20/old-neglected-theorems-are-still-theorems/
I have very recently been thinking about the question of partiality vs totality in programming languages, a perennial topic in PL’s that every generation thinks it discovers for itself. And this got me to remembering an old theorem that, it seems, hardly anyone knows ever existed in the first place. What I like about the theorem is that it says something specific and technically accurate about the sizes of programs in total languages compared to those in partial languages. The theorem provides some context for discussion that does not just amount to opinion or attitude (and attitude alway seems to abound when this topic arises).
The advantage of a total programming language such as Goedel’s T is that it ensures, by type checking, that every program terminates, and that every function is total. There is simply no way to have a well-typed program that goes into an infinite loop. This may seem appealing, until one considers that the upper bound on the time to termination can be quite large, so large that some terminating programs might just as well diverge as far as we humans are concerned. But never mind that, let us grant that it is a virtue of T that it precludes divergence.
Why, then, bother with a language such as PCF that does not rule out divergence? After all, infinite loops are invariably bugs, so why not rule them out by type checking? (Don’t be fooled by glib arguments about useful programs, such as operating systems, that “run forever”. After all, infinite streams are programmable in the language M of inductive and coinductive types in which all functions terminate. Computing infinitely does not mean running forever, it just means “for as long as one wishes, without bound.”) The notion does seem appealing until one actually tries to write a program in a language such as T.
Consider computing the greatest common divisor (GCD) of two natural numbers. This can be easily programmed in PCF by solving the following equations using general recursion:
The type of defined in this manner has partial function type , which suggests that it may not terminate for some inputs. But we may prove by induction on the sum of the pair of arguments that it is, in fact, a total function.
Now consider programming this function in T. It is, in fact, programmable using only primitive recursion, but the code to do it is rather painful (try it!). One way to see the problem is that in T the only form of looping is one that reduces a natural number by one on each recursive call; it is not (directly) possible to make a recursive call on a smaller number other than the immediate predecessor. In fact one may code up more general patterns of terminating recursion using only primitive recursion as a primitive, but if you examine the details, you will see that doing so comes at a significant price in performance and program complexity. Program complexity can be mitigated by building libraries that codify standard patterns of reasoning whose cost of development should be amortized over all programs, not just one in particular. But there is still the problem of performance. Indeed, the encoding of more general forms of recursion into primitive recursion means that, deep within the encoding, there must be “timer” that “goes down by ones” to ensure that the program terminates. The result will be that programs written with such libraries will not be nearly as fast as they ought to be. (It is actually quite fun to derive “course of values” recursion from primitive recursion, and then to observe with horror what is actually going on, computationally, when using this derived notion.)
But, one may argue, T is simply not a serious language. A more serious total programming language would admit sophisticated patterns of control without performance penalty. Indeed, one could easily envision representing the natural numbers in binary, rather than unary, and allowing recursive calls to be made by halving to achieve logarithmic complexity. This is surely possible, as are numerous other such techniques. Could we not then have a practical language that rules out divergence?
We can, but at a cost. One limitation of total programming languages is that they are not universal: you cannot write an interpreter for T within T (see Chapter 9 of PFPL for a proof). More importantly, this limitation extends to any total language whatever. If this limitation does not seem important, then consider the Blum Size Theorem (BST) (from 1967), which places a very different limitation on total languages. Fix any total language, L, that permits writing functions on the natural numbers. Pick any blowup factor, say , or however expansive you wish to be. The BST states that there is a total function on the natural numbers that is programmable in L, but whose shortest program in L is larger by the given blowup factor than its shortest program in PCF!
The underlying idea of the proof is that in a total language the proof of termination of a program must be baked into the code itself, whereas in a partial language the termination proof is an external verification condition left to the programmer. Roughly speaking, there are, and always will be, programs whose termination proof is rather complicated to express, if you fix in advance the means by which it may be proved total. (In T it was primitive recursion, but one can be more ambitious, yet still get caught by the BST.) But if you leave room for ingenuity, then programs can be short, precisely because they do not have to embed the proof of their termination in their own running code.
There are ways around the BST, of course, and I am not saying otherwise. For example, the BST merely guarantees the existence of a bad case, so one can always argue that such a case will never arise in practice. Could be, but I did mention the GCD in T problem for a reason: there are natural problems that are difficult to express in a language such as T. By fixing the possible termination arguments in advance, one is tempting fate, for there are many problems, such as the Collatz Conjecture, for which the termination proof of a very simple piece of code has been an open problem for decades, and has resisted at least some serious attempts on it. One could argue that such a function is of no practical use. I agree, but I point out the example not to say that it is useful, but to say that it is likely that its eventual termination proof will be quite nasty, and that this will have to be reflected in the program itself if you are limited to a T-like language (rendering it, once again, useless). For another example, there is no inherent reason why termination need be assured by means similar to that used in T. We got around this issue in NuPRL by separating the code from the proof, using a type theory based on a partial programming language, not a total one. The proof of termination is still required for typing in the core theory (but not in the theory with “bar types” for embracing partiality). But it’s not baked into the code itself, affecting its run-time; it is “off to the side”, large though it may be).
Updates: word smithing, fixed bad link, corrected gcd, removed erroneous parenthetical reference to Coq, fixed LaTeX problems.Filed under: Programming, Research, Teaching Tagged: functional programming, primitive recursion, programming languages, type theory, verificationRobert Harperhttp://existentialtype.wordpress.com/?p=881Fri, 21 Mar 2014 03:15:55 +0000Parallel SNP identification
http://blog.malde.org/posts/varan-parallel.html
http://blog.malde.org/posts/varan-parallel.htmlWed, 26 Mar 2014 06:00:00 +0000Package consolidation
http://www.yesodweb.com/blog/2014/03/package-consolidation
A few weeks ago, there was a pretty heated debate about the PVP on the
libraries mailing
list.
I've seen a few different outcomes from that discussion. One is to reduce
dependency footprints to try and avoid some of the dependency problems.
(Another one is concrete proposals for changes to the PVP; I intend to post a
proposal as well in the coming days, but wanted to get the easier stuff out of
the way first.)This blog post is about some plans I have for consolidating multiple packages
together, hopefully to result in simpler dependency trees without causing users
to have unneeded package dependencies- at least not too often. The reason I'm
writing up this blog post is to let the community know about these changes in
advance, and let me know if any of these changes will cause them problems.
Also, if I list a package as deprecated, and you'd like to take over
maintainership, please let me know.One of the guiding principles I've used in setting this up is that I belive
some dependencies should be considered incredibly cheap. Depending on
process, for example, is a cheap dependency, since it comes with GHC. (Unless
you place restrictive bounds on process, which can instead cause dependency
problems.) For more information on these principles, please read my
description.I'll start at the bottom of the dependency tree and build up.streaming-commons, zlib-bindings, text-stream-decode, conduit and pipesCurrently, there are about six core conduit packages: conduit, zlib-conduit,
attoparsec-conduit, etc. In addition, for some of these packages, I've split
off helper packages providing functionality to be used by other streaming data
libraries as well, such as zlib-bindings and text-stream-decode.I want to collapse that into just three packages. All of the streaming helpers
will end up in a new package,
streaming-commons. I've talked
with Gabriel Gonzalez about this, and we'll be collaborating together on
creating a high-quality, low-dependency library. This library will also include
some features currently baked into conduit itself, like lock-free Windows file
reading and directory traversals.All of the conduit core packages on top of that would then be merged into a new
package, conduit-extra. So we'd end up with conduit, conduit-extra, and
streaming-commons. The only downside is that, if you only needed zlib support,
you'll now get a few extra packages as well. However, following the principle I
listed above, these extra dependencies should all be coming from the "basically
free" dependency category.Crazier ideas for streaming-commonsThis may be taking the idea too far, but we could include some even more
advanced tooling in streaming-commons. This could include not only the data
types from xml-types and json-types- which provide both streaming and tree
based data structures for those data formats- but also attoparsec parsers and
blaze-builder renderers. This could allow quite a bit of the xml-conduit
codebase to be shared by the pipes world, for example.I'm curious if people think this is a cool idea, or too radical (or both!).Deprecate failure, attempt, and safe-failureThis one's been on my list for a while, pending some details being worked out
with Edward Kmett. The goal is to completely deprecate failure and attempt in
favor of the exceptions package, and within exceptions split out MonadThrow
from MonadCatch.This will also mean removing some redundant functionality from resourcet. It
will be nice to be rid of the custom MonadThrow and MonadUnsafeIO defined
there.http-client-*A few simple moves: merge http-client-multipart into http-client, and merge
http-client-conduit into http-conduit. The latter change will mean that it's a
bit more difficult to use http-client in conduit without depending on tls, but
that's a use case anyone has expressed interest to me in.Another change I'm planning to do at the same time is to add a new module to
http-conduit, with an alternate API. There are a few places where I'm
dissatisfied with the current API, and this module will work as an experimental
next-gen http-conduit. I'm planning on keeping both versions of the API around
for quite a while for backwards compatibility, however. The changes I'm looking
at are:Avoiding ResumableSourceUsing with-semantics (like http-client) to avoid accidentally keeping connections open.Don't bake in the need for ResourceTPossibly use lenses for field modifiers on the Request data type.shakespeareThis change is pretty simple: collapse shakespeare, shakespeare-css,
shakespeare-js, shakespeare-text, shakespeare-i18n, and hamlet into a single
package. It made sense to keep these separate when APIs were changing rapidly,
but things are basically stable now.waiSince they don't add any extra dependencies, I'd like to merge wai-test and
wai-eventsource into wai-extra. Once again, since we're dealing with stable
APIs, this shouldn't cause too much trouble.I'm also considering deprecating wai-handler-devel, since it's no longer used
at all by yesod devel.Deprecate pool-conduitpool-conduit used to be a resource pool implementation based on code in conduit. Since then:The actual resource pool implementation is provided by resource-pool.The code I used from conduit has moved to resourcet.At this point, pool-conduit doesn't really have much in it. If there's code
that people are using from it, I'd like to get it merged into resource-pool
itself.yesodThe next iteration of Yesod will have a significantly simpler dispatch
system.
This new code doesn't really make sense as a general-purpose routing tool, so
I'm planning on moving that code into yesod-core itself and deprecate
yesod-routes. I know there are other users of yesod-routes; I think it makes
sense to rename yesod-routes to do something like merging yesod-routes into
wai-routes, as yesod-routes has no Yesod-specifics in it.Another minor change: merge yesod-eventsource into yesod-core. No extra dep,
and a stable API.Finally, the big (and somewhat controversial) one: merge most of the yesod-*
core packages into the yesod package itself. A number of year ago, we did
precisely the opposite. However, given API stability, I think the time has come
to simplify our dependency list again here. This will have the added benefit
that when a user reports "I'm using Yesod version 1.2.3", it will give us more
information.xmlI'm planning on deprecating dtd, uri-conduit, and xml-catalog, as I don't use
them any more and have no time to maintain them.Another idea I'm playing with is merging html-conduit into xml-conduit. This
would add a tagstream-conduit dependency. Alternatively, perhaps
tagstream-conduit could be merged into xml-conduit as well.classy-preludeMerge classy-prelude-conduit in with classy-prelude. Downside: classy-prelude
will depend on conduit-combinators, but that doesn't actually add more than 3
extra packages.Next step: trimming dependencies in generalI'm not planning on rushing any of these changes. The goal is to take them
slowly and methodically, and release changes incrementally. For example, after
the conduit changes are done, I'd release new versions of wai, yesod, etc, that
are compatible with both the new and old versions. Hopefully the user facing
changes will simply be tweaking import lists and cabal files, but users will be
able to take their time on this.Ultimately, I'm planning on releasing new version of persistent (2.0) and Yesod
(1.4). You can see the Persistent 2.0
goals. The
Yesod 1.4 release doesn't actually have any breaking changes planned, aside
from upgrading its dependencies.One other thing I'm going to be doing in this process is a more general
trimming down of dependencies. I'll be going through yesod-platform to find
packages we don't need. I also want to avoid redundant packages if possible
(e.g., we only need one cryptographic hash packages). In many cases, as a
community, we have multiple package options available. I think a great move for
the community would be to start consolidating those options as well where it
makes sense. If I have concrete ideas on that, I'll certainly share them.http://www.yesodweb.com/blog/2014/03/package-consolidationTue, 25 Mar 2014 08:07:00 +0000New Haddock released! A visual guide to changes.
http://fuuzetsu.co.uk/blog/posts/2014-03-24-New-Haddock-released%21-A-visual-guide-to-changes.html
http://fuuzetsu.co.uk/blog/posts/2014-03-24-New-Haddock-released%21-A-visual-guide-to-changes.htmlMon, 24 Mar 2014 15:33:16 +0000Emacs support for debugging Haskell
http://chrisdone.com/posts/ghci-debugger
http://chrisdone.com/posts/ghci-debuggerSun, 23 Mar 2014 00:00:00 +0000