Inadvertent incorrectness

Quality assessment of Haskell programs

Ken — Wed, 28 Sep 2011 21:46:37 +0000

One of the greatest things about writing code in Haskell is the wonderful libraries (incidently, one of the worst things about writing code in Haskell is the libraries). In particular the libraries for assessing the quality of your own code. I’m especially found of:

QuickCheck for a quick testing of the correctness of my code, by letting me specify the properties I want to check, and taking care of generating random test-cases.
Criterion for checking the time performance of my code in a statistical sound manner. Taking care of dealing with the garbage collector and such like.

In this post I give a quick tour of how to use these libraries.

Getting the libraries

Our first step is to install QuickCheck and Criterion:

$ cabal install criterion quickcheck

Now we are ready to go.

Backstory

The standard list-based implementation of quicksort that you so often see, seems ripe for some optimisations:

quicksort :: Ord a => [a] -> [a]
quicksort []     = []
quicksort (p:xs) = quicksort lesser ++ [p] ++ quicksort greater
    where
        lesser  = [ x | x <- xs, x < p]
        greater = [ x | x <- xs, x >= p]

One obvious optimisation could be to only traverse xs once, when partitioning the elements into lesser and greater (and equal) than p. Here is one implementation that uses foldl' for maximum performance (we hope):

kquicksort :: Ord a => [a] -> [a]
kquicksort []     = []
kquicksort (p:xs) = kquicksort lesser ++ equal ++ kquicksort greater
    where
        (lesser,equal,greater) = foldl' part ([],[p],[]) xs
        part (l,e,g) x =
          case compare x p of
            LT -> (x : l, e, g)
            GT -> (l, e, x : g)
            EQ -> (l, x : e, g)

Checking some properties

The basic usage of QuickCheck is to specify the properties you care about as ordinary Haskell functions, and then you use the quickCheck function to check the properties. In the following I use QC as the qualified name for values from the QuickCheck library.

The first property we want to check is that kquicksort returns a sorted result. To check this property we also define a helper predicate sorted that checks that a list is sorted:

sorted :: Ord a => [a] -> Bool
sorted (x1:x2:xs) = x1 <= x2 && (sorted $ x2:xs) 
sorted _          = True

prop_sorted :: Ord a => [a] -> Bool
prop_sorted xs = sorted $ kquicksort xs

The next properties that we want to check could be that kquicksort is idempotent and if it is given an ordered argument it doesn’t mess it up:

prop_idempotent :: Ord a => [a] -> Bool
prop_idempotent xs = kquicksort xs == (kquicksort $ kquicksort xs)

prop_ordered :: Ord a => QC.OrderedList a -> Bool
prop_ordered (QC.Ordered xs) = xs == kquicksort xs

For prop_ordered we use the type class OrderedList to specify that this property should only hold for arguments that are sorted.

Now we can use the quickCheck function to check all our properties:

checkAll = do
  QC.quickCheck (prop_sorted :: [Int] -> Bool)
  QC.quickCheck (prop_idempotent :: [Int] -> Bool)
  QC.quickCheck (prop_ordered :: QC.OrderedList Int -> Bool)

The type constraints on all the properties are necessary, because that is what makes overloading works, so that quickCheck can generate random test-cases. Alternatively we can give our properties a less general non-polymorphic type. However, by keeping the more general type for our properties we can check them for other types as well. For instance, we might want to check it for a custom type, like the following type Colour:

data Colour = Red | Green | Blue
            deriving (Eq, Show, Ord)

Before we can use QuickCheck with the Colour type, we need to specify how to generate random values of this type. That is, make it an instance of the Arbitrary type class:

instance QC.Arbitrary Colour where
  arbitrary = QC.elements [Red, Green, Blue]

Now, we can check the prop_sorted property by adding a simple type constraint:

QC.quickCheck (prop_sorted :: [Colour] -> Bool)

Checking the performance

Once we have convinced our-self the code could be correct, we can start worrying about performance. Thus, we turn to Criterion. When doing performance testing there are all sorts of gnarly details that can go wrong and ruin our experiments: we forget to force the lazy evaluation and thus measure the wrong thing, the running time of function we try to measure is to short to get a meaningful reading of with clock resolution on the computer we use, the garbage collector might introduce noise from one sampling into another, we might be using the computer for something while we run the experiments which could introduce noise, and so on.

In the following I use C as the qualified name for values from the Criterion library.

Criterion takes care of all these concerns, and just present us with a simple interface where the only thing we need to specify is which functions we want to run, on what data, and how much we want the result to be evaluated. Thus, to benchmark kquicksort on the list [20, 10, 30] and get a fully evaluated result, we first use the nf (for normal form) function from Criterion to get something Benchmarkable:

C.nf kquicksort [20, 10, 30]

When we have something benchmarkable, we use the bench function to label it and turn it into a Benchmark

C.bench "kquickcheck on short list" $ C.nf kquicksort [20, 10, 30]

When we have a list of benchmark we can hand it over to defaultMain which makes an excellent main body for us that will allow us to configure our benchmark from the command-line without recompiling the program. Without further ado:

makeList :: Int -> [Int]
makeList n = QC.unGen (QC.vector n) (R.mkStdGen 42) 25

main :: IO()
main = do
  checkAll
  let sizes = [ 10000, 20000, 50000, 75000
              , 100000, 250000, 500000, 1000000]
  let inputs = [(n, makeList n) | n <- sizes]
  let benchmarks = [ C.bench (name ++ show n) $ C.nf sort ns 
                   | (n, ns)      <- inputs,
                     (name, sort) <- [("quicksort ",  quicksort),
                                      ("kquicksort ", kquicksort)]]
  C.defaultMain benchmarks

In the helper function makeList I have reused the generator framework from QuickCheck for generating some random test data for my benchmark. For simple integer lists as we have here it is a bit overkill, but for more complex input it can be nice.

The complete program

If you want to find out if kquicksort really is faster than quicksort is here the complete program.

import qualified Criterion.Main as C
import qualified Test.QuickCheck as QC
import qualified Test.QuickCheck.Gen as QC

import Data.List(foldl')
import qualified System.Random as R

quicksort :: Ord a => [a] -> [a]
quicksort []     = []
quicksort (p:xs) = quicksort lesser ++ [p] ++ quicksort greater
    where
        lesser  = [ x | x <- xs, x < p]
        greater = [ x | x <- xs, x >= p]

kquicksort :: Ord a => [a] -> [a]
kquicksort []     = []
kquicksort (p:xs) = kquicksort lesser ++ equal ++ kquicksort greater
    where
        (lesser,equal,greater) = foldl' part ([],[p],[]) xs
        part (l,e,g) x =
          case compare x p of
            LT -> (x : l, e, g)
            GT -> (l, e, x : g)
            EQ -> (l, x : e, g)

sorted :: Ord a => [a] -> Bool
sorted (x1:x2:xs) = x1 <= x2 && (sorted $ x2:xs) 
sorted _          = True

prop_sorted :: Ord a => [a] -> Bool
prop_sorted xs = sorted $ kquicksort xs

prop_idempotent :: Ord a => [a] -> Bool
prop_idempotent xs = kquicksort xs == (kquicksort $ kquicksort xs)

prop_ordered :: Ord a => QC.OrderedList a -> Bool
prop_ordered (QC.Ordered xs) = xs == kquicksort xs

data Colour = Red | Green | Blue
            deriving (Eq, Show, Ord)

instance QC.Arbitrary Colour where
  arbitrary = QC.elements [Red, Green, Blue]

checkAll = do
  QC.quickCheck (prop_sorted :: [Int] -> Bool)
  QC.quickCheck (prop_sorted :: [Colour] -> Bool)
  QC.quickCheck (prop_idempotent :: [Int] -> Bool)
  QC.quickCheck (prop_ordered :: QC.OrderedList Int -> Bool)

makeList :: Int -> [Int]
makeList n = QC.unGen (QC.vector n) (R.mkStdGen 42) 25

main :: IO()
main = do
  checkAll
  let sizes = [ 10000, 20000, 50000, 75000
              , 100000, 250000, 500000, 1000000]
  let inputs = [(n, makeList n) | n <- sizes]
  let benchmarks = [ C.bench (name ++ show n) $ C.nf sort ns 
                   | (n, ns)      <- inputs,
                     (name, sort) <- [("quicksort ",  quicksort),
                                      ("kquicksort ", kquicksort)]]
  C.defaultMain benchmarks

Compile it with the command-line

$ ghc -O3 -W --make Quicksort.hs -o Quicksort

I Am Going to JAOO 2008 As A F# Expert

Ken — Thu, 11 Sep 2008 20:01:45 +0000

Microsoft Denmark have invited me to participate in JAOO 2008. If I in return spend some time in the Microsoft stand demoing F# and answering questions about F# and functional programming in general.

When Martin Esmann (Microsoft Academic Developer Evangelist) approached me, I told him that I’d be happy to show up, but I’m not using F# from within Visual Studio. In fact, I’m usually not even using Microsoft’s .NET implementation. I’m a happy Mono user. To Martin’s credit his initial reaction was that it would be cool that I demoed F# using Linux and Mono in the Microsoft stand.

However, after talking it over with his colleges, Martin got back to me and told me that they thought that using Linux and Mono to demo F# would send to mixed a message. I agree with that. So unfortunately I won’t be demoing Mono in a Microsoft stand this time. But I’ll of course bring my own laptop, and might show F# on Mono and Linux if there are questions about portability. Maybe I should contact the Mono guys and ask for a T-shirt, so that I silently can invite questions about Mono

Time to practice some F# demos. Any suggestions for what would be effective? I’m planning to show of the brand new Units of Measure and also sequence comprehensions.

Getting Ready for the ICFP 2008 Programming Contest

Ken — Fri, 11 Jul 2008 13:30:18 +0000

I’m getting ready to participate in the eleventh ICFP programming contest.

So far everything works like a charm, KVM can run the LiveCD for the contest without a problem:
kvm -cdrom ICFPCD15.iso &

I hope I’ll be able to spend more time on the contest compared to last time I participated.

Decoding Morse Code With F# Comprehensions

Ken — Fri, 09 Nov 2007 11:31:30 +0000

In my last post I showed how to decode morse code in Python using list comprehensions. In this post I show how to do it in F# instead.

First using list comprehensions:

let codes =
    [("A",".-");   ("B","-..."); ("C","-.-."); ("D","-.."); ("E",".");
     ("F","..-."); ("G","--.");  ("H","...."); ("I","..");  ("J",".---");
     ("K","-.-");  ("L",".-.."); ("M","--");   ("N","-.");  ("O","---");
     ("P",".--."); ("Q","--.-"); ("R",".-.");  ("S","..."); ("T","-");
     ("U","..-");  ("V","...-"); ("W",".--");  ("X","-..-");("Y","-.--");
     ("Z","--..")]

let rec decode input =
    if input = "" then [""]
    else [ for c, code in codes when input.StartsWith(code)
           for rest in decode(input.Substring(String.length code))
           -> c + rest ]

As it can be seen the code is almost identical to the Python code. Incidentally, I could not find a function equivalent to Python’s startswith method in the O’Caml standard library (without using regular expressions). Fortunately F# came with one from the .NET library.

Much to my the surprise the compiled F# (running on Mono 1.2.4) is 4 times slower that the Python code. I then rewrote the program to use sequence comprehensions:

let rec decode input =
    if input = "" then { -> "" }
    else { for c, code in codes when input.StartsWith(code)
           for rest in decode(input.Substring(String.length code))
           -> c + rest }

This version runs faster, and uses only a constant amount of memory. Still the Python version is three and half times faster.

I then tried to run the programs on an other computer with Microsoft’s .NET implementation. This improved the F# running times a lot. However, they are still 40% to 80% slower than the Python version.

My current guess is that Python is much better at handling strings than .NET.

The actual numbers:

	Linux, Mono		Vista, MS .NET
Program	Time sec	Ratio	Time sec	Ratio
`morse.py`	7.94 ± 0.05	1	11.03 ± 0.03	1
`morse.fs`	33.07 ± 0.19	4.17	19.65 ± 0.33	1.78
`morse-seq.fs`	28.1 ± 0.59	3.54	16.12 ± 0.36	1.46

Update 2007-11-12 Added test files:

Python code: morse.py
F# code using list comprehensions: morse.fs
F# code using sequence comprehensions: morse-seq.fs

Morse Code Decoding With Python List Comprehensions

Ken — Wed, 19 Sep 2007 20:16:44 +0000

As a small exercise for getting up to speed with Python I decided to solve ruby quiz #121, which is to to write a function that finds all possible decodings of a string of Morse codes without letter- and word-separators. Given the nature of the problem I decided to use python’s list comprehensions for the solution.

Without further ado here is the code I ended up with:

#!/usr/bin/env python

import string

letters = [('A',".-"),   ('B',"-..."), ('C',"-.-."), ('D',"-.."), ('E',"."),
           ('F',"..-."), ('G',"--."),  ('H',"...."), ('I',".."),  ('J',".---"),
           ('K',"-.-"),  ('L',".-.."), ('M',"--"),   ('N',"-."),  ('O',"---"),
           ('P',".--."), ('Q',"--.-"), ('R',".-."),  ('S',"..."), ('T',"-"),
           ('U',"..-"),  ('V',"...-"), ('W',".--"),  ('X',"-..-"),('Y',"-.--"),
           ('Z',"--..")]

def decode(input):
    if input == "" :
        return [""]
    else:
        return [ letter + remaining
                 for letter, code in letters if input.startswith(code)
                 for remaining in decode(input[len(code):]) ]

# Some Testing code
def test(s, code):
    if s in decode(code):
        print code + " can be decoded as " + s
    else:
        print code + " can NOT be decoded as " + s

test("SOFIA", "...---..-....-")
test("SOPHIA", "...---..-....-")
test("EUGENIA", "...---..-....-")

Interesting, my solution is rather similar to Patrick Logan’s Erlang solution. And I find it simpler to understand than the Haskell solutions at the HaskellWiki.

Recursive Descent Parsers in C#

Ken — Fri, 08 Sep 2006 20:46:16 +0000

Peter Sestoft and I have written a note about how to write scanners and parsers in C#. The note is based on earlier versions for SML and Java.

The note contains an thorough introduction to grammars on Backus–Naur form (BNF). This includes a description of properties your grammar should have so that it can be mechanically translated to a program. And also some prescriptions about how to transform your grammar so that it has the desired properties. In technical terms, the note describe how you can check that your grammar is an LL(1) grammar, and if your grammar is not an LL(1) grammar, we give your some tricks that will usually transform the grammar into an LL(1) grammar.

The parsers you write using our method are recursive descent parsers. For the scanners, however, we just use an add-hoc method. Both parsers and scanners makes good use of the .NET framework. For instance, the scanners creates a token stream from a TextReader. Hence, the scanners can be used to scan both files and strings. Likewise, a token stream is represented as IEnumerable and scanners uses yield to create this token stream. Thus, creating the token stream lazily.

To give an example, the simplest scanner presented in the note is the following scanner:

using TokenStream = System.Collections.Generic.IEnumerator;

class ZeroOneScan : IScanner {
  public TokenStream Scan(TextReader reader) {
    while ( reader.Peek() != -1 ) {
      if ( Char.IsWhiteSpace((char) reader.Peek()) )
        reader.Read();
      else
        switch(reader.Read()) {
        case '-': yield return Token.FromKind(Kind.SUB); break;
        case '0': yield return Token.FromKind(Kind.ZERO); break;
        case '1': yield return Token.FromKind(Kind.ONE); break;
        default: throw new ApplicationException("Illegal character");
        }
    }
    yield return Token.FromKind(Kind.EOF);
  }
}

I find this use of yield quite elegant.

Working with this note and getting my name on it has special meaning to me. A precursor note for SML was actually the note Peter used when he taught me for the first time many years ago (on my second semester at university). Over the years, I have returned to the note many times when I have needed to parse a small language that did not warrant the use of a parser generator, or when a generated parser would have been inconvenient to use because the text to be scanned did not come from a file stream (modern parser generators will not generate parsers with this problem).

The note ends a bit to early, in my opinion. I would like extend the note to cover Extended BNF. And I would also like to cover parser combinators. Well, one day when time permits…

ICFP Contest 2006, Team KFL

Ken — Thu, 27 Jul 2006 11:37:39 +0000

In 1967, during excavation for the construction of a new shopping center in Monroeville, Pennsylvania, workers uncovered a vault containing a cache of ancient scrolls. Most were severely damaged, but those that could be recovered confirmed the existence of a secret society long suspected to have been active in the region around the year 200 BC.

Based on a translation of these documents, we now know that the society, the Cult of the Bound Variable, was devoted to the careful study of computation, over two millennia before the invention of the digital computer.
…

Like last year the prospects for my participation in the ICFP Contest was not looking good. None of my team mates from last year’s team seemed to be able to participate, and neither did I myself. The weekend of the contest was packed with family business. And on top of that, when the weekend arrived I was sick Friday night and Saturday.

However, Sunday evening I had some free time and I decided that I would take a crack a the contest just to see what it was about. Judging from the discussion mailing-list it sounded quite fun and interesting. The first phase of the contest task was to implement a 14-instruction virtual machine called UM and when that was running you should use it for running the provided codex for the operating system UMIX.

So I registered my team KFL and started to implement my UM in SML. The first thing I did was to implement an instruction decoder that could translate a 32-bit word into an SML datatype. Then I wrote a function that read in a file of 32-bit words encoded in big-endian as four 8-bit words each. And then maped my decode function over the Vector of words. For this task the SML Basis Library really shined:

fun readFile filename =
    let val dev = BinIO.openIn filename
        val all = BinIO.inputAll dev before BinIO.closeIn dev
        val words = Vector.tabulate(Word8Vector.length all div 4,
                                    fn i => Word32.fromLarge(PackWord32Big.subVec(all,i)))
    in  Vector.map decode words
    end

Time spend: 1 hour.

Unfortunately, this did not work. My decoding function failed after 1675 instructions or so, complaining about illegal instructions. And indeed the 32-bit word it complained about did not seem to encode a legal instruction. I tried to reimplement the conversion from 8-bit words to 32-bit words, in case PackWord32Big worked different than I thought. But I still got the same error. Thus, I gave up and went to bed.

Time spend: 2 hours.

Monday morning I had to see to some other things first, but then I had some time to spend on the contest. Even after I had slept on the problem I still couldn’t figure out what was wrong. So I asked my colleague Arne if he had 10 minutes to help me debug my program. I explained him the problem, showed him my code (actually my debug output, and then we looked at the codex in a hex-editor. He confirmed that based on my explanation, my program appeared to be working correctly, and it looked as if there was an illegal instruction in the codex, if all instructions really was encoded a single 32-word. Hence, one or more of my assumptions had to wrong (it was easy to rule out that the codex was wrong, because more than a hundred teams were able to run the codex). Then it occurred to me, the codex was not required to only contain valid instructions, maybe the code would jump over damaged parts of the codex and part of the contest would be to repair the codex. Thus, I changed my code to only decode instructions on demand, and kept the whole program as an array of 32-bit words. Lo and behold the machine was able to start running the codex! However it failed in the self-check the codex performed. After some debugging I found one place where I used the name of an register (registers in the UM are named by integers) as a value rather than using the value contained in the register. Now my UM was able to run the codex and the SANDmark (a debug and benchmark suite provided by the contest managers).

Time spend: 2 hours.

My first version ran the SANDmark in a bit more than 18 minutes (14 min user and 4 min sys) , 768 seconds user time according to MLton’s profiler. Which was not to bad but I’d seen on the discussion list, that other participants had UMs that ran the SANDmark in a couple of minutes. Thus, I decided to profile my UM to see where the time was spend. To my surprise the top function in the profile was my decode function, a function that took a 32-bit word and translates it to an SML datatype. Here are the first few lines of decode together with the helper function standardRegs that fetches out the register names:

fun standardRegs w =
    let open Word32
        val A = (w << 0w23) >> 0w29
        val B = (w << 0w26) >> 0w29
        val C = andb(w, 0w7)
    in (toInt A, toInt B, toInt C)
    end

fun decode w =
    let open Word32
        val opr = w >> 0w28
    in case opr of
           0w0  => CMove(standardRegs w)
         | 0w1  => ARead(standardRegs w)
         | 0w2  => AWrite(standardRegs w)
         ...

And the top of my interpreter loop looked like this:

      while true do
             case spin() of
                 CMove(A,B,C) => if $C = 0w0 then ()
                                 else A < - $B
               | ARead(A,B,C) => A < - $$B sub (W32.toInt($C))
               | AWrite(A,B,C) => Array.update($$A, W32.toInt($B), $C)
               ...

Where spin is the function that reads the current word at the program counter, updates the program counter, decodes the word, and return the instruction. But how could 19% of the time be spend in the decode. I moved the call to decode from spin to my interpreter loop to aid the MLton optimizers:

      while true do
             case decode(spin()) of

This made the SANDmark 5 minutes faster wall clock time, that is 13 minutes. Or in MLton profiler time 529 seconds. 30% improvement just for moving a function around. Not bad.

Time spend: 30 minutes.

After this optimization my UM was fast enough that I thought I’d try to solve some of the puzzles. So I logged into the UMIX OS using the guest account and started to poke around and collect points. The first real puzzle was to fix a password cracker written in a weird Basic dialect that used roman numerals instead of decimal notation for integer literals (including for the line numbers).

Time spend: 1½ hour. Collected points 230.

Then I had to go home, and while I cooked dinner (I was baking pita bread, and while the dough was rising I had time to hack) I was able to write an improved password cracker—in this weird roman numerals Basic: hack2.bas. This gained me an other 100 points, just before the contest ended (the contest ended at 18:00 in CEST)

Time spend 45 min. Collected points in total 330.

All in all not bad to make 330 points after spending only seven hours and 45 minutes of rather fragmented time.

After dinner I was able to gain an other 35 points by writing a list reversal program in a graphical 2D language: rev.2d. It took half an hour or so.

The setup for the contest was absolutely amazing and most entertaining. My account of it here does not do it justice. An incredible amount of work must have gone into the preparation of it. I’m looking forward for the final debriefing from the Contest Organizers.

Yesterday, I tried for fun to optimize my UM program a bit more. Programs running on the UM are able to allocate and free arrays. In my original implementation I used a ref to a functional Red-Black tree to keep track of the mapping from UM-pointers to arrays. I know, not the best choice of data structure, but I was just trying to get a “good enough” UM up and running. From the profile it was obvious that lots of time and memory was spend on keeping the Red-Black trees balanced. Thus, I replaced this code with an array, and a free-list for reusing UM-pointers (32-bit words). Thus, my code for managing the “heap” when from 12 lines of code (not counting the code in the Red-Black tree library) to 28 lines of code. This small changed made the SANDmark run in 4 minutes wall clock(175 seconds of MLton profiler time) an improvement of almost 67%. Looking at the profile, I could see that decode was again on top of the list (using 42% of the time). Thus, I decided to inline decode and deforest the instruction datatype by hand. This made my code 68 lines smaller, and the SANDmark ran in 2.50 minutes (134 seconds of MLton profiler time), 23% improvement. Almost four times faster than the UM I participated in the contest with. Time spend 1½ hour.

Code for the UM I participated with: um2.sml
Code with improve heap handling: um3.sml
Code with inlined decode function: um4.sml

Theory of evolution

Ken — Thu, 05 Jan 2006 16:15:03 +0000

There is no theory of evolution. Just a list of creatures Chuck Norris has allowed to live.

Refactoring SML Quiz, Part 2

Ken — Wed, 21 Dec 2005 21:41:43 +0000

The answer to yesterdays quiz is: Yes, types are necessary for lambda-lifting refactoring. Namely, if the lifted function contains an overloaded operator such as, e.g., +.

For example, given the program:

fun foo x =
    let fun add y = x + y
    in  add 5.0 end

where we want to lift the function add. Our first attempt might be to transform the program into something like (refactoring is a source code to source code transformation):

fun add x y = x + y
fun foo x = add x 5.0

However, this might not compile with every SML compiler. It depends on how much local context the compiler uses to resolve overloading.

Moscow ML compiles the transformed program above just fine. But Moscow ML complains about the following version:

fun add x y = x + y;
fun foo x = add x 5.0

Notice the extra semi-colon after the declaration of add.

Update 2005-12-22:
Just for good messure. Here is one possible result of refactoring with necessary type information:

fun add (x : real) y = x + y;
fun foo x = add x 5.0

I believe that only captured variables needs to be typed.

Also note, that overloading is not the only problem. It is possible to construct a similar example using records, but I’ll leave that to the reader (unless there is a popular and desperate demand, then I can provide an example).

Refactoring SML Quiz, Part 1

Ken — Tue, 20 Dec 2005 21:55:26 +0000

Yesterday, I discussed with some students who are implementing an SML plug-in for eclipse, whether types are necessary for a lambda-lifting refactoring for SML.

So today’s quiz is simply: Are types necessary for lambda-lifting refactoring in SML? Why/Why not?

Remember, the refactoring works on valid SML programs, and after the transformation the program should still be valid.

Example:

fun foo x =
    let fun bar y = (x,y)
    in bar x end

is transformed/refactored to

fun bar x y = (x,y)
fun foo x = bar x x

I’ll give the answer tomorrow.