Tags: code

glasses

Windows 7

Windows 7 is coming out in beta this weekend.  It's a no-brainer upgrade from Vista, as it is, for my purposes, a strictly superior OS.  And if you're stuck on XP, it's finally time to move on.   If you're running Leopard or Linux, there's probably nothing I can say to convince you anyway.

This article and the article it rebuts may be useful to you if you're running XP.  http://blogs.zdnet.com/Bott/?p=630 

I installed Windows 7 on my tablet (which was designed for XP) for my trip out east, and used it all weekend for DMing a D&D adventure (using the killer DMing tool, OneNote).  It's fast, responsive, the new handwriting input tip is awesome, and I could swear I gained an hour of battery life just by installing Windows 7.  Also, the new fingerprint scanner drivers are pretty sweet.

One thing Windows 7 introduced is triggered services, which means services that start up automatically when someone needs them.  Pretty obvious addition, and it's a nice middle ground between automatic and manual start services.  Now that services don't have to make the tough choice, it really helps reduce the normal load of background processes.

There are a bunch of cool new features in Windows 7 of the type one will use every day.  They're the kind of features that integrate into your workstyle, and then you go back to an XP or Vista machine and can't stand using it.  As one blogger nicely put it, (paraphrasing) "there's not a lot of wow in Windows 7, which is exactly what's needed.  It's satisfying on a much deeper level."

My favorite things about Windows 7:
    - the new taskbar, especially the jump lists.
    - gestures for docking two windows side-by-side and maximizing
    - the fact that desktop search is significantly faster and searches the control panel now.
    - action-oriented troubleshooting wizards that work and are actually helpful!
    - improved responsiveness
    - improved battery life
    - simplified and customizable Shutdown/sleep/etc button in the start menu
    - Multi-touch
    - Libraries.  Libraries are virtual folders which aggregate various folders across your disks and network.  This means that your experience with different apps doesn't suck when your music isn't in "My Music", because now programs interact with your Music library.
glasses

Pro and Repro were sittin' in a boat... Pro fell out, who was left?

 So, I'm debugging this issue with the tooltips in the Windows common controls.

A particular debugger is crashing when it's trying to display the contents of a particular string in a tooltip.

I am using that particular debugger to debug the crash.  In that debugger, I have a callstack in the tooltips code, and if I hover over variable containing the text that it's trying to display, I get the crash!

If I were to debug that crash, I would get a callstack in the tooltips code, which if I hovered over the variable containing the text...

You get the idea: a recursive repro case!
glasses

F#MiniJava

Sunday, Chris and I finished our minijava compiler implementation in F#.  Whereas most of the class implemented it in Java or C#, we knew that all we needed was a functional language with lex and yacc companions, and a great runtime library.

The compiler converted java to x86 assembly code, and we did all of the implementation in F# (and lex and yacc with F# semantic actions) and then embedded ml.exe (MASM compiler) and link.exe (C++ linker) as resources in our binary to do the assembling and linking with the C runtime.

We had discriminated unions galore, and pattern matches to go with.  Perfect for a compiler, where there are a few types, but lots of operations you want to do on those types.  The Java folks were all using the Visitor pattern.  Also, once we got the AST converted to a list representaiton, we could use ML's excellent list library, especially map, iter, and fold_left and fold_right.

I also made some pretty sweet uses of curried functions.

Chris is on the F# team but didn't have the FP background that I have, so he tended to use mutable state and objects more than I did.  But he also knew a lot more about the language.  This was great because I learned a lot more than I would have without him, even if it lead to style inconsistencies.

One cool discovery of Chris's was the pass-forward operator, |>, which allows you to chain producers and consumers like you would do piping in a shell.  If this is in CAML, I never knew about it.
For example,
let outputMethodListing methodInstrs = methodInstrs |> List.map (fun (instr:Instruction) -> instr.ToString()) |> List.iter outputFile.Write

The expression to the left of the pass-forward operator is the last argument to the expression to its right.  It associates left, just like the | in the UNIX or windows shells.

F# is a convergence of OO and Functional, the F# runtime library and the .NET runtime library, type inference and generics.  So there are a bunch of ways to do anything, which is probably not a good thing.  I definitely don't feel facile with making implementation choices in F#.  

The upshot is that I appreciate now just how much I am an expert in C++, in that I know the various styles and their tradeoffs, and can make a series of choices that fit together holistically.

I'm using F# as my scripting language in my day job.  It's annoying to wait a second for the runtime to load everytime I run a script, but it's worth it because I like F# way better than perl.
  • Current Music
    Radiohead - In Rainbows
  • Tags
    ,
glasses

Integrated experiences

I had a very well-integrated day doing my compilers project and homework.  I began by using F#'s fslex.exe and the F# compiler in Visual Studio to build a Mini Java scanner.  I used MSDN from IE to get some string functions in the .NET library.  I zipped and emailed the results to my partner using Windows Live Mail.

Then I did the written homework in Word, embedding two Visio diagrams, of a parse tree and a DFA, and two Excel tables, one of a shift-reduce derivation and one a parser table.  

I referred often to this week's Powerpoint lecture when doing my work.  Tuesday, I used Media Player to view the lecture side-by-side with the Powerpoint deck from home due to poor road conditions, so today I only needed to refer back to the accompanying Powerpoint deck.

All the while, I was using Windows and listening to music with the Zune desktop software.

My experience today was not bug-free; Windows froze.  Happily, Office recovered all my documents I was working on.  The Zune software is buggy, but I didn't happen across any bugs today. 

But ten Microsoft products... and I didn't even think about it until twenty minutes ago.  Sometimes it's worth sitting back and realizing just how well-integrated and enabling all these products are.  While not a line of my own code appears in any of the scenarios I exercised today, I'm still proud to work for the company that made all this happen.
glasses

Compiler Class kicking off

Yes, I'm taking a compiler class for my Master's degree.  That may seem weird since I spent over four years working on compilers, but I was a backend guy, and this class will at least 2/3 deal with frontend issues, I'm guessing.  I'm well-versed in those too, but I never actually took a class in undergrad.

For the class project, he instructor recommends Java because he's got the course set up for it, but he says people have used C# in the past, and that ML would be fine (in other words, he's secretly hoping someone will do it in ML).  He stresses that working with someone else on the project is especially a good idea if you're not doing Java.

Java is not exactly on my career path, so of course I'll do C#.  But wait... compiler geeks know that functional languages are made to write compilers!  C# + ML = F#!

I'm wondering if that's feasible when he mentions "oh, and someone wanted to use F#, which I think could be cool but risky.  Is the person from the F# team here?"

A guy I know raises his hand, and he and the instructor share a special moment.  He was on the Visual Basic team and used to carpool with me to a different class.  He's now a tester on the F# team, and I know him to be a pretty smart guy.  And he claims there are scanner and parser generators for F#.  So we're going to do the project together.

So yeah... things are coming together amazingly well.  And if F# turns out not to be ready-for-primetime enough, it can interoperate with C#, so we always have that safety net.

Today was all review though, but a fulfilling lecture nonetheless.  Mostly scanners, automata, regular grammars, history, and a tidbit on bottom-up parsing.
glasses

Function Calls as Messages

The metaphor of the function call as a message has never resonated with me, and I think I finally know why.

You see the term "message" a lot in software engineering, and not just for function calls.  In Win32 programming, user events are considered to be passed as messages, via the "message pump".  In networking, of course, messages go about via packets.  These are reasonable uses of the term.

But I've seen a surprising tendency to talk about procedure calls as messages, even non-remote procedure calls.  I see it a lot in literature, as in  "[An] API is an older technology that facilitates exchanging messages or data between two or more different software applications." [1]  I see it when reading about OO, as in, objects exchanging messages.  That terminology puts me on edge.

Message implies, to me, some unreliability.  It's asynchronous.  It may not be delivered.  It may be malformed or misunderstood.  There may be bad weather.  Etc.  A message also implies some sort of packaging, like an envelope or a little ribbon around a scroll. 

For the simplest case, local procedure calls via a well-designed procedure on an object, (a) nothing can get messed up along the way, (b) the interface is designed, as much as possible, such that the caller cannot make mistakes in input that aren't immediately flagged and actionable by the programmer (at compile time) or his function (at runtime), and (c) there's not any packaging going on, just passing of raw data in separate parameters.

Two basic extensions of this simple case flirt with the idea of being messages.

The first is remote procedure calls.  Those look like regular procedure calls, but marshal their parameters to call out-of-process or on another machine, fundamentally asynchronously.  Unreliability and packaging, so this is a message, right?  Well yes, but... the spirit of the RPC technique is to abstract away the messagey-ness and make it look like a simple, synchronous local call.  As an RPC caller, you only need to think about the messagey-ness when you're writing the error-handling code.  At the academic and architectural levels, you're not dealing with messages, you're dealing with something much more tightly bound.  RPC is thus a protocol designed to abstract away the "message" metaphor.

The second is procedure calls that take strings to be interpreted at runtime, as in
Object.SetProperty("Color", "Black");
This technique is often used for XML or SQL queries, where text of another language is being passed in and parsed through a procedural language.  The string can be thought of as a bundle, which is at a layer beneath the procedural program semantics, and in that way this type of call really is a message.  

And that's its problem.  It's so late-bound, it gives you no statically-checked assurances that you're doing something reasonable. It's OK for machine-generated code to do this kind of thing, and that's quite common, but to ask a more fallible human to program like this is a bad idea.  A human would much rather write
Object.SetColor(System.Colors.Black);
with nice autocompletion and red squigglies and other compile time checks.  For the case of SQL, LINQ is supposed to provide the static checking long missing from embedded queries.
Strings are sometimes used for less "legitimate" purposes too, such as for extensibility.  Enums aren't easily extensible from version to version, but strings are.  The lack of compile-time checking is a large price to pay for versioning, and it makes me wonder why better solutions haven't been popularized for this problem.

So, if a function call being like a message is such a bad thing, why embrace the term?

[1]  How a Good Software Practice Thwarts Collaboration - The Multiple Roles of APIs in Software Development deSouza et. al.

glasses

Improbable fun with IEEE Floating Point

Had I played the lottery last Tuesday, I might have won.  Something seemingly insignificant but perhaps even more improbable happened.  I’ll have to give some background first before the punchline.

 

So, you want to compare floating point values.  Plan A is to use ==.  Unfortunately, you aren’t normally supposed to compare exactly, as there can be rounding errors and such, since things like associativity don’t hold.

 

Plan B is seeing whether they are within a fixed ε of each other.  Strike two.  The decimal precision of a float scales inversely with how near the values are to zero.  A float consists of a sign bit, an exponent, and a mantissa.  In other words, think scientific notation, .000005322 and 5,322,000 take the same number of digits: 5.322 * 10x in each case.  Thus, rounding errors vary in the amount of imprecision based on how large the number is.

 

We need a plan C.  Luckily, the IEEE standard cleverly builds floating point values so that they can be compared.  32-bit floats have 1 sign bit (s), 8 exponent bits (x), and 23 mantissa bits (m).  Let M = 224, and E = 150  A float’s value is calculated as

(-1)s * 2(e-E) * (M + m)

The M factor is totally crucial, because it forces each real value into a particular float representation.  Without that M, you could inversely vary the exponent and the mantissa and have many ways to represent the same value (e.g., 3*22 and 6*21).  In particular, this would mean that there would be no plan C for comparing floats.

 

But because of this setup, we can basically just convert the bits to integers and compare them.  This technique will scale nicely to the size of the floats, and was proposed by a coworker for some test code:

bool AreEqual(float a, float b)

{

      // Reinterpret as integers

      int aAsInt = *(int *)&a;

      int bAsInt = *(int *)&b;

 

      // Makes sure 0.f and -0.f are equal.

      if (aAsInt < 0)

            aAsInt = 0x80000000 - aAsInt;

      if (bAsInt < 0)

            bAsInt = 0x80000000 - bAsInt;

 

      // Allow, say, four units of precision difference or less.

      if (abs(aAsInt - bAsInt) <= 4)

            return true;

      else

            return false;

}

Other than the trickery with the subtraction from 0x80000000, it’s pretty straightforward.  However, that part bothered me, because I quickly looked at it and thought that a float and its negative would compare to be the same.  I plugged in 2 and -2, and lo and behold, I was right!  Without thinking further, I shot off a mail to my coworker showing the bug.  He wrote some garbage back about “interesting case” and “minimum negative integer” and crap.  I replied, saying “I think we’re talking about two different things.  I’m saying the function is broken for any a and –a!”

I had just sent the mail when my hackles rose, and I decied to start plugging in other numbers.  Sure enough, 2 and -2 were the only pair of numbers that failed.  The reason was completely unrelated to my concern: the argument to the absolute value function was MININT, and since there’s one more negative integer than positive, abs cannot return a positive integer when given MININT, so it returns MININT.  Of course, MININT is less than 4…

How could this happen?  There are (232)2  pairs of floats, and at most 232 pairs that would subtract out of this formula to be MININT.  The probability that I’d pick such a pair at random would be ½32.  Now, my pick wasn’t exactly random, nor is the floating point standard, but still…

 

I wish I’d found those odds in a powerball ticket.  I also missed the opportunity to pretend I had found the failure case through careful deduction and calculation.

  • Current Music
    The House of Slovenly Truth
  • Tags
glasses

Functional Programming in the mainstream?

Expert C++ developers have long known how to program with a functional programming style by applying some nice libraries and a few good principles and design patterns.

The most basic principle is, of course, don't have functions with out-parameters; prefer to use return values for better composability.  And of course, you have to have exceptions that can be propagated outside the return flow.  (So there goes C right away.)  Add on using delegates (pointers-to-member functions) and functors (objects that behave like methods by overloading the () operator), and you're on your way, though there are pitfalls you need to navigate.

One big aspect of FP, however, is having the compiler make types for you on the fly, such as tuples.  C++ can't easily do that, though, you could kinda work something out using templates by having templated structs, one each for varying numbers of arguments, and constructors that combine answers into those structs... but it would be ugly.

C++ also doesn't have anonymous delegates, which means, gosh, you have to go through the tedium of naming and finding a home for all your functions.

Enter C# 3.0, out in public beta right now.  Due to LINQ (language-integrated query) they needed to add all these FP features into the language.  So now it has anonymous types, lambdas, and expression trees!  This blog, though a couple years old, offers a good walkthrough of some of the new stuff.
http://www.interact-sw.co.uk/iangblog/2005/09/30/expressiontrees

I should mention that the anonymous types are still statically typed by the compiler.

I hope this means I can one day use FP in my day job!

glasses

"When in doubt, leave it out."

Joshua Bloch, who worked extensively on the Java library, said this in his lecture on API usability at OOPSLA last year.  He was referring to the idea that, if you don't have a clear scenario and testing for a method, don't include it in your API.  You can always add later, but once you put something in, you have to support it and can never remove it.  Also, ill-conceived "kitchen sink" APIs tend to be cluttered, and offer little in the way of guidance on where to begin in the common scenarios.  They also tend to require lots of boilerplate code: long strings of function calls with no input from the client.  If there's no input, why not wrap them up into a function that calls all of them?

For me, his comments solidified something I'd been saying to my teammates, in less crisp and concise terms, for some time, and was a touchstone I began to use when talking to people about choices around APIs.

From the other side of the universe, some of the people on my old team lived in fear that if they didn't expose everything publicly, someone somewhere would want to do something that they couldn't do, and this would be disastrous.  They would often make dire compromises for that principle, including exposing what would normally be construed as implementation details.

Once you expose your implementation, you are locked into that implementation, and can no longer change as new requirements such as performance, scalability, and security come into play in future versions.

The situation is exacerbated by the fact that, in many programming languages, it's simply easier to make something public than private.  This is helped somewhat if you use tools like C#'s Xml docs, and .NET's FxCop which force you to do extra work on public interfaces to make them presentable.  They impel you to really think about what you're doing.

I'm catching more whiffs of this desire to expose everything down to the metal, despite the sheer impossibility of testing/supporting/maintaining that low-level surface.  But even C++ developers badly need constraints and guidance, such as type safety and other static checking, convenience wrappers, and aggregates.  If something is left out, at least they will know it quickly and look for other solutions.  Often times, a kitchen sink API seems like it will support the client's scenario, only for them to find out it wasn't really that well supported and there are bugs, or worse, it changes in the next version and breaks their application.

The compromise I typically hear is "Make the common stuff easy, and the uncommon stuff possible."  This sounds reasonable at first, but unless you're very careful, you're still exposing a bunch of stuff you probably haven't done focused testing on, and you're going to be right back in the application-compatibility hell that Windows is doomed to for all eternity.

I'm still working out the tradeoffs here, and meanwhile trying to figure out how to communicate with people who come from the other side of the universe.