Monday, May 21, 2012

First Class Space-Time Functions


Let's take a trip into the esoteric for a few minutes.  Here's a corner of my spatial computing work that I've been nibbling away at for a while, which I think may be one of the most important theoretical questions I'm engaged with: how can we define first-class functions over space-time?


Let me unpack that wedge of words a bit:

  • Function: by this, I mean a function in the programming language sense: a chunk of code that we can run in order to do something or other.
  • First-Class: this means that we can treat the function as a piece of data, passing it around in other pieces of code, constructing it on the fly, changing bits of it, and so on.
  • Over Space-Time: here I throw in the nasty bit: I want this function to be something that, on the one hand, I can think about as an operation over a manifold, and on the other hand I can implement as a distributed computation on a whole lot of scattered devices.

First-class functions, just as themselves, have been around for a long time in computer science.  Distributed algorithms have been around for a long time too.  There's even been some nice work on programming models for code that moves around from device to device like data (Marco Mamei's TOTA system, for example, or the code in Bill Butera's paintable computing system).


None of that, though, lets you step back and get a clean global view of what's going on with your program, and that's what I want.  I want (in the way I usually do with Proto), to be able to write code for the aggregate, and compile it to execute on individual devices.


My first stab at this was in 2009, in this paper for the Spatial Computing Workshop.  That came at the problem from the bottom up, worrying about how you can tell whether two executing processes that started at different points where the "same" or not.  I proposed some mechanisms that were way ahead of what the Proto compiler could support at the time, and we're *still* catching up to be able to try them out.  But in the end, what this paper basically said was: this is a serious problem, and none of the straightforward mechanisms work, so let's give it to the programmer to deal with.  Have fun!  Good step forward, but not enough.


The next stab was from the top down, as part of the 2010 Spatial Computing Workshop paper where I proposed a basis set of space-time operators.  This was really asking how we could define space-time computations without reference to a particular programming language, and part of that was coming up with a (very abstract) definition of a space-time computation.  So you could certainly fit first-class functions over space-time into the model created by this paper, but it didn't give any recipe for how to do so.


Most recently, Kyle Usbeck and I put together a precise model of Proto function calls (for SCW'11, where else?), and used it to get some of the things you want out of first-class functions.  But once again, we had to stop short of the full goal---and in fact, ran into a clear problem caused by closures (the use of variables in the definition of a function).


So: what exactly is the problem?  Why are first-class functions so hard?


Here's my current understanding of the problem:

  • When a first-class function is evaluated over space-time, the data values it manipulates are fields over manifolds.  No device knows all these data values, because they're stored locally at the devices representing particular locations in the manifold.
  • When a first-class function acts like data, it can be stored as a value at a particular location in the manifold, and moved to another location.

So that means any data fields used as part of the function definition can't all be stored when the function acts like data, and the function can move around to places where its data fields aren't even defined.  This rules out really standard first-class function tricks like partial evaluation: if we take function f(X,Y) and make a new function using field Z for X, g(Y) := f(Z,Y), then we have no way to assure that Z gets carried around with the function in a meaningful way!


I am undaunted, though.  I think the solution is going to require us to take a much more nuanced view of what "defining a function" and "evaluating a function" actually mean, but I think that solution is out there, and that when it is found it will be extremely important for the future of all of our networked systems.


Perhaps you, dear reader, have a brilliant insight that will resolve this dilemma?  If so, I'd love to argue it with you...

No comments: