Jake Beal's Next Step: 2012

Monday, December 31, 2012

The Year in Review

Good afternoon, dear reader, and welcome to the end of 2012. Unless, of course, you live somewhere quite a ways to my East, and you've already entered 2013. And, of course, this will be up on the Internet for all eternity, so a priori any reader of this is unlikely to still be in 2012. But brushing all that foolishness aside, I am certainly still in 2012 right now, and I think I'll take this opportunity to look back over the year from a scientific perspective.

Let's be organized about it, and look at things in terms of different aspects of life as a scientist:

Research Projects: the core of it all, carving truth from the substance of the world
Publications: the primary product of research, and place to draw research threads together
Funding: powerful amplifier of research, yet generally a trailing indicator of one's impact
Position: one's institution and position within that institution affect opportunity greatly
Impact: what difference one's research makes to others in the world
Professional Service: organizing, reviewing, supervising students, etc.
Work/Life Balance: that portion of life as a scientist which is not being a scientist

Not all of these need to advance every year, but in a healthy career, at least something significant should be happening in most categories.

Looking back over my own past year, the biggest change by far is in the area of work/life balance. I've been running pretty hard for some years now, and since my wife is a scientist as well, "work/life balance" sometimes meant things like "let's sit all snuggled up on the couch while we work on our laptops." In July, that changed irrevocably, with the birth of my daughter. Now I live by an ironclad rule: from the time I get home to the time she goes to bed, I do not work, but spend time just being a parent to my child. More than anything else, this means that I am having to give up perfectionism, and the notion that I can do it all and have it all. My lack of effective triage has been slowly grinding me into dust, and with Harriet's arrival it has accelerated to the point where I can no longer pretend. My goal now is to be only 80% of perfection. This is extremely difficult, but feels doable---I suppose it is my New Year's Resolution. Ask me at the end of 2013 how it has gone.

The other big news for me this year is in scientific publications, with four major journal articles and two book chapters, besides the usual collection of conference and workshop publications. Those journal articles and book chapters loom larger than usual in my view, because of their contents: this is the year when we reported major results from my first funded project in synthetic biology, and in spatial computing we published two key formalizations of space/time computation (one for continuous space/time and the other for discrete), and a massive review of spatial computing programming languages. Overall, it's been a very good year, and there's more in the pipeline from my ongoing research, so I feel very secure about my scientific base.

Funding's been much more of a mixed bag, but I'm still alive, and I'll just keep my fingers crossed on the proposals that are outstanding. Position is a no-op (as one usually expects), and impact is hard to evaluate (Will my energy work escape the lab? Only time will tell.), though Google Scholar indicates a significant uptick in my citations, which is always nice.

In the world of service, I am graduating a co-supervised PhD student, as I reported in this post. The rest is pretty standard: we put out another special issue on spatial computing, and I'm continuing to act as an associate editor for ACM TAAS, plus running my seminar series at BBN and reviewing innumerable papers of highly variable quality. I have also taken a big step by not being an organizer for the 2013 Spatial Computing Workshop (the sixth in the series, and I feel happy that we've been going long enough that I didn't know that number off the top of my head). The 2012 edition was the best yet, and I have confidence that the others will do at least as well without me.

Putting it all together... I think I'm happy: strong on the scientific core and surviving OK everywhere else: not ideal, but a very good base to continue building on. Next year will see big changes as well, both professionally and personally, and from where I sit right now, I think it will go OK. And you can hear my perfectionism again, to not be all superlative, especially in a public forum like this. Honestly, though, I think I prefer a quieter confidence that I can simply stand upon as a firm foundation for the year to come.

Tuesday, December 18, 2012

Better Living Through Manifold Geometry

"Better Living Through Manifold Geometry" was my cheeky title for our editorial introduction to the Computer Journal special issue on Spatial Computing that is just about to come out. Alas, the final article appears to be receiving only the rather more boring simple appellation of "Editorial."

Regardless of title, though, the thing that I found quite striking as I actually read through all of the articles in the special issue, looking for the common threads that drew them together, is that spatial computing really has been getting much more coherent in intellectual approach, and that manifold geometry is one of the key concepts that keeps popping up.

My take on this is that it is not enough simply to recognize the locality and spatial embedding of a distributed system. You also need representations that will let you take advantage of that insight, and normal Euclidean geometry, like we all learned in grade school, is just not sufficient. We need our geometry to align with the structure of how information can actually flow, and the tool for that is a manifold. The nice thing about manifolds is that they can give you the "stretchiness" of topology, warping around whatever constraints exist in the real world, yet they can still provide most of the nice geometric properties we want, like distances, angles, paths, volumes, etc.

The only problem is that we don't grow up learning about them, so most people find manifolds to be a difficult and non-intuitive notion. Even our maps get flattened into Euclidean projections when the surface of the Earth is really a sphere. And of course the formal mathematical notation typically just makes things worse. But that's one of the things that I think we're nibbling away at, bit by bit, as we work on Proto and other spatial computing languages: how to capture the power and ideas of manifolds, but wrap it up in a way that makes it easy for any programmer to take advantage of it.

Monday, December 10, 2012

The International Journal of Mystery

Hi folks... I had a lovely vacation away from the internet last week, and now I'm back with another batch of scientific philosophizing. Lots of discussions of papers queued up, but that will keep a little longer...

Recently, a junior colleague of mine was telling me about a journal publication he's working on, and told me he was a bit concerned because he wasn't sure whether the journal was actually any good or not. To my great shame, the first words out of my mouth were "What's the impact factor?" To my astonishment, his immediate reply: "What's an impact factor?"

I've been thinking more about this since. Could I not have done something even slightly more worthy than immediately falling back to the shared common bugaboo of science? After all, I don't generally pay all that much attention to impact factor either, and certainly can't quote numbers for most of the places I've published. Is it so odd for my colleague to not have known about impact factors? Moreover, I receive pseudo-personalized invitations to publish in various international journals every day and I ignore most of them as academic spam without even bothering to look up their impact factors. How do I actually judge the quality of a journal when I'm deciding whether to submit there?

First, for those of you so fortunate as to join my colleague in his innocence, let me explain. Impact factor is a number used as a way of measuring how important a scientific journal is to a field of research---and therefore as a proxy for measuring how important a piece of research is by the company it keeps. It is typically calculated using three years of journal articles indexed by Thomson Reuters, as the mean number of citations in a given year to articles that a journal has published in the prior two years. You're probably already thinking of objections: Why count only citations from journals? Who the hell is Thomson Reuters and how do they decide what's indexed? Why two years - don't we care if things stand the test of time? Can't people manipulate the system? These, dear reader, are only the tip of the iceberg and there's a long tradition of scientists deriding impact factor as a metric, making up new alternative metrics that address some of the problems while creating other new ones, and generally adding to the chaos of standards. Nevertheless, impact factor, like Microsoft Word, is the lowest common denominator that many are forced to bow to, by their institutions, by their funders, by their tenure committees...

Let's avoid going any further down that tempting rathole of a discussion.

Instead, let's return to the question at the root of the whole discussion:

Is this journal any damned good?

First, off, what do we mean by "good" when we're talking about journals? In my view, this basically boils down to three things. In order, from most to least important:

Will my reputation be enhanced or tarnished by publishing here? Some journals will add lustre to your work without anybody even reading it. Rightly or wrongly, we primates love argument from authority. Conversely, if you publish in a journal that's a total joke, people will wonder what's wrong with your work that you couldn't put it somewhere meaningful.
Will my work be read by lots of people? I believe that most articles will only ever be noticed, let alone read, by people who found them by Googling for keywords in a literature search. And your close colleagues should know about your work because you talk about it together. Each community, though, typically has one or two publications that people just read because they feel it represents the pulse of their scientific community. Get into one of those and you'll be seen by orders of magnitude more readers.
Will I be competently reviewed and professionally published? Amongst the great herd of middling journals, some are a pleasure to work with and some are a total train wreck. In the end, though, if you get reviewers who give good feedback and the actual mechanics of publication are handled professionally, that's a nice bonus.

Ideally, impact factor ought to tell you about #1 and #2, but in practice I find it really only tells me about extreme highs.

So, what is it that I actually do in order to tell if a never-before-heard-of journal is any good? Well, first I check the editorial board: Do I know them? Do I know their institutions? Of course, the really big names in a field are often not on boards, or on boards only ceremonially, since they're too busy. I tend to look for the presence of solid mid-rank contributors and decent institutions---the sort of folks who I find form the strongest backbone of professional service. But if nobody I've heard of in the field and nobody at reasonable institutions cares enough to help run the journal, then why should I think that publishing in a particular journal will make any impact?

If the editorial board hasn't convinced me one way or another, then maybe I'll check the impact factor, but really that's just a +/- test: if it has an impact factor of at least 1.0, that's a good sign, but a hazy one and not necessary, since many good venues have no impact factor and impact factor can be gamed. More important is how long something has been around: anything that has survived at least a decade is likely to be solid (though again, not necessarily).

As for black marks: if a never-heard-of-it journal seems to have an extremely random or broad scope, then what could its community possibly be? Those I always find suspicious, since it feels like they are just trolling for submissions. Much worse than that, though, is if the publisher is a known bad actor, especially somebody who spams me repeatedly. I'm sorry, Tamil Nadu, but your academic community will be forever tarred in my eyes by the people who fill my inbox with poorly targeted spam.

Sufficient? Hardly. But those, at least, are my own heuristics for dividing the worthy and the dubious when approaching yet another new journal. I suspect that this isn't a problem for people who don't do as much interdisciplinary work as I do, and that it was a lot easier a few decades ago when the number of journals was much lower. But think: if it's this hard to decide where to write, how much worse is the problem of finding what to read? And that is a discussion for another time...

Saturday, December 01, 2012

Author Order Semantics are Broken

OK, folks: rant time again. This one's been in the back of my mind for a while, and dealing with several different papers recently that all managed their authors differently, the cognitive dissonance is high enough that I think it's time to get it out of my system.

Author ordering on scientific papers is totally broken.

Here's the thing: the order of authors on a paper matters a lot. It's exposure, since the first author is the one that gets associated with the paper, and you'll always see it cited as "[Busybody et. al, '01]," and not any of the other permutations. And there's a lot of tea-leaf reading that goes on as people interpret the how to understand who's really to credit for the work in an article. Only problem is, there's several conflicting theories of how to interpret author ordering.

Here's the main theories:

The first author is the most important, the second author less so, etc.
The first author is the most important, the last author is the senior author, the authors in between don't really matter.
Authors are listed alphabetically, with no author assumed to have significantly more credit.

Then there's lots of different sub-theories as well, having to do with who did the laboratory or coding work versus who did most of the writing, do you include only really important contributors or anybody who ever commented on the project, how does supervision play into the decision, etc.

Within any given community, there's usually some conventions, often driven by typical author list size. For example, my roots are largely in the more theoretical and software-driven side of computer science, where it's not unusual to see single-author papers, and most are probably 2-3 authors. In that community, Theory #1 tends to dominate, and the bar for authorship is pretty high. On the other hand, my work overlaps a lot with biology now as well, where there tend to be lots of authors, and Theory #2 is more typical. I've also seen Theory #3 pop up in special circumstances, in which I am often unfairly privileged because my name begins with "B."

But these theories conflict, and no paper ever comes with a note saying which theory it belongs to. Oh, there are journals where they have you put in a little assignment of responsibility saying "R.F. wrote the paper, J.X. performed the experiments, K.O. did the data analysis, and P.Q. killed mice until we begged him to stop." But those are typically telegraphic at best, deliberately obscure at worst, and potentially subject to all sorts of odd internal group politics.

The real trouble comes at the boundary cases. When there are ten authors, you can safely assume there's some tiebreaker policy in effect, and most of the ones in the middle aren't terribly important. But what about 3 or 4 or 5? Is the last the unimportant tag-along, or the all-important thought-leader / supervisor? If three authors are in alphabetical order, is that a deliberate choice, or just a 1-in-6 coincidence? How quickly does significance decay going down the list of authors?

Ultimately, I think the trouble comes from the fact that our language is linear, and we're trying to express a team structure that often is not. If we had a symbology of authorship, that would perhaps help, so that one could draw the authorship as a graph with circles and boxes around names and arrows between them. But that will never fly, and probably wouldn't make a difference anyway, since we still have to pick somebody to come first when we're talking about the paper with other people.

So, in the end, what do I think we should do about it? I guess I'm feeling Churchillian tonight, because at the end of the day my feeling is this: author ordering is the worst possible way to indicate responsibility for a paper, but it's better than all the alternatives.

Tuesday, November 27, 2012

Keynote on Engineered Self-Organization

Just a brief note today, as I squeeze a post between proposal, paper, and parenting: earlier this month, I gave a keynote talk at the Through-Life Engineering Services Conference, a new conference put together by folks in England who are involved in a large mixed academia/industry project to tackle complexity in large-scale engineered systems like aerospace vehicles, the power grid, and trains.

Attending was fascinating for me, getting to see how people who are right in the middle of these manufacturing and management problems are actually thinking about things, and what applied research looks like in the area. Actually, it helped clarify for me some ways of thinking and talking about my own research, particularly my work on energy demand management. Lots of interesting people too, and hopefully some of the possibly collaborations will come to pass.

The other nice thing about giving a keynote (and writing an invited paper to go along with it), is that it gave a good chance to put together a review on my work in engineered self-organization, and to pull together a unified view for myself of how all the pieces fit together. The talk, "Engineered Self-Organization Approaches to Adaptive Design," is on my webpage now (also in PDF), as is the paper---though unusually, I recommend reading my slides rather than the paper, since they were finished much later, and I think I understood the story my better by the time I wrote them.

Thursday, November 15, 2012

Silent Communion

Tonight, I looked down into the growing dusk over Montana and saw a single tiny light burning amidst a vast expanse of snow. It sat in the middle of dormant fields, wrapped around by the darker tendrils of a rough-hewn river system. Five minutes later, another slides by, a fiercely orange pinprick of civilization alone in the wilderness of Western America.

Who are these lonely sentinels of the wilderness? I hope they sit warm and content within their domains, no matter the frozen lands around, and I think how lovely silent it could be, alone in the snow and nothing to see but the land, the stars, and the planes passing by above.

Presentations and networking done, I am homeward bound through the night, pulled by the stream of pictures from home that trickled into my phone across the morning, images of my smiling daughter playing, laughing, sleeping, happy in Ananya's arms. Tonight will be late and hard, gliding into Boston well past midnight, with an internal proposal deadline still to hit tomorrow. But I wouldn't give it up for the world. Just sometimes, looking down, I think how nice it would be to spend a month in a cabin in the wilderness, and just let everything stop for a while.

Monday, November 12, 2012

Swarm Presentation & Paper available

I've now posted the presentation and paper from my talk at the AAAI Fall Symposia online. This is a case where I actually recommend the presentation, "From Spatial Computing to Tactical Command of Swarms,"and its accompanying bundle of live Proto demos over the paper. The reason is simply that by the time I wrote the presentation, I understood much more clearly how to enunciate the contribution I am making in the area of swarm control.

It comes down to one of the core problems that I hit on again and again in all of these different areas: composability. There are lots of clever ideas for how to make a swarm of robots do something together as a group. Many of them come from natural inspiration (e.g., flocking like birds, foraging like ants or bees, flowing like water). The problem however, is that robots are neither birds, nor insects, nor water. For any realistically complex application, there are a lot of different aspects that have to all be gotten right, and inevitably it is the cast that not all of those will be identical to any particular natural source. For example, if you want your robots to flock together like birds, well, they probably don't steer like birds, and the consequences of hitting one another may be more severe than for birds, and their sensors pick up different sorts of information, and they communicate with different ranges, and so on and so forth. So we need to take the basic natural behavior (e.g., bird-like flocking), and modulate it to fit the requirements of our actual platform and application. Moreover, you're probably going to need to put a bunch of these different pieces together in order to get anything complicated done---and our ambitions for engineered systems are usually pretty complicated, even when the core ideas or main "normal mode" behavior is simple.

So what we get from a continuous abstraction like the amorphous medium, and composition models like Proto uses, is a clean model for how to put the pieces together to get complicated behavior. Dataflow composition gives us a clean separation of different computations, state-through-feedback means we don't have to deal with weird interactions through persistent variables, and restriction---ah restriction, the most subtle spatial operation---lets us modulate behaviors by changing where they are being computed.

If you're interested, the talk lays it out pretty well, and the demos illustrate it really beautifully...

Friday, November 09, 2012

The Joy of Scientific Airline Travel

I'm writing this now in a plane, flying back from England, where I just gave a keynote on Engineered Self-Organization and spent a couple of days after the conference working out possible collaborations with colleagues. I'll talk about all that sometime in the near future---right now, though, what I want to talk about is the joy of scientific air travel.

I never really flew much as a kid---my family tended to drive into the nearby wilderness for our vacations, so I never got exposed enough to become comfortable with flying. When I started flying professionally in dribs and drabs during grad school, I was always completely afraid on takeoff and landing, willing the plane up into the air or safely down to the ground as I stared intensely out the window. It didn't help either that I was coming from Boston, since all the landing paths at Logan Airport come in over the water, and you never have land below you until just moments before the wheels touch the ground.

These days, though, I rather look forward to it. Somehow, flying transformed from a frightening necessity into a comfortable routine. Now I sit by the window just because I enjoy the view, and also because it gives me minimal interference from my fellow passengers. Once we're airborne, out comes my laptop or the papers I need to read, and there I am with nowhere to run and no Internet to find me (no, I have never paid for Gogo, and I pretend it doesn't exist). It's a calm, focused time, tapping away getting things done, and with my MacBook Air these days, I can eke out around eight hours of battery if I'm just writing papers with my screen brightness turned down.

Sometimes I've got something in particular I need to do, other times I just open up my machine and take stock of the state of my intellectual world. Some of my best thinking gets done while doing that (and you get some quality blog posts too). It just seems rather ironic to me that one of the places I am most grounded is when I am 10,000 meters in the air.

I still sit by the window, whenever I can, so that I can look out at the world going by, see the intricacy of the land and the settlements of people upon it. Clouds too, though I'll admit I find them boring after a while. When it's clear down below, I love to watch my progress against the map and try to identify the landmarks as they go by. Chicago is one of my favorite cities from the air, as is New York, and on a good day flying into Boston from the West, I can mark every major city, river, and highway from Utica on in. Once, flying out of San Francisco, we passed right by Half-Dome in Yosemite, and it practically hovered there right outside my window, turning in three dimensions.

But why am I just talking about it? This is a blog, and I can show you pictures just as easily. Here are a few of my favorite memories from the air: flat and two dimensional, faded compared to how they looked in person, but maybe still enough to give you a feel.

Chicago

Hindu-Kush Mountains

Michigan Shore

If I ever stop caring to look out the airplane window, I'll know that I've lost an important part of my soul.

Monday, November 05, 2012

An Accidental Investigation of Publication Metrics

Dear reader, welcome once again to one of my more philosophical posts. I've been working on reorganizing my webpage---something long in need of doing. It used to make sense, when I was a grad student or a young postdoc, to have a simple list of all my publications. Over the past few years, though, as both the number and variety of my publications has grown, I think this has become less sensible. Now the list is rather long, and all silted up with the detritus of scientific publication---dead ends, early work, incremental reports, and important-but-boring filling in the gaps.

One of the things that makes my webpage such a mess is that my current list does not discriminate between types of publication: journals, book chapters, conferences, workshops, tech reports, and unpublished white-papers are all jumbled together in chronological order (possibly the worst reasonable ordering tiebreaker).

I used to solve the density problem by segregating the publications by subject area. Subdividing further would be unsatisfactory to me these days, however, since there are so many connections between different pieces of work---do I put the first "functional blueprints" paper into morphogenetic engineering or spatial computing, since it was much more focused on spatial/cellular approaches than what came after? How about my energy work, which started out as an application of Proto, but has evolved to shed both Proto and spatial computing in general? There are far too many such boundary cases, and I don't want a reader to miss a publication because they're looking in the wrong section.

I suppose I could resolve the density problem by segregating them into type: put the Respectable Journals up front, followed by the High-Impact Factor Conferences, and so on. Problem is, I've got tech reports and workshop papers that I think are more important than some of my journal papers.

Which leads to a general comment on scientific publication, I think. So far, in my career at least, I find there to be a minimal correlation between importance of publication and "significance" of venue. An idea put forth first in a workshop (the amorphous medium abstraction), has become the most central element of my whole line of research, and I still cite that workshop paper. Maybe someday it will be replaced with a Reputable Journal paper updating and expanding the results, but that hasn't happened yet, and isn't likely to happen soon, what with my jam-packed publication queue and parenthood.

So, let's see how my intuitions hold up against data (ah, the scientific lifestyle), and try plotting "venue" vs. "importance" . First, I've gone through all the publications on my website and pulled out those that I think are "important," further coding some of them as "foundational"---meaning they are something whose importance I think is broad and durable, generally leading meaning it's at the root of a significant ongoing research program. Now let's group them into publication classes using my CV, which lists 91 non-thesis publications (Google scholar finds more, but we'll ignore that whole can of worms for the moment). In my CV, where publications are broken up into six classes, which we'll order by typical ferocity of peer review (a proxy for venue quality), in decreasing order: Journal, Conference, Book chapter, Workshop, Abstract, Informal (tech report, white-paper, etc.). Plotting the numbers of each type as a stacked graph, we have:

Huh... my publication profile actually looks a lot more conventional than I expected.

It's completely unsurprising that the abstracts are barren of value, since they're typically just too short for anything significant---no more than two pages. The big surprise, looking at this, is how barren the conferences are. My guess is that a lot of those "unimportant" conference articles are steps on the way to a more complete result---and looking more deeply into them, it seems like about half of them are exactly that. That workshop articles are largely barren is less of a surprise, since so many of them are position papers, dead ends, or roads not taken---and a deeper inspection confirms that completely. Workshops are apparently where I toss ideas against the wall, and some of them stick (with massive importance), while most of them just fade away.

Digging into those journal articles further, I find that six of the eight journal articles started life as a "lesser" publication, and then were extended and upgraded into a full journal publication---which then supersedes the prior publication in importance, hogging all of the spotlight. That's appropriate, I suppose.

Does this mean that I should expect the foundational workshop and informal publications to migrate into journals as well over time? Perhaps they will---and in fact, I know that one of them is trying to already.

So what we have here in many ways is a "revisionist" picture of science, where the material that turns out to be important ends up migrating over time upwards in venue quality. If that's the case, then "journal papers are more important" is only true for people who aren't the author: it's a selection process that retroactively highlights the important work, rather than a leading indicator. Perhaps we should instead think of publications as some sort of an exploratory tree process. Here's a notional diagram of what that might look like:

Color to match bar graph above. Arrows indicate dependency, pointing from a dependent work to its source. Size: large=journal, medium=conference/chapter, small=workshop/abstract/informal. Concentric publications indicate "venue promotions" that supersede a prior citation.

Let's say a research program started at the large bottom node with a workshop publication. As it goes up and out, it grows and branches. Importance tends to relate to how much research is running back through a publication. Also, as publications become more important, they sometimes upgrade into more "quality" venues---which renders the prior version (shown as concentric) unimportant. Sometimes a big step can be taken directly, sometimes it needs to go through bridging stages on the way. And of course there are lots of things that end up staying unimportant, either because the initial idea was wrong, hit a dead end, or just plain got triaged by the 24-hours-per-day limit.

I suspect that I may have a somewhat higher than average branching factor, given the nature of my research and personality. I don't know though---this may instead be an impression that I've gotten due to the operation of just such a process. After all, the informal publications tend to fall away from visibility if they are not deliberately preserved and archived online by a researcher, and it's hard to see anything besides the mature work of another researcher. It would be fascinating to study this over a number of scientists, but really hard to do effective coding on publications.

Coming back to the root problem that started me down this intriguing rathole: when it comes to laying out my webpage, since I'm going to be showing people a snapshot of time, I think it's only right to classify things by current perceived importance, and not by category. And now, dear reader, an exercise for you: let's see just how long it takes between this post and an actual restructuring of my website. If it happens very quickly, it probably means I'm engaged in proscrastination; if it takes more than six months, well, you have my permission to point and laugh. And if you're a scientist reading this, would you be willing to contribute a coding of your own publications?

Monday, October 29, 2012

Swarm Control at AAAI Fall Symposia

Later this week, I'll be giving a talk at a symposium on Human Control of Bio-Inspired Robot Swarms (part of this year's AAAI Fall Symposium Series). I like the AAAI symposia, because they're a really good place for position papers and preliminary work: you can put your ideas out there, get some good feedback, and at the same time put it in an archival place where people can cite it if they find it useful and inspiring. I think it's also one of the things that keeps the AI community from being scoop-fearful like many other communities.

Anyway, my talk is about an application of my continuous space abstractions and Proto that should be pretty obvious: controlling large swarms of robots. We published a journal paper aiming in this direction a couple of years ago, talking about how viewing a swarm as a "material" flowing through space makes it very simple to create complex swarm behaviors. Rather than worry about all the individual robots and how they should interact, you specify how regions should stretch and squish and flow.

This isn't entirely new---other people have used continuous space models as well, though in a much more control theory or partial differential equations way. The nice thing about Proto is that once you've got a behavior specified, you can build on it, modulate it by choosing which robots are participating, compose it with others, etc., and it's all very simple and easy to predict what's going to happen. That's how you can build up such complex behaviors so quickly and easily---once you've got a few building blocks, you can go wild putting them together.

Well, this week's paper is pushing that work forward, asking the question: What's a good interface for letting ordinary people talk about what they want their swarm to do? I'm proposing that a good starting point is a sort of "command and control" model where you break your swarm into units, and then talk about who's supposed to stick together, where they're supposed to move, and how much they should be long & thin vs. thick and fat. Or to be more precise: specifying the first three moments of the swarm distribution for each unit. That makes it easy to make formations like these:

Swarm moving in a dumbbell formation, and another in a chevron formation.

There's a bunch of other thoughts in there, and proposals for how we can turn this idea from early work into something practical for people to use with swarms. Not that there is much in the way of actual swarms out their yet either (with some elegant notable exceptions), but that's only a matter of time and cheapening hardware, and there's a lot of folks working on that...

I'll post the paper after the talk is given.

Tuesday, October 23, 2012

Congratulations to Noah Davidsohn!

Following up on last week's post about characterization, let us celebrate Noah Davidsohn's successful Ph.D. defense. Ron Weiss and I have been co-advising Noah's work on the characterization project, where he has done all of the wet-lab work and contributed that perspective into the experimental planning and analysis. Noah presented a quite clear and coherent discussion of the scientific journey of the project, all the way up to the ultimate results that I previewed here last week. The dark cloaked forms of the Thesis Committee now draw aside, and all that remains to the acolyte is to finish the Document itself...

Congratulations, Noah!

Monday, October 15, 2012

Progress in Characterization...

Normally I wouldn't share a significant unpublished result on this blog, but last week we showed this off at a program meeting, so the cat is already mostly out of the bag. To make a long story short, all of the hard work we've been doing on characterization of devices in synthetic biology is beginning to pay off. I've talked about this a couple of times before, on the struggle to make good models of biological devices and on the new characterization protocols we developed so that we could study devices quickly and precisely.

Now, it seems to all be coming together nicely, and we've gotten a few beautiful results like this:

I'm not going to try to explain it all here (well, unless somebody actually asks for details in the comments), but the important things to understand are this:

The circles are our predictions, and the crosses are the experimental data
The colored areas we have confidence to predict in, the grey areas we don't
Our predictions are really, really close to the actual behavior.

That's all I'll say for the moment, but look for a publication will full details appearing Real Soon Now... and I'm damned excited, because if the future work keeps going in the direction these results indicate, it opens the door for massively more complex biological systems and justifies the whole design tools thrust that we've been making at BBN...

Monday, October 08, 2012

That new proposal smell...

Last week, my collaborators and I sent off a full proposal and a pre-proposal. I'm quite proud of both of them: they both build logically on what we have done before, have clear and achievable research programs, and aim towards applications that people care about. Maybe you'll hear about them again, and maybe you won't, depending on whether the folks on the receiving end agree with us...

What I want to talk about here, though, is that wonderful new proposal smell. When I'm writing a proposal that I'll be a PI for, I really have to fall in love with the ideas. I just can't see sending in a second-rate proposal, to do something that you aren't all that enthusiastic about, and I have to think that the reviewers would be able to smell that on one's proposal as well. Oh, I've been an nth writer on proposals I wasn't enthusiastic about before, but that's different than being a PI, the Primary Investigator, the one leading the charge and holding the bag if things go wrong.

So when you start working on a proposal, there's the foundation you've got and the target you're writing to, and this dance where you try to bring the two together, and figure out how to sell your ideas to the folks on the other end, whose purposes are not your own. It would be easy to be cynical about it, I think, and to view the process as some sort of money/power game. And I suspect that it actually is in some of the really giant-money high-stakes stuff, like where Congress gets directly involved. But in the world that I play in, I don't think that's the case. And it's not just about marketing, either.

Rather, I look at proposal-writing as a distinct part of the scientific process. When you're doing science, you need to understand not just "What is my idea?" but more importantly, "How does my idea relate to what else is out there?" and "What difference does it make if my idea is true?" Sometimes this can be related directly to some sort of real-world application (as is the case with my work on energy management). Sometimes it's indirect, where you might end up affecting the world someday, but someday is likely rather far away. Sometimes this just relates to how we understand the world we live in, where we come from, and where we're going (whenever I start to fret too much about real-world applications, I remember that General Relativity, one of the most significant intellectual developments of the 20th Century, has only recently reached into the practical world of our daily lives via the GPS in our smartphones). If my ideas are actually as important as I would like to believe they are, then I had better be able to make a case why somebody else should be interested in them, and a proposal is just a focusing of that discussion to a particular audience. A good proposal isn't a sales job, it's a proposal for a mutually beneficial partnership with your potential funder, where you doing your research helps them achieve their own goals.

For a really good proposal (like, I hope, the ones we just sent in), the effort of writing the proposal is part of the preliminary research on the project. The work of figuring out how to connect with the subject of the proposal solicitation brings up new questions and challenges and forces you to confront problems that could be avoided in prior contexts. With a good team and a good proposal, those tensions lead to new insights even as you write, and you end up formulating a really clean plan of attack on the problems. It exhilarating, laying out a possible future, seeing how the work all could fit together and what exciting lands it could lead to, before the realities of a project set in and we have to get engaged with all the messy details, side trips, unexpected obstacles, new phenomena, etc. that the world may soon decide to throw at us. That's what I mean by the "new proposal smell," and it's lovely to inhale it every once in a while, and just to say together with your collaborators, when the documents are filed, "Good work everybody. I hope we get to do it."

And none of this, of course, no matter how good your intellectual content or prose, can ensure a proposal will actually be selected for funding. To mangle Anna Karenina, "All funded proposals are (sort of) alike, but every failed proposal may fail in a different way." So perhaps you'll understand me when I say: the saddest thing about proposals is knowing that you may never get to do the research.

Sunday, October 07, 2012

"Organizing the Aggregate" now in print

Just a short note: the book containing our review of spatial computing programming languages is now officially out and on sale, both at the publisher's website and Amazon. It's entitled "Formal and Practical Aspects of Domain-Specific Languages: Recent Developments", and our review "Organizing the Aggregate: Languages for Spatial Computing" is Chapter 16. Now, I don't get any cut of the sales, so it doesn't matter all that much to me whether you buy it in its elegant form in the book or just snag the content from the preprint. But it's nice to see it in print.

Sunday, September 30, 2012

Pinned

I'm currently pinned.

There's a proposal draft I need to read and mark notes on, but I don't have a pen and the nearest one is three feet above me on a shelf. I've read as far as I could, but then got to a section I really need to mark up, and I just can't go forward without making the marks I need to make, or I'll lose all the value of the read-through.

I could try to annotate on the electronic copy on my laptop, but I'm typing this one-handed and full of typos, which you won't see because I'll clean it up before posting. A MacBook Air, by the way, is a wonderful machine for a parent since it is so light and can be balanced on your chest, arm, whatever, without any problem. But although I can use the machine, I sure can't produce content on it right now given my hilariously egregious current typo rate and terribly low words-per-minute while doing low-light upside-down one-awkward-finger hunt-and-peck.

Pinned, flat on my back, able to move legs and arms, but only so far. Not the easiest position in the world, and if I could still sleep, it would be a fine time for it, but it's morning enough that I'm fully awake and no longer able to drowse.

Drowned my sorrows in blogs for a bit, but now my creative side is itching to get started with the day and do some science. That's the thing, you know----I think you can't be a scientist without some level of obsession and inability to just let things alone and be content like a normal person. I certainly can't, and given a long enough time of stillness, I'm always going to start try to create something---bring something of value into the world that wasn't there before, my own little strike against entropy and time. Not necessarily science, maybe just cooking dinner or organizing our room or watering my plants.

Not that I can do any of that right now.

But at least I can do something meaningful with my brain, more than playing zombie content consumer on my favorite blogs and web-comics. And now finally I think I have a solution: this is an excellent time to try to catch up on the literature a bit---or at least plug my fingers in the dike. Not that it will be particularly easy to read, lying here upside down and holding my laptop up above me.

But hell if I'm going to wake the baby when she's decided to sleep sprawled on my chest like a cat.
Even if it does mean I'm pinned.

Sunday, September 23, 2012

Raising kids with science

Last week, I took my daughter Harriet to the IgNobel awards. This was a terrible idea, of course, since she's only two months old, but in the spirit of the ceremony I figured that a terrible idea might just turn out to be great and went for it. Fortunately, my decidedly risky reasoning turned out to be correct---she enjoyed some parts of the raucous performance (especially the opera), ignored most of the rest, drank quite a bit of formula, squawked loudly only once or twice, had one fast and discrete diaper change, and slept through the last fifteen minutes or so. During the paper airplane barrage, the nice folks sitting near me formed a missile shield and deflected wayward planes that might have otherwise hit her. But beyond all the parenting and silliness, I had some serious thoughts as well: sitting there in that theatre, listening to a celebration of the strangeness in science, made me think a lot about the question of raising kids with science.

I'm a scientist---as well you know from the tagline on this blog. So's my wife, and our shared love of inquiry in one of the standing waves of our relationship. So I have a feeling that my daughter is likely to either embrace science from the start, or to end up running screaming away as fast as she can. So, how should I think about this as a parent? What's the responsible way to approach this whole area of life?

Well, an important place to start is getting a clearer idea of what I mean by "science" in the first place. The obvious starting point is that, yes, I do SCIENCE! for a living, and write papers and grants and take data and stuff. But my study of the more obscure types of questions that makes up my career has had a backward effect on the rest of my life as well. During graduate school, one of the most liberating lessons that I learned was that "I don't know" is a totally respectable answer---quite liberating for somebody who used to be an obnoxious know-it-all have-to-be-right kid in grade school. You mean I don't have to stake my ego on having answers? Later, one of the hardest struggles toward my thesis was staring at the pile of conjecture and mechanisms I was working with and asking how I could really justify what I thought I knew.

Those lessons I learned in graduate school boil down to two simple questions that I believe are the root of science, and they are eminently applicable to everyday life:
1) What do you actually know, and what do you not know?
2) How do you know what you know?
Once you know where you stand, the obvious and tempting extensions are "Let's go find out..."and "Would this help?"

Science, the profession, is simply about answering those questions in places that other people are also interested in and where the answers are not yet known and finding the answers typically requires rare knowledge or equipment. But you can practice the same things anywhere. Does taking this shortcut actually help me get home faster at rush hour? Should I pack my lunch or eat at the cafeteria? Day care or nanny or stay-at-home parent? Get the baby her vaccinations on schedule? None of this requires the trappings and ceremonial indicators of science, just a willingness to recognize that you may have bias in your preferences, to ask how you can test what makes sense, and then get the information.

Let me give an illustrative example---good scientific practice, giving the reader a cross-check of what's been said so far. When Harriet was but a young fetus, we faced a common modern pregnancy dilemma: an expecting mother is supposed to eat lots of fish because it's jam-packed with Omega-3s and other Good Nouns, yet must limit her fish intake to one to two servings per week to avoid mercury. What's a loving parent to do? So as the designated reader of medical horror material, I went digging around to try to understand where the one to two servings limit was actually coming from, since it's cited everywhere but typically doesn't actually come with hard numbers about how much mercury is the actual recommended dose limit (unlike, say, caffeine, where the recommendations almost always come with milligram dose numbers). It turns out, though, that with a little bit of Googling you can actually get hard per-species numbers directly from the FDA. Taking canned tuna as a reference point (recommended 1 serving per week, 0.128 mean ppm), it quickly becomes obvious that the species lumped together into "two services per week" vary wildly in their typical mercury load. Herring is clearly at the right level (0.084 ppm), as is mackerel (0.050 ppm for Atlantic), but you can safely eat an order of magnitude more sardines or tilapia (both 0.013 ppm) and scallops until you're sick (0.003 ppm). Moreover, since mercury is an accumulative toxin that is flushed out of your system over a long period of time, it's the mean rate of consumption that matters rather than the particular time period (again, unlike caffeine). That means that if you've had a week where you didn't eat fish, you can eat double as much the next week with little worry. So science showed us a clear way out of the dilemma: we just wrote down a list of all the seafood where appetite was a bigger limiter than mercury, taped the list to the fridge, and had our seafood without fear. Science to the rescue, needing just a little math.

I give that example to show how thinking scientifically can help us sort through the blizzard of information that makes up normal daily life. The science in that story is not about how the FDA got those numbers to put on its website, but the fact that we realized we didn't know why "two servings" was given as the magic number, went and found a reliable source of information (the FDA), and then solved our real problem ("What should we cook for dinner?") by turning that complex information into a simple list on the fridge. So, in the putative words of Socrates, "[I am wise] because I do not fancy I know what I do not know," or to quote another more recent philosopher: Science. It works, bitches.

Coming back to the original point...
Do I care if my daughter becomes a Scientist?
Not in the least. But I care deeply that she understands the process of science, that it's something for her as natural as breathing and as basic as talking.

What exactly that translates to in terms of actionable policy recommendations for parenting a particular sample size of one (viz: Harriet) is a subject of ongoing study, but at least I know what I'm trying to do as a parent...

Monday, September 17, 2012

A modest proposal for reviewers

Peer review is a necessary evil in the life of every scientist. On the one hand, pretty much every meaningful paper you ever publish will go to a bunch of peer reviewers, including the infamous Reviewer #3, who always suggests more experiments. On the other hand, a significant chunk of your professional service will be reviewing on program committees, reading some good papers and a lot of others that are painful messes where the authors clearly need to do more experiments. On the third hand, sometimes you'll find yourself wrangling reviewers yourself, and trying to get the damned procrastinators to actually turn in their reviews so you can let the authors know whether their paper is being blessed with publication or cursed with rejection or a request for more experiments.

Journal papers are particularly bad in this regard, since there's no particular schedule on which the paper has to be accepted or rejected, and sometimes a paper can languish for more than a year in limbo, unable to be cited or even submitted elsewhere. And what's happening during that time? Well, from my own experiences wrangling reviewers, half the time the editor is waiting to see whether the reviewers will actually do the reviews they promised or not. See, as a reviewer you get told, "We'd like to have you review this paper, and you've got six weeks to do it," or some similarly long time.

Six weeks? No problem! There's got to be a time in the next six weeks when you'll be able to read this paper... and then other projects and deadlines intervene, and the time slips away, and you end up at the end of six weeks trying to find a time to actually give the paper its fair shot. I'm pretty faithful about turning in on time, but some people definitely aren't. So when I'm acting as an editor or program chair, I spend a lot of time cajoling or tearing my hair and trying to get somebody else to review at the last moment when a reviewer fails. And as an author waiting for a response, I'm always wondering whether the reviewers are doing anything or not...

So here's my modest proposal for fixing peer review timing: if the reviewers are going to review at the last moment, why not bring that last moment much closer? Why don't we give reviewers only a single week, no matter how massive a paper they're going to review. Then the process of negotiation back and forth can start much earlier and we can toss out the reviewers who aren't going to review much faster. Everybody wins: authors get responses quickly, editors get their reviews back faster, and it will even lower the load on reviewers, since the editor no longer needs to recruit extra reviewers in case some fail.

So, dear reader, would you be in favor of such a fast-tracked world?

Thursday, September 13, 2012

Au Revoir, SASO

Today's is the last day of SASO, the IEEE International Conference on Self-Adaptive and Self-Organizing systems. There are more workshops tomorrow, but I've been away long enough and I'm hopping on an early morning plane to go home to my wife and daughter.

It's been a good conference, not just for my personal aggrandizement as a scientist, but also for me to learn things and have good conversations with colleagues. Overall, I had about a 30% hit rate on talks I was interested in---pretty high for a conference---and I'm taking away a couple of things I need to look into more, and some possible collaborations to continue.

Right at the end, I had the privilege to sit on this conference's panel discussion, which focused the topic of "New Research Directions." My own slides were a subject of much discussion and debate, as they challenged people to spend more time focusing on the refinement of SASO material into reliable and reusable engineering building blocks.

The bit from the whole discussion that sticks best in my mind, however, was Mark Jelasity's declaration that SASO is "a place to send your rejected papers---but only the odd ones, not the bad ones." I found that an apt description, and based on the discussion, I think a lot of other people did too. The same ideas were reflected somewhat in my own slides---SASO, I find, is a place filled with people wrestling with excedingly difficult problems, and looking outside of their own domains to find solutions. It's hard and slow and produces a lot of false starts, but damn it's an interesting breed of science.

It's been a good conference, and now it's time to go home. You may now expect this blog to go back to its normal weekly posting schedule.

From Lyon, good night.

Wednesday, September 12, 2012

Yesterday WAS a Good Day to Demo!

Just back from the SASO conference banquet, at which awards were handed out... and we won one! Our Proto demo of self-stabilizing robot team formation won the "Best Demonstration" award. We were cited, among other things, for being:

simple and easy to understand,
an excellent example of self-adaptation and self-organization, and
freely available for anybody to download and play with themselves.

Major kudos to Jeff Cleveland and Kyle Usbeck for the work they contributed to building the demo and also to ensuring we had a nice webpage and movie to show it off to best advantage. Go and check it out for yourself!

Fast Demand Response: ColorPower 2.0

I'm quite happy to take the cap off of this paper: at long last, the paper on the ColorPower 2.0 paper, Fast Precise Distributed Control for Energy Demand Management, is officially published, and I can put it up as well. The pictures have been up before, in these two posts, and now you can learn all the key ideas about how we're doing our distributed energy management.

This builds on the prior work from Vinayak Ranade's thesis and paper in SASO 2010, where we showed that fast distributed control of energy demand was possible. The controller we used in that paper was terrible, though, and we acknowledged that right there in the paper---it just wasn't the focus then, and we hadn't had a chance to study that aspect of the problem well.

Over the last year, however, first my colleague Jeff Berliner at BBN figured out the right representation for understanding the control problem, and then I was able to turn that into an algorithm to actually do the control correctly. Together, and with the help of Kevin Hunter, we refined it into the shining gem presented in this paper: the ColorPower 2.0 algorithm (can you tell I'm excited about it?). We simulated it at all sorts of scales, with all sorts of problems, and it always stands up well---and better, matches our theoretical predictions nicely too. Plus the experiments produce beautiful looking figures like these, showing the convergence and quiescence times of the algorithm for abrupt changes of target:

The bottom line: we've got a system that should be able to shape the energy consumption of millions of consumer devices in only a few dozen seconds. Now we just have to get it out of the lab and into the field. Come talk to us if you want to use it, though, since it's also protected by patents...

Tuesday, September 11, 2012

Today is a Good Day to Demo

One of the things I always feel indebted to Jonathan Bachrach for is how pretty Proto looks. When we were first developing the language together, he was the one who hacked together the original simulator with OpenGL, based on the previous work he'd done with multimedia processing languages. So Proto's simulator got built by somebody to whom appearance really mattered, and with an artists touch and attention to detail. And so, to me at least, the simulations we make look gorgeous, and I just love playing with them.

Well, today I got to show off my toys in the SASO demo session. This spring, I put together a set of self-stabilizing algorithms for robot team formation. Kyle Usbeck and Jeff Cleveland then helped turn these into a nice demonstration---well, it was a contest entry originally, but SASO didn't get enough entries, so they just rolled us into the demo session. In any case, the robots form up into little "snakes" for each team, and go crawling randomly around in 2D or 3D in wonderfully distracting colorful patterns.

Kyle and Jeff made a nice movie showing off the algorithms, and how they're resilient to pretty much any way you can think of the break them---adding robots, destroying robots, moving robots, changing goals and communication properties, etc.:

The upshot of all of this today is that I got to talk myself hoarse in front of a projector for two hours while lots of folks enjoyed the beauty of our simulation, and hopefully even got to understand a bit about Proto and the continuous space abstractions that made it all so easy to do.

If you want to play with this stuff too, feel free: you can read about it and download it all here.

Monday, September 10, 2012

How resilient is it anyway?

As engineers and scientists, we worry a lot about how well the things we build hold up. Anything that goes out into the real world will suffer all sorts of buffets from unexpected interactions with its environment, strange behaviors by its users, idiosyncratic failures of components, and myriad other differences between theory and reality. So we care a lot about knowing how resilient a system is, but don't currently have any particularly good way of measuring it.

Oh, there's lots of ways to measure resilience in particular aspects of particular systems. Like if I'm building a phone network, I might want to know how frequently a call fails---either by getting dropped or failing to connect in the first place. I might also measure how call failures increase when there are too many people into one place (like a soccer match) or when atmospheric conditions degrade (like a thunderstorm) or when a phone goes haywire and starts broadcasting all the time.

But these sorts of measures leave a lot to be desired, since they only look at particular aspects of a system's behavior and don't have anything to say about what happens when we link systems together to form a bigger system. That's why I'm interested in generic ways to measure the resilience of a system. My hope is that if we can design highly resilient components, then when they're connected together to former bigger components, that we will be more easily able to ensure that those larger components are resilient as well.

Even better is if we can get compositional proofs, so that we know that certain types of composition are guaranteed to produce resilient systems---just as there are compositions of linear systems that produce linear systems and digital systems that produce digital systems, etc. This is the type of foundation that lays the groundwork for explosions in the complexity and variety of artifacts that we can engineer, just like we've seen previously in digital computers or clockwork mechanical systems. I want to see the same thing happen for systems that live in more open worlds, so that we can have an infrastructure for our civilization that helps to maintain itself and that can tolerate more of the insults that we crazy humans throw at it.

But first, small and humble steps. In order to be able to even formulate these problems of resilience sanely, we need to better quantify what this "resilience" thing might mean. In my paper in the Workshop on Evaluation for SASO workshop at IEEE SASO, I take a crack at the problem, proposing a way to quantify "graceful degradation" using dimensionless numbers. The notion of graceful degradation is an important one for understanding resilience, because it gets at the notion of margins of error in the operation of a system. When you push a system that degrades gracefully, you start seeing problems in its behavior long before it collapses. For example, on an overloaded internet connection that shows graceful degradation, things start going slower and slower, rather than going directly from fast communication to none at all.

In my paper, I propose that we can measure how gracefully a system degrades in a relatively simple manner. Consider the space formed by all the parameters describing the structure of a system and of the environment in which it operates. We break that space into three parts: the acceptable region where things are going well, the failing region where things have collapsed entirely, and the degraded region in between.

If we draw a line slicing through this space, then we get a sequence of intervals of acceptable, degraded, and failing behavior. We can then compare the length of the acceptable intervals and the degraded intervals on their borders. The longer the degraded intervals that separate acceptable and failing intervals, the better the system is. So in order to know the weakest point of a system, we just look for the lowest ratio between degraded and acceptable on any line through the space.

What this metric really tell us is how painful is the tradeoff between speed of adaptation and safety of adaptation. The lower the number, the easier it is for changes to drive the system into failure before it can effectively react, or for the system to accidentally drive itself off the cliff. The higher the number, the more there is a margin for error.

So, here's a start. There are scads of open questions about how to apply this metric, how to understand what it's telling us, etc., but it may be a good point to start from, since it can pull out the weak points of a system and tell us what they are...

Sunday, September 09, 2012

Enter SASO 2012

This week is the 6th annual SASO: the IEEE Conference on Self-Adaptive and Self-Organizing Systems. If any conference is my "home conference" at the moment, this is probably it. The attendees of this moderate size (~100 people) single track conference tend to be all over the map in terms of interests and applications, but there is one thing that clearly unites us: a dissatisfaction with the brittleness of ordinary complex systems engineering, and a desire to address it by making the systems smarter in some way.

It makes for a very diverse and rather messy conference, with a lot more proof-of-concept and early work than finished systems or grand results. There's also a lot of folks out there in the greater scientific world with really flaky ideas about how to go about this type of resilient engineering---lots of magical thinking of the form "it smells kinda like Nature, and Nature is awesome, so it must be awesome too!" The conference has tightened itself up quite a bit over time, though, and by now the quality of the papers is generally pretty good. My Ph.D. advisor, Gerry Sussman, once told me that he judges the quality of a conference by how long he remembers something that he learned there, and in that sense SASO does quite well for me.

I'm much looking forward to SASO this year, the more so because I'm going to be quite busy talking about interesting things. In particular, the highlights of the conference for me are:

Monday, I will be talking about metrics for graceful degradation in the Workshop on Evaluation of SASO Systems
Tuesday, I will be chairing a session in the morning and giving a demo of our self-stabilizing robot team formation algorithms in the afternoon.
Wednesday, I will be presenting my work on the ColorPower algorithm for distributed energy demand management.
Finally, on Thursday, I'll be sitting on a panel on New Research Directions, talking about my views on the need for nature-inspired systems to move beyond one-off applications and toward the extraction of principles and laws.

And then there's colleagues to catch up with and other talks to see... it will be a busy week, and also my first time away from home without Harriet since she was born. But fear not, dear reader, I shall salve my wounds and fatigue by writing some extra blog posts about all of the cool things I get up to this week.

Monday, September 03, 2012

Pretty BioCompiler GRN Diagrams

One of the pieces of work I'm rather proud of is my Proto BioCompiler. Back in 2008, as I was hanging around at the synthetic biology lunches at MIT, I realized that there was a nice tight mapping between the genetic regulatory network diagrams that the biologists were drawing and the dataflow computation graphs that I was using to express the semantics of Proto. Basically, in biology you can represent a computation with a bunch of reactions evolving in parallel over continuous time, with the products of one reaction being used as the inputs for others. Similarly, in Proto my model of computing had a bunch of operations evolving in parallel over continuous space and time, with the outputs of one operation being used as the inputs for others.

So after giving a couple of exploratory talks in the lunch meetings, I wrote up the ideas in this paper in the first spatial computing workshop, and then went ahead and and actually built the thing a couple years later. We've been refining it ever since, and the lovely thing about the BioCompiler is that it takes all of the guesswork out of designing biological computations. Most of the really effective genetic regulations that we can engineer with right now are repression, which means that designs are typically implemented in negative logic. For example, "A and B" might be implemented as "not either not A or not B", which is just much harder for us poor humans to think about, and means that, for me at least, anything over a few regulatory elements is almost certain to contain mistakes in the first design. When you're dealing with biological experiments, where it can be weeks from design to the very first test, you really don't want to get it wrong the first time.

So the BioCompiler lets us just sidestep that whole mess: you write the program in normal red-blooded American positive logic, and it turns all inside out on its own and then optimizes. Won't work for every program, but for the range that it knows how to work in, it's beautiful. In an instant, out of the BioCompiler comes the new design and also a simulation file for testing it in silico.

The only problem is that the output of the BioCompiler looks like this:

BioCompiler supposedly "human readable" output for an XOR circuit

or, worse, like this:

Just the top bit of the machine-readable XML generated by BioCompiler for an SR-latch circuit

Bleah. Sure, there's a genetic regulatory network buried in there, but to actually communicate it to another human we need to turn that soup into a diagram, and that's not only a pain but another really good way to introduce errors into the works.

No more. I just recently sat down with the manual for GraphViz and figured out how to do build complicated custom nodes, with which a diagram can be created. Then I hacked a new output option into the biocompiler, for creating GraphViz diagrams. The result is still a bit rough on the artistic front, but intelligible and can turn into an SVG file for editing. Booyah!

Diagram of a circuit for computing ordering relations

Monday, August 27, 2012

In Which Jake Questions the Free Market

One of the things that makes my work with Zome different from a lot of other approaches to modulating energy demand is that we aren't trying to solve the problem by using price-signalling or any other sort of market-based approach.

Market-based approaches are the dominant line of thinking for how to control demand, and when you are first approaching the problem, the reasoning behind it makes sense:

A lot of things in our society manage supply/demand relations pretty well by markets, where changes in supply lead to changes in price that lead to change in demand, until equilibrium is reached. This is total Econ 101 material.
At the macro-scale, power in the US grid is pretty much entirely managed by markets, and that works pretty well (if you ignore the whole Enron thing).
Pilot studies where you give people information about power prices (e.g., through a red light that glows when the price is high), and they moderate their demand, or you give the same information to appliance controllers (e.g., thermostats with a budget and a cooling goal) and they shift their use accordingly.

I don't trust it though. You see, I just think that price is a pretty lousy and impoverished signal, and there's no reason to restrict ourselves to that when we're running an algorithm on computers. If you commit to using a market-based approach, you're throwing out pretty much the whole world of distributed algorithms and other engineered self-organization approaches. And most engineered systems don't use markets for good reason---they're a lot of complex hassle to get right, give you lots of ways to shoot yourself in the foot with unexpected emergent effects, and really just beg for exploitation if you get any real money involved.

If you assume you have to solve the problem with a market, you probably can (I'm pretty sure market-based approaches are Turing complete, if you allow complex enough structures and derivatives), but you might have to twist the problem into a pretzel to do so.

So I tend instead to think the right way to approach a complex distributed control problem like energy demand shaping is to start by solving the distributed control problem, and then figure out how to match it well with external incentives. Maybe the answer will be a market, but usually it won't, just because markets are only one tiny corner of a really big design space.

And I might be on the way to saying something much more definitive about it. Just recently, when looking at how the Zome approach compares to price-signalling approaches, I noticed that it's really easy for a distributed demand response market to end up completely failing and always being very far from equilibrium. And this result just might be more general...

Monday, August 20, 2012

Making the cover of ACS Synthetic Biology

Just a brief note today... the current issue of ACS Synthetic Biology, which includes our papers on our TASBE tool-chain for designing organisms and our MatchMaker algorithm for selecting genetic regulatory elements, has the following lovely cover image and blurb:

This cover depicts bio-design automation’s transformative effect on synthetic biology. DNA design increasingly will be a computational effort culminating in software “toolchains” which cover the specification, design, and assembly of novel biological systems.

Dear reader, I am proud, because that image is an illustration inspired by our TASBE tool-chain paper. That is all.

Monday, August 13, 2012

Publish and Perish

Dear reader, if I can beg your indulgence for a little while, I'd like to do some philosophical maundering for a bit. Before my paternity leave, I'd been through a hard stretch of "publish and perish" recently (we had a bit of a perfect storm of anticipated deadlines, unanticipated deadlines, and requests for revision) and so the nature of publication and science has been much upon my mind.

One of the standard maxims of science that every scientist knows is "publish or perish"---if you don't get your ideas out there, you can't have an impact. Of course, there's all the other meanings of that statement as well, having to do with career and funding and all that, but I personally tend to look at it through the filter of the impact of my ideas.

Here, I find the philosophy of Bruno Latour compelling (though I disagree strongly with him in certain other areas): in their anthropology of science, he and Woolgar present a "cycle of credit" that summarizes the scientific world as an economic enterprise. Scientists then act as investors, building research capital by transforming one type of scientific resource into another.

The cycle (filtered through my interpretation and mutation-inserting memory) is roughly:

Position and publications give a reputation that can be used to secure funding. Funding allows the production of scientific data. Data in turn supports the production of publications, and so on around the cycle, with established researchers likely to have significant research capital moving through all stages at the same time.

Where, you may ask, are ideas in there? Why at every link! That's the broader job of the scientist. The standard image of science being merely about production of data and testing of hypotheses is only part of the picture, though a cornerstone on which the whole enterprise rests. Drop any of the other ingredients out as well, though, and the whole enterprise founders.

I know that some might find this "economic" view of science as crass and unsavorily political. I don't think so, though, because I think about science as being not just about knowledge, but able knowledge that matters in some way. Oh, it doesn't necessarily have to matter any time soon, and it doesn't necessarily need to be practical---some things are worth knowing simply because it shapes our understanding of the universe that we live in (an aside: one of my undergraduate degrees is in theoretical mathematics. I always assumed that things like abstract algebra, topology, and measure theory would be purely useless in the "real world," but enjoyed them simply for the sheer power they gave over the abstract world. To my great surprise, I now rely on pretty much all of the theoretical mathematics I even learned.). So when I look at the cycle of scientific credit, I see each of these steps as marking the motion of ideas outward. Data is how you know your ideas are meaningful, publications are how you spread them to others where they will have an impact, and position and funding are amplifiers that the world gives you as it starts believing your ideas are worthwhile.

What I do notice is missing from Latour's cycle, however, is a clear location for professional service, like organizing or public outreach or teaching. For myself, at least, that's been a very important part of keeping forward momentum, as well as something I think is really important if you take the "impact of knowledge" view of science that I do. Service doesn't really fit the diagram neatly: in my experience, it tends to stem from publications, funding, and itself and it feeds into all of these by indirect means. But I think it doesn't really fit because it's largely a different type of reputation---but, well, any model only takes you so far.

So: publish or perish. True, I suppose, but I prefer it when it's for the right reason. Not for fear of perishing, but because I've got things that I have learned that I need to communicate and an audience I want to communicate them to. And that, I suppose, is why I'm not snobbish when it comes to Impact Factor of my publication venues. There are places I publish because there are people I want to talk to, others I publish because I want to broadcast an idea, and others simply because I want to archive a key (but possibly obscure) result for later reference. Only for broadcasting an idea does impact factor really matter---the others are all about getting a strong enough peer validation and placing the ideas where the right group of people will be able to easily find them. Publish or perish? We'll see what what happens, but I'm content as long as I'm neither hiding nor overselling the things that I accomplish.