Wednesday, August 31, 2016

Biological Foundries

While in the UK, I had a gap in my schedule at IWBDA, and so I went to Edinburgh.  It was a lovely train ride up the coast from Newcastle, and for the first time in my life I had an opportunity to visit Scotland, birthplace of my McDonald and Houston ancestors.  Getting off the train at Edinburgh station, I was immediately struck by the remarkable degree to which the city is a center of culture, from the omnipresent Robert Burns quotes in the station to the theaters all about, not to mention the burgeoning festivals just beginning their month of explosion through the streets.  It was a fine warm, sunny day, and I immediately set off walking across the town toward my main business: visiting colleagues.

One of my stops that day was the Edinburgh Genome Foundry, one of only few such centers like it in the world. At present, I am aware of only eight: five in the US (Ginkgo Bioworks, Amyris, Zymergen, MIT/Broad, and Urbana-Champaign), two in the UK (Imperial and Edinburgh), and one at the National University of Singapore.  I may well be missing some, and others may be getting founded even as I speak (the UK and Singapore foundries are just getting off the ground), but the point is there's a small but growing number of such centers, both in industry and academia.

All of these foundries are aimed at much the same basic goal: to greatly increase the rate at which complex genetic materials can be engineered and tested. The focus differs somewhat from foundry to foundry: some are more focused on assembly, while others are more focused on information processing circuits, and yet others on chemical synthesis.  And of course, as each is its own unique cutting-edge experiment, the particulars of how each is set up are quite different. At their hearts, however, every foundry is the same, being essentially a robot-assisted machine shop for genes, formed of a number of stations of automated lab equipment, fluid-handling robots, and some combination of industrial manipulator arms and lab technicians to move materials from station to station.

Biological foundries are different than other laboratories incorporating automation in that they have much more flexibility, and with that flexibility comes a higher order of challenge in organizing the informational side of the foundries, and thus my interest.  In order to make a foundry work well, you need to have some sort of explicit representation of the genetic constructs that you are aiming to manipulate, the biological processes that you hope to create and affect by means of those constructs, and the various protocols and assays that you intend to perform in order to manufacture and evaluate them.  Much of that is well-represented by SBOL, and if they don’t choose SBOL they will likely end up having to recapitulate its development, so I am hoping we can ensure that all of the foundries adopt SBOL (it is being used within at least some already). Beyond that, I hold that it will also be important to for the foundries to adopt good unit calibration in their assays, and to consume that data in model-driven software design tools, in order to avoid some of the past tragedies of large biological characterization projects that produced largely non-reusable data.

Moreover, in a world with many biological foundries, I suspect that ultimately the ones that will have the largest impact will be those that open their processes and data and ensure that their works are recorded in good interchangeable standards.  Some are commercial concerns, of course, and that will limit their ability to share results out, but if those at least are able to take standardized information in, it will no doubt help them in the marketplace as well.  For biological foundries, like everything else, we no longer live in a world where isolated “moon shot” projects are a particularly competitive way to pursue either science or commerce, and I hope that their operators are well able to come to grips with this reality.

Monday, August 29, 2016

Waiting for data is the scariest part

Two years ago, in partnership with the iGEM foundation and some excellent colleagues, I ran the largest inter-laboratory study ever conducted in synthetic biology, in which forty-five teams participating to create an international baseline on fluorescent measurement, synthetic biology’s favorite debugging and reporting tool. Last year, we ran the world’s largest synthetic biology inter-laboratory study, in which eighty-five teams contributed to help figure out that the calls were coming from inside the house the most critical source of error in measuring fluorescence is problems using instruments and handling data.

DIY fluorimeter tested by Aix-Marseille 2015 interlab team

This year we’re once again running the world’s largest synthetic biology inter-laboratory study, this time with ninety-one teams, who are helping us test experimental protocols aimed at fixing the problem: we distributed calibration materials and calculation spreadsheets that should let everybody measure their systems in the same, directly comparable units.  This might not sound like a big deal, but imagine how confusing life would be if you measured with a ruler marked in centimeters while other people were using mils, rods, chains, furlongs, and leagues, but you didn’t know your markings were different.  The world of fluorescence is that bad and worse right now: last year, the numbers we received from different teams measuring the same genetic system varied by a factor of more than one trillion.  Given all that, it’s frankly remarkable how much precision the teams were able to achieve in the ratios between units.  If we can get everybody using centimeters (metaphorically), it should let things become much better still, since people will be able to compare experiments directly and figure out much earlier when something’s gone wrong in their experiment and needs to be debugged and redone.

Right now, though, the project is in the middle of its most scary and exciting phase.  The teams have got the protocol and the materials, and we’ve debugged as much as we could of problems in its design (next year: corrected spreadsheets, better tube stoppers, and a giant red warning sticker that says “Don’t freeze the LUDOX!”).  At this point, there is nothing much that I can do any more to positively affect the results of our experiment: just take the data from the teams as it arrives, process it, and stare in nervous excitement and concern at the evolving numbers.

Running the iGEM interlab is awesome and scary, a big responsibility: these folks are investing their time, resources, and trust in us as organizers of the study, and we have a responsibility not waste their time and to ensure that what we do is both good science and good education. So far it has gone well, and the preliminary results are looking good, but there’s a lot of data still out there being gathered and anything can happen.

I’m excited and scared, and I love these young people for making such a grand effort possible, for seizing the opportunity and understanding what an important thing this is and how much of a difference their work can make.  Some have run into obstacles and had to withdraw, or have turned in broken or patchy data, and I tell them how much their contribution matters too, because it tells us how things go wrong and what needs to be improved in order to get everybody good rulers for their work. Young men and women, in every corner of the world, all doing their part to contribute one more brick, small but significant, to the foundations of science and society.

I am honored to have the privilege to lead an effort like this, and I'll be on eggshells until we know how well it has succeeded.

Friday, August 26, 2016

SBOL at IWBDA 2016

At the start of this year’s International Workshop on Bio-Design Automation, we ran a workshop to teach people about SBOL, the Synthetic Biology Open Language.  As you may remember from previous things that I have written, SBOL is a way of describing biological designs: beyond the “marked up DNA” model supported by older standards like GenBank, SBOL supports the needs of engineering by allowing descriptions of how subsystems are stitched together into a design, as well as the behavior of these components as they interact.  In short, it’s a good rendezvous point for tying together all of the different aspects of a biological design.

Unlike some of the education events we’ve done in the past, this one was focused on developers rather than users.  So after a brief introduction, we got deep down into the weeds of the data model and the code libraries.  There were several dozen people there, mostly from across the UK and Europe, and by the end of the afternoon folks had generally managed to get the libraries and demonstration projects effectively running on their machines and start putting together some complex representations with code.  Some will be even probably be able to bring this back effectively to their universities and companies and start making use of it in their own projects.  All told then, a nice success, and building on this success, we’re hoping to run it in more places, as well as to supply the materials to anybody else who wants to run tutorials.

Challenge problem from the SBOL tutorial: representing a CRISPR repressor system.

SBOL also got some nice mentions in the main track of the workshop: in addition to some “core SBOL” talks, I was very pleased to see it mentioned in a large number of the other talks from the whole community, either as something they’d already taken advantage of in their work, or else something they were aiming to integrate with in the near future.

Standardization is not simple, easy or glamorous.  Bit by bit, however, we are laying the groundwork for a nicely integrated world of biological engineering tools.

Wednesday, August 24, 2016

Going with the Flow in Replicons

RNA replicons are an exciting new platform for synthetic biology.  Basically, you take a virus and you remove the part that makes it infectious, replacing it with whatever synthetic system you want to put into the cells instead.  You leave the part that replicates inside the cell, though, so that even if only one or two pieces of RNA get into the cell, they’ll self-amplify and express your system really strongly.  It’s great because you can get expression as strong or stronger than you’d get from editing it into the cell’s DNA, but it’s safer to use because it doesn’t get into the DNA.

We’ve already shown that you can precisely engineer unregulated expression from replicons.  Regulating expression, to actually control which parts of your system are running when, is trickier.  Even though people have been able to get repressors like L7Ae working on replicons, they are much less effective than with DNA, even though they’re being expressed more strongly.  This is a problem if you want to make controlled replicon systems, and seems like it really ought to be solvable.

Digging into this problem, I found that indeed, it looks like the problem is that when you set up a system to work on DNA, it’s likely to work poorly because it’s going to end up fighting against the natural dynamics of the replicon.  Turn that around and go with the flow, however, and it looks like it should be possible to get even better performance on a replicon than you can on DNA.

In my talk at IWBDA this year, I showed my conclusions about why things go wrong when you just go straight from DNA to replicon: since everything is amplifying exponentially as the RNA replicates, low levels of expression get raised a lot higher, making the system “leaky,” and outputs rise before there’s enough regulator to shut them down.  Big problem.
Model for L7Ae repression of mVenus from replicon, providing an explanation for observed poor performance.
Once you’ve identified the problem, though, some good paths to try towards fixing it appear as well.  In this case, it turns out that decreasing the dose of repressor and increasing the rate at which the output decays looks like it should radically increase the efficacy of repression.  Will it really do so?  We won’t know until it can get tried out in the lab, but the model’s based on things that have worked before and there’s a nice broad area of high performance to shoot for, so I’m awfully hopeful this will work.
According to the replicon repression model, the best performance comes when both L7Ae dose and mVenus degradation time are moderately reduced.

Monday, August 22, 2016

Biological Design in the English North

This past week was the 8th International Workshop on Bio-Design Automation (IWBDA), along with a couple of other associated events—a tutorial day on the Synthetic Biology Open Language (SBOL) beforehand and a workshop on the integration of electronic and biological design automation afterward.. All told, I spent nine days in England (and Scotland!), immersed in synthetic biology for almost the entirety of that time, and though it was all good, it was also quite exhausting.  I’ll come back to the technical content on another day.  For now, I want to talk about Newcastle, where IWBDA another these other events were held this year.

This is my third time in Newcastle, and if I wasn’t cutting down on my travel, I’d be back there again in a month for another meeting on synthetic biology standards (I’ll be dialing in instead). There’s an excellent group of synthetic biologists there, a big node in the larger UK network, and I work frequently with Anil Wipat on SBOL, he being both a long-time contributor and the current chair of that standards effort.  I always enjoy spending time with him and the rest of the Newcastle crew, who, like me, are true believers in the power of characterization and design tools in wrestling with biology.

As a city, I find Newcastle upon Tyne to be a delightful hodge-podge of the old and new.  From its ancient military history, it eventually became a tough old industrial town and port city, one of the anchors of the English North.  Rail lines tangle together at its center, at the top of a steep embankment beside the river, and spider out across bridges in all directions.  One route goes over a remarkably high rail bridge, standing far above more modern road bridges and a beautiful walking swing bridge just a bit further downstream.  An tall old mill has been repurposed into a modern art museum, and on the high side of town the university stretches between two broad lanes of park.
High Level Bridge
Gateshead Millenium Bridge
The Long Stairs
The people I see on the streets of Newcastle are a motley hardy lot, young women walking apparently imperturbable through the freezing rain in their miniskirted clubbing outfits, tough geezers yelling to each other across the streets at four in the morning, loud music in every pub and restaurant.  Joggers rush by the cows in the parkland past the university, and photographers set up on the riverbank talking amiably to one another by the unhappiest looking palm trees I’ve ever seen, their bark and leaves looking shriveled and grey in the cold. I definitely am an outsider there, but so do many of the others seem to be, and I’ve never yet been hassled for it.  It’s feels to me a much less urbane and international sort of crowd than one sees in London, but a rough diversity that I find I do appreciate.

All in all, a good place to wander through with one’s head drifting around and processing all the science of the day.