Received University of anyone Medical University of South Carolina to go to war very happy to be here very long to get here. Of course and also like to inform you that he has until five after the Barclays. Thank you. And it's a great pleasure to meet you as and he said I'm coming from far away namely across the courtyard here but I don't think I've ever been in this room. So that's so it was a good experience. No indie. Asked me to talk about the biosystems institute that we launched just a few weeks ago and which I'm leading right now and I said but I want to talk about science and he said but you should talk about it and asked So you get the. Tool for the price of one. And I try to stay within time anyway. So here's a little preamble. So when I was in college in the Middle Ages. I studied biology and mathematics and people laughed at me and they said you know. Biology is too complicated to use mathematics and by now we know that biology is too complicated. Not to use mathematics. So that it's sort of the idea. So here's my overview I want to talk a little bit about meant I want to engine hearing and challenges and needs for models and then about a particular modeling approach that we have been using for a long time is called canonical modeling or biochemical systems theory and I want to show you some advantages of that some examples and then different issues of optimisation and then in a second talk which may start when the thing is. Ten fifty four. Or I will talk about a little bit. So I probably don't have to tell you this because you are in a chemical engineering department and you know it meant I want to engineering is but I put it in from some working group and it's so it's a new approach to understanding and using metabolic processes so most of this is done in microbes and you want to of course the microbes into producing something that they really don't want to produce at least not in significant quantities and you want to coerce them to doing that without killing them. That's sort of the idea. So in the past. Most of metabolic engineering has been done through motor Genesis and selection. So you take a strain you select the one that is doing the best according to your criteria you improve the medium you zap the strain and you select new ones and the question is is there a more rational way of doing this and I claim that at least as an aid for doing that mathematical modeling might be a good way off approaching the problem. So here's an example. It's citric acid citric acid is produced at a rate of about a million tons per year million tons per year and you say why but if you look at your marmalades preserves all the things contain citric acid because it's considered among carcinogenic you can add it to anything with without asking the F.D.A. or U.S.D.A. or anybody. So there's a lot of that for them it's also used as a start. Material for kinds of chemical reactions and as a cleaning agents for machinery. So in one thousand nine hundred fifty in this final Curry actually started the large production off citric acid in the United States and you say so what. Well it was so important that the United States became independent of Europe with respect to such I guess it. In fact they started exporting it. So this is sort of a simplified scheme. Of what's a trick as a metabolism metabolism looks like in this fungus as special as Niger and the task is to reroute things so that the organism produces more citric acid than it usually does. So the big challenge of that is the complexity that we see it's easier to show this way against the complexity in biological systems. So what does that mean complexity organizational complexity means you have a lot of components. You have a lot of process sees the processes are non-linear and you have quantitative changes in parameters that can cause qualitative changes in response so that sounds obscure we're just think about going out in the sun and getting a tan. If you get a little bit more sun all the sudden there's this invisible threshold and you get a sunburn which is a totally different reaction totally different response of the body to just a little bit more sun. In terms of numbers probably don't have to tell you those about six thousand genes and in the yeast thousands of proteins and you call I we don't know how many proteins we have it depends how you count but they're sudden something like five octillion atoms in the utan human body and I spell it out because I thought you probably haven't seen five octillion here it is. And if we talk when we talk about these numbers we have to keep in mind what a psychologist said in the fifty's. He said that the human brain can manage seven plus or minus two items at once. So if you're young and smart. Like most of you are I guess then you can do. Nine people like me can maybe do five. I'm happy if I can do five. But I cannot do five forty million and I cannot do six thousand at once so we need to have aids that help us to deal with these numbers. So if we look at the whole metabolic map without even gene regulation without them I now. Makes without anything is just the metabolism. It's scary. And so we need help. And I claim that systems analysis to the rescue. OK Photoshop is a nice tool so Systems Analysis always is based on a model of the models at the center and almost if you probably have an idea of what you'd do with the model once we have it so we want to understand what what the system is doing. We want to extrapolate to new conditions then we want to manipulate it and once we can manipulate we want to optimize the manipulation. So we want to find drug targets over them to increase yield of some some organic compound in these organisms. So how do we get to that model that's a different question. So we need the model structure and if you do much about it. Modeling like I'm doing then you get much from databases like A and E. course like a metal psych from new experiments or from the literature and then you need data and in the olden days which are still going on. Most of the doubt. I came from the literature on you know enzyme kinetics correct touristic scale and values of emacs and so on and then you would put them together in some function like you because many great law and put them all together in one big model and then hope for the best and usually that didn't really work very well now we have an alternative if the alternative comes from what I would call global data. And there is the Internet of course that gives us all these wonderful things and I'm talking about you know my courageous espec N.M.R. methods and you can do those as snapshots or what's much more interesting as time series measurements. So I will come back to that later but you have an organism like a bacterium you let's say you started then you give it something to eat and then he can measure non-invasively how the metabolites change over time. So that's kind of. For now obviously there's a lot of information in these data and the challenge is to get that information out. We can actually if we have good enough data enough that I we can actually infer what the structure of the model should be and my lab is working very much on this particular on those those two green arrows right now. And then of course you have to do diagnostics and testing validation of those things. So there are two types of optimization I want to talk about one is the optimization once you have the model the other one is optimization to get the model and I would come back to those. So the ways of setting up the model could be in different different forms the easiest Just like your metric systems. So you have a little Here's a little toy sandbox problem. And also you have metabolites A.B.C. D. and then your reactions are one through our seven and you can formulate a connectivity matrix that shows what's coming into a poor Let's get going out of a poor book and I don't have to explain that you can probably infer what all that is and then you can set up a differential equation as it's here that says the change in Suppes in the substrate vector or is giving by the connectivity matrix multiplied by a vector or flex values. So you're fluxes for all these reactions and if you set that equal to zero. It means that the system is in a steady state and you can see if you ignore the differential party when you have an ultra brite system and then you're home free because you can do a lot with these types of old rock systems. You can do what that little bit more advanced like going to Paulson is doing and I'm sure many of you know often more have met him. And he's doing Flex balance analysis which is essentially the same thing but it adds constraints and adds other things to that system. So the advantages are that you don't need kinetic details. So if you have K. M. values here you don't need them. You just into typology and you need to identify flux valueless. It's a linear system and you know that any assistance can be sauf with sizes up to the top thousands optimisation is straightforward and people like estimable us and poles and other people are are using these types of approaches to optimize the production of valuable compounds. The steady state solution is given by a kernel of this equation and also they are some tricks to make the solution unique and in a good to go into that now the limitations are if you have kinetic information about the system. There's no place where you can put it in. So you get you cannot use your case you don't you can't put in your inhibition constants on those things you cannot do your was known in the area T. S. as soon as you put in the mid eighty's in there. The linear structure goes. Goes go up is gone and so you cannot do the analysis. Because of that you cannot really put regulatory signals and they are not persons as you can but it's cheating because he just puts major cities in there that sort of change from time to time and turn on and turn off things. So that's not real regulation. But if you have if you look for optimal strategies for altering the flux distribution. Clearly you need to you need regulation. Because if you change an organism if you cut something out of an organism. It's not going to sit there and not do anything. It's fighting against you it's trying to circumvent the problems you have create it. So what would you want is a non-linear formalism that captures the essence of biological systems and accounts for physiological but not a large trickle process sees us inside its own lends itself to analysis and simulation optimization. And you need some computational efficiency to make the whole thing work. So if we work toward. Such a nonlinear modeling framework we can go back to the tenets of our systems analysis so you say that each component of the system. Potentially depend on all other components in the system and if you want to understand what the system is doing. You need to understand what each of the component is doing inside the system not by taking it out. And then the dynamic changes in that system component driven by inputs and outputs. So if we formulate it the way we do it is we have two types of quantities we have exercised lots of metabolites in the different contexts it could be a species ecological ecological species or could be genes and then we have processes which we call a V. Plus the minus now to write that as a model of easy here it is. The issue is though. What are the present I might as well we know they are functions of probably a lot of the inside variables and probably some outside variables. So what do we know about these functions outside this this formulation here. Well they're complicated and the big problem is where do we get functions from. Now if you're in physics then the functions come from theory. You know you know how Springs behave and you know a lot about electricity in optical features and those things and so you can piece your models together based on physics and because that works so well we can shoot little gadgets up to Mars and actually land and spit out a rover that's running around. Because we have theory known biology we don't have a theory. We have educated guesses. For instance for Growth functions that we're pretty well but at a very superficial level. We have sort of partial theory Mz for instance an enzyme kinetics and then we have generic approximations. So why don't we use true functions like in physics. So here's an example the simplest of the simplest A plus B. is going to go into a P. prosecuted by substrate reaction was to products like glue cause an A.T.P. going to look or six phosphate and a D.P.. So that looks easy if you do it right then. You need to figure in how the enzyme is getting into the action and this is from the book by shorts and shorts as well. This is one mechanism to produce P. and Q. of A and B. And if you're right on every step with mass action. This is what you get. Now. That is a mess. Now you can say we have become pewters you know toughen up computing is here. So we can do this. Yeah you can solve it but the problem is not solving these equations the problem is getting all these memes and quaffs and constants that you have to have to bring the thing to life. And besides many other things you don't really need. So if this is for one single reaction you can imagine what a big. Metabolic network model would look like so that not going to work. Why don't we use linear functions. Well here's something that's not metabolic engineering but let's say you want to model the heartbeat and you start was a sine function. So it's it's going up and down pumping and now here are the sudden you. You're in a horror movie or you're running up the stairs and warm you go into this bigger amplitude and if you do there was a sine function of this amplitude States forever. So you never recover from from your from your shock in that horror movie unless you go to a chick flick and maybe it goes back. On the other hand if you have a nominee a system like event up or oscillator. But see the first you know there's this baseline here looks almost exactly the same the portal Basin looks the same but there's a vent up or non-linear oscillator comes back almost immediately to the original oscillation So it's a stable sustained oscillation. You cannot do these stable sustain oscillations linear systems. So the challenge is the linear process missions unsuited if we look for money a function of the infinitely many out of them and somebody says if you divide the function into a linear function and you have functions is like. Dividing the animal kingdom into elephants and none elephants. So we have all kinds of choices in the Norris of peace no guidelines for what you're to pick. So here's a solution was potential Mike so I'm sure who was in Michigan for a long time as Noah Davis and I worked with them some while long time ago he said well we can take the functions we have plus the minus and approximate them and lower US My coordinate system using Taylor theory and Taylor of course this has been dead for a long time. So what he does must be right and so that leads to something that we call canonical modeling or biochemical system theory. The result this is this and only a mother can love it but it is very useful. So you have the incoming prophecies you have the X. come out going process sees all the variables that affecting the influx are given here with with exponents that we call kinetic orders and their the rate constant. The kinetic or it can be positive or negative integer and on into Cheney thing real. The F. was and made us on the negative. Not important is that each term contains only those variables exactly those variables that have a direct effect for instance of X. two has no effect on this particular term the G.I. two would be zero. The thing would be you want to drops out of this product. OK. So it's an unusual way of looking at things but it is a very useful way and I hope I can convince you that it really is useful. So there are twenty formulations within this by chemical system theory. One is what I just showed you. So you take the sum of all these incoming processes to aggregate them all and then you do the approximation you do the same with everything that's going out of this poor and you do the approximation. Or you can do each separately and then you get a sum of all these products of par functions mathematically able. So different. Mathematically they are exactly the same at some point if your choices go to the operating point. And if you go away from it. You know the two differ a little bit you know in most cases it doesn't really make much difference which one you are so both are very similar in accuracy but there are differences between those mathematical formats. This gives you sort of the doable size that we can manage these days. Something like twenty five metabolites it has to do with single digit metabolism the details are not so important. Something like thirty enzymes many many parameters and the values of those parameters this poor guy of our US year. You spend three or four or five years to getting those parameters to gather. It's a hard job. It's a hard job. So there are many applications of that in pathways fury in you know dopamine LICHTMAN sense of this now for biofuel There are papers on Gene circuitry genome expression doubt on metalwork engineering and so on. They're very interesting mathematical features of these equations but I don't have time to talk about them. So one of the features that is very interesting of these S. is the models is if you look at the city state equations. So he has a differential equation in the ice of them and we have internal variables with external variables that treat exactly the same except that these external variables don't have their own equations. So we said that equal to zero. And throw it to dot and then we take logs and you can see log here and you have log there and the whole thing becomes a linear system in logarithmic coordinates. Now this is very amazing because the X. system format itself is highly non-linear in fact it is sort of any of that anything you can write as a differential equation you can write as an assistant Exactly. So I don't have time to talk about that either but it's if you're curious come to me afterwards. So the systems are very very non-linear but the steady state equation. A linear. Now that is very cool because now you can compute the city states and you compute features like stability and sensitivity and houses is one application you can very easily do past weight optimization with this system. So you say if I want to run my optimisation and I batch conditions for the cultures and steady state. I can write a linear program that says. So even though the money in the kinetics is everything is taken into account the only thing is it's happening at steady state close to steady state. So maximize the log of the flux for the log of the variable. So for some of the vapors citric acid then I maximize the log of that or I can look at the flux with which the organ spits out citric acid into the medium and takes a lot of that I can form a steady state conditions it's all in the unlock space. I could afford me the constraints on the variables and mock space constraints and lots of flux. So everything is a linear system linear so we can use methods of operations research and you know that also applicable to thousands of equations. That's what a T. and T. uses to optimize your connection when you reach out and try to talk to somebody. And very well understood. It's very robust and of course it's income probably faster than the new methods. So we have done some applications out there and then people that you probably know by name. Bailey is dead now but Flo DOS was here some while ago had some money cut this so they use these types of optimization schemes for optimisation of past where structure and we have done a lot of work for G M A Models which I told you alternatives to those things. So example coming back to us to trick acid here. So here's the past weigh in these arrows sort of indicate home much you have to change at least theoretically have to change the endemic to the TS or gene expression to to optimize the flux the coming out of the. And going into the medium. Is So if you. If you look at the system it has a barge twenty enzymes that one could probably muck around for us twenty genes. If we allow an optimisation where you can change every single of these twenty enzymes or twenty two actually then you can. According to a theory according to these optimizations you can increase the yield about twelve thought another twelve for does depend on how tightly you set your boundaries. For us. You want to say well the flux from here to there cannot be more than that or a variable cannot be lower in concentration that a certain level. Otherwise the organism isn't viable. So you can increase a twelve volt no question if you know that's a lot of work to change twenty enzymes. If I can only change one enzyme which one should it be and if I change it you can pick any one of those twenty and you can change it any way you want to up or down doesn't matter how much can you improve you know the answer is you can pick whichever you want to. It's not going to do anything good for you. It's not going to change. OK. Now that's a that's a damper So OK let's do it. Let's change two or three or you know enzymes how much you improvement or we get well if you do two nothing is happening. You yield will not increase if you strange three nothing is going to happen for just six almost nothing is happening going up a little bit. If you want to have at least a three four year old you need seven that you need to change now. Philosophically that's a very interesting observation because what it means is the standard techniques of frenemy to Genesis and optimizing the medium up to a strain and so on. You do that. Interrupt of really four hundred years like curry since Curry I should say if you do that you find all the simple solutions. So combinations of one or two or three. There. Just give you the right set up. You can find by you know random. But if you imagine rolling seven dice at once you don't find that by coincidence. So here we can predict what you could do potentially and what is not going to work. So that's going to call. So now I want to talk about the other optimization of my lab is working in that quite intensely and the other up to me is parameter estimation from time series data. No you go with your problem to a computer scientist and you feel like you're with a bicycle at a four way stop and the other guy has an F three fifty and he's coming in a minute later and he's just going to go off. I mean what are you going to do so that's the impression you get when you talk with computer scientists. So they say well. We have solved this problem in the forty's and fifty's. You know there are many algorithms out there. So that many methods. Now I have not yet seen a computer scientist come back to me with a solution to our problems. So they say good. Here my data find me a solution and then there is this black hole for computer scientists who lie and they never come back out. OK which may be good. So there are many methods out there. If you look at B.S.T. estimation of B.S.D. there are about hundred papers since two thousand. Trying to optimize parameter values from from time series data hundred papers. All of them say they work all of them do work for the case that in the paper. Almost all of them don't work if you change the data in the example a little bit. None of them works if you change the data a lot. That's a situation. Now if you do a brute force and this fellow Kikuchi believe it was the Japanese to do what they did it was great force. Fyfe able system. Try to estimate parameters. It took seventy hours on a cluster of one thousand P.C.'s. Now there are seventy thousand hours of P.C. time now that is not a happy situation. OK so the time to conversions is a problem. Problems with current colinear data are noisy doubt our problems with models permitting redundancies and compensation of error among terms. And since I got so many slides that. Let me tell you what we are doing now one thing we do is that we replace the differential equations for the differentials in the differential equations with estimated slopes. So we take our time series that we estimate slopes at all the different time points and we substitute the slopes for the differentials what that does is each differential equation gets substituted by K. algebraic equations. So there were no you have K. times an equation that's even worse. No it isn't we have clocked how much time algorithms like today goggles and so regression algorithm spend on optimizing these types of equations and which part of the process. And I don't we don't have time to play a guessing game but they spent something like ninety five percent on integrating integrating the equations. If the equations are nasty stiff in the in in jargon. If they are stiff The other than spend ninety nine. If not cloth almost one hundred percent on integrating equations. And if you have a system that's not stiff. That's not nasty and you do these random searches for parameters you are bound to run into combinations that make the system stiff. So even if you have a good system well behaved system. The other is most fail very very very often because they become stiff and then the algorithms keep on keep on doing that. So we do this. Estimation and slopes of situation and that cuts down the time from seventy thousand hours to about one hour on a P.C.. So that's a good improvement and we have done some other things. No one one issue I want to talk about is compensation of error among terms. So what does that mean for you as an example. It doesn't really matter. It's a made up example. So it's in G.M.A. form here's here's my little pass way so there's one one branch point in this a feedback and a feedback and if you feed forward activation of some activation you know this process here and so I created a model off that and used that model as data. And then I look for the solutions that would give me a decent solution cities and fits and if you look at this. And you imagine that would be experimental that I would say you know this this is a good fit. You know this is a good fit. But it has two were wrong terms not just one wrong term or you know we start it was one long term and then it's compensating all with all the other terms. So everything is a little bit wrong. Now if you are just one to look at it. Look at this or put it in a picture frame you can say well this is OK but if you want to extrapolate this situation out to something like doubling the input exit row here. So it was such a different situation or if you say what happens if there is no activation or what happens if I take out some of the X. to that type of X. or palatial if you do that then you are in big trouble. And here's an example. So scale is different and gives you the true system is behaving like the dots the model which which smoked very very promising. Here read it as a bad job especially in the beginning. And it comes from compensation of heiress. So if you look at these hundred page. Both of us cited zero of them including ours are looking into these compensation issues so that is bad so because it's bad. We did something about it. We come came up with something it's called Dynamic flux estimation just got published in bioinformatics and it was inspired by stoichiometric analysis but it was extended to time courses. So we study the flux bands at each time point. And of course of balances not in equals out but in equals Plus what's the change in that variable S. So it's a different system than we have before. So the changing of A but at time T. is all in flux us at T. minus all F.X. as a T.. So if you if you say this various not changing it. T. then you go back to a strike in metric and Nastase it in houses. Now even if you do this if you set it up like that you get a linear system and we saw that as far as possible and the result is that we get numerical values of Fluxus at all time points. Now we don't get functions out of it. We just get what the flux values individual flux values look like at each time point. And then we can go along to go ahead and represent these large fluxes with appropriate functions appropriate models. So this if thing consists of two phases the first is totally model free. The second one is model based so the model for a restart was a time series we do some technical stuff like smoothing and stuff like that we estimate the slopes from the data and that can be done automatically. If we nor the system topology we can set up the system of fluxes which flux is going to this pull which ones come out of this poor and we list them all and then we do so many are brought in and put you in conditions and we are working on an opportunity conditions right now but I'm not put your conditions we get a dynamic flow. It's profile that says he darts where you have fluxes versus time of Fluxus versus the contributing variables. Now you leave the model free estimation face and you can make assumptions of what the function should look like that is crybabies Fluxus So if you like because men who can do that if you like Palm lost by coming system theory you do use that if you like not models or something else you can do that. You fit that into the symbolic flux representation in the you can do some parameter estimates can you get numerical flux representations that are in a functional form and from that you get a fully parameterized he medic model. So let me show you an example. So this is an example that has to do with black cause as in luck to caucus luck this is the little blue dots year luck to caucus as I'm have been told for many years I showed a wrong picture that was an Arctic office but I'm not doing that anymore so. When I did I got it from some websites as like to caucus and somebody wrote to me says you are using my picture and it's wrong. So OK OK. He says for a fee I will give you a better picture I said no thank you. And I could then make I don't have money. Anyway this picture is involved in in your words Jesus dairy products of wine bread pickle all those the things you like to eat and we were interested in regulation. After a car service and we're working with people in Portugal who are producing really cool dad up and the goal is to understand how this past was regulated and even though it's played causing It's a yeah you know I've been in the book for fifty years one hundred years just a question about regulation. We want to extrapolate to new situations for instance not also or expression studies or other things they want to max may yield and so what's the thing is doing is taking Group course and converting it into lactate and by doing that the medium becomes sour and become so sour that the tape. If you're over it becomes so solid that other critters like equal and struck the carcass don't like it so they don't go into your bird and because of that you can keep the your work longer than if you didn't have the lactate and they're OK so that's that sort of the idea. But they're secondary products that improve the creaminess so that your grower and the texture and all that eventually they would like to have a solar wind model but we're far from that I think. So your typical back up these are produced was in vivo N.M.R. measurements and if you're really interested in that I have a postmark right now from the lab in my lab so you can tell you all the ins and outs of how that is done it's really cool so here the group was in the medium that's of course easy Local six phosphate for Plus this. Phosphate A.T.P. other things you can see in some cases the cursor various most of the board every thirty seconds. In other cases you have a lot of fluctuation here and that's probably because these two are in very very fast it with every of these two reactions so we have some problems with calling near eighty. Anyway so we've been playing with these data for quite a while. And we had actually fitted these data before and first we didn't find any solutions and then we were happy for a moment and then we found all kinds of solution there were really different. And he said How can that be. Well it's because of compensation compensation of error between terms. So that was our problem so every time we want to extrapolate to a different set of conditions the whole thing blew up and said This is not good. So here's our dynamic fax estimation Here's the flux years sort of the model. So in contrast to high organism that use A.T.P. they actually use this paper. S. as a fast food donor and that produces glucose X. phosphate and it has a side effect that you produce pirated immediately and that goes into lactate you start souring the meeting. Medium very very quickly. So here's our first attempt with with idea that smoothed out there were generated artificially and you can see the fit so good. The the individual flux profiles are done very well. So that's all good. You can do that. It's sort of tolerant to noise so we did that in that we're working. But what's really important is this one here. So here we knew that the glue causes this estate uptake function but we pretend that that it was something like this it's something like an exponential decay functional style and we did the analysis. In all the fluxes came out OK like this one and you see the target flux in the fifth century the same except for this one. So here you see clearly a very very big difference of Holder's flux depends on Fed and glucose Well you say so what where this is a big deal because it allows us to diagnose assumptions that we had an other people had automatically put into these models and had no way of checking. Now we can check them we can diagnose wrong assumptions. So it's really cool. Weapon problems of course this will come. Convergence issues with doing mask on the vacation didn't have time to go into that in many cases of course these fluxes are under determined just so we have fewer time courses that we have measured and then we have fluxes that are in the system so we have an under determined system and we are working right now on methods for combining this with other matter of the laws to true and lies those data. There are big big issues with data that are colinear. So they behave sort of the same. Or they are very very. A quick Librium and it will take some really good stead of stations who can get out of their little silo of linear analysis and major season regression analysis into something that's dynamic and the many or so if anybody if you're a statistician you can come and I'll give you a job or so. And then there are models and then just mathematically allow different solutions and we have worked on that this is a source for sleep it's not a lie. It's so physically from Norway one of the two Norwegians in mathematics who have become famous able being the other. So those are problems. So off here the summary of my technical talk. I'm doing OK on time. OK. So biology has become too complicated. Not to use math and the reasons I hope were clear to you. It's just the number of things we're talking about twenty five thousand genes in the human body and something like probably two hundred fifty thousand proteins and nobody has any clue how many minutes more metabolites we have it's just a lot of stuff to keep track off and that alone is not sufficient because now all the things are interacting in the into act in non linear fashion. Sometimes you have competing effects this wants to increases this wants to decrease is which one is going to win to what degree is the one going to win and so on. So the non-linear interactions are really our brain is not made for non-linear interactions at least on in this context. So really the only solution. In the long run is going to be a computational model. And how you get it. It's a matter of taste. What you do with it. It's a matter of strategy and technique but there is a model that needs to be involved. Otherwise you cannot do it. The nature of nature does not provide us with guidelines for what model we should use and that. Gives us some. Some freedom and it gets us contests and scientific fights and other things as part of the deal. So the Infinity infinitely many choices and he's canonical models that I showed you are not the solution to every problem on earth even though we would like to think so but I admit that it's not the case but they are good pro they're good defaults and if you don't know how to get started this really helps you to get started. Because you can set up your diagram of a biological diagram and it's made it's an automated procedure. So as to set up the equations. We have a computer program that does it for you. So the adventures are that you know exactly what parameters mean. So effect of this on that or turnover rates. There's not nothing else for computation associate with the steady state refute the system form become very simple and indicate that with optimization where you can really solve a big systems. Now there are some problems with it to what in principle it's a great great way of doing it. So optimization is facilitated by the model structure and optimization means both something like year old and the identification of parameters. So I don't enjoy it because now the second part is going to start. OK So this is my current crew and they're all over there in that building and on. And I'll be happy to send you information or you can Google me and stuff. All right now integrated by a system in situ and this is much shorter. So it's bear with me. So we call it C.. Integrative Biosystems Institute. I don't know who a few looks at these research arise and but. I think in the last issue actually summer two thousand and eight they had a long article about different activities in the center and the institute and so on. So having the problems your little puzzle pieces and the computers and the drugs of the D.N.A. and all those things play nicely together. We also have a website. And I just. Dot edu so that should be easy to do. So there's a lot of information out there. So here's a history in the history starts with two thousand and four because that's when I came and everything before and I'm not. So in two thousand and four. I came in they put me on a committee with all these esteemed people here. And they said sure. Georgia Tech do something in computational biology or structural biology or systems biology or integrative. And didn't you read enough. And of course they were said yes. So there was good and then in two thousand and five the three deans sure store and get in and de Mello at the time said OK here's your committee. You should tell us what such an entity should look like. And they asked us to write a white paper and they had expected something like fifteen pages maybe but with German thornless we wrote a white paper that was one hundred twenty three pages and nobody read it but it made an impression because it has a lot of weight. OK. So if you see doorstopper us all over campus. That's what it is. So anyway we were so impressed by the quantity if not the quality that they say good. Now you have to write an implementation plan so we got a new committee. And there was Richard Fujimoto who was a Computer Science years from biology and myself. All right so in these activities we did surveys and we had many meetings we talk with a lot of people. Nobody's really concrete ideas but he says you know there's good. We should do that. And so the vision was and is that Georgia Tech would be nationally recognized internationally recognized as a leading institution for research and study of integrative biological systems. And clearly we need a situation wide participation. It's not just biology and it's not just computer sciences. It really needs to have cross disciplinary linking. We clearly need to leverage the strengths in the lots of strengths at Georgia Tech that are relevant. We need to create a physical not just a virtual institute and we've been promised a significant part of the building that's to be built on Atlantica tenth Street and on whatever the recent unpleasantness of the economy is letting up. I guess we will be back in plant and we are encouraged to do strategic hiring in relevant areas. So the courts are to create such a learning environment research environment consolidate all the strings. And then to develop a graduate program interior of interview with BAE Systems. Why do we do it. Why do we do it now. Well if you if you look at the paradigm of biology over the last fifty or one hundred years. It's what's called reductionist So the idea is if you want to understand how a heart works. You need to take it apart and then you need to study the valves and the tissue us and if you want to understand how they work you need to take the tissues apart and you look at the cells. If you want to understand how the cell works you have to take them apart. So you're taking apart a part a part a part until you down at the building blocks and the idea was well once we know the building blocks we can put together and we understand what's going on. Well this last part. Which is conveniently ignored in many cases is the hard part because we still don't know how we can put all these pieces back together into functioning entities. So this reductionist reductionist approach has to be complemented with something like a Reconstructionist approach where you put pieces back together and that requires modeling. So the opportunities there because many of the things you know Georgia Tech is really strong in you know simulation pattern recognition visualization. We have some really cool people who are tagging molecular tagging the so on. So if we look at biology in general there's sort of data collection. And that's the good old fashioned biology that's still going on that should be going on. I'm not minimizing this at all that goes from observation to quantitative data. And then a bioinformatics came along about fifteen or twenty years and then organizes these data are especially the high throughput data. And so from quantitative data you go to information what does it mean what's clustered together. What's causing what next one is my chemical biological systems analysis that's where my lab is and many other labs here and that's where the focus of this is it was going to be and that goes from information to understanding. So we have all these pieces of an astounding how can we put them together so we really know what. What is happening in this organism and then this is just starting a synthetic biology where you want to reconstruct biological systems from the bottom up. So you want to go into innovation and create little biological systems that do a job that is difficult to do otherwise. So if you take these pieces it goes on observation to quantitative information and the standing to innovation. So that for me is integrative biosystems. Or systems biology. So all the the big question that has. Come up all the time is what's in it for G.T. for take what's in it for you. Steak or. Because I could I mix I like cats and you know you can hardly tell them what to do if you don't believe it as your chair he tries to tell his people what to do and they really do it unless there's something in it for them right. And that's human nature. So what's in it for everybody is that the prospect of solving gram challenge problems in biology in medicine that nobody by himself or herself is able to solve because the solution require complex and flexible teams that are to disciplinary interdisciplinary and cross disciplinary if you want to be picky. So the guiding themes are multi level assessments. You know at this point. Probably ninety nine percent of all analyses are one level. Maybe the genome never maybe physiological level or maybe the metabolic level. We need to combine those levels because they are carbide they work together. We need to develop molecular inventories. So we need to know what all the little blocks are that nature has as at its disposal to solve these complicated tests and then we need and able to technologies that are suited for systemic approaches and so again low to understanding to creating a pathway systems in some sense. So when we have decided to begin with three streams are sure like one of the development from the normal SOS to work cancer. One has to do with sensory microbial system the environment and there's a lot of interest in that because infectious diseases and sustainable environments cleaning up the environment those things in sort of support it is supporting the both high performance computing and that's where computing is in this mix. So we have these multi-level level a system of course in a general. If you like his rate systems we need modeling simulation molecular inventories and then devices at the engineering part points of the creation of tagging devices or methods sensors visualization two worlds in those all the enabling technologies. So the implementation has begun. We haven't executive committee and there consists of these three folks here I'm sharing that right now we have an advisory board with high ranking Georgia Tech faculty like your own Professor Bill Maher Yes. We have a rotating directorship So we've done by the end of next year. And the hiring would be through schools but other entities and if there would be involved in that. And if anybody has good ideas for hiring superstars from outside the process come to me and I will find money for you. In terms of research we have funds for C. projects and if you're a member and you can work with somebody from a different unit that's a you are here and you find someone in biology or in physics to work on some interesting project then you can apply for that as a faculty there funds for shared graduate students. So the graduate student has at least two mentors into different units that may be a blessing and maybe a curse. It's probably both. If you are a student. But that is supposed to trigger a sort of collaboration because students usually don't go away. So you have to deal with them. Then we have a distinguished seminar serious and we had a speaker today Dr Ryan it's us or there is every So we have that eventually we will have this Biosystems building the name is not quite decided yet but we made a big advance by getting into the capital campaign. And the building into the Capitol campaign. So for those of you who don't know what that means is if somebody comes and says thirty thousand or thirty million dollars to spend what should I do with it then we are there. OK. If you have thirty million dollars to spend. I'm here. OK come forward. We have a Chalk Talk series that happens every Wednesday at noon. It's an informal setting where people usually from inside Georgia Tech talk about their research. And in particular about open research questions we encourage speakers not to use Powerpoint which changes the dynamics dramatically. So you can ask they scribble on the whiteboard and thousands you can't read it. They have to explain it. And so there is there's a lot of interaction. We sort of food capital F O O D. So if you're a student on Wednesday free food. It's in the clouds building. OK. Catholic I mean I talked about that we have some come committees we have a website of course. We had some events where we start was a brainstorming workshop or shot or lawn. We like to do things in style. We had a poster workshop where we had something like hundred twenty posters and two hundred people showing interesting and we had featured speakers including your professor Lou and we had an announcement celebration in February this year and then we had our international launch conference just three weeks ago in there. We had really really good speakers. So you missed it. It was great at about two hundred fifty in ten days we had a good poster session it all happened. So tell we had dinner in the ballroom of the aquarium. And it was it was announced plenty fully so she didn't come to bed. So work in progress. So we are working on gradual post-doc program. We would like to have an Dolman to where we have some money that we can we can have posts that are not associated with one faculty member but this sort of floating post box for you to work on training runs. Building this building a city G. partnerships and all then of course we need to work towards sustainability so sustainability. On this or money stuff but you know your thirty million will be put to good use. If you have them. So it's a great opportunity. There's a lot of buzz. All the way from the president of the provost to the international faculty all across campus and we want to integrate that buzz of our goal. So all almost made it. In that time so I thank you very much. And that's a great question and my students always mock me because my standard answer is It depends. And that's my answer here. It depends on for instance how good the data are if the data are very noisy you're not going to you're not going to diagnose anything. If the data are very clean and you see systematic deviations from those data. You may be able to figure out that there's some time delay and you say well as time delay coming from. Well the most likely one is that there's something in between that is stealing this reaction. So I'm sure that to some degree. You can diagnose other things that are not just drugs or presentations and I'm sure that them many things you would not be able to find. So this is all at the beginning and you know maybe in three or four years I can give you a better answer but this is sort of I think partially yes and partially No At this point. Yeah. We're doing that right now right now. If things work. You don't know if things don't work you read on. Now it's so if you go to fit a sort of OK. You don't know whether it's wrong assumptions for the for the functions or whether it happened suddenly something that was screwed up the experiment. Or it's a combination of they're off especially if you have noisy data. It's very difficult to find out what the reason is no the solution of course is that you have replica data and eventually that will be coming and so we're getting a big data set of metabolic profiles. I'm almost almost scared. How big. There's going to be from the group in Japan who have a lot of machinery and a lot of students who are working there twenty four hours a day and so that would be a very good test bed because most of the things we have done on this very very small system where we know a lot about the topology and some about the regulation and we have really good data. So as soon as any of these ideal assumptions deteriorate. You know problems are lurking. And we know that if. Yeah. So in the ideal case you get a specific number you have to increase this and them to you by three point five four and where you have to decrease the security by you know to eighty percent that sort of the solution that comes out of that and else a good question is what happens if you cannot get three point five four and you have to do a four fold. And then we know that that the theoretical maximum will not be reached. How much it deteriorates depends on the special case. Now there are also assumptions about how much you can allow Fluxus to deviate from normal and very very well from normal and most of these boundaries are set by your own. Little feel guessing educated biologists who say well you know if it's more than twenty fold. There's not going to work itself. It may be able to. It may be possible to get a better solution in the theoretical maximum but than you violate some of these constraints that you put on. In the case of optimization or in case of Premier estimation. With the optimization. If everything is working correctly and ideally and everything is good life is good and you set it up as an assistant. The size doesn't mean a difference because a linear system so you can do it. Now the trouble comes in with getting the right primitive ideas. And so if you don't have the right primitive values then you know you on the on and the question is how do you get those so it comes through comes back to the primer estimation issue but with the with this as just a model. It's becomes a linear system and then you only have feeling your objective function that you can muck around with and you constraints. And that's under your control. So you can play around with it. It's very fast. It depends it depends on what you have to cross validate your data. So if one doubts that with fifty doubt up ones when you take out twenty five of them. And you put them in a drawer for later use. It's not going to help you. Because the number of dead up points in these small time profiles as. Really you need enough so that you know what the profile looks like. So if you take those are using for for validation it's not going to work. If you have a data set that is produced on the different conditions that's a different substrate. That would help to some degree it will help by telling you all wrong. What you've got with your model is not working. Period. It does not tell you what's not working. Really. So all then you can go and say well let me fit both together and see what they can find something like that but we have done that and it's really not a not a pretty picture. Because you just them all of the arrows are with a you have compensation was composition with this with an equation of between all these equations. So this if this really works and in a stable fashion. It's much. It's a big step forward. I think.