Well. That's what. I'll. Welcome hundred zero zero zero S. professor of biomedical mikes of Columbia University the division of that biomedical thinks he is one of those people who spearheaded if not for. Record. Yelling out you know. The one that just discovered you gotta walk so congratulations it's always good news on the very. Last minute ten seconds to the banks the rapper and he's also the resources director of White America the government. I'm just capable I'm sure you said the truth as to why all of that research is fundamentally what I just. You know why birth right this is. A really hard core this study chaotic behavior in high dimensional dynamics is very appropriate for an Italian. Guy out of chaos that's something I'm going to try to make it all work of organized chaos of what was the disorganized chaos. Here in Spain at the University of orange I learned yesterday which I realize. More observers you know folks I did not realize that the cities were pushing for a great fight. You know. You spend dollars I shouldn't have been right sorry if the post look like you were even older than giving in the ranks of the manager. Over there but. Then he went off into industry for a while where he was the founder C.T.O. an executive I present you know. And that was three point one person so here's an example. Wilder's. Biological systems and more from the ape or. Thinker so. Great matter and yes very much. And he was hungry was one of the first people at the foresight to try to reach what. He is on the border with by the National Cancer and there's a number of scientific advisory boards and percent of. All as I said he really is the only major visionary it's just my home. Base and it's a real. Well to talk with about a very important problem. Which is trying to understand power or normal cells or cancer cells on and off are and he's got a very novel approach very difficult problems associated with brain injury talking about taking mastery of only one of the switches extremely unpredictable put. Thank you Jeff. So. So I actually I have to say that my relationship with the Georgia Tech started about fifteen years ago when I heard Bob Merriam at the National Academy of Engineering say that compositional biology was really at the crossroad of three disciplines was chemistry biology and computer science and he said you know what to say about those if you move his biology if you think his chemistry and if you failed his computer science so if it stinks and fails to move it has got to be competition on biology so I. Actually had a slide with that for many years in the now don't use it anymore but I thought he was it was very interesting that he had that vision of integrated. Integrative science so this is a cartoon of how our approach to understanding biology not just in cancer but across the spectrum of both basic biological know physiologic phenotypes and disease rate if you know types of change over the last ten years at some point we thought that one day we'll have the power to sequence everything and given enough a large population either of cells or of individuals will be able to just look at the association between traits and their underlying genetic components and this turned out to be spectacularly not the case you probably have seen some of the papers that were done very deep you know typing and sequencing four or six different complex diseases and basically almost nothing has come out of the other end and right now the reason for that is that not only the entire set of sort of the pet independent variables that determine if you know type is much broader because it includes a very large at the genetic component that is now starting to be studied but in fact that the dynamics of the cell is a major element that drives. And point phenotypes in way without even having to change the genetics and get it Jandek layers and so it becomes the name of the game I think and to a large number of people in this audience should be no surprise is this to some extent change from studying and you know sort of built building zoology of the genetic epigenetic and functional layers in cells and individuals to try to understand how the regulatory logic of the cell manipulates and integrates that information to produce a particular trait and disease so we call these building the assembly manual of the cell you know if you're trying to fix a car probably the first thing you should do and what the first thing the problem most people do is to actually buy a manual that tells you how the car circuitry works trying to fix a car Q trying to get a manual of assembled a clock. We don't have an assembly money for the cell and so about ten years ago we took up this challenge and try to see whether we could actually come up with a more reasonable set of. Regulatory processes that integrate that information why is that important let me give you a very simple example this is a cartoon thirty thousand foot view of why it is important but I watch you try I will make these extremely tangible in the in the remaining in the rest of the talk and the idea is that I repeat these slides about a thousand times and you'll have a sense of what today a genome wide association discovery study origin expression differential analysis studies looks like you have a bunch of genes some of them turn red because they are either overexpressed or under expressed or maybe because they carry a particular genetic variant that seems to be associated with it with a phenotype you have absolutely no way to from this mass trying to to identify which genes may be more relevant for the disease and in fact because of limitations on our ability to identify genetic operations and and differential expression a lot of genes are expressed in the in the gray area where you really can't tell whether the difference expressed or not genes that are green may turn out to be in fact those causally related to the presentation of phenotype and completely missing from this picture if you now have a regulatory network superimposed to that view things start becoming a lot clearer and the reason they become a lot clearer because now you can start asking questions of the type given this structure of the regulatory network and given that I see this pattern about aeration What are the gene what is the gene or genes most likely to have contributed to the particular pattern and in this case the surprising result may be that a gene that in fact is not itself differentially expressed or or harboring a genetic variant may in fact be related to these in fact I'll show you precisely examples in which genes that do not harbor genetic variants and I. Are not differentially expressed will be identified as master regulators in specific field times. OK so this sounds fantastic in theory So how do we get down to actually getting it getting into into practice so we could actually do anything akin to what I showed in the previous slides we first had to come up with a reasonable mark of regular interaction in the cell so over the last ten years we spent significant amount of time to put together a variety of tools and algorithms that would allow us to dissect transcription networks post translational networks post unscriptural networks and then once you have this map and it literally took about eight years of time to start getting out how we can now interrogate these maps to understand things like for instance the mechanism of action a particular drug or the genes that are associated causally to with the etiology of a particular disease or with the particular feature to be transitional for instance a developmental process and this is actually sort of a good slide for you in terms of references because this was the Agricola Rockne that I'll talk a little bit about was literally the first one that was used to dissect you know why regulatory lab work in humans so this one has been the first one used for postals missional networks and these are a number of tools of we've created more recently in the macro in a regulatory network and in fact to discover new Newmark are an ace and so I think what distinguishes to some extent our lab is that we have taken sort of a approach that is skipped over some of the simpler organised like equalizer or yeast and gone and tried to go directly into a human context which is provide new which to the extent the take home message was that it looked very complicated but actually some of these methods work much better human cells than they work in eastern Ukraine and it's very surprising result probably due to the nature of the regulatory and the redundancy of the networks in which is higher than used. The Tropic nature of regular. So I'm going to go and give you a little bit of an example of how these approaches work and with a lot of emphasis on how these approaches are biochemical he and functionally validated because there are literally hundreds of methods and each one promising to work better than the other one the proof of the putting is in the eating and so to a large extent we have to experimentally validate everything that we produce and in fact if you don't have in your mind a way to think about validation before you even start putting together a computational tool you should probably think about that first and not now work on the competition too until you have it your ideas are very clear and the map. So how does the Iraqi algorithm work. Motif of. The lab has been the use of information theory in fact one of the people that was one of the major contributor to some of these algorithm the lab and man he's now a professor here at Emory and really using information theory has been an extremely valuable perspective for us in terms of the second derogatory logic of the cell because it allows us to apply a certain theorem that intuitively very very simple very very intuitive and that actually allow it to. Eliminate an enormous number of potential force positives from your networks so this is an example this theorem is called the Data Processing equality and it tells you that the information measure as the mutual information which is in bits that is exchanged between individual on a phone conversation is always larger on the direct path so between people that talk directly than between people to talk indirectly so there will be if you measure information between what Mary knows and what Jane knows there will be some useful information but is the result of this transfer information which is interacting as a result it will be smaller than both information exchange between Mary and Joe and between information and changing joint So this is very important because in a biochemical network one of the major problems. Previous approaches cool relevance networks was that we associated the idea of course expression we did of regulation which is in fact cannot be further away from the truth and you can see it in this case where you have a transcription factor regulating another one which is regulating the target in court suppression there were you would have inferred a direct interaction with one of these targets which would have sort of misled you and follow up experimental results in this case if you can measure of pain for instance by using a large set of mockery suppression profiles or any kind of molecular profile data the information transfer between all three of these pairs of valuables so if you want to have two to have to target and if one to the target you will find that the information transfer from T.F. want to target is much smaller than information transfer between two fun to have to have two in the target and therefore you can eliminate OK. So this was applied again in two thousand and four in this paper in Nature Genetics to try and dissect the regulatory logic of a human be cell both normal and tumor related and what we show here is I like to call these things ridiculous ohms or fuzzy grounds because it's almost impossible to make any sense of what they mean by the very pretty and what this is showing here is actually did make proton could gene which is a transcription factor and in the large circle the direct agates inferred by Iraq any more than that a regulated by make in a smaller circle the indirect ones for things that appear to be regulated through some intermediary process and what we were pleased to find at the time was that about half of the ones that we had detected the one in red and pink were in fact unknown in the literature and been and had already been experimentally validated so we said that's great there is enrichment but as I said this is no longer there is no longer allows you to get away with any kind of biological conclusion and so we took twelve of the white circles that had never been previously reported make targets and we do it something called Press. With essentially measures whether the MacPro thing binds up stream in the promoter region of the corresponding targets and as you can see here this is the this is the negative control this is a positive control and is the result showing that in fact for eleven of the twelve genes that we tested at the time we could show very definitely that Mick was biting the promoter region and this is been largely improved there are several papers in informatics that really expands on the theoretical component of the now of the analyses and also in nature protocols where we introduce bootstrapping in different type of. Informational estimators And now from the exact same data which was a collection of about three hundred fifty marker is pression from file from human the cells we can identify about three hundred or more targets of MC and in fact with a much higher even higher today and what we originally got validation rate. OK So let me tell you this is great that you say OK so these protein biopsy move make does not mean very much probably not make a stream of a lot of different proteins at the time this was consider about chemical the. Good proof that the that there was regulation much description factor but in practice we no longer use that type of approach and what we do is we first of all don't validate just a few targets but a genome wide all the targets for the binding affinity to the protein using something called sheep seek which in your position bought by sequencing the entire set of promoters in the human genome and the second thing that we do is we functionally validate the targets by silencing the transcription factor and seeing whether the predicted target of the receiver factor in fact affected you know privately by by that science and so this is an example of a network that we used to study pluripotency and lineage specific differentiation using a tumor model which is out of germ cell tumors growing carcinoma in particular so you know mio. Carcinoma of course you know most and the broadcast you know my particular in that group completely mimic the developmental paths of a embryonic stem cell so they when exposed to Otto B.M.P. they follow exactly the same type of developmental lineages and so we did we use a large collection of these embryonic carcinomas and other germ cell tumors to build a regulatory network and what you see in another fuzzy gram here is just the backbone that history strip purely transcription factor based So this the nodes here are only transcription factor and the edges indicate relationship between the relationship between transcription factor and so we asked then so how can we validate these networks and we looked at three genes that are particularly important for to report and see in fact as you know when you go express these gene with make in a fiber glass for instance these different different shapes into a pool report and cell that's what the I.P.S. cells are based on so these are opt for stocks to a nanogram and now this is the type of validation that we do pretty routinely for these matters it is purely for giving a sense of how we do this in the lab today so these graphs are very nice graph that we use for a lot of different things called Gene Cernan which Manasseh simply enrichment analysis plot where on this axis for instance you're showing all the genes sorted by the one most differential expressed following science you know four to one least differential expressible in silencing all four and on this axis you're showing grad the predicted target or acne and what you can see mediately is that in fact the color coding on this part the dark to light shows the density of the red bar and so you can see there's dramatic enrichment in density very close to the targets that have changed most and this is repeated for all three of the transcript in fact a lower salary. And this curve here is one a curve that goes up a notch every time you hit a bar and goes down a notch every time you hit a non bar with different weights given to bars and Omar such that when you start and you start and end at the same exact place OK so this. Developed the Broad Institute by public Tamayo and. And removed and the Colgin set a mission. So you can see here functionally these inferred targets seem to do exactly what we predicted that is you know in a particular law in a particular line when we silence them the targets that we predict and move on this side what you're seeing is again all genes sorted by this time they're sorted by the probability of their promoter binding the jurisdiction factor so these is the one with the highest probability of binding out for this one with the lowest probably abiding order for according to chip seek analysis and again you see this case is complete in this respect activity if you look at the enrichment there's almost like a total black bar there at the top which means that the density of the red bar is really very very high right at the beginning and in fact we were validated by university Taishan individual targets ten out of ten ten out of ten and nine out of ten in one cell line in ten or twenty other cell line validated for the same gene so this gives you a sense of how accurate the prediction of our acne are they typically range in the seventy two so you're ninety percent in back and follow up by chemical ideation and do they work for a prescription factor they don't so in fact we published a paper a notch one and we showed the north to predict which one targets we actually have to estimate not one protein expression actually active peptide as a billion a nucleus because some of the proteins are heavily posterization modified and therefore them are in a really does not relate to the truck to fuel the protein but for about seventy percent of transcription factors we're able to reconstruct these now were very accurately. OK So the question now is does this work on LI for reconstruct infrastructural networks or can we use these approach also to reconstruct more complex by chemical networks recent signalling networks or metabolic pathways and I won't talk about metabolic pathways actually an M. in mine and one of my students ever worked to newspapers just been accepted to show that Iraq can be used to reconstruct metabolic pathways. Using also mass action law conservation but this is a particularly interesting result because it uses. A set of data that was generated by the Michael Cohen Group at cell signal a technology. That essentially profiled fossil peptides in non-small lung carcinomas and the idea to take tissue licet for about two hundred different tumors samples and cell lines you know basically take lightly digesting it with restriction enzymes and then immune to purify it with antibodies that detect forcefully Tyrus and so this is possible terracing specific antibodies and then huge essentially use time to master trust to simply sequence these peptides hope to have seven to twenty I mean acids and then you get a map of the abundance of these peptides in the cell and so you can treat the suspension the same as an expression profile except this time you're not measuring expression of genes you're measuring the availability of the hospital and there's a lot of changes you have to do to Iraqi because there's couple of problems one this data is highly discreet while Iraqi people who tends to use continues data the other one is that there's a lot of the generation the same protein may contain multiple fossil peptides and actually the same fossil peptide in terms of sequence maybe long to different proteins so we have to deal with those results but this is kind of the result analysis this is E.G. if I was one of about eighty nine kinases that were profile but his method and what you see here is the Iraqi prediction for the substrate of E.G.F.R. notably these five substrates in red and yellow were the ones that were known in the database in the in the fossil site database us as targets of E.G.F.R. so you can see that the prediction is a little bit larger than what we actually knew about E.G. far and. What you're seeing here in green our substrate that we're validated by Psylocke us is the measure essentially the change you fall. Relation of the substrate bowing silencing of the of the kinase and so you're seeing that at this stage with the least restrictive Iraq needs threshold about sixty two percent of the targets we were identifying by the method by day by Psylocke if you use a slightly more restrictive threshold from Iraq any you now get into the ninety percent range and still have been depicted in yellow or red but they're still all here they're known targets of E.G.F.R. So this allows us to build a full compliment of tire seen kinase phosphorylation substrates which now it generates a network for first initial process that's typically been quite hard to get. And I'm going to do something now that I always discourage people from doing which is illustrate the result of something to something hasn't been introduced yet but since I'm going to do it right away and this was better explained this context to see this and so we use this algorithm called master regulator and I see that I will stray later to now take these phosphor. Fossil signalling networks and interrogate them to identify what are the potential master regulators of the specifics up types of the tumor in each cell line that was available in our NASA any tumor sample So first of this is a particular cell line in which the records and Michael come group that found out fusion construct as being causally related to two more Janice's there is another one in which Matt was identified as the causal lesion and is one one for P.G. a far and when we were on master regulator and I think on this network as you can see these are the master regulators that we identified and as you can see from these two lines are can met were identified in this line we did not and far because in fact if you did you fart component of the network is not well profiled essentially from the force were the only data we don't get enough data for pretty far for one reason or another and so we can't represent the well in the network but what was remarkable is that is well known doubt silencing are the out. A fusion Gene has no effect on the viability of the cell and so this is a triggering. Sort of. Transformation but it's not necessary one of the tumors addicted to however when we call abrasion which is the Southwestern Michael White and John Mina we went and tested the other regulators there were ninety five we found that in fact we were discussing efforts to hear we found the P.K. two and you know F. or two in isolation but not in combination seem to be actually quite. Due to have seen it in effect and so by ability so this is an example and we're now doing it for every single cell line and we're doing all the possible combinations of these to identify mass regular of long term Regenesis and what is interesting here is that we were able to do this at the single cell level meaning that instead of identifying the mass regulator for a subtype of the tumor we could tell Well these are the protein does seem to be the tumor seems to be addicted in this particular cell and this particular tumor and so this is kind of an opening for what we're now trying to do in terms of more personalized approaches to their. OK So everything I've told you is kind of a big white lie and white while I that entire field keeps repeating which is that makes it look like everything in the cell is regulated pairwise fashion OK almost nothing in a cell is regular impair wise fashion and so this network representation that we see where you see two nodes connected by an edge really don't mean anything because that edge may or may not be there the pending on a variety of other proteins so we try to see whether we can still use information theory to dissect these modulation up effect where for instance the ability of a transcription factor to regulate its targets is modulator both post transcription of meters on the protein or poor or it genetically has on the target gene by a variety of different proteins including protein diet due possibly some modification or actress relation as we've seen of the target prescription factor or co-factors etc. So the idea was again from information theory that if you have a conversation between two individuals and it's not going that well but there's somebody at the phone company and somebody Tony starts moving around some dials at some point the conversation may improve and when that improves we this is captured by something called a conditional mission information what we demonstrated essentially theory and that shows that the conditional information is the optimal task to detect these type of relationship three three Vajra relationship where you have this kind of modulatory effect and this app is exactly biochemical networks again because for instance if you look at MC make it can be transcribed and translated into the meat protein but if there's a signal coming on the just a three cascade make gets this POS related story in sixty two and tagged by you because the nation and therefore very rapidly eliminated down degraded by the by the Proteas ome and as a result even though you may have a lot of them are unable make you have very little protein and so the correlation between the money of MC in the American way of a target of make is pretty bad exactly like the conversation if you now have a blade that pathway what happens is dynamic is no longer degraded and the correlation between money of making the M.R. and A of the target becomes very high so again we devised this was published last year in nature about ecology and arguing coalmine the that was able to essentially interrogate genome wide all the possible transcription factor and all the possible modulators to identify which protein seems to modulate which transcription factors and the way it works is actually quite easy to understand so imagine that each one of these columns in these two hit maps is a expression profile. These two blocks represent the set of experiments in which you have the thirty five percent highest and thirty five percent lowest expression of the modulator you're interested in so in this case searing through an ace experience or any kind is thirty eight and then what you can do is sort these again according to the expression of the transcription factor of interest in this case make so little make I make will make I make and what you're seeing here. It's really a very interesting pattern that shows that on this axis you have the targets you make in the starkest are not regulated by make when there's low levels to get thirty eight They're beautiful the repressed and activated when there's a high level of us together here so the assumption here is that you see something like this maybe Ezekiel thirty eight seems to be a potential modulator of make up till we found out in fact educated is a very potent modulator make activity it binds make and he actually binds make differentially either in the center domain or in the entering of domain depending on its activity so if it's postulated it doesn't bind the central domain only binds the interim of the main If it's not force related so we knocked there it by the Central to me when he bind central to me and he protects me from postulation a certain sixty two and so make it super stable you get super mic it never gets degraded but when it's on the entering of the main it actually force relays make a certain sixty two possibly through to mediate a false relation signals and as a result this is what happens that if you actually silence S.T.K. thirty eight or actually if you ever express it which is another paper that is now review. The rest of your career is gone and meat protein is completely gone of the protein level but unchanged in a lab and so this can be done in fact there were other papers where we actually in this case used both Mindi to identify a modulator of and make inroads cells and actually defect or three a complete abrogate the effect of the of these protocols to be conjugated called you we want and again you can see that when you want a science make a level go through the roof for the R.N.A. level and change and unfortunately there's a lot of light here you can see it but essentially if you lose you want in the brain you now start having major morphogenesis the facts and with accumulation of a second you want but if you're a D.L.L. tree you completely rescue the phenotype and all you find the modulator but you can find through Iraq near the actual effect. OK so here's another one of these. This one is an interesting one was published over a couple years ago and what it shows is the first map where you can actually associate signaling and drug targets to transcription factor regulations so everything in yellow is a transcription factor everything in purple is signalling protein in particular signalling proteins that are known to be drug of all and what you see in some of them like the able kinase are both a transcription factor and. And what you're seeing here doesn't make any sense and you cannot interpret it but is in fact completely machine interpretable and provides you with what we call a drug ability matrix from the cell which to you basically what you want to say is two things if I want to activate or repress a particular transcription program this may just will tell you how to get it done using drugs and if you want to actually know which kinase had contributed to a particular phenotype this can be also interrogated to find out and I'll show you some results. So I showed you some interesting results were proteins the module a transcription factor activity but obviously you can do this for almost any model to affect research and you could use it to identify whether a particular snip changes the ability of a protein to regulate sagas you could use it for you know understanding whether there are sort of third order or fourth order regulatory processes and this particular case is what we tried to do is to understand micro and a biogenesis and the idea is that came up completely serendipitously from the fact that we could not for all the possible outcomes that we made express mature mirror in a burka cell and the precursor was perfectly expressive of the true could not form and when you put the same construct in a different so it was just working fine so we thought there must be something that participates to nearby Janice there is still a specific to this mirror because the other near are working fine in inverted and that is all too specific to a cellular context this is a kind of a little map of the back Genesis process for a black or an AIDS and you know dice. And and the risk complex accent except or are very well known this about fifteen to twenty proteins that are known to participate but they're all really not complex specific and they're all non really macro in a specific and well we sense seem to find is that there are a lot of things that seem to be totally maccaroni specific so we use this approach where in this case we could estimate the mere precursor by taking mirrors that are contained within host genes so that we could use the M.R.A. of the host gene like for us and the locus host near fifteen and near sixteen so we could use their money of the gene as a proxy for the expression of the precursor of the macaroni and that we could measure the mature mirror using expression arrays and so what you're seeing here is very interesting when D.D.X. ten is low the mere precursor estimated through the proxy for mirror to eighteen rank from low to high does not clearly did all with the mature Mir But if D.D.X. Tanny's high you can see an almost perfect correlation and these are not spectacular now correlated from completion correlate and this shows an experimental validation factor when you silence the X. ten you end up getting a significant difference in the availability of the part you are me. Is another example where you see exactly the same situation if you look at before in fact there is an inverse correlation between the ability of the person ability of the of the mature but if you now express our B.P. for the highest level you now get a very good positive correlation and again we validated this by experimental osses So if this is true for men by Genesis you can now start thinking about using Also these for mere activity so for instance there was a very nice paper that appeared nature a couple of months ago but the polyp and all three that showed that the P ten pseudogene which is completely inactive has absolutely no function that is known of X. is a decoys for a particular Mere that size. And says pretend as a result when there's a high expression of the P ten gene it acts as a dick or intact or it's a way the back or an A and P. ten is no longer suppress But when the pseudogene is down regulated then pretend is very high suppressed and so the pseudogenes is in no effect it is an awkward gene because when it's area activated it suppresses B. time which is the most impressive kind of nice which. So we ask the say the same question we say well if we now can profile the mature mirror and we can profile it's. Are there genes the modulator the ability of the mirror to regulate the target and again we don't if I both regulators are seems to be agnostic to the mere types of regulate the activity of many mirrors and we identify. Modulators that are in fact very specific and this includes both genes that seem to be acting as decoys as well as genes that are in a bind in genes and may modulate them you're using other mechanism this is experiment of validation for three of them including me one of six that essentially buying P ten so this may be forty cents another decore for the near one of six which was the one identified one hundred actually we think was near twenty one. OK so once we have these completely complex specific information on regulation we can now start bringing in additional information and we are very very cautious about this because most networks the people of you so far have been basic like protein protein interaction there are bays and used to hybrid Ses which are not context specific and in fact are X. we were asses which if you which are very unlikely to predict the ability of the two proteins to bind in vivo at physiologic conditions this protein as way of expressing that he has to host they are in an infant can combine offers are the condition they are not forced for laser because terracing kinases are not present in the east and so the environment is totally different and if you believe it you know the regulatory networks are differ. In each one of the cells and so what you get X. Bebo is just one incarnation of all these possible subtypes of networks so we like to bring in this information we like to bring it in a second stage and this is done through evidence integration I've written essentially as a Beijing approaches you can use random forest or you can use the I base it turns out that it really doesn't make much of a difference why do you use the advantage that you get by integrating more than one modality way overshadows whatever our competition to use for integration and so we can do it for protein protein protein in a now we also have propeller mirror. R.N.A. interactions and and these are now starting to be very very specific and we publish a paper recently Mark our system biology where we use these compute integrate approaches both to identify a novel complex that combines the the. The premedication complex with my code to complex thought to control complex which are known to be functionally related but never shown to be in fact physically interacting in the protein and also that I defy the mass regulators of germinal center formation call me been folks want. And we're now sort of funded to apply these type of approaches to variety of tumors and in fact non tumor related phenotypes particular area last injure germ cell tumors and number of other stem cell. Initiatives that were recently found but these are in the cell breast cancer prostate. And these are for D.C. J. complex and non-small cell lung. All right so so that's great so we now i hope you believe that we can do at least a decision. Effort of trying to reconstruct some of these regulatory interactions. Now what OK so how exciting is it to be able to say these are the targets of a particular a regulator maybe interesting about chemical but from about perspective it tells you not very much in fact may even make the entire game more complicated because now you're staring at these incredible fuzzy grams and. You don't know how to interpret so why we wanted to do is to go from you know an expansion or we say analytical approach to now it's in that approach that would allow us to zoom in on a very very small number of candidate genes that are causally related to the presentation of particular phenotype and we introduce his medical mass regular analysis that has been really much more successful No we expected it to. But that is also extraordinarily simple conceptually understand and the idea this is the following imagine that you have two phenotypes one is normal one is tumor related or maybe two phenotype in a tumor progression cascade the idea is that if you imagine that there is one regulator that is responsible for this transformation one or more let's not start with one then if your network is a very accurate representation of the targets that are repress are activated by these regulator is it directly if it's interesting from Fox or indirectly if you say kinase then it is very reasonable to assume that if this is an oncogene Well exactly where the target should be over expressing this to shew and its repressed target should be under expressed in the tissue and you can actually measure this if if this representation is correct and that's what you should see you can now measured using these kind of and Richmond analysis approaches so what this is showing is this all genes sort of from the most over Express in one tissue for the most under expressing the tissue compared to you or your other field type and what you see Again you the bars are the predicted targets of this restriction factor T.F.T.P. one and again what you're saying is that this targets all very nicely accumulate a one end of the spectrum and therefore you can associate a certain probability that these are provided if you want may be associated with idiology of these particular disease and. If you do this for every single transcription factor quite a lot will have this type of shape and so now you have to figure out which one works best and we did this by using a number of methods one uses sort of a an odd risk ratio approach that allows. Rank the genes if you rank the gene by people you get horrible results because this Preval here is complete dependent on the size of the regular. And the other method that we use is a regression methods Well basically allows us to sort of formulate a pseudo kind adic of the regulation process so that we can basically among the genes that come up US candidates say fifty what is the rank of the top five and which one regulates the largest subset of the genes that change in that in that context so let's see if this works what we did here is silenced three different prescription factor in a human B. cell this is a working phone and the reply to be zero six Fox and one in name and we asked the question if you eliminate any R.N.A. for destruction factor can you infer which one was. Regulated by the silencing as a shining and as you can see we used two different approaches for the genes that are richer in ours is really doesn't make much of a difference in monkeys they all. In one case one came in number two with the original gene separate Eunice's a method that we've developed in-house but it really doesn't make a difference essentially always no matter what you do in the top that and this is basically where you want to be because then it's very easy to validate things does this work post translational this is much harder case so we took the protein that turned out to be the most clear tropic kinase in a B. cell which is in fact a sticky thirty eight which were ready had science in data and so this is setting yourself up for failure because this prosing regulates more than one hundred transcription factors so the response you're going to have is extraordinary Has your genius and to think that you are able to after your science to reconstruct that that was the protein have been science it's a big leap of a big leap of faith and in this case we know basically what the media are going did for us it allows to pretty quick transcription factor effect in particular which faggot's of those transcription factor should have been affected now you have a bigger what we call back. Extended regular on and we could use the same enrichment I says on these targets rather than direct are going to transcription factors or you see here that educated eight came up as number five or seven hundred seventy two proteins and then in fact the one that came up a number one C.D.C. five shares more than half of its regular targets with S.T.K. thirty eight so this actually gives Also since I would now following on that tells you that you can now start using this information on how targets are shared to try to reconstruct the topology of the signalling pathways because if it is a signal improve things is right on top of another want to regulate sit you soon that this one would be slightly more people trapped in this one and in fact cover both the targets of this one and some other targets. So this seems to be working fine in-vitro OK And this is a big leap of faith because now we have to show that he works well with an actual biological phenotype. So the story that I'm going to tell you. For ASA a very big. Leap Forward and this type of approach about tickle systems and was motivated by a paper by a kennel dopy. M.D. Anderson who discovered about five years ago that patients affected by Lupus normal to form a really bad form of brain tumour the most it is very rare but is the most frequent form of brain tumor that's a story malignant where segregating in three subgroups so these are patients and these are genes and red means over express in this patient green means under expressing the patient now you see here there was a group of people that where over expressing proof genes a group the worst if you will genes in a group that was expressing proliferated genes and these close Agapit with the prognostic signature with a missing subgroup doing the absolute worst in terms of their lie. Expectancy and so a cancer biologist would have asked the question what are these genes they would have looked at very hard at this group and in this group and say Is there something that I know about these that reminds me of something where we can now ask after we reconstruct the regulatory network is who regulates the signature because if you change one single regulators regulator are you going to have a massive effect at the level of the genes that change then all of them are going to be causally related to the field type and in fact what you what I'll show you is that the genes that we discover and demonstrated were related to the phenotype are not in this group and you would never have found them by just looking at different expression eyes so how do how do we do this we first used Iraq need to reconstruct a complete complete transcription regulatory network this is the competition network that was generated after we run marina so we used his network and we asked what other transcription factor there are according to their regulatory pattern the master regulators of the missing subtype OK It turns out that only five or so everything in Scion here is a missing pregnant in the signature showed you this five transmission five of those are factors seem to operate you Lady signature this one in blue down regular decision in fact this is lost in every tumor is a typical tumor suppressor genes this is fantasy or computational fantasies this is reality because we then what invalidated again all of these prescription facts are both in terms of the binding activity so eighty percent of the test that we run in terms of running each one of the four for transcription factor again starting last one is a negative control issue about eighty percent of them would show that they were binding up super star gates and these are the functional silencing so we silence every one of the transcription factor and show that in fact the genes that are most differential expressed are precisely the one in his missing in the signature and we actually both silence them in human cells with the expression we over expressed them in your in cells with very low expression to see that both were consistent. One thing that was. Very interesting is that when we looked at these five prescription factor and did a little bit of chemistry on them we found that they were fine so Iraq is not very good at determining the direction if both if one is received after the other one is the target then we know the relation can only be going this way but if there's a transcript or a factor we don't know who regulates whom and so we're to do some station to find out who is buying upstream of the other and came up with this typology which is very interesting because it's completely modular centrally almost every jurisdiction factor seems to be regulating the others and is completely hierarchical everything flowed on this way and which we didn't expect and to Gene C.B.P. and start three so he has to serve units the beta and the deficit units that came out both very high honor analysis seems to be on top Moreover what was very you know really remarkable is that when you look at the enrichment of inner C.B.P. targets or start three targets it was high it was very very significant about six percent but when you look at the enrichment of my Second Mile targets in the intersection of the regular It was thirty six percent almost half of the genes that they call regulate were in fact and sank you know and this led us to believe that maybe what we're seeing here is in fact a senior just a fact so how we went invited Is this our mooring stems neural stem cell when you deprive them of my agenda develop along a neuron linage So that becomes neurons you can see here when you quote topically expressed that three in C. beta but not if you express only one of them there's a very significant morphological transformation the cell becomes very flat and then I'm going to look like fiberglass and you deprive my agenda develop along and. This is a completely barren linage because there's no way the physiological the a stem cell will ever go into a second linage and what you see here is really blew up our our minds because what you see here these are major markers for type and we also did scratch us a matter of jelling vision ass's and show that these cells become. Really invasive and and migratory when you when you or express both but here you're seeing these marker missing close up type and if you over expressing this about three or see the data they tend to go up a little bit a little notch you know twenty percent thirty percent fifty percent but if you call express both of them there is you know anywhere from a ten fold to one hundred fold increase in their over expression so now this shows that these genes are sufficient to induce a missing trance differentiation of that OK but are they also necessary in the tumors or is the tumor addicted to them and what we show here is these are now cell lines from my great human glioma So these are taken directly from patients and cultured and what you're seeing again is not very visible but the mark in red and green here are again markers of my second most up that have been acting. I think exacting and. Forty And you can see that SAT SAT is that removes a little bit of them see beta silencing moves a little bit of them but when you silence both of them they're completely gone and again you see the synergistic type of behavior and so now the shows of the both necessary and sufficient this is still not completely satisfactory because this is again in vitro data so we went in vivo and we did a century intracranial injections of high grade almost cells in mice and this applies as you can see from this one in. These are incredibly tumor Janick they all die in about one hundred twenty days if you silence is that three or so you made out of the combination you can see that three they do a little bit better beginning especially all crashed very rapidly in all die pretty much in about the same time with C.P. beta silencing they do significantly better but repeated the experiment out of twelve mice in total so this one this is a result of the first experiment six mice but out of twelve mice three of them died and had signs of the of a group of sperm where you science book out of twelve only had any evidence of a tumour. This is a tool you're seeing here this is not a stoma this is what it was looks like essentially like a like a coastline that has been eroded because the cells are tumor are migrating anybody nearby to show such a speed that by the time the surgeon respects it it's already in the other hemisphere of the brain this is a completely mobilized tumor is now my grating are all in a nearby tissue OK And so this century proved definitely we die in vivo these beans completely change the more genetic potential of these cells we also want to again human in a complete different cohorts and retrospectively check whether this could be good markers for progress and what we saw is that in fact one hundred percent of the patients that had the double positive so to answer your data when in fact dead in about one hundred twenty weeks fifty percent of the patients that had double negative when in fact still alive so this seems to be one of those really helpful markers for to emergency so you may ask is your network rigged Are you always going to come up with a mastery because no matter what question you ask so this shows that if you actually use them a second most unusual proofs in your political signature you get very different answer and they make a lot of sense like for instance all it too is a well known regulator of neural. Targets and you can see that is lost in the game in the rural but most of the other mass regulator essentially completely independent. And the other question you can ask and the fact that this was asked by the reviewers of the Nature paper is well how robust is the No a lot of people do the same exact experiment to the French expression rises and they overlap the French express gene is like ten percent how good can you repeat this these analyses if you tell you put different data spent. The is that we run it on different data sets extracted now can there interrogating in our complete new panel which another and this show is the overlap in the three predictions century perfect almost complete overlap. These are the master regulators twenty two out of about forty that we identified where in fact by all three datasets about fifty percent and the ones that are found by at least two out of three that number grows very significantly so only about twenty of them twenty six of them were not found by abuse two of the of the math and in fact the five that were that if I had a module all it. Remains L. O. O. O. O. O. O. O. O. O. O. O. O. O. O. O. OK so now. You believe this. That is that one hundred percent of the missing subtype is controlled by these two genes you have to ask. I do this genes harboring a genetic alteration or something upstream of them and it alteration goes see them remove them it's not going to be. It changes the name of the game from what we call genome wide association studies to something that we now call pathway Y. association studies so we used and this is unpublished data we use both Mindy and Iraq me to go upstream of the robbery you couldn't identify kinetic lesions in copper so etc and epigenetic operation that were associated with the phenotype and. For. This. Is the. Most. Wide you'll. Hear. Hope. All is. Just around and all. That. Look at. Our. Very. Own. Home. My. God. Do. You know all the. Time. However we. Are. All the. But I'm there Doc says I can control. All right so now these allows us to formulate a new theory of cancer and this is where things get good fun and get. You know potentially controversial So if you are a pharmaceutical company you have two goals one goal is to develop appropriate therapeutic targets and the small molecules the modulating the other goes on to fight a biomarkers that can be used to stratified the population for instance if you had you know hurt too in breast cancer is both valuable as a biomarkers and as a target and if you had to run an air septum trial in all the general breast cancer population he would fail miserably so the typical place where you find both by markers and targets is so for The Back in fact is both can be fine in terms of the genes that Aboriginal acculturation but they can also be fine in terms of genes that have differential expression by markers what we are getting is this is exactly the wrong place to look why because if you look here you are you're segregating you're over segregating your disease only a very small percent of breast cancer her two implication only a small percent of lung cancer have E.G. a foreign place occasion all the small percent of the know where the entire disease have been rough implication So now you're building a drug that works only ten percent twenty percent thirty percent of the cases. Here it's absolutely idiotic to look for biomarkers because this is like going to the scene of a plane crash and saying the biggest broken piece is the cause of the crash and the reality is that you know sitting on the sea ice and you can watch your brother will review the progress of the Randy early detection response network and they had to validate sixteen markers that were produced using this approach and nine out of sixteen validated in studies and full of studies I believe because this is the wrong way to look for biomarkers now that they've done about twenty five. One of them is validated but that gives you kind of the ratio between how many things you have to try and many things are going to validate so however note the following There's an almost infinite spectrum of genetic alterations almost every patient in cancer will have its own particular combination of loss of tumour suppressor and gain of oncogene and there's almost no general pattern in the maybe some individual gene that may have a higher or lower but in general as a pattern is very specific However if you look at the molecular phenotype level things seems to fall into a small number of categories if that is the case if you believe this well then there must be some integrity logic and what we call master integrators that will integrate these genetic signals and if you generate signals and produce a relatively small number of distinct phenotype and what we're saying and I hope I've demonstrated are for that would be an case but I wanted to also give you a couple of other references which are and so this is a global sort of paper in Nature there's another paper nation we published last year and it was largely selling for months where we showed with regard to my father that in fact he and if half of the profit in Africa itself is not genetically altered but it integrates about twenty different mutation in the B.C.R. pathway and in D.C. forty pathway that contribute to the difference between the A.B.C. subtype and you should be surprised if you will be selling forma which are very very different programs A.B.C. is very aggressive the C.B.S. treated relatively well with current there and following record resistance in T L L So this basically says this layer here is a great layer look at it both for therapeutic targets and for biomarkers. How much time you have. I mean it's OK So let me go very quickly about drug mechanism action because now we have the targets and the bio markets can we coupled them with potential to refute it compounds so we looked at whether we could actually use the Connectivity Map which is a map created by the Broad Institute which is. Actually perturbed so lines with particular compounds and then build the expression profiles with two different type of approaches one where we repeat the same the same probation with the same drug but in a different so your context in this case a diffuse large reselling form and what we can see is that about you know and I wish fifty percent of the drugs could be identified in the Connectivity Map but about fifty percent really could not be identified by matching the profile so cell context is very important and you can actually map unless you're going to profile every single context may have some problems the other thing that we want to know as well is a more important question which is well so this is in vitro How are in vivo So if I give a particular drug to to a mouse and I now take a sample from his prostate can I match it to the corresponding drug than in the Connectivity Map So we do these very large projects not to do this but to build it in people interact to him in prostate cancer where we took my from different fifteen different genetically engineer strains are predisposed to prostate cancer and cross them with about a panel fifteen drugs that would produce therefore a lot of different perturbation but united and drug related who are D. prostitution and we now finish quite a number of them these are all the ones that we've already in the ready analyzed and what was very interesting is that and this is for this project we want to compare human interact Ohman prostatectomy was very interesting is out of the drugs that we test that could be compared with a can actually map only one which is right for my seeing the maps in two zero the most which is also an M two inhibitor could be identified as sort of very highly matching in the Connectivity Map So when you going vivo you're even more unrelated to what happens in the interest of situation so I want to do is see whether we can actually get a better sense of the mechanism of action of drugs in a way that is independent of the context because you actually didn't find out the profile which made involve all sorts of different processing etc But actually the real mechanism of action of the drug and we use these other. And cool idea was published in two thousand and eight which basically says if you have a regulatory network and you an opportunity with a drug instead of looking at which genes changes let's look at which interactions change so our gene you may completely lose it because that's a that has hit the post translation only by drugs or in a bit of a kindness you're never going to see any differential expression of the trainees or as you can see some complex feedback loops but if you look at the interaction of the kind there are going to change quite significant so we did this and basically what it does you take two phenotype drug related one hundred one and you end up seeing which edges in a particular regular change and we did this for two drugs that have almost identical mechanism of action on a schoolmaster tracks it was called purple trucks it they had a an enzyme called the other for that reductive base in the cell and one is a more potent trucks individual for. Protein the important thing is the protractor also seems to have an amazing resolved in patients that are refractory to most attractive utilization map of selling form so I want to see whether these comparison could actually do seed a little bit of the of the difference and what we saw was very interesting not only the method identified in the Top twenty is actually the top twenty six or thirty for each one of the two drugs the genes that sit in the most is regulated area of the network identified yet referred reductive as one of them but fifty percent of the other targets where in fact exactly the same and when we look at differential expression lies and I see less than ten percent of the differential expressed August where the say yes but also if you look at the wanted a specific only to P.D.F. X. you can see that there seems to be a dramatic reactivation of D.N.A. damage dependent B. fifty three of those is control or sort of program cell death control and there's something that is interesting because in this that was self which undergoes somatic if you take into the mutate their own D.N.A. you want to completely block any damage response because otherwise they would kill themselves or they're going through selection for the immune system so this basically is a drug that seems. To be able to reactivate that mechanism and that may be why it is in fact the French reactor and there's a number of other could change so this allows us now that we have put all these things together to think about a complete different way in which we can go in terms of translational research and development. You see this pathway this is a path where the normal is followed right today for bringing drugs into clinical trials first you have to find it compound and have to go into medicinal chemistry and refine it you need to understand the mechanism of action you need to understand by marks that can be used to identify the population but also to follow efficacy within the population when you have all doubt which is six to ten years you now can go in clinical trials so a group of senators including my center Columbia received our money to see whether we could do this in two years and is include two or Strieber of the broad John Maine and Michael White. The Southwestern. Scott Powers and Cold Spring Harbor and this bill Hanna got a fiber and my my my center of Columbia and the idea was can we use this kind of function ation and system biology approaches to do all of these things are once in two years OK and we actually signed up to do it for three different cancers in two years which we think is crazy but actually the results are already coming up after only one year and they look very very promising starting with that you see resistance and also for the G.M. And so there's a paper that just came out this is not a peer reviewed papers peer reviewed but it's commentary basically. The P.I. of the center co-author of the century explain how can you this type of approach is to very very quickly identify all the major ingredients I can drive into therapeutic type trials. And the way we're doing at Columbia is by coupling pulled our nice screens around to five potential master regulators the system biology approach evidence integration to sort of to find a combined list and then screening a small molecule libraries. It's tens of thousands of compounds twenty five to one side and you see there a particular molecular signature or a particular phenotype and then combining these molecules with these mechanisms through such as Marina and idea and just in the last slides on a show that we looked ready with were Schreiber for composite can inhibit such a beta so we saw three integrated into five thirty seven compounds that binds a tree very specifically using and you can mistreat came out of this tribe or a lab and this shows kind of the synergy between the what the virus sensors are doing and we're now testing them using cells that have been instrumented to report differential Octavio districts in fact so they had a conclusion and sort of reflection for you to carry away from from the stalk and I hope you know I haven't lost too many of you because obviously there was a lot of material here but the idea is the art and I wanted to really present this stuff because it it if you miss a piece you miss the global story and one piece of the global story is that the current emphasis today is on Gene Harbor genetic alterations this is great because you can find these genes by sequencing or by genetic sequencing Unfortunately these genes may not be the best entry point for therapy number one if you have a deletion very hard to deal with a gene that is deleted using therapeutics number two if it's amplified to be example of a very small subset of population so it may work but it actually if you hit the first second and third most frequent want anything after death start to be in the two percent range are you going to develop a drug for something that hits only two percent of the population with a disease second one is that current approaches to by market discovery should be revaluated And the reason is that again these issue of going down with the air PLA airplane crash analogy of looking forward not bigger is better in some cases shown you know the genes that seems to be the most sensitive to determining the subtype are in fact very dual differential Express. The third point is that when we look. It's much better instead of starting with a Jew OS to end up with a Jew OS once you have the functional data you can now walk upstream pathways and look for lesions or alterations or even variances that are normally an error in the population in a much more targeted directly way within the pathways and finally this is something that we're working on right now with this we have project called a thousand drug project which is a misnomer because it's not needed one thousand or more drugs but it basically says that what we want to do is identify a repertoire of model small molecules that can hit really different and orthogonal targets within the human genome right now we're hitting with all the entire pharmacopoeia of drugs that we have less than two hundred genetic targets and we tremendous side no off target effects if we could have about a thousand compounds that hit distinct I guess we would be able to combine them very quickly and essential drive through combination there so let me just thank a number of people so these are people in my lab and sort of computational components of the lab an experimental component is our collaborators to your account of far as lab of our own and that's relevant all the work in G.B.M. also we can all dopy clear about a shot in my question for the work in prostate cancer in mice the store Schreiber's lab for the again working G.B.M. MICHAEL WHITE working line on cancer. Funding sources Thank you. So. This is. The right ones. All the. Way. Very low so very very non differential expressed right. So. We're. Right. Here. They where that what the point I was trying to make was a different one was that was sufficient for the idea for the reductive is that was the four thousand I can tell you there was nonsense distinguishing if you can see where is the tree were also not of so the word for two hundred fifteen hundred there were non sophistical significant OK. For Scobbie rate of five percent so this. No I mean is no not a test because it has obvious have to correct for my firm or discover a typical dislike things but that's not the point really the point is that you would never go to the four hundred gene to validate it if you have a result from the French expression out OK and you would never go to see B. and start three if you're looking for more Geneses And in this is due to different supply because he has never been related to cancer before so if you use a typical approach is even if you went through the list and say these are all the regulators you now go from two thousand to maybe are three hundred you still will be at the end of this you still would probably overlook it OK while if you use the approach sort of protocol virtual programming approach it actually comes right on top and on the protein level in the midst of chemistry can see this differential expression but the other of the gene expression level there's no difference or expression.