So we've got to pick the fruits we want to harvest off the tree which will be the products. We've got to find out the pathways to get there. We've got to look at the feasibility of the process the control structure and all the other listed rise that doesn't mean that they're afterthoughts but the resource conservation and environment safety and health strategies need to be taken to account as well. Each of those problems are pretty complex and hard to handle even more so if we start looking at them at the same time. So there's a lot of work in the systems community trying to address even just small combination of these. So. When you said getting into the math of it and that's the part I'm going to skip over pretty quickly but generally you can describe any process or product design problem by series of these five six equations in a very simplified form you'll have some kind of an objective function. You'll have a process or a product model maybe even both you have some equality in inequality constraints and some structural constraints. There's two main ways that you can solve this. You can do it based on heuristics or knowledge which means you try to come up with a feasible but not necessarily optimal solution where you can do the brute force mathematics of my station where you solve the objective function and the constraints. The problem with the optimization approach is that depending on the complexity of the problem you may not end up with a solution at all. So another way to do it is to look at a hybrid solution approach where you use knowledge or he risks a kind of bound the variables a little bit before you invoke the numerical solver. Excuse me. However regardless of Reddit approach you take. The key ingredient is going to be the process or the product model that always is going to be there and it's going to be that model that dictates the kind of solutions you can get and the complexity of the solution step. So what I want to do is spend just a couple of minutes on the writers that might. Play particularly the property models because that's usually where all the complexity is when we're trying to solve these kinds of integrated problems. So if you go back to the product tree. Different constitutive equations or models cover. Different areas of the product range. So some will be best at handling a certain class of of chemicals and others were high and or another class so have just try to illustrate those with three different colors. There's a couple of questions. When you choose the property model you basically implicitly have chosen the search space that you can handle as well. The question is do we know enough to drive one single model credibly not and even if we did the time it would take to develop one that would cover everything and probably not be worth the investment compared to use and a couple of different hybrid approaches to the question that we would like to address from a systems level perspective is whether we can solve simulation and or my station problems where you can use multiple models to represent the same variable. So in order to do that we need to look at what use of models for. Generally the product probably model. Represent the physical properties as a function of the intensive variables like temperature pressure composition and the component I.D.'s So it provides basically a service to your process model the process model ask for physical properties and it will provide the information in terms of temperature pressure conversation in the confines. And the past as matter is just some box relating the raw materials to the products. If you serve in more than once. Then it's a simulation. Whether or not you do that an aspirin or a by hand or whatever. Anytime you solve the same set of equations more than once. It's basically a simulation. A simulation you use some set of design parameters guess the angle and ices that up a bit. And as you've a the design parameters you evaluate the results. OK. That you can feed into an algorithm that handles synthesis and design. In this place the property model provides a service through the process model. Now back in the eighty's or so a lot of work was done on trying to use the synthesis methods as a means of bounding our design targets and the feasibility of the different approaches as well. When we do that the property model will provide advice. On the ranges that we can expect the properties to have and so on. So there's have been the two main ways that we have used property models in the past where we would like to do. Is use them in a as part of the solution directly and not just as an embedded part of the equations but in order to do that we would need to reverse the flow of information. So that we can solve the process model in terms of properties and the property model now returns intensive variables meaning the temperature of the pressure conversations and or the compound I.D.'s. But that is a different way of approaching it from how we've done it in the past. The beauty of it is that in general by decoupling these two. The size of the problem becomes significantly reduced because all of the non-linearity and generally there is buried inside the property model not so much in the process model because that mostly consists of balancing constraint equations. So we came up with I can't claim all the credit for this. I'd like to but I can't. My Ph D. advisor was the one that came up with the first part of it which was the reverse problem from relation. It basically provides this computationally more efficient solution strategy. What we're doing is a targeting approach where we try to target the actual solution before we do any complex calculations and it allows us to decouple the two types of of the equations and I'll show you that in just a second. The beauty of do. And a targeting approach is that it. It relieves a little bit of that. Inherent literature of nature that design has. So this is conceptually how it works you decouple the constitutive equations which includes all the phenomena and cons property models and here from the balancing constraint equation is when you do that you take one conventional forward problem. And reformulated as to backwards problems. So what you do is you solve the balancing constraint equations to identify the targets basically doing a reverse simulation. That gives you the design targets or the values of the constitutive variables. And then the second reverse problem is you solve the constituted equations to match the targets. Now to the benefit of this is that every time you match the targets. You don't have to resolve the bouncing constraint equations. It also means that you can have any number of different property models to represent the same constitute variables here. So you don't need to solve those again if you know the targets all you do is decouple the two problems. So for the target and then try to match and it makes it a lot more efficient from a computation standpoint. OK Now the constitutive a was one way of representing that a physical properties and that's the part I'm going to focus on today. So why would you want to do design based on properties in the first place. Rather a lot of processes that are driven by physical properties rather than chemical components. And a lot of that motivation for this work came out of my time in Auburn initially when we started working with the paper industry. Now as General Chemical engineers all you students are used to basically equating quality with purity. And the paper makers can't do that you can have one hundred percent pure cellulose going into and out of the paper machine and the product can be worthless. So stand. Design techniques that equate you know composition as a means of quality. Pretty much fail at designing these kinds of processes because the quality depends on physical properties like opacity activity and other metrics that are not related to the purity of the cellulose but depending on physical properties and stuff like fiber length in a kind of to video the fibers and so forth. So there wasn't a systematic way of handling that. So that was what we wanted to start out trying to address and a lot of the performance objectives are described in terms of properties as well because really in the in the design. Field. You don't care much where the name of a compound is until you have to buy it up until that point it serves a purpose meaning it has certain properties where functionality is that you mean if you need to do a solvent extraction you really don't care. Conceptually whether it's called Benzino tell you when until you have to buy it up until that point you worry about how well it does the job. So those are based on the properties and so much on the name of the identity of the components. When you get into problem electro design in particular it's all based on properties. So there was an opportunity that we thought to look at the properties as an interface between the types of different properties. Now that we came up with was what we call property clusters. It is not meant to replace any of the composition based techniques that we all know and love. Because the types of problems that the composition based techniques can handle they do very well. So this is meant to extend those kinds of methodologies into the types of products or problems that we can't solve. So we wanted to come up with a way that we could reduce the dimensionality of the problem. In order to visualize it because that is very helpful in trying to gain some insights. We generally do on the property estimation for the molecular design using. Contribution but I'll show you at least in passing some examples of other methods that we've used those of you who watched Professor disco's seminar here in November would have seen a lot of the work that we've done on that as well we've used a lot of his methods for representing our nontraditional properties. So basically what this provides us is a framework that allows us to approach both the molecular design problem and the process design problem from the properties perspective. So I'm not going to detail all the. The math that goes into this because. It's really not that interesting and the truth is not that it complex. So it's not worth the time but basically what you do is you basically you take the physical properties and you rearrange the description into a combination of Euclidean vector and a scalar function and but you end up with these conserved surrogate properties that all have linear mixing roles. Now is the theme of analysis among you're going to flinch when you hear the word linear makes in roles because that rarely happens. And that is true what we do instead is write it as a linear mixing of non-linear terms. The easiest way to explain that is to use something like density. Density does not have a linear mixing up. But inverse density does so you can hide the non-linearity in the functionality description. When you do that but you basically do is reverse the representation of the composition space. Normally we would represent on a turn or a diagram of the species on the very C's and the properties inside the space. We basically reversed that a make one turn. Put the functions of the properties on the various Cs and the components on the inside. That has a lot of benefits. When you have money and makes the rules even if the terms themselves are nonlinear that means that all the mixing operations are straight lines which helps you do simple mixing up to my station directly visually. I'm going to show you a little bit of the algorithm here basically there's a linear I say should step then and normalization step and then the original formulation required the identification of this augmented property index which was named a U. peek as it was developed at all but instead of an A.P.I. I can't claim credit for that either. That was Dr how long he's idea but I thought it was comical so we left it that way. OK but there are a few conditions that need to be met when you're solving these problems particularly visually. The first one. You have to be inside the feasibility region for a given unit that you want to feed it to that provides you the first level then there's a sufficient condition that at the same point you have to match that value of this index and the analogy you can use is that the location corresponds to a set of three mass fractions. The only way that you get a unique solution is if the total mass remains the same for both solutions. OK So there's a it's a similar analogy for that. OK show an example. The first one we did was from the paper making industry. And it's a simple recycle problem you have some fibers that go into the paper machine. Some of it turns into papers all of it is rejected then it's pulped served some of it goes to waste and some are potential for recycle. So the streams are categorized not by the purity of the Cellulose Fibers but by the properties of it. And the objectionable material was basically just to a mass fraction However the it's. Fission and reflectivity kind of could come monk papers theory. And then you're given some information on the properties of the flows of the fibers in the pro five years and the conditions of the paper machine that they can accept. So there's a lot of data in a table. No worries. The interesting thing of these three numbers and the fact that we have two linear and one non-linear property description but the beauty of it is and this is something that Professor Ralph. Can probably comment on I'm sure he teaches in his courses as well you can look at a lot of this data. The beauty of this approach when you think of it from a systems perspective is that if you have a paper machine that requires a minimum of one hundred tons to go into the feed and we have a maximum of thirty tons of recyclable material left no matter how well we design anything. We're still going to need seventy tons of fresh material to be fed in but that's an important target to know because that is something we can compare all of our other solutions against. So we have two different ways we can do this we can do a direct recycle of the fibers. Or maybe do some kind of change to them and then recycle them. So when you take all the data and you translate it into the property space this is what it ends up looking like. The constraints on the paper machine becomes a region and here the fresh fibers are located here. The recycle fibers are located here. And since our mixing operations are straight lines. It's pretty easy to find out which one will maximize the recycler of the of the profilers it ends up being the point where you have the longest on for the profilers you can then back accolade the flow rate it ends up being eighty three times an hour. You know that the maximum I recycle we can do would give a fresh five or recycle of seventy. So we would have to do something to change the properties of the broad fibers. In order to get there. So you can then solve the problem backwards again with a new target saying that we want to be able to recycle all of the broke fibers when you do that you have to move the cut the properties to this point again this is all solved. Visually you don't have to worry about what the properties are but then they're usually back calculated after that. Now I gotta admit I'm not a paper person. So I don't know what the physical or chemical treatment it is that is necessary to move the absorption causation and reflectivity to these values but to the benefit is that we were able to target this without having to do a lot of detailed calculations. So that was going to pass this on to the paper makers and they can come up with some technologies for doing that they know that a lot better than we do. So once we had the process side. Want to link this to the molecular design that we've been doing on other topics. You will notice that the the operator the property operators. The functional description of them look very much like what we're used to seeing for group contribution. Group kind of using methods generally take the number of occurrences of a given group times the contribution of that group and you add it all up you get the total property for so we want to figure out a way that we can combine the two the beauty of it is once you have it in an operator form. The rest of the math remains the same. So it should be possible to represent both problems on the same platform and that's basically what we did you end up with a different representation of the. The lever arm that you use graphically was now becomes the location of the combined molecule when you add two fragments together rather than the mixing ratio of two strains. So there's one way you can illustrate this. This is a simplified representation of one. The properties are that we want to give an molecule to have. And these are the Canada fragments we have available. So we could need to move inside this region so let's say we combine not mix but combine G one and G two that would get us somewhere here. Then we need to move in this direction. So we could add fragmented G three which moves us. Into the space that we need However in this case we still have at conceptually one for you. Beyond which we can have so we need to add another terminating group and this would be G four we end up with a molecule here that ends up inside the region. So you can do all this usually by just keeping track of the free bonds pretty straightforward. There's going to be some conditions you need to have no Freedmen's left over when you're done. Otherwise it's not a complete molecule it. The location of the final molecule should be inside the region. It should match at least the five minute index of the region. The range of it. And the reason for that is that you have some properties in there that are not calculated directly from group contribution. There's only a limited number of properties that you can predict using group contribution. And some of them need to be double checked against the real property is one of them is like vapor pressure. I'll show you that in a second but they pressure can't be predicted from group contribution but you can use an empirical correlation between vapor pressure and willing point and boiling point can be predicted from group contributions but you need to double check that the translation between the two doesn't lose any information. So I think sample if I show you just a molecular synthesis. This was started as a kind of a traditional and mixed in. It's were not linear programming problem by looking Chinese group that it was that when he was stood University Connecticut. But design of a blanket wash solvent for resin printing ink and we saw. Solve the exact same problem except we did it all graphically and so the first issue is you have seven possible groups available and there's a maximum chain length of seven groups in this case. And then there are five properties that we need to match. OK here of a price. Ation boiling point Melting Point base for pressure and some other one that I for mine and I can remember. Anyway. My bad. To solve it graphically because that was the intent of the method in the first place so that means we're limited to three properties so we did three of them and we want to use the remaining two Ask meaning criteria afterwards. OK so we use heat of vaporization and the two temperatures boiling and melting temperatures for the the graphical approach and then we left they pressure inside the ability and screen criteria afterwards. So from kind of usual you get the expressions for the different properties. They can then be translated into the molecular property operators with some appropriately chosen reference values which basically just brings everything to the same order of magnitude. When you know on the rise it. So this is where the property constraints of the solvent need to fall. Here's a location of the different groups that sin and she had chosen. And then you can basically just start building combinations until you no longer have any free bonds. There's a variety of them. Now. If you're thinking about is to compare the visual approach with makes them a job and only optimization you can see. But most of the fragments are already inside the constraints so it shouldn't have been a particularly hard all to my station problem for them to solve and it wasn't bad. You don't. We get that when you just look at the constraints. So we designed eleven candidate molecules doing this. Then there's a couple of additional checks that need to be done. They all are inside the region so that takes care of the first condition. They also have no feed on is that second one. Then there's a couple of checks the bottom three fall out because of the values of the index. Then the rest of them all eight of them satisfy everything including the non group contribution based properties. So that means they're at least candidates for further study. So members of the eight candidates seven of them are the same as right at Cheney and and Senate came up with and then there's one more thing. And that was screened out because it's final and because of what they wanted for they didn't actually include it in the family lation when this all did with out to my station. But it does satisfy all the property constraints that were specified except it would not be feasible for that application how it fell out of the opposition approach I'm not sure but the other seven are exactly the same. So now it shouldn't be a surprise that what we want to do is really link the two approaches. We've looked at it from a process side and we look at it from a molecular side. So what we'd like to do is solve the process design in terms of the property targets. Then design the corresponding molecules that we need. OK So this is an example where we did that. It's a matter of degree sing process where. An organic solvent is used in a degree. Certain. And then a lot of it is evaporated as a mixture of off gases. And an issue from relation to. Provided to be fired. We wanted to see if we could then condense it. And use it for something to reduce the amount of fresh selling that we would need. OK And then we want to do it graphically so we just chose three properties to represent a strain the amount of sulfur in there for corrosion purposes the middle of volume for you know pumping in and recovery aspects of the vapor pressure. So we had some constraints on the unit how much. Software could be allowed in there. What the volume should be in the vapor pressure and then we had to learn to make some rules for the whole of volume and the amount of sulfur and then they nonlinear making room for the vapor pressure. So you translate the data you end up with something like this. The constraints on the degrees and unit makes for this region right here. And then we had some data of the properties of the concept as a function of temperature. If you think about the V.O.C. has been a mixture of organics you change the temperature of different species are going to condense that's going to have a an effect on the properties of the concept. So it's for station purposes we chose to run the condenser at five hundred Kelvin. And so that gives us this point since all the mixing operations a straight line is that gives us a target for where the boundaries are for different lines that could go through that a greasing unit and still makes with the five hundred K. Conn and said. And then we made it simpler ourselves and said we're not going to have any suffering there by just eliminating any sulfur containing not molecular fragments so that means that our property targets are is everything between point A and point B. on this axis because sulphur moves in this direction and in the trying to phase diagram. So we now have the property targets. We want to pull in. Me a group contribution based properties that we have any of the non group contribution based properties need to be translated and then we need to solve the molecular theirs on. So our targets and I think it will be in the way over here is the software so by not having any software in the groups that gives us a degree of freedom and still be able to solve it graphically so we added the head of a price. Ation. As another property that we wanted the solvent to have. So there's like I said before vapor pressure cannot be predicted directly from group contribution but you can link it to the boiling point. Via this kind of empirical expression here. So now we have the all the molecular property operators here. We have a selection of seven groups that we can use and this time we have at least two outside the range. We can design a series of candidates. And then check for the different kinds of feasibility they are all inside the feasibility region and they all have complete. They and c So there is no more free bonds. To fall outside the index range and two of them when you actually predicted the vapor pressure for those components they fell outside that range. So we're left with three. So we have two different tones and and want to cover so I guess it. Now we can place those back on the original diagram because we know what its properties are and we know that all the mixing operations are straight lines. We want to maximize the amount of recycle that we could do so that means it's the one with the shortest arm for the fresh solvents which means it's the one in the middle in this case. But it's a very simple way of the. Pushing the problem from both sides. Now granted. It doesn't allow you to solve every kind of problem you need to be able to scribe both the process and the molecules from the properties perspective in order to do this but it does provide at least the first generally unified framework for addressing this from both sides so. The way from member Malik you will see and start looking at more general product design. Which just as design of structured products. The National Research Council came out with this set of findings about a year and a half ago. They started coming up with what are the challenges that our community faces that we need to be able to predict structures and properties from fundamental principles but that's always been. The Holy Grail for engineers but what we really want to do is to complement experimental and theoretical approaches experiments are very expensive time consuming and the vast size of the search space makes it almost impossible to go. Test one thing see if it works tests and other and so on. So we'd like to be able to compliment that was something so real theory. But then as a lot of the research that's going on in this place shows. There's a variety of different scales that we now have to address which we haven't had to before. So Michael Hill. Read a paper a few years before the National Research Council about what it is that the systems community could do. We should be able to combine the risk. There's an optimization. The one thing that will be different from what we're used to is that no longer will we describe everything in terms of physical properties but rather performance attributes which are much. Harder to handle than the physical properties because the a lot of the professions attributes are going to be consumer desires and stuff like that that we don't really have a handle on how to predict. But what we would need at some point is a new methodology that we're not eliminate the need for experimentation. That will help to guide and focus what experiments to run. So that's when you get into multi scale systems. I'm not going to spend a lot of time on that but really when you want to do design for products you're dealing with all the different scales both from the process side and the molecular design. So the question is can we match all the scales at the same time. This is what I showed you before we basically tried to solve the process design problem in terms of some initial property targets then solve them like a design problem and feed the molecules back. From it to scale that doesn't quite work that way. It's not that easy. However. We could use probably the properties as an interface between the different types of problems. So you can use fundamental principles for the process designed to generate some property information consumer add to can probably be translated to properties using some kind of metrics and I'll show you an example of that here in a little bit microstructure is where you get into the more structured products. There's a lot of work being done on course training and other times not as for doing this we're still left with kind of you know into season stuff like that for the molecular scale and then the atomistic methods for doing like remodeling like I mentioned so on and fundamental ab initio methods which is not something I'm going to talk about at all but there are a lot of good work being done. In those areas but it should be possible to make all those through. Intake of the properties. Which is a thing that is not usually looked at in that perspective. So what we're focusing on is that linkage between the complementary experimental and theoretical approaches as they were outlined by the Research Council. So we're looking at pathways. There is the key metric in fairness which is the part I'm going to talk about today mostly I'll talk about mixture design of experiments. And some semi empirical models which I won't cover but higher order group contribution is one of them. Connectivity indices methods and other Q Sark US P.R. methods and molecular Bethia which is basically a variation on the theme that Dr Biscoe presented in November we use a lot of the the signature descriptors to describe the more nontraditional properties as well. So. I'll just quickly how the general framework looks. Basically there's an empirical method that we try to translate that should be targets. Into some design targets that is not a trivial thing to do and all we did basically was set up a generalized framework for doing this is going to be very vacay specific because the linkage between. The consumer to buttes and the physical properties are more are more or less case specific so you can say that for paper the tactile feel that the consumer wants is related to a certain property that is going to be the same property if you start talking about a lotion. So there's going to be a lot of case specific and data driven information there. But you could also map some of this down using a lot of other information that are being used spectrum information and so on for the molecular makeup a specific specifically for Structured Products like. Different kinds of composites and so on. There's a lot of data available but it needs to be processed in. A different way. So I'm not going to spend any more time. On this what I'd like to show you instead is a bit about the design of experiments. Now a lot of you probably do experiments. And you've done. Design of experiments you try to find out response surface Pods and find out where your sweet spots are and so on. And I know that Dr Lee and Dr Ralph are doing some really interesting stuff on how to handle the uncertainties in this what I'm going to show you is a way of handling some of the common toil explosion instead. So next to design is the one I want to focus on what you try to do and which is basically deal we are all the fragments or the compositions to add up to one as you know the difference between a normal day a week and make sure you eat. And what you're trying to do is come up with a way that you can represent the physical phenomena using the minimum number of experiments. So you can use stuff like box thinking type designs and so on to try to explore the search space but you end up with an expectation that will allow you to relate some kind of property to a function of the amount of the constituents. OK. So you run some experiments you collect some data. And now we have some data for the responses as a function of a series of chosen points. Then we try to do a bad model from that and there's main two main ways of doing that there's the polynomial models and the canonical models. But they try to do the same thing represent the responses that we got as a function of the compositions of the constituents. And then we try to use it for prediction. That's all good now. The best separate from coming to explosion. If you have seven components then that means you need twenty five independent planets per property that you're trying to evaluate. That quickly becomes pretty tedious got to overlay and find the sweet spot that's pretty pretty and. Fact of way of doing this. So what we want to do was visualize this in a different way it should be able to handle the coming to the problems should be easy to do and should be universal in the application. So here's an example you've got seven components in three properties that means we need seventy five plots from the make sure we do the regression and then plot them. That's a lot of plots you take the same problem. Some components. Three properties and you translate into this classroom space then it's one plot. Because you no longer represent the properties inside the turn or a diagram what you represent in there are each of the pure components. So it's a lot easier and lower dimensionality to represent this. Plus. If you add another component you don't add twenty five more plus you add another dot. Now if you need more than three properties then you need a multidimensional representation which doesn't work. However you can salvage mathematically still a hell of a lot easier set of equations of nonlinear terms than you can trying to solve the four non linear expressions they get from regressing the D.O.B. problem. So that was basically what we came up with. So I'll show you a quick example of this and I got to give my student Charley credit for this. He's an avid sailor and whenever he can get out of the lab he wants to race. So that the very first problem he wanted to solve was optimize a polymer blend for use in rep for sailboats OK so that's what he did and the two scales that we're dealing with here is the consumer attributes and the molecular design. So we put a lot of this out of Cornell spoke on design of experiments. So that was not a strength of yarn was one property thready delegation was another and then we said the third ability probably is related in some shape or form. Through the middle of all you. Whether or not that's true or not we don't know but we had to do something about something as intangible as float ability and we didn't want to waste Charlie's time by sending it out on the lake and throwing different polymers in the water so we came up with some values for the properties targets and we had some candidates of polymers that we could put in just a simple that came out of Cornell spoke. Now this is how it was done including this book. So he had these eighteen response surface plots for them to every. When you did it in our method. You had one plot and this is a blow above it. Because they're all clustered pretty close together no pun intended. This is the shape model which is one of the ways that you can use a regression the shape model does not center anything whereas the Cox model does now and the shape model it looks like. The component right here which is number three should be pretty close to being used directly as a as a main component in our eye of our brand. But when you translate it to the Cox model where everything is center around a standard. What happens is that the relative distance from the standard indicates the strength of the response how much effect a given component has on the overall response. So that means in this case the number three has virtually no response. No impact on the response so it should be used as a filler. So you can increase the clarity in just these two plots which would be very hard to get out of these eighteen in any one look so that was kind of the. Kind of the point of this whether or not. Charlie actually has made the polymer blend or he just went to the bow store and bought. A rope I don't. No. OK. I'm sure this is just a quick two minute version. This is some of the record that we're doing on more structure. Problems. Mr structured products we use in a study from Sweden where they're trying to design compressible acetaminophen tablets and so they have a variety of different ingredients they can put in there and we want to sell this in a more efficient way. Basically So these are the the properties that they use angle repros compressibility and water content they have a different set of exceptions available and they already derived all the mixing rules for so that was easy to translate force. What happens is when you regress all this data you end up sometimes with negative core fissions which when we initially found related all the. All the mass. We didn't take into account. So everything used to fit. Inside this triangle. Now all of a sudden we get a negative core fissions because that's what happens when you just run the regression. So we didn't really know what that meant. Except when we started to study the actual responses it meant that the further away you were it actually indicated the strength of the effect. So really the main component that has an effect on this design is number six. So when you blow it up. Number six this summer up on the fourth fifth or something. But you'll notice that in all the possible formulations that you can help with it would have to be there and that matches exactly what the experiment was found out when they did that kind of trial by a trial and error approach. So the one thing that we're struggling with. What happens if we need new molecules right now we're linking the attributes. Chooser components. We can do that without worrying about the coming toward explosion. But if we need new molecules that aren't in the training said what do we do about that. Also when you have any secondary call in the area or even very nasty non-linearity in the regressions How do you have a rat one way of doing that is to use subspace subspace mappings generally using latent variable models. And I'm not going to spend any time on that but there's my student Charlie has developed this and it was published. Late last year I think and I need to research. But basically it allows you to. Use design experiments and or P.L.S. to come up with that your property relationships and then you use principal component now is to come up with which other principal properties and then there's some I say she steps and you can use to determine the structure of the of the final product but I will skip that if you concluding comments but I've tried to show you is that plastering method provides at least a framework that allows you to solve these probably driven problems without having to commit to components until the very last step. It enables visualization that makes it. Interesting from a dimensionality perspective. We can map attribute data down to a subspace I didn't show you that in detail with some a thought in a linear functions that means that you can guarantee global optimality when you're solving. That's definitely a plus. Where are we going to know we need to expand the general property descriptions those of you who are doing operations research and maybe hide high level math of may have come across the concepts of basis expansion base expansion is a. A mathematical tool that might provide is a way of deriving the linearized expressions for the operator functions. We're working on integrating more typological indices one of them is for group contribution based Roshi design. Which is a cool way of structuring across the seas. And so that you generate a property from the flow sheet and at the same time you generate the proposition want to molecules to have. Press or working on including microstructure information particular recruit there are many and so on and a lot of that is coming from. Professor Darby's group at Santa Barbara. And basically we're looking at a variety of ontology methods as well for doing network analysis and so on. So I had to acknowledge the funding from National Science Foundation and from the consortium the fossil fuel science and although I get to do all the talking and I'm the one that treated to the nice dinner and the nice breakfast and lunch. The work really good credit goes to trial solver someone who was the main one who worked on this and the response I got developing I've worked on pronouncing that for three years but I finally got it. And John does all the molecular there's nine work that I presented. So they're the ones that really deserve the credit. All right thank you very much. And. I really like. This. Why are you. Well you need things. You can see that we were. Yes I like it. That it's not being a polymer expert I'm probably going to jam my jam my foot in my mouth on that one but I'll try what we really want is to be able to describe either the effect of the processing on the final properties of the molecules let's say. You know that a certain processing step changes the glass transmission temperature. In this manner. If you know that then you can use that information either as a discrete point per you know level of processing or it becomes a pathway on the process side right. At the same time you can map the corresponding molecules or fragments of molecules that you're trying to combine until they basically hit the same point. So you can you can solve both at the same time you can target. The. The amount of processing that you want to do to get the property target you want the product to have or vice versa. The product of the property becomes the interface between the two. So you can either design a timer and say OK this is the properties that it has now in terms of doing that. That's the normal way. So there's no gain really of doing it the way we're proposing. But you could visually at least it would it would be attractive you design the molecule that provides a certain point in space and if you had a directory that is a function of the processing whether to intersect is the amount of processing you would have to do to get there but you would normally come from the process side not the molecular side that. You. Yes yes yes. So you're going to you're going to have a. I would you call it you would have a. Forgiven molecule that was processed in the same way but in two different extents you would have a or locus of points that would represent the resulting final molecule. Let me sense. Now that means if you go to a multi-processing steps that are vastly different in fact on the properties it becomes a more highly dimensional problem but being represented in a linear way. The only part we have to do with nonlinearities is the final step. We have to back calculate the final property so you should still be able to solve it easier than as a full on brute force makes them as an honest living up to my station problem. Yes yes. Sure. We've actually taken the visual approach and turn it into an algorithmic one. It's very easy to do since everything is linear and you can set up the optimization criteria pretty straightforward. So we've we've really had limited for two or three properties only when we want to do it visually. We've also run a couple of examples where we have properties that are functions or multiple properties and use that as a way of handling additional non-linearity it is also possible. We've done that with visually and for that as a break. Approach the hardest part is not really a particular type of property that is harder than another one. It really depends on how much data. We can you get from first principles to do the the transformation into a linearized description whether or not we have decent predictive power. So there's the ones that are hard to do at least from a. From the molecular design standpoint other ones that are very different from what can be predicted from group contribution methods or from let's say tautological indices methods. The structural information like for different kinds of functional nano composites stuff like that is difficult to do without it being data driven. Where you're like yes. You know. You go back to you know all of that base work there will be no. Most room and Yes yes. Let me give you the. I said to the prediction person's answer first and then let me give you the truthful one. OK. The prediction person one says the predictions are usually strong regardless of whether it's one class of properties one class of components or another since they're all pretty similar straight simple see six to see eight. Hydrocarbons with one functional group. It shouldn't be very different. In terms of the quality of the properties. The reality is whether or not they actually perform like that with only three properties to describe the mixing and the behavior is questionable. But it might be we took the example for that purpose there that when they were three very different properties but the reality is you would have to test them all or test the ones that at least fell inside the feasible space now that the point to make really is not that we're not trying to solve everything in one shot. I don't believe and. The optimization approach to go for the brass ring in one shot. I think the way to approach these kinds of problems for exactly the reason you point out is to generate all the alternatives that are feasible. And then pick the best one. Now the ones that are infeasible the ones that would be screened out by what we did those would never work whether or not the ones we did finally work. We're not sure but there are candidates there one is a didn't satisfy the first criteria we set up will never work. So at least we've limited the search space in this case to three that show promise. The other eleven or the other eight that were screened out they would never work because they would not satisfy the initial constraints whether or not the three that we did were satisfy every constraint. I don't know but the other eight would definitely not show. You. Most of it from data. So it's I regret. You know you know. I like the ones that we've used so far we actually haven't done the ones that I showed today we haven't you. We haven't done the regression ourselves we found those and other references. But the work they were trying to do on bases expansion as a as a method for do. That you can pretty much it. There's analogies there to doing systems identification approaches we can fix the structure so in bases expansion you can fix what kind of for Maj you want to do have and try to. Fit the data to those kinds of expressions and then we're not limited if we use that approach to let's say polynomial functions or anything like that we can set in my power we want as long as we can come up with that linear mixing of the non-linear terms. That's all we need. But the ones that we do so far we haven't really. Derived ourselves the work that I was that I mentioned very briefly on the P.L.S. P.C.A. work. The regression core fission is there are one way of you know coming up with these kinds of. Of expressions. Now you don't know depending on how you set up the POS in the P.C.A. you may not be able to at least in standard software be able to hide functional descriptions in the principal components. So they may end up being the simple linear by linear quadratic and so on but then you could always rerun those and try to repressurize those into a linear form with non-linear terms that you. And so I mean you know. You know you can still use. You know that the point is that once you have descriptions of the properties. They're no longer process specific some of them might be like the plan or processing would be probably specific but you can probably. Generate enough of those descriptions at some point you know to be able to represent. Let's say a class of problems that relate to palm or processing because there's a certain effect on polymers of a certain size or morphology that this process will have on its properties and once we have that kind of information available then it becomes you know generalizable. Thank you. You can still use.