[00:00:05.05] great uh welcome everyone i'm aditya prakash and welcome to the [00:00:08.21] [00:00:09.18] final seminar of idea seminars ai seminar series of this semester so i'm really pleased to welcome [00:00:15.16] [00:00:15.16] professor jesse thaler from mit we'll be talking about uh collision course artificial intelligence [00:00:21.20] [00:00:21.20] meets fundamental interactions so jesse is the inaugural director of the nsfi institute for [00:00:27.01] [00:00:27.01] artificial intelligence and fundamental interactions he's a theoretical particle physicist [00:00:31.14] [00:00:32.23] and he joined the mit physics department in 2010 and is currently an associate professor [00:00:37.11] [00:00:37.11] in the center for theoretical physics he was a fellow at the miller institute [00:00:41.14] [00:00:41.14] at berkeley and then he got his physics a phd in physics from harvard in 2006 [00:00:46.08] [00:00:46.08] and a bachelor's uh uh in math and physics from brown 2002. he has been awarded many uh awards [00:00:54.00] [00:00:54.00] throughout his career and early career research award from doe a presidential early career award [00:00:59.14] [00:00:59.14] for scientist engineer the pks from white house and sloan as well into 2013. uh so [00:01:04.21] [00:01:04.21] again i'm let's welcome jesse for this talk i'm really happy to welcome to the series [00:01:10.19] [00:01:11.16] jesse over to you great thank you thanks for the invitation and for the introduction so um [00:01:17.07] [00:01:18.12] sorry to interrupt uh jesse just one one thing i wanted to point out to the audience that as [00:01:22.17] [00:01:22.17] usual we'll take questions uh through the chat because jesse won't be able to see you so jesse [00:01:27.03] [00:01:27.03] i will interrupt you if there are some questions which require your urgent attention otherwise i'll [00:01:30.17] [00:01:30.17] read it off at the end of the day thanks sorry okay all right okay perfect uh okay so great [00:01:36.04] [00:01:36.04] so um i am the the director of this new institute for artificial intelligence and [00:01:41.18] [00:01:41.18] fundamental interactions and uh these two fields are ones that are on a collision course and i mean [00:01:48.08] [00:01:48.08] a collision course in kind of two ways for this talk one is telling you about this intersection [00:01:52.06] [00:01:52.06] of these two fields but also i work in particle physics and we work with particle colliders [00:01:57.12] [00:01:57.12] and i'm going to be telling you a little bit about what we do in particle collisions as an example [00:02:01.07] [00:02:01.07] of this intersection of these two fields um so uh first just to give you some more information [00:02:08.08] [00:02:08.08] about this institute so the nsf ai institute for artificial intelligence and fundamental [00:02:12.13] [00:02:12.13] interactions uh it has an unpronounceable acronyms but we've been calling it ifi [00:02:17.01] [00:02:17.18] and you can see this intersection between fields in our logo which either looks like [00:02:22.00] [00:02:22.00] a capital a with a lowercase i on top of it so that's for ai or like a capital f and a capital [00:02:27.09] [00:02:27.09] i next to each other so that's for fundamental interactions and we really do see artificial [00:02:32.00] [00:02:32.00] intelligence and fundamental physics as being flip sides of the same coin this nsf-funded [00:02:37.11] [00:02:37.11] effort is anchored at mit with substantial involvement from harvard northeastern and tufts [00:02:42.21] [00:02:43.12] and uh what we're trying to do with this institute is we're trying to advance physics knowledge [00:02:48.21] [00:02:48.21] from the smallest building blocks of nature to the largest structures in the universe [00:02:52.15] [00:02:52.15] and galvanize ai research innovation so it's really making progress both on the physics side [00:02:57.05] [00:02:57.05] and on the ai side and doing great science along the way so uh there's many different components [00:03:03.18] [00:03:03.18] to this institute which i'll be telling you a little bit about in this talk today um we [00:03:08.08] [00:03:08.08] have efforts in theoretical physics that's my subdomain uh in experimental physics as well [00:03:14.06] [00:03:14.06] as well as in the foundations of ai and these are three fields that uh don't necessarily talk [00:03:19.12] [00:03:19.12] to each other as much as they should and so we're trying to build bridges between these fields via [00:03:24.00] [00:03:24.00] this ai institute and one of the ways that we're trying to do that is via a prize postdoctoral [00:03:29.22] [00:03:29.22] fellowship program our i5 fellows which are kind of acting like the gluons within the [00:03:34.13] [00:03:36.08] pieces of the institute so on the left-hand side this kind of red green and blue blobs this is the [00:03:42.00] [00:03:42.00] way that particle physicists represent a proton with the blobs uh being kind of like the quarks [00:03:47.05] [00:03:47.05] of the institute and then the gluons are the glue that tie together the various elements and bind us [00:03:52.02] [00:03:52.02] into a more complete hole so there's many aspects of our institute we have training education [00:03:57.11] [00:03:57.11] and outreach at the physics ai intersection we're aiming to cultivate early career talent [00:04:02.02] [00:04:02.02] um basically people who would might be deciding between a career either in the physics or more on [00:04:07.12] [00:04:07.12] the computer science side and this program allows those folks to actually pursue both interests [00:04:12.12] [00:04:12.12] simultaneously fostering connections to physics facilities and to industry and then something that [00:04:18.02] [00:04:18.02] i'm most excited about is for us to build strong multi-disciplinary collaborations and advocate for [00:04:23.11] [00:04:23.11] shared solutions across subfields and in putting this institute together it was remarkable to me [00:04:29.18] [00:04:29.18] how much uh different areas actually share common goals so for example in particle physics what i'm [00:04:38.04] [00:04:38.04] going to be talking about today our goal is to analyze collision debris and there's a number [00:04:43.14] [00:04:43.14] of senior investigators in our institute focused on particle physics and understanding collision [00:04:48.02] [00:04:48.02] debris but we also have people on the more foundational ai side in particular my colleague [00:04:53.07] [00:04:53.07] justin solomon who works on geometric data processing and uh remarkably classifying furniture [00:05:00.02] [00:05:00.02] uh into various categories and analyzing collision debris at colliders like the large hadron collider [00:05:05.11] [00:05:05.11] these turn out to share many similarities and already we've started to build shared solutions [00:05:09.22] [00:05:09.22] to our problems i'm looking forward to many more over the five years of this institute [00:05:14.06] [00:05:15.14] so let me give you a little bit more background uh about uh fundamental physics um uh so uh i'm [00:05:21.12] [00:05:21.12] gonna try to start as basis as basic as i can um so the pillars of fundamental physics on the [00:05:28.19] [00:05:28.19] largest scales uh we have big bang cosmology which explains how the universe started in a hot dense [00:05:36.02] [00:05:36.02] state and evolved primarily under the influence of gravity to the universe that we see today [00:05:41.14] [00:05:41.14] uh big bang cosmology involves ingredients like dark matter dark energy as well as radiation [00:05:47.18] [00:05:47.18] neutrinos and a little bit of matter and this is one of the areas where there's exciting [00:05:53.18] [00:05:53.18] uh uh things going on in the physical sciences trying to understand the origins of our universe [00:05:58.13] [00:05:59.11] and uh understand better this big bang cosmology and in particular what came before it [00:06:03.03] [00:06:04.00] uh on the shortest distance scales of nature we have the standard model of particle physics [00:06:07.22] [00:06:07.22] represented here in pie chart form so in orange are the quarks of the standard model in green [00:06:14.12] [00:06:14.12] are the leptons of the standard model in blue are force carriers these are the mediators of forces [00:06:20.04] [00:06:20.04] like the electromagnetic force the strong force and the the weak force and at the center of the [00:06:24.17] [00:06:24.17] standard model is the higgs boson discovered at the large hadron collider at 2012 and one that's [00:06:30.17] [00:06:30.17] still undergoing uh intense scrutiny to understand its role in the dynamics of the universe at the [00:06:36.15] [00:06:36.15] shortest distance scales so we have the longest distance scales in nature on the left the shortest [00:06:40.21] [00:06:40.21] distance scales in nature on the right and i would say that understanding these two pillars uh has [00:06:47.12] [00:06:47.12] been a real triumph of human intelligence and our understanding of how much dynamics [00:06:52.23] [00:06:52.23] is actually explained by the forces of gravity electromagnetism the strong force and the weak [00:06:58.12] [00:06:58.12] force is kind of remarkable and we'd like to be able to translate uh the lessons from uh physical [00:07:04.02] [00:07:04.02] reasoning and translate them into other contexts in particular into artificial intelligence context [00:07:10.02] [00:07:10.02] um and hopefully you'll get a chance uh in this talk to see an example of that at work how is [00:07:15.09] [00:07:15.09] it that if you take physics concept seriously you can actually build uh new artificial intelligence [00:07:20.17] [00:07:20.17] uh systems uh that are more powerful than methods that didn't account for those physical principles [00:07:26.08] [00:07:28.04] so the um uh just one thing to point out in the slide um my my research focus is on the strong [00:07:33.11] [00:07:33.11] force uh the strong force is not necessarily something that everyone in this audience has [00:07:37.09] [00:07:37.09] heard about so i'll be telling you a little bit more about the strong force uh in explaining my [00:07:41.03] [00:07:41.03] research um but i'm happy to tell you more in q a about any of the things that are happening in [00:07:45.18] [00:07:45.18] our institute uh revolving uh any of the forces of nature as well as cosmology and the standard model [00:07:50.15] [00:07:52.23] okay so i'm titling my my talk collision course and this is a title that i started [00:07:57.11] [00:07:57.11] to use in january of 2019 there was a workshop at the aspen center for physics [00:08:02.19] [00:08:02.19] on theoretical physics for machine learning and what i started to realize around then uh [00:08:09.03] [00:08:09.03] was that uh there really was a collision between two fields that we could exploit so before i tell [00:08:16.13] [00:08:16.13] you about that uh let me explain this picture of what you're seeing right now so here you're seeing [00:08:21.07] [00:08:21.07] uh an image taken from uh collisions of two protons at the large hadron collider [00:08:27.16] [00:08:27.16] those protons are brought into collision something happens um that could be dynamics within the [00:08:31.16] [00:08:31.16] standard model of particle physics or it could be dynamics that actually points to some new physics [00:08:35.11] [00:08:36.00] and from that collision we get these sprays of radiation coming out and somehow we have [00:08:41.20] [00:08:41.20] to figure out from these sprays of radiation uh what's going on at short distances in nature [00:08:47.01] [00:08:47.01] and just to give you a sense of the scope of the problem every 25 nanoseconds [00:08:51.18] [00:08:52.17] there's a new proton proton collision that happens and that's an enormous data volume that one has to [00:08:57.16] [00:08:57.16] sift through in fact it's such a large data volume that you're not even able to store um uh all of [00:09:02.10] [00:09:02.10] these images to tape you have to make selection criteria decide which images are are interesting [00:09:07.07] [00:09:07.07] to keep and which ones to throw away and then for each individual event that you want to study there [00:09:12.13] [00:09:12.13] are various data analysis strategies that one could pursue and so the simplest one that you can [00:09:16.17] [00:09:16.17] see just by eye is that you have these clusters of of radiation clusters of particles coming from [00:09:22.02] [00:09:22.02] the central collision point um and these clusters are what we call jets and jets are both a physical [00:09:27.16] [00:09:27.16] phenomena it basically tells you what happens when you make quarks and gluons in the standard [00:09:32.00] [00:09:32.00] model at short distances what those quarks and gluons manifest themselves at long distances [00:09:36.06] [00:09:36.23] but jets in a machine learning language is an example of an unsupervised uh clustering strategy [00:09:43.12] [00:09:44.02] where basically given uh a collection of points you want to find how to arrange those points [00:09:49.05] [00:09:49.05] into the best collimated sprays that you can that is your best proxy for the underlying dynamics of [00:09:56.13] [00:09:56.13] the system to study so i'm going to be telling you in this in this in this talk a little bit [00:10:02.19] [00:10:02.19] more about collider physics so hopefully uh from this talk you'll you'll know more about collider [00:10:06.17] [00:10:06.17] physics than you do when you came in um but the second meaning of collision course uh which is [00:10:11.03] [00:10:11.03] uh in the title uh is the collision between the physical sciences and mathematics statistics and [00:10:16.12] [00:10:16.12] computer science um and this was on on display at this aspen workshop that i mentioned where people [00:10:22.13] [00:10:22.13] who came from a physics science background um had interesting ways of thinking about the mathematics [00:10:27.16] [00:10:27.16] between behind artificial intelligence and people coming from a mathematic statistics and computer [00:10:32.00] [00:10:32.00] science background had an enormous range of tools that we could use to analyze physical data and so [00:10:38.06] [00:10:39.03] part of the institute is is figuring out how can we gain new insights into fundamental physics [00:10:43.20] [00:10:43.20] facilitated by advances in artificial intelligence and of course vice versa can we [00:10:48.04] [00:10:48.04] gain new insights into artificial intelligence facilitated by advances in the fundamental [00:10:52.06] [00:10:52.06] in fundamental physics and i put asterisks here because you know of course it's not just [00:10:56.02] [00:10:56.02] fundamental physics there's many scientific domains which are benefiting from artificial [00:10:59.09] [00:10:59.09] intelligence and then i put an asterisk next to artificial intelligence because that's [00:11:03.14] [00:11:03.14] in my mind a broad umbrella talk about all sorts of interesting things happening [00:11:08.08] [00:11:08.08] in computer science in statistics in data science and we want to capitalize on all of those advances [00:11:13.20] [00:11:14.15] of which you know ai is one component of course the component that's in the title of our institute [00:11:19.01] [00:11:20.12] so broadly speaking if you want to just say in kind of one line what the goal of our of [00:11:24.17] [00:11:24.17] our institute is uh trying to do you know one way of saying what we're trying to do is can we teach [00:11:30.10] [00:11:30.10] a machine to think like a physicist and this is in contrast to other strategies for uh engaging [00:11:38.02] [00:11:38.02] with artificial intelligence uh for example trying to mimic the creativity that you might [00:11:42.08] [00:11:42.08] find in human intelligence and trying to teach a machine to think like a toddler so you know i have [00:11:48.00] [00:11:48.00] an eight-year-old son and trying to reason with uh with a child is not the easiest thing to do [00:11:53.22] [00:11:54.12] my son can't necessarily explain all of the logic that goes into decisions that he's making [00:11:58.19] [00:11:59.09] what we'd like to do is we'd like to develop ai techniques that [00:12:03.03] [00:12:03.03] actually incorporate best practices from the physical sciences and teach a machine how to do [00:12:08.23] [00:12:08.23] the same type of rigorous physical reasoning that we do as as domain experts and you know when this [00:12:16.17] [00:12:16.17] question was kind of first posed to me i i said no no way how can we possibly teach a machine to [00:12:21.18] [00:12:21.18] think like a physicist and in particular the the revolution that's been happening in deep learning [00:12:27.11] [00:12:27.11] didn't seem like it it corresponded to the way that physicists approached problems [00:12:32.04] [00:12:32.19] but over time i've come to realize uh that actually there is this convergence and uh [00:12:38.13] [00:12:39.03] there is a way to infuse ai techniques with domain knowledge in a useful way in particular the domain [00:12:45.20] [00:12:45.20] knowledge of physics and let me just give you an example to highlight how my own mind was changed [00:12:51.01] [00:12:51.01] by using an example not from physics but an example from image processing uh that convinced [00:12:56.08] [00:12:56.08] me that teaching machine how to do something in an intelligent way i would actually give you better [00:13:01.22] [00:13:01.22] answers uh to your problems than you would have gotten if you were not to put that human knowledge [00:13:07.14] [00:13:07.14] or domain knowledge uh into the machine and so the example um here it comes from image processing [00:13:13.09] [00:13:13.22] um where this is an in-painting task that you might be familiar with where you're given a [00:13:18.19] [00:13:18.19] corrupted image on the left-hand side in this case this is an image of a of a library and you have [00:13:23.22] [00:13:23.22] various regions that are masked out in white and you want to go from that corrupted image and you [00:13:28.10] [00:13:28.10] want to uh try to reconstruct uh based on your prior knowledge about how these images are made [00:13:34.04] [00:13:34.04] your best guess about what this library actually looked like underneath those masks [00:13:37.14] [00:13:38.10] so the standard story that i had heard about deep learning and deep learning's ability to [00:13:43.14] [00:13:43.14] do this type of in-painting task is that one we have increased computational power and so you're [00:13:49.11] [00:13:49.11] able to um just use that raw computational power to actually just do much better [00:13:54.12] [00:13:54.12] uh kind of likelihood analysis if you want to think about it that way much better analysis [00:13:57.22] [00:13:57.22] than you could do with more limited computational resources and then the other thing i heard is that [00:14:02.15] [00:14:02.15] you need you know large data sets though if you imagine having a billion pictures of a library [00:14:07.12] [00:14:07.12] now if you've seen a billion libraries then you've seen them all and you can use that to build up a [00:14:11.16] [00:14:11.16] prior knowledge about what libraries in general look like and then use that to find what is the [00:14:16.02] [00:14:16.02] closest library uh uh on the right-hand side uh to the uh corrupted one on the left-hand side [00:14:23.14] [00:14:23.14] and that's your best guess about how to solve this in-painting task and you know using lots of [00:14:29.05] [00:14:29.05] computer power using large data sets of course i'm familiar with that in particle physics where we do [00:14:33.09] [00:14:33.09] uh benefit from increased computational power and gigantic data sets at the large hadron collider [00:14:38.08] [00:14:38.21] but you know i'm a theoretical physicist so a lot of the work that i do is pencil and [00:14:42.08] [00:14:42.08] paper and shock and chalkboard and i didn't see what kind of role that i could play in [00:14:47.11] [00:14:47.11] the deep learning revolution because it didn't seem to involve domain knowledge in any way [00:14:51.12] [00:14:52.04] but this example from image processing actually convinced me that you can fuse deep learning uh [00:14:58.00] [00:14:58.00] with what a theoretical physicist might call uh deep thinking where uh you can make progress not [00:15:04.02] [00:15:04.02] just by increased computational power and large data sets but also by understanding the structure [00:15:09.05] [00:15:09.05] of this problem and in particular what was so inspiring to me about this in painting task [00:15:14.06] [00:15:14.06] is that this is a machine learning strategy that didn't ever see any example photographs [00:15:19.22] [00:15:19.22] um it was a randomly initialized neural network it had not been pre-trained uh having seen uh any [00:15:26.00] [00:15:26.00] prior images rather the way that this inference task was carried out uh was by understanding [00:15:32.06] [00:15:32.06] the structure of the problem and realizing that when you have an image like a library you have [00:15:36.21] [00:15:36.21] repeated patterns and basically grid-like structures at different scales in the uh in [00:15:40.21] [00:15:40.21] the image and you could figure out how to use a kind of convolutional neural network type [00:15:44.21] [00:15:44.21] structure even if it was randomly initialized to actually figure out how to do a kind of um [00:15:49.09] [00:15:50.02] a cut and paste task and use the information that is unmasked to figure out what's going [00:15:56.02] [00:15:56.02] on behind the mask even though you've never seen images of that type before [00:16:00.00] [00:16:00.17] and so because of that i became convinced that i could be using domain knowledge from [00:16:05.16] [00:16:05.16] from my field and inject that domain knowledge into a machine learning architecture and be able [00:16:10.19] [00:16:10.19] to do uh better on that task even in cases where i didn't have necessarily training data to work with [00:16:16.06] [00:16:17.18] okay so that's an example from uh from image processing uh let me now uh use an example from [00:16:24.08] [00:16:24.08] my research and tell you about this intersection of artificial intelligence and fundamental [00:16:28.12] [00:16:28.12] interactions and uh my research is in on the short distance sides so uh that involves uh the standard [00:16:34.15] [00:16:34.15] model of particle physics and i want to give you uh an example of a research program that we've [00:16:40.21] [00:16:40.21] been doing that has been leveraging our knowledge about collateral data in order to build new types [00:16:47.01] [00:16:47.01] of machine learning strategies uh that incorporate that knowledge so uh here again is the standard [00:16:52.13] [00:16:52.13] model of particle physics and a number of these uh names are are unfamiliar to you uh though there [00:16:59.01] [00:16:59.01] are a few ones that you might have heard of before so for example the particles that experience [00:17:02.19] [00:17:02.19] electromagnetic interactions uh in particular photons which carry the electromagnetic force [00:17:07.16] [00:17:07.16] as well as electrons which should be very familiar to you and heavier cousin of the electron called [00:17:12.06] [00:17:12.06] the muon these are elementary particles which means that if these are produced in the collision [00:17:18.02] [00:17:18.02] debris at the lhc when i slam together two protons these are objects that i can reconstruct directly [00:17:24.10] [00:17:24.23] and actually see in my detector but everything else in this pie chart is either unstable that [00:17:31.05] [00:17:31.05] is you make it and then a split second later it disintegrates um or it gets bound up by the strong [00:17:37.14] [00:17:37.14] force which again is my research area and so the quarks in orange and the gluon the g shown in blue [00:17:43.20] [00:17:43.20] uh quarks and gluons are ones that experience the strong force and we never see quarks and gluons uh [00:17:48.21] [00:17:48.21] isolated in nature rather quarks and gluons get bound up into composite states the proton is one [00:17:54.08] [00:17:54.08] example of a composite state uh but there's other ones with uh with funny names so pions chaons k [00:18:00.13] [00:18:00.13] longs protons neutrons these are composite states that actually can hit our detector um and uh [00:18:06.23] [00:18:06.23] we can use those composite states to infer that quarks and gluons were produced but this [00:18:12.08] [00:18:12.08] is a relatively limited palette of things that we actually see in our detector everything else about [00:18:18.00] [00:18:18.00] this pie chart for example the higgs boson or the w boson or the z bosons fundamental particles [00:18:23.03] [00:18:23.03] whose properties uh we've inferred from collision debris you don't actually see them directly in [00:18:28.02] [00:18:28.02] your collision debris rather you only see them via their uh their remnants in terms of the elementary [00:18:34.00] [00:18:34.00] and composite states that are long enough lib to actually hit your detector and be visible [00:18:37.18] [00:18:39.07] so if that wasn't hard enough as an inference task to measure this collection of final state [00:18:43.14] [00:18:43.14] particles and reconstruct the structure of the standard model it's even more complicated [00:18:47.09] [00:18:47.09] because you have to have these particles run into various detectors and the detectors that we use [00:18:52.10] [00:18:52.10] in particle colliders are very heterogeneous they have various different types of outputs actually [00:18:57.12] [00:18:57.12] outputting at different time scales it's actually a quite challenging uh data reconstruction [00:19:02.06] [00:19:02.06] problem and um what we have is we have uh tracking detectors that see some particles uh calorimeters [00:19:11.01] [00:19:11.01] uh two different types of calorimeters that see particles uh we have a specialized system [00:19:15.03] [00:19:15.03] dedicated to just detecting muons and uh what you end up getting uh is these uh collisions that [00:19:22.13] [00:19:22.13] look like this again this is happening every 25 nanoseconds you get a picture that looks like this [00:19:27.03] [00:19:27.03] and somehow from these various uh tracking and calorimetry information [00:19:31.14] [00:19:31.14] somehow you have to reconstruct the individual particles that are hitting your detector and [00:19:36.17] [00:19:36.17] then you want to do as i mentioned these kind of clustering strategies to find out which of [00:19:40.12] [00:19:40.12] those particles are related to each other uh to form these jet like objects which act as proxies [00:19:46.00] [00:19:46.00] for uh the fundamental quarks and gluons that you aren't able to see directly so this is a a very [00:19:51.18] [00:19:51.18] challenging data set to uh to deal with on the other hand um the structure of this data set if [00:19:57.09] [00:19:57.09] you think about it uh in its most abstract form is actually one that's quite familiar uh once you've [00:20:02.08] [00:20:02.08] done this reconstruction task then at the end of the day the spray of particles that are coming out [00:20:07.01] [00:20:07.01] from the collision point can be described as just a collection of points in momentum space [00:20:11.12] [00:20:12.08] uh the particles have momentum in the x direction the y direction and the z direction [00:20:16.04] [00:20:16.04] and they're all coming from this one central collision point so i can just think about these [00:20:20.06] [00:20:20.06] as being points in momentum space you know somewhere between a hundred and a thousand [00:20:23.20] [00:20:23.20] points in momentum space for any individual uh collider event and that data structure [00:20:28.21] [00:20:28.21] a collection of points in space is one that's quite familiar to people who work in the machine [00:20:33.20] [00:20:33.20] learning or artificial intelligence worlds that's something that would be called a point cloud [00:20:37.20] [00:20:39.16] so typical point clouds that people talk about are collections of points [00:20:42.23] [00:20:42.23] in in position space and typically in three-dimensional euclidean space [00:20:46.21] [00:20:47.18] and that's the kind of data that you would get from let's say a self-driving car with lidar [00:20:51.18] [00:20:51.18] point detection where you have a collection of points that are in position in x y and z evolving [00:20:57.07] [00:20:57.07] in time and it's maybe not so surprising that a number of the tasks that you'd like to do with the [00:21:02.13] [00:21:02.13] self-driving car have an analog in the collider physics realm so for example you might want to [00:21:07.22] [00:21:07.22] solve a segmentation task where you want to find a collection of points represented by these boxes [00:21:12.08] [00:21:13.07] that are actually corresponding to individual objects so that's the direct analog of on the [00:21:19.12] [00:21:19.12] particle physics side identifying these jets these collimated sprays of particles uh and uh [00:21:24.17] [00:21:24.17] trying to understand uh uh you know which clusters are kind of optimum uh in terms of representing uh [00:21:31.03] [00:21:31.03] distinct objects in your collision event in the case of these jets uh you'd like to classify them [00:21:37.05] [00:21:37.05] did that jet come from a quark did that jet come from a glue on did that jet come from a w boson [00:21:41.11] [00:21:41.11] from a higgs boson well that classification task of course on the self-driving car side [00:21:46.21] [00:21:46.21] is saying are these objects you know is that a bicycle is that a car is that a pedestrian clearly [00:21:51.18] [00:21:51.18] that classification task is one that you would like to get get correct and uh some of the same [00:21:58.02] [00:21:58.02] uh goals in self-driving cars of having reliable inference uh as well as fast inference those [00:22:03.16] [00:22:03.16] are things that are also shared on the particle physics side when we're doing uh collider studies [00:22:07.16] [00:22:09.18] so you might think oh all i need to do is take all of the off-the-shelf uh point cloud machine [00:22:16.00] [00:22:16.00] learning strategies and just use those for particle physics uh data analysis and if [00:22:20.19] [00:22:20.19] that were the case then there would be nothing fun for our artificial intelligence institute to do um [00:22:26.08] [00:22:26.08] but actually we can use our domain knowledge our knowledge about really what's going on in [00:22:32.00] [00:22:32.00] this particle collider to develop more robust ai technique and that's what i'm going to be telling [00:22:36.15] [00:22:36.15] you about using my own research is just an example of one case where you're just going to be able to [00:22:41.05] [00:22:41.05] see this very clearly in action and so to um to do this i need to tell you just a little bit more [00:22:47.11] [00:22:47.11] about uh this process of jet formation so this is again what happens in the strong force when [00:22:52.13] [00:22:52.13] quarks and glue gluons bind together so i start off with my proton proton collision as i mentioned [00:22:58.19] [00:22:58.19] already protons are composite states they're bound states of quarks and gluons and when i [00:23:03.07] [00:23:03.07] slam them together at high energies i liberate those quarks and gluons those quarks and gluons [00:23:08.08] [00:23:09.01] undergo a process called radiation they're just like electrons when they move around they generate [00:23:14.04] [00:23:14.04] electromagnetic radiation quarks and gluons when they move around they generate gluonic radiation [00:23:19.09] [00:23:20.08] and the key difference between the strong force and the electromagnetic force is that [00:23:24.17] [00:23:24.17] uh in the strong force uh gluons can create more gluons in the electromagnetic case unless you're [00:23:30.15] [00:23:30.15] dealing with something like non-linear optics photons don't actually generate more photons [00:23:34.02] [00:23:35.03] um those quarks and gluons we don't see them directly uh they bind together so eventually the [00:23:39.01] [00:23:39.01] strong force becomes strong uh and uh as i go out from the collision point those quarks and gluons [00:23:44.12] [00:23:44.12] bind together to form these composite states that we call hadrons so the reason why the collider is [00:23:48.21] [00:23:48.21] called the large hadron collider is because we're colliding together protons and protons [00:23:52.13] [00:23:52.13] are examples of that composite hadron state those hadrons hit my detector i just described [00:23:58.00] [00:23:58.00] how challenging it was to figure out what's going on in your detector and somehow we have to figure [00:24:04.00] [00:24:04.00] out you know our goal is to understand the dynamics of quarks and gluons but we have to view [00:24:08.13] [00:24:08.13] it through this uh uh you can think of it kind of like a smearing genitive process where the quarks [00:24:14.13] [00:24:14.13] and gluons get smeared out because they're bound into hadrons the hadrons get smeared out because [00:24:18.23] [00:24:18.23] you have imperfect detection but i can use my knowledge as a theoretical physicist to say well [00:24:24.15] [00:24:25.05] in this whole process of going from quarks and gluons to composite hadrons to detection [00:24:30.06] [00:24:31.03] what information about this collision is most robust [00:24:34.04] [00:24:34.21] uh and in particular what's most robust about this collision is the flow of energy off to [00:24:39.18] [00:24:39.18] infinity so in a quantum mechanical language there's a quantum mechanical operator called [00:24:45.11] [00:24:45.11] the energy flow operator and for those of you with a physics background you'll recognize as capital t [00:24:50.19] [00:24:51.09] in this formula here is the stress energy tensor t 0 that means the flu that means energy and t 0 [00:24:57.11] [00:24:57.11] i that means the flow of energy the flow of energy from the central collision point going out to uh [00:25:03.07] [00:25:03.07] an idealized theorist detector at infinity that is information that's robust to these hydrogenization [00:25:09.09] [00:25:09.09] and detector effects which are challenging to model and so if we focus on energy flow [00:25:14.15] [00:25:14.15] this is information that i think is most robust and therefore if i build an ai architecture uh [00:25:21.14] [00:25:21.14] around that idea of capturing correctly the energy flow and not the full information about [00:25:27.03] [00:25:27.03] hydrogenation and detection i will be able to make more robust inference um and in particular [00:25:32.04] [00:25:32.04] robust inference that i can actually do uh first principles theoretical calculations related to [00:25:36.08] [00:25:37.18] so this is an example of this collision course a collision between uh the principles of [00:25:43.05] [00:25:43.05] fundamental physics in this case this robustness of energy flow um which uh for those of you who uh [00:25:49.11] [00:25:50.02] maybe know something about about fundamental physics uh this energy flow concept it comes [00:25:54.04] [00:25:54.04] from the field of quantum field theory a quantum field theory is the mathematical [00:25:58.13] [00:25:58.13] structure that underpins the standard model of particle physics so this is you know a [00:26:02.12] [00:26:02.12] bedrock principle of fundamental physics that this energy flow is robust but then you have [00:26:07.03] [00:26:07.03] from the other side the power of artificial intelligence and in particular are the power [00:26:10.19] [00:26:10.19] of various point cloud learning strategies and so what i'm going to be presenting today [00:26:14.13] [00:26:14.13] is based on um work that was developed at carnegie mellon uh an architecture that's called deep sets [00:26:21.01] [00:26:21.20] uh into this collision you also need two fantastic graduate students my uh uh current student patrick [00:26:27.11] [00:26:27.11] kaminski who's graduating uh soon and my former student eric matodiv who graduated this year [00:26:32.00] [00:26:32.15] you also need funding from various agencies uh of course including the national science foundation [00:26:36.23] [00:26:36.23] which is funded our new uh artificial intelligence institute and uh out pops a new neural network uh [00:26:43.16] [00:26:43.16] architecture uh called energy flow networks which does the synthesis of this fundamental [00:26:49.07] [00:26:49.22] physics idea uh uh synthesized with point cloud learning techniques from the literature [00:26:55.05] [00:26:56.15] so uh uh uh what can i tell you about this energy flow network so let me give you a little bit more [00:27:02.12] [00:27:02.12] technical detail just to have a little bit of meat in this talk today um so on on a technical [00:27:08.17] [00:27:08.17] side the difference between what we're doing in on generic point cloud learning techniques [00:27:13.09] [00:27:13.09] is that what we're doing is we're dealing with what are called weighted point clouds [00:27:16.19] [00:27:16.19] so uh each point has a weight associated with it and zero weight uh uh points don't uh carry [00:27:24.06] [00:27:24.06] any information um and if you have two points that are going in the same direction uh then [00:27:29.05] [00:27:29.05] the total of their weights is all that matters the individual weights of those points if they're on [00:27:33.16] [00:27:33.16] top of each other uh is is irrelevant so what that corresponds to in the particle physics language is [00:27:38.13] [00:27:38.13] that we have in energy weighted directions so the momentum of particles instead of talking about p [00:27:43.14] [00:27:43.14] x p y and p z we talk about the direction n x n y and nz and then the uh energy serves as a weight [00:27:50.04] [00:27:51.03] and uh uh an equivalent way of talking about this is imagine you put like a camera on your detector [00:27:58.23] [00:27:58.23] and you're kind of looking back down at the spray of particles coming from the central collision [00:28:02.06] [00:28:02.06] point and you want to think about how much energy is being deposited you can think about this [00:28:07.03] [00:28:07.03] information as being an energy density so you have energy at various different points those points [00:28:11.20] [00:28:11.20] corresponding to points on your detector and this energy density uh carries the same information [00:28:17.16] [00:28:17.16] as this weighted point cloud and we want to develop a machine learning architecture that [00:28:21.12] [00:28:22.04] uses this representation as the fundamental representation for understanding what's going on [00:28:27.01] [00:28:29.01] so with this background we can now think like a physicist so uh apologies for the overly [00:28:34.12] [00:28:34.12] cartoon version of machine learning so we have a black blocks uh you know the machine of course a [00:28:38.06] [00:28:38.06] huge amount of of research has gone into creating effective black boxes to solve a variety of tasks [00:28:43.20] [00:28:44.15] we input a properly specified problem in this case i'm going to show you a [00:28:48.21] [00:28:48.21] jet classification task trying to classify jets in terms of the two dominant categories [00:28:53.11] [00:28:53.11] that appear in particle physics park jets and gluon jets that's the equivalent uh [00:28:58.00] [00:28:58.00] in the image processing of like cat uh pictures and dog pictures uh so we have quark jets and [00:29:02.19] [00:29:02.19] gluon jets and uh that classification task is a properly specified problem we can do we have [00:29:07.22] [00:29:07.22] many examples that we could use for uh training and uh we can also inject into this architecture [00:29:14.13] [00:29:14.13] uh two facts from physics one fact from physics uh comes from quantum mechanics and that's the fact [00:29:20.08] [00:29:20.08] that uh uh there's a symmetry where identical particles are indistinguishable that is you [00:29:26.10] [00:29:26.10] can't tell if you're given two photons you can't tell which one was first or which one was second [00:29:30.06] [00:29:31.01] that means that you can't use natural language processing techniques [00:29:34.13] [00:29:34.13] where you take sentences which have an actual semantic ordering to them where the order matters [00:29:39.05] [00:29:39.05] with particles you can't do that if you're given a collection of particles the order in which you see [00:29:43.07] [00:29:43.07] them uh carries no uh information therefore you need to have a permutation symmetry on your inputs [00:29:48.13] [00:29:48.13] if you want to do the learning properly uh this uh this energy flow concept or this energy weighting [00:29:54.19] [00:29:54.19] uh the technical name for this is safety or even more technical infrared and collinear safety [00:29:59.12] [00:29:59.12] that basically says uh that you want to make sure that your points are treated in this energy [00:30:03.16] [00:30:03.16] waiting case and again this comes from quantum field theory we can inject both this permutation [00:30:08.13] [00:30:08.13] symmetry and this infrared and collinear safety uh symmetry if you want to think about it that way [00:30:12.23] [00:30:12.23] we can inject that into the machine get out of solutions to that problem that i want to solve [00:30:17.03] [00:30:18.02] but if i really want to think like a physicist and i want to reproduce the workflow that i do [00:30:22.02] [00:30:22.02] with you know my graduate students you know i'm not satisfied with my physics graduate students [00:30:26.00] [00:30:26.00] if they just give me a solution to the problem that they've that i've posed i want to find [00:30:29.18] [00:30:29.18] some way of verifying that they've actually carried out that analysis in the correct way [00:30:33.14] [00:30:34.08] and building an architecture that actually has verification built in and the ability to check [00:30:39.05] [00:30:39.05] that the answer is physically sensible that's one of the things that we want to make sure [00:30:43.12] [00:30:43.12] that we do as part of this institute uh uh because this is one of the key things that [00:30:47.20] [00:30:47.20] one does in the in the physical sciences is this verification and quantification of uncertainties [00:30:52.12] [00:30:54.04] so uh let me give you the equation that actually describes these energy flow networks and it's [00:30:58.17] [00:30:58.17] actually quite simple uh when i first entered into this uh area of machine learning and artificial [00:31:03.22] [00:31:03.22] intelligence i i expected things to be very complicated and while there are many complications [00:31:08.06] [00:31:08.06] associated with neural networks in their training um in terms of encoding some of these symmetries [00:31:13.03] [00:31:13.03] it turns out to be relatively straightforward at least in this case so i'm going to solve my [00:31:17.22] [00:31:17.22] problem acting on jets i'm going to solve it in a kind of two-stage process the first thing that [00:31:23.07] [00:31:23.07] i'm going to do is i'm going to look at individual particles so that's what's represented by this phi [00:31:28.08] [00:31:28.08] function on the um on the right-hand side here this phi function there are multiple of these [00:31:33.11] [00:31:33.11] five functions labeled by a and that five function takes in information about the directions that [00:31:38.15] [00:31:38.15] the particles are going it gets multiplied by the energy of those particles and then i sum over all [00:31:45.09] [00:31:45.09] particles in my uh in my jet that i want to study and that sum operation has the permutation and [00:31:51.09] [00:31:51.09] variance that we need and this linear weighting uh in energy this allows us to do uh deal with [00:31:57.11] [00:31:57.11] this energy flow or deal with this the safety and remarkably uh uh if i specialize uh this carnegie [00:32:04.08] [00:32:04.08] mellon paper on deep sets and specialize to this case with this line energy linear energy weighting [00:32:08.21] [00:32:08.21] you can actually prove that this functional form uh describes any safe observable uh and [00:32:14.23] [00:32:14.23] it turns out to have excellent jet classification performance you know basically state of the art [00:32:18.08] [00:32:18.08] performance uh with fewer parameters than um other methods that are in the literature [00:32:23.07] [00:32:24.06] now i've i've hidden the neural networks that are baked into the sphy function i've also hidden the [00:32:28.04] [00:32:28.04] neural network that baked into this f function what this f function does is it synthesizes all [00:32:32.19] [00:32:32.19] these v a functions and synthesizes that into a final solution um and maybe that letter v is [00:32:38.19] [00:32:38.19] suggestive that letter v stands for verification and i can verify uh that the machine is doing [00:32:45.03] [00:32:45.03] something sensible by looking at this internal representation we often call this a latent space [00:32:51.07] [00:32:52.10] look at this latent space representation of a jet and try to figure out if that latent [00:32:56.08] [00:32:56.08] space representation carries the physical meaning that we want so for people who are familiar with [00:33:01.07] [00:33:01.07] convolutional neural networks this is similar to trying to plot a filter activation function uh in [00:33:06.08] [00:33:06.08] this case i'm actually just directly plotting what this verification function looks like [00:33:10.00] [00:33:10.00] and you think about this verification function is basically asking the machine hey what part of the [00:33:17.01] [00:33:17.01] flow of energy were you focused on and in the uh pictures that i'm going to show momentarily [00:33:22.23] [00:33:22.23] you're going to see a representation of what the machine has learned to do for a particular [00:33:27.12] [00:33:27.12] jet classification task and what information it thought was important for solving that problem [00:33:31.20] [00:33:32.10] and what we're going to see is that the way that the machine looked at that information [00:33:36.02] [00:33:36.02] is one that has physical meaning so again i've already put in physics here i've already put [00:33:41.05] [00:33:41.05] in the energy flow concept i've already put in permutation and variance i've already capitalized [00:33:45.05] [00:33:45.05] on the machine learning advanced on the cs side of this deep sets architecture from cmu [00:33:49.05] [00:33:50.02] now i want to see has the machine learned additional physical principles [00:33:53.20] [00:33:54.10] uh uh in the way that it's approached to solving uh this jet classification task [00:33:59.09] [00:34:00.15] and what's really cool is that this visualization it's kind of psychedelic uh [00:34:04.21] [00:34:04.21] but this is a representation of what the machine learned and i'll try to walk you through this [00:34:08.04] [00:34:08.17] so again what i'm trying to do is a very simple binary classification task tell the difference [00:34:13.09] [00:34:13.09] between sprays of particles coming from quarks and sprays of particles coming from gluons these [00:34:17.22] [00:34:17.22] sprays of particles are going into the screen here and the center of that spray of particles is [00:34:23.07] [00:34:23.07] at the center of these boxes and in these boxes i'm representing where the machine is paying [00:34:28.15] [00:34:28.15] attention so each one of these rings you should think about that as being like a filled in blob [00:34:35.01] [00:34:35.20] and what you can see uh is that the machine uh has decided to pay attention to small blobs near [00:34:43.03] [00:34:43.03] the center near the center of the jet and larger blobs near the periphery of the jet um and uh the [00:34:49.12] [00:34:49.12] number of these v functions the number of these verification functions basically gives you a sense [00:34:53.07] [00:34:53.07] of the resolution that the machine has in order to try to reconstruct this jet so we start off [00:34:57.12] [00:34:57.12] with a latent dimension of 8 then 16 then 32 and 64. and you're already trying to starting to see [00:35:04.10] [00:35:04.10] that the machine has learned very interesting patterns um these blobs are arranged in a in a [00:35:09.18] [00:35:10.23] in a symmetric way around the central point that is if you go around an azimuth uh around that [00:35:16.15] [00:35:16.15] central point you see that it's basically paying attention to information in a uniform way going [00:35:21.09] [00:35:21.09] around in the circle um but you also see that it's paying attention to information uh at the center [00:35:27.09] [00:35:27.09] with with finer resolution that at the periphery and if i go to the kind of scaled up full [00:35:32.04] [00:35:32.04] uh you know most psychedelic image is kind of like the self-similarity fractal structure that you see [00:35:36.19] [00:35:37.18] and this fractal structure is not an accident this fractal structure is the known fractal [00:35:43.14] [00:35:43.14] structure of the strong force um uh something that again is part of my own domain knowledge [00:35:48.15] [00:35:49.05] so without me telling the machine that fact by just telling it some baseline facts about quantum [00:35:54.12] [00:35:54.12] field theory it actually in some sense inferred this other fact about quantum field theory [00:35:58.23] [00:35:58.23] that there's this logarithmic scaling of information as you go towards the central uh [00:36:03.03] [00:36:03.03] part of this uh of of the jet so the machine has figured out the scaling of the strong interactions [00:36:09.09] [00:36:10.02] um and so uh just to give you some jargon uh the scaling uh is called uh ulterelli parisi [00:36:16.04] [00:36:16.04] uh scaling uh it's our d-glap evolution is another way that it's called um it's been known since the [00:36:21.12] [00:36:21.12] 1970s it's the kind of things that are in particle physics textbooks and it's just completely amazing [00:36:26.19] [00:36:26.19] to me uh that the machine figured this out so um the way that you would you would infer this [00:36:31.18] [00:36:31.18] is you'd say okay take these blobs say what's the radial distance from the center of the jet [00:36:36.21] [00:36:36.21] and what is the size of the kind of blob or pixel that it reconstructed and you can then find a [00:36:42.06] [00:36:42.06] scaling relation and if you had exact altered parisi scaling then you would get a slope of two [00:36:47.18] [00:36:48.10] the fact that you get a slope of 1.6 here is because of effects that we actually understand [00:36:52.04] [00:36:52.04] from the particle physics side um and uh knowing the scaling you can actually come up with an even [00:36:58.02] [00:36:58.02] more psychedelic representation of the information this is basically logarithmic scaling going to the [00:37:02.00] [00:37:02.00] center of the jet and if i uh do a a projection of the psychedelic image into the space where [00:37:09.18] [00:37:09.18] i basically undo that logarithmic scaling uh you get this kind of uh jackson pollock-esque [00:37:15.14] [00:37:16.04] image here uh that's showing that the machine has figured out how to pay attention to information in [00:37:21.16] [00:37:21.16] a uniform way but not uniform in the way that a standard convolutional neural network would pay [00:37:26.13] [00:37:26.13] attention to things uniform in this logarithmic space and the fact that you see kind of uniform [00:37:31.03] [00:37:31.03] pixels uh in this in this representation here is evidence that the machine has really picked up on [00:37:36.23] [00:37:36.23] a key feature of the physics problem and now we can take this key feature bake it [00:37:41.16] [00:37:41.16] back into our machine learning architecture and then see what we can learn next from it [00:37:44.21] [00:37:45.11] but this is an example of teaching a machine to think like a physicist and it responded by [00:37:50.17] [00:37:50.17] actually learning something cool it learned the fractal structure of the strong force [00:37:53.22] [00:37:56.00] now the fractal structure of the strong force is something that's already known and in backup [00:37:59.05] [00:37:59.05] slides for people who are interested i can actually show you something new that it taught [00:38:02.04] [00:38:02.04] us that's uh even more technical than what i said already but i hope you think this gives [00:38:06.19] [00:38:06.19] you kind of a flavor of the type of reasoning that we're doing uh in my research program and [00:38:11.05] [00:38:11.05] then we want to uh push out to this institute as a whole of uh of incorporating uh domain knowledge [00:38:17.20] [00:38:17.20] and then from that uh both relearning things that we already know but in this machine learning lens [00:38:23.05] [00:38:23.05] uh but then also using that to discover things that we didn't know about our data sets [00:38:27.01] [00:38:28.04] and uh just to just put a punctuation mark on on this point like this collision course is not just [00:38:33.09] [00:38:33.09] this example that i showed you here uh gaining new insights in fundamental physics facilitated [00:38:38.02] [00:38:38.02] by advances in mathematics statistics and computer science in just in my own research group and with [00:38:42.12] [00:38:42.12] my students patrick and eric we've seen incredible connections uh so for example uh there's a field [00:38:48.17] [00:38:48.17] of machine learning called blind source separation where you're trying to learn from unlabeled data [00:38:54.10] [00:38:54.10] sets and uh leveraging that idea of blind source separation we actually were able to define in a [00:39:00.21] [00:39:00.21] rigorous way actually something that's a little bit ambiguous even from the perspective of [00:39:05.12] [00:39:05.12] quantum field theory namely how do you define in a rigorous way what a quark or gluon actually is [00:39:10.00] [00:39:10.21] it turns out that a half century of collider physics data analysis strategies uh can be [00:39:15.12] [00:39:15.12] translated into the language of optimal transport for people who know what the earth mover's [00:39:19.22] [00:39:19.22] distance is it turns out that earth mover's distance is connected in a very interesting way [00:39:24.08] [00:39:24.08] to things that we've been doing for 50 years in collider physics and then even fields like graph [00:39:29.14] [00:39:29.14] theory turned out to be highly relevant and this is an example of the vice versa kicking in where [00:39:35.09] [00:39:35.09] we actually had a problem in collider physics that we could translate into a graph theory picture [00:39:39.14] [00:39:39.14] and understanding the collider physics actually allowed us to extend an entry in the [00:39:44.08] [00:39:44.08] online encyclopedia of integer sequences about a particular property of graphs and actually extend [00:39:49.16] [00:39:49.16] that again using information from the physics side to actually just answer a simple question [00:39:54.00] [00:39:54.00] about graph counting and all these examples that i've been doing with my students this is just an [00:39:58.15] [00:39:58.15] example of progress being driven by early career talent with cross-disciplinary expertise patrick [00:40:05.09] [00:40:05.09] and eric had a computer science background that they brought into the physics realm [00:40:09.12] [00:40:09.12] and this is something that we think is is quite powerful and now they're able to take their [00:40:13.07] [00:40:13.07] physics knowledge and then push it back out uh into uh into for example industry applications [00:40:18.12] [00:40:20.02] and trying to reproduce uh the the story that i just told you in some sense that's [00:40:24.15] [00:40:24.15] what this nsf institute is all about um we are uh 20 physicists and seven ai experts [00:40:31.07] [00:40:31.07] uh across mit harvard northeastern and tufts and part of our excitement about this institute is [00:40:37.05] [00:40:37.05] that the boston area really had a critical mass for uh transformative research at the [00:40:41.22] [00:40:41.22] intersection of fundamental physics and ai um at mit in particular it involves people [00:40:47.12] [00:40:47.12] working on many many different fields uh and even though we're all at the same institute uh we are [00:40:53.09] [00:40:53.09] uh as is true of many places you know in our own silos uh this uh this institute gives us an [00:40:58.23] [00:40:58.23] opportunity to see the commonalities in our ways of approaching problems and i i i'm really excited [00:41:04.13] [00:41:04.13] to see what happens when you take physicists and ai experts put them together uh in a room [00:41:09.18] [00:41:09.18] and and seeing what uh types of new solutions come out um we've been calling uh this internally [00:41:17.03] [00:41:17.03] uh this research we've been calling it ab initio ai research or ai squared so what do [00:41:22.04] [00:41:22.04] i mean by ab initio artificial intelligence well uh artificial intelligence and machine learning [00:41:27.11] [00:41:27.11] strategies that's that's that's pretty clear but ab initio means you know from first principles [00:41:31.18] [00:41:32.08] and so machine learning that incorporates first principles best practices and domain knowledge [00:41:36.23] [00:41:36.23] from fundamental physics that's what we're calling this ab initio artificial intelligence [00:41:40.12] [00:41:41.05] and in this uh example that i gave you already um i uh invoked the notion of symmetries and i [00:41:47.05] [00:41:47.05] invoked verifiability and there's many other first principles and best practices from fundamental [00:41:53.07] [00:41:53.07] physics that we hope to incorporate into the machine learning strategies that we develop [00:41:56.13] [00:41:57.09] and i should also say that you know we're not the only ones thinking in this direction [00:42:00.13] [00:42:01.07] and even if you just go back to uh you know let's say convolutional neural networks one [00:42:06.02] [00:42:06.02] way of stating what convolutional neural networks are able to achieve in a uh in a symmetry language [00:42:12.17] [00:42:12.17] is achieving uh what's known as translational equivariance or translational invariance [00:42:16.17] [00:42:17.07] which in the physics context is associated with conservation of momentum and baking in that [00:42:22.23] [00:42:22.23] translational symmetry into a cnn that explains a lot of the power of cnns in image recognition [00:42:29.03] [00:42:29.03] and image processing applications in the case of energy flow networks we took physical principles [00:42:34.19] [00:42:34.19] of the indistinguishability of identical particles from quantum mechanics this infrared and collinear [00:42:38.21] [00:42:38.21] safety from quantum field theory in generating these cool psychedelic images and this is just an [00:42:43.14] [00:42:43.14] example of ai squared so ai on the abinicio side a powerful strategy to analyze collisions at the [00:42:49.20] [00:42:49.20] large hadron collider and try to understand the structure of the universe ai on the artificial [00:42:54.10] [00:42:54.10] intelligence side uh this is a point an efficient neural network that can deal with weighted point [00:42:59.09] [00:42:59.09] clouds you take those two together and you have this example of cross-cutting research uh that [00:43:05.05] [00:43:05.05] goes across traditional uh disciplinary boundaries and uh i gave you an example from particle physics [00:43:11.16] [00:43:11.16] here just as an example but our institute involves people working in a variety of different fields [00:43:17.11] [00:43:18.19] and what all these fields share in common is that they're extremely data rich [00:43:22.19] [00:43:23.11] so for example gravitational waves which i'll explain a little bit more about uh in a moment now [00:43:29.05] [00:43:29.05] this is data that's uh streaming into detectors that are listening into the universe looking [00:43:34.17] [00:43:34.17] for listening for ripples in space-time that might come from colliding black holes [00:43:38.17] [00:43:39.09] you have first principles nuclear physics calculations um which i'll explain also a little [00:43:43.11] [00:43:43.11] bit more where you have synthetic data where you try to put the dynamics of quarks and gluons on a [00:43:48.04] [00:43:48.04] computer and try to infer nuclear properties from that we have people in our institute [00:43:52.13] [00:43:52.13] working in astrophysics i mentioned particle colliders already mathematical physics where [00:43:57.16] [00:43:58.06] actually people who work in string theory need to solve problems uh in uh not theory [00:44:03.12] [00:44:03.12] as an example and trying to find new ways of classifying knots if you can do that successfully [00:44:08.06] [00:44:08.06] on the mathematical physics side with ai you can actually do a better job in string theory research [00:44:13.01] [00:44:13.14] and then dark matter dark matter is a mysterious substance that uh is responsible for a lot of [00:44:18.08] [00:44:18.08] the structure in the universe even though we haven't seen it directly and trying to [00:44:21.16] [00:44:21.16] understand the nature of dark matter we have researchers who are taking ai techniques uh [00:44:26.06] [00:44:26.06] into that particular domain so uh these areas and more are things that we're excited about pursuing [00:44:31.14] [00:44:31.14] in the boston area and i'm gonna tell you just a little bit more about that uh in my remaining time [00:44:35.20] [00:44:37.01] so um how are we going to accomplish this what are our strategies for actually getting people [00:44:42.13] [00:44:42.13] uh in the same room to uh view their data sets through this uh artificial intelligence lens [00:44:48.02] [00:44:48.02] or for people who come from foundational ai side uh to see the types of lessons that can be learned [00:44:53.09] [00:44:53.09] from fundamental interactions data um and one of the things that we're doing which we just launched [00:44:59.07] [00:44:59.07] is uh launching a postdoctoral fellowship opportunity where we want to recruit train [00:45:05.20] [00:45:05.20] a talented and diverse group of early career researchers and spark interdisciplinary mostly [00:45:10.08] [00:45:10.08] investigator multi-subfield collaborations and we're looking for postdoctoral fellows who [00:45:16.10] [00:45:16.10] are sitting at the intersection of these various fields and in particular people who [00:45:21.09] [00:45:21.09] might be overlooked um in physics departments or in computer science departments because their work [00:45:25.18] [00:45:25.18] is on the boundary uh giving them an opportunity for a postdoctoral experience to really um [00:45:30.04] [00:45:31.03] establish the viability of research that happens at these intersections [00:45:35.07] [00:45:35.07] so we actually just had our first uh application round for for these ifi fellows um and uh we are [00:45:41.09] [00:45:41.09] in the process of reading through uh exciting uh research proposals and it'll be really great to [00:45:46.19] [00:45:46.19] see what these early career researchers do as part of our institute in terms of the research [00:45:52.23] [00:45:52.23] that we're doing i mentioned we had these three thrusts we have this artificial intelligence abner [00:45:58.04] [00:45:58.04] artificial intelligence for theoretical physics for experimental physics and for foundational ai [00:46:02.02] [00:46:02.17] and this covers a wide variety of fields and we decided that intellectual diversity was really [00:46:08.23] [00:46:08.23] important and that people working in disparate areas but coming together to use common um [00:46:14.19] [00:46:14.19] strategies and techniques that's how we decided to organize our institute as as opposed to focusing [00:46:19.16] [00:46:19.16] on you know one particular problem now the big problem that we're trying to do is define what [00:46:24.04] [00:46:24.04] do you mean by ab initio artificial intelligence what do you mean by teaching a machine to think [00:46:28.04] [00:46:28.04] like a physicist so i'll be giving you just three examples in the next three slides [00:46:32.15] [00:46:33.14] those are the ones that are color coded here but just to say we're working on standard model [00:46:37.22] [00:46:38.12] physics which is related to what i just talked about already in clutter physics string theory [00:46:42.19] [00:46:42.19] astroparticle physics as well as techniques to discover uh in an automated way physical laws just [00:46:50.15] [00:46:50.15] from raw data we have people working on particle physics experiments again related to what i what i [00:46:56.17] [00:46:56.17] just talked about in my presentation gravitational waves which i'll describe more in a moment [00:47:01.01] [00:47:01.16] as well as multi messenger astrophysics and then on the foundational ai side this idea [00:47:06.10] [00:47:06.10] of incorporating symmetries and invariants is a very hot topic uh in in ai uh speeding up control [00:47:13.07] [00:47:13.07] and inference basically figuring out how can you run uh some of this ai fast enough to actually be [00:47:19.20] [00:47:19.20] useful at these lhc applications where we need 25 nanosecond uh uh cadences uh i'll mention a little [00:47:27.11] [00:47:27.11] bit about physics-informed ai architectures as well as the theory of neural networks and trying [00:47:32.06] [00:47:32.06] to understand why neural networks generalize as well as they do using concepts from statistical [00:47:37.05] [00:47:37.05] physics to understand uh the great performance of neural networks in terms of their uh their [00:47:41.07] [00:47:41.07] generalizability so let me just uh spend the next three slides just going through these uh three [00:47:46.10] [00:47:46.10] examples that are color coded just to give you just a little bit more flavor of what we're doing [00:47:50.04] [00:47:51.11] so in the standard model of particle and nuclear physics uh here i'm going to give you an example [00:47:56.06] [00:47:56.06] from my colleague fiala shanahan of lattice field theory for nuclear and particle physics [00:48:00.21] [00:48:01.12] so um uh i i gave you kind of a cartoon of the strong force um but i want to emphasize the [00:48:08.21] [00:48:08.21] equations governing the strong nuclear force are known precisely but precision computations using [00:48:15.07] [00:48:15.07] those equations are extremely demanding and if you just take all the open super computing [00:48:20.10] [00:48:20.10] resources in the united states um it's around 10 a little bit more that's actually devoted to [00:48:25.01] [00:48:25.16] numerical understanding or numerical calculations of the strong nuclear force and there's they're [00:48:30.15] [00:48:30.15] incredibly important um in particle physics and nuclear physics they're important for example for [00:48:34.23] [00:48:34.23] understanding uh dark matter uh so dark matter is a mysterious substance that might bump into [00:48:40.12] [00:48:40.12] nuclear matter and understanding the response of nuclear matter is something that you need to do [00:48:44.17] [00:48:44.17] and you can uh gain insight into that by doing these numerical calculations and what my colleague [00:48:50.02] [00:48:50.02] fiala did is actually collaborate with industry with google's deepmind to develop custom ai tools [00:48:55.11] [00:48:55.11] custom generative models uh based on a technique called normalizing flows that is able to achieve a [00:49:03.09] [00:49:03.09] thousand-fold acceleration in a kind of a a toy uh lattice field theory calculation while preserving [00:49:10.19] [00:49:10.19] symmetries and guaranteeing exactness of the of the results in the uh in the asymptotic [00:49:15.16] [00:49:15.16] limit and so this is something that's absolutely essential if you want to do precision calculations [00:49:19.11] [00:49:19.11] to actually have guarantees of exactness as well as understanding of your uncertainties and what's [00:49:24.15] [00:49:24.15] really cool about this research is that even though this is aimed at nuclear and particle [00:49:28.10] [00:49:28.10] physics the tools that they designed actually are relevant for interdisciplinary applications [00:49:33.05] [00:49:33.05] and in particular some of the challenges for robotics where you have joints that have certain [00:49:40.10] [00:49:43.01] uh mappings where if you go around an angle by by two pi you get back to where you start [00:49:48.02] [00:49:48.02] that's related in a fun way to some of the symmetries that you're trying to preserve in [00:49:52.06] [00:49:52.06] lattice field theory um and this architecture is in fact relevant for robotics even though [00:49:57.07] [00:49:57.07] it wasn't certainly designed originally for that application so that gives you just a [00:50:01.01] [00:50:01.01] little bit of a sense of how a challenge in fundamental physics is is relevant more broadly [00:50:05.11] [00:50:06.17] um uh we have uh uh have an issue work in experimental uh physics in particular [00:50:13.05] [00:50:13.05] gravitational wave interferometry at ligo um and so when two black holes merge in some distant [00:50:19.20] [00:50:19.20] galaxy their merger generates uh gravitational waves that ring out and hit detectors [00:50:25.20] [00:50:26.17] and this detection of gravitational waves was uh awarded the the 2017 nobel prize [00:50:31.20] [00:50:32.17] um but this is a very challenging problem of uh on many aspects so once you collect this data [00:50:39.11] [00:50:39.11] basically the ringing that you hear from these black hole black hole mergers you have to reduce [00:50:44.23] [00:50:44.23] noise and we have people in our institute working on reinforcement learning for noise reduction you [00:50:49.16] [00:50:49.16] have to compare the theoretical calculations and the theoretical calculations that you'd have to do [00:50:53.14] [00:50:53.14] are extremely computationally demanding and we hope that some of these techniques might actually [00:50:57.14] [00:50:57.14] speed that up um and then if you want to correlate the gravitational wave signature that you saw in [00:51:04.00] [00:51:04.00] one place and correlate that with what you see elsewhere for example uh with follow-up telescope [00:51:09.22] [00:51:09.22] observation then you need to do fast inference um and we have uh people in our institute that [00:51:15.11] [00:51:15.11] are trying to take machine learning and put it on fpgas uh in order to make sure that this inference [00:51:21.03] [00:51:21.03] tasks can be solved as quickly as possible such that you don't lose out and opportunities to do [00:51:25.16] [00:51:25.16] multi-messenger astrophysics where you can actually correlate black hole signatures [00:51:29.09] [00:51:29.09] correlate with that with neutrino signatures like that with radio and an optical and x-ray and so on [00:51:34.23] [00:51:34.23] um and so fast decision making is something that's relevant in our institute and then uh finally just [00:51:41.11] [00:51:41.11] in terms of foundational ai uh a technique that appears in many disciplines not just physics [00:51:46.15] [00:51:46.15] is deconvolution and uh we have researchers who are working on trying to uh uh figure out from uh [00:51:57.12] [00:51:58.08] noisy data or from incomplete data how do you do your best uh inference of what's really going on [00:52:04.23] [00:52:05.18] and we think that the unique features of physics applications and the power power of these physical [00:52:10.08] [00:52:10.08] principles actually offer compelling research opportunities to advance the field of ai research [00:52:15.03] [00:52:15.03] more generally so for example one of my colleagues demobah works in a more neurosciency direction uh [00:52:21.01] [00:52:21.01] and trying to separate out neuronal signals uh using a technique that he's developing called [00:52:25.20] [00:52:25.20] sparse coding networks and it turns out that this is relevant for trying the same kind of [00:52:30.23] [00:52:30.23] mathematics behind that is relevant on the more fundamental physics side of trying to image black [00:52:35.14] [00:52:35.14] holes and uh an experiment called the uh event horizon telescope actually multiple experiments [00:52:40.10] [00:52:40.10] that are all trying to image uh the central black hole uh at the central part of galaxies [00:52:45.01] [00:52:45.16] they have to do this deconvolution task where they have incomplete information and they have [00:52:49.12] [00:52:49.12] to somehow reconstruct this beautiful ring that you might have seen in the news corresponding to [00:52:54.02] [00:52:54.15] the halo around the black hole and reconstruct that as best as they can given limited information [00:53:00.21] [00:53:01.16] and we can use and capitalize on physics priors and on interpretability of our methods for improve [00:53:08.04] [00:53:08.04] robustness and as i already mentioned before another aspect of foundational ai is leveraging [00:53:13.11] [00:53:13.11] tools from physics to explain the ability of networks to generalize and actually give [00:53:17.18] [00:53:17.18] you robust answers to things like deconvolution even though at a formal level deconvolution is [00:53:22.15] [00:53:22.15] an ill-posed problem so trying to understand why do neural networks work as well for tasks which [00:53:27.09] [00:53:27.09] might otherwise seem ambiguous so that just gives you a sense of what we're trying to do in [00:53:32.10] [00:53:32.10] in our institute and happy to take more questions about that uh we have many other [00:53:36.04] [00:53:36.04] activities in terms of uh internal and external research engagement workforce development digital [00:53:41.14] [00:53:41.14] learning outreach broadening participation knowledge transfer to industry as well as shared [00:53:45.16] [00:53:45.16] resources and happy to talk about any of those that people are interested but just to summarize [00:53:50.13] [00:53:50.13] uh uh let me give you some talking points and then and then a last slide so you know the talking [00:53:55.01] [00:53:55.01] points uh that we want people to know about our institute uh we think we have a compelling vision [00:53:59.12] [00:53:59.12] for the future of physics and ai research we think that by fusing this deep learning revolution with [00:54:04.04] [00:54:04.04] the time-tested strategies of deep thinking in physics that we can gain a deeper understanding [00:54:08.08] [00:54:08.08] of our universe and as well as of the principles underlying machine intelligence and that we hope [00:54:13.14] [00:54:13.14] that some of these physics ideas will actually go back into this foundational ai domain [00:54:17.18] [00:54:18.21] our goal is to train the next generation of researchers working at the intersection of [00:54:22.02] [00:54:22.02] physics and ai and we have programs like this i5 fellowships we're also developing [00:54:26.10] [00:54:26.10] interdisciplinary phd programs that offer unique opportunities for early career researchers to [00:54:31.18] [00:54:31.18] pursue their interests since there seemed to be a growing number of students who really want to work [00:54:36.06] [00:54:36.06] at this intersection of kind of like physics and computer science realms and then finally [00:54:41.14] [00:54:41.14] the research that we're doing in ifi has natural synergies with of these emerging interdisciplinary [00:54:47.11] [00:54:47.11] computation and data science institutes that are popping up all around uh including places [00:54:51.14] [00:54:51.14] like georgia tech and you know our view is that machine learning for the physical sciences is is [00:54:56.10] [00:54:56.10] growing dramatically and now is the time to start thinking about new faculty hires uh in this area [00:55:01.22] [00:55:01.22] and one of the things that we hope that if i can do is generate the next generation of talent that [00:55:06.12] [00:55:06.12] could uh end up being faculty uh at these kind of interdisciplinary centers um so with that let [00:55:12.04] [00:55:12.04] me just uh leave you uh with the with the with the summary uh so uh uh we're trying to advance [00:55:18.08] [00:55:18.08] physics knowledge from the smallest building blocks of nature to the largest structures of the [00:55:21.16] [00:55:21.16] universe and galvanize ai research innovation i've given you one example from my own research [00:55:25.22] [00:55:25.22] in particle physics and give you some hints of other things that are going on at our institute [00:55:29.22] [00:55:30.21] we're excited about building strong multi-disciplinary collaborations and [00:55:33.22] [00:55:33.22] advocating for shared solutions that can work across subfields and uh we look forward to [00:55:38.12] [00:55:38.12] collaborations and synergies with the broader ai community and uh and uh looking forward to hearing [00:55:44.02] [00:55:44.02] your questions about our institute questions about uh my research and uh potentially even [00:55:48.13] [00:55:48.13] possible collaborative opportunities uh in the future so uh thanks very much and uh [00:55:54.04] [00:55:54.04] again looking forward to your your questions great thanks a lot jesse for that nice talk uh [00:55:59.11] [00:56:00.00] i don't think you can hear everyone else clapping so uh so here's a question from charming hill uh [00:56:06.04] [00:56:06.04] the scientific projects are very compelling uh besides the interdisciplinary postdoc how does [00:56:11.05] [00:56:11.05] the ai institute allocate the resources to select facilitate or even integrate these projects uh [00:56:16.13] [00:56:16.13] maybe administratively yeah this is an excellent question and what we decided to do um for the way [00:56:25.01] [00:56:25.01] that we funded things is that we wanted to have shared resources so this if fellowship is shared [00:56:30.08] [00:56:31.18] and we have various programs like uh like summer schools um and conferences that are [00:56:36.08] [00:56:36.08] shared and seminars that are shared um but that the way that we were going to help [00:56:40.17] [00:56:42.00] this research happen was to make sure that the training was done at a per pi level um and so for [00:56:48.23] [00:56:48.23] each of our our senior investigators um we figured out exactly what they wanted to do and how many [00:56:53.14] [00:56:53.14] uh students and postdocs they would need to pursue that and then we went through in a way of [00:56:57.14] [00:56:57.14] trying to make sure that we had good coverage of junior people in those in those areas and then we [00:57:02.00] [00:57:02.00] allocated people according to in some sense what the research opportunities were in those areas [00:57:06.04] [00:57:06.17] i think going forward um what we're hoping is that we'll be able to have uh students who actually [00:57:11.20] [00:57:12.10] benefit from a dual mentorship model where a student would have a mentor on the physics side [00:57:18.12] [00:57:19.03] and a mentor on the ai side at the same time but because getting a phd does require having uh [00:57:26.08] [00:57:26.08] domain knowledge specifically we decided that the funding would go to the senior investigators [00:57:30.21] [00:57:30.21] for the support of those junior people directly um with the i5 fellowships as kind of the glue [00:57:35.11] [00:57:36.10] putting those all together and we'll have to revisit that strategy but that's that's what [00:57:40.00] [00:57:40.00] we uh tried to do is to make sure that each of our programs were were well supported at the [00:57:44.02] [00:57:44.02] individual senior investigator level and then the overarching umbrella and our activities were the [00:57:48.08] [00:57:48.08] way that we're going to do the synthesis great uh so there's another question from david sherrill [00:57:54.06] [00:57:54.06] who asks uh integrating ai with domain knowledge is very interesting you mentioned an example of [00:57:59.01] [00:57:59.01] building in permutational symmetries you have a general strategy or strategies for deciding [00:58:03.16] [00:58:03.16] what kinds of domain information to include and how to go about including it good so uh this [00:58:09.18] [00:58:09.18] idea of permutation equivariance um i should just emphasize you know we're not the first people to [00:58:13.18] [00:58:13.18] think about it and integration symmetries into ai architectures this is a growing area um and [00:58:20.08] [00:58:20.08] in some sense that's the low-hanging fruit and figuring out how to do that it's very easy from [00:58:24.23] [00:58:24.23] the physics side why because we have exhaustive categorization of all possible symmetries that [00:58:29.05] [00:58:29.05] we have in all of our problems um uh you know i can tell you about the symmetry structures [00:58:34.06] [00:58:34.06] not only the standard model of particle physics but also the symmetry structures of all possible [00:58:37.22] [00:58:38.12] universes consistent with quantum field theory so identifying those symmetries is not the challenge [00:58:42.23] [00:58:43.12] um and in fact incorporating those symmetries uh is not necessarily uh that challenging um [00:58:50.21] [00:58:50.21] uh because of the advances that are happening in foundational ai in that area of course we still [00:58:56.12] [00:58:56.12] need to do it uh and there's many symmetries that i rely on that are not yet in ai architectures [00:59:02.10] [00:59:02.10] but identifying uh them that's not where i see the uh the main challenge the main challenge [00:59:08.21] [00:59:08.21] here is that i have all these uh things in gray at the bottom symmetries conservation laws scaling [00:59:14.02] [00:59:14.02] relations limiting behaviors locality causality unitarity gauge and variance entropy leased action [00:59:18.08] [00:59:18.08] factorization unit tests exactly systematic uncertainties reproducibility verifiability [00:59:22.15] [00:59:22.15] you know and and it goes on like these are all the things that i do in my research program [00:59:26.12] [00:59:26.12] these are all the things that i expect you know my graduate students to learn [00:59:30.08] [00:59:31.01] and how do you translate these concepts into an architecture symmetries we know how to do [00:59:37.03] [00:59:38.00] but but factorization um this is something where you basically say that certain physical proper [00:59:43.14] [00:59:43.14] processes can be broken into uh individual units that are synthesized together in a kind [00:59:49.22] [00:59:49.22] of a markov chain like way how do you build an ai architecture that has that in there [00:59:55.03] [00:59:55.16] and that has factorization in the precise way that we mean it that it's not a standard markov chain [01:00:00.06] [01:00:00.06] um but it's it's one that's tied you know specifically to uh to my domain [01:00:05.16] [01:00:06.08] and uh it's been really instructive for me to work with foundational ai people and say look here is [01:00:11.22] [01:00:11.22] the formula that i know is true how do you build an ai architecture that satisfies that formula [01:00:16.15] [01:00:17.07] and they say well i know how to satisfy that formula that's no problem but you don't have [01:00:20.19] [01:00:20.19] enough computing resources to actually implement it and so then you say okay how do you in an [01:00:24.23] [01:00:24.23] efficient way implement my equation that obviously can't be the full one so it only [01:00:30.13] [01:00:30.13] has approximate uh uh factorization sorry exact factorization but only approximate information [01:00:36.08] [01:00:37.01] uh how do i do that and that becomes a dialogue so essentially in terms of a strategy the strategy [01:00:44.08] [01:00:44.08] has been marching through all the things that we do and asking for each one of those things we do [01:00:50.10] [01:00:50.10] who in our institute might know of the analogy of that in the ai context and then vice versa if [01:00:56.12] [01:00:56.12] you have something that's useful in the ai context what is the analog of that on the physics side and [01:01:01.16] [01:01:01.16] it seems to be you get physicists and ai experts in the same room and you just you know give them [01:01:06.08] [01:01:06.08] you know 30 minutes to just chat and then just new ideas pop up um and uh that was certainly uh [01:01:12.12] [01:01:13.05] the case uh in um this example in the middle here uh where i talked about a half century of [01:01:18.21] [01:01:18.21] collider physics coming from this field of optimal transport i knew zero about optimal transport then [01:01:24.21] [01:01:24.21] i talked to my ai friends they tell me oh by the way your data looks like something that uh [01:01:30.12] [01:01:30.12] i look at and then i look at their equations and then i realize wait a second your equations look [01:01:34.08] [01:01:34.08] a lot like our equations and then it took us a couple years to figure out the translation but [01:01:38.10] [01:01:38.10] then once we did there was a lot of uh a lot of gains um so i didn't really answer the question [01:01:44.17] [01:01:44.17] uh but it's it's getting people in the same room and just articulating what those uh best practices [01:01:50.00] [01:01:50.00] are and then trying to find that translation that's been the strategy that's been effective [01:01:53.20] [01:01:53.20] thus far and we'll see how how far that takes us in the future right so uh maybe i had a final [01:02:00.00] [01:02:00.00] question so and i think you're the right person to ask given that you're a theoretical physicist [01:02:03.20] [01:02:03.20] and uh i mean so a lot of people talk about this ai being able to discover laws directly from data [01:02:09.20] [01:02:09.20] so what what do you see as the uh like is that in the near horizon or or it's a more long-term right [01:02:18.19] [01:02:20.10] yeah yeah good excellent so um one of the things that i want to when i emphasize uh here uh let me [01:02:25.16] [01:02:25.16] just scroll back to the to the slide where i had this um is that uh uh oops not letting me it's not [01:02:33.03] [01:02:33.03] letting me go one second there you go um that the input that we need is a properly specified [01:02:40.08] [01:02:40.08] problem so when you say i would like the machine to learn in an automated way you know fundamental [01:02:47.03] [01:02:47.03] physics laws that sounds crazy and indeed it is crazy unless you can come up with a properly [01:02:52.21] [01:02:52.21] specified way a properly specified problem whose answer would be the laws of the universe so let [01:02:59.03] [01:02:59.03] me give you an example in quantum mechanics um there's this famous black body radiation spectrum [01:03:04.23] [01:03:04.23] which is a consequence of quantum mechanics uh that basically says you know you turn on an oven [01:03:09.16] [01:03:09.16] at a certain temperature and the radiation comes out has a very particular uh frequency spectrum [01:03:14.15] [01:03:15.11] um and it comes from quantum mechanics and if you gave me all the data in the world on the blackbody [01:03:20.13] [01:03:20.13] spectrum how would i ever learn the uh the laws of quantum mechanics and i would argue that you can't [01:03:27.05] [01:03:27.05] that if all you give me is the blackbody spectrum there's simply not enough information there that [01:03:31.14] [01:03:31.14] you would need quantum mechanics to explain it there's many other simpler formulas that [01:03:35.22] [01:03:35.22] would explain the back body spectrum without getting at the kind of intrinsic structure of it [01:03:40.10] [01:03:41.01] but then you go and you go to chemistry and you know atomic orbitals and and binding uh [01:03:47.12] [01:03:47.12] and there uh you uh uh you know have various tools for for understanding those chemical properties [01:03:54.10] [01:03:54.23] and and those those tools ultimately trace themselves to quantum mechanics and so if i only [01:04:00.19] [01:04:00.19] had chemical data also i'd have a difficult time uh you know inferring the structure of quantum [01:04:05.11] [01:04:05.11] mechanics and somehow you need both you need both information about the black body spectrum you need [01:04:10.06] [01:04:10.06] information about chemistry uh you need uh uh information about for example superconductor or [01:04:15.16] [01:04:15.16] semiconductors like there's there's all sorts of different types of data sets that you would need [01:04:20.02] [01:04:20.02] and then asserting that they all had a common origin in quantum mechanics that's the thing [01:04:24.02] [01:04:24.02] you'd want to learn learn from many many example datas and then try to find an underlying kind of [01:04:29.16] [01:04:29.16] principle if you can turn that problem into a properly specified optimization problem [01:04:34.21] [01:04:35.11] then you might hope to to uh to to to learn those underlying physical principles and so [01:04:41.01] [01:04:41.01] for the people who are in our institute who are thinking about automated learning [01:04:44.06] [01:04:44.06] of physical principles what they're doing is they're saying what does it mean to discover a new [01:04:49.22] [01:04:49.22] law what does success look like what is the output of the machine that you would say oh yes this [01:04:55.09] [01:04:55.09] corresponds to something that might you know be justifiably called a new physical law [01:05:00.00] [01:05:00.17] and then trying to say okay then how do you turn that into an optimization task [01:05:03.18] [01:05:03.18] such that you can have alternate laws that aren't as good and then the best one is the one that is [01:05:08.21] [01:05:08.21] somehow most simple uh explains the most data uh you know uh generalizes in the appropriate way [01:05:16.10] [01:05:16.10] uh has has uh more symmetry or what not to it and coming up with rigorous definitions [01:05:22.02] [01:05:22.02] of what a physical law is this is part of the deep thinking and this fusion of deep learning [01:05:27.01] [01:05:27.01] and deep thinking first the deep thinking specify my problem in a rigorous way and then it comes the [01:05:31.14] [01:05:31.14] deep learning of having specified that problem in the rigorous way then take these ai tools [01:05:36.00] [01:05:36.00] to actually solve them in uh ways that are better than what a human could do great thanks [01:05:41.16] [01:05:41.16] a lot jesse fascinating uh so uh i i think we are already over time so this is a good time to end [01:05:47.14] [01:05:47.14] um thank you again jesse for helping us end the semester's last seminar series on a strong note [01:05:52.21] [01:05:53.11] so see you everyone next semester and good luck jesse with your institute thank you thanks so much [01:06:03.20] [01:06:08.15] you [01:06:09.03]