I'm really delighted to introduce today's speaker, Dr. Jeffrey Markowitz. He's one of our newest assistant professors in the Department of Biomedical Engineering. He started on our last March or something like that. Alright. So less than a year. So we're, we're getting to hear some things that are kind of hot, hot off the presses, I think today. Jeff received his undergraduate degree in philosophy and writing from Johns Hopkins with concentrations in Math and Philosophy from Oxford University. He received his PhD in Computational Neuroscience at Boston University under the mentorship of Tim Gardner. There he did a bunch of pioneering work really on the population dynamics. So in the songbird motor system. And notably this research required designing new carbon-fiber electrodes, as well as new statistical methods for analyzing neural activity that actually have really become much more ubiquitous in neuroscience today. For his postdoc, he went to Harvard Medical School and it was mentored by Bernardo Sabatini and Sandeep data, where he really did really deep and beautiful exfoliation of how mammals generate natural behavior. For this work, he generated new methods for 3D tracking using machine-learning with concurrent optical and electrophysiological recordings and stimulation in freely moving animals. He's won a number of awards, including the Burroughs Wellcome Career Award as well as the Packard Fellowship Foundation. One thing that really is exciting to me, Jeff's work is that throughout his career really seems like he's demonstrated this questions driven approach. And so he's willing to try any technique whether or not it's brand new and he has to do a bunch of troubleshooting on it himself. Or whether or not he has to invent something new himself. And this is something that I think, you know, it makes me really excited about having someone like Jeff as a colleague. And so I'm really excited for desktop today. I won't step on too much of why he's going to present, but I think you guys are going to really enjoy it. So welcome, Jeff. You all hear me. Okay. Good. Alright. So thank you for the invitation to take What, about 300 steps from Whitaker down here to BDB. I'm really excited. I think I'm presenting some of this data for the first time to the larger tech community. The game plan today is I'm going to talk entirely about unpublished work. This was work that was mostly completed while a post-doc with printers have Tanium bulk data. And to give you a sense for how I typically would prefer things together, doesn't have to go this way. I'm happy to be a well oiled machine and kind of regurgitate stuff to you. But I think it would be better for all of us if we keep this as interactive as possible. So if you have any questions whether it's clarification, technical, philosophical, I'm more than happy to indulge. And if there are questions online, I think I'm gonna get flagged in the moment. Yes. Hi. I think that we don't see the screen share online and I was wondering if you guys want to see the torque. Let's try this again. So Okay, Excellent. We're good. Okay, we're good. All right, so without further ado, why don't we just dive right in actually before he died. But does anybody have any idea what this thing is here on the title slide? Hey, don't feel bad if you don't know what this is. This is also not very well known to me when I came across this. So this is actually a reinforcement learning machine that was built back in the early 60s by Donald Mickey and Sullivan. She was actually a bit of a pioneer. So he develops deep neural networks back before. That was even a phrase. The people commonly new. And this is a set of match boxes that can play tic-tac-toe. It's called the matchbox editable knots and cross engine, which is very fancy word for something that can play tic-tac-toe. So they're actually little pieces and each one of these magic boxes, and the pieces either get added or subtracted depending on whether the machine wins or loses a tic-tac-toe. And after a few rounds of this becomes very good and can almost reach human level performance. So even back in the early 60s, they think people are thinking hard about some of the things I'm going to talk about today. So what I'm showing you on the left is the first stage of a system. Pleasure of working on as a postdoc, It's known as motion sequencing or mosaic for short. This is a system that was originally developed by ontological and Matt Johnson and later, later on work done by many folks in lab including weed killer Sherry Lynn and Scotland. And what it involves doesn't have to involve this. What it typically involves is taking a mouse. Never seen an environment before. So it's just a mouse taken from its home cage and placed into a plastic bucket, which we will fancifully sometimes called an open field arena. And then mouse has left to do whatever it wants to do in that open field. It just mills about. The key here is that we film it from above using a 3D depth sensor. In this case, so Microsoft Connect, but now there are many more sensors that you can use to do this sort of experiment. For a couple of hundred bucks. You can actually do some pretty interesting things, maybe with pets or children at home. And this video here is 3D depth videos. Remember we're filming this mouse for the 3D depth sensor. So what that means is every pixel is color-coded from it's by type from the floor. So the cooler colors indicate pixels that are close to the floor. Warmer colors indicate pixels that are higher up off the floor. I'd preprocess this data just a little bit. So the backgrounds with subtracted mouse is done in a black hole. And I've taken the mouse and centered it and oriented at computationally. So it's always in the middle of facing the same direction. So what that means is the only way a pixel can change in this video is this. The mouse changes its posture in three-dimensions. Now, the real secret sauce here is that we take this data and we send it to a probabilistic time series model that in a fully unsupervised way can identify everything that the mouse does in the open field. So every color in this scrolling visualization here indicates a different behavior performed by this mouse. These are things like runs turn left, turn right, Different phases, approving all the things you'd expect a mouse to do. If it has no task, no reward, no stimulated, no instruction. Now I should say that this model has been specifically tuned to focus on sub-second. So these behaviors on the order of 300 to 700 milliseconds. And in a standard experiment, it might identify something like 20 to 40 different behaviors. Now the seed of what I'm about to tell you today came from looking at what many, many mice. So these can be mice of different ages, different sexes, whatever, and putting them the open field. And we notice something kind of strange. I'm going to summarize that using what we call a state map or a state diagram. So what we noticed is that even though these mice are just being placed into an open field left to do whatever they want to do. There was actually something in common in their behavior and it has to do with the statistical structure, their behavior. So in this diagram, every circle indicates a different behavior identified by our system. And sometimes we'll refer to these behaviors as behavioral syllables with analogy to the songbird literature. Now the size of these circles tell you how frequently those behaviors are used. And then the lines that connect any two behaviors tell you how frequently one behavior transitions into the next. So there's a lot going on here. But the one thing you might appreciate is that there's a bunch of stuff in the middle with lots of lines and then there seemed to be not so many lines out on the periphery. So what this means is that across many, many mice that we've run through this system over the years. For some reason these behaviors are more deterministic. So in other words, if any mouse is performing this behavior, in all likelihood it's going to transition into this or this. But then if a mouse finds itself in the hinterlands out here, so there are many thin lines that have been duly pruned away. But if the mouse finds itself out here, it's next action is almost a coin flip, right? So this is consistent across all mice spontaneously moving around an open field. So what is up with that? Well, you encourage questions. So can I ask a question? Yeah, absolutely. You said we could ask philosophical question. So what what does, how do you distinguish between discrete behaviors versus a continuum of behaviors? It seems like it's yeah, if more and more discrete behaviors at some point it becomes a continuum. I'm just curious like what you all were thinking during this process. Yeah. So I think to rephrase your question, I'm telling you that there are different behaviors here, 22, 40 different behaviors, which assumes that there's some boundary between these behaviors. So there are discrete in some sense. But there's also a continuous aspect of behavior. Behavior is continuous in space and time. So how do we deal with this? Well, the way that we deal with this as to kind of push that question to the model. So the model is an autoregressive Hidden Markov Model. And what that means is we assume what we observe is continuous. So things fluidly move in space and time. The mouse's body doesn't warp in unnatural ways. But that there are discrete boundaries between different modes of moving around continuously. And that's the discrete part, that hidden Markov part. So we've tested different models like fully continuous models, fully discrete models, and they tend to be less predictive of mouse behaviors. So we can take all these models because we're probabilistic, have them generate behavior. And then we can ask, is what they generate normal? Does it look realistic? And it turns out that this model does a pretty good job, at least within the classes of models that we've tried. Yeah. So it's kind of an empirical answer to your question, but that's where we are at the moment. Any other questions? Okay? So we noticed this weird thing in common between all the mice, I should say, that had been run through this system. And given this, we speculated that potentially there's something that's central nervous system that dictates the structure. And so this led us to an area that we've come to know and love over the past 56 years. So it's highlighted here. It's called the striatum. And for the uninitiated, this is a sagittal cartoon of a mouse brain. So it's as if we've cut the head in half like this and we're looking from the side. Now the straighten is an interesting structures. So it sits at the interface between cortex, receives information from. Motor areas, sensory areas, Madison sensory areas. And then it sends all that information along as the rest are, sends it along the rest of these structures that are collectively known as the basal ganglia. Now, work that I've been involved in and work from any other groups is implicated the striatum both in the encoding of what behaviours animals performed moment to moment. And it's been demonstrated that this area is causally involved in the process of deciding what actions to perform moment-to-moment process, sometimes referred to as action selection. Now fortuitously, the striatum receives a dense projection from the midbrain, that's dopaminergic from the substantia nigra pars compacta. Don't mean is a neuromodulator that we all know is very sufficient to shape the statistics of an animal's behavior. And this comes from work that dates back at least a half a century, going back to the pioneering work of olds and Milner, where they put metal wires into a rat brain. And if those metal wires happen to be near dopamine neurons, and they gave the rats the ability to pass current through that metal wire by pressing on a lever, that rat would become addicted to pressing the lever. And the reason that the floor here is metal is because in a subset of experiments they actually had electrified the floor to see if the rat would walk over that electrified fence to get to the lever. Turns out they do. So dopamine can get a rat to press a lever at the expense of almost everything else. We can also opt to genetically stimulate dopamine neurons. So zoom forward to the late 2000s here. And it was demonstrated that stimulation of neurons is sufficient to get a mouse to spend more of its time in one chamber versus another. Then in some really beautiful work from the Robert's lab, it was demonstrated that optogenetics stimulation of the neurons could get a zebra finch to shift the pitch of a single note and it's Song either up or down. So across many different species and contexts, it's been clearly shown that dopamine can shape statistics of what an animal does. But almost everything that we know comes from contexts like these, where we focus on a single behavior or a single class of behaviors? Or are we focused on animals that have been restrained and they can only interact with us for the experimental apparatus, for single action like a lake or a nose poke. The question we asked was whether dopamine can modulate the statistics of action selection when the mouse is free to do whatever it wants. So pick and choose 20-40 actions in the open field, and it's been given no instruction, no reward and no stimulate those. Dopamine, actually modulate the statistics of what it does. And to test this question, we express the genetically encoded biosensor delight. So this is a biosensor that you can put in a neuron. And when it binds dopamine, it fluoresces in green. We expressed it in many neurons in the striatum. And then we put an optical fiber above the cells expressing delight. So then whenever they fluoresce green, we can pick up all that fluorescence and we get a bulk optical signal that tells us whether dopamine is going up or down. So we do this in freely moving mice, and we also do Mozi. Now, before I animate this, I should preface this by saying that the dominant idea for why dopamine is so good at shaping animal behavior is because it signals something unexpectedly good or unexpectedly bad. So it's thought that dopamine neurons increase their firing rate when something really good happens and you weren't expecting it to happen. So you get some food reward, awesome. Dopamine neurons go up or something unexpectedly bad happens. So I was expecting food and it didn't arrive, then WE, neurons tend to be inhibited. So what would we expect if a mouse is just spontaneously moving around? We would expect dopamine to be really flat. Well, what we found with something pretty different, which is dopamine seems to be going pretty crazy when amounts is just milling around doing its thing in the open field. And in fact, this mouse has been in this environment, this plastic bucket with no features in it, about six or seven times, so they've been heavily habituated to this environment, then dopamine is still fluctuating wildly. Now when we observe this, we thought, okay, we're getting these fast and frequent fluctuations. But maybe these are just really small changes. And to absolute dopamine concentration, maybe it's not really meaningful to the animal and we're just fooling ourselves a little bit. So to test this, we did something pretty simple, which as we just food deprived the animals for about a day. And then we put chocolate chips and the open field. So if you'd probably a mouse for a day, It's pretty hungry. They like chocolate chips, even if they're not hungry. So these might mean they really want these chocolate chips we just aligned are delayed fluorescence to when the animals approached the food that they really, really wanted that's in white. And then in orange are the fluctuations we observed spontaneously. So these are just aligned to nothing, just the mouse moving around doing its thing. And it turns out they're about the same magnitude. So this confirms the idea that these fast and frequent spontaneous dopamine transients are on the same order what we observe when the line to an unexpected food reward. This means they should be meaningful to the animal. So one thing that we next thought it was, sometimes they appear to be more transient than at other times. So what actually is driving the variability in this dopamine signal? Well, we looked at something that many groups have looked at recently in the context of the mouse. So the Donbass lab and root Acosta and Josh Dublin determines lab have all seen very strong and robust correlations between simple movement parameters and dopamine. And this might be in the context of running on a treadmill, running on a floating ball, or moving in the open field. And so we did something very simple, which is we looked at the correlation between our dopamine signal and the animals speed in the open field. And we looked at this correlation at different timescales. So we just bend both, both data streams at longer and longer bins and just ran a correlation. We found is that it's sub-second time skills. There's actually a strong negative correlation. And interestingly, as we look at longer and longer timescales, this correlation starts to trend positive. So there are couple of interesting things going on here. One, we see what other people see. Cool. So we think we're unreasonable footing to this correlation. Actually flip signs as you look at longer and longer timescales. So there has been a heterogeneity of answers with respect to what the correlation between, between dopamine and movement is. And we think that's at least in part due to focusing on behavior at different timescales. So that's a little bit of an aside, but the timescale which you look at behavior can deeply impact your answer. But going back to our original question about variability, while we found a correlation between velocity and dopamine, the absolute magnitude of that correlation isn't huge, right? There's still a lot of variability left in the signal. So what else should we look at? What we thought? These transients are fast and frequent behaviors we focus on are 300 to 700 milliseconds. Both of those seem pretty fast, so maybe don't mean has something to do with these behaviors that we pick up with most seek. And so what we did was we aligned our delayed signal to the transition oh, yeah. Question. Yeah. Thanks. That doesn't mean takes to have an effect. Yeah. So your question is, what is a reasonable timescale on which to expect to dopamine to have some impact on the physiology of cells and maybe downstream on behavior. So this has become an interesting meta question in the field. And the reason that I say that is we're focusing on a particular part of the striatum, the sensory motor striatum, which is in the dorsal lateral compartment. If you look in different parts of the striatum, the speed of dopamine release changes dramatically. So if you go from dorsolateral to dorsal medial, it gets slower. So dorsolateral it looks like you have a transient, maybe one every 2 s. And the on kinetics looked like 60 to 70 milliseconds very fast. Towards the medial, the timescale seems to be at slower by about 50%. You go to ventral, it gets slower almost by an order of magnitude. So there's something very, very interesting going on with different timescales in different parts of the striatum. Now, if we think more about how long does it take dopamine to impact the physiology of a cell that can be tens of milliseconds. So it's known to rapidly depolarized and hyperpolarized cells. It's also known to have impacts on downstream signaling cascades and involve PKA that can take a little bit longer. It also can impact plasticity, and that can be on a timescale media tens of seconds to 2 min. There's a lot going on with dopamine. Yeah. So maybe we can revisit this when we come back to when we do some manipulations where I think it becomes very relevant. Yeah, So really it's natural to think about it on all time skills in a way which makes it a little bit confusing to think about. Okay, so we did something simple, which is we looked at the transition from any behavior that we pick up to any other behavior. So whenever the animals switches into something new, like hundreds of thousands of transitions that are in our dataset here. And we asked us, delight seem to go up whenever the animal transitions into something new. The answer is yes, there is this little blip. So this green trace is the average onset aligned delights signal at all transitions in our data. We find this nice little bi phasic modulation. So that's cool. Maybe delight actually does encoding something about these behaviors that we pick up with our system. So the next thing we thought as well, or system doesn't just identify transitions that identifies what those behaviors are. Identifies turn left, turn right, runs, rears, et cetera. And so we looked at the average onset aligned dopamine wave form when we looked at all 37 behaviors that are system picked up. And we just sorted this data from the most positive to the most negative. So that's why this little heat map goes from red to a little bit more blue. At the bottom, there is some diversity. We look at different behaviors that's interesting. So when we saw this, we thought, okay, maybe what's going on is that dopamine is really high for rears or run some class of behaviors that dopamine is really relevant for. And maybe it goes down for other types of behaviors. That would be really interesting. It turns out that's completely wrong. So while we thought that dopamine might encode the kinematics of behavior. Once we started digging into this, we saw something that initially really confused us. So if we take two behaviors, 10.37, so one is pretty high up here, it has very positive modulation. 37 down here it looks pretty negative. Those are two types of rearing. So those are behaviors that extensively are pretty similar to one another. They're just different ways of lifting your body off the ground and maybe having a little sniff to see what's in the air. In one case, you have seemingly a lot of dopamine and another cases in the other case, the dose dopamine is completely flat. That's really weird. Even weirder is if you take two rows that are very close on this heat map. So this is behavior too, and behavior eight. These look very similar in their onset aligned averages. There are two totally different behaviors. So one of the mouth is reared up a little and we call it investigates or just looking around. And in the other case it's a sharp turn left. If you look at their velocity profiles, they're polar opposites of one another. So what this tells us maybe, is that dopamine is not strongly coating the kinematics of the animal's behavior. It's even worse than that. So if we look at the same behavior, so in this case, this is the average onset of lines dopamine waveform we get for a pause. It turns out there's a lot of dopamine at the onset of this behavior. We can look at individual instances of that pause. And we can sort them by whether the light was high or low. If there was a little bit of variability, we might get a low positive peak and high positive peak. But what we find is that when we sort these individual instances, the bottom 25 per cent actually have strong negative going. We waveforms and the top 25% of strongly positive going dopamine waveforms. So even for the same exact behavior, doping is doing totally different things. So yes, I was wondering if in an animal that's not moving, whether or not these transients kinda look similar or yeah, yeah. So, so the question is, if we look at behavior more broadly in the animals doing stuff at some points and not doing stuff at other points in time, do we find more dopamine when they're doing stuff versus not there? Yeah. So head fixed. I can't answer. What I can say is if the animal isn't moving that much, but they're still changing their posture slightly. We still see dopamine transients. There is rough correlation with how frequently they move. And that's that minor positive correlation, right? At long timescales, it does not explain all the variability in the signal, right? It's a correlation of 0.1 at best. So there is a, an approximate correlation, but not super-strong. Any other questions? Yeah. So kinematics been seen to explain everything in the signal, What's left? Well, the beginning of the talk, I alluded to the statistics of behavior. So a simple way we can look at the statistics of what a mouse does is by looking at how frequently the behavior is used. So the way that we looked at this was simply by taking the average number of times a behavior was used. And this is average per mouse and personable. And then we plotted this against how much delight that was associated with that behavior around the onset of that syllable. So we find a robust and modest correlation between these two variables. So this tells us that there is some coding for how frequently and behaviors used, on average over the course of a 30-minute session. What about within that 30-minute session is dopamine predicting what the animal does moment to moment. So we can crack this open a little bit. So in blue, I'm showing you the correlation between delight and how frequently a behavior is used for the same behavior over larger and larger bin sizes. And so we find that there's a strong correlation that dissipates exponentially with a timescale of about 1 min. So it looks like if dopamine is elevated for one behavior, that behavior is used more for about the next minute. And then timescale is much faster than what we see if we look at the autocorrelation of delight itself, or if we look at the correlation between delight and velocity. So that's really interesting because it tells us that at least dopamine has something to do with the statistics of behavior. But how specific is this? Is it signaling the exact behavior that is associated with that transient or is dealing also associated with the neighboring behaviors because it seemed like it spreads really far in time. Well, it turns out that this correlation is really specific. So there's a, there's a good reason that we focused, focused on behavior at this timescale, which is if we look at the correlation between delight, how frequently a behavior is use, the correlation is maximized if we look at the current behavior, and then if we look back even to behavioral syllables, that correlation falls back to chance. So in other words, if I take behavior and delight and I just slide one of those signals by one or two behaviors, the correlation is completely eliminated. So this tells us that the light is doing something very temporally specific. Now, there's another way that we can think about the statistics of behavior. So I told you that the light is correlated with how frequently a behavior is deployed. But behaviors can be used in different sequence contexts. And so the way that we're going to hash this out is by looking at how randomly a behavior is used. And we're going to look at the alkaline transitions from behaviors, and then we're going to look at the entropy of those distributions. So in low entropy or deterministic behavior might only have a couple of different behaviors that it transitions into. A high-entropy behavior might have many downstream partners. So in other words, one is more like a coin flip That's high-entropy and one is more deterministic, That's the low entropy behavior. So it turns out this is also strongly correlated with delight. So if we take the average outgoing transition entropy associated with the behavior, we find that this nicely maps on to the amount of daylight that occurs during that same behavior. And again, we can look within a session. And we find that on a second-by-second basis, there is a nice mapping between the light levels and how randomly a behavior is used. But this correlation, unlike what we see with how frequently and behaviors use, correlation decays very quickly. So the x-axis is very zoomed in relative to what I showed you before. Remember I said if dopamine is elevated, that behavior is used more for about a minute. Here it looks like if dopamine goes up, the behaviors more randomly used on the order of seconds. So this is a much faster effect what we see with frequency. So what I've told you in this first part is that correlation only dopamine appears to modulate the statistics of action selection when the mouse is free to do whatever it wants. But we want to do better than that. We want to be able to causally test this idea. And to do that, we need to go in and actually dial-in dopamine levels during specific behaviors. We can't just observe what's going on. So to do that, we needed to rebuild most seek from the ground up. So something that I didn't mention what I told you how great mosaic is, is that it typically involves you taking a bunch of data and sending it to a compute cluster and then 4 h later getting back your results. But if we want to deliver dopamine during a specific behavior, we need to do what takes 4 h normally, we need to do this in about 200 milliseconds, because these behaviors are 300 to 700 milliseconds long. So we need to know what the mouse is doing in the moment and then respond to it with some perturbation. So I'm not going to belabor the technical details too much unless somebody wants to discuss them. In short, what we did was we replaced a lot of the offline image processing with, guess what, a deep neural network, in this case a denoising autoencoder. So this takes raw, noisy depth frames and spits out a nice clean death image. And we subject that to the same processing steps that we use in our offline version of Mozi. So this involves linear dimensionality reduction for PCA and then applying a time series model. And as I mentioned in response to Garrett's question, it's an autoregressive hidden Markov model. And we've actually modified this slightly to be more tolerant to the type of noise you get with real-time tracking. As I mentioned, this system works much, much faster than what we used before. So now we have at least the computational ability to pick up with the mouse is doing and then deliver some sort of response to it. Now the way that we're going to manipulate dopamine during a specific behavior is often genetically. We're gonna do that using transgenic expressing expression of channelrhodopsin. So this mouse is the result of crossing a debt iris cre mouse with an Allen Institute, a i32 channelrhodopsin reporter. So what this means is all dopamine neurons have channelrhodopsin. In this case, we can see the neurons that are expressing channelrhodopsin because it's fused to the YFP. So cool, we have transgenic expressing, expression of channelrhodopsin. Now, the way that we're going to stimulate these neurons is by parking and optical fiber bilaterally over the axons that originate from the essence C and terminate in the dorsolateral striatum. This is going to trigger dopamine release specifically in the area that we were recording from in the first part of the talk. Now importantly, we calibrated this excitation to mimic what we see spontaneously. So we're not driving the system into some aberrant non physiological mode. We're delivering the same amount of dopamine that this brain area would see during natural spontaneous dopamine fluctuations. So what I'm showing you in blue are the optimal evoked dopamine transients that we get with our setup. And in green, these are the transients that we see spontaneously. Okay, so we dial things into mimic naturally the amount of dopamine release that we see during a spontaneous transient. In terms of the actual experiment that was done here, every mouse underwent a three-week experimental protocol. So we did two 30 min sessions every day. And on the first day we simply recorded the mouse's behavior, the 3D depth sensor, and this is just to establish a baseline. And then on day two, we turn on our closed loop version of Mozi. We pick one behavior in one behavior only. And the mice get a little blue laser pulse during that one behavior. In this case, it's a little rear up and investigate. So they get pulses during that same behavior both morning and evening. And then on the next day we off to establish a new baseline. Then we pick a new behavior, in this case, a walk forward. Then we record a new baseline and we rinse and repeat over the course of three weeks. So every mouse that enters this experiment gets closed loop optogenetics stimulation during the same-sex behaviors, we permute the order a little bit animals to animals so we don't get any ordering effects. So I told you that endogenous dopamine predicts when an animal uses something more. So what happens if we deliver more dopamine during a specific behavior we get is a rapid steering of behavior towards the targeted behavior. So this is the cumulative number of times mouse performs the targeted behavior over baseline, and this is plotted against time in minutes. Now what's really cool is that you can separate our optogenetics mice from controls within about a minute or two. So this learning is very rapid and very specific, so we don't see learning spilling over to other kinematically similar behaviors. We don't see it's spilling over to other behaviors nearby in time. It's that one sub-second behavior that we picked in the mice do it more and they do it more within minutes. Now what about movements? Remember I told you that dopamine is correlated with speed and the open field. And this has been a little bit of a controversy in the field. We find that with our calibrated optogenetics stimulation, we get no signatures of induced movement. So even though elevated endogenous dopamine is correlated with speed, if we deliver physiological levels of dopamine, we do not induce movements. And we're quantifying that by looking at velocity, acceleration, angular velocity, and height. We've also looked more deeply and just taking all the data during stimulation and ask, can we quantify this? Can we classify that these trials are different from the controls and they're not. So we can't find any signatures of movement, but they're calibrated stimulation. But if we crank things up quite a bit, we can get the same effects that other people have seen. So if you stimulate for a little bit longer, you can get a nice run forward. So we can take the same light level and just extend it out for a few seconds. So we deliver more dopamine and my start running more. So non physiologically elevated dopamine. Canada's movement. What about sequence randomness? Remember, I told you that sequence randomness is correlated with endogenous dopamine fluctuations. We do find an effect here. So this is back at our physiological optogenetics stimulation level. We find that for the next couple of transitions after stimulation, there's an elevated randomness in the mouse is transitions. From the closed loop experiments, we've been able to recapitulate the effects on how frequently a behavior is used and how random the animal's behavior is. So now I'm going to talk a little bit about actually trying to link together the first and second parts of this talk. So I told you that endogenous dopamine seems to be correlated with changes in the animal's statistics. We can go in and recapitulate them in closed loop. So that's the causal test of this idea. And next, we want to, we want to determine whether endogenous dopamine sensitivity can predict exogenous dopamine sensitivity. And I'll get a little bit more specific in a second question. So in this experiment, if you were to now stop delivering dopamine, Do you still see higher statistics of the specific actions that yeah, that's a great question. So you're asking, Hey, we have a baseline The day after stimulation. Once we turn the laser off of the mice still doing that targeted behavior more, the answer is yes, and it persists for about 12 to 24 h and there's just a little bit of residue left. And that completely goes away by the second day after stimulation. So we did separate experiments where we stimulated and then just recorded their behavior after stopping for multiple days. And we find that day two after stimulation, it's all gone. So it looks like we can rapidly modulate behavior the date after there's still some learning Left day too, It's all gone. So it's very fast. Yes. Another question. I was wondering whether or not there are certain behaviors that it's difficult to reinforce and certain behaviors it's easier to If only I included that slide. So the answer, the answer, the answer is yes. So your question is, do we see signatures of differential reinforce ability from behavior to behavior? The answer is yes, and the answer is also partially explained by how much endogenous dopamine is associated with that behavior. So you can actually predict which behaviors are easier to be enforced by the endogenous dopamine levels associated with that behavior. I'm going to tell you about mouse to mouse variability, not syllable to syllable variability, which is also an awesome question. Yeah, So interestingly, it doesn't map on to a specific class of behavior. It's not like the groom's were more easily reinforce. It just happened to be the behaviors with more dopamine associated with them on average. Which is pretty interesting Anywhere, Chris, how prevalent my behavior started manipulating? Easy it was to modify its behavior with yeah. Yes. Here to repeat your question for the folks online. You're asking if a behavior was used more before we start stimulation, is that behavior easier or harder to reinforce? So it gets tricky because all of these things are correlated. Where more usage is correlated with more dopamine at baseline, which is correlated with something being easier to reinforce. But we do find that behaviors that are used infrequently and frequently, if they have more dopamine associated with them at baseline, they are easier to reinforce. Now, the other, the other part that's a little bit more technical. If a behavior is used more, it's going to get more simulations during the stimulation session. So behavior that's used 200 times and 30 min, you're gonna get more stems. And behavior. These 30 times is gonna get many fewer stimulations. And that's a difficult thing to control for it. So I don't have a perfect answer for that part of the question. Okay. Here's the question I really want to ask. You talked about in your behavioral graph, you talked about the entropy from a state. From a state. What's the distribution of states that goes into? You can also look at almost the reverse of that, which is from a state, what's the entropy of states that lead to it? So some behaviors I would expect, are only entered into from a small number of other behaviors. Then some behaviors are entered into from a wide range of other behaviors go there. It's easier in some sense to get into that behavior because you can get there from many different settings. Here's what I'd really like to know. Is it easier to induce uptick in a behavior with extra dopamine if the, if the incoming entropy into a state is higher than another state. So essentially if it's easier in some sense to transition to a state and thinking about like the motor learning papers from you and Battista write things that are already within like a subspace that's easy to move in or easy to learn to and to manipulate. Two. Is there, I'm trying to ask if there's a behavioral correlate to something like that. Yeah. Yeah. So you're asking if behavior is, in a sense For modular isn't more easily reinforced whether we're looking at incoming transitions are outgoing transitions. So the answer is yes. We focused on outbound. They turned out to be more predictive. Inbound is also predictive. The complication technically is the inbound is correlated with outbound. So it's a hard thing to untangle unless you have lots and lots of data and you're really careful. But our sense of it is that modularity is predictive of how easy it is to reinforce a behavior. It's, if it's easier to slot something into any position, it's just easier to upregulate it. And I think our data supports that idea. You can see everyone's really interested in this part of the talk. You may have said this, but have you done the opposite experiment where you inhibit and can you suppress certain behaviors and that way? Yeah, yeah. So you're asking if we if we turn the sign to negative, what do we see? So we did that and you can, yeah, so you can say, here's the I always end up with, here's the complication. The complication is inhibiting terminals. As everybody knows, epigenetically inhibiting terminals be devil's optic genetic experiments. There's a lot of complications for a variety of reasons. So we tried lots of different opsins. It turned out halorhodopsin worked really well. And axon terminals, we did get suppression of the usage of behavior. We didn't collect enough animals to look at sequence randomness. So I don't have an answer about that. We did also do the experiment and cell bodies with ECR2 got the same results where you can suppress actions. Yeah. We ended up not in the papers provisionally accepted. It's not going to be in the paper, but we do have that data and it does look like things go down if you suppress. It's also complicated because secant, you can't suck it out with a vacuum. It's kind of easier to pump a chemical into the extracellular space than it is to get it out. You can stop the cells from firing, but that doesn't suck it out as quickly as it goes in. So that's the other part that makes it a little different. Apples and oranges? Yeah. Any other questions? Anything online? So I think being provided a beautiful segue into this third part, we're going to talk a little bit about whether we can predict what happens with our epigenetic experiments based on what we see in response to endogenous fluctuations and dopamine. So more concretely. If we look at all the animals in a trigger off the genetic experiments. So they're shown here on the left. What I'm plotting now is the average number of times they perform the targeted behavior over baseline. And what we see is some animals really get it like gangbusters. Some other animals just are not getting with the program that they don't seem to learn. Now some of this can be due to trivial technical variability. We checked everything we thought we should check. So it's transgenic expression of channelrhodopsin. They have the option. We can also verify that histologically the fibers are intact. The mice seem normal, they're all age matched and roughly size match. So why is it that some of them don't get it? Well, this is a generic property in behavioral neuroscience, especially with mice where it seems like some get the behavior and some don't. And when we saw this, we speculated that potentially there is some latent variable that we weren't carefully considering, which is maybe some mice are more sensitive to dopamine than others. You can easily think this might maps onto something like dopamine receptor density. Or you could cook up a lot of stories for why this would be the case. You might imagine that for some bytes, the same amount of dopamine release leads to huge changes. And so we'll counts and entropy. One other mice, the same amount of dopamine release leads to very muted changes. So maybe there is some mouse to mouse variability in sensitivity. So to do this, we had to do a slightly more elaborate experiment. So we can't just stimulate and we can just record. We need to do both at the same time. So in order to do this, we took our data errors between mice and injected a flex redshifted excitatory option in this case Crimson are. We injected this into the substantia nigra pars compacta. We also injected our trusty green daylight into the dorsolateral striatum. And we parked in optical fiber above the axon terminals from these cells and the cell bodies that are expressing this virus. What that means is we can deliver red light, gets some stimulation. We can deliver blue light and we can get some nice green fluorescence. So we have two spectrally separate tools that allow us to do dopamine stimulation and dopamine recording. Okay, So what's the answer? Well, first, we needed to make sure that what we were doing made sense. So if there is some latent variable like dopamine sensitivity, what this means is that the correlation between endogenous dopamine fluctuations and usage and endogenous dopamine fluctuations in entropy should be very correlated from animal to animal. If one mouse is more sensitive in terms of its usage, it should be more sensitive in terms of its sequence randomness if they come from the same place. So we do find a nice correlation from animal to animal. So now the million-dollar question is whether one of these variables actually explains optogenetics reinforcement. So the answer to that is yes. It turns out that if a mouse is more sensitive at baseline, so no stimulation here, we're just asking, does endogenous dopamine correlate with changes in usage? If it's higher for a given mouse, that mouse more easily learned. In our optogenetics experiment. So we do find that there is a latent factor that explains why some mice learn and why some don't. We have no idea if this is a general mechanism, but this is something that the Datalab is actively pursuing. So lastly, I want to touch a little bit on what we think is going on. So I told you a bunch of interesting things, but what is our actual theory about how dopamine maps onto the mouse's behavior. So in words, we think that what the mice are doing is consistent with the reinforcement learning model. No surprise. This is a standard reinforcement learning diagram. But instead of having an oral agent, in this case a mouse interacting with the environment, we have the mouse interacting with its own internal dopamine levels. So what we think is really happening is my searches are chasing their own internal dopamine. Behavior happens to catch a dopamine wave, reconfigure what they do in order to perform that behavior more and more. So we have this nice closed loop diagram between the mouse and its own internal dopamine. And for an oral model, what you need or states, actions and rewards. And so here what we assume is that the state or the context of the agent is the behavior of the animal is currently engaged in the action transition from one behavior to the next. And then the reward is the dopamine again, at that transition, this all seems pretty natural rate the basal ganglia gets information about what the mouse is doing, gets lots of projections from motor somatosensory areas. The mouse clearly has a sense of what action they're performing. Basal ganglia gets e Firenze copies. It's also involved, involved in the partial control of behavior. And the circuit receives a very dense dopamine projection. So it seems like all the information is there in the circuit to do this sort of thing. So we're going to formally test this by doing some very simple simulations. So what we're gonna do is we're going to initialize a reinforcement learning agent in silico. We're going to assume. Every state can go to every action with equal probability. We're going to feed in experimentally recorded, don't mean. And we're just going to run a reinforcement learning model using the data that we report it. And what we're going to ask is, does the model recapitulate the behaviors that we see in the open field. To make this more specific, this is the transition diagram that we observe in all of our mice. So we find that some behaviors have a predilection to transition to certain other behaviors for one reason or another. So now we're just going to run our model to see what behaviors are likely to transition into others for the model, just feeding it dopamine that we record. First, we tried a very simple model. This is reward only, so we just assume optimized dopamine at the expense of everything else. What we get is kind of a rough match. If you squint your eyes, maybe it's believable. It is statistically significant for what that's worth. We do find some match with our experimental data. But then remember that dopamine also impacts how random the mouse's behavior is. So we can put that into the model and now we get a much better match with what the mice actually do. So if we assume the mice are both optimizing their own dopamine, and we assumed that the randomness of their behaviors under the control of dopamine, we get a very nice match between what the model believes in terms of transitions from one state into another and what the mouse actually does. To our minds. This, this confirms at least partially the idea that what the mice do in the open field is consistent with our all learning. Which if you think about it, if you told me that putting a mouse and an open field has some sort of structure that follows a reinforcement learning model driven by dopamine. That sounds pretty weird. These mice are just randomly moving around. They can do whatever they want. They haven't been instructed to do anything. But their behavior is still consistent with this reward-based learning framework, which I think is super interesting. So to sum up, I told you that endogenous dopamine in the absence of a task or an explicit goal reinforces spontaneous behavior. So here, showing you that we've done this through dopamine recordings. We've also shown this to you causally through close loop optogenetics stimulation. I'm showing you that endogenous dopamine structure is action selection over a short timescale. Dopamine seems to scale very nicely with the randomness of the mouse's behavior. And I've also told you that the effect of endogenous dope behavior is consistent with an RL framework and appears to explain variation and offer da reinforcement. So a question that we're actively thinking about is now where the rubber meets the road. So don't mean is getting released to a bunch of cells and it's impacting their physiology. And those cells also control with the mouse does. So what do we think is actually happening in these cells? So now we're really interested in focusing on this locus where dopamine interfaces with neurons that control movement. So we're really interested in pursuing what's going on there. Yes. Question. Yeah, So this is a question from Lena. Think what? The question is. What are the implications of your work for a Tova treatment in humans? Are there some behaviors that would be more sensitive to Olivia baselines of dopamine making them more prevalent. Boot this explain, not now that can, that can see us and riskier behaviors with the atria, admin and PD. Oh, that's a great question. So do we have some predictions for L-Dopa treatment in humans based on what we've observed and do we think there might be some differential effects from animal to animal behavior to behavior? Certainly my prediction would be yes. It's certainly going to be a different way of manipulating dopamine since L-dopa, dopamine elevations are going to be much longer lived than our optogenetics stimulation. So it's gonna be a different type of dopamine perturbation. But I think we would strongly predicts that behavior is differentially impacted for behavior to behavior and person to person. What we should do about it as a much harder question. There's another question from Diener Jaeger. The question is, is every mouse showing a completely different, different sequence of behaviors and to permit doing forces or read the walk. Or is there an innate transition matrix and how this be explained by your result? Yeah, yeah, Thanks for the question. So the question is whether there's some behavior, purely random walk and everything go to everything or does it look like there's some innate core? So it's a little bit of both. So we do think that there is an eight core to behavior. Part of it is driven by biomechanics, which is something we're getting more and more into. Some behaviors, are gonna be biomechanically favorite in terms of what they can come from and what they can go into. This super obvious or we're down is gonna be is going to follow a rear up, e.g. there are many other behaviors that we think follows similar rules. There are other behaviors that seem to have consistent sequence structure that don't have a simple biomechanical explanation. I can't elaborate more on it because I don't know why. But there appears to be a combination of this in a core and then still some variability in the transition matrix. So using the transition matrix alone, you can make reasonable predictions about like mouse identity, which is kind of interesting. Yeah, so wishy washy a little bit above. But thank you. Yes. Great, great talk. I am in my mind trying to marry the reinforcement learning and the randomness. And I'm wondering if you maybe think of them as two separate things that just happen to both be in a dopamine. Or if you think the randomness is an important part of the reinforcement learning because having variability in the behavior is important to them. Sort of map the possibility space to then figure out what gets reinforced or how you're thinking about how those things go together. Yeah. Yeah. So I think your question gets at, we have these two pieces that are all model. How do we think this really maps on to what we see? I'm gonna go to a bonus slides to cheat a little bit to partially answer your question. So we think this variability actually does some good in terms of reinforcement learning. Now to get more specific and an RL model, you have a transition matrix and you have a policy. And policy tells you how do you act on the transition matrix that you're given? And we think that dopamine is actually setting the randomness and that policy. If you put those two things together, you actually get a kind of interesting RL model. And so the sides a little bit complicated, but I'm just going to try and give you the punchline, which is we put this model that has these two pieces chase to meet and have, have randomness scale with the amount of reward you receive. So first we put it through a per separation test. And so what we find is that if a model, let's say we have to set its variability too low if it doesn't have the ability to scale its randomness. If we give it a larger awarded per separates, right? So that's why these two starting back-and-forth between these two things. And then in the high variability case, sure, it doesn't separate. But then if we give it a sparse reward environment, it doesn't learn to exploit that reward, right? So you have to pick one or the other. It's hard to really calibrate these things to find the exact right amount of randomness to use. But then in the full model, we can actually set the baseline variability to low. And then if you get a big reward, you crank it up. That prevents you from perseverating in the top plot, but then you're still able to get rewards in a sparse reward environment. So actually think there's a reason you would want to do this. You don't have to preset how random your behavior is. There are other ways you could do this, but this is our simple idea of why this would be computationally useful. And it does map onto what the Missy to be doing. Yeah, that's really interesting. And I think overall variability is an important part of learning that is sometimes neglected. Absolutely. Yeah, yeah. And I think plenty, plenty of folks have shown in the context of motor learning that variability does seem to predict how well you can learn to reach a goal in the presence of perturbations, right? So yeah, it's just worth that. Marie Smith and that's small, that's key. And done some years ago in humans. And we think we see some signatures of that in mice. And people have also seen this idea in songbirds. Yeah. So along these lines, Yeah. Simply mean that an action that's reward base and it has been associated with the reward. After learning, you wouldn't see a bigger correlation of this action with your dopamine, then if it's a random, if it's before the association, does that make sense? It does. I think what you're asking is, after the a learning, do we get more usage, more dopamine? Now it's easier to learn in a way it's almost a runaway process. Is that, is that getting it exactly, yeah. So my prediction is yes. And, and the reason I'm saying prediction is we haven't run very long timescale, reinforce this thing, come back to it a couple of days later to see if we can get even stronger learning? I suspect the answer is yes. And one thing I didn't mention is we did some experiments where he opted genetically stemmed the cell bodies, probably with non physiological dopamine levels. And what it looks like the second session. So the evening session in the opera experiment, the learning rate seems to be kind of medium in the morning and at very high in the evening. So there does appear to be some signatures that there's a runaway process with da. Let's say there's no optogenetics, but there's, you mentioned there's chocolate in the field. Yeah. And let's say this is something systematic or the animal has perform a very specific extra to get the chocolate, right? I would, I would assume something similar where this action has a lot more dopamine then if it's not associated with the reward. Does that make sense? Yes, yeah. So you're asking the natural context, maybe it's not gonna be one behavior, but there's probably a class of behaviors that gets you closer to good stuff. And do we think similar mechanisms are at play? The short answer is yes. We haven't looked, we've wanted to do versions of this experiment that involved clicks and food rewards to see if we can get the same type of learning in that context and more controlled way than trucking chocolate chips in the arena, which was mostly done to calibrate what we were seeing in our recording. So I think it'd be a really interesting experiment to try and do more naturally what we've done through optogenetics. Yeah. I had a, I had a quick question. So you talked a lot about these dopamine transients like the phasic dopamine. And I was curious whether or not you think that there's a role for tonic dopamine? Kind of like, you know what Lina was asking about levodopa treatment, like what kinda like elevating conic dopamine? Yeah. Yeah. Yeah. Yeah. So you're asking about really DC shifts? More DC shifts and dopamine. The things you might expect from an L-Dopa treatment. What do we think's going on there? So there are a couple of answers, one of which is tonic. In some sense, maybe tonic is a moving average of phasic release. In which case we can see signatures of what's going on with tonic. The second part of the answer is it's really freaking hard to measure tonic dopamine with current methods, you either have micro dialysis which has a timescale of minutes at best, or you're left with these single fluorescent protein tools where it's really hard to even know what your baseline is. So I don't think we know, we know from, I think drug treatments and humans and there's a huge literature there. And I don t think we either have effective ways of titrating tonic dopamine and mice other than these pharmacological means, we don't have really good methods to measure them. This I think is just an outstanding problem in the field. I think people have tried to take stabs and tonic, but it's a really hard thing to do carefully, in my opinion, at least with these methods. Yeah. Yeah. For sure It should be involved. It's going to determine like receptor occupancy and all these things that we think are impacting behavior. But it's just hard to both measure and test right now. Yeah. This is a question from below Heider. The question is does mouse to mouse efficacy of endogenous da and reinforcement or so frantic, better learners for tests. Sorry, can you repeat that? I can ask to mouse efficacy of individuals da reinforcement, the product better learners for tests. Does it predict better learners the endogenous dopamine effects of efficacy? Yes, yes, it does. So all the things that we saw in the first part of the talk appear to be pretty strong predictors of what happened in the second part, when we do those experiments in the same mice, that cohort, it was a smaller cohort of mice. So it's hard to say how generalizable this is going to be. We haven't tried these mice and other tasks, but we have some thoughts that potentially the mice with this endogenous dopamine sensitivity might actually be more readily learn even in constrained tasks, which would be an interesting area of future exploration. Yeah. This relates, relates a little bit to my first question, but it might have been approached by other questions. I'm curious about this. In defining what you call a behavior. We talked a little earlier about discrete and how many of them there should be and so on. I'm wondering if a more complicated, a complex behavior can be thought of as several different sub behaviors. And when I'm thinking about the entropy part of the talk, when you are correlating that with dopamine, right? I'm just wondering if somehow that's tilted towards more complex behaviors that would have more, some behaviors that would represent that more complex behavior if that makes any sense. Yeah, it totally makes sense. Yeah, So your question is entropy kind of a surrogate for maybe a simpler underlying question, which is, have we broken up behavior in a way where the things that are more kinematically biomechanically complicated, broken up into more parts. So they're, entropy is different according to our model. Open question. We know, given the constraints that we have right now, this is the best model that we've come up with. For sure. The continuous observed dynamics in the model are linear. So this has certain constraints that other models may not have. And it's possible that entropy does correlate with something like biomechanical complexity. We've looked at correlations with things like acceleration and jerk and not found some dead ringer like you missed it. It's just the jerk of the behavior. And it really looks like you've broken up one thing into ten sub-components. And that way, that's why the entropy is low. We don't think it's something as simple as that, but there could be something else that we're missing. But their systems. So I think as the models get more complicated, the recordings of behavior get more high res. We should be able to get gain more insight into that. Yeah. Yeah. We don't know. Yeah, That's the best answer I can provide. Will let you finish your last few slides. Okay. We're done. But there's a lot of people think. So. Definitely you should see who's being paid, right? So I was jointly mentored by two amazing postdoc mentors and Bernardo 17 and Bob data and supported me through all the, all the adventures during my postdoc. Lots of people really helped. I'm just going to highlight the folks that were involved in what he described today. So grad student in the Datalab when Gillis, um, I've worked with for almost a decade now, it's really been, it's been an incredible collaborator, my J and other graduate student, the Datalab, Jeff would an undergraduate who ran most of the behavior. And I think his cares why we can look at some of the subtle effects that we did Scotland or been provided very capable computational assistance. He now runs its own group at Stanford, Alex and Matt for creating many of the tools that we leveraged here. There are now a Google folks from the lab who made sure we were to off-base for their dopamine modeling. And I should also thank the folks that are currently in the lab which has just opened. So there is an incredibly intrepid and brain technician who is joining the ranks in these early days, Zeynep. And we're actively looking for more folks to join the labs. So if any of this sounds interesting to you, feel free to reach out. We can just chat about the work or some possibilities of doing work in the lab. And of course, need to thank the funders and our current collaborators. Thank you. Alright. Anyone has more questions for one or two.