[00:00:05] >> Great pleasure to welcome Professor Nick. He. Then went on to this is that for him there is. The Rose brain and the best psychology. And it's a really great pleasure to have him here today really is a leader. Models a really. Exciting imaging approach. And tension in perception learning and memory and really the intersection between the 2. [00:00:40] These days particularly interested in statistical learning so how do we build some knowledge base of the structure of our world so that we can make predictions. Our behaviors and I think that's the point for today's. High delivery memories is. There's just about every So please join me in a world. [00:01:07] Thank you so much for for having me and I've had a really wonderful day I was telling some people or earlier that the 1st scientific conference I ever went to was here the cognitive aging conference in 2002 and if anybody else was here at that meeting. It was a fantastic man actually got me. [00:01:30] Serious a thing about going to grad school at the time I was an undergraduate so. Yeah thank you for the invitation and for the opportunity I really enjoyed meeting with everybody is a lot of exciting work going on here and it's a real privilege to get to tell you about some of our recent research Feel free to interrupt if you have any questions. [00:01:49] This is a lofty kind of tentative title not sure that we want to fully revise our understanding of how memory works in the brain but what I want to talk about today is at least some new findings that Condit question the existing. Framework that people use for understanding memory so a classic phenomenon in the study of memory that really led to our understanding of memory systems the idea that there are different brain systems that support different kinds of memory where patients like this this name is Lonni Sue Johnson she's. [00:02:29] An amnesic patient she has complete bilateral hippocampal loss I'll show you a picture of her of her brain and a little bit she goes by her full name typically neuropsychological patients would go by their initials but she's a public figure she's a fascinating. Case because she had a very rich interesting life before she suffered this catastrophic brain injury from this and so she was a professional artist she illustrated several covers the New Yorker magazine for example she plays the viola she. [00:03:07] Had a pilot's license and owned 2 airplanes that she flew around she owned a dairy farm in upstate New York she had a very rich interesting life and was very accomplished in many domains and then she suffered this illness that destroyed or her temporal lobes bilaterally once she recovered from this she had classic symptoms of hippocampal amnesia an inability to form new memories and an inability to in her case recall prior experiences so this is the kind of patient that gave rise to a lot of early theorizing about memory systems in the brain so I thought I'd play a little video hopefully the volume's good already started if it's not. [00:03:50] Of her sister and her mother talking to her about an event in her life from before she got sick. So this is a you know obviously a personally heartbreaking episode from her life the reason I show this video is because it illustrates very clearly both forms of nature that she has retrograde amnesia so again he died many years before she suffered a brain injury and it's controversial still about whether they have a campus remains important for remembering long consolidated information and she's a case study this suggests that it does so that's retrograde amnesia for experiences in her life from before she got sick she also suffers from interior grade I mean and that's all straight in this video because by her sister's report this is the 100th time that she was told her father is dead after she recovered from her illness so she despite being told that 100 times she had been unable to form a new memory of being told that and still expresses surprise. [00:04:50] So it's patients like this that lead to and you know not necessarily from and stuff like this in a gems cases from surgery that led to the idea that the hippocampus and the medial temporal lobe memory system is particularly important for this autobiographical memory of one's life here's a here's a scan of her brain this is where your hippocampus would be. [00:05:15] So you can see she's missing or hippocampus bilaterally and also has some lateral temporal damage on one side and that's fairly typical for and syphilitic patients. So that's what led to this sort of classic taxonomy of memory and Squire and others suggesting a long term memory consists of this declarative explicit component for events in one's life and also for facts about the world that depends on the medial temporal lobe and that this kind of memory for events and facts can be distinguished from other kinds of memory implicit non declarative memories of skills and habits priming conditioning etc Each of which depend on different brain systems and so this is the canonical view this is in every textbook and here there's an oversimplification which is there's a $1.00 to $1.00 mapping between cognitive constructs like episodic memory semantic memory priming and brain systems and there's no reason a priori that the brain has to respect to the boundaries between cognitive constructs that we come up with as as psychologists and the way the field is currently going and what I hope to give an example of is that that sort of simple one to one mapping is actually not quite true and that the medial temporal lobe is involved in other kinds of learning and memory as well at least in what I'm going to talk about today so. [00:06:38] I'm going to focus on this Middle Temple of memory system and how it supports 2 different kinds of learning one that it was known to support episodic memory which is what I was talking about before and also statistical learning which is what the heck are you mention which is traditionally thought to be a quarter call kind of learning so just to try to illustrate the difference between these 2 kinds of memory and this is somewhat agnostic to the brain region involved episodic memory refers to her ability to store and retrieve specific events that happened at a particular moment in space and time. [00:07:11] This kind of memory must be formed rapidly you only have to experience something once to a store in episodic memory so it's one shot learning. And it produces memory traces that are set. From each other and by separate from each other I mean sort of a technical sense of that but just as a cartoon version of that where you might have 2 memories and if each of these circles represents a neuron or a neural population there's no overlap between these 2 memories even if they refer to related experiences so for example when I commute to the office every day and approach this dreadful parking lot if I parked here yesterday. [00:07:52] That might be stored in that memory trace somewhere in the brain and if i parts here today that might be stored in a different part or different population of neurons Now why is that interesting well almost all aspects of this experience or these experiences are the same same time of day I'm driving the same car I'm doing the same thing I'm going to the office I'm listening to the same radio station I'm thinking about who I need to meet with ruminating on various problems in my life and in my lab I might be wearing similar clothing I mean I'm going to be seeing all the same buildings the same intersections seeing many of the same cars maybe even the same people so a lot of overlap in these experiences and yet despite that. [00:08:37] The memory traces are tend to be stored separately for those 2 specific experiences the computational reason for this is that by storing memory traces separately that helps avoid interference both injury and coding and retrieval So when I go to try to remember where I park today I'm able to pull up this particular location and distinguish it from that location or park yesterday obviously this fails sometimes but that's the purpose of separate storage of memories is to avoid confusion between memories or competition between memories so this is one kind of memory where you're storing discrete memories of particular moments in space and time and this can be contrasted with statistical learning and this is very much related to how introduce this so the world is noisy and dynamic yesterday when I pulled into the parking lot there was a construction crew that was blocking part of the parking lot it snowed the night before there was snow on the ground an intersection was closed so I did navigate a different route I made to drive my wife's car because my car was in the shop so there are aspects of yesterday's experience that actually are not representative of my general approach of parking in this parking lot and that's what I mean by noisy these are idiosyncratic aspects of a particular experience and dynamic the world changes you know part of this law might be close to a new building gets put up and so in so far as using my prior experience to guide adaptive behavior in the future I don't necessarily want to only store these idiosyncratic specific details but rather combine across memories to find what's common What are the patterns in my experience what are the regularities What are the things that are always true when I go to park. [00:10:22] And so in the context of you know a parking lot like that that might be something like learning a spatial map over the parking lot of where there tend to be open spots so I might learn over time that these are good you know aisles to head down over here and here because there tend to be open spots this isn't because I'm remembering that one time that there was a spot there but on average these are good places to go what's interesting about this kind of learning is it has the opposite computational requirements to episodic memory in many ways one is a slow so to combine across memories you have to learn gradually over time you have to integrate over multiple experiences rather than learning just one thing at a time and Moreover in order to integrate across experiences you have to store those experiences in an overlapping manner so that the common elements across those experiences can get reinforced so that's what this cartoon shows here where you know one circle might correspond to one day and another circle my correspond to the next day by storing them in an overlapping manner the common features or elements of those 2 experiences get reinforced and strengthened over time and my statistical memory becomes these overlapping features and so again there's this distinction between rapid separated learning and slow gradual overlapping learning now to solve this problem in the brain it's been a theorized that you can't do these opposite functions in the same brain system. [00:11:51] In particular in order to learn quickly you can learn slowly in order to separate memories you can't overlap right and so the episodic memory functions ability to rapidly store separated memories has been ascribed to the hippocampus and someone like illustrates this inability to form new episodic memories. And that the extraction of regularities across episodes happens in the neocortex broadly defined so you know the idea is that as you and code individual experiences that are stored in the hippocampus during offline periods while you're resting or sleeping those experiences get replayed out to the rest of your brain the rest your brain uses more overlapping representations and is able to extract common aspects of those experiences so the difference between separation and this are overlapping representations of the distinction between the hippocampus and the cortical here so this is the standard view that you get rapid learning of specific events in the campus and gradual learning of regularities and neocortex. [00:12:57] This was put forward in the complimentary learning systems theory in the mid ninety's and I think is still probably the dominant theoretical view about how these 2 kinds of learning are supported in the brain the problem with this is that if you actually study learning of regularities. You tend to get a lot of hippocampal involvement. [00:13:19] So across many different studies using different kinds of tasks the medial temporal lobe and hippocampus more specifically are consistently implicated in statistical learning this process of extracting patterns this is true in our own work and I'll talk about some designs related to this so this is sequential learning so learning of sequences and patterns and sequences but this is also true in other kinds of tasks like artificial grammar learning tasks context or cueing tasks reaction time motor sequence learning tasks so across all these domains where there's some underlying structure that guides behavior the hippocampus seems to be activated. [00:14:02] Now what does it mean to find a blob of activity in the hippocampus What does that tell us about what they have a campus is contributing so this is what we're trying to figure out is so what is the hippocampus doing and how does that fit with this theory that says the work of statistical learning should be in cortex not in the campus So to answer that question we sort of approach this problem from a different perspective one that's come to be known as sort of the representational hierarchy view of memory and so the idea is that if you take the classic Phelim and then Ehsan wiring diagram for visual cortex down here is your retina moving into view one v 2 ventral stream areas dorsal stream areas and as you move forward interior in the ventral visual stream into temporal cortex gradually it terminates in and to run a cortex in the hippocampus the very top of this visual system wiring diagram is that hippocampal structure that I was showing you before and so the idea of the hippocampus as a sensory region or as a visual region has led people to speculate what kind of visual information would it care about what kinds of sensory information would it represent. [00:15:18] And so the view adopting sort of the standard way of thinking about visual processing Harkey says that what it does is a more complex form of processing than what the next stage down does so you know whereas these early areas like the one in the 2 might represent spatial locations in the world or edges. [00:15:38] Next up sort of mid-level features you might get representations of texture color. Shape and as you move interior to that especially the ventral stream representations of object identity knowing that this is a computer and that's a chair or categories differ qualitatively distinct kinds of things living non-living faces building this you know emit So this is typically were models of the visual system stop but if you think about what would be the next level of abstraction in visual processing. [00:16:10] We end based on the known properties of the hippocampus the middle temporal lobe in spatial processing and in sequence processing temporal information. This theory sort of proposes that the medial temporal lobe might be tuned not to any of these features but rather based on space and time and specifically that whereas in these areas 2 things that have the same identity might be stored similarly so to chairs or to computers or to cameras once you move up into these areas what determines whether things are stored similarly are processed by the same neurons and populations is not whether they have the same identity but rather whether they have the same place in space and time and what I mean by that is that if whenever I walk into factories office he's sitting there I might come to represent his face and his office with the same set of units not because his office and his face look alike not because they're the same category of objects but rather because they co-exist in the world I mean you can think of that more abstractly computers on desks you know people in a auditorium there's a lot of structure in our world is not determined by object identity object category or lower level visual features but rather by spatial temporal COA currents and so by from this perspective the role of the hippocampus in statistical learning might be to extract regularities in terms of how objects co-occur in space and in time and this might be the tuning the the selective idea of the system could be for the spatial temporal features. [00:17:51] So the prediction that falls out of this is that objects that reliably appear together in the world should come to be stored more similarly OK this is building on some really classic work actually buy. In primate if your temper cortex impair Ronna cortex showed these kinds of effects of temporal proximity on the tuning of visual neurons. [00:18:16] So this is sort of the hypothesis that we started with at least for the 1st study I'm going to show you that that co-occur and is going to drive the similarity of responses of this brain region and that might be why they have a campus is involved in statistical learning. [00:18:35] I'm so we looked at this with high resolution functional magnetic resonance imaging and we're going to look specifically at different parts of the hippocampus it's actually a beautifully understood circuit neural circuit coming from rodent models and primate models we understand the components of the circuit we understand their wiring and with sort of modern imaging techniques we can now extract these different components and look at the circuitry of the system for these initial studies that's not going to be very relevant but when I get into some computational stuff later on we're going to try to simulate how the circuit works. [00:19:09] OK. So this is they have a campus I showed you before but if you take a coronal section so if you slice through it and you look from back to front this would be the hippocampus is where a lot had a black hole at the zoom in here this is essentially the. [00:19:27] The wiring diagram of the least the core components of the system the enter Ronna cortex provides the input to the campus and it projects in a few different pathways One is through the dentate gyrus and ca 3 another is directly to CA one dentate gyrus ca 3 CA wants to become and these are all just different sub regions of the hippocampus that are connected via these arrows now we can try to measure brain activity within these different sub regions to try to understand how they are contributing to this process and we do that by doing very high resolution structural imaging so like a knee M.R.I. but for your brain but a very high resolution and using known landmarks anatomical landmarks we can trace beyond Rees of these different sub regions by hand across the whole length of the campus you can see the labels here corresponding to the colors in the diagram and this is sort of a more anterior slice of the campus this one here is a more post your slice of the campus and we get out of this is regions that correspond to different components of the circuit and then we're going to look in these regions of interest whether the way it stores information is governed by spatial or temporal information so this is all work from a former graduate student on a Shapiro who's starting her own lab this summer at U. Penn. [00:20:50] And so one of his early studies what she had people do was watch a sequence of fractal patterns in which there were some regularities embedded in terms of which fractals tended to follow each other. Subjects weren't told us they were told just watch in fact they were given a cover task every once in a while a fractal came up there's a little gray scale patch on it and they had to press a button when they saw a fractal with the grayscale patch so that's what they're told to do that's what they think the experiments about in general they don't become aware of the fact that there's some predictability to the sequence So here's what it might look like OK So watch out for $45.00 minutes at the most exciting experiment but they get very passionate about the grayscale which is good and so if this is a snippet of the sequence that I showed you. [00:21:43] We embedded in the sequence some regularities in terms of which fractals followed each other. In particular some fractals were part of strong pairs of the 1st was followed by the 2nd 100 percent of the time so whenever you saw this picture you would see this one next in the sequence or this one you see this one next in the sequence and we're going to compare that to a control condition where there's a weaker probability So these pictures are followed by these one by this one a 3rd of the time this one by this one a 3rd of the time and the 2 thirds of the time that it wasn't followed by that it was followed by something else. [00:22:16] So the hypothesis here is that these fractals that are part of these strong pairs are come are going to come to be to evoke more similar representation or at least patterns of activity and they have to campus these are all betrayed pairings there's no reason why we group them the way we did it was random and so there's no inherent reason why these things should belong together other than the fact that they tend to occur together in sequence. [00:22:42] To measure the impact of learning we we did that exposure phase in the middle of the study but both before and after learning we expose people to all of these individual pictures on their own wanted to time multiple times and we measured the pattern of activity enough from our eye of vocal by each of these pictures so we get sort of a template of how each fractal is processed and they put campus both before and after exposure to regularities I'm in fact these 2 faces are identical to each other all the difference is that in the middle they were exposed to this physical structure critically order here and here is random so there's no structure at this point we're just measuring the the snapshot the response to each individual fractal and the prediction was that the pattern of activity evoked within that campus for fractals that were strongly paired should be more similar to one another than the patterns of activity of oked by fractals that were part of weak pairs that didn't belong together is as strongly again. [00:23:43] I'm just surprised for the sake of brevity I'm just going to show you a sort of a key result here this is the change in neural similarity the correlation of the pattern of bold activity from F. M.R.I.. From before to after exposure for fractals that were strongly paired or weakly paired and as another baseline condition we permuted the labels of the fractals of this is sort of a nonparametric baseline for how similar to fractals are without any sort of relationship and what I saw on the X. axis here is 3 of these different sub regions that we're looking at at the time of death from our we couldn't distinguish these sub regions so we lump them together now it's possible to do that OK So up here means that after exposure to regularities the fractals came to evoke more similar patterns of activity that the statistics had governed the similarity structure of this brain region. [00:24:38] So as a baseline those shuffled pairs show no change in similarity is a function of exposure however the strong pairs the fractals that reliably paired together came to evoke more similar patterns of activity we assumed that these weak pairs I would be somewhere in between shuffle pairs had a very low random probability of 2 percent. [00:25:00] 100 percent transition probability these weak pairs had a 33 percent chance of transitioning so we assume that the weak pairs would be somewhere intermediate between strong pairs and shuffle pairs much more surprise and I'm not going to spend much time on this today we actually found that members of weak pairs from before and after exposure came to elicit less similar patterns of activity and I have some thoughts about this to be happy to answer any questions but this sort of non-monotonic change in the neural representations is a function of a continuous variable the probability puts a lot of constraints on some of the learning theories or learning rules that apply Yeah. [00:25:48] Now they're told just look for the gray scale patch fact the vast majority of them we debriefed them very thoroughly afterwards had no idea that there were irregularities imbedded in it there's a lot of randomness in the sequence too because there's no breaks between the fractals and there's a lot of violations of these weak pairs it's not it's not a very apparent structure. [00:26:12] OK So these are very you know I talked about like driving to the parking lot that's a much more complex kind of structure a lot more variability more complex long range dependencies. Where as these are very sore artificial regularities deterministic pairs are one thing always follows something else so we've been interested in whether this learning process scales up to more complex regularities and to do that we've been adopting a graph theoretic approach this. [00:26:43] Schematic here is a community structure graph as in all graphs each of these little circles is a node and a node in this case is a stimulus so whenever you visit this node in the graph you would show this stimulus to the subject associate with. The edges the lines here connect the nodes and they tell us when we're generating a sequence how we're allowed to transition between the nodes so we picked this diagram because it has interesting properties that I'll talk about in a 2nd but it's referred to as community structure because there are communities of nodes that tend to transition to each other more than they do to other to nodes and other communities community structure is a complex kind of structure but it's also a very naturalistic kind of structure and universities have community structure faculty in one department tend to talk to each other more than they do to faculty in other departments it's true about animal colonies it's a very widespread kind of structure OK Now we didn't show people this instead we just showed them a sequence of shapes but the sequence of shapes they saw was drawn by walking over this graph So here's how it would go and start somewhere on the graph and you'd move along edges randomly and when you visit a node you insert that shape in the sequence. [00:28:01] So this sequence that we generated is just a sequence of shapes that's people are just told watch the sequence and every once in a while one of the shapes going to be rotated 90 degrees and press a button when you detect that OK but critically the statistics of the sequence it's so far as they're learned could allow somebody to infer the underlying generative structure that gave rise to the sequence or the statistics of the sequence respect the statistics of the underlying graph and the question is will the hippocampus come to encode the graph the generative graph that was used to produce the sequence that they watched and if so that would reflect a more sophisticated kind of statistical learning. [00:28:45] This is an interesting graph partly because the kinds of transition probabilities that I mentioned in the previous experiment are no longer relevant here each node has 4 edges and each of those edges is equity probable the transition here between nodes in 2 different communities is the same strength as the transition from this no to that node within the same community the transition probabilities do not tell you about the community structure another way of saying that is that the joint probability of these 2 appearing in sequence is the same as the joint probability of those 2 appearing in sequence so while how could you learn the structure then it's not based on transition probability it's based on the overlap in the transition partners so 2 nodes in the same community tend to transition to an overlapping set of next nodes in the sequence so both of these nodes will lot of the time be followed by this this or would all the time be followed by this this or this right and so by looking at predictive overlap you can learn which nodes should go together and in fact it's simple to model that kind of learning in a in a neural network if you train a neural network to predict what comes next in the sequence and so far as to inputs lead to the same prediction they'll be grouped together and later. [00:30:01] So this is another kind of learning signal based on predictive overlap. OK Without going into too much detail we basically measure the pattern of activity evoked by each of these shapes so it's the same kind of analysis that I talked about before in the hippocampus and we're going to look at the the similarity of these activity patterns based on whether you're comparing 2 nodes in a different community versus 2 nodes within the same community and in so far as there are some in coding of the community structure there should be a higher correlation within community than between community I mean indeed that's what we find a slight increase in the correlation when you're comparing within community. [00:30:43] Now this is just one very reduced way of looking at community structure right the within versus between you can do sort of more broader analysis trying to look at the similarity space of all of these shapes that is how similar all the shapes to each other in the hippocampus and you can do that by essentially calculating the correlation of every node with every other node where you get a correlation matrix and every correlation is then a distance between those shapes in the hippocampus you can then apply various techniques to that such as multi dimensional scaling to try to visualize the distance between shapes in the hippocampus based on the correlation of their activity patterns at a lower dimensional projection and when you do that you could get out of out of it a visualization like this where now the distance in this 2 dimensional plane between each circle corresponds to the similarity of the neural activity patterns and what you see is that the nodes that belong to the same community are clustered in space that is they evoke more similar patterns of activity so orange goes together purple goes together green goes together you also see that the boundary nodes in the lighter shades here are more internal to each other so it's only capturing the communities themselves it's capturing the relative roles of nodes within the community so just having you watch that long sequence has allowed your hippocampus to infer this underlying graph right that's the structure that it's extracting merely by a passive exposure to the sequence so we're doing other studies like this to try to understand sort of the limits on the kinds of statistics the kinds of relationships the capacity for these sorts of relationships and so on. [00:32:28] OK this is all imaging work showing some role for the campus but doesn't mean it's playing a necessary role may could be that learning is happening elsewhere and is being reflected in the hippocampus that's where patients like Lonni Sue come in and so we've been conducting patient studies in patients with hippocampal damage or epilepsy patients or other patients with middle temporal lobe deficits to try to understand whether the hippocampus or medial temporal lobe at least or necessary for statistical learning and the jury's still out on this but the data that we have from ADI sewer as follows We had or do many different statistical learning tasks in a healthy neurotypical person you really can only do statistical test once because once you then ask them about the structure you've given up the game and then if you bring it back in to do another task they're going to try to explicitly learn the structure right so they can implicitly learn it once and then when we tell them about it then all bets are off because she doesn't remember having ever done this experiment before every time we go in we sort of get a fresh take on statistical learning so we tested her 11 times and every time we tested her we ran a separate age and education matched control sample. [00:33:44] So we ran statistical ourselves with different stimuli as well we didn't want any deficit to be related to a stimulus specific deficit and what we found across stimulus types just to summarize the data is that whereas the controls after exposure to these sequences could discriminate patterns versus non patterns in a in A to alternative forced choice. [00:34:06] Was unable to do that this is where the structure that she was exposed to was sequences of 3 things that always appeared together in sequence. We also tried it with even simpler structure pairs where we saw again that control participants were able to discriminate statistical pairs vs non pair. [00:34:27] Combinations after exposure and again Lonni Sue is no different from Chance now patients fail for a lot of reasons maybe she wasn't watching the sequences maybe she didn't understand or test instructions and so we wanted a positive control where she could succeed and it's thought that you don't need the hippocampus for familiarity judgments about individual items so rather than probing or on whether these 3 scenes appeared together in sequence we tested on whether this individual scene was something she had experienced in the sequence versus a novel scene that she hadn't seen before so we had to do an item discrimination to ask and that's not to be supported by potentially areas outside of the Middle Temple. [00:35:11] And in fact lies he was fine at doing item level discrimination she just couldn't do discrimination of regularities So this is tentative evidence that she was at least paying attention the sequence she was learning something and she understood our told turn of forced choice tests she just in which she was unable to use she was able to extract the patterns or combinations of these items. [00:35:34] This is one way to get at the necessity that the campus. Another way that we're working on is to do intracranial recordings and stimulation in epilepsy patients this was a collaboration with Lucy Maloney and Simon any news or post doc and this was at N.Y.U. we recorded from total I think 23 patients in a statistical learning task these patients have both cortical surface electrodes this is showing all of the patients electrodes overlap so these are implanted on the cortical surface they also have depth electrodes that are inserted into the hippocampus because it's a major source of seizures so these patients are implant of those electrodes so that the neurosurgeons can monitor for where seizures are coming from in their brain and then excise that that tissue in order to improve their epilepsy but well they're waiting for all these seizures to happen we go in and we can do cognitive tasks and measure nerve physiological responses during statistical learning. [00:36:38] We didn't want to depend on a behavioral measurement in these in these patients partly because there are impaired in medicated in various ways and so instead we're using a neural frequency tagging approach to measure learning. These are basically the same kinds of structure they showed before so in this case this was a syllable structure but we also did this with vision where you would hear to be Robey dad. [00:37:03] You go to bureau for 10 minutes and at 1st if you hear these streams it just sounds like gobbledygook but after a minute or 2 you start hearing the words as discrete entities even though there's no breaks between the words it's all in your mind they're all statistically defined what's nice about this design is that there's 2 frequencies that are relevant here one is the syllable rate that is the onset of the syllables which in this case was happening A For her it's so 4 times you get 4 syllables per 2nd and you get that also lation because each syllable has different acoustic properties right so there's there's this oscillation acoustics related to the syllables unrelated to statistical learning however in so far as they learn these groupings of 3 syllables another frequency might emerge at the batteries between these 3 syllable groupings so at this here or here or here and that oscillations going to be one in a 3rd Hertz that is every 3 syllables you're going to get a break and we can essentially look at power of these 2 frequencies in order to infer sort of acoustic processing the syllables and statistical learning of the word boundaries and we can compare that to a condition where you're getting the same sequence exposure in terms of syllables but where there's no trisyllabic word structure right so we would expect the same syllable frequency but no word frequency in this case. [00:38:33] So this is these are these are measurements now for across many. Lateral temporal and for front electrodes to show us give you a sense of the data and what you see in the structured case as these patients are listening you see a peak at both the syllable frequency which is again in the acoustics but you also see a peak at the triplet frequency. [00:38:56] And I think I have in the next diagram of show you that that ramps up over time so we get an online measure of learning as the power of the word frequency increases. Importantly you don't see the word frequency in the random kit condition but you do again get the syllable frequency you might notice this kind of interesting bump in the middle we didn't expect that but it actually corresponds to the frequency of pairs of syllables and there's some prior behavioral work suggesting that learning of trisyllabic words is actually chaining together of learning of pairs of syllables and so this might reflect sort of an earlier stage of learning where they're learning pairwise associations and then linking them together into the 3 syllable words. [00:39:36] OK So here's the time course of the word rate coherence So the emergence of power of the word frequency and we can see is after you know 3 minutes or so you start seeing a separation between the structured and random conditions in terms of the power of that word frequency. [00:39:54] So this is a way of measuring from the cortex statistical learning and now in terms of looking at the necessity they have began because we've started doing some work where we stimulate the hippocampal electrodes the depth electrodes and they have a campus and then measure how that impacts this word rate synchronization in the cortex so we're knocking out the region we think is driving the world learning and seeing how that impacts the cortical synchrony to the word boundaries. [00:40:26] This is very early very preliminary but just wanted to mention this is an approach we're taking Here's an example of the depth electrodes on the multiple contacts but the last contact tends to terminate in the hippocampus we also have a new system now with micro wires so we can out of the tip of the electrode push about 8 wires that go out into different parts of the campus so we can do subfield specific recordings. [00:40:49] So one caveat is that these patients tend to have hippocampal dysfunction they often will have their temporal lobe their entire temporal lobe including They became was removed as treatment so these are not typical people so caveat that depend on what we find your might be somehow dependent on. [00:41:08] Epilepsy but regardless you couldn't do this kind of work unless there was a clinical reason to have these electrodes inserted and in fact unless there is a clinical reason to do this stimulation which is they want to sometimes to evoke seizures they'll stimulate different electrodes and so we can piggyback on that to cause that we intervene and look at the effect. [00:41:29] So we're going to look at is the again this word frequency tagging at the one and a 3rd Hertz frequency on the cortical surface as a function of hippocampal stimulation through depth electrodes and i Pod this is is that hippocampal stimulation will knock out the word rate but not affect the syllable rate which again is auditory that's acoustic the word rate is learned the auditory syllable rate is just in the stimulus so this is at baseline measuring during the stimulation periods the word frequency the syllable frequency and if you stimulate the hippocampus this is measuring from cortex because stimulate the campus that selectively knocks down the word rate frequency more than the syllable rate frequency so this is only 2 patients but it suggests that even sort of the intact relatively intact Tippett campus disruption of the hippocampus during statistical learning in pairs these cortical signatures of learning. [00:42:35] I think $23.00 total I'm not sure how many that figure well as we continue to collect data after that but we're just about to submit a date on the paper on the data for the recordings and there's a total of $23.00 patients for that. I should say the hippocampus also shows this kind of pattern of frequency tagging to the word boundaries in addition to the cortical areas but we don't want to stimulate a record close together so we're still a bit was recording cortex OK. [00:43:05] So this is ongoing work so how can we think about this or going back to how we started with company learning systems how can you implement both episodic memory and statistical learning and they put campus if they have these different properties and so to answer that question we would be conducting some computational simulations using the neural network model that was developed for complimentary learning systems this is work by Randy O'Reilly's group but others have contributed to this as well this is a neural network model that's loosely biologically plausible in the sense that there are layers for different subfields of the campus the inhibition the kind of activity of the network is modeled after known physiological properties of the campus it's not a complete model it's missing certain subfields but these are the sort of the critical elements at least for today's story so and to run a cortex provides the input to the campus it passes. [00:44:06] In one case through what's known as the tri snapped a pathway into dentate gyrus then into CA 3 then to see a one so in blue this pathway had previously been shown to I mean code episodic memories so a lot of prior simulations of episodic memory behavior were based on this pathway. [00:44:27] It learns the change the learning rates of the changes to the weights in this pathway are very fast allowing for one shot learning and the representations are very sparse so you see very few active units in these layers sparse activity allows memories to be stored in separated ways the fewer units that are active the less likely they are to overlap the memories. [00:44:51] So people people have previously done a lot of modeling of this pathway what ought to did that was novel modeled this other pathway that goes directly for men to run a cortex to see one and then see one outputs is activity back down to around a cortex and that sends it out to the rest of the brain and so essentially what we did was take this model and train it on sequences that had either the pear structure or told you about before the community structure that I told you about before and we started we Nish Ally's the model and then we trained the weights in the model by giving it sequences that had the same temporal regularities in it than what we had given to humans and we're training the model to reproduce we're training the weights of the model to reproduce on the output layer what's present on the input lever layer OK So that's the objective here is to train the weights up so they given some input outputs that same information and the question is by training the weights up on sequences that contain temporal regularities How do the weight the weight structures within the model encode the regularities. [00:45:57] And so I think this is a video so is just showing as you March through the sequence you sort of importing different things that's driving different representations in this different part of the brain the models trained at this point so it's doing a good job of reproducing the input on the output. [00:46:14] Then a test what we do is we just feed the model a single fractal pattern as an input it's kind of like one F.M.R.I. we just showed people one picture and measured the neural activity here we're going to measure the activity of vote in each of these 3 layers by a single fractal and then we're going to do the same kind of pattern analysis that I told you about an F. M.R.I. we take the pattern of activity say a vote in CA 3 by fractal a then we input fractal B. measure the pattern to Vittie and see 3 and correlate those patterns of unit activities as a way of asking has the model clustered its representations of A and B. and our hypothesis was that whereas this pathway learns quickly and uses sparse representations the model synaptic pathway directly to CA one has a slower learning rate and it also employs more overlapping representations you can see there's less inhibition lower sparsity So that means there's going to be more overlap between different inputs and that's going to allow it to find regularities better than these pathways that employ very sparse representations so i Pod This is was the dentate gyrus and c 3 is not going to learn about regularities but that C one is going to learn about regularities yet. [00:47:32] Well it's supervised in the sense that I didn't took the sled out about how it's actually trained but it's a contrast of have been learning so it's sort of semi supervised. The answer there's another connection here between directly from entering the cortex input to output so that's the supervision Gnutella put the correct answer and then you try to learn the weights in the model to to better achieve that objective but critically you separately train these 2 paths or you alternate between training these 2 pathways so you generate one input by allowing the tri snap tick pathway to drive the output and then you do an update where you let the model snap the pathway drive the output and you alternate between training those 2 pathways up. [00:48:21] OK And so then we're going to look at the patterns of activity in the model layers as a function of this pair structure that I told you about and I meant a single between the initial response of the network and the settled response in the network and then Porton difference here is a subtle response is because of the amount of recurrence in the system both recurrence out tend to run a cortex and then back in but also recurrence within the model itself because there are multiple sort of redundant pathways to many of the subfields we expected the information to spread everywhere and they have a campus the initial response tells us where the representation is stored where it initiate the sort of statistical information and so what you find in terms of the similarity structure of patterns of activity evoked by the fractals is as follows so there's a correlation matrix A through age or the 8 fractals on the X. and Y. axis a correlate with A is very correlated that's not surprising it's the off diagonal structure that we're interested in. [00:49:23] A and B. were paired C. and D. were paired in F. were paired in G. and H. were paired and where you can see is that indented jars A is not represented more similarly to B. than it is to D. e D. F. or age and if you notice like D. could be followed by a sometimes F could be followed by a sometimes because that was the 2nd item in a pair so all of these sort of bright blue spots are possible transitions in the model but it's insensitive to the probability the strength of the relationship so it was followed a and B. were much more tightly coupled than a was with D. but there's no sense to that indented 3 however in CA one you see the pair structure emerge where the pattern of A by A and B. is now much more similar to each other than the pattern between A and anything else and same for C. and D. in F. G. H.. [00:50:13] When this information gets. Put from the sea of one intent to Ronald cortex and then loops back in that allows the model to retrieve a and B. and in dentate gyrus and ca 3 so you see that pair structure spread everywhere but we think that this means that C one is sort of causally responsible for this. [00:50:35] We can do other kind of neat things in the model like that we can't really do in humans easily in so far as the model snap the pathway is necessary for statistical learning lesion ing that pathway the connection from entering the cortex to C A one should prevent the model from learning and the way I'm going to plot learning in this case is the probability of given a on the input the model retrieving be on the output right so in so far as there is some sort of joint representation of A and B. internally when presented with an the input B. should be reactivated on the output and that would be evidence of learning so when you leave the monist after pathway a is no more likely to retrieve B. after training than it is to retrieve any of the other fractals So it's not there's no retrieval to be given a when you don't have a modest number pathway the shows that is necessary in the model to connect and to run a cortex is the one. [00:51:30] In contrast when you lesion the try snap to pathway the connection between CA 3 and c A one in particular the model does just fine given a retrieves be given be a retrieves a at a much higher probability than retrieving anything else so this shows that the Maasai path is not only necessary but also sufficient it can do it on its own without the trust not a pathway. [00:51:53] So to summarize this part of the talk and then I'll just mention 2 brief. Ongoing areas convert we have converging findings from F. M.R.I. patients and simulations of the campus is involved and may be necessary for this kind of learning. It's a minor revision to the standard theory of human memory based on complimentary learning systems suggesting that the complementarity might exist within the campus itself across these 2 different pathways and the way I think about this is that we've put all of the burden for statistical learning on cortex but what this work suggests is that some initial extraction irregularities might have happened in the campus itself and what consolidation might do is not just send individual experiences out to cortex but it might actually send structured knowledge out to cortex during sleep in offline periods where as the hippocampus might be forming at least within on the timescale of the day hypotheses about the structure of the world that then get consolidated over multiple days the rest of the brain. [00:52:54] OK So 2 quick future directions if episodic memory is just of learning depend on the same brain system and reuse some of these same subfields How do they interact what you'll notice is that C one is being fed both from C 3 and ferment around a cortex and so if that's the case how do they how do they interact with each other Typically this hasn't been studied if you're looking at episodic memory you wouldn't embed regularities if you're studying statistical learning you wouldn't use idiosyncratic details or trial unique kinds of images like you'd use for episodic memory so this is work that most in brain Shermans working on and so her combined episodic memory statistical learning task has both episodic details and regularities in it to look at how these kinds of learning interact the way she did that is she presented scenes where the categories of the scenes followed a structure so beaches were followed by mountains. [00:53:49] But each individual picture was trial unique so all of the exemplars are only shown once what this means is we can study statistical learning of the categories and episodic memory for the exemplars and her hypothesis was that one way that these could interact is that in so far as. [00:54:10] Categories predictive of what will come next in the sequence that allow us to generate an expectation to retrieve the associated categories the general expectation of that category and that this retrieval based on statistical learning might inhibit are blocking coding of the exemplar of that predictive category so to put that another way if beaches predict mountains given some Beach you might be expected retrieving mountains expecting mountains and that might prevent you from encoding the idiosyncratic episodic details of the current Beach right that retrieval and coding are sort of competitive with one another in this case and in fact that's what she found so this is episodic memory and a subsequent test for the exemplars they were either the 1st category in a pair the 2nd category a pair or random items so relative to both the 2nd items in a pair or the random items episodic memory for these predictive category exemplars was impaired and. [00:55:13] This is we've done a couple of Defense was a. This is yes now with confidence and this is looking at high confidence judgments will serve a source memory version of this where they get the image and they have to say when in time in the experiment they saw it and you get the impairments both for high confidence recognition and for temporal source memory judgments. [00:55:40] Yes so that the so they rate is all based on repeated exemplars the novel lures are other exemplars of the same categories in the memory test and the random items here all the other categories are completely. OK We think this is related to prediction so Brin ran an imaging study to test the hypothesis that is the retrieval of the predicted category that's impairing episodic in coding of the current exemplar and so to do this she used classification algorithms to machine learning algorithms on the patterns of activity evoked in the hippocampus and in visual areas that are responsive to these kinds of scenes so this is sort of a baseline condition if beaches are followed by mountains this is your ability to decode mountains during the presentation of mountains is just perceptual decoding. [00:56:41] That works OK in visual cortex so it's reliable and both like simple cortex and peripatetic cortex in work sort of marginally well in the campus itself but if you look in. During the A category so now during beeches decoding information about mountains versus other categories you can reliably decode mountain from beaches in the campus but not in visual areas so we're decoding the category prediction from that selectively from the hippocampus and moreover the more. [00:57:13] Categories that decoding there is of the next category in the sequence the worse the memory for the exemplars of that predictive category so that is showing that relationship between the amount of prediction and the amount of episodic encoding. So I'll just end by highlighting I think an interesting developmental mystery that has guided a lot of our work were no developing methods to do these kinds of experiments in infants and toddlers statistical learning is thought to be a core building block of the mind to help us learn language help us learn object labels structure of the environment spatial knowledge it's present even in newborns and yet the dogma is that the campus is a relatively protracted maturation in terms of size morphology site or architecture conic to Vittie even into adult hood so how can you have the same behavior without the same at least an identical brain system. [00:58:12] And so we've been developing methods to to test this and. Well should I guess I should probably wrap up so I won't go into too much detail. I'll just sort of give the the theory here one possibility is that it happens outside of the hippocampus that you don't need the hippocampus that greater plasticity in the cortex early in life might allow for rapid binding that looks like statistical learning you don't want to do this in an adults because that would cause interference catastrophic interference with your world knowledge but in an infant that doesn't have much knowledge big to do the cortex rapid change and rapid plasticity might be might not be such a bad thing. [00:58:57] And then as you get older and new learning shifts to the hippocampus the learning rate of cortex would slow down and cortical updates would happen via consolidation. The other story which I am favoring more these days is that we actually don't know really what the campus is doing early in life there are very few studies of the function of the human hippocampus in young kids in fact I'm not aware of any direct studies you need to use F. M.R.I. for this because it's a deep brain structure you can't use E.G. your near infrared spectrum Skippy. [00:59:31] And is generally not used in the kinds of tasks that I've shown you in Awake is sort of. Infants who are watching visual information having behavior recorded and so on so we just don't really know what the campus is doing early in life so one way that it could support statistical learning is that that model synaptic pathway that I showed you as being important for statistical learning based on our model actually develops before the tricep tick pathway that underlies episodic memory behavior so when people say that they have a campus as a protracted maturation it's often based on the observation that episodic memory behavior has a relatively long maturation So this is a mechanic where the juvenile in the cat has an intact monosyllabic pathway but no try Snapp that pathway whereas by adult hood they now also have the tri snapped pathway so maybe there's a bias early in life in the hippocampus to extract regularities even before it's able to do fully fledged episodic memory encoding. [01:00:33] So yeah we've been doing studies now in. In infants I won't go through the methods I'll be happy to answer questions about it this is the work of. A few graduate students in lab camera know us has been leading this particular project we've now collected about this is a couple weeks out of date about 50. [01:00:56] F. M.R.I. data sets from infants this is age and months on the X. axis is starting at 3 months up to $36.00 months these are painful experiments in terms of how much data we get for the amount of time we put into it so these tend to be 90 minutes scans and we get you know 5 to 10 minutes of usable data but that's a lot of data given that there's currently very very little data data in this age range we're doing a whole range of experiments but the most relevant one for today is we repeated sort of a very basic statistical learning task where we show in Finn's looming fractal patterns that have a pair of structure or where the sequence is random and a very sort of 1st pass analysis on the babies that have done this task if you look at occipital cortex in areas that respond to these kinds of stimulus. [01:01:48] There's no difference between structured in random conditions either in the 1st half or the 2nd half of exposure. So these are areas that are visually responsive This is a different score between structure and random so the visually responsive areas can't really distinguish between structure and random and there's no change over time as they learn however you look in the medial temporal lobe the difference between structure random is not there in the 1st half of exposure but by the 2nd half of exposure there's this emerging difference between structure and random blocks suggesting the contrary this idea that the campus and meta temporal lobe are functional early in life they might be participating in statistical learning so it's very very tentative a lot more data collection to go but I think this is pointing to maybe you know a greater role for the campus in early learning and memory than than previously thought so I'll stop there for the sake of time and thank you very much. [01:02:50] We. Run it through questions like you know. Yeah so. Yeah. So. Yeah it's. Very. Them I think it's are great so one thing we have done is look at. We've gotten ratings and sort of. Tried to get more objective measures of the typicality of all of the scenes we're using it wasn't specifically address your question but our hypothesis was that exemplars that are more typical of the category should be ones that are more likely to be forgotten because they're going to trigger more of this sort of category prediction. [01:03:54] There are better cues for category for the category representation but you. Still But that doesn't turn out to be true so I don't know if that helps with my family. Were very high or yeah. Where you know you have. Yet to figure out. Something. And. Or. You're giving evils or. [01:04:35] Yeah now you can start using that. For learning the sequence Yeah yeah there is a lot of variability of there's 12 categories here and there is variability across those categories and how tight they are and so I think even within this dataset there could be a way of looking at that and my prediction would be if you had a highly variable category you would get less of the less morning of the sequence and less prediction effects because you're still sort of learning what are you know that these things go together lately yeah. [01:05:10] Yeah. Yeah. Yeah. Yeah I think it's a great producer and we can manipulate that you know we could we could I think Brand sampled images that she thought were representative of the category on purpose but we could manipulate that and find bad examples and manipulate sort of the range of the category and test that it's a great idea yeah so. [01:05:39] Yeah yeah yeah. If you look at the title plus 3 overtime. Because you expected those clusters are going to get your learning going to be developing on the directory. And they're. Basically just pretty loaded. Model of the hyper Carol cluster and see how those clusters kind of Virgin Islands are all yeah yeah. [01:06:08] I'm not in this study at this. This study was actually designed to look at whether. This kind of structure can give rise to like event boundaries and so. During some of the. Learning phase they were asked to press a button when they thought something and changed and so it's not an ideal data set for looking at sort of learning over time but yeah in principle I think. [01:06:37] I think that that would be really interesting I mean do you have a prediction of how you would expect it to emerge like you know you can you can imagine I showed 2 kinds of information in there because one was just clustering by community but the other was information about sort of boundary nodes versus internal nodes right you might imagine some trade off there yeah I don't know I don't. [01:07:01] Know. About. Yeah yeah. Yeah. Yeah yeah I think there's a lot to be done here yeah the cruise a paper I think is only there for into Yeah and they what they manipulated and some of that work is how you do the passes over the graph Yeah. And there's some some subtleties there are so like if you do a hamiltonian path we have to visit every node before you revisit any node that really changes its statistics and makes it very very difficult to learn this graph and so they compared a few different kinds of walks but this was a random essentially random walk. [01:07:50] That it's a great idea. I guess you got a lot. Longer Yeah but you know. Yeah. Yeah it's really it's a great question there's a fundamental challenge with all of these studies like it's very salient in the pair you know in the peer case where like the reason why we did pre and post learning is because we can present things in a random order if you're looking at the representations during a learning there's a temporal confound the things that are paired or closer occurring closer together interact was ition And so that which is the is the point like but normally enough designed the opposite of what you want you want things to be as they were thought of as possible so that's why we did the thing which makes it harder to look at the time course of learning just because the ball bowled response when we did the community structure one which was not a prepares to sign we looked at it during learning not over time we don't really have the resolution for that but we just made sure to always equate the distance between the things that were being compared in time so there are ways of doing it I think it's a good question. [01:09:21] That. There are especially in the community structure. Work there some for all areas that seem to attract a community structure as well that that are more related to the kind of parsing behaviors it was talking about before. In the in this parish structure I don't remember off the top of my head but yes memories there are a few other clusters. [01:09:45] We also often see caught a you know involvement in there been a few other people have observed that terms a quarter call representations knowledge I mean the standard cortical story would would require some amount of consolidation this is all within session so there's been a couple of studies I showed one of them and that summary slide. [01:10:05] Is very using birth year. Yeah so just a point of this what I'm not sure of is this paper a follow up paper that has Siri action time task analysis and then scanned that day and the next day and they found a shift from M.T.L. to stratum from one to X. So that's like one example of consolidation to look and really confusing. [01:10:40] Yeah I mean my hypothesis when we started work before we got to this or the hippocampus we were basing this on the work by. Referred to before and what he does he trained monkeys it was accidental discovery I mean it's like one of my favorite you know series of like Nature papers from this X. and it was cover where he trained he was familiarizing monkeys with these sort of fractal patterns but he always used the same order of stimuli when he was familiarizing them and what he found is that if you looked into your temper cortex and what turned out it was T. It's to some extent also pirana cortex the tuning was not just based on the visual features because also based on the neighbor on the neighbors so the tuning for one fractal if you picked a neuron and found his preferred stimulus its next most preferred stimulus was the fractal either before or after in the sequence and so there was a tuning based on just spatial and temporal Kohut currents but those again those were in visual areas that were selective for the stimulus so that's how we started off thinking especially these higher level areas and if your temper cortex maybe Purana cortex if we find you know stimulus representations the effect of learning is going to be to push those representations in different directions based on the statistics that's hard to to measure that is pretty subtle changes hippocampus does not. [01:12:09] Show reliable stimulus representations its tuning is based purely on the spatial temporal features So for example it's very hard to decode these individual fractals from a pose even if you can find Perrin from ation it did go off. All right. I'll see you at the reception Thank you.