[00:00:05.01]
bachelor's degree at sanford and then moved on to
do a phd at princeton in psychology neuroscience
[00:00:13.16]

[00:00:14.13]
and then moved on to a post doctor or fellow
position at the harvard medical school uh in 2019
[00:00:21.14]

[00:00:21.14]
when she accepted an assistant professor
position in at the university of pennsylvania
[00:00:27.03]

[00:00:27.03]
in philadelphia so she has been there for about
a year and a half now uh having uh to deal with
[00:00:35.05]

[00:00:35.20]
studying the lab during the pandemic
um so hopefully that's been working
[00:00:40.02]

[00:00:40.02]
okay for her i know he's working on
memory and learning with a particular
[00:00:46.06]

[00:00:48.10]
emphasis on consolidation during awake periods
in sleep and she is also working on your network
[00:00:55.09]

[00:00:55.09]
modeling so we're very pleased to have her and
anna take it away great hi everybody thank you so
[00:01:03.12]

[00:01:03.12]
much for having me thank you for the introduction
dobie um so i want to start with um this really
[00:01:12.15]

[00:01:12.15]
broad question which is how does the brain store
new information and i want to define two ends of
[00:01:19.16]

[00:01:19.16]
a representational spectrum from localist to
distributed representations which i'm going to
[00:01:25.12]

[00:01:25.12]
illustrate here in terms of their use in neural
network models um so in a localist representation
[00:01:31.16]

[00:01:32.06]
some input comes in from the environment there's
some set of neurons or units that are going to
[00:01:37.20]

[00:01:38.13]
represent that information across this hidden
layer here um and when i say localist all i mean
[00:01:43.22]

[00:01:43.22]
is that some new input that comes in regardless
of how similar it might be to that first
[00:01:48.06]

[00:01:48.06]
piece of information is going to be represented
with a completely non-overlapping set of neurons
[00:01:54.02]

[00:01:54.02]
in this hidden layer as opposed to a distributed
representation where you allow there to be
[00:01:59.11]

[00:01:59.11]
overlap across the different the different
inputs that you see in the environment and
[00:02:05.20]

[00:02:05.20]
this distributed kind of representation has been
crucial to the success of neural network models
[00:02:11.20]

[00:02:11.20]
in machine learning applications and i think also
in um in in thinking about how these models apply
[00:02:20.23]

[00:02:20.23]
to the brain and to to behavioral phenomena um
but there are pros and cons to using this kind
[00:02:26.23]

[00:02:26.23]
of representation uh maybe the most obvious pro
is that if you have this kind of overlapping
[00:02:34.21]

[00:02:34.21]
representation it makes generalization really
easy and automatic so if you for example represent
[00:02:41.11]

[00:02:41.11]
all of your memories of seeing all these different
birds flying with this overlapping representation
[00:02:47.18]

[00:02:47.18]
that makes it really easy to to infer
properties of some new thing that you see
[00:02:54.17]

[00:02:55.07]
that that has a lot of features in common with
all these birds that you've seen flying before
[00:03:00.15]

[00:03:00.15]
maybe you haven't seen this particular bird flying
but um you'll be able to immediately understand
[00:03:05.14]

[00:03:05.14]
that probably this bird also has that property of
being able to fly so this kind of representation
[00:03:10.15]

[00:03:10.15]
makes that kind of generalization really easy
whereas a localist representation makes that
[00:03:16.08]

[00:03:16.08]
extremely difficult because these representations
are separate and so um so it's less easy to see
[00:03:22.12]

[00:03:22.12]
those relationships between experiences on the
other hand there are some kinds of memory tasks
[00:03:27.22]

[00:03:27.22]
where this type of representation is really useful
like if you're trying to remember that penguins
[00:03:34.08]

[00:03:34.08]
uh swim but they don't fly whereas this very
similar looking bird does fly in this case uh
[00:03:40.15]

[00:03:40.15]
this kind of representation is useful because it
has such a low interference so it allows you to
[00:03:44.15]

[00:03:45.12]
think about those differences between um between
these objects whereas in this case you get there's
[00:03:52.00]

[00:03:52.00]
going to be much more interference and this
interference issue is associated with the kind of
[00:03:58.02]

[00:03:58.02]
behavioral difference between these
two representations which is that
[00:04:01.11]

[00:04:01.11]
um these distributed representations require
information to be learned in an interleaved order
[00:04:07.16]

[00:04:07.16]
the order presentation matters whereas that's not
true for localist representation so just to give
[00:04:13.05]

[00:04:13.05]
you a quick intuition for that if you haven't um
thought about that before in uh an interleaved
[00:04:19.18]

[00:04:20.10]
uh presentation situation for a local store
presentation you can go back and forth between
[00:04:25.16]

[00:04:25.16]
studying information um and that will that works
fine if you present information in a blocked order
[00:04:31.03]

[00:04:31.03]
so like one set of information before you move
on to the second set that also works there's no
[00:04:35.22]

[00:04:36.13]
there's no overlap in these representations so
the order presentation doesn't it doesn't make
[00:04:40.08]

[00:04:40.08]
any difference but in the distributed case
um interleaved exposure works really well
[00:04:46.19]

[00:04:46.19]
but blocked exposure has this issue where to the
extent that the representations are overlapping
[00:04:54.02]

[00:04:54.02]
for this second set of information relative to
the first there's a tendency to overwrite um
[00:05:00.10]

[00:05:00.10]
that information that you um that you initially
learned so distributed representations are really
[00:05:06.23]

[00:05:06.23]
great at finding and storing structure and data
efficiently but this representation is highly
[00:05:12.10]

[00:05:12.10]
susceptible to interference and this is the
classic catastrophic interference problem that
[00:05:19.03]

[00:05:19.03]
still plagues many neural network models most i
would say um so what is what does the brain use
[00:05:26.15]

[00:05:27.14]
well probably it uses um some combination of
both of these kinds of representations and
[00:05:34.08]

[00:05:34.08]
that was the proposal of the classic complementary
learning systems theory which basically said well
[00:05:40.00]

[00:05:40.13]
maybe the hippocampus uses um these these
localist style representations and it does really
[00:05:47.18]

[00:05:47.18]
rapid learning of individual episodes and it's
useful to use this kind of uh representation
[00:05:53.12]

[00:05:53.12]
because it keeps all those experiences separate
to avoid interference and then offline maybe
[00:05:58.15]

[00:05:58.15]
during sleep the hippie camps will replay
these experiences in an interleaved order
[00:06:03.12]

[00:06:04.06]
and that will allow the neocortex to slowly
extract the statistics across these experiences
[00:06:09.20]

[00:06:09.20]
and build up these these distributed
representations that are so useful for
[00:06:13.16]

[00:06:13.16]
kind of long-term storage of structured
information so um so this i think is a is
[00:06:21.16]

[00:06:22.08]
basically right and explains a lot of data
but there's um a really important missing
[00:06:27.05]

[00:06:27.05]
piece here which is that it doesn't explain
how uh we can do generalization quickly before
[00:06:34.00]

[00:06:34.00]
there's time for offline replay and sleep um
so we know that we're very good at this in in
[00:06:40.02]

[00:06:40.02]
um various different uh ways so like when you have
to generalize something across the time scale of
[00:06:47.14]

[00:06:47.14]
seconds or minutes or hours how do we do that how
does that fit into this picture and we've proposed
[00:06:52.17]

[00:06:52.17]
that actually the hippocampus is capable of this
kind of rapid statistical learning as well um as
[00:06:59.09]

[00:06:59.09]
this localist style learning that it has this
kind of intermediate style representation
[00:07:04.00]

[00:07:04.00]
where things are somewhat overlapping and
there's a kind of intermediate learning role
[00:07:07.14]

[00:07:08.04]
learning rate and that allows the hippocampus
to both extract statistics and learn episodes
[00:07:14.23]

[00:07:15.18]
so um i'm gonna tell you about some of the
experiments that got us thinking about the
[00:07:20.17]

[00:07:20.17]
hippocampus in this way um and i'll show you a
model of the hippie campus that tries to explain
[00:07:27.16]

[00:07:27.16]
how it's possible that these two different kinds
of representations could coexist in one brain area
[00:07:32.23]

[00:07:32.23]
because these things these things really are in
tension computational tension so how is it that
[00:07:37.11]

[00:07:37.11]
one structure can do both of these things um
and then mostly today i'd like to focus on our
[00:07:43.20]

[00:07:43.20]
recent work testing predictions of this model
so um how how might we know that this is what's
[00:07:50.06]

[00:07:50.06]
happening as opposed to some other um some other
way of producing rapid generalization and then if
[00:07:57.05]

[00:07:57.05]
i have time i'll tell you a little bit about um
what we're thinking about in the sleep domain so
[00:08:02.15]

[00:08:02.15]
once you've encoded these new representations
then how do you um process that information
[00:08:09.03]

[00:08:09.03]
offline and and kind of complete this process
of building up these neocortical representations
[00:08:15.18]

[00:08:18.04]
okay so the the experiments that got us um
thinking about the hippocampus in this way
[00:08:23.01]

[00:08:23.01]
are um our visual statistical learning experiments
where participants see a stream of novel images
[00:08:29.22]

[00:08:29.22]
presented one at a time and there's some hidden
temporal structure um embedded in these streams
[00:08:36.02]

[00:08:36.02]
so in this first experiment there were pairs
of objects that always occurred together so the
[00:08:43.05]

[00:08:43.05]
these objects are seen much more frequently back
to back than two objects that happen to appear
[00:08:48.12]

[00:08:49.12]
together in the transition
from one pair to the next
[00:08:52.04]

[00:08:53.12]
and we measured the representations
of each of these individual
[00:08:56.17]

[00:08:57.09]
objects in the hippocampus well in the whole
brain um before and after this sequence exposure
[00:09:04.00]

[00:09:04.15]
and um right so we're just picking up the
patterns of activity across the voxels in
[00:09:10.04]

[00:09:10.04]
the hippocampus and other areas that are evoked
by each of these individ these individual images
[00:09:15.18]

[00:09:15.18]
and then we can ask whether images that were
paired in the sequence exposure are represented
[00:09:20.21]

[00:09:21.18]
more similarly or differently than images that
were not paired during the sequence exposure
[00:09:27.03]

[00:09:28.12]
and what we find is that over the course of the
sequence exposure there's a change in pattern
[00:09:34.23]

[00:09:34.23]
similarity in the hippie campus where items that
were paired together um tend to be represented
[00:09:40.17]

[00:09:40.17]
more similarly and there's no change in items that
were not paired together so this suggests that
[00:09:47.03]

[00:09:47.03]
the hippocampus is sensitive to
this kind of statistical learning
[00:09:50.15]

[00:09:51.05]
and also that it's using this kind of
overlapping representation of related items
[00:09:58.15]

[00:09:58.15]
um that's maybe not what we're used to seeing
in the hippocampus or what we would expect
[00:10:03.03]

[00:10:03.03]
from an orthogonalized localist
representation in the hippocamp
[00:10:06.13]

[00:10:08.17]
uh we we think the hippie campus is not just
sensitive to these things but it's really
[00:10:13.05]

[00:10:13.05]
critical for this kind of statistical
learning so here we tested a patient with
[00:10:18.17]

[00:10:19.11]
bilateral hippocampal damage um on the same
kind of paradigm and we found that she was
[00:10:24.19]

[00:10:24.19]
unable to learn um this kind of structure across
across not just shapes but also um scenes and uh
[00:10:32.15]

[00:10:32.15]
tones and syllables so even auditory stimuli
she was unable to pick up on this kind of um
[00:10:38.02]

[00:10:38.15]
statistical structure and that's been replicated
in some some additional uh hippocampal patients
[00:10:46.06]

[00:10:46.06]
and it's not just the simple kind of pairwise
structure it's also more complex structure we
[00:10:52.04]

[00:10:52.04]
think the hippocampus can learn so this is
an example of a temporal structure that's uh
[00:10:59.03]

[00:10:59.03]
that's created from uh um from this graph with
community structure that we think is kind of a
[00:11:04.04]

[00:11:04.04]
more realistic form of temporal structure and
this these sequences can't be parsed based on
[00:11:09.14]

[00:11:10.21]
pairwise co-occurrence frequencies or simple
differences in transition probabilities
[00:11:15.14]

[00:11:15.14]
you have to be sensitive to this higher
order structure and we find that people
[00:11:19.14]

[00:11:19.14]
are sensitive to the structure and that the
hippocampus again represents the structure
[00:11:23.14]

[00:11:23.14]
in a similar way way where items in the same
community end up being represented more similarly
[00:11:30.08]

[00:11:30.08]
than items in different communities so we think
that hippie campuses is very good at this um kind
[00:11:36.23]

[00:11:36.23]
of picking up on structure over time and seems
like it's using these overlapping representations
[00:11:41.20]

[00:11:43.18]
okay so so it seems like the hippocampus is
both separating related experiences i didn't
[00:11:49.18]

[00:11:49.18]
show you data on that today but there's
quite a lot of literature on that
[00:11:52.19]

[00:11:53.14]
but also allowing related experiences to overlap
as a function of their statistical structure
[00:11:59.16]

[00:12:00.08]
um and these two kinds of representations are um
they're different they're really in tension and
[00:12:07.07]

[00:12:07.07]
how is it that the that the hippocampus
could be doing both of these things so
[00:12:11.18]

[00:12:12.17]
um to try to understand this we have been working
with this neural network model of the hippocampus
[00:12:19.03]

[00:12:19.03]
that has um that has properties that reflect
what we know about the actual hippocampus
[00:12:24.19]

[00:12:24.19]
it has internal uh cortex that provides input and
output to the model and then we have three hidden
[00:12:32.06]

[00:12:32.06]
layers that represent three of the subfields
of the hippocampus dead date gyrus ca3 and ca1
[00:12:38.17]

[00:12:40.00]
um the if you haven't seen this kind of um this
kind of depiction of a neural network model before
[00:12:47.05]

[00:12:47.05]
the the height and color of these little
boxes represents the firing rate of a neuron
[00:12:52.12]

[00:12:52.12]
or or maybe a small population of neurons and
these arrows represent connectivity between
[00:12:58.12]

[00:12:58.12]
those neurons um the connectivity looks actually
something more like this but we just use those big
[00:13:05.03]

[00:13:05.03]
arrows to represent that that fuller connectivity
so there are two pathways through the through this
[00:13:13.16]

[00:13:13.16]
model and and the actual hippocampus there's the
tri-synaptic pathway and the monosynaptic pathway
[00:13:18.23]

[00:13:18.23]
and these pathways seem to have different
properties the tri-synaptic pathway is really
[00:13:24.06]

[00:13:24.06]
the pathway that implements those famous sparse
localist-style representations that are the heart
[00:13:32.15]

[00:13:32.15]
of the reason that we think the hippocampus
is probably so good at episodic memory and the
[00:13:37.03]

[00:13:37.03]
reason that it has these kinds of representations
is that there's this really kind of specialized
[00:13:41.18]

[00:13:42.12]
sparse fanning connectivity into dentate gyrus
that will take inputs even if they're very
[00:13:48.19]

[00:13:48.19]
overlapping and project them to non-overlapping
populations of neurons in this area so you end
[00:13:55.22]

[00:13:55.22]
up with this pattern separation process this
direct pathway that goes from internal cortex
[00:14:02.06]

[00:14:02.06]
to ca1 seems to have different properties um has
less extreme pattern separation and that might
[00:14:09.14]

[00:14:09.14]
be related to the fact that um the there's less
extreme scarcity in the in its connections with
[00:14:15.07]

[00:14:15.07]
the the regions that it that it connects to um
and also it seems to have a slower learning rate
[00:14:22.04]

[00:14:22.04]
um than the tri-synaptic pathway so this pathway
on its own can do slow incremental learning
[00:14:29.09]

[00:14:29.09]
but um so it's it's able to learn you know over
the time scale of like tens of trials um uh
[00:14:38.02]

[00:14:38.02]
whereas this right but whereas it wouldn't be able
to learn on on one trial whereas the tri-synaptic
[00:14:43.14]

[00:14:43.14]
pathway can do one-shot learning and is less less
involved in kind of more incremental learning so
[00:14:49.11]

[00:14:49.11]
um so rodents burden studies have been able to
lesion these two different pathways and show that
[00:14:54.08]

[00:14:54.08]
they're these different properties so um we're
gonna present this like statistical structure to
[00:15:02.12]

[00:15:02.12]
this model and just see what happens can it handle
um parsing these sequences so the way we do this
[00:15:09.07]

[00:15:09.07]
is really simple we just show the model um two
items at a time where the current the current item
[00:15:15.20]

[00:15:15.20]
has full activity and then like the immediately
proceeding item will have some decayed activity
[00:15:20.17]

[00:15:20.17]
we just move that window along and we ask can the
model figure out from the statistics over time
[00:15:26.21]

[00:15:26.21]
that there are these pairs in this case embedded
in this structure and we can contrast that with
[00:15:34.04]

[00:15:34.04]
like a kind of a classic episodic memory kind of
simulation where we just ask the model to directly
[00:15:40.19]

[00:15:40.19]
memorize that there are pairs of things that go
together so memorize the a and b go together and
[00:15:46.10]

[00:15:46.10]
just move on to the next pair so here we're not
going to require the model to extract statistics
[00:15:51.09]

[00:15:51.09]
over time but just see if it can quickly
form these these bindings between these items
[00:15:57.11]

[00:15:59.16]
and then after we train the model either
with the statistical structure or the
[00:16:03.16]

[00:16:03.16]
kind of classic episodic variant we want to
test the model in some way that's analogous
[00:16:10.04]

[00:16:10.04]
to what we uh what we did in our fmri experiments
which is um very easy to do in this kind of model
[00:16:16.21]

[00:16:16.21]
we're just going to show the model after after
learning item individual items by themselves
[00:16:21.20]

[00:16:22.12]
and then record the pattern of activity that's
evoked across these different layers of the model
[00:16:27.03]

[00:16:27.03]
so we can say okay this is the pattern of activity
evoked by item a in region ca3 and we can do the
[00:16:33.09]

[00:16:33.09]
same thing for all of the other items and then
we can um do these same kinds of like correlation
[00:16:41.11]

[00:16:41.11]
analyses that we do in the in the human data so
we can ask you know are two items that were paired
[00:16:47.07]

[00:16:47.07]
in the sequence represented more similarly than
items that were not paired same kind of question
[00:16:53.12]

[00:16:54.17]
so this is just to orient you to the figures
i'll show you here so first i'm going to show
[00:17:00.23]

[00:17:00.23]
you the results for the episodic variant of the
model where these pairs were clearly demarcated
[00:17:05.18]

[00:17:05.18]
and there was no need to extract any statistics
from the sequences and what we see here is that
[00:17:12.12]

[00:17:12.12]
the dentate gyrus and ca3 so the trisynaptic
pathway are really good at learning this kind of
[00:17:19.03]

[00:17:19.03]
structure which is which is what we would expect
from um from known properties of this um of this
[00:17:25.22]

[00:17:26.19]
pathway and also from previous modeling work
um so basically these areas are are building up
[00:17:33.05]

[00:17:33.05]
a conjunctive representation of each of these
pairs and each item by itself will evoke that
[00:17:37.14]

[00:17:37.14]
same conjunctive representation and so you get
this really strong pair structure in these areas
[00:17:42.06]

[00:17:43.01]
da1 does have sensitivity to the structure
but it's much weaker basically it would
[00:17:47.11]

[00:17:47.11]
just take more time for ca1 to build up
strong uh representations of these pairs
[00:17:52.15]

[00:17:53.09]
so this tells us what we already knew about this
pathway which is that it's very good at quick
[00:17:58.12]

[00:17:58.12]
episodic learning so what happens in this
situation where the model has to extract the
[00:18:04.13]

[00:18:04.13]
statistics from the structure from the sequence
over time here we see a totally different
[00:18:10.02]

[00:18:10.21]
pattern of representations in the model it's it's
the opposite of what i just showed you where now
[00:18:16.15]

[00:18:16.15]
c1 is really good at representing this pair
structure whereas dentate and c3 are completely
[00:18:23.14]

[00:18:23.14]
failing they are doing something
interesting this checkerboard structure
[00:18:28.08]

[00:18:28.08]
suggests that these um regions are picking up
on both the transitions between pairs like bg
[00:18:36.10]

[00:18:37.03]
and the actual real pairs like a b as opposed
to something a pair like a and g which was never
[00:18:44.04]

[00:18:44.04]
actually shown together in the sequence so this
pathway is doing too good a job of memorizing
[00:18:51.18]

[00:18:51.18]
everything that it's seen which includes this
the pairs and the transitions between the pairs
[00:18:56.02]

[00:18:57.01]
and that's that's exactly what you want from
from an episodic memory system um but it's
[00:19:01.12]

[00:19:01.12]
totally not useful for this learning problem
where you're trying to extract the statistics
[00:19:06.10]

[00:19:06.10]
over time so these different areas have different
properties as a function of the learning problem
[00:19:13.11]

[00:19:13.11]
and the structure of the input um and this shows
us that ca1 can solve this task and it's the
[00:19:19.09]

[00:19:19.09]
only part of the hippocampus it seems to to be
able to solve this statistical learning problem
[00:19:24.19]

[00:19:27.12]
um one one kind of interesting application
of this model is in the domain of
[00:19:35.12]

[00:19:35.12]
development because infants have very undeveloped
hippocampi but they're uh but they seem to be very
[00:19:42.02]

[00:19:42.02]
good at statistical learning and so for people
who think that the hippocampus is important for
[00:19:46.15]

[00:19:46.15]
statistical learning it's a little bit of
a puzzle why infants would be so good at
[00:19:50.19]

[00:19:50.19]
statistical learning if they if they don't have
developed hippocampi um but it turns out that
[00:19:55.20]

[00:19:55.20]
actually these two pathways develop at different
different times and it's it may really be that
[00:20:01.20]

[00:20:01.20]
it's the tri-synaptic pathway that's undeveloped
in infants because the monosynaptic pathway seems
[00:20:06.15]

[00:20:06.15]
to develop very early on and so we can ask okay
in the model if we don't give it access to the
[00:20:13.20]

[00:20:13.20]
trisynaptic pathway and has to just learn
on this monosynaptic pathway what happens
[00:20:19.11]

[00:20:19.11]
um and it turns out that it's completely fine in
fact it actually does even a little bit better um
[00:20:25.18]

[00:20:25.18]
without the trisynaptic pathway so so this
suggests that the monosynaptic pathway
[00:20:30.02]

[00:20:30.02]
can act on its own and it might it might
provide some insight into what's happening
[00:20:35.09]

[00:20:35.09]
for infants who don't yet have this tri-synaptic
pathway episodic memory system set up yeah
[00:20:40.19]

[00:20:43.12]
so um so i have um you know made the argument that
this pathway has these properties that are well
[00:20:52.06]

[00:20:52.06]
suited to extracting structure over time that
the slow learning rate is really good at kind
[00:20:56.15]

[00:20:56.15]
of picking up structure over many trials and the
overlapping representations are useful for seeing
[00:21:02.15]

[00:21:02.15]
that um seeing that there's shared structure
across time and we think that it's not just
[00:21:09.09]

[00:21:09.09]
temporal statistical learning that these kinds
of properties will be really important for it's
[00:21:13.16]

[00:21:13.16]
really any situation where you need to pick up
on structure across across experiences and so
[00:21:19.09]

[00:21:19.09]
we've been um applying this model to other kinds
of domains one area we think uh that that this
[00:21:29.18]

[00:21:29.18]
learning strategy should be really useful for is
um category and concept learning um where you're
[00:21:35.03]

[00:21:35.03]
trying to understand uh uh some new category
uh by picking up on um structure of exemplars
[00:21:42.21]

[00:21:42.21]
over time and we know that the hippocampus is uh
plays an important role in this kind of learning
[00:21:47.12]

[00:21:48.17]
so i'm going to show you a couple simulations um
that um that jalan asuchevich has done applying
[00:21:54.21]

[00:21:54.21]
this model to a category the category learning
um domain so first she uh took the classic
[00:22:02.15]

[00:22:02.15]
weather prediction task where you see a set of
these abstract cards and you're supposed to um
[00:22:07.22]

[00:22:08.17]
categorize into sunshine or rain and we know
that amnesiacs can do this task but that
[00:22:16.06]

[00:22:16.06]
they're impaired so we don't think that the heavy
campus is the um the only region but capable of
[00:22:22.21]

[00:22:22.21]
of doing of doing this kind of learning but it's
certainly playing a role so we want to try to
[00:22:27.16]

[00:22:27.16]
think about what that what that role might be so
yelana finds that um that so here she's breaking
[00:22:34.13]

[00:22:34.13]
it down by the intact model in green and
then a version of the model that only has the
[00:22:38.17]

[00:22:38.17]
monosynaptic pathway in orange and then a version
that only has the tri-synaptic pathway and purple
[00:22:43.14]

[00:22:44.12]
and she finds the model can do this categorization
task if you take uh if you take away the triceptic
[00:22:51.09]

[00:22:51.09]
pathway just use the monosynaptic pathway it seems
that it's maybe even doing a little bit better
[00:22:57.05]

[00:22:57.05]
like we saw in the statistical learning case and
that it's really the monosynaptic pathway that's
[00:23:01.01]

[00:23:01.01]
responsible for this the trisynaptic pathway
totally fails on this this uh prediction task if
[00:23:07.05]

[00:23:07.05]
you change the task and instead ask the model to
recognize particular card combinations that it saw
[00:23:13.14]

[00:23:13.14]
during exposure now it's a very different story
now that tri-synaptic pathway kind of learning
[00:23:18.08]

[00:23:18.08]
strategy becomes really useful um and it turns
out that the that the tri-synaptic pathway only
[00:23:25.01]

[00:23:25.01]
model does the best and um and the monosynaptic
pathway by itself struggles a lot with this kind
[00:23:31.05]

[00:23:31.05]
of uh task so both of these these um pathways are
engaged by this task and are doing something but
[00:23:38.17]

[00:23:38.17]
um but that the the actual category learning
aspect of it is driven by the monosynaptic pathway
[00:23:44.23]

[00:23:46.15]
he also applied the model to this um this object
category learning paradigm where participants
[00:23:52.02]

[00:23:52.02]
learn about these three categories of of these
fake satellite objects objects in the same
[00:23:57.18]

[00:23:57.18]
category share most of their features their parts
but each object also has unique individuating
[00:24:04.21]

[00:24:04.21]
parts and and participants learn about both
of those things and we find um in in our
[00:24:12.00]

[00:24:12.00]
uh imaging experiments that use this task that ca1
is sensitive to this category structure so their
[00:24:18.12]

[00:24:19.11]
items from the same category are more are
represented more similarly than items from
[00:24:23.09]

[00:24:23.09]
different categories in ca1 but not in ca 3 and
dentate and also that there's just more similarity
[00:24:30.00]

[00:24:30.00]
overall in in ca1 which i think is consistent with
this idea that it's using a different kind of um
[00:24:35.20]

[00:24:36.15]
representational scheme so in the model um
if you ask it to do a generalization task
[00:24:44.02]

[00:24:44.02]
so you um you show it a new satellite it
hasn't seen before and ask it to try to
[00:24:48.13]

[00:24:49.12]
infer some uh some missing feature the um the
intact model can do this well the tri synaptic
[00:24:57.16]

[00:24:57.16]
pathway only model really struggles with this and
interestingly the monosynaptic pathway only model
[00:25:03.11]

[00:25:03.11]
does really really well at this task so this is
the this is the most obvious example we've seen um
[00:25:09.11]

[00:25:10.00]
uh what looks like a really strong trade-off
between what's happening in these two pathways
[00:25:14.19]

[00:25:14.19]
and a demonstration that in something like
generalization the mono synaptic pathway is
[00:25:19.01]

[00:25:19.01]
really um is doing all of the work but again if
you if you change the task and ask the model what
[00:25:25.09]

[00:25:25.09]
it you know um can you remember unique features of
individual exemplars now the monosynaptic pathway
[00:25:32.21]

[00:25:32.21]
fails at that and the tri-synaptic pathway um is
completely responsible for that kind of memory so
[00:25:38.10]

[00:25:38.10]
so again the tri-synaptic pathway is doing
something it's sensitive to these objects and
[00:25:42.06]

[00:25:42.06]
to the extent that you want to remember what's
unique about them it's critical but it's not
[00:25:48.12]

[00:25:48.12]
helping with this this generalization process that
requires understanding the structure over time
[00:25:55.20]

[00:25:57.09]
and if you look at the hidden layer
representations of the model you'll see that ca1
[00:26:01.09]

[00:26:01.09]
represents this the structure really nicely which
is consistent with what we see in the imaging data
[00:26:06.13]

[00:26:08.23]
so i've i've shown you that the mod synaptic
pathway is good at um this temporal statistical
[00:26:15.14]

[00:26:15.14]
learning and also at uh category learning
building up kind of new semantic memory
[00:26:20.23]

[00:26:20.23]
um have we are we done have we solved the problem
or are all that are there alternatives are there
[00:26:27.01]

[00:26:27.01]
other ways um that we might uh think about how
the hippie campus contributes to these tasks and
[00:26:33.05]

[00:26:33.05]
how might we test between these alternatives um so
so in order to address this question i am going to
[00:26:42.17]

[00:26:42.17]
bring up one last task um where there's been a lot
of theorizing and modeling about um about how the
[00:26:49.20]

[00:26:49.20]
hippocampus might might contribute which is this
associative inference task where you learn that
[00:26:56.06]

[00:26:56.06]
two things a and b go together you learn that
b and c go together and then there's some kind
[00:27:00.15]

[00:27:00.15]
of test of whether you can do that transitive
inference from a to c and we know that this task
[00:27:09.11]

[00:27:10.06]
relies on the hippocampus from um from rodent
lesion data and also from from human fmri data
[00:27:16.02]

[00:27:18.10]
so how do you solve associative reference there's
uh there's been different proposals so um one
[00:27:24.10]

[00:27:24.10]
proposal uh was implemented in this remerge model
which said well let's let's kind of preserve and
[00:27:33.18]

[00:27:33.18]
focus on that localist kind of coding scheme in
the tri-synaptic pathway and see if we can get
[00:27:39.18]

[00:27:40.17]
inference to emerge from that coding scheme
without assuming that there's anything else
[00:27:45.22]

[00:27:45.22]
happening so the way that this works is that
there's um there's a try synaptic pathway-like
[00:27:52.10]

[00:27:52.10]
kind of layer that has these representations
of each of these pairs of items so kind of
[00:27:57.18]

[00:27:57.18]
conjunctive episodic representations
of each pair and they're connected
[00:28:02.02]

[00:28:02.02]
to these item representations so if
you present item a by itself at test
[00:28:07.14]

[00:28:07.14]
it will activate that a b memory that you formed
during encoding which will remind you of the
[00:28:14.06]

[00:28:14.06]
b item which will get you then to your bc memory
and then you can solve the task and get to c so
[00:28:20.10]

[00:28:21.07]
you without having stored the relationship
between a and c at encoding as long as you
[00:28:25.09]

[00:28:25.09]
have these these kind of recurrent connections
set up correctly um you can you can use spreading
[00:28:31.01]

[00:28:31.01]
activation to solve the problem at retrieval so
that that's one strategy um that's very different
[00:28:37.07]

[00:28:37.07]
than the strategy that i've been telling you
about which is this kind of interleaved learning
[00:28:41.11]

[00:28:41.11]
to build up distributed representations strategy
where you have items um a and c they start off
[00:28:47.12]

[00:28:47.12]
with maybe some amount of overlap in their
representations and then as you go back and
[00:28:52.17]

[00:28:52.17]
forth between studying a b and b c you you kind
of merge these representations because that's
[00:28:59.12]

[00:28:59.12]
what this uh this kind of learning algorithm will
tend to want to do to the extent that things are
[00:29:04.23]

[00:29:04.23]
similar in the environment that a and c share
this b feature it's going to tend to push these
[00:29:10.02]

[00:29:10.02]
representations together um but critically again
this this the order of presentation is going to
[00:29:17.01]

[00:29:17.01]
be important here um so interleaved uh learning
allows this kind of representation to be built up
[00:29:23.03]

[00:29:23.03]
whereas some other kind of order would not allow
that so that's gonna be that's gonna be important
[00:29:28.23]

[00:29:29.22]
um okay but if you have the injury leave learning
and you build up these representations inference
[00:29:35.01]

[00:29:35.01]
is very easy and automatic much more so
than than in the in this recurrent kind of
[00:29:40.10]

[00:29:40.10]
strategy because a and c are just showing sharing
overlapping representations that makes this um
[00:29:45.22]

[00:29:45.22]
this transitive inference really trivial
there's one other um important theory of
[00:29:53.12]

[00:29:53.12]
uh how you solve associated reference
which is distinct from these other two
[00:29:57.03]

[00:29:57.03]
and it's called integrative encoding and it says
that you study a and b and then when you see b
[00:30:03.14]

[00:30:03.14]
and c it reminds you of the a b experience and you
form this kind of conjoined um abc representation
[00:30:13.01]

[00:30:13.22]
that allows you to do the ac lenka test i'm not
going to go into the details of this so much
[00:30:19.09]

[00:30:20.12]
but for the purposes of the experiments i'm about
to show you it makes the same predictions as
[00:30:25.12]

[00:30:25.12]
the remerge model um which is basically that it's
not sensitive to the order of presentation of
[00:30:31.11]

[00:30:31.11]
information whereas this one is okay so um so
i'm gonna focus on the the uh remerge kind of
[00:30:40.00]

[00:30:40.00]
strategy and the distributed um representation
strategy and both of these strategies are
[00:30:46.12]

[00:30:47.09]
available to this neural network model that i've
shown you and that's because it has both of these
[00:30:52.15]

[00:30:52.15]
pathways so um so if you if you don't allow
the model to use its recurrent computation um
[00:31:01.12]

[00:31:01.12]
that's fine it can solve the model it can
solve the problem in ca1 because it has these
[00:31:06.00]

[00:31:06.00]
representations of a and c that are overlapping
so can solve the problem that way but even if
[00:31:11.05]

[00:31:11.05]
you're just using the trisynaptic pathway if you
allow the model to use recurrence at retrieval
[00:31:15.12]

[00:31:16.06]
it can also solve the problem that way in the
same way that remedge does so it's kind of like
[00:31:20.17]

[00:31:20.17]
the trisometic pathway is um is implementing
remerge but then there's this other pathway
[00:31:26.00]

[00:31:26.00]
that can implement this other strategy okay so
why do we have two strategies when might you
[00:31:30.12]

[00:31:30.12]
want to use one strategy or the other well
this just this strategy should be really
[00:31:36.02]

[00:31:36.02]
fast and efficient um which could be really useful
in some kinds of situations but it's going to be
[00:31:41.18]

[00:31:41.18]
susceptible to interference whereas this strategy
should be slower it's more effortful i think of
[00:31:48.10]

[00:31:48.10]
it as being a little more explicit you have to
like think through um these different experiences
[00:31:53.07]

[00:31:53.07]
that you had in order to solve the inference
but it should be resistant to interference um
[00:32:00.00]

[00:32:00.00]
because that localist representation is gonna
um is going to uh fight interference problems
[00:32:05.22]

[00:32:05.22]
for you so um so there's differences in the
behaviors we expect from these two strategies
[00:32:12.00]

[00:32:13.16]
so uh here's a paradigm that we've developed to
try to tease apart these two different strategies
[00:32:20.08]

[00:32:20.08]
um so this is a line of experiments led by my grad
student z joe and marley tandock and daria singh
[00:32:27.07]

[00:32:27.07]
in the lab are also involved in this project um
so here we're going to do an associative inference
[00:32:33.03]

[00:32:33.03]
task where we present triads um either
in an interleaved or in a blocked order
[00:32:41.01]

[00:32:41.01]
so so here there's a um here a b pair and then a b
c a matching bc pair in the interleaved case we'll
[00:32:49.07]

[00:32:49.07]
go back and forth between studying the a b and the
b c and in the blocked case we'll study all of the
[00:32:55.12]

[00:32:55.12]
abs before any of the bcs but everything is mixed
together in these two conditions and then after
[00:33:04.04]

[00:33:04.04]
all that exposure we do two kinds of tests uh the
first test is a speeded recognition test where we
[00:33:11.16]

[00:33:11.16]
show participants two objects um and we ask them
to quickly make a judgment about whether these
[00:33:19.01]

[00:33:19.01]
two objects were actually presented together um
during the exposure phase and so the answer in
[00:33:26.02]

[00:33:26.02]
this case would be no because you know even though
they were indirectly related objects they were not
[00:33:32.08]

[00:33:32.08]
actually seen together so um so the answer is
no in the explicit inference test so this is a
[00:33:39.11]

[00:33:40.08]
this is like the standard associated inference
kind of test we show participants and object and
[00:33:45.14]

[00:33:45.14]
we we make it very explicit we say one of these
objects was indirectly associated with this first
[00:33:51.01]

[00:33:51.01]
object via some other object can you remember
which of those objects it was um and then we
[00:33:57.16]

[00:33:57.16]
give them plenty of time to make that judgment so
we don't expect there to be any difference between
[00:34:03.12]

[00:34:03.12]
the interleaved and the blocked conditions in
the explicit inference test because we think that
[00:34:08.17]

[00:34:09.14]
either of these strategies could solve this task
and be available and there's no there's no reason
[00:34:14.17]

[00:34:14.17]
to think that that the order presentation is going
to make a big difference for that but in the speed
[00:34:20.12]

[00:34:20.12]
and recognition test um we think we predict a
difference between between interleaved and blocked
[00:34:26.04]

[00:34:26.04]
conditions and the reason is that if you've built
up a distributed overlapping representation of a
[00:34:32.21]

[00:34:32.21]
and c over the course of this exposure then it
should be confusing to see a trial like this
[00:34:40.04]

[00:34:40.04]
where you where you're asked to say that you did
not see these two things together because if you
[00:34:45.12]

[00:34:45.12]
have overlapping representations of a and c
are gonna you're gonna get you're gonna get
[00:34:50.17]

[00:34:50.17]
confused and think maybe you did see them together
so we predict you'll either be slower or uh to say
[00:34:56.08]

[00:34:56.08]
that you did not see these things together or that
you might even false alarm and say that you did
[00:35:01.09]

[00:35:01.09]
see these things when you didn't if you're using
a localist representation which we think you're
[00:35:06.00]

[00:35:06.13]
more likely to be using in the blocked case then
you're less likely to have that kind of problem
[00:35:12.00]

[00:35:14.06]
so first these are the data for the the
explicit like standard associated inference task
[00:35:19.12]

[00:35:19.12]
um accuracy on this task is higher on this
axis for the interleaved um and lower for the
[00:35:28.10]

[00:35:28.10]
blocked condition and we find that there's no
difference between these two conditions and it's
[00:35:33.09]

[00:35:33.09]
not that people can't do this they can do this
task very well um there's just either just equally
[00:35:38.06]

[00:35:38.06]
good on the in these two different conditions
um but in the speeded recognition case we find
[00:35:44.08]

[00:35:44.08]
that participants are slower to reject
those interleaved ac's that are confusing
[00:35:50.12]

[00:35:50.12]
relative to blocked acs and they false alarm
more so they're more likely to false alarm
[00:35:56.15]

[00:35:56.15]
and think that they saw these acs
that they never actually saw together
[00:36:01.09]

[00:36:04.00]
uh we this was an m trick experiment
so we decided we would just
[00:36:07.22]

[00:36:08.23]
pre-register it and run run the same thing with
more subjects and we find the same results where
[00:36:13.22]

[00:36:13.22]
there's no difference in the explicit um uh test
but you're slower for the interleaved condition in
[00:36:22.23]

[00:36:22.23]
this speeded test and you're also more likely to
false alarm so we think this is a reliable pattern
[00:36:28.12]

[00:36:30.10]
so so that was an example of a situation
where the overlapping representations are are
[00:36:36.23]

[00:36:38.00]
are hurting behavior right because they're making
it more confusing to um to make a judgment about
[00:36:44.06]

[00:36:44.06]
whether you've whether you've seen these uh
pairs of objects together or not but um we'd
[00:36:49.14]

[00:36:49.14]
like to highlight situations where distributed
representations might be useful um and so the
[00:36:54.06]

[00:36:54.06]
next couple experiments are showing you know
these are the kinds of things that we think that
[00:36:59.07]

[00:36:59.07]
the building of this representation might do for
you that would be that would be useful so in this
[00:37:04.19]

[00:37:04.19]
experiment we did the same um same exposure either
interleaved or blocked presentation of these pairs
[00:37:10.17]

[00:37:10.17]
um and then we did a generalization kind of
test so we asked uh or we first taught people
[00:37:16.12]

[00:37:16.12]
about novel properties of some of the objects
so we might um say that this turns out that this
[00:37:22.10]

[00:37:22.10]
object fleeces whatever that means and then we ask
people which uh which other objects do they think
[00:37:29.09]

[00:37:29.09]
also have that property and we test to what extent
they'll generalize that property to the indirectly
[00:37:35.07]

[00:37:35.07]
related items as a function of the order of
exposure and we find that the that they're
[00:37:42.06]

[00:37:42.06]
they're more likely to do this generalization um
for the interleaved acs which is consistent with
[00:37:49.03]

[00:37:49.03]
this idea that you're building up this distributed
representation that's so useful for generalization
[00:37:54.12]

[00:37:55.20]
um the if you break this down separately
into the interleaved and blocked conditions
[00:38:00.06]

[00:38:00.06]
it looks like this is working for the
interleaved case but not at all for the blocked
[00:38:04.02]

[00:38:04.02]
case where there's no they're not doing this
generalization at all across these indirect pairs
[00:38:08.15]

[00:38:10.15]
um and then in this last experiment we
thought well are there scenarios where
[00:38:15.11]

[00:38:15.11]
we can make this effect um even bigger like are
there scenarios where there's no hope that the
[00:38:22.06]

[00:38:22.06]
local style representation could ever um solve
this task for you basically and so we thought
[00:38:28.08]

[00:38:28.08]
well what if we take this the same paradigm and
instead of asking people to explicitly remember
[00:38:34.08]

[00:38:34.08]
memorize that these two pairs of these
two objects go together in a pair
[00:38:38.00]

[00:38:38.00]
we'll turn this into a statistical learning
experiment and um and ask people to
[00:38:43.20]

[00:38:44.17]
both kind of parse the sequence and find where
the pairs are and and understand these indirect
[00:38:50.17]

[00:38:50.17]
relationships and this is the kind of situation
that we think um distributed representations are
[00:38:56.15]

[00:38:56.15]
really crucial for it's unclear that localist
representations would can solve this task at all
[00:39:02.08]

[00:39:02.08]
as you saw in the in the neural network model in
the tri-synaptic pathway and here indeed we find
[00:39:09.18]

[00:39:09.18]
that even if we do this explicit inference test um
the kind of standard uh associated inference test
[00:39:16.15]

[00:39:17.11]
the accuracy for interleaved acs in this
paradigm is much better than for blocked um
[00:39:24.06]

[00:39:24.06]
and there's no evidence that people can
learn the structure now in the blocked case
[00:39:28.06]

[00:39:31.18]
okay so um so interleaved excuse me
[00:39:35.09]

[00:39:39.09]
interleaved learning seems to benefit
rapid inference we think this is
[00:39:42.21]

[00:39:42.21]
consistent with the idea that the hippocampus
might contain these distributed representations
[00:39:48.10]

[00:39:49.03]
that complement they're separate from these um
pattern-separated localist representations that
[00:39:54.23]

[00:39:54.23]
we still think are very important for um avoiding
interference in in episodic memory okay i'm happy
[00:40:02.17]

[00:40:02.17]
to like pause and take any questions at this
juncture if if that's if that's useful um
[00:40:09.20]

[00:40:10.21]
otherwise i'll um just spend a
few minutes on on our sleep work
[00:40:16.23]

[00:40:18.23]
okay i don't see anything yeah there's no
question in the chat right now this is dobby
[00:40:23.20]

[00:40:23.20]
there's no questions in the chat so maybe you can
just keep going okay i will power power on okay um
[00:40:29.07]

[00:40:30.06]
so i'm gonna tell you a little bit about what
we're thinking about the next step here so
[00:40:34.06]

[00:40:34.06]
um what happens after these new regularities that
you're learning in the hippocampus are encoded how
[00:40:40.02]

[00:40:40.02]
do you what does this transformation look like
so if you have no more exposure to information
[00:40:45.14]

[00:40:45.14]
in the environment it seems like there's still a
lot of learning and consolidation that's happening
[00:40:50.06]

[00:40:50.06]
how does that happen um and one one kind
of prediction of this framework this way
[00:40:57.05]

[00:40:57.05]
of thinking about things is that over over time
and in particular over sleep that there should
[00:41:04.08]

[00:41:04.08]
be this emergence of even more overlapping
distributed representations in neocortex
[00:41:09.16]

[00:41:10.17]
even without further exposure to information
in the environment and so we ran a behavioral
[00:41:19.07]

[00:41:19.07]
experiment using these satellite stimuli that i
showed you earlier these object category learning
[00:41:24.10]

[00:41:24.10]
um stimuli where participants are learning about
three categories of these objects where objects in
[00:41:30.23]

[00:41:30.23]
the same category share most of their features
but there's also unique individuating features
[00:41:35.18]

[00:41:36.12]
and what we find is that over the course of um
of 12 hours that includes either sleep or wake
[00:41:43.18]

[00:41:44.19]
unique features of the individual satellites
will be um maintained over a night of sleep
[00:41:52.02]

[00:41:52.02]
but forgotten over a period of the same
amount of time spent a week and this is
[00:41:57.22]

[00:41:58.12]
consistent with a lot of existing literature on
kind of arbitrary pairings and the ability of
[00:42:04.19]

[00:42:04.19]
sleep to prevent um the forgetting of that kind of
information but in the case of shared features so
[00:42:11.07]

[00:42:11.07]
the features that are shared across members of a
category we find that there's actually this above
[00:42:16.06]

[00:42:16.06]
baseline improvement in your ability to remember
these features across the night of sleep
[00:42:21.03]

[00:42:21.03]
which suggests that sleep is promoting somehow a
better understanding of the shared structure even
[00:42:26.08]

[00:42:26.08]
without having more exposure to that structure in
the environment and so we think this is consistent
[00:42:32.15]

[00:42:32.15]
with this idea that if you're building up this
offline um distributed reputation a neocortex that
[00:42:39.14]

[00:42:39.14]
it it's going to highlight the shared structure
for you and and make you understand that structure
[00:42:45.16]

[00:42:45.16]
even better um we also found that um we ran a
fmri experiment with this paradigm looking for
[00:42:54.10]

[00:42:54.23]
um offline reactivation of these um of these
satellites to see how it relates to changes
[00:43:01.16]

[00:43:01.16]
over time um we're now doing um experiments in
with mg trying to actually look at sleep replay
[00:43:08.12]

[00:43:08.12]
but for this for this first experiment it was a
fmri experiment where we where we stuck with awake
[00:43:13.22]

[00:43:14.17]
replay and we found that awake reactivation
of these objects in the hippocampus
[00:43:19.07]

[00:43:20.04]
predicts improvement in your ability to remember
them after sleep so we think that what we were
[00:43:25.18]

[00:43:25.18]
measuring in these awake periods is kind
of representative or maybe influences what
[00:43:30.08]

[00:43:30.08]
continues to happen um offline during sleep and
that sleep is is then especially useful for um for
[00:43:38.06]

[00:43:38.06]
these uh these these um memories so um so with
the the reactivation that we observe results in
[00:43:47.11]

[00:43:48.06]
more behavioral improvement over time if if
that time includes sleep so how um how does
[00:43:57.12]

[00:43:57.12]
replay actually shape cortical representations
this feels like kind of a mysterious thing and
[00:44:04.12]

[00:44:04.12]
it's really a difficult computational problem
because usually our models rely on input and
[00:44:12.04]

[00:44:12.04]
feedback from the environment in order to build
up useful representations so how do you um how is
[00:44:17.12]

[00:44:17.12]
it possible that just this like offline learning
based on already existing representations could do
[00:44:22.15]

[00:44:22.15]
something so useful and so here we're again doing
uh neural network modeling where we have a model
[00:44:31.03]

[00:44:31.03]
that can sleep it has a hippocampus that has a
cortex and we're going to try to understand what's
[00:44:36.02]

[00:44:36.02]
happening in interactions between hippocampus
and cortex to produce these changes offline
[00:44:41.09]

[00:44:43.03]
so um learning during sleep like i said is a very
tricky computational problem we've been developing
[00:44:49.09]

[00:44:49.09]
kind of new learning schemes that might allow
um this to work like learning from your existing
[00:44:55.09]

[00:44:55.09]
representations to do something useful so here's a
here's a video of the model um sleeping it can it
[00:45:02.21]

[00:45:02.21]
can move from from memory to memory each of these
different um like strong yellow looking states is
[00:45:08.12]

[00:45:08.12]
a particular satellite and we get the model to
move so it it behaves completely autonomously
[00:45:15.11]

[00:45:15.11]
we set it running with some random initial um uh
activity and then it just runs completely on its
[00:45:22.10]

[00:45:22.10]
own and it moves from memory to memory uh because
we have some short-term synaptic depression that
[00:45:28.00]

[00:45:28.00]
forces uh co-active uh pairs of units to kind of
um tire out and um and move to the next memory
[00:45:35.12]

[00:45:36.04]
um and then we use we have them the model track
its own stability and when states are very stable
[00:45:42.04]

[00:45:42.21]
they get they get marked as good states or plus
states for learning so like this state here
[00:45:48.10]

[00:45:48.10]
that's a stable state so it gets marked as a good
state and then we have oscillations um that are
[00:45:55.09]

[00:45:55.09]
perturbing these states and helping to reveal kind
of aspects of these memories that need some help
[00:46:03.11]

[00:46:04.08]
and oscillations are very prominent um
perturbation in the sleeping brain and so
[00:46:09.01]

[00:46:09.01]
we can kind of make use of that and the way this
works is that as also as the um as we oscillate
[00:46:15.01]

[00:46:15.01]
levels of inhibition in the model that that
controls the the amount of overall activity
[00:46:19.16]

[00:46:19.16]
that's possible and when we raise the inhibition
above baseline um that reveals parts of a memory
[00:46:27.05]

[00:46:27.05]
that are weaker and in need of strengthening and
when we lower inhibition below baseline it reveals
[00:46:34.10]

[00:46:34.10]
who is connected to this memory and potentially
interfering with it by having activation spread
[00:46:39.22]

[00:46:39.22]
farther than normal and then we can
contrast our stable states with these
[00:46:45.18]

[00:46:46.12]
perturbed states from the oscillations and then
we can do error-driven learning in this case we're
[00:46:51.05]

[00:46:51.05]
going to do contrast of heavy learning which is
going to allow us to do some something useful
[00:46:57.16]

[00:46:57.16]
from these existing representations we also
have sleep stages in the model we know that
[00:47:03.09]

[00:47:03.09]
the hippocampus and cortex are especially uh well
connected and communicating during slow wave sleep
[00:47:09.07]

[00:47:09.07]
and then during rapid eye movement sleep seems
like these systems are relatively decoupled and we
[00:47:14.21]

[00:47:14.21]
kind of let the neocortex run by itself during uh
rapid eye movement sleep period and what we find
[00:47:21.20]

[00:47:21.20]
is that over the course of the model's sleep it
builds up these neocortical representations where
[00:47:27.14]

[00:47:27.14]
objects from the same category that the shared
parts of those representations are enhanced
[00:47:33.03]

[00:47:34.02]
but also the unique parts of the representations
are are preserved um and so we're very excited
[00:47:39.14]

[00:47:39.14]
about that because it matches up with what we
see in our behavioral data where you're able
[00:47:43.05]

[00:47:43.05]
to remember unique properties but there's this um
there's this enhancement of the shared structure
[00:47:48.04]

[00:47:50.12]
um and in my last minute here i'll just um tell
you about what we're up to now or as as soon as
[00:47:57.01]

[00:47:57.01]
they let us back in the lab um we have uh this
targeted memory reactivation study that we're
[00:48:03.20]

[00:48:03.20]
running where participants are again learning
about these three categories of these objects
[00:48:10.06]

[00:48:10.06]
but we're actually going to have people listen to
the names of the objects as they're learning about
[00:48:16.12]

[00:48:16.12]
them so they'll hear these names um and then
during sleep we're going to play these names
[00:48:25.03]

[00:48:25.03]
um at at the peaks of these oscillations which is
this the slow oscillation which is the time that
[00:48:31.09]

[00:48:31.09]
we know that replay is kind of likely to happen um
with the idea that we can try to have some control
[00:48:38.13]

[00:48:38.13]
over what participants are replaying and this is
this targeted memory reactivation technique is
[00:48:43.12]

[00:48:43.12]
sounds kind of science fictiony
if you haven't heard of it but
[00:48:46.04]

[00:48:47.05]
but has been used successfully in many many sleep
studies now so it's a really exciting way to have
[00:48:52.12]

[00:48:52.12]
um some some causal control over what's happening
in human sleep and we're trying out different
[00:49:00.06]

[00:49:00.06]
manipulations but one of the things we're really
excited about is is playing these um hues in
[00:49:07.05]

[00:49:07.05]
either interleaved or blocked order if we think
that replay is offline replay is most useful when
[00:49:14.08]

[00:49:14.08]
information is interleaved well can we show that
you really extract the shared structure best if
[00:49:19.01]

[00:49:19.01]
we encourage that replay to be interleaved okay
so that's just a sense of what we're up to now
[00:49:25.05]

[00:49:25.20]
um and let me just sum up overall so so i've
told you that we think the hippocampus might
[00:49:31.05]

[00:49:31.05]
contain these distributed representations um that
are complementing the the well-known localist
[00:49:36.12]

[00:49:36.12]
representations but more broadly we think this is
part of a broader kind of continuum where during
[00:49:42.19]

[00:49:42.19]
sleep the hippocampus is helping the neocortex to
further enhance this shared structure with even
[00:49:48.17]

[00:49:48.17]
more distributed representations that are kind of
being built more slowly over time over the course
[00:49:54.21]

[00:49:54.21]
of of long-term consolidation okay i'll stop there
and take any questions if you have them thank you
[00:50:02.00]

[00:50:04.17]
all right thank you anna if you
if some of you want to unmute and
[00:50:09.09]

[00:50:09.09]
join me in thanking anna for her talk i'm
sorry for this crazy flashing of my background
[00:50:16.04]

[00:50:22.04]
yeah it's i think for most of us
[00:50:24.10]

[00:50:25.14]
your picture is sort of small so the
background didn't okay okay all right so um
[00:50:32.00]

[00:50:33.01]
if uh if anybody wants to ask a question you
can just unmute and maybe just ask it directly
[00:50:39.20]

[00:50:47.09]
all right uh while everybody's thinking about
questions maybe i can ask my my question um so
[00:50:53.03]

[00:50:53.03]
that was that was super cool really like a ton of
interesting stuff so thank you so much for that
[00:50:58.17]

[00:50:58.17]
um i have a question about some of the um stuff
that you talked about at the very end so you're
[00:51:05.09]

[00:51:05.09]
talking about that model of sleep and you're
saying that um some states that the stable get
[00:51:13.05]

[00:51:13.18]
marked as good and some states that they're
unstable or also having it marked as is bad
[00:51:19.05]

[00:51:19.22]
can you talk a little bit more about that like
mechanistically how how is this who is doing the
[00:51:24.19]

[00:51:24.19]
marking how is that marked um and how is that used
later yeah yeah so um so the way it works is that
[00:51:34.10]

[00:51:34.10]
um when you first fall into an attractor um the
pattern of activity is very stable so it makes
[00:51:42.02]

[00:51:42.02]
it so that we can just track this one variable
which is how much is activity changing from one
[00:51:47.14]

[00:51:47.14]
time point to the next um and i you know we
don't have a we don't have like a particular
[00:51:53.07]

[00:51:54.08]
uh candidate mechanism in mind for like what
what exactly we think would be implementing
[00:52:00.10]

[00:52:00.10]
that but there's i think there's a few options
for how neurons can be sensitive to how much
[00:52:05.11]

[00:52:05.11]
change there is on a short time scale like that
so we assume that something like that is possible
[00:52:10.00]

[00:52:10.21]
and then we just say okay so as you fall into
an attractor things are very stable we assume
[00:52:15.20]

[00:52:15.20]
that you can mark that that's true um and
then as synaptic expression starts to take
[00:52:20.15]

[00:52:20.15]
hold and the attractor becomes less stable the
oscillations which are always present um will
[00:52:26.02]

[00:52:26.17]
will will become stronger basically um just
because of that action of the synaptic depression
[00:52:33.14]

[00:52:33.14]
so oscillations get bigger and so what we can do
with this very simple thing where we just compare
[00:52:38.04]

[00:52:38.04]
that initial state to the rest of what happens
until things drop below some some you know some
[00:52:46.19]

[00:52:46.19]
threshold that we say okay we're probably not
replaying a memory anymore um and so just by
[00:52:51.12]

[00:52:51.12]
tracking that one variable that's enough to say
we can now contrast the stable state to everything
[00:52:57.07]

[00:52:57.07]
else which includes both sides of the oscillation
right includes both the showing the revealing the
[00:53:04.10]

[00:53:04.10]
weak parts that need strengthening and revealing
the competitors that need weakening that all
[00:53:10.10]

[00:53:10.10]
gets baked into the same like bad minus state and
the math works out so that we can just contrast
[00:53:16.13]

[00:53:16.13]
that with the stable state and that's all you need
so you just have to be able to track stability
[00:53:20.10]

[00:53:20.10]
and then compute that contrast um and then
that's enough to get the learning working in
[00:53:25.22]

[00:53:25.22]
the directions that you need awesome um and
to the extent which now you have a model of
[00:53:33.01]

[00:53:33.01]
like the different stages of memory do you think
you can play with like what would happen if you
[00:53:39.14]

[00:53:39.14]
just interfere with rem sleep or what will
happen if you just interfere with some other
[00:53:44.00]

[00:53:44.00]
part of the sleeve hey yeah yes definitely the
model can probably do and like make predictions
[00:53:50.02]

[00:53:51.09]
yeah we're very interested in that one
of the things that we're interested in is
[00:53:54.08]

[00:53:55.03]
the idea that cycling between slow wave and ram
might be important um so that's yeah it's hard
[00:54:01.12]

[00:54:01.12]
it's hard to have um to have experimental control
over like a sleep stage cycling or the pro having
[00:54:09.16]

[00:54:09.16]
you know having or not having particular sleep
stages but um but yeah we can explore that in the
[00:54:14.02]

[00:54:14.02]
model and one idea that we've been thinking about
is that maybe cycling between slow wave and ram
[00:54:20.10]

[00:54:20.10]
over the course of one night is really useful
for integrating new information into existing
[00:54:26.12]

[00:54:26.12]
knowledge structures so if you assume that
um that's that cortex contains kind of your
[00:54:31.11]

[00:54:31.11]
long-term memory of everything whereas the campus
might be more biased towards things that have
[00:54:35.01]

[00:54:35.01]
happened more recently then um then effectively
what you're doing by going back and forth between
[00:54:40.10]

[00:54:40.10]
slow wave and ram is thinking about things that
happen more recently and then thinking about
[00:54:45.14]

[00:54:46.12]
the rest of what you know right and that
that's another kind of interleaving process
[00:54:51.05]

[00:54:51.05]
that might be really useful for integrating new
information into existing knowledge and so we can
[00:54:56.04]

[00:54:56.04]
build models where we can build simulations that
um that either have that interleaving or don't
[00:55:01.03]

[00:55:01.03]
have that interleaving and um and it does indeed
seem to be the case that in that kind of situation
[00:55:06.21]

[00:55:06.21]
where you need to integrate new knowledge into
existing knowledge um that that going that
[00:55:12.13]

[00:55:12.13]
sleep cycling is really important so yes we're
very interested in that super cool so let me
[00:55:20.21]

[00:55:24.13]
provide the opportunity for somebody
asked us somebody else to ask questions
[00:55:28.04]

[00:55:28.23]
this is uh mike hunter uh thank you for the talks
uh really interesting to see what's going on here
[00:55:36.00]

[00:55:36.21]
i'm going to ask what's probably a
super ignorant question so forgive it
[00:55:40.08]

[00:55:41.07]
uh so you talked about uh kind of monosynaptic
pathways and trisynaptic pathways is there
[00:55:47.20]

[00:55:49.07]
any reason to limit to those options now you
can have four or five or 132 synaptic pathways
[00:56:00.13]

[00:56:00.13]
if you want you know the model will work
the same if less efficiently what is the
[00:56:07.09]

[00:56:08.13]
justification for this kind of limited set
[00:56:11.14]

[00:56:13.22]
uh yeah so i i actually don't care about
the number of synapses in particular um
[00:56:20.08]

[00:56:20.08]
we i when i use those terms i'm i i just use them
as a way of referring to two different pathways in
[00:56:26.02]

[00:56:26.02]
the pro in the hippocampus that have different
properties so um so it could have been that the
[00:56:31.14]

[00:56:31.14]
hippocampus had three synapses on the pathway
that had a slower learning rate and and more
[00:56:37.07]

[00:56:37.07]
overlapping representations but that's not the way
it was built and you know so we built the model
[00:56:41.01]

[00:56:41.01]
to correspond to the the known anatomy of the
hippocampus so the only thing that i care about
[00:56:46.10]

[00:56:46.10]
is those two properties how much overlap is there
and what's the learning rate and those are the
[00:56:52.04]

[00:56:52.04]
two properties that that vary across those two
pathways and you know there might be reasons we
[00:56:57.16]

[00:56:57.16]
could talk about for why the hippie campus has
set up three synapses on its on the sparse high
[00:57:03.14]

[00:57:03.14]
learning rate pathway there might be a need for
multiple transformations of the representation
[00:57:09.05]

[00:57:09.05]
there that's not as important for statistical
learning kind of strategy um but yeah i don't i
[00:57:15.01]

[00:57:15.01]
don't mean to suggest that the number of synapses
is important to the theory it's not thank you
[00:57:25.20]

[00:57:26.13]
i have a question too um by the way it was a super
cool talk and i i really enjoyed it and thank you
[00:57:33.20]

[00:57:33.20]
so much um so my question is um so it seems like
the hippocampus has these two different pathways
[00:57:41.16]

[00:57:41.16]
of learning like doing statistical learning
and encoding things um episodic memories
[00:57:48.23]

[00:57:48.23]
and i i was wondering whether those two systems
can work together or um is there any kind of
[00:57:58.02]

[00:57:58.02]
like priority that one happens first than the
others yeah yeah that's a really great question
[00:58:05.07]

[00:58:05.07]
something we're thinking about a lot right now um
so i showed you that there was look what looked to
[00:58:10.12]

[00:58:10.12]
be trade-offs between these two pathways and that
brings up a really interesting question which is
[00:58:16.10]

[00:58:17.07]
might we have control over the action of these two
pathways um we know that there's fluctuations in
[00:58:24.08]

[00:58:24.08]
the um in the strength of the two pathways as
a function of um of theta oscillations and as
[00:58:32.04]

[00:58:32.04]
a function of the amount of acetylcholine in the
hippocampus and so it's plausible that actually
[00:58:38.08]

[00:58:38.08]
you know depending on what you're trying to do you
might be able to root information information more
[00:58:42.15]

[00:58:42.15]
strongly through one pathway or the other which
would be very useful if you're you know if you're
[00:58:46.15]

[00:58:46.15]
in a situation where you're trying to generalize
or you're in a situation where you really want to
[00:58:50.02]

[00:58:50.02]
focus on the specifics of what you learned it'd be
great if you could have some control over that and
[00:58:55.12]

[00:58:55.12]
so we're interested in the possibility that medial
prefrontal cortex which has direct connections to
[00:59:00.19]

[00:59:00.19]
to ca1 might be able to like help root these
um connections through these two pathways
[00:59:06.19]

[00:59:07.09]
um but you asked if there was like a a difference
in different phases of learning is that what you
[00:59:12.15]

[00:59:12.15]
said between the two pathways yeah uh yeah
i mean we we don't we haven't seen we we
[00:59:20.10]

[00:59:20.10]
haven't implemented that in the model
and i don't think we've seen evidence
[00:59:24.23]

[00:59:26.02]
for that in data that i can think of other
than the fact that there's a developmental
[00:59:31.03]

[00:59:31.16]
difference right where you seem to start
off with more a reliance on statistical
[00:59:36.15]

[00:59:36.15]
learning and then over time add up in the
episodic memory but that's that's different
[00:59:40.19]

[00:59:40.19]
that's over the course of development so yeah we
haven't we haven't explored all right thank you
[00:59:46.10]

[00:59:54.19]
all right okay one last question thank you um
so i have a question about the um sleeper effect
[01:00:04.02]

[01:00:04.02]
uh so that's related to the last part of your
presentation um i got quite interested with that
[01:00:11.11]

[01:00:11.11]
sleeper effect because it almost seems like um
people didn't pay attention their brain actually
[01:00:18.04]

[01:00:18.04]
at rest but actually during that process they
learned something which kind of like associated
[01:00:24.06]

[01:00:24.06]
with the last part of your presentation about
you know the um when they are shared um structure
[01:00:32.23]

[01:00:32.23]
people can um replace the car better through sleep
when they wake up so i kind of wonder whether
[01:00:40.15]

[01:00:41.16]
they two things are actually correlated or um it's
just like uh illusion are you asking whether we
[01:00:52.19]

[01:00:52.19]
know that the reap that replay during sleep is
like is responsible for the memory changes that
[01:00:58.04]

[01:00:58.04]
we see yeah yeah um i would say the evidence
is pretty strong at this point for that um
[01:01:06.08]

[01:01:06.08]
the strongest evidence probably comes
from rodent work where they actually
[01:01:10.04]

[01:01:10.04]
can well they see extremely clear evidence
of replay um which is of course harder for us
[01:01:16.06]

[01:01:16.06]
with our methods and humans and you can disrupt
that replay you can like specifically see it
[01:01:22.15]

[01:01:22.15]
and and stop it in rodents and find that that
um that that impacts memory consolidation so
[01:01:28.23]

[01:01:28.23]
um so i would say we have a we have a pretty
we have pretty strong evidence at this point
[01:01:34.13]

[01:01:34.13]
that um that replay is causally involved in
the memory consolidation changes that we see
[01:01:38.17]

[01:01:44.13]
all right great well thank you again
anna that was that was wonderful
[01:01:48.13]

[01:01:48.13]
and we have some more meetings for you
with faculty and students uh this afternoon
[01:01:53.09]

[01:01:55.03]
wonderful well thank you again and uh
hopefully uh we'll see you around during
[01:02:01.18]

[01:02:01.18]
non pandemic yeah yes yeah perhaps me come back
when in prison someday great thanks everybody
[01:02:18.15]