So a lot of very good to give thanks and I thank you so much for a nice introduction. OK So that's the general idea today we're going to talk about language in the brain in particular how you how it is that you might learn a new language and what that looks like once you've learned a new language so language is easy once you figured it so kids have to work really hard to learn language but as if I'm doing a good job today as I talk language will just fall away you'll forget that we're even I need in talking to you and when you're really in depth a conversation it's like. It's like language isn't even part of what you're thinking about you're thinking about the concepts in the thoughts right so that's native language that's fluently and that's what a lot of the work has focused on so today instead we're going to talk about what it means to learn a language now learning a language is of course much more difficult it's effortful it takes a lot of practice and it's something you have to work at even once you learn the language speaking in the language it's not your native language can be hard to do they were trying to talk about the neural processes that are behind learning a new language and what that means in the brain. So a lot of the previous work has focused on native language understanding so they put people in a scanner they take pictures of their brain while they read words in their native language and we try to figure out what they are by. It's doing so this is a slight departure and that was about all of my work up until this point was on this is a slight departure in that we're going to ask them to learn a new mapping of symbol to Word and they're going to ask if we can still pick up that that word representation in their brain when they see this image. So the questions we're going to answer today are can we detect learning so can we tell that the person is actually learning the mapping that we're asking them to learn and can we detect the contents of the learning so though I'm showing you a symbol instead of the word cow can I tell that you're thinking of that word cattle So you've learned the mapping from symbol to cow can I see that new mapping in your brain. And then we're also interested in how that learned mapping how the representation induced by the new learned mapping is different than the representation induced by their original word form so how is reading the symbol that you've learned means cow different from reading the word. So this is in collaboration with a few of my collaborators that you that I was that you that can tell just recently I just moved in July so this is my student Chris Foster he's going to forever regret putting on the CD cap and then taking his picture because I use it in all my talks I'm sure he doesn't appreciate it as well as Chad Williams And so Chad is actually the one who collected the data and all of his his supervisor. So you may not have heard of the University of Victoria so I thought I would just tell you here is Victoria it's actually really close to Seattle and it's actually south of the American mainland border so it's one of the most elderly places in Canada and in addition. It gets less rain than Vancouver. And it gets no snow it doesn't freeze it doesn't get below freezing it's very rare. So that's where I used to live I lived there for three years and now I live in Edmonton I went in is not so of the mainland border it does get snow. You may wonder why I moved if you have a question about that we can talk during the break. But Victoria's beautiful in my collaborators there are fantastic. OK So this is generally the flow of what we'll talk about today so we're going to talk about how we find language in the brain so this is the typical studies I've done up until this point looking at native language and native language representations in the brain. Then we'll talk about the paradigm we used to get people to learn this new mapping between symbols and their native language and we go through the results and will wrap up at the end so at the beginning we need to talk about how computers represent meaning so what would it mean for a computer to understand a word. So first all this talk about what it means for you to understand a word can you tell me a word that's similar to orange. Tangerines a good one Clementine you guys are really close to that orange space you're rating or in space you're all naming citrus fruits if I got you to go longer you might say Apple maybe it's a banana so like in this realm of things that are sweet and grow on trees right citrus fruits Nobody said car. And nobody said sandwich even though you eat sandwiches right so you have an understanding of what semantics means what words mean and I can ask you to sample around a particular point in that space so you have in your mind a representation of the world and that's what we're asking you to draw from when we talk about word meaning so what we need to do is get computers to also build a similar representation. So here is. A representation of the world we have one axis in that sweetness and we have one axis and it's grows on a tree so all of our fruit is pretty sweet and all grows on trees and sandwiches don't grow on trees maybe they're slightly sweeter than cars. So they're a little closer to the group than they are to them a car is. So this is a very simple vector space model of semantics. B.S.M. and sorry and this is the computer model will be talking about today that represents word meaning. So here in a vector space most Magic's every word is a point and so every word is assigned a list of numbers so here in the two dimensional space the list of numbers will have two elements so here sandwich is zero and zero point zero three because it's a little bit sweet and doesn't grow on a tree so every word will get a list of numbers and that's how we tell which words are similar they will have lists of numbers that are similar. So you and I know many many words and we're able to differentiate between all of them and that's because we have a very high dimensional space semantic space so we're able to tell the difference between many many thousands of words. And we define the dimensions of language as we learn the language so my daughter knows what a horse is if I wanted to read a zebra was I would just say it's a horse with stripes and she would understand intuitively that there is a dimension that is straight and horses sometimes usually don't have stripes and zebras always have stripes. So we learn this these dimensions as we go through life. So we need to get a computer to understand these dimensions in. The same sort of way. So we could create a computer model that had the same dimensions as humans but there are multiple problems with that first of all be tedious because how many dimensions do you have the split of all the words in your head you have maybe even infinite number of dimensions. It's also error prone in that we have to get somebody to write down all the dimensions and it's subjective right you if you don't own a dog you might feel differently about dogs than I do because I own a dog maybe you have a fear of snakes you feel differently about snakes than other people do so everybody's own representation of semantics is a little different so it doesn't really make sense to ask people to write it down. Possibly take too long and I computer scientists so I have no patience and I get computer I want computers to do everything quickly there's also no ground truth so like I said you and I have different experiences of the world so there's It wouldn't really make sense for us to try to write down the one true vector space model of semantics for all people. So what we do instead is we we get a computer to learn the dimensions automatically and it assigns that each one of the words a point in a space in that space. And so we're going to process a large text corpus essentially download all of the web pages we can get our hands on and then we're going to build a model from all those web pages together so you can sort of think of it as like the average semantic space because many many people write those web pages and we're sort of averaging together their experience so now instead of each one of our words having two numbers in the vector we're going to make these doctors much longer. On the order of hundreds of dimensions and that allows us to tell the difference between tens of thousands of English words so in order to do this vector create these vectors for each one of the words we're going to look through this corpus and we're going to look for words that are associated with a word of interest for a word of interest as banana we're going to look for all of the time to see the man in the corpus we're going to see what words appear nearby So as you can imagine banana often appears with the verb eat probably less with the verb drive that would make it similar to that noun Apple but different from the noun car. And you can see how the way that we use words implies something about the meaning and so that's what we're leveraging here in these computer vector space models of semantics that's been going on for a long times nine hundred ninety seven was say one of the first really popular models of this late in semantic analysis and then I will be using today is called Skip Graham was released in two thousand and thirteen and it's been a pretty effective model I'll tell you a little bit about it. So skip Graham works like this it is a neural network model and the task it is is trained to do is to predict context words given essential words so you sell you tell a neural network Central where it is banana and it needs to predict with high probability the words that will occur nearby. So it's going to predict with high probability maybe high probability yellow and so its ability to predict those nearby words. Tells us something about the word meaning so you can use any large body of text that one word using today was trained on Google News So it's a whole bunch of news stories and the hidden layer of this neural network is just one hidden layer becomes the word vectors. So sort of the hidden representation that the network learns becomes the forms the space in which we place words. So we have some are kind of remarkable they can do a lot of different tasks they can approximate human judgments of words similarities if you ask a person to judge they may day and noon How similar are they they might give it a very high score two point nine four and the distances between those words in the vector space will correlate with these human judgments of words similarity. We can also they can also ask answer total questions about synonyms so what's a synonym of levied imposed synonym and so those two words will be close together in the vector space whereas the other choices will not be can actually perform about as well on these sorts of tasks as students take necessity you might be impressed if you're not impressed. OK And then also a possibility for role filler so like what given the verb fire and the noun employer how how likely is it that agents so do employ employers to the firing Yes they typically do. OK and they can also do something really neat and just check just track the meaning of words over time so if we are able to split our our corpus up into decades we can watch how the meaning of a word changes over decades so for example the word broadcast used to mean to go see it and now you know of course it means something to do with transmission of radio. So the actors are. Interesting surprising tool math human behavior pretty well there are people who will argue that they're not perfect and of course they're not but they're that there are a tool that we have now is not going to talk about how these models are trained on Corpus data relate to what the brain is doing so what does the brain do when you think of a word does that does your brain representation have anything in common with the vector space reparative representation that I just told you about. So here we're going to take our vector space model of semantics or we're going to try to figure out does it have anything to do with E.G. or any G. collective a person a person reads the same word or F.M.R.I. of a person reading the same word so it's been employed in all three of these modalities and today we'll just talk about. It. So in order to do machine learning we need to take our brain imaging data turned it into something that machine learning algorithm understand we're going to take it and then try to predict the vector for that word so we just in E.G. we do the simplest thing you could possibly do which is just to concatenate all the time series for the time period we're interested in and that becomes a feature vector So this is the simplest thing you could do and it's also there's lots of room for improvement here. Another thing that we do is we average all the trials for people reading a particular word so when you read the word banana we're going to average all of the trials for when you read that red banana Plus we average across all the participants this is a little different than what people had done previously in previous work they would build a model for each subject and then measure the accuracy of each subject and then average those accuracies together and report the average accuracy of a subject specific model so here averaging across subjects this might be a little surprising for people who work with brain imaging data of course people have different sized brains. I think because of the smoothness of E.G. that's why this works so now we have a brain data average that tells us what the brain usually looks like when you read a word like banana we're going to train a regression model for each one of the dimensions of our vector space model so that's what these bit is represent and we're going to use the brain imaging data multiplied by that data vector to predict one of the dimensions so each one of the beta vectors is independent that means we have an independent model for every vector every dimension of the vector space another room for improvement but another simple assumption we made that actually ends up working really well. Because the brain data is high dimensional we use a regular riser to Ridge aggression. Which just means that we ask the model to only choose the most important features when it's making its prediction. And so what this model does is it takes as input brain imaging data and it produces as a put a point in their high dimensional space so it's predicting one of the points in our if this was sweetness and grows on a tree it's going to predict a point in that space and then we can tell what word it is by finding the words that are nearby so we're going to test our model on how the data so here in my example I have seven words I'm going to hold it the two words most and house I'm going to try to predict on them what word I think they're so I'll take my model I'll give it the easy data multiplied by the data here's that matrix and so I'll predict a point in space for both this brain image as well as the screen image so I look at them in space here's prediction one in his prediction to. The two and asking the model to choose between the two held up words telling the model here are two brain images the words they correspond to are most and how so you tell me which is which So I want to know is the assignment of prediction one to mess and prediction two to house better than the vice versa prediction one to host prediction two to miss So here it's really obvious the green lines are much longer than the red lines and so we're going to choose the assignment that says prediction is most prediction to his house this is the two vs two test so essentially we're holding up to two brain images and we're just asking the computer to decide which one is which. So two vs two accuracy will be the percent of those tests that we run that are correct and will run all possible pairs of words so instead of sixty there's a seven hundred words were pairs that you could test and the reason you do this is because you're using two predictions to make one assignment so you know that even if one of your predictions is quite bad the other prediction might be able to compensate for it so you have better signal to noise it's easier to tell if your model is better than chance using these two vs two test. And chances about fifty percent because you're choosing from one of two choices so if there is no signal at all if the brain had nothing to do with actors we would see but fifty percent So overall this is the flow so we start with a corpus we run it through this word to back algorithm to get our vector space model and then we take that back to space not on the average of all our E.G. data and we run it through the classification framework this is our bridge regression to predict points in space so we run the two verses to test and we get to our two vs two accuracy. So it will be talking about today is mostly this to persist two accuracy it just tells us if there is a relationship between the word after space and what the brain is doing. So this is going around I think the first example was two thousand and eight Mitchell paper so it's been used a lot and one thing my group has done is actually assemble together a bunch of freely available brain data imaging sets of people reading words and made them available online so that people who build vector space models can test against the brain this is brainbench we have a bunch of different modalities of any gene E.G. we also English and Italian data sets abstract abstract and concrete nouns although right now it's only now. And growing so we're always adding data sets when we can. And so here is with the same. So my students are going to never have a picture taken with this headset and here's skip Graham so performs quite well in this case about as well as these other two models and because it's quite popular and Computational Linguistics that's what we'll use today. So it's on the order of like seventy seven percent versus two actors and this is across all of those datasets. Other questions now if you could start. Early this. Is your story about. Now. So today we're just been talking about single words and they're not in context there are actually pronouns and verbs and nouns in what we talk about today so. We do have multiple parts of speech. And there's some new work coming out of people reading sentences and how that you can use a recurrent mill network which is a neural network trying to predict the next word in a sequence how it's hidden representations actually relate to the brain so people are thinking about beyond nouns. But for now we're just talking about Sigurd's get any other questions. So in brainbench there's a bunch of different paradigms sometimes they're answering a question like Is it bigger than a microwave oven. Sometimes Or just told to visualize it there's a bunch of different. Actually my case they are because this magic space they used was behavioral So the semantic space the vectors are actually made up of the answers to be a real questions. But often it's just a task to make sure that they're reading and understanding the words so it can even be like a one back task. Yet. The question is kind of what if you did instead of the two verses to test something more like rank accuracy so if you had a list of sixty words and you had to choose from the sixty but to rank them by the distance to the predicted point it actually performs a bet on par with the two vs two test because you get something the two vs two test is essentially rank accuracy with only two elements you can think of it that way. So but another interesting thing to do and you could also have a much bigger set of words. Yes we'll talk a little about the E.G. when I get to that part we're just going to start but. In general we do the simplest things like a theme for today we do the simplest thing you can which is just take the time series you're not doing any for your analysis we do a little bit of I.C.A. to take out artifacts but it's it's pretty low key. OK So the way that we're going to ask people to learn this new language is using a reinforcement learning paradigm so it twenty four subjects come into the scanner they're all fluent in English they've got like a very high score in an English fluently test. And they had to learn this new language while having E.G. Niji cap on. Now at this point I need to get a copy at that this language is actually like an incredibly simple mapping of symbol to word so it's as if you're learning a new vocabulary so you're learning which symbols map to words we're not talking about grammar right now not talking about syntax none of that we're just talking about mappings. For instance by reinforcement learning which will make clear in the second series this act of Chapter sixty four sensors. And the sixty one signal sensors and we cut to that five hundred hertz down sample to two fifty. So here's what it looks like in the scanner so you're going to come in and you're going to see is a symbol it looks like this is taken from one of two Indian languages and then you're going to be presented with four choices now the first time you see these four choices you don't know the answer you don't know which one of these words the symbols responds corresponds to so you choose one dog you're probably wrong twenty five percent of the time you'd be right. And then the next time you see the symbol maybe a choose a different word maybe choose Run and then you get it right so it's a process of guessing and getting feedback is how you learn the language in the in the scanner. So that's like this so for five hundred seconds you see this funny symbol and then the choices appear on the screen and then you have up to two seconds to make your choice once you make your choice it turns white for five hundred milliseconds there's some intertribal interval here and then we give you feedback about whether it was correct or not and you see the feedback for a second. So answer is interested in answering a few questions the first is is there a difference between receiving correct and incorrect feedback so do you does your brain respond differently to those two feedbacks. And we're also interested in can we see the semantics of the word you're learning so if this is you in this is correct can we see the semantics for the word you at this point in time when you're reading the symbol. So our data set is made of sixty words and they're randomly assigned to sixty unique symbols so there's no rhyme or reason for which symbol goes we didn't choose like I'm pretty Cowie looking symbol for cattle that the sixteen words are made of fifty four nouns three verbs three pronouns and they're presented in random order so that means that every subject sees the words in a different order and in addition they're presented around a different number of random times for each participant. So some participants see the word count twice some participants will see that word count twelve times. So we're going to go through a few different results one of them is sort of one of the more traditional analyses of just an E R P analysis of the signal to see if we can detect or word positivity. So interested here and asking what is the difference between receiving a positive and negative feedback so does your brain do something different when you receive positive versus negative feedback. So here is a graph on the Y. axis is the voltage This tells us sort of like how much your neurons are firing in here is the telling and so on the spot here is the onset of the feedback so this is the point at which they see either a check or an X.. And so the lines look like this. If it's correct you see this large more of a deflection and if it's incorrect you see there's more of a up bump here and if we take the difference of those two waves we see that there's a big difference between correct and incorrect feedback in about two hundred eighty knots. So this is typical reward feedback response and so it's good to see here. So where when other question you could ask is as you receive more and more feedback so you got it right Cal is the symbol you got it right the next time you see that symbol you guessed cow. Is this is a response to that difference so you're going to be expecting the correct feedback that you will get it correct so what does your brain response look different so again here's the onset of the feedback and here reword one is red and then sort of mustard and green blue blue pink so you can see that the the curve changes as you learn so this can be interpreted in more than one way one could be that it's a difference in reward so it's less rewarding to get it right the second time or it could be a difference in learning you are learning you've already done a lot of the learning the first time you get it right so we can do is average across all six of these peaks in and find sort of the grand average peak and ask what the difference is between the conditions for that grand average. So it looks like this so the very first time you get that correct feedback you have a large response and that it falls off with a sort of diminishing returns. So as you continue to get the words correct you. Are less excited about it. So now we're interested in we have shown we see in the behavioral data that your learned that language are our subjects on an average get around eighty three percent accuracy so they learn the language by the end can we tell in your brain that you actually learned that mapping can we see the words the symbol is mapped to using E.G.. So here are going to take all the words are represented six or more times this is because we need a good signal of the actual representation for the word and we're going to average all of the trials beyond the second repetition so three four five and up as many times as you saw that word. We're going to take E.G. signal from all the sensors again we do nothing special here we just put them all together and this is from zero to seven hundred milliseconds after the simple answer so just like we talked about before we take in the brain data we'll pass it through a model and predict a point in the in the vector space and turns out using this task set up we get almost eighty percent accuracy so we're able to tell what word a person is thinking of when they're doing the simple earning task. So this was surprising to me and exciting but there's lots of follow up questions. So when do you learn the representation so at what point can we actually detect that you've done this learning so not on the first trial remember the first trial you have no idea what that symbol means and you're guessing right so maybe by the third trial or maybe by the sixth trial. So that we answered this was to average the trials together so here this this increase includes correct and incorrect trials so whether or not you got it correct or incorrect we're going to average those trials together. The read is going to be averaging trials one to three then two three four four five six three four five and four five six and so we're going to see as a function of which trials we averaged together whether we can see the word the year you should be mapping the symbol to. So the graphics like this story on why is two versus two accuracy so we're seeing here is that if we average together the fourth fifth and sixth trial we're able to get almost the same accuracy that we get using all of the data which means at this point in time we can tell that you've learned nothing. This exciting we're able to see quite early on trial for five six that you're already learning this mapping. Another question you can ask is have the behavioral data so we know that some of our participants don't learn as quickly as other participants so some of their accuracy behavior accuracy is just low or if you're having a bad day maybe if you got to have coffee there's a lot of reasons why this could be so we want to know is there a correlation between the two versus two accuracy that we can we can get using the data and their behavioral accuracy so if they are behaviorally worse at this task are we able to do they also have worse to persist to accuracy. Rate. So we had twenty four subjects it was pretty clear division of the top and bottom seven performers seven written seven best performers so we didn't cherry pick this number. So we took all of those with Task accuracy below eighty percent and they get two versus two accuracy fifty nine point seven So almost sixty percent. This would probably not pass statistical significance test. But if you take those with accuracy above eighty five percent we get sixty five percent so there is a correlation between your behavioral accuracy how well you perform that task and our ability to tell in your brain whether you're meeting making that mapping and you might wonder well why is the sixty five number so much lower than the seventy nine that you told us earlier and that's because we're only averaging together seven subjects here instead of a full twenty four there's a signal to noise problem but so we're able to see that those who perform well in this task are actually doing a better job of creating that consistent representation in their minds. And finally where in the brain is this new language so again we have two vs two where it went so we had to persist to actually see on the Y. axis and on the X. axis is time so here's the onset of the symbol is zero so this is the symbol they're learning the mapping for and five hundred milliseconds is the onset of the choices. So the graph looks like this so here we're trying to look with windows a fifty milliseconds or trying to predict what word they're thinking of during this part and here it's still there see this point they're seeing for words but they can see the word that is their preferred choice. If they're doing the task correctly so we actually get above chance accuracy so we're able to tell what words are the symbol is mapped to very early in time it starts about one hundred forty milliseconds in these wind Sobeys one hundred forty one hundred ninety milliseconds and here much later in time after they see the onset of the choices we see another peak. So it's like they have a representation when they see the symbol and then once they see the word choices they have another sort of refreshing representation of the same word once they see that the correct choice is on the screen to the questions. Across. So people are remarkably similar in their understanding of words even if they have different opinions about them so there's I don't know if you could do it E.G. data but they've done it with half my training on one person and testing on another and it works for some people not all people. Because we do it we do agree about a lot of things in the world especially nouns the kind of nouns we're talking about so you understand what a chair is there's not much disagreement. And they did have a question. For you. So this is the yeah so this is everything's this should be trial three and on and we didn't actually Selectric correct trials but it mostly contains correct trials at that point. Yes we'll get to that in a second so one of the questions we're also interested in asking is Where in the brain is this representation showing up so I mean these are called the outs of course because we're using E.G. So that's a very smooth representation but that's one of the things we'll talk about maybe just one more question. So here every point in this graph represents a new set of betas. But one good question would be what if you took the bait as you learned during that early peak and apply them everywhere else what would that look like so that would tell us they would answer for us the question is the representation of the word when you're reading the symbol the same as a representation of the word Once you see that word on the screen. When you were. First learning the word in English so there's somebody was. Just. So. Yes. So none of that was an E.G.. And I guess it kind of gets the intersubjective question that is there could we use this across subjects and sometimes yes. OK. Just interested in where in the brain this signal showing up so what. Sensors do we see this representation. So here is one of these points represents an easy sensor and we're using not just that point not just that sensor but all the sensors in its immediate neighborhood in order to train the model. So this is from zero to four hundred milliseconds so this is the time they're viewing the symbol this this is mostly the time they're viewing the choices and here's an average over the full one second. So here is a few I don't know if you can see but a few of the points are white white points are significantly above chance so actually during the time that they're reading the symbol none of the sensors by themselves can actually predict what word they're reading the word mapping to the symbol so that means that this this representation in this early point is pretty weak it's not as strong as we see later so here during the time there they're reading the choices including the word they know is the correct word we see a much more robust strong representation and so individual sensors in Left Temporal can by themselves tell us what we're reading. And if we take the full average you see that increase in the number of statistically significant points meaning that there's some complimentary signal across the two time periods so when we put them together we get a better model. So it's not like everything before four hundred milliseconds is noise there are some additional benefit for including that in the model. So I'm going to do a little bit of wrap up and talk about what we talked about today and why it's interesting so we went through a lot of results and sort of like why should you care. So we can detect the word positivity as people learn this new vocabulary. That is that we can tell that they have they have a different brain response when they get a word right and when they get it wrong we can also see that the reward reaction to mission is over correct So as you learn the paradigm as you learn going through the paradigm your brain response to the correct trials diminishes. And so this could be either because there's less learning on the subsequent trials or because the reward is less great because you knew it was coming it's an anticipated board. So we can also now we can detect the semantics of the word during this learning trial. Now this is important because some of the past work doing this word vector analysis has had some criticism some of those criticisms are do you know these vectors actually have anything to do with semantics. Right they could be just the visual features and actually there's a there's a correlation between. Frequency and wavelength so words that are frequent are shorter That's just a function of language it's not something we can get around and so something that's a little semantic something like word frequency has an effect on the word form and that means that there will be fewer white pixels on the screen when you're reading words that are more frequent so how can we tell that what we're doing is actually decoding a semantic signal and so the work that I talk about today because we're not showing the word we're showing arbitrary symbols is a good piece of evidence that what we're actually decoding is semantics because these symbols are arbitrarily chosen and mapping doesn't have anything to do with the word. And we're also not coaching them to visualize the word so a lot of previous work had asked them to. Just beforehand I asked them to imagine all the words and then in the scanner they were supposed to imagine the word for three seconds and that's not very natural that some of people do when they read and so here this is a very engaging paradigm because you're doing this learning task and it's enough to engage people to create a sense of representation that we can still pull them out with this technique. So we can also detect this learned representation is early as four five six so even though we're including correct and incorrect trials trials for five six are enough to tell us what the word is that you were mapping to that simple. And we found also that behavioral accuracy correlates with a two versus two accuracy so even though so we can tell how well you are performing on that task just looking at your two vs two accuracy this could be useful for things like. Determining the difference between a correct guess and somebody who's actually learning. Now we also see two peaks in those two verses to accuracy which I think is maybe one of the more interesting results here so we're at the very beginning when they're reading the symbol we see the short peak. And then we see once they see that the correct word is on the screen we see this much more robust representation of the word so the question is is there a mapping between these two states is there a similarity between the time they're reading this symbol and the representation they pull up when they read the actual words and so one of the ways we could test that and we're working on right now is if we trained a model just on that early point that high peak during the simple reading part can we test it later in time during the time they're reading the actual word in this or can we do still predict that with two verses to accuracy so this is a if you may have heard of it time generalization method technique where we're training and testing across different time windows to see how the representation changes over time. So we see we see even with this though we can tell that there's a difference between the semantic representation for the symbol and the one they pull up during when they're reading the choices so the first thing is the symbol representation appears later it's about forty most seconds later and it's much less it's sustained for much less time so it's just a little peak and it's gone whereas the time when they're reading the choices it's a much longer peak although it's a similar accuracy which was surprising to me. And the peak that happens after you're reading the choice is earlier. We also saw that the distribution sort of the censor space told us that the representation during the time you're reading the symbol is much more distributed weaker and none of those none of the sensors by themselves could tell us what the words we needed all of them together. And so this is different than when you're reading the choices when individual centers were enough. So this tells that representation during simple reading is not as robust it's a weaker signal. So in the future there's a few ways we could go with this one of them is OUR I've already talked about which is if we train a classifier during the time you're reading the symbol does it perform well if we tested during the time you're reading the actual choices so is there a similarity between those two representations. And we can also extend it to other learnings scenarios so it would be interesting to test people what representation is look like when people are learning to generalize to new stimuli so if I told you you'd never heard of a Zebra I told you that the zebra is a horse with stripes What is your learned representation for a zebra look like. We can also talk about different ways to do curriculum presentations so is there a way to present trials in a particular order so that people helps people learn and with that I'll actually stop asking questions and e-mail me if you have any that question. And you'll never pronounce my name wrong ever again. So. It's. Your. Shoes. She goes. You know you. Go you have any predictions that would be interesting so we would play them word and then we'd either play them like English word or we'd show them English words that be interesting I have thought about how that would work. It's an interesting idea. If. You learn. The words right. And then they remember. What you know. So without them you know. There are ways in which the brain leaves me. And the only. Way I'm going to be just. The way the. Still I want to believe that sort of thing I guess his question the question is sort of like if if we encourage people to not translate because what we're doing is we're actually getting them to translate we're showing them the word and so we're encouraging them to do this mapping to words we could have said show them doesn't work for all of the words but we could instead show them pictures like we could show them a picture of a cow so the symbol for cow and then a picture for a cow and maybe that would be a different show a different response so be interesting. Yeah so I think well so there's a whole bunch of work that shows that reading is very automatic and sort of like the representation that you pulled up is pull up in your reader where it is quick and not something you have a lot of conscious control over so I think that that's what's happening is they're reading that once they see that correct word it's like their brain just does because they're also seeing three other words right but they're able to focus on the word that they know is correct. I mean. You. Know. So this is a great question yeah. So the way that we tested just about significance for these models is called a permutation to us so we take all of the sort of assignments of which brain image of corresponded to which where do we shuffle them so if we were just learning some correlate of vision. If if the word vector had really nothing to do with it and we could sort of like hijack it in predict something it had to do with vision then when we shuffle it we could still do it and so we can't or at least that's what we're testing against when we say it's better than chance or a test we're testing it against a model that had that situation where the words were where shuffle the semantics no longer lined up. With the actual brain image. And so if you're interested that gives a fifty fifty point zero one percent accuracy so very very close to fifty percent. I retraced. That's interesting so if we instead of showing the words the. Screen they came up would be say a set of four symbols and you had to choose which one is right to be Learning Association from symbol to symbol and I think the science question would be Can you see some of the visual features for the second symbol during the time you're reading the first symbol that would be interesting. I'm not sure I want your I would think maybe you might be able to do that in different modalities but E.G. is very smooth. Just. So because you're an Indian speaker you know these you know these letters so if we only took people who speak an Indian language I forget the two that we chose Tom And. Yeah. So we both know one of them that's good so. That's interesting so if the response to be different if this was your native language that's interesting because you could build then you could build sort of like memory devices in the thing that I would find hard about this task is you're seeing these symbols you know like I don't know curvy thing with the stick on top and then that's you I guess and you're trying to build these associations but if you actually knew what that you know actually meant in the language and you could actually build the betterment representation I would be interesting. Yeah. Yeah. So it's interesting I think the question is something like If you ask people so they they get to see the symbol and then they ARE you ask them before you show them the choices their confidence in do they. Think they're going to get it right would that correlate with the sort of strength of the representation when they read the symbol I think that's a really interesting question so whether this like introspection correlates to if you can tell how strong your center group is indeed and there's a bit of personality in there too right because. It depends what you kind of person you are you know I mean some people just aren't as sure about their answers. Yeah. Yeah. Yeah. Yeah. It's a good question so the question is So why do you see that one little peek and it goes away you have to remember that word right. It's a good question I don't I don't have a good answer for it. But you think if you could detected at one point if it was still there we could still detect that maybe it's not as strong that might be true. I would have a cautious person so I would imagine I would do not operate on this task and I could imagine that when I think about what I would be like if I was doing this task is like. You know like and so that's what you see you see like talent like the person is like deliberating internally and then they see the choices and. You know but. So we train them we train them separately I do have some really preliminary results of training training during that very first time in testing later and actually doesn't look like there's a very good match but we haven't had a chance to really dig into that deeper yet. Great so. Our model is as I said the simplest model you could possibly train so it doesn't know anything about which sensors are close to each other and it doesn't do any sort of correlation analysis or try to project into a lower dimensional space there's nothing there nothing on top of it but of course all of that could help here. Now to regression does have the property that if there are a set of correlated variables it will sort of spread the weight across that group but. Nothing beyond that. Yeah. So what perceptual priming be in this case. Where right. I see. And like what kind of I'm interested what kind of features. Yes I like a frequency analysis that would be interesting yeah there's lots lots to do here. Is this OK. Yeah. So do you mean sort of if we ran the reinforcement learning paradigm for longer and we use the data from later yeah. Yeah yeah I think you're right that. Yeah. Yeah I think you're right that if we went for a longer and maybe use that additional task in addition. I bet it would become more robust and that would be good because then maybe we could train at that time and test everywhere else that would work. If. You're going. To make her.