[00:00:05] >> You know we're very pleased to have her abductor near Kingston today and our Mary Aaron introduced him Hi everyone. Dr Kingston is a university this distinguished professor in the Department of Educational Psychology at the university Afghan's He's also the director of the achievement has as many Institute in his Ph d. in this occasional measurement Teachers College Columbia University he has worked as a cycle nutrition festival 0 per se stands developer managed educational testing process for both general and operative assessment including as an executive director of the Educational Testing Service and senior vice president at major progress is research focuses on large scale me and says Man with particular emphasis on how it's going better support student learning through the use of learning maps and a gnostic classification models that ticking sound Thank you again for accepting our invitation we're honored to have you here because today well I'm sorry that I couldn't be on campus with all of you but given the weather I'm glad to be in Lawrence right now hopefully someday I'll have a opportunity to visit your campus in the future. [00:01:21] Another thing you should all know about me is our I like popular music. And I find themes that I can weave into what I'm doing. All mention it more in a couple of minutes but I actually formed a rock and research power to a band called the skew here at the University of Kansas a number of years ago and we've played at several national conferences all short clip of one of our songs and a little bit. [00:01:51] But that led me to this theme for today's presentation and I'm trying to switch slides. So this song and hopefully you're going to actually be able to hear to my speakers when I play something we'll see in a moment because want you to hear that soon can you hear the music now everyone's frozen so I don't know if you can hear this but I'm hoping you can I'm only going to play for a few seconds. [00:02:28] So this was a hit song 33 years ago and one of the challenges with the music is sometimes it is time back around but this song actually by r.d.m. actually made it back on the bestseller charts last March so it's the end of the world as we know it has become a current theme with the pandemic and everything else but there's a hopeful message associated with this song and I've applied it to my own thinking and work in measurements so here's the lyrics that I wrote the power of the lyrics if you will basically I repositioned the song in terms of what's happening in the world of measurement. [00:03:15] And one of the things that I'm I'm not going to read the whole song to you you can read it on your own you saw it in the abstract. And now that you've heard the tune you can sing along in your mind if you're reading this if anyone actually wants to record it so I can use it many future sessions I'd appreciate it greatly So if a Georgia Tech wants to get a rock and research priority band of their own together I'm looking forward to hearing that but in any case. [00:03:43] One of the things that I've noticed throughout my career is even when theory moves fast practice moves slow in the world of education at large and in measurements and in psychometrics as a whole we'll explore things with research but the impact on the real world happens much more slowly and I view that as a major problem that we have to face we have to pick up the pace of the impact that we're having we're all trying to make our fields better but if that doesn't if that takes decades to happen then we're not achieving all that we could be in any case I wrote these lyrics there's actually meaning to all of these lyrics all of them relate to what's going in in the world of both educational testing but also psychological measurements at large and at the end I'll make some of these connections. [00:04:43] So a little of professional history about myself I started as a high school teacher I went to graduate school Teachers College Columbia University went to Educational Testing Service at a time that all sorts of exciting things were going on I don't respond spirit was being 1st applied to assessments even though it was developed in the 1900 or early 1915 so it wasn't applied to the late 1970 s. and work was going on in areas such as standard setting the folks who were e.t.s. Fred Lord Sam Messick Bill Lang go up and count the Paul Holland done Rubin and countless others and folks would come to visit from all over the place and that's where I got a meet this Professor Susan Emberton for the 1st time I listened to a talk of hers and that got me thinking about some of the things that I thought were shortfalls in the world of measurement and that needed to be addressed I left e.t.s. I came back to e.t.s. I was a private consultant I think. [00:05:57] I got the feeling that it's not enough to know one other stand the technical aspects of measurements but it's also important to have that opportunity to influence so I moved in a policy direction in my career I became an associate commissioner of education in the Kentucky Department of Education in the early ages of accountability testing where they have a much better more robust accountability system based on assessments but many other things then was later required by the federal government they had to worsen their accounts of delivery system to fit in with federal mandates became a private consultant went to a testing company at the time called advanced systems and measurements and evaluation and then called measured progress. [00:06:43] And then c.t. McGraw Hill and finally 14 years ago came to the University of Kansas so one of the things that you can get from a historical perspective of my own career here is I can't hold a job so with that in mind I have been longer at the University of Kansas now 14 years than I've been anywhere previously and have very much enjoyed the freedom and flexibility to take some of the ideas that I've had and work with others on and move forward some of my thinking and early on in my career I used to believe every test should have a purpose and every purpose should have its own test I started in the world of admissions testing I worked on the g.r. rate for all of you who have insurance making the Graduate Record examinations test I was the director of research and test of elements for that testing program but please don't blame me I. [00:07:42] Complained about a number of things that I thought weren't working quite right and eventual ia left that program and it's whatever influence I might have had back in the early 1980 s. has dissipated by now. One of the problems I faced However in more recent years now I'm going about 10 years back you may think you know what the purpose of the test you have developed is you may have had the most wonderful of intentions with what you have done but the users are going to take what you have done and do what makes sense to them and if what makes sense to them is not your original intent or purpose that doesn't matter so this is the graph of Task Usage for a assessment system that was intended to support formative purposes and classrooms it was supposed to be about assessment for learning not assessment of learning and if you think of a test that's supposed to be. [00:08:43] Helping with formative assessment you would think that the test volumes ought to be fairly stable throughout the year you start using it in order to provide feedback to students in order to write information to teachers and it's at a steady state you always need it but that's not what we saw from the beginning of the school year from the beginning of the school year of the volumes were very low very low very low and picked up at a certain point of the year well it turns out when they started peaking when they jump sky home I was right before the state accountability assessments were given and some people were using the staff which was have these lofty purposes of helping students learn better as practice for the state assessment as opposed to for any purpose associated with helping students learn and it didn't matter how much we talked to the field this is what was being done well the 1st step in any 12 step program is to admit that you have a problem and then all of the assessments that we develop we've got a lot of problems there are a lot of different pieces that have to come together. [00:09:54] So how as we think as I have thought about a measurement and I was once asked to write a book chapter about the most important issues facing the 2nd metrics what do we have that we wrestle with most in the future it was clear to me what the 3 most important issues are I will give everyone in the audience a few seconds jot down in your mind if not on the piece of paper what you think those 3 most important issues are but I won't keep you waiting too long it's validity liberty and the little bit so we have to always always think about validity and I told you about our group the skew so now I will present a musical performance from the National Council in measurement education conference about 7 years ago or 6 years ago. [00:10:59] Every claim to stain. Every choice you made a. Break every step you take it's all. It's. The. Every test you say should be the way it's all. It all. Comes. In. Every frame day. Fuming. Every. Every step you take it's all. A. Week. Like all new Sam's coming out and they denounce he was yelling leave our family out of our feet No I didn't. [00:12:58] Have reclaimed you stay. Every choice you me. Every Rood you bring down every now you me it's own believed in. The roof you bring everything down the retain its own believe didn't see even. Meet me. I was me on the left hand of the stage playing the cowbell the only music was dreaming that I am qualified to play and there is actually a youtube channel for the skew and we have about 8 or 9 songs of various quality of reporting from conferences where they performed but now you know there is a musical group around the field of metal and sounds like a matter and you can relate to it as much as you wish. [00:14:08] I know he will sort of columns that I've been missing Georgia Tech as a science punk band I am going to listen to that if anyone can post a link for them or did you. There's a link posted so maybe that's it. And you need a s o p Benz Ok well I do think that every rational program ought to have their own bands and there once was a battle of the bands and c.m.e. an a.d.r. a where 3 different universities the University of Massachusetts at Amherst as a band called The Message casts a bit based on Sam Essex name so. [00:14:53] I'm glad you have a science one but you need a psychology one and so I keep working on it with you in any case so I also recommend Newton strollers book on validity if you haven't already that it could be a fork or perspective which among other things will tell you how slowly things have changed in the litany. [00:15:19] And I certainly hope no one ever talks about 3 different kinds of alluded to ever again but talks about crafting of the look of the arguments as 8 units every idea whether you call it construct the of the or not none the less thinking about the look at the is evidence here what reason. [00:15:40] In any case that led me to thinking about evidence center design which has been around since the 1990 s. and I have a quote from Miss Levy all and Lucas that it's absolutely driven approach document in the support conceptual substance to school that assessment data provides for instances or actions and that helped move my thinking along also. [00:16:06] For those who are not familiar with evidence center design at which is there's a growing consensus that this is the right way to build assessments regardless of the field it's started an educational assessment but has continued forward in the more psychological realms and a team that I worked with at the University of Kansas but also Johns Hopkins and northern Illinois developed a model of reading motivation using evidence center design and I find it a very strong approach it changes some of the thinking during the process of test of element I'm not going to go into any detail but there are steps domain analysis domain modeling. [00:16:52] And that's where the argument building gets really done what is it that you want to see to allow you to infer what you want to infer and what is that logical argument that connects x. and y. So that's at the heart of evidence center design and making sure that for every item you create for every aspect of test design you are thinking with this evidentiary reasoning model z. is called the warrant or the rationale for seeing that x. supports the claim why that you want to make so x. is your evidence why is the claim that you want that evidence to support and z. is the rationale for why seeing x. supports y.. [00:17:40] This opens up the Susman development more directly to building in. Other fields within psychology. The cognitive realms neural psychological realms. Theories of curriculum all of those things get manifested much more clearly when one takes this kind of approach and then you have the conceptual assessment framework where you finally define the blueprint that's based on the model and they did and then you create the assessment and administer the assessment so the world was moving along nicely. [00:18:21] There somewhat more examples that are in the Power Point feel free to look at those. But I'm going to argue I have all argued that even though evidence center design is an example of that we're design it's not quite backward enough as it has been implemented and so one of the things that has often been missing that I still find missing far more often than present is asking the question what are we trying to accomplish by having this assessment at all and it becomes necessary to question everything. [00:19:02] I started my undergraduate career as a philosophy major. And. The cart was. Famous for saying I think therefore I am concept or ergo sum but actually I'm told that the arguments he was making started with a different phraseology to do done with this way in French I doubt therefore I am it was the very act of doubting that led to the basis of his philosophy so that's one aspect of Cartesian philosophy I still agree with that everything Question everything so. [00:19:45] You can ask a question which I didn't have back when I was working on the Graduate Record examinations program what is the purpose of your graduate missions that's and the operational answer to that was to predict the 1st year grades of graduate students because that was a necessary but not sufficient. [00:20:09] Step on the way to success as a graduate student but I want to question whether 1st your grades is even necessary or meaningful or meaningful enough and so what is the purpose of the graduate program and what are we trying to do there now this argument is actually clearer I think for undergraduate education than graduate education and I'm moving. [00:20:34] My research focus right now to undergraduate education from k. 12 education when you think of college admissions very often in the United States the whole idea has been to select the class that will do the best Why what is the. Social good out of each and every university trying to maximize the g.p.a. of its undergraduate students if that's what it's trying to do and I would argue that the tests that we use and the other aspects of selection that we use put a lot of emphasis on 1st year grades for undergraduates and alternative that you could be thinking of is to maximize the improvement of student learning what Garbus of where they start now I'm not saying that's what every college should do or even any college should do I'm saying you should question what it is that you are trying to achieve and so for every task you have to do that and my thinking started to evolve in the world of k. 12 assessment in the world of public school k. 12 assessment that's. [00:21:46] Really what we're trying to do is improve student learning so anything that we do every aspect of the design of an assessment has to be created with the idea of improving student learning and everything else is of secondary value and not saying you can't have a test serve accountability purposes but if you have to struggle between designing a test to serve accountability purposes versus to assert to serve improvement of student learning for me and also has to be this way for you but for me it is clear cut student learning is what we need to maximize Ok well that started me thinking more about what would a test look like if we were to do that and how would the design process evidence center design need to change to take that into account. [00:22:43] Well in a lot of research now a lot of grants that you write a theory of action is required what is going to happen. What goals are going to be achieved by what it is that you are doing this is the theory of action that we put together at the beginning of the dynamic learning maps alternate assessments. [00:23:08] Grants. At the time it was the largest grants at the University of Kansas it's interesting that an education grant was the largest grants at a major research university but that was the case and this was our theory of action it started with what our beliefs were what the inputs to the assessment would be the processes the outputs and then I'll come short term intermediate term and long term. [00:23:40] And as we think about those up what would up over here for you are long term outcomes are not about accountability they're about teachers increasing their understanding to help them teach better and. Thinking differently how much saying these are the best outcomes I would not choose these outcomes now although I will also admit that my choice of Al comes that I build into a grant is not 100 percent the same necessarily as where my heart wants the outcomes to be. [00:24:15] I'll I 1st want to get the grants and use that to change the world and make it a better place to change the education of students with significant cognitive disabilities in this particular example and some of the long term outcomes that way but aren't quite as direct I did in fact go to a short term outcome that we had as to me the most important students have improved academic outcomes so we're putting it out there this test isn't just to measure what students have learned and that is different than how we have often thought about tests before tests become an instrument that has purposes above and beyond measurements and my place of going back to my example the chart of the volumes with formative assessment is it doesn't matter if you explicitly make something like this your desired outcome there will be outcomes that come out of this often referred to unfortunately as unintended consequences that can be negative so if you don't build a system to have positive consequences and then the probability of negative consequences is too high. [00:25:29] And we saw that in some of the public feedback and pushback associated with next No Child Left Behind and race to the top and other accountability based systems with cartoons like of the one that I have here see Reggie Cade in the public school system what does that qualify you to do is that a multiple choice true false or full of the why question to all of them education in many Americans cool stops around March so that test preparation can take over and that is a problem. [00:26:05] There are other problems that are associated with assessment that we have to think about that go on and have gone through the history of testing at least in the United States if not internationally. There is a trade off. Between there has been historically a trade off between standardization and the lid of the slash reliability so in so many areas of statistics and measurements there's a tradeoff between precision and bias basing statistics can be viewed as accepting a little bit of bias in trade for a lot of precision and we need to think about these issues again in the systems that we develop how much are we willing to give up so this cartoon which was based on an internet name that we then changed here because we couldn't find who created the original name so we created something that we could use without copyright infringement for a fair school accountability system all students have to take the same test please climb that tree so the whole idea of standardization it's interesting started as a little the arguments and eventually became the reliability of argument many people don't recognize that the roots of standardized testing within the United States. [00:27:33] Their rest has been has a much older history another country countries China for example but in the United States it started with Horace Mann in $845.00 he was the commissioner of education for Massachusetts and recognized that the way the public schools were working the. The giving of a high school diploma often was based on high school students at a time when most people didn't graduate high school going in front of the local board of education and being orally quizzed by members of the Board of Education and then a a high school diploma was conferred if the board thought the answers were sufficiently good but everyone got different questions and Horace Mann Laden argument comparing the process that was being used to 2 people running a race one on a draw a level track field and one wanting a money Hill and argued that there needs to be standardization in the educational assessment process and in fact brought essay examinations to the Massachusetts schools as a way of conferring high school degrees and so it all started with this lack of fairness this inability to make consistent inferences from the fact that someone was granted a high school diploma based on an oral examination but perhaps the roots of standardization were forgotten. [00:29:12] And became used primarily to enhance reliability and selective response testing came into place as a way of improving consistency of grading so Frederick j. Kelly I found that interestingly ironic I came to the University of Kansas and did not know this at the time but the 1st test with selective response items the 1st operational test was let the response items was the 115 cans of silent reading test the bell apply for it or telling. [00:29:48] Friends or Kelley at the time was at what is now important state college it was Kansas Normal School of Law And but then came to k. you and was our 3rd dean of the School of Education here he published the test in 1915 came to k u n 1916 his dissertation in 1914 boy he moved fast in what he was doing was on teachers marks their variability in standardization and the whole idea of this was to increase the reliability of grades teachers marksmen grades of graves that teachers gives. [00:30:23] The picture there which is not for Kelly because I could not find a good picture of Frederick Kelley there is one on painting on the wall in the building that I'm in right now but I couldn't get to it in time to make a copy of it but it is for Kelly's dissertation advisor l. Fonda. [00:30:41] And Else it is rumored that he'll form that came up with a multiple choice item and. I can't find any evidence of that I have to search the archives at Teachers College and they wouldn't let me do so they said they looked and couldn't find it so either Thorndyke created it but only for research purposes or his student Frederick Kelley came up with the idea in any case this was years before the army out a couple of years. [00:31:12] Before the army Alpha and it was other students of the old foreign **** that moved the army output to become a multiple choice exam and all of this was to take out the variability that was associated with the grading part of the exams and so the world was started to change by the way as an interesting side comment what was called the file and reading tasks it looked nothing like what you would find a reading test looks like today it consisted of no passages to be read it was reading the questions that was the silent reading and example question it was not on the actual test it was the question that helped students understand what the test was going to be like is lower given the names of 4 animals draw a line around the name of each animal that is useful on the farm Kalle Tiger rat and Wolf so that's what reading tests look like in 1015 it was not just Kelly is there were several other that were developing around that same time this was the big starts of objective testing and educational testing in the United States. [00:32:28] Another thing that came that was pushing the reliability aspects of assessment was work that your link was a sorry affluent West Yell for about the link was a did. Interestingly I don't analyse this for achievement tests was done very differently in the early years of test basically what you would do is you would give a test to the grade that was supposed to study the curriculum and the grade before that curriculum and you'd look for items that students who were exposed to the ideas did better that on then the students who were not exposed to the ideas on that was the basis of item analysis but linguists changed it so that it was to look at the a correlation between an item and the total score on the test wasn't the 1st come up with this idea but he knows this into the mainstream and I would argue that one of the purposes this serves is to increase internal consistency measures and again stressing reliability over validity. [00:33:45] I'm going to skip my side rant because I don't want to run out of time. So consider it skipped. The primacy of reliability concerns over validity concerns began to slow change slowly changed in the 1970 s. with increased thought and research on criterion reference which over patient item and test biased generalize ability theory which connected well liability and validity the unitary theory of living and work going on inaccessibility so slowly but surely things were changing. [00:34:24] And that led me to the point in my career again when I was working on the dynamic learning maps proposal what would test look like if we want them to support student learning and be robust against negative unintended consequences I actually had a list of 6 desired characteristics here are a subset of those. [00:34:47] The assessment should be based on cognitive and learning theory as opposed to just on some taxonomy of curriculum they should have assessments that model good instructional projects of a feature I call instructional relevance but the idea is if you know that teachers are going to teach to the test then you better have a test that does something good from that instinct that teachers are going to follow you need to embed the assessments and classroom assessment in the classroom instruction and not just the end of the year my goal is to have instruction seamlessly joined with the sas meant for this kind of testing not all tests purposes are served the same way but for tests of learning I would argue that this is important and also the idea of modeling node mastery and not focusing on scale scores so moving to diagnostic classification models away from scaling models. [00:35:50] And that required lots of different ways of thinking about the contents also it led us to developing fine grained learning maps but this whole idea of basing assessment on cognitive in learning theory. Started with thinking about learning trajectories also called learning progressions sometimes they're very linear sometimes they are multi-dimensional. [00:36:19] Here's a model that's a that's a honeycomb model where things can connect in lots of different ways and all of that from my perspective was insufficient for the complexity of learning and so our approach was learning Knapp's this is actually a portion of our mathematics learning now if I remember correctly it's about a 3rd of the map it's a very dynamic map we're constantly putting time into reviewing improving testing portions of the map but there are many thousands of nodes and more thousands of connections we have it we initially envisioned this as a base in network. [00:37:04] We have lots of challenges in calibrating that network many of them we believe come from the fact that you can't test large chunks of the network with individual students these are students with significant pattern of disabilities you can't ask them more than a handful of questions at any one time before they loose focus on the activities that are going on. [00:37:32] So there are a number of heuristics that we have to build in place I believe there are also challenges because we have new short tests associated with the reliability of the items that are going into the diagnostic classification decisions we eventually. Re cast that as a as a d.c.m. a diagnostic less patient model which from conversations that I had with Jonathan Templin we both believe it's algebraically equivalent to a base in network approach with some constraints that are applied to it. [00:38:11] The form of the probability function but essentially is the same thing. There are a lot of good aspects of using network kind of theory among other things you ought to be able to do the primers ation locally you don't have to prioritize the entire map at the same time. [00:38:31] But when you look at what I have just shown you there was too much it was too complex in fact when we were developing these maps initially they were going to be inside and driving the algorithms that selected how items to administer to students but were not going to be shared with teachers at all when we brought in $100.00 teachers to validate sections of the map they told us they want to take the maps home and start using them immediately this was 2 or 3 years ahead of schedule and we never intended for them to use it anyway but we learned from the teachers and in fact teachers have made good use of the maps once they made us realize no one will ever look a whole map at a time. [00:39:14] We have this mental image of our learning maps like Google Earth you start an outer space and then you come in you come in you come in and you see more and more detail you're not looking at the world or the whole time and thinking about it as one giant map you come in closer and closer and if for example this section of the map which is about constructing understanding's of text is still too large for teachers to make use of that. [00:39:43] You can come in further and deal with a small area such as identifying the feelings of powers in a story which includes the precursor skills necessary for a child to do that now remember all the maps that I'm showing you now at this one here are for students with significant cognitive disabilities it's my belief with no evidence that the same map applies to all students but that learning goes so fast compared to students with significant kind of disabilities that you cannot separate some of these nodes from each other and you would need to collapse them for other students because you would not be able to differentiate mastery of all of the nodes from each other that's still to be tested. [00:40:30] Part of the idea about assessments modeling good instructional approaches included building in gauge went activities into the assessments soon after or maybe at the same time next generation science standards based test started to do the same thing with having a underlying phenomenon for a set of questions so that students could become interested in the phenomena that is going on and try to take their learning and apply it in the context of a particular phenomenon and this forms a basis with which students can reinforce recent learning anchoring it to preexisting knowledge skills in understanding's providing country text that can be used for subsequent tasks. [00:41:15] Ideas of cohesion that reinforce learning and building accessibility into this it's absolutely true for students with significant cognitive disabilities that there are a lot of co-morbidities that are going on and so intellectual disability is often goes hand in glove with physical disabilities of various kinds and advances in medicine have led to the point where. [00:41:44] Prenatal problems that would have led to non-viable. Births now the children survive and even thrive but with multiple conditions that need to be addressed in making the learning accessible to them and making the taking of excess assessments accessible to them but we've been remarkably successful this students who often it was spelt were incapable of learning we've been able to provide the information to teachers so teachers can see that the students are capable of learning though often at a much slower rate than students in the general population. [00:42:30] The idea of them bending assessments and classroom instruction can be extended Valerie shoot at the at Florida State University has been doing work for over a decade on stealth games where our assessments but I will argue also instruction is imbedded in games includes a lot in one called the physics playground where us students avatars play on a playground do activities that depend on the laws of physics and allow them to demonstrate their knowledge of the laws of physics and so this whole idea of I'm betting assessments in classroom instruction is slowly but surely starting to catch on. [00:43:13] These ideas are very close to what historically has been called formative assessment or assessment for learning and this is just a google and gram from the use of those 2 terms in texts which has been going up tremendously since the late 1990 s. even from the year 2000 the last 2 decades there's been an explosion of this now this explosion doesn't ring that is the prominent way people are thinking about measurements in schools but it is becoming more and more so. [00:43:43] My general argument is that education innovation takes about 30 years so if things started picking up in 95 or 2000 and we still have a number decades of before it becomes as prominent as it's going and in considering all of this one has the think even within formative assessments what are the goals of this thing I'm calling formative assessment one of the problems with formative assessment actually not with formative assessment but about the world in which it's imbedded is that anyone can call anything a formative assessment it doesn't have any meaning it is not a term that everyone has agreed upon and despite a lot of. [00:44:22] Attempts by organizations to do so. So I think it's important to think about the particular goals that you're trying to accomplish with this thing you're calling the formative assessment. One of the problems that we have again related to this idea that anyone can call a conformance or assessment is around the year 2000 Nabi 2005 as the concept of formative assessment was starting to pick up faster and faster testing companies rebranded the assessments that they've hired in place for decades and started calling all of them former of assessments they didn't change at all they didn't serve any purpose is any better. [00:45:01] But they're being marketed as foreman of the substance. And that refers to one of the lyrics in my my song so that's why I want to make sure I make that point but many cases think of your goals if you're going to design a task you want to set expectations for students which is a possible use for formative assessments allow them to have an advanced organizer for the learning that they are about to do it will look one way you will design it make choices of a certain quality if you want to focus teacher instruction on student expectations that will have an impact on your design and it will have an impact on when you administer this formative assessment if you want students to be able to self correct their misconceptions again that has an impact make sure that you've built those misconceptions into the system and make sure that you administer it before completion of the instruction so that it's a at the right time for them to master the material if you want to provide teachers with feedback so they can improve the instruction they're doing again that has implications for what it is that you're going to design and then finally providing teachers feedback so they can prove instruction for future students that's important too to me less important than some of these other ideas here but it has implications for what you design and when you would minister the test taking a breath letting you catch up for a 2nd. [00:46:25] Another thing that was important to us was the modeling of node mastery not score scales and part of this and thinking about backward design again was starting at the end of the process what is the information that we are going to provide so that the goals we want to achieve will be achieved and so there are traditional kinds of tests and test score reports and this one I think may come from the park exams maybe from Smart Balance they can't remember and they provide scores this child performed at level 2 and underscore a 529 so if I just told you your school child had a score of 529 what does that mean by itself there is no inherent meaning with that information if I tell you your student your child has achieved a level of 2 lab multiple problems with that one of the problems is what does Level 2 mean and you can provide me with information to make it a little bit better and say typically a child at level 2 can do these things and they do that over here here it tells you what's it tells you in the bottom left your child perform better above students performing at level 4 at level 4 students demonstrate strong comprehension of grade level literate that's just poetry poetry fiction and drama what's grade level literary level literary texts what strong comprehension. [00:48:05] What does this mean to a parent what does this mean even to a student it's insufficient information and then the 2nd thing that I want to well a point of 2 things and one this is a typical child in this particular category it's not me it's not my child it's sort of the average Why can't I know what my child can do and here in creating these levels I categorized continuous data it is not a good thing to categorize continuous data there's a whole literature on this a lot of the work this preacher is done but a lot of it some of the work Chris preacher event of them are built has done up points up some of the problems of categorizing continuous data I'll summarize it in saying don't do it. [00:48:55] It lowers power can add biases can cause problems of all kinds so this is the typical way that scores are reported of a number that may or may not have any meaning associated Here's just some more information here here's an alternative approach that a note mastery approach led us to with dynamic learning maps. [00:49:19] This is a within your report that parents get several times during the year here we start with under the area of determining critical elements of text a grade level expectation that a student that we're aiming to get a student to is answer who and what questions to demonstrate undertake and standing of details in a text and these are kids with significant cognitive disabilities. [00:49:43] At the lowest level what they can do is attend to object characteristics then they can identify familiar people objects places and events after who and what questions identified details not familiar story answer them even if the story is not familiar and going through them left at every stage of the way both children and their parents not only know what they've demonstrated they can do but what's the next thing we want the child to be able to do We did a number focus groups of parents and they so much preferred they this kind of approach historically they never knew what their child could do and so we have lots of these associated with lots of the standards that we have well I'm not going to go over how I touched upon all of the different issues. [00:50:34] In the song as well as in my talk but I have advocated. Of them all. I want to commend the number 7 for having set me up one being one of the people who set me down this path was so many years ago and one of the things that is also important to me as that none of us in the fields of assessment measurements like America it's call it what you want i also a college on the measurement side of my own psychology forget the content and context and here week that has to be addressed with the psychometric approach is that we're using with all that said thank you very much for listening to me. [00:51:20] And I'd be happy to answer any questions thank you Dr King so that actually helps a lot in my own research and also if it's a race some question. Ellen and if anyone doesn't have a question I want to ask my question. You said that here developed. The. [00:51:45] Those nose based on cognitive theory one. 0 yes so. Writing about that reality and that friend aspects of that is the evidence I see that you have country in the main where there's no one's. Details about everything do you. Think that other types of evidence like a response processes are necessary for the purpose of this that nothing assessment or not because the purpose is I mean. [00:52:19] Chipman of the students and diagnosing them their problem so you don't need to focus much on the other asks. I think that process data is very hopeful and we've got a long way to go to understand how to best use process data but there's been a lot of research going on for quite some time that can help us think about these things I also think that all of these tasks so the development of diagnostic of the d.l. and assessments the dynamic learning maps assessments had a big team of people spending 4 years on it and it was nowhere near enough time and money and people to do everything we wanted to do so while I believe in our proposal we did talk about the use of response process we also talked about another idea that someday I'm going I would like to investigate that which was customization many of the students have trouble attending to things they have strong preferences and we built some of that into some of the hands on tasks that they do remember attending a conference on autism and being told by some of the experts there that please do not provide standardized materials to be used with some of the tasks because I may have a student who relates to door knobs but does not relate to jelly beans and if you want them to. [00:53:49] Tell something it better be door knobs and you need to let me as a teacher know my students and allow them to do this what we have this other idea that the beginning of the year we would gather store information about the personal lives of students that's a little tricky and scary for some people to begin with but do you have any siblings do you have any pets What's your favorite hobby and then why test items in ways that you use as exemplars things that child can relate to it's moving away from standardizing in one sense but moving more toward standardization in a mother sense if your idea of standardization is everyone do the exact same thing you're moving away from standards and if it's your moving towards choosing stimuli that are salient equally salient for all each student then you have to be different for each student so that's a long winded answer but yes process data is very important and we have to be doing more work with. [00:54:47] Each year. There are any question I'm going to ask another so still how does the negative. Mind of the kind of eerie behind this. Research I mean but is it just as savvy model or if you can carry models so. I won't say we compared models or even had a specific model in certain sense rules. [00:55:23] The whole. Understanding of cognition and the research literature around cognition was embedded in specific kinds of tasks that students were doing so we had a couple of still do cognitive psychologists on our staff one of whom focuses and has areas specialization is reading and one of the underlying cognitive processes associated with reading and there are places where we built that in 2 nodes in the model there are also places where we didn't build it in yet that we still have to go there watts of ideas around tax complexity that fill in a bit that fit in with cognitive models of various times that are not explicitly in our model we have welled up again because of complexity reasons and we've round up welled up considering text complexity as an orthogonal dimension of the assessment it's not tightly interconnected with anything we know its importance. [00:56:30] We for example. Well in the concept of familiar texts versus other that's when young children are learning to read 1st they often learn to read by having parents read them stories parents if any of you in the audience are parents and will no children want to hear the same story again and again and again they will not grow tired of any story as quickly as you will grow of that story but that they that basic idea before they learn to read before they are ready for any kind of decoding activities they have to understand what a story is how the parts of the story connects We each other so the idea of a familiar text is part of what we did build in but again we've got some work still to do there's a lifetime of work here I said that maybe to make myself feel better when we started down this road with this I will start this out but I will not see obsessed. [00:57:33] So I need all you people to finish this work from. I am not really interested in a job they hate you and me and they know it and everything just an assertion Ok thank you very much if there is no I don't question. Carol said well I just have a comment Ok yeah. [00:57:55] I'd really like what I can based approach to assessment. I don't know if you're familiar with Kindle that's been used now in Georgia's hunger development but the kids are playing games but they've got like mathematical poets reading components and so on and it seems to take care in one of a channel problem that I think can arise you know think about playing a game and have an interactive responses versus sitting down and looking at a mark or choice I don't you can see a real motivation difference I think and how well they will do you know I think if you're talking of agrees with that perspective yeah that's absolutely one aspect of it what I the early work in games based both assessment and learning basically took traditional claims of drillship kinds of problems and you get a point every time you've got and they have to cripple what would pop want to know if you if you play the number that at the 4 equals 11 you know you you get a point that to me is a little bit too simplistic for you know kids well I fear most kids won't just play that game a long time or play it once a long time so we've got a lot more sophisticated with the games and. [00:59:28] It's I think it's good yes and it's important for the reasons. Ok thank you I think that's set for now and we'll be back again. A graduate meeting with here and also a question meeting with you so much so thanks thank you again for the talk by everyone.