Actually.

The director of the times.

Thought.

This is all.

For.

Your.

I'm right.

For you that White.

House is popular.

I.

Think so so.

OK so can everyone hear me OK So
just for the accent it's a strain but

I've been working on it for
quite a bit of time so.

As a friend said some months Patrick I
just started at stats Some drifted out of

science.

So it's stats we have
all this cool data and

today I'm going to show you the tops of
stuff that we can do with tracking dogs so

I was doctored out of my aim is to
maximize the value of the tracking dogs.

Before I talk about that I just want to
give you an insight into my background

because a lot of stuff that I've done
with faces I've basically put it

across to multi agent tracking data so.

As I said before the strain I did
my undergraduate You know what's.

On his thesis was on face recognition
then after that I did my Ph D.

in audiovisual speech recognition
part of this if it I spent an I.B.M.

T.J. what's in the New York where we won.

To recognize visual speech
regardless of head position.

And so after that I took a post-doc at
CMU at the Robotics Institute I told my

mom I was only going away for six months
but I've been away for seven years now.

We wanted to basically.

Get an objective measure
of human behavior so

the idea here is that there are a lot of
subjective things in the medical field so

one thing is pain depression and facial
paralysis the idea was if we could use

Computer Vision Machine Learning to come
up with objective measures OK And so

part of this I work with a method
people have done anything with faces.

Basically did the facial
retargeting forever talk and so

after he left he went to
a Disney research and found was

well they had these sports projects and
didn't like sport if I love sport but

ended like sports essentially
that's how I got the job so

if one started this project
before as we continue did on

the I did so Disney owns E.S.P.N. OK.

Eighty percent of E.S.P.N.
twenty percent is owned by new Biscoe So

when you get to reading.

Essentially the idea here is that East
spin had a lot of rights to content OK so

I C C C C There's so
much content there but

it costs a lot of money to generate
that content so for maybe high school.

Other sports such as volleyball and
soccer they're not going to bring a big

production truck out there so
the motivation here was if we could build

build a smart venue where we
could track players in real time

could we move a robotic camera to generate
content automatically and send you to

really know us artificial intelligence
problem can we emulate what an M.R.I.

predator does and
then can we do automatic analysis so for

the past five years I've done a lot of
stuff with tracking data and so basic.

Clearly this is what I'm
going to talk about today.

In the N.B.A. spent you systems use and
thirty arenas and

essentially what we get now
are these spreadsheets OK so

I'm a stats God but
I hate the spreadsheet it lets context but

we have a lot of it OK you can
imagine giving this to a coach or

an analyst and they're just going
to throw it in the bit OK but

with a start it with the tracking data we
have this fine grained data so it doesn't

mean we have more of it it means that
we can model specific interactions

in context OK So that's basically what
we were interested in doing that stat

the idea came so
it's a standard paradigm here

essentially now we want a human to
interact with a computer so now a computer

can see every game that's have been played
a human can't do that now so there's games

happening simultaneously a computer can
see every game that's have been played but

a human can but a computer is not really
intelligent when it's viewing the sky so

what we want to do is kind of
combined both together and

correct tools to help like human
do their job better OK and

a very simple example I have here is in
terms of search this guy's mental hunch

he's a former Pittsburgh Steeler
running back he's an honest N.F.L.

analyst so back in two thousand and

eleven in December there's really
nothing to do in Pittsburgh and

other than they watch football so I was
watching N.F.L. camp then one morning and

he was analyzing the Houston Texans
running game with Marion Foster and

he's really proud he's really
proud of this he had one play and

he said I watched twenty nine as
a video to find a similar play.

This guy gets paid a lot of money he was
really proud of this coming type it's like

a badge of honor and I said this
is stupid this is really stupid or

all we should be doing this better.

And it begs the question how do we
actually find place in sports OK

how do we search in sports so
just so we have this play here

we have Steve Nash very rare
occurrence in a like a uniform.

Set shot how do we actually search for
that OK we can have You Tube interface.

Three point shot.

That's one very very
coarse description and

we have found of a place
like that I'm sorry about.

The should be in.

The So this is just place to go to
exercise OK So we have six hundred

games in this database and
this brings up twenty thousand players.

Now we have the tracking data we can be
more specific OK we can have dribble to

process behind the back it's in this very
slow cation But what's a problem with this

problem is that there's no ranking OK we
don't know which plays are most similar so

we just have to get the human
to go through that list and

also a picture a picture tells a thousand
words we have ten words there but

there's so much other information
going on tend to find brain motion so

this begs the question is the language
we're currently using for

sport the correct one
OK we don't think so.

We can understand this we understand
trajectory said the ribs the grains going

right to left your eyes go left or
right OK most people understand

if you're a coach or if you're
analysts you go to the right board and

you'll draw that so
I will not use that as input query.

So that's basically what
we've done OK We were place

this using words just with a chalkboard.

And so this is the idea behind it OK so
here we have an interactive interface and

retrieve panel and
say what you can have is so

you get the user to select the games so
this could be a video the tracking

data is linked up with the video and you
discount the user to select which gown so

they're cutting type OK And
so you get the user display.

Longer plays out you
know one to five seconds.

Then also you get the user just basically
scrub through it so again you could just

this is this video OK And so
once something interesting happens

they use a concise step I want to find
everything like that the person button for

true and instead of twenty you know
announce we can do this in a second.

But what's really cool about this is not
really apply is our important OK we've got

the users like I want to find a place with
just these players the best when these two

offensive players you know fund
the search and then you can do that.

But this is this chalkboard so
instead of kind of scrubbing through that

we could just get a user to draw
that OK so I presented this.

At the start of the month
that all you are.

Interested in learning the details about
this or the video you can actually

go to my website and find that we're
going to Disney research website

because that was a collaboration
between Disney research and stats.

But this is I think we can do much better
with description the surface of the cool

stuff we can do OK So
imagine if we could do analytics on

the player now that we can do search let's
do when the Linux on the play so that's

a sad we have applied interest with it
because if I could click on a platter and

get the probability of them screwing that
situation or had bet on move them around

and see how that percentage change or
want to have a good defender there and

move that around and see how that is
affected I want to see FOSS which price

I say I switch up plot is that percentage
going to change these are the types of

things that we can do people can do that
in the video game right can we do that

in real sports that's the guy and
this is types of things that we can do and

so essentially he's the engine OK.

Who's the retrieval engine most of the
magic happens in the pre-processing step

and basically the big thing
is aligning the data.

Spec to computer vision and

face it's what's the thing that you
have to do with doing any type of.

Vision you have to do good registration
registration registration registration and

this is a similar thing and then once we
do that we can do prediction on top so

that's basically what I'm going to
talk about today I'm going to talk

about the first thing importance of
a lot of the trucking data and then

I'm going to talk about some of the cool
stuff we've done for basketball soccer and

also tennis OK I'm on it so what I mean by
Lima OK so if you give me I'm a Liverpool

fan Liverpool are really really good
two years ago OK That's why ours

choose not to believe he went to
Barcelona because that was a stupid OK so

let's just say we have Liverpool
we have to study eleven he

had to we actually represent this OK so
a feature represents a feature

representation that we could have is just
skip the special location of the bone or

the players and we're wondering OK
let's just say Suarez passes it to

Sterling Sterling there as well but
I won't talk about him.

So ten minutes later they do exactly
the same thing had we measure a similarity

what we can do we have the two
feature representations at the same

We get the norm of that if it's
close to zero it's pretty simple OK

now what's the problem here.

Is tend to switch positions so.

SWAT isn't stirring quote my ball but
a computer doesn't really know that.

So when we have this switch
we introduce noise and so

if you take the main of that
that's basically twenty metres but

you look at all the permutations for
only on field players ten factorial

there's three in a million
permutations here.

So that's a big problem.

How about we have this language that we
use in sport it's doesn't really matter

who they are it matters whether
relative to everyone else OK So

we have sweaters in the right wing.

And standing on the left wing sewing
instead of running about who they are we

say well which position without you and
once we enforce that at every

frame we basically normalizing
it's permutation OK.

And so this allows for large scale
analysis so the initial idea we had for

this is two thousand and thirteen.

Not my previous boss Ian He basically
had this idea took me six months to

actually realize what he's talking about
but it's basically makes all this work.

And so comparing Liverpool we compare
strike strays really good circuiting if

you didn't know about how we actually do
it it's actually learning to play mutation

matrix and each frame OK So
we have I am identity representation and

then we have our row representation
of formation representation so

to actually get that we have to learn or
get the pigmy Taishan matrix at that

frame in a plot and so how we do that is
that we get the player detections and

then we have a template so
that's a template a lot into a template or

talk about how we actually get that
template and then once we do that we get

the cost matrix use one hundred or
an algorithm and this yields the.

Metrics OK so how do we actually
get this formation template OK.

The idea here is as I said before is that
plays switch positions all the time so

here we have a half of one tame and here
are the trajectories across the whole half

we see on top of each other but in terms
of a structure we actually want to

get information in so I had quick
actually do this so one way that

you could do it you could go through and
it tied each frame for each roll.

I tried it it takes a long time so if you
want to do three hundred eighty games in

a season that's really tough but truly
not is that we could actually do this

in an unsupervised way it's basically
the way that we did this is a form of.

Caymans we just use the rhythm so
the idea is that

we get the identity representation as
a guess and then we iterate on that

to find the Mexican rocker who based
on wrong not identity and what we get

is these formations that pop out OK so I'm
pretty proud of this big thing and circa

I'll talk about this later statistics and
suckers suck they just terrible terms of

storytelling the terrible big thing is
the strategic element formation and

what not we can actually get
this directly from God I'm OK.

And so here's some visualizations of the
formations that will pop up just running

through the use in the rhythm and
there's no human there's no

humans in the loop here we're actually
this is converging automatically so

you're getting four photos for one four
ones all these type of nice things and

so what's really not still is that we get
this interaction we can visualize a game

so instead of watching a game for
forty five minutes we're going to see

the interactions So here we've used
a window five minutes and here we

have the covariance based on roll and
we can see bicycling the structure this

really could start flocka following ground
with a map we see have teams interact and

we can kind of get this this
understanding on on on on on on the flow.

And so what we did is we said Well what we
get from this are the formations so how

does this compare to an expert so we're
going to expert to label each half and

we got the expect to say well which for my
shin did this team play in that house and

basically we're going to be at a five
percent accuracy now there's a lot of

ambiguity in labeling so
we've got a couple of experts to do this

they agreed around eighty five
ninety percent of the time so

we're reaching that agreement OK now this
allows us to do really cool things so

as I said before statistics and
soccer are really really bad so

if you get it is piano if you go to B.B.C.
they'll have the average for.

And so this is taken from Manchester City
they have the two center backs on top

of each other so they use up the DOT event
and they only use when I touched the ball

it just shows you that we have partial
information this isn't a really good

description of what's happening with
the two center backs on top of each other

we have James Milner on top of a grammar
change move it is a winger but

if you he switch from left wing to right
wing so of course it's going to be right

in the middle but we can do now is we
have another can textural features so

we know which position each player
was in when they did that action and

also the right of swaps of really
important we've done stuff with basketball

in how analyzing have
teams get open shots.

Up a lot of open shots but

this is another contextual feature to
drill down into how teams attack and

defend OK so this was taken from.

A couple of years ago where we so

in all professional sports the home
of the home advantage exists so

to most people here where the home
advantage so basically no professional

sports the home team is more likely to win
than the white team it's exacerbated in

soccer because sucker's low scoring is a
really good book called School casting and

basically they say it's
all up to the referees.

So this makes sense because referee
the influence conformity on this

type of stuff.

Because I've played a lot of stuff and you
hear this thing you know when at home and

drawing away games so
people conserved it so

we actually wanted to see if that
kind of translated strategically.

And what we found we analyzed all
the teams on fortune couldn't disclose

the identity of the Slade
it's in your heart up.

So we found.

Time formation of a team
is there right formation

we found that teams
are really rigid in the.

That they play we have team to here that
actually went from a back four to a back

three.

But in terms of formation
they're pretty rigid OK However

the overall position home teams tend
to play further up the pitch now it's

hard to prove causality so we had a K.D.
pipe into theirs and today with a nova

basically what we found at home
teams had more shots on go.

Goals and more shots but I had the same
passes and same shooting efficiency

what actually happens is that home teams
have their position the forward third

that my position the forward they're
going to have more shots but

also you going to get more founds or
this type of stuff.

So that's kind of one cool
thing that we can do with it so

not only does it allow us to
do really quick retrieval

it will last us to do a pretty
interesting analysis.

Now let's go to boss couple Now this is a
really nice piece of work that we did with

the song you he was with Disney
research now he's faculty.

At Cal Tech So
essentially what we wanted to do here

is at each frame we wanted to know
the probability of a player shooting

possibly holding the ball but
we wanted to do it personalized OK so

here we have Tim Duncan
he's just posted now.

So given that initial frame he's not very
likely to he's not very likely to pass or

to shoot he's back to the basket but

is that game unfolds we see how
these probabilities change and so

essentially what we wanted to do is get
a user interface where we could move plays

around and see how these probabilities
change OK so I'll show that at the end.

And so a big problem is that so

we have a striking down from stats really
really nonstarter and on that point we're

hopefully going to release the start of
very very same to the academic community.

Potentially in a cattle competition
which I'm really excited about but.

Yet you are hear more from us about that.

But essentially what we wanted to do
is predict six states you know whether

the Bohema will ship whether the band will
pass to one of these fourteen minutes or

whether the ball handler
will hold the ball but

we wanted to make it personalized.

And we wanted to get to interact
with all the different other players

OK which is really really tough because
we're never going to have enough data

OK having a situation where you
have five players on one team and

a different five players in the other team
how they interact different context is

really tough OK There's just too many
permutations but what we actually did here

is that she's like in fact the models
kind of similar to what you use for

recommendation system which is a form
of collaborative filtering and

so it's actually what we wanted to do here
we have this big sparse matrix in terms

of we've broken into these factors so
the first factor that we wanted to see

is if we can come up with a signature
map of each player in terms of shooting

OK now if we broke the field and according
to a series of cells we get this big

sparse matrix and essentially what we
wanted to do is find these factors so

you know negative make Trix
factorization to do this and

we found we could describe based on these
ten factors and what's really nice is

that we can describe each player as
a linear combination of these ten factors

and let's do some analysis so
this is a limit back in two thousand and

twelve thirteen he distended to shoot
in the corner shop Carmelo Anthony.

He does what he wants.

Tim Duncan kind of post stuff but this
is really not sure because we have this

intuition of what's going on it's really
nice to get a now model to back that up

but it's back to prediction if we do
the best prediction these are the models

that we generate but not only can we do it
for shooting we could do it in terms of

process one applies more likely to get the
passes So here with Tony Parker he tends

to cut a lot OK we pick that up big hook
he turns the post up on the right here.

Stop Le Bron James gets
the ball where everyone's

OK All these kind of nice
things that we pick up.

Also we can do this generally given
where the Burleys do we know the flow of

the past and so here we have X. where
the ball is X. where there's a nice likely

tends to go to the poster on
the top there and equally we can.

Given where the given where you receive
the board where the ball Malaki come from

OK so revel perfected these things off and
put into a model and we can do these

things so I have a slide on the next one
which showcase the stuff we can do so

here we just say given apply has a bone in
a certain location what's a probability

that should pass well hold it here if
in the corner you tend to catch and

shoot if you're posting up you'll have it
for a second or two and then you'll shoot.

Up the top of the court here you
tend to hold it for much longer.

But what's really nice now is that
we have this preplanning to OK So

heat so you can get a huge sums website
you song you don't come and hear.

The bass so you can have to tame So
this is the Spurs first is the like is and

you can move closer in and
see have their probabilities change so

here we have Tim Duncan.

Corresponds to the process
to the Forte might.

Corresponds to the shooting probability
when you move closer and you can see how

these change which I think is pretty cool
so you can imagine a coach or talent

before the games being played you can
interact with a sings and see how players.

Behave.

So I get really excited about this
stuff it's kind of pretty cool but.

Now soccer so
I love soccer soccer is really cool so

is there any Brazilians here.

Or.

Anyone.

Has any Germans here.

Right now could I.

On his right OK so I'm going to analyze.

Again I'm going to analyze
preserve us Germany OK now

you look at the statistics is Germany and
Brazil what's really interesting and

this is a case against a buffer you
know watching sport by buffering

see you at a machine learning telescope
and I was texting my brother and

it was like jamming up two or
three Neal and

I said I can't believe it three new guys
know it's five now so I was off by two

minutes because every two hundred
people they were streaming the scan.

But anyway so here we have preserve
us Germany so Jimmy won seven one

if you look at the stats Brazil had
no position that had more shots

OK I remember watching this guy I don't
remember that that's kind of crazy so

having no laugh and
no head what I did is that I want to

have a shot OK so
here we have Oscar in the first half

now do you think this is a good quality
chance or a poor quality chance.

Good.

I said it was poor because he had
to defend is OK good location.

But is to defend is a however this is
subjective this is really good luck

very subjective we want to come up with an
objective measure second half I'm not that

is like but that's just my opinion
the second half we have a.

Pretty free that's a pretty
good chance right now.

Of eighteen chances of Brazil a lot of
them down the lower end by stock market

now let's look at the Germans
this is the first girl.

Crazy David Louise.

Louise Miller free that's a pretty
good chance in my opinion that

now this is what you see in under sevens.

This is a third a full
scale I think should have.

Kicked off.

OK.

I went through all fourteen shots.

Jimmy had better chances so
the whole goal of this work was so

well can we actually get an idea which
team had the better chances Helfen D.C.

coaches or Analysts say well we had the
better chances why can't we measure that

we should be out to make sure that and
but this is very subjective

if we do winning and
we can do this in a more objective way and

so that's basically what we wanted to
do here and so it's kind of similar

basketball you have expected point value
this is what we call expected goals out.

This season that we analyzed
to talk to European laid

we analyze ten thousand shots and
we looked at the ten second window

before a shot OK let's
pressure temple play and so.

So we can do something much cooler so
this is just the baseline so

you could distribute this as a just
a supervised learning problem OK we have

the tracking down across some feature and
then we can throw logistic regression

you don't want to throw in logistic or
discredit because this is normally

behavior but
this is this initial stop OK now.

Shots.

Every ten shots a team has on
average that team will score OK but

it varies Now I should.

I was talking to John this morning about
this I showed the slot I went through

ten thousand shots again I have no loft
went through ten thousand shots and

I segmented them based
on the six categories so

we have open play based on having
position in the forward third can attack.

Which obviously against So
it's very very quick transition and

so what we found here is that I point
three percent of the time that you get

a shot and I can play in up with the goal
OK you can attack it's nearly double that

kind of makes sense and it's
a function of spice it want to ones or

this kind of nostalgia now corn is not to
send a shot so you get from a corner and.

Goal but this is misleading how many
shots to actually say from a corner

OK effective it's because he took
percent what you're actually finding now

is that teams are using the defensive
corner as an attacking weapon.

OK because you're more likely to screw
in the counterattack than actually

considering from the free kicks a lot of
these nice things that pop up penalties

this season that we analyze seventy one
percent but I would convert a free kick

that's basically a direct shot on goal
after a foul and pace will set pieces but

pace I defined as a cross
into a box after a foul So

that's around ten percent however only
fact is very big thing is proximity so

if we just look at the heat map over
the shots obviously if you're closer

to the goal you more like a disco OK so
that's a very important feature

defender proximity and this is the only
thing you get with tracking down you don't

get it from Event data it's a ridiculous
conversation that you have the value of

tracking data of course you need tracking
data to do these types of things and so

we've just seen coded this
drive to take point in polygon.

Which I thought was much much more
complicated than I actually thought

actually funny and
if a point is in a probably go.

Yet Maybe I'm slow but

it took me a while but you know we
can encode this type of behavior.

And it is predictive of actually
getting a goal or not and

then you have action suffer only suck
a plot the attacking team is read left to

rot the defending team's Blue go right
to left and green are the actions OK And

so what I mean by actions
is possible dribbling.

Across We also in code to go keep
the location general team flat or

motion only stops the cool things.

And so what's not says that
this is basically like in fact

as if we can supervise.

These into these different classes we
learn to classify for each one of the.

OK And then we can do better.

I think I have that slide
mixed in a plot so this is.

You know people talk about statistical
significance and early sings but

what you really need to do
is check the error plots and

so what we wanted to do we have all
the shots we know whether it's a goal or

not zero one what we want to predict is a
continuous value between zero and one and

so a way that we can check the arrow
just using our Missi and so

just say we saw a lot of scoring
a goal is ten percent for every shot.

Most of the time ever smaller
throw in ten percent but

when I do school we have this paper and
ninety percent.

Now when we break it into specific
context when we get those main values

you know we can say we buy
can get into these modes but

when we use all these features we can see
that would basically pushing the arrow

towards the right and
we're getting this not.

There which is not substantiated we
kind of same well we did coupling all

this behavior and it's a good indication
that we have enough clusters that we're

accounting for but also you see some blips
here and ninety percent or higher and

that's really nice too because then
we can start to quantify how lots.

This is a number was behind this
is really really good OK but what

I did initially is I just went through and
handcrafted these things but once we along

the route we can start clustering and
finding these things supervised so

here we have the starting position over
the pliers ten seconds before a shop so

which color refers to the position now for
your lawn it we get the structure.

Then it will ask us to cluster and
then we do this supervise

we would use to write it in the current
spectra moment very very simple idea but

it basically makes all my work
last five years relevant so

now we can actually go through examples so
here we have a camera

tech down the left hand side then the left
hand side and then it was across from the.

Stick and then ammo suggested it was
around about seventy percent luck of

scoring OK here we had a Canada tech and
the right hand side.

Guards in the back six made
about fifty three percent so

he is just a gut check with models
doing something reasonable OK and

here we had a free kick on top of
the box the goal keeper Pettitte and

attacker was on the back still throwing
about fifty percent now we had examples of

taking shots from mid range a long
way up front about four percent OK so

here we haven't specified the player
identity this is kind of analogous to what

happens in baseball and other fields where
you have wins above replacement we have in

a general model or rather the average
price player in that league would

do in that situation that allows us to do
some nostalgia Russians OK so here we have

team how much they won the league that
year they scored seventy one goals but

mother suggested they only score fifty
seven why do you think that's the case.

The best place to rock we have to say here

they only scored twenty six but
they should've scored thirty six.

I haven't got as good a place
now we can look at defense.

Keepers We have team to here
they conceded sixty four but

our model suggested they
only concede forty nine.

So far to go exactly right in terms of
story telling so as it does need a.

Story tell us now we have
team in versus team S.

It was a Darby match so
a team in had my position more shots.

In terms of storytelling terms that
that team miss had better chances OK So

when you're writing up a story you
can say well that the better chances

are more likely to win
here we have another guy.

Draw but now we can shed lot on
this team if dominated by should

have scored two goals this team didn't
have any chances so that's a nice.

Ory larm OK so you can say well
it better keep out a great guy

all of the attackers were wife for
or that was just on Lucky.

And so again I apologize for

the Resolution Trust me it
looks better on my laptop.

I didn't have the dock for
the World Cup OK but

what we're doing which are dead trees
now we can actually drew it OK So

what again I have not lost I went through
the thirty two shots and I drew them.

And they are come up with minus to me so

I want to be cool with you went to a stats
website and then you had your game and

if you drew plays out and then we can give
you a value then it is much introductory

this type of interactivity
would be cool OK now so we want

the last thing I want to talk about is
tennis and we have to time your time so

we run very lucky we won
best pipe slime last month.

This is done with history and

I'm on Felix and he'll be joining me at
stats very shortly as I am that night

we are heartily if you're really really
good and you love sport come talk to me.

But.

Then if you have no off.

If you have no life.

If this find you repulsive you
know you're fit in with me.

No hey it's optional but.

This is a good advertisement for me it's
video OK So such a motivation here is

that if people watch tennis A familiar
with the broadcasts we have her cry

because gripe tells bit
of stories we have.

It helps with Campari OK Democratic thing
happening with tennis is that we have

I.B.M. slim track you know to have keys
to the match so what case in the match.

The three things that you should do
to be a player OK The funny thing is

is that these two don't interact OK well
card isn't used for this we have a list

I don't want to know and it's been out for
a very long time so what we wanted to do.

Is basically get this we want to know
who's more likely to win a point

join a match so here we have this
great example against Federer and

Nadal Federer is my favorite player Tom
but he can't beat Nadal and I think there

is always complex things going on but if
you read the dolls' book what he says is

that my forehand is much much better than
the single handed backhand I'm just going

to hit it back in and we can measure
that OK what we want to do is

measure this dominance objectively measure
these things with the tracking data and

here we have Nishikori Vista dull.

And we have this pop but we see it as
a momentum change there's a switch and so

we call this Piper the thin edge of
the which when applied hits a window

it's kind of when we come
to know that the dome and

I mean we want to see this kind of what
happened before hand that's basically

what we're interested in that's what
we call it the thin edge of the wedge.

Now again similar what we did with
soccer we have this pipeline but

as I said before we don't
really want to use.

A linear class first so this is where we
use a random decision first but given

the whole card out it basically comes in
terms of trajectories the trajectories of

primitive eyes is a poll and I mean and
then we can craft these features from that

we can get the shot stop like ation impact
location then we get these high level

features dominance features which
are important to coaches and analysts and

basically we put all these features into
a random decision foresaw show you how we

did that but who is really interesting
if we actually visualize these features

OK that's a cool thing with sport we
can actually visualize these things

we know it's interpreted we know what it
means so here we have Djokovic Nidal and

Federer and here the probability
distributions when they win the point so

here we see Federer he tends to be rather
on the court when he hits a winner OK

in terms of feet like patient he's a civil
lawyer he tends to be inside the baseline

where he's at the net where
you look at the other two.

Djokovic there behind the baseline
which makes sense but

also in the dark writes for
the player to be on one side OK Again

we're just putting the probability
distributions here.

And so in terms of the classifier.

Because this behavior is really non-linear
and so we decide to rein in decision force

at this and then we can get these
probabilities OK And so what we did

we had three tournaments with a dollar
we had a bet ten thousand points and

forty thousand shots he's had we
actually label but it's not trivial So

how do we best label a stock so
it's a standard when probability.

Works so what we did is that if a player
won the point we label all my shots as one

when a part lost that point we shot zero
OK And so we brought that into three

you call sets and then what we're really
interested in is modeling play specific

behavior because these tournaments a
knockout we just looked at these top ten.

OK And
then we just used to see it there are now.

Put in this presentation to get I
was really interested by this and.

A survey when you seven point
nine I could win the point OK but

when we get rid of
the ice first officer of

the return is actually more likely to
win the point so I didn't know that but

that's important in establishing the buy
slot OK it's not fifty fifty it's

actually the baselines around
about point five three and so

when we actually look at these individual
features we show that we do better but

there are big issues with
this it's not specific and so

I'll go back to the no
doubt fit our example.

We should have a model to model
individual behavior federal better so

his behavior depends on the doll so
we should have a model which.

Which which can matter these two
interacting to get a problem with that so

these two we probably have enough data but
for

a lot of other situations we want OK And
so.

The problem is for these contexts current
services we're never going to have enough

data and what happens if they haven't
played against each other before OK so

this kind of brings the idea of playing
a star OK We talk about stall all the time

we talk about stock and Bhaskar in tennis
what does it actually mean in terms of

modeling OK So again let's have federal
have can we describe his behavior so

we could describe it in
terms of attributes and

the vision community there's lots of
work been going on in that tribute so

we can say well his forehand amazing he's
backing is good but it's not as good as

the other guys on the two are right vote
right approach shots cried out ahead

smash OK now let's say he's
playing Corey had as he

prepare so what he could potentially
do is that he could watch type and

then he could have a pad and pencil and
he could say well he's pretty good for

him he's back in school he's
violence not that great and

he's let's not that great so that's all
right that he could pretty plain to

play him OK all right he could do it so
well and

she Korey he reminds me a lot of
Andy Murray I played Andy Murray a lot so

in terms of planing what I'm going to do
is use all the examples that I have for

a Murray and
use that as my internal classifier.

So that's basically you know I did and

what's really nice is that we can just
be more efficient with the DOT off so

if we want to model everything well again
we're not going to have enough data but

using this player start we can
basically make the truly shallow and

also we can get by with having less trees
OK So that's the intuition there and

so essentially what we wanted to do is
instead of handcrafting these attributes

we wanted to do it directly from data so
we just used a trajectory cluster.

It means but also what we actually
found to be better uses shoes in

a description of clustering using.

And so when we do this we actually do
better and we can model individual context

here we found we have fifty shots we can
describe all these attributes by fifty

shots and once we have this we can get a
distribution OK we can get a histogram and

once we have a histogram we can
stop measuring similarity So

now we can start measuring
Plas similarity and so

what we found is that Djokovic
is very similar in style for

the strain open to federal money tends
to be closer to an issue Korean some so

that's why I use that example in
the dark tends to be off when he's on

he's left handed However we're just
looking at the rule to check trees.

And say Well what's what's really nice
about this is that we can actually do

accurate prediction of behavior.

Surely we had that we had the two.

We had the momentum shift earlier.

But here we're using our style for
you to actually find it's actually.

And use in context.

And so using context behavior changes
depending whether you're winning or losing

by point or not we actually find that
we change the initial state it's OK So

again the resolution is poor apologize for
that by the future go to a paper and

find this so we can see that even
though the DA is really receiving

on that particular situation he's
more likely to win the point.

And so I get really excited about
this because again we can use

this as a pre-planning too so
this is a Cody paper that we had last year

where we wanted to do
serve recommendation so

the really interesting point here is that
when the DA plays everyone he tends to say

regardless of point he tends to sit down
the middle the majority of the Tom But

on the bright point he tends to do
the opposite that's like he's go to Sir.

Tends to go right on by point OK And

this is the same for Federer and Djokovic
Murray doesn't tend to change as much but

you can imagine you can go into face and

then you could play a specific scenario
and then we can give recommendations.

And say what's also cool is that you
can simulate these things say given

an incoming trajectory we want to see the
probability of one of these players and

one rocket hit the ball and then in terms
of a measurement we can start using

predictability as a measure you can
use predictability of entropy so

I Federer tends to have the most shot so
he's.

Predictable of those other two OK so
now we can be my person was just promoting

the idea of these new measurements
in terms of predictability.

And said he came from win it because
if we could ride up on the court

where we think the next shot
has gone before it's hit.

I think that's cool imagine you do that
baseball this is what I think it's going.

On in soccer with a free kick.

You know players notice you know you talk
about people I did planning on what they

were going to do let's show it.

And here's some nice visual Analytics said

tennis is traded as well OK so
that's basically it

I really enjoy talking about this stuff
and I'm really glad you came to see it

if you have more questions I'll be happy
to awesome or you can send me an email.

Yeah that's it so so so thanks for

coming.

Yeah absolutely absolutely yeah it's just
a function getting the data in real time

it's a sense of problem so if you get
it we can do it because the time taken

is learning the model but you do that
offline once it's on the run while you go.

Yes.

So very good question
talking about this a lot so

you have system one system
to this is a training thing.

I think.

So in professional sport it happens it
has to do with speed of decision OK so

if you do things really quickly
then you can't you can't really.

React to that but the better that
you get so OK I'll rephrase it so

when you when you're doing
these one two three.

So they have to predict what
the other players doing and

you can only do that by training and
being technically superior OK And so

once I do that I can expand that
decision space at that speed OK So

I think it's a function of speed and
also the state space I think those

teams are the best at that OK so you can
have an individual player do whatever but

it doesn't really matter if the other
guys can't predict what they're doing

because those decisions
come up you might not make

sense.

Thank you thank.