Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!
Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!
Visit our Patreon page to unlock exclusive Bayesian swag 😉
Takeaways:
- Education and visual communication are key in helping athletes understand the impact of nutrition on performance.
- Bayesian statistics are used to analyze player performance and injury risk.
- Integrating diverse data sources is a challenge but can provide valuable insights.
- Understanding the specific needs and characteristics of athletes is crucial in conditioning and injury prevention. The application of Bayesian statistics in baseball science requires experts in Bayesian methods.
- Traditional statistical methods taught in sports science programs are limited.
- Communicating complex statistical concepts, such as Bayesian analysis, to coaches and players is crucial.
- Conveying uncertainties and limitations of the models is essential for effective utilization.
- Emerging trends in baseball science include the use of biomechanical information and computer vision algorithms.
- Improving player performance and injury prevention are key goals for the future of baseball science.
Chapters:
00:00 The Role of Nutrition and Conditioning
05:46 Analyzing Player Performance and Managing Injury Risks
12:13 Educating Athletes on Dietary Choices
18:02 Emerging Trends in Baseball Science
29:49 Hierarchical Models and Player Analysis
36:03 Challenges of Working with Limited Data
39:49 Effective Communication of Statistical Concepts
47:59 Future Trends: Biomechanical Data Analysis and Computer Vision Algorithms
Thank you to my Patrons for making this episode possible!
Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan and Francesco Madrisotti.
Links from the show:
- LBS Sports Analytics playlist: https://www.youtube.com/playlist?list=PL7RjIaSLWh5kDiPVMUSyhvFaXL3NoXOe4
- Jacob on Linkedin: https://www.linkedin.com/in/jacob-buffa-46bb7481/
- Jacob on Twitter: https://x.com/EBA_Buffa
- The Book – Playing The Percentages In Baseball: https://www.amazon.com/Book-Playing-Percentages-Baseball/dp/1494260174
- Future Value – The Battle for Baseball’s Soul and How Teams Will Find the Next Superstar: https://www.amazon.com/Future-Value-Battle-Baseballs-Superstar/dp/1629377678
- The MVP Machine – How Baseball’s New Nonconformists Are Using Data to Build Better Players: https://www.amazon.com/MVP-Machine-Baseballs-Nonconformists-Players/dp/1541698940
Transcript:
This is an automatic transcript and may therefore contain errors. Please get in touch if you’re willing to correct them.
Transcript
Today I am joined by Jacob.
2
Buffa, the senior director of performance science and player development for the Houston
Astros.
3
Growing up with a deep -rooted passion for sports in St.
4
Louis, Missouri, Jacob's journey from aspiring baseball player at Missouri State
University to leading player development and performance science is nothing short of
5
inspiring.
6
Jacob discusses the critical role of nutrition and conditioning in athlete development,
emphasizing the innovative
7
of education and visual communication tools to help athletes understand how their dietary
choices impact performance.
8
He also explains how Bayesian stats play a pivotal role in analyzing player performance
and managing injury risks, and delves into how complex concepts like Bayesian analysis are
9
communicated effectively to coaches and players,
10
they understand the uncertainties and limitations of the models used.
11
Finally, Jacob and I discuss emerging trends in baseball science, such as biomechanical
analysis and the application of computer vision algorithms.
12
This is Learning Basics Statistics, episode 114, recorded June 20, 2024.
13
Welcome Bayesian Statistics, a podcast about Bayesian inference, the methods, the
projects, and the people who make it possible.
14
I'm your host, Alex Andorra.
15
You can follow me on Twitter at alex -underscore -andorra.
16
like the country.
17
For any info about the show, learnbasedats .com is Laplace to be.
18
Show notes, becoming a corporate sponsor, unlocking Bayesian Merge, supporting the show on
Patreon, everything is in there.
19
That's learnbasedats .com.
20
If you're interested in one -on -one mentorship, online courses, or statistical
consulting, feel free to reach out and book a call at topmate .io slash alex underscore
21
and dora.
22
See you around, folks.
23
and best patient wishes to you all.
24
And if today's discussion sparked ideas for your business, well, our team at PIMC Labs can
help bring them to life.
25
Check us out at pimc -labs .com.
26
Hello my dear fans, I'm coming to you with fantastic news because Learn Based Ads is going
live.
27
We're indeed going to have the two first live shows of Learn Based Ads history.
28
It's going to happen very soon in Stankham 2024 in Oxford, UK.
29
We're going to have two panel discussions.
30
We're going to kick things off amazingly on September 10 with Charles Margossian, Steve
Bronder and Brian Ward.
31
talking about the past, present and future of Stan.
32
And then on September 11, Elizaveta Semenova and Chris Wyman are gonna make science look
really cool because we're gonna talk about how Bayesian stats are used in the very
33
important field of computational biology.
34
So if that sounds like fun, if you wanna ask us embarrassing questions, if you wanna meet
us in person, if you wanna have exclusive LBS stickers, well, get your StanCon tickets now
35
and...
36
Honestly, I can't wait to meet you all on September 10 and 11.
37
See you very soon, my dear patients.
38
Jacob Buffa, welcome to Learning Bayesian Statistics.
39
Hey Alex, how are you doing?
40
I am doing very well, thank you so much for being on the show Jacob.
41
So as I said in the introduction, you work for the Houston Astros, which I'm sure the
American listeners know about for non -baseball listeners.
42
Houston Astros is a big MLB...
43
team so baseball and thanks a lot actually to JJ Robbie for putting us in contact.
44
JJ was here on the show damn a few years ago.
45
I don't even remember the number of the episode but for people curious about what JJ is
doing at the Astros he was not at the Astros at the time but you'll get an idea of what
46
he's doing he's doing absolutely tremendous job.
47
in the R &D department.
48
So yeah, I referred to that episode.
49
I put in the show notes the link to the sports analytics playlist.
50
And I'm sure if you're into sports, that's going to be worth your time.
51
But today we have Jacob with us and I'm having you on the show because you're doing a lot
of different things and your background is actually super interesting.
52
So yeah, maybe
53
Tell us what you're doing nowadays, but mainly how you ended up working on these because I
know your path is marked by a passion for baseball, sure, but it was still a bit senior in
54
random.
55
So I love that.
56
Yeah.
57
So currently I serve as the senior director of player development and performance science
for the Houston Astros.
58
My path to here was definitely unique.
59
I played baseball in high school and then actually went to Missouri State University for
baseball as well, but wound up very quickly realizing that I was much smarter than I was
60
good at baseball.
61
so wound up actually pursuing an interest in just overall human performance.
62
I was very passionate about
63
basically, you know, training to be bigger, faster, stronger.
64
so, you know, wound up spending a lot of time around the strength and conditioning staff
there, you know, wound up doing some, some internships and really learning as much as I
65
could actually outside of school.
66
You know, I chose my, my degree is actually marketing and then wound up adding economics
in there.
67
But
68
was never really big on learning like inside the classroom.
69
It was just something that was not a passion of mine.
70
So through college, you know, was able to gain a lot of knowledge around just general
kinesiology, strength and conditioning principles.
71
And actually, I think it was like my junior year, I was approached by a friend of mine
named Denton McRomey, who was like, hey, man, like,
72
we should start a gym, you know, after college, like that's what we should do.
73
And actually at first I was like, you're crazy, like starting a business, like that's, I
don't know how to do that.
74
But he kind of talked me into it.
75
And so that's, you know, after graduating in 2016, moved back to St.
76
Louis, Missouri.
77
And, you know, we had some connections.
78
We played baseball together in high school.
79
He went off to play at Rockhurst, but we had some connections in the St.
80
Louis area with
81
baseball teams and we wound up leveraging those to be able to start training some kids and
got a building and basically kind of step by step, you know, figured out, you how to do
82
it, how to run the business.
83
And one of the things that we did that turned out to be relatively unique there was, you
know, I was very, we were very passionate about identifying what underlying physical
84
qualities we, you know, were truly being impacted.
85
to help improve on field performance.
86
Because deadlifting more or squatting more is definitely important, but there's not
necessarily like a causal relationship to throwing five miles an hour harder.
87
But there are certain first principles that we're trying to impact.
88
So this is where I started to learn a lot about force plate research and just general
linear physics.
89
And we purchased a set of force plates ourselves.
90
started jumping athletes, really diving into the movement signatures, the force velocity
profiling.
91
And we started testing guys' bat speed, their throwing velocity, and just started keeping
all this information with players.
92
And over the course of a couple years, know, wound up, you know, being able to have some
research around why we do what we do, and we're really enjoying it.
93
And then the Houston Astros in 2019 opened up a job called a performance coach.
94
And this was traditionally what baseball would call a fourth coach or a development coach,
which coaches first base, know, maybe coaches defense and base running.
95
But this role, they actually expanded to help in performance science.
96
So the Astros were actually the first organization to have a sports scientist in baseball.
97
So they were very passionate about this.
98
part of this role was helping to do the sports science testing, helping to do the workload
monitoring, a lot of the grunt work, quite frankly, but it was definitely insights into
99
multiple departments.
100
it was something that I, given I never had actually a formal degree in kinesiology or
anything like that, I actually felt like this was my first shot.
101
to actually work professionally or work for someone, work for an organization or in sport.
102
So I applied and wound up getting the job actually.
103
I actually remember Bill Fricke, Pete Petilla and Jose Fernandez were the three who I
interviewed with and who wound up hiring me and I'm forever grateful to all three of them
104
for taking a chance on me because my resume was nothing spectacular.
105
And so I wound up doing that for 2019.
106
In 2020, they actually are after that season, they offered me a position as a sports
science analyst.
107
So I accepted that role, moved down to West Palm Beach with my wife, where the spring
training complex is, and was a sports science analyst for two years.
108
And then in 2022, you know, the Astros decided that they wanted to make a bigger
109
more formal investment in biomechanics and sports science.
110
So they started a performance science department and they asked me to be the director and
build it out.
111
So I was very grateful for that opportunity.
112
In 2022 and 2023, I was the director of performance science, building out that team and
trying to get that research off the ground.
113
then just this past year, at the end of the year, they
114
as to expand my responsibilities again to oversee player development.
115
So that is the long story of how I wound up where I'm at.
116
Yeah, I love it.
117
I love it.
118
It's absolutely, absolutely fantastic.
119
that's also why I wanted to have you on the show.
120
Because as you were saying, also you did quite a lot of weightlifting, which I am
personally very interested about.
121
I do that very amateurly in my local gym.
122
But something I discovered when digging into the science of weightlifting is
123
And that was surprising to me because like, I didn't know anything about that before doing
that myself, diving into the science of it and basically conducting small RCTs on me at
124
the gym and, know, coming up with my own macro cycles and so on.
125
So something I was really, really surprised by is by the importance of nutrition.
126
actually, you know, because when you're like, when you start a training like that, you're
like, yeah, the training is like 90 % of the results, right.
127
But actually, I discovered nutrition is extremely important and is an integral part of the
training program.
128
So I'm also curious if that's the case in sports teams.
129
like baseball and then like, yeah, basically how do you apply that kind of knowledge that
we have from a much more controlled sport like weightlifting?
130
How does that help you in your job today?
131
Yeah, that's a really good question.
132
absolutely, like nutrition plays a huge role in, I think all professional sport, but
definitely within our organization, we do take it very seriously.
133
And yeah, I think that there are, you interesting parallels, you you talked about
weightlifting specifically, you know, I did spend probably three, three, four years
134
competitively weightlifting.
135
And, you know, like one example of something that is, it's just a staple in weightlifting
is you have to make a weight class.
136
And so, you know, one of things that you have to do is you have to be able to manipulate
your body weight to essentially be
137
the lowest body weight that you can possibly be while like lifting the most weight that
you can possibly lift.
138
And, you you have weigh ins at a certain time and, you know, so you wind up needing to
basically weigh in at a certain time and then understanding this is what I need to eat and
139
when to be able to, you know, lift at my fullest capacity, you know, over X number of
hours later.
140
And while the exact scenario is
141
significantly different than baseball, right?
142
There's no weight classes or anything like that.
143
The general principle of being able to understand essentially, you know, what your body
needs to perform at its highest level and how long that takes to get in your system and
144
get out of your system is extremely valuable, right?
145
And so, you know, on like something like that, that is certainly applicable is, you know,
even something as simple as like caffeine intake.
146
You know, we play night games and so we know how important sleep is for overall
performance.
147
And it could be very easy for someone to take extreme amounts of caffeine before the game,
you know, because they don't know how long it takes for caffeine to actually get through
148
their system.
149
And so they wind up actually it not being that useful, you know, for the first portion of
the game.
150
And then they wind up basically not being able to sleep for several hours post game.
151
So I think even basic principles like that, understanding what to put in your body and
when is extremely impactful.
152
Yeah.
153
Yeah.
154
That's a very good example.
155
I'm actually very curious.
156
How do you follow that?
157
Because like, you cannot be behind the players all the time, right?
158
So here you have to also rely, I guess, own professional, professional character of the
player.
159
So I'm guessing there is variation on that.
160
How do you guys handle that?
161
Because, yeah, in the end, we know about that stuff, but also there is a lot of personal
variation, not only as you were saying on caffeine intake and the timing, but also effect
162
of caffeine on people.
163
I'm personally very sensitive to caffeine.
164
So that's cool because it wakes me up in the morning.
165
But definitely I know that if I take caffeine after more or less 12 p I'm gonna have
troubles at night.
166
I'm wondering how, yeah, how do you implement that stuff once, like how do you implement
the science on the players?
167
Yeah, you know, I'm not gonna lie and say that we have it down perfectly or that all of
our players, you know, follow
168
everything to a T.
169
We largely rely on education.
170
And I think that that's something that resonates with me.
171
you know, I think for anyone that has kids, it's not significantly different in that you
can tell them the right thing over and over and over again, but until they believe it
172
themselves, they may not do it.
173
And so, yeah, to your point, we don't have oversight over these guys.
174
all the time, nor do we want to have to.
175
So the best thing that we can do is essentially educate them on why it's important for
their performance, why it's important for their careers, and trying to distill complex
176
science into very simple but impactful infographics and try and communicate things
visually and essentially get them to believe and understand that if they want to improve
177
their performance.
178
this is something that they should do.
179
And definitely when players do that, you can definitely see it because they take ownership
over their careers and we definitely see changes on the field as well.
180
Okay, yeah, I see.
181
That must be super interesting.
182
So you basically get the players somewhere together and you go through the application of
the science.
183
I don't know, like today is about caffeine, tomorrow is about sleep, and next week is
going to be about meal timing, stuff like that.
184
Is that how that works?
185
Yeah, essentially.
186
You know, we have the draft coming up here, July 14th to the 16th, and that's a great
example of like after the draft, we have an onboarding process for all of our players.
187
And so they will learn about the Astros' philosophies.
188
In many areas, they'll learn about what our hitting philosophy is.
189
They'll learn about what our defensive philosophy is.
190
They'll learn about what our strength and conditioning philosophy is.
191
And one of the things that they'll learn about is our nutrition philosophy.
192
And it's definitely on the education side.
193
It's why are carbs important?
194
Why are fats important?
195
Why is protein important?
196
How much of that should you intake?
197
What are the proper sources?
198
And ultimately, you know, we always try and tie it back to on -field performance, you
know.
199
So, you know, for example, you know, we can educate players that if, you know, if you're
playing in the field, know, carbs are important for essentially like high bouts of energy,
200
right?
201
And if you, one of the key performance indicators of basically being a good defender in
the outfield is how fast you can run.
202
and how much ground you can cover.
203
if multiple balls hit you, can you do that multiple times?
204
And so, it's a non -trivial thing to be able to fuel your body correctly for maximum
effort sprints multiple times over several hours.
205
And so, if we can tie it back to basically what they value, I think it has a better chance
of landing.
206
Okay, yeah, that's definitely super interesting.
207
I love that.
208
And yeah, that...
209
Also, personally, that timing of things is, I can see very interesting.
210
You have also to understand, know, like there are definitely some moments of the days of
the day where I'm more efficient at the gym than like I'm definitely much more efficient
211
in the morning than in that at night.
212
Right.
213
So I never now I almost never train at night or the evening if I don't have to.
214
And I much rather do that in the morning.
215
Also, because I have the caffeine.
216
boost, you know, some like, and going to the gym after a full day of work is just like,
that's hard.
217
You know, I much prefer go for a walk, or something like that.
218
But definitely something I resonated with, and that's like, that's very anecdotal.
219
But you're saying that there are some late night games, right?
220
And so you have to take your caffeine at the right moment so that it
221
gives you the boost for the game, but at the same time doesn't disturb your sleep.
222
So it's a completely different field.
223
But I do some stand up from time to time.
224
And stand up shows are at night.
225
And so I actually have the same issue.
226
I I came up with that, that timing stuff, like very nerdy caffeine timing the other day,
just before a show because I wanted to have that but I knew if I if I
227
took my caffeine too late, I would have like I would not sleep for instance, before like
three or 4am which happened to me before.
228
So like that that made me that made me laugh when you talked about that because I was
like, well, not only you know, high sports professional have that issue.
229
So thank you so much, Jacob for for all the work you do.
230
And that's that's actually useful to much, much more people than you thought.
231
That's good to know.
232
You see?
233
Well, that's actually the same issue.
234
I mean, for any people who have to do some stuff at night where they need to be alert, I
guess that will be useful to them.
235
Now, I'm curious also about what you do, the kind of work you do for analyzing player
performance and injury risk, because I know these two topics are extremely important for a
236
professional sports team.
237
I'm wondering how Bayesian stats are applied here and how they can be helpful.
238
Yeah, I think that there's a significant way that just the overall Bayesian framework is
applied.
239
And I think if we think about the components of professional sports, being successful in
professional sports, some of them being skill acquisition, cognitive processing, in -game
240
strategy, and then obviously kinesiology.
241
you know, injury risk.
242
Kinesiology is probably the most publicly researched area, you know, of all of them.
243
If anyone wants some answers, it's easiest, you know, to Google how to make a player
bigger, faster, stronger, or, and you'll get dozens of research articles that are
244
applicable, which, you know, in my area means that
245
we can leverage that information as priors and then be able to apply our observations from
our population to both improve the resolution of the insights that we glean from the data
246
that we have, but also to be able to infer maybe where our specific processes or our
specific population might differ from the research population.
247
Okay, okay, I see.
248
That's in what's the what would you say is the state of the science on these on on these
fronts?
249
My are we somewhat confident?
250
Or is that something that's really at the frontier and that's evolving almost every year?
251
I think that there there are aspects of it that we are very confident in.
252
And there are aspects of it that are definitely evolving.
253
So an example of aspects that we are confident in is like we are very confident in how
specific musculature and their functions apply to injuries and human performance.
254
Very confident in the static state.
255
You know, I think that we are one area that we that the research is improving in is
understanding maybe how these function in a
256
in a dynamic state.
257
And an example of that would be, know, it's maybe easy to take a look at hamstring
strength, right, and player's hamstring strength and then track that over a season and
258
see, okay, who hurts their hamstring more or less, right?
259
But it's, you know, there are certainly aspects to like sprint mechanics, that impact
that.
260
that maybe are less obvious because there's not quite as much quantifiable information on
it right now.
261
It also requires essentially getting more nuance and understanding what is the muscle
doing at the time of injury.
262
And given that injuries are relatively sparse in nature when compared to non -injured
instances, that type of information is tough to come by.
263
But there are definitely people doing good work and trying to understand how
264
coordination fits into injury mitigation.
265
So that's one area that we're improving.
266
But I do think it's very good and we're very confident in overall, how does musculature
impact injury risk?
267
Okay.
268
What is a question or topic in particular in that realm that you'd love to see answered in
the coming month?
269
that you're really curious about.
270
Well, I guess, you know, don't know if this is specific enough.
271
It's definitely not in the coming months.
272
But you know, one thing that like we are always pursuing in the baseball industry is one
of the things that is most important to a pitcher being successful on the field is how
273
hard they throw.
274
That's, that's like pretty common.
275
The harder you throw, generally, the better the results are going to be.
276
However, we also know from
277
external research that how hard you throw is pretty much the driving factor to whether or
not you're going to get hurt.
278
You you put more torque on the elbow and more strain and that winds up essentially
escalating your injury risk a ton.
279
And so like one of the things that I think we've been trying to look at is
280
tendon and ligament adaptations and trying to understand, can we periodize workload of a
pitcher to be able to maximize their in -season performance and mitigate their injury
281
risk?
282
Because ultimately, the answer of throw the ball slower is not gonna work.
283
I think baseball has tried to take the approach of
284
just throw less overall and injuries continue to increase.
285
So, you know, I think that there's, I don't know the answer to the question.
286
I don't think external research will get to it.
287
Hopefully, you know, we're able to get to it internally.
288
Yeah.
289
Yeah.
290
I guess that's, that would be quite, be quite interesting, I'm guessing.
291
And what about the, so I guess you talked a bit, a bit about that right now, but what do
you,
292
See like how to Bayesian models help in predicting the impact of training loads on the
athletes Well -being and performance in general like not only injury.
293
Yeah You know, I think when it comes to training loads, you know, we we know that there
are broad truths About how stress, you know impacts the human body
294
We also know that there are nuances around how specific people or players adapt to
stresses.
295
So we can essentially use those broad truths to overcome sparse data where we may not have
a whole lot of information on any specific player.
296
But we do have a few observations.
297
And then if we combine that with something like a multi -level model where we can
298
then really glean some robust insights where maybe robust data actually doesn't exist.
299
I see.
300
Yeah.
301
Yeah, for sure.
302
I mean, that's definitely where, where hierarchical models would definitely be super
helpful.
303
Like if you can relate the different, the different positions and the different players
and the different population of players, definitely super powerful.
304
And what about the, so what you do also some, like you also work on, on
305
athlete conditioning, right?
306
And you, you, you do that, like you use the science of that to improve the training of the
players that right?
307
Yes.
308
Yeah.
309
Okay.
310
So how do you use how do you use, like, how do you do that?
311
And how do you use Bayesian approaches here?
312
Yeah.
313
Good question.
314
So I mean, I think it's not unrelated to the last answer that I gave, but one of the
limitations of overall injury prevention is the amount of data that can be collected.
315
We only have so many players that come through our system in any given year or even
through a couple of years.
316
And then we only have so many samples even within a given player.
317
And especially on an injured population, right?
318
The injured population is significantly less than the healthy population to the point
where, you know, if you have one or two players who maybe got injured with what looks like
319
healthy data, you know, it can be difficult to discern.
320
you know, even we go back to leveraging previous research, you know, I think
321
If we take the example, stick with hamstring strength and hamstring injuries.
322
If we have hamstring strength information on players, we can absolutely take information
from research and say, we believe within a certain degree of certainty that this is what a
323
healthy hamstring signature or force profile would actually look like.
324
And we can play around with how confident we are in that.
325
you know, to basically see what gets us closest to the actual outcomes.
326
And then that allows us to, you know, obviously be more confident in what we're looking
at.
327
Yeah, yeah, that makes sense.
328
That, I mean, that sounds pretty challenging, but that does make sense.
329
So from all that you're seeing here, really something I can see is that, yes, if you look
at the
330
You know, like one question in particular, the data can be limited.
331
But if you look at the overall amount of data, and definitely in comparison to other
sports, baseball is quite rich in data.
332
Because you have inputs from game statistics, you have player tracking systems, have
physiological data, you have a lot of these sources, how, so how do you integrate these
333
diverse data sources to then provide interesting insights?
334
Yeah, I think it, from my perspective or just my opinion on it, I think it starts by
layering the data properly, which I think to me means understanding what level of
335
information is important depending on the question being asked or what level of
information do we need to start with.
336
And so, for example, if we want to know, let's say how we can make someone a better
outfielder, right?
337
First, we jump right to, well, what's their reactive strength index from the force plates
and, the reactive strength index is low, hamstring strength is low, I think we're going to
338
lose a lot of people, right?
339
There's not going to be a whole lot of people that will immediately make that connection
and say,
340
yeah, that makes sense.
341
We fixed that, he'll be a better outfielder.
342
But if we start with maybe asking the question, how many runs is this player worth as a
defender?
343
How many runs has he saved as a defender?
344
Which may come from our ball tracking data, right?
345
That may come from understanding which balls were hit to them, were hit to him.
346
How many other defenders would have actually made that play on average?
347
Something simple like that.
348
you know, then we can maybe work backwards to the next level of information and say, well,
it looks like maybe he doesn't catch quite as many balls as the average outfielder because
349
he's not as fast.
350
Like he's slower than average as well.
351
And we know that the amount of ground that you can cover is certainly important.
352
Then I think you can make the next step to the physiological data and say, he's like,
doesn't produce a whole lot of force and he's not super strong.
353
So now,
354
people start to actually link the two and say, okay, now I can see how improving his
hamstring strength and his force production qualities makes him a better outfielder.
355
Okay, yeah, that's fascinating.
356
it's like, yeah, different hints basically that you're picking up from the data.
357
Yeah, yeah, essentially, and making sure that basically each one is applied at the right
time.
358
Yeah, yeah, And well, I think that's...
359
And a question I have that's related to that also is then what what do you think are the
most significant challenges that you face?
360
Not only you, but the whole, you know, science team that which are these challenges that
you face when you're applying patient stats in baseball science?
361
And how do you address them?
362
really good question.
363
I mean, actually, I, I think that
364
The largest challenge is actually getting people that are extremely familiar with Bayesian
methods and fluent in Bayesian methods.
365
And I would not consider myself a Bayesian expert by any means.
366
And the field of sports science doesn't teach this.
367
It's very limited in the statistical methods.
368
that it actually teaches.
369
so I actually think one of the, I guess maybe another tangent, like it's related, is
people in the field of sports science tend to be very tied to what previous research,
370
methods that previous research have done, right?
371
So they'll come in and they'll say, I want to do project X and...
372
this paper, these two papers were written on this project and they did it this way.
373
So this is exactly how I want to do it.
374
These are the statistical methods that were used.
375
And a lot of times these papers are written by very, very intelligent strength
conditioning coaches or, you know, exercise physiologists, but they're, they're very,
376
they're not written by people with strong stats backgrounds.
377
So I think getting people in the field that are actually familiar with this type of
approach.
378
is the largest obstacle.
379
But I do think once we get people with that skill set, there's actually very few barriers
to it, just given, I think, two things.
380
The first one being the amount of tools that are available to use Bayesian methods across
both Python and R with very simple syntax and are computationally fast has expanded
381
tremendously.
382
you know, just over the last six or seven years since I've been paying attention.
383
And I also think that how we communicate, how we communicate Bayesian stats generally
aligns with how people think.
384
People know that there are uncertainties around every decision that is made.
385
And we know that some uncertainties are wider than others.
386
And depending on our risk tolerance, you know, that may factor in more so than just a
single point estimate.
387
And so I do think that overall, communicating them, I think that's actually one of the
strengths of Bayesian approaches.
388
Okay.
389
I see.
390
Yeah, that's very interesting.
391
it's like, it's a mix of like not only the data and the availability of those, and also,
guess, the importance of having at least a part of the organization focused on that, but
392
it's also a technical side in the sense that
393
You definitely need people who are able to work on these with these kind of methods that
you're using a lot and Bayesian stats are definitely a very important part of that
394
workflow.
395
Yeah, yeah, absolutely.
396
Yeah.
397
And actually how, like, because you have to communicate, as you were saying, your findings
and the results of your models to a lot of different stakeholders.
398
So how do you do that?
399
I know from experience that it can be challenging.
400
So how do you communicate these complex statistical concepts like those from Bayesian
analysis to coaches and players to ensure that they are effectively utilized?
401
Yeah, that's a non -trivial task as well.
402
I do think one of the things that we try and do is
403
we do communicate it in different ways to different groups of people, right?
404
I think when talking with JJ's group and R &D, we're actually gonna wanna be as technical
as possible because we actually want their input on the methods and they're gonna wanna
405
know, I trust these results based off of the process?
406
If I'm communicating with a coach or a scout,
407
they don't care about that, right?
408
If I'm communicating with them, it's actually more so like one of the general approaches
that we take is, we distill the information that we have down to as few dimensions as
409
possible?
410
So, oftentimes what that looks like is maybe at most three or four dimensions where
obviously if we're relaying it in a graph, we have our X, Y axis and then maybe it's
411
you know, gradiented with a specific color and faceted by different positions, right?
412
So, you know, for example, if we're trying to communicate injury risk of a elbow injury
risk of a pitcher, you know, we might take a look at, we might take a look at the, you
413
know, x axis being shoulder strength, the y axis maybe being lower body strength or
something like that, it may be gradiented out.
414
by injury risk or probability and it may be faceted by how hard you throw.
415
So that way, we can communicate four different variables, but very, very simply put and
hopefully easy to distill down.
416
see.
417
And what do you, in your experience, what are the most common challenges of consumers of
these models?
418
be players, be coaches, be people from the business side.
419
What do you see as the main difficulties and what would you recommend?
420
What would you your advice to listeners who have to do the same at the work?
421
Maybe not for coaches and players, but for other stakeholders who are not part of the
model building team, but have to use the models.
422
in their own work?
423
Yeah, so I think that there are two obstacles and they're actually kind of probably
competing obstacles.
424
I mean, the first one is, you know, we want to be as concise, as quick as possible, right?
425
We don't want to say, okay, you know, look at this visual, then this visual, then this
visual, then this visual to make your decision, right?
426
If we can encompass it all in a single visual or
427
you know, a single pillar of philosophy, that's what's going to resonate.
428
Otherwise, you know, they may forget or if it gets too complex, they may not even try and
use it.
429
The second obstacle is actually, you know, related to that is when we do that, I think we
run the risk of glossing through or like smoothing through a lot of information, maybe
430
meaningful information and maybe nuanced, but nuanced cases do come up, right?
431
And so we don't want to overgeneralize in an effort to simplify too much.
432
so, you know, like one of the things that we have tried to do, and I'm not saying that we
are great at it, so maybe other people have better approaches, but we have tried to keep
433
the information or the philosophy or the tagline as simple as possible, but then try and
highlight, you know, these scenarios.
434
maybe possible scenarios where it's worth, if the results don't look intuitive to you, ask
a question.
435
And we just try and highlight where these possible scenarios could go wrong, where
certainly we want people to actually think through it.
436
And if they see a result that says, I don't think this is right, this doesn't make any
sense to me, as an expert in their field, just ask the question, or please just don't take
437
these at face value all the time.
438
Hmm, yeah, definitely.
439
think it's something very useful.
440
So in my experience, making sure to communicate not only what the model can do, but also
and maybe most importantly, what it cannot do.
441
And that way, that will mitigate a lot of these issues of over or under confidence in the
model.
442
Because I mean, we definitely as humans and it's well documented in the science that
443
And humans have a different way of handling uncertainty around algorithm decisions, right?
444
We tolerate much more the fact that a human is gonna underperform and be wrong in a
special case, but algorithms, when they are wrong in just one instance, then people will
445
lose trust extremely fast.
446
in the algorithm.
447
yeah, I can think it's something to be very careful of when we communicate our model
because well, people will be way more, will be way harsher on the model than on a scout,
448
for instance, right?
449
A scout can be wrong much many more times than a model for recruiting players can be
because of that.
450
bias that humans have.
451
So I think it's very important to communicate that as you were saying, and also to
communicate that the model is not just a machine.
452
The model is made by humans.
453
be a bit kinder to it, Yeah.
454
Yeah.
455
I I think a good example of that, least for us, you put it well, communicating what the
model can't do.
456
For us, that's actually our injury risk models, I think,
457
for a lot in that category where like, you know, if we were to actually communicate, if we
actually communicated the exact probabilities that the model outputs of someone getting
458
hurt, it's almost always going to say that the odds are they don't get hurt because those
are the true odds, right?
459
That at any given time, if someone goes out there and plays, the odds are that they won't
get hurt.
460
And so then, you know, we can communicate that like this, model that we're using is,
461
not necessarily to make the prediction whether or not that this player is going to get
hurt.
462
It's to infer what physical qualities or what features are actually important that we can
impact that lead to more or less risk.
463
And so maybe it's less about is this player at 45 % risk or 35 % risk.
464
It's more about what do we deem as important that would put that player at more or less
risk and then is that worth it?
465
Yeah.
466
Yeah.
467
So basically communicating all the uncertainties around the decision to make.
468
Nice.
469
Yeah.
470
Cool.
471
Well, I think I've already asked you about a lot of that, like very precise, you know,
science questions.
472
So maybe now to play us out a bit more looking towards the future.
473
Are there any emerging trends?
474
that you see in baseball science that you believe will significantly impact how teams
manage training and performance in the near future.
475
And yeah, are there also some breakthroughs that you would really want to see?
476
Yes.
477
So actually, as far as future,
478
you know, I guess, more or less like innovations in this field.
479
One of the things that makes me very excited about my role is I actually do believe like
performance science in general fits into that category.
480
And I guess more specifically as it relates to biomechanical information.
481
So like we've talked a lot about just general kinesiology and physiology in this
conversation.
482
you know, the last I think was three or four years ago, you
483
Major League Baseball rolled out Hawkeye information, which is tracking the individual
joints of every single player.
484
And that is where a lot of injury research comes from, especially in baseball field around
the torque of the elbow and things like that.
485
So I do believe, like I, I'm very excited at that data set.
486
And I believe that that's where, that's where the arms race.
487
is in baseball is who can leverage that information the best.
488
As far as breakthroughs that I'm hoping for, I don't know, maybe I could probably change
my answer if you would like me to change it.
489
But I actually think that not necessarily on the research side, but the quality of the
computer vision algorithms and the player, the tracking, I'm hoping that
490
breakthroughs occur there and maybe even more specifically the speed at which those
algorithms or those models are processed.
491
And I guess that's for two reasons.
492
First of all, when we're talking about elbow torque, the difference of one inch of the
wrist placement is exponentially more in degrees, which is exponentially more in force or
493
torque.
494
And so
495
if a model misses by an inch, that's significant.
496
And that's a very high standard for a computer vision model.
497
It's a high standard for the human eye.
498
But ultimately, if you want to get the most precise information possible, that's where I
think some of the innovation will come from.
499
And then in a practice setting, there's a lot of research around
500
feedback loops and skill acquisition and basically being able to provide a target and then
just providing that player with feedback of whether or not they hit that target and how
501
far they were.
502
And just given the complexity of the computer vision models and the size and the compute
power, those results, those biomechanical results don't come back for an hour or two,
503
which is fast, but it's not, you
504
we could use it inside of a minute, right?
505
To really get to apply it in a practice setting.
506
And so, yeah, those are maybe not specific kinesiology or physiology innovations, but I'm
hoping that somebody can figure that out in the next several years.
507
Yeah, mean, yeah, for sure.
508
That's like, I agree, that sounds absolutely amazing.
509
So listeners, you've heard Jacob like...
510
get going on that if you're if you're a fan of computer vision algorithms in baseball.
511
Definitely that would be used by the Astros.
512
And I'm guessing a lot of other teams.
513
Yeah, that's super cool.
514
I completely agree.
515
And, well, I think that's that's the show, Jacob.
516
I mean, that's I think we've already covered a lot of topics.
517
Before we close up,
518
I have the last two questions I ask everybody, of course, at the end of the show.
519
But is there a topic you would have liked to mention but I failed to ask you about?
520
I actually think that we covered it.
521
mean, these are probably my three favorite topics of baseball, human performance, and
maybe statistical methods.
522
So I think we hit on it all.
523
Well, I'm glad.
524
to hear that.
525
So then, let's play a sandwich.
526
First question, if you had unlimited time and resources, which problem would you try to
solve?
527
Yeah, I...
528
Is this specific to my field or just in general?
529
Now, Justin Shanerl.
530
Yeah, at a limited time and resources, I'd probably dive into that.
531
But then more specifically, like in my field, I would absolutely love to be able to solve
the elbow injury risk with pitchers.
532
I think it's something that is just an extremely complex problem.
533
And I very much enjoy complex problems.
534
And there's an extremely high return on investment.
535
I think for someone I can help with that.
536
Yeah, I mean, I'm really impressed at because the players play such an amount of games per
year.
537
It's absolutely incredible.
538
Me, like honestly, my I was anchored with European sports teams.
539
So like in football, they play.
540
I mean, soccer, they will play like on tops.
541
let's say 50, 60 games per season, rugby is less.
542
So like, yeah, when I started working in baseball and I saw the number of games that these
guys play per year at such a high level, I'm honestly surprised that they don't get
543
injured more often.
544
And yeah, like I understand why you're saying the elbow injury because like, yeah, that
was one of my first, that was one of my first questions when I started looking today.
545
It was like, damn, but the pitchers must throw, I don't know how many thousand balls in
each season.
546
And that's not even counting the training.
547
So the amount of joint pain and risk that you have with that is absolutely incredible.
548
I really don't know how they don't get injured more often, to be honest.
549
Yeah, I agree.
550
What they do and what they go through is impressive.
551
Yeah, 162, that's a lot of games.
552
damn.
553
And is that actually, so maybe last question before the very last question, do you see
any, like is the pitcher position really the one that's the most at risk for injury or is
554
that pretty much
555
will widespread across the positions or do you have some positions that are much more
prone to injury?
556
No, mean, it's pretty centralized at the pitcher position.
557
There are definitely injury risks all over the field, but in terms of the biggest, mean,
the injury risk on the mound is exponentially higher than any other injury.
558
I think if we look at the
559
the game of baseball, the throwing motion is probably the only one that like truly pushes
the limits of the human body.
560
know, sprinting, you know, no offense to any of my baseball players, love you guys, but
they're not the fastest in the world.
561
You know, they're not pushing that barrier.
562
They're not the strongest, you know, in the world, but that right or left arm and the
delivery is moving, you the fastest in the world.
563
And so I think that's the one that pushes the boundaries the most.
564
Yeah.
565
Okay.
566
Interesting.
567
Yeah.
568
I mean, I'm not, I'm not surprised, but that's, that's good to, to, to hear say that.
569
Yeah.
570
I mean, the, you amount of pitches they have, they have to make is just like, can't, I
can't believe that.
571
It's just, it's just absolutely incredible.
572
and also like, if you have any baseball players listening to that episode, well done.
573
that's like, that's impressive.
574
Like, you let me know if you have some.
575
Houston Astros players listening to that episode.
576
That's like great publicity.
577
We need to like, you know, advertise that and then on the social media.
578
It's like that.
579
That'd be quite amazing.
580
actually, you know, do you have do we have any study about then pitchers who retire and
how their joints age?
581
Because I know for US football, for instance, that can be quite a big, they can still be
at a high injury risk even after their professional career.
582
Is that the case also in baseball?
583
Or do we not know about that?
584
You know, that's a good question.
585
I'm not gonna say that there's not research.
586
haven't, know, there could be something that I'm not aware of.
587
But I haven't read personally, you know, any, any research on it.
588
So, yeah, I'm not aware of any.
589
Okay, yeah.
590
Yeah, I'd be interested in that.
591
If anybody in the audience knows about that, let us know.
592
And well, finally, last question for you, Jacob.
593
If you could have dinner with any great scientific mind, dead, alive, or fictional, who
would it be?
594
man.
595
You know, I'm to go with fictional.
596
And I'm going to go with Tony Stark as Iron Man.
597
I'm a big fan of the Marvel movies.
598
And so I think he's the one that I'd like to have dinner with.
599
That's a great answer.
600
I have never had that one on the show.
601
So yeah, you're the first one.
602
But I understand.
603
That's definitely my favorite of all the Marvel superheroes.
604
So yeah, I love that.
605
Yeah, that would
606
Definitely be super cool.
607
Would you ask him if you can fly the iron suit?
608
Definitely.
609
And I'd hope that he would say no, but I would have to ask.
610
Yeah, I mean, yeah, for sure.
611
Yeah, I would definitely ask.
612
Like you should probably also ask if you could play baseball with the iron suit.
613
That'd be probably super fun.
614
Yeah, that might be my only chance to make it professionally.
615
Yeah, mean, with the iron suit.
616
You must throw pretty fast.
617
you like you should think about that, Jacob.
618
That would mitigate injury risk a lot.
619
Yeah, probably.
620
Well, on that note, I think it's the perfect time to close.
621
So thank you so much, Jacob.
622
That was a pleasure to have you on the show.
623
Thanks again, JJ, for putting us in contact.
624
As usual.
625
We'll add links to your website and socials and any resource that you think is interesting
for listeners who want to dig deeper and start learning about baseball science, sports
626
science in general, and baseball analytics.
627
Thanks again, Jacob, for taking the time and being on this show.
628
Thank you very much, Alex.
629
I really enjoyed it.
630
This has been another episode of Learning Bayesian Statistics.
631
Be sure to rate, review, and follow the show on your favorite podcatcher, and visit
learnbaystats .com for more resources about today's topics, as well as access to more
632
episodes to help you reach true Bayesian state of mind.
633
That's learnbaystats .com.
634
Our theme music is Good Bayesian by Baba Brinkman, fit MC Lars and Megharen.
635
Check out his awesome work at bababrinkman .com.
636
I'm your host.
637
Alex and Dora.
638
can follow me on Twitter at Alex underscore and Dora like the country.
639
You can support the show and unlock exclusive benefits by visiting Patreon .com slash
LearnBasedDance.
640
Thank you so much for listening and for your support.
641
You're truly a good Bayesian change your predictions after taking information and if
you're thinking I'll be less than amazing.
642
Let's adjust those expectations.
643
Let me show you how to be a good Bayesian Change calculations after taking fresh data in
Those predictions that your brain is making Let's get them on a solid foundation