Cluster analysis of T20 batsmen and bowlers

Imran Khan analyses cluster match-ups in T20 cricket. 

T20’s offer a concentrated view of the task at hand for batsmen in cricket – that of balancing the reward of scoring runs in a limited number of balls and the risk of losing your wicket.  In this article we compare the interaction of different clusters of batsmen and bowlers.  For batsmen, those clusters are defined by their strike rates and wicket percentages i.e. the proportions of balls that leads to a dismissal.  Bowlers are clustered by their economy rate and strike rate.

For example, what happens when high quality batsmen, those with high strike rates and low wicket percentages, come up against high quality bowlers, those with low economy rates and low strike rates?  By how much do high quality batsmen dominate the lowest quality bowling, those with high economy rates and high strike rates?

Data

We consider all batsmen who have faced at least 500 balls in T20 cricket (of which there are 718) and all bowlers with at least 500 balls bowled (771) for our analysis.  The table below shows the top 50 run-scorers with their strike rates and wicket percentages.

batsmanrunsballs facedstrike ratewicket%
CH Gayle101376807148.93.7
BB McCullum76905555138.44.4
DA Warner75625272143.44.0
BJ Hodge73375590131.33.6
KA Pollard73094834151.25.0
DR Smith69215429127.54.9
Shoaib Malik69095602123.33.3
SK Raina68724936139.24.2
V Kohli68225152132.43.3
RG Sharma66735085131.24.0
LJ Wright62964346144.95.1
G Gambhir61455053121.64.1
DJ Hussey60974579133.24.4
KC Sangakkara59744750125.84.4
AB de Villiers58064030144.14.3
AJ Finch57064167136.94.0
SR Watson56474071138.74.8
RN ten Doeschate56424153135.94.6
OA Shah55094405125.13.8
DPMD Jayawardene54554134132.04.5
RV Uthappa54474046134.64.6
MS Dhoni53423968134.63.7
DJ Bravo52764223124.95.0
KP Pietersen52583843136.83.9
TM Dilshan51934229122.84.5
JP Duminy51824236122.33.2
MJ Guptill51624020128.43.8
SE Marsh51473992128.93.3
LRPL Taylor50903785134.54.3
LMP Simmons50844389115.83.9
S Dhawan50174193119.73.8
CL White49643872128.24.2
RS Bopara49644190118.54.5
MJ Lumb49553589138.15.7
DA Miller49263585137.44.0
Umar Akmal49023715132.04.4
EJG Morgan47373720127.34.7
Ahmed Shehzad45743680124.34.4
MEK Hussey45693658124.93.3
JH Kallis44163927112.53.6
Yuvraj Singh44133345131.95.0
RE Levi43863049143.95.1
Mohammad Hafeez43783552123.35.2
M Klinger43683438127.13.3
YK Pathan43012999143.45.2
Kamran Akmal42203297128.05.0
AM Rahane41673515118.53.9
KD Karthik41463231128.35.2
Azhar Mahmood40913023135.35.3
V Sehwag40612747147.85.5

The corresponding table for bowlers ordered by wickets is below.

playerballsconcededwicketseconomy ratestrike rate
DJ Bravo653587843678.0617.8
SL Malinga522459413216.8216.3
SP Narine537252732825.8919
Yasir Arafat470263442818.116.7
Shakib Al Hasan492355602656.7818.6
Saeed Ajmal418445342636.515.9
AC Thomas455857392637.5517.3
Sohail Tanvir531764382607.2620.4
Shahid Afridi525058082596.6420.3
Azhar Mahmood482561432587.6418.7
DP Nannes462457192577.4218
JA Morkel485861792407.6320.2
KA Pollard409455192338.0917.6
SW Tait367448912187.9916.9
A Mishra388046212117.1518.4
AD Russell398553672098.0819.1
Harbhajan Singh472852622046.6823.2
DW Steyn397544232016.6819.8
R Ashwin415046652006.7420.8
PP Chawla390048411947.4520.1
SR Watson389750541947.7820.1
Imran Tahir342540281917.0617.9
Umar Gul295636361897.3815.6
M Muralitharan370539451796.3920.7
NLTC Perera327545941788.4218.4
KK Cooper300835791767.1417.1
M Morkel355044091747.4520.4
JP Faulkner308840881707.9418.2
RS Bopara341841991707.3720.1
IK Pathan347543781697.5620.6
S Badree358534391685.7621.3
JS Patel337739161676.9620.2
CH Morris284235831667.5617.1
GR Napier290538311647.9117.7
A Nehra282136241627.7117.4
R Vinay Kumar318542161627.9419.7
BAW Mendis261730081596.916.5
Wahab Riaz286234261577.1818.2
NL McCullum316237841577.1820.1
PP Ojha286734371567.1918.4
MJ McClenaghan255334721558.1616.5
WD Parnell309339021557.5720
DJG Sammy327642871547.8521.3
Naved-ul-Hasan256130601537.1716.7
J Botha387541891526.4925.5
AD Mascarenhas264731331527.117.4
DT Christian316443211508.1921.1
Mohammad Hafeez310132161486.2221
SR Patel327038891487.1422.1
AU Rashid265533161477.4918.1

Ball-by-ball data is obtained from 2,190 T20 matches consisting of 493,053 balls.  We only consider those balls involving the batsmen and bowlers specified above resulting in 387,583 balls in total, from which 452,856 runs are scored and 18,867 wickets are taken by bowlers.

Clusters

Batsmen

For each batsman we specify whether they have a high, low or average strike rate and wicket percentage which gives us 9 clusters in total.  The splits are determined by one-third and two-third percentiles.  For example, the top 239 batsmen (1/3 * 718) ordered by strike rate are deemed to have a high strike rate, whilst the next 239 batsmen will have an average strike rate and so on.  Each of the 9 clusters will have approximately an equal number of batsmen.

The graph above shows the 9 clusters labelled A to I.  The two dark grey lines indicate the median value of the strike rate and wicket percentage for all batsmen.  The majority of batsmen lie within a narrow band of strike rate and wicket percentage hence why the high and low clusters have wide ranges.

We can say that the highest quality batsmen lie in cluster I – score quickly and don’t lose their wicket often.  Examples of batsmen who are in this category include: CH Gayle, BB McCullum, DA Warner, BJ Hodge, SK Raina, V Kohli, RG Sharma, DJ Hussey, AB de Villiers, AJ Finch

On the other hand, cluster A are the lowest quality batsmen – score slowly and get dismissed often.  Although there is an argument that cluster G batsmen are worse in T20’s as they score slowly and take up a lot of balls.  Examples of cluster A batsmen include: J Botha, JL Ontong, CF Hughes, NJ Dexter, GO Jones, GM Smith, AN Kervezee, Yasir Arafat, NRD Compton, R McLaren

Bowlers

We categorise bowlers in much the same way – 9 clusters made up of combinations of high, low or average economy rate and strike rate.  The graph below shows the clusters.  The two dark grey lines indicate the median value of the economy rate and strike rate for all bowlers.

The highest quality bowlers are in cluster G – low economy rate and low strike rate.  Examples of bowlers in this cluster are: SL Malinga, Saeed Ajmal, A Mishra, Imran Tahir, KK Cooper, BAW Mendis, Naved-ul-Hasan, AD Mascarenhas, K Santokie, Mohammad Nabi

Cluster C bowlers are quite clearly the worst type of bowlers and include the likes of: R McLaren, RA Jadeja, MC Henriques, DA Cosker, JEC Franklin, Anwar Ali, KJ Abbott, I Sharma, JDP Oram, NE Mbhalati

Cluster interactions

We can now investigate what happens when certain clusters of batsmen and bowlers face each other.  So what does happen when the best batsmen face the best bowlers?

It turns out that they pretty much cancel each other out.  The mean economy rate and average across the whole dataset are 7.01 and 24.0.

The best batsmen generally do dominate the worst bowlers (clusters I vs. C).  They score at nearly 8 runs per over and average over 40 – the highest average out of all the 81 cluster combinations.

Interestingly, the highest economy rate occurs when batsman cluster C (high strike rate but high wicket percentage) face off against bower cluster I (high economy rate but low strike rate).  The relatively low average is not surprising when we are considering batsmen who make the most of their brief time at the crease hitting expensive bowlers who still end up picking up wickets.

Finally, the lowest economy rate occurs between batsman cluster A (low strike rate and high wicket percentage) and bowler cluster D (low economy rate and average strike rate).

The full breakdown is below.

batsman clusterbowler clusterballsrunsdismissalseconomy rateaverage
AI209921381656.1113.0
AC12551340786.4117.2
AB14331497706.2721.4
AH158315591085.9114.4
AG11011005875.4811.6
AE196918841385.7413.7
AF217523361316.4417.8
AA11371172546.1821.7
AD13801218905.3013.5
BI279232102036.915.8
BC198123881137.2321.1
BE311533342086.4216.0
BF319537532047.0518.4
BB253126621446.3118.5
BH255127591836.4915.1
BD268326871736.0115.5
BG188518391475.8512.5
BA16801741936.2218.7
CG303836222277.1516.0
CB375346811877.4825.0
CD373242742376.8718.0
CC396652512037.9425.9
CE536864373407.1918.9
CA274931661456.9121.8
CF543874233548.1921.0
CI555575944008.2019.0
CH451156113237.4617.4
DG1063988745.5813.4
DH11751171745.9815.8
DI10931181626.4819.0
DE17241687875.8719.4
DD15241375845.4116.4
DF12671307606.1921.8
DB12661244635.919.7
DA12111169465.7925.4
DC10661118456.2924.8
EA390339961756.1422.8
EF688381353597.0922.7
EB581465692436.7827.0
EE673873393586.5420.5
EI580866883486.9119.2
ED552554512925.9218.7
EH495254372606.5920.9
EG359937512226.2516.9
EC470257981807.432.2
FA440052162027.1125.8
FD749283933516.7223.9
FG481755452816.9119.7
FB768196833167.5630.6
FH771098004687.6320.9
FI8775115865197.9222.3
FF9649129564808.0627.0
FC652388202948.1130.0
FE10007120855237.2523.1
GA26902659995.9326.9
GD367734821745.6820.0
GB357336551356.1427.1
GH337333861626.0220.9
GI316935191356.6626.1
GC300634661036.9233.7
GE480349652076.224.0
GG271825541305.6419.6
GF381641541636.5325.5
HE10811122774606.8126.7
HH771884153926.5421.5
HD853988623326.2326.7
HI8733108183967.4327.3
HA591864282236.5228.8
HF9079110133507.2831.5
HB9095101883206.7231.8
HG554459082606.3922.7
HC707885592547.2633.7
IG611070922696.9726.4
ID10542118514466.7526.6
IH9273114693877.4229.6
IA646272331976.7236.7
II9777128934337.9129.8
IB9937124033577.4934.7
IE12792161735297.5930.6
IF10123135254118.0232.9
IC727596142397.9340.2

Individual batsmen and bowlers

We can also see how individual players perform against certain clusters of batsmen or bowlers.  For example, the table below shows Kieron Pollard’s performance broken down by bowler cluster.

bowler clusterrunsdismissalsballsstrike rateaverage
A1506125120.025.0
B3497218160.149.9
C32214237135.923.0
D46926426110.118.0
E44023398110.619.1
F62621413151.629.8
G37414274136.526.7
H52120362143.926.0
I43517307141.725.6

Pollard scores heaviest against bowler cluster B – those with an average economy rate and high strike rate.  However his scoring is most restricted by cluster D bowlers (low economy rate, average strike rate).

batsman clusterrunsdismissalsballseconomy rateaverage
A394524.59.8
B224192635.1111.8
C287273165.4510.6
D347702.914.9
E342213845.3416.3
F584255786.0623.4
G16691905.2418.4
H543185455.9830.2
I928429276.0122.1

Sunil Narine’s performance against the different batsman clusters is shown above.  He noticeably performs best against cluster D against whom he averages just 5.9 and has an economy rate of 3.24.  He performs relatively worse against clusters F and I (high strike rate batsmen) although he still only concedes a run a ball.

Further work

This kind of analysis has shown how we can identify which types of batsmen score quickly and freely, as well which bowlers can be targeted whilst minimising the risk of losing a wicket.  Teams can tweak their batting order or bowling strategy to ensure the optimal batsman or bowler is in play at any one time.  We can refine this further by considering the type of bowler and the stage of the innings.

Imran Khan is on Twitter @cricketsavant

(Visited 230 times, 1 visits today)
0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *