What is Cricsheet?

Cricsheet is Retrosheet for Cricket. We provide ball-by-ball data for Test Matches, One-day internationals, Twenty20 Internationals, some other international T20s, and all Indian Premier League seasons.

At the moment we have ball-by-ball information for 2,705 matches, comprising 340 Test matches, 1,162 One-day internationals, 125 other one-day matches, 456 T20 internationals, 103 international T20s, and 517 IPL matches, featuring 51 countries, 11 IPL teams, and 2 representative XIs going back as far as 2005.

The most recent matches added to the site are: the India vs Sri Lanka T20 match that was played on the 9th of February, 2016, the South Africa vs England ODI match that was played on the 9th of February, 2016, and the Ireland vs Papua New Guinea T20 match that was played on the 9th of February, 2016.

Currently we have 83.72% coverage across the entire site, with 98.84% coverage of Test matches, 86.46% coverage of ODI internationals, 92.68% coverage of T20 internationals, and 100.00% coverage of IPL matches.

The data

The data is provided in number of zip files, one of which contains all of the matches, and the others certain sub-sets of matches, such as for type of matches, matches for certain countries or teams, or periods of time. We also provide (as an experiment) all T20 internationals, and all IPL matches in CSV format. Below is the listing of the data grouped by types of matches, or you can see the full set of downloads.

All matches
2,705 matches, 12.67M
Test matches
340 matches, 4.37M
One-day internationals
1,162 matches, 5.03M
One-day matches
125 matches, 0.51M
T20 internationals
456 matches, 1.15M
Non-official T20 internationals
103 matches, 262K
Indian Premier League matches
517 matches, 1.33M

Using the data

What could you do with the data? Well that’s up to you really. You could investigate who are the best and worst value players in the IPL. Or see how much difference different non-strikers make to the scoring rate of the people they bat with. Or come up with something completely new that revolutionises cricket like finding the equivalent of DIPS (Defense independent pitching statistics) from baseball.

The data format

The data is provided in YAML format, a human-readable data format. There are libraries available to parse this in multiple languages. As for the structure of the file, hopefully it is clear enough when you have a look at the data, although a full description of the format is also available.

How can I help?

Spotting errors in the data

The first method of helping would be to spot any errors in the data. Ideally we won’t have any but there’s always the chance and if we can spot the errors we can fix them and write further validation to ensure that further examples don’t slip through.

Helping with missing data

The second method of helping is to help us get ball-by-ball data for our missing games. This doesn’t even have to involve finding the data, it’s possible you know a contact who may be able to shed light on some matches, or you know of someone who has the commentary for a match on tape. Even small bits of info might be enough to put us on the right track.

Blog Entries

We do have an infrequent blog to which we occasionally post about updates to the data format, additions to the site, or random musings. The most recent entry was “Version updated to 0.6” on the 14th of December, 2014.

Getting in touch

You can contact the project at stephen (at) cricsheet (dot) org. Feel free to get in touch, we love hearing about what people are doing with the data.