The format of the data file
The data provided on this site is in a YAML format. This document describes the organisation of the file and attempts to give some idea of the possible contents for each field. The version of the file documented here is version 0.4. You can also see version 0.3 of the format.
meta Required
meta: data_version: 0.4 created: 2009-08-15 updated: 1265923616
data_version Required
The version of the data format the file contains. For example, the document you’re reading details version 0.3 of the data format. The actual content of this field will be in the form X.Y, where X is the major version number, and Y indicates the revision. For example, 1.2 indicates the file refers to the 2nd revision of version 1.
The major version number will only be updated when there are substantial, or critical, changes to the format of the file. For minor changes, such as the addition of a new field, only the revision will be updated. Ideally minor changes should be able to happen without affecting anything which is using the data.
created Required
The date on which the data file ws first created. This will be in the format YYYY-MM-DD, for example 2010-12-05.
updated Optional
A unix time indicating when this data file was updated. This will only appear if the file has been recreated after the initial creation.
info Required
info:
city: Cape Town
dates:
- 2011-01-02
- 2011-01-03
- 2011-01-04
- 2011-01-05
- 2011-01-06
match_type: Test
outcome:
by:
runs: 6
winner: Chennai Super Kings
teams:
- South Africa
- India
toss:
decision: field
winner: India
umpires:
- IJ Gould
- SJA Taufel
venue: Newlands
city Optional
The city in which the game took place.
dates Required
The dates on which the game took place. If there is just one day, for example in a T20 match, then it will be an array containing just that one date.
match_type Required
The type of match this data file refers to. Currently the possible values are Test, ODI or T20.
neutral_venue Optional
If this is in the file then the value will be 1. This indicates that the game was played on venue neither team would regard as a home ground.
outcome Required
The data for this field is an associative array which details the outcome of the match this data file refers to. It contains information such as which team won the match, whether the game was a draw, tie, or no result, and any margin of victory.
by Optional
This entry is an associate array which details how much the winning team won by. It is an associative array with the possible values of innings, runs, and wickets.
innings Optional
If the match was won was by an innings and something then this entry will appear with a value of 1, for the 1 innings.
runs Optional
If the match was won by a number of runs, or and innings and a number of runs, then this will contain the runs.
wickets Optional
If the match was won by a number of wickets, then this will contain the number of wickets.
eliminator Optional
This field will list the winner of any elimination bowl-out which decides a tie in a T20 match.
method Optional
This field will detail any method used to determine the winner where a match has been curtailed for some reason. Currently the only value this will contain is D/L when a match uses the Duckworth Lewis method.
result Optional
The result of the match if the match was not won by one of the teams. Currently the possible values are draw, no result or tie.
winner Optional
If a team won the match then the name of the team will be here.
overs Optional
If the match_type is either ODI or T20 then this will indicate how many overs there are in each innings. Likely to be 50 for ODI matches, or 20 for T20 matches.
player_of_match Optional
If this field appears then it will contain an array of any players who were adjudged to be the player of the match.
teams Required
An array containing the teams who played in the match. There will always be two entries. If the match is not being played on a neutral venue the first team listed will be the home team.
toss Required
winner Required
This will list the team which won the toss, and will be one of the teams listed as playing this match.
decision Required
The decision made by the team winning the toss. This will be either bat or field.
umpires Required
An array containing the umpires who adjudicated in the match. There will always be two entries.
venue Optional
The venue in which the game took place.
innings Required
innings:
- 1st innings:
team: Ireland
deliveries:
- 0.1:
batsman: WTS Porterfield
bowler: IK Pathan
extras:
wides: 1
non_striker: JP Bray
runs:
batsman: 0
extras: 1
total: 1
- 0.2:
batsman: WTS Porterfield
bowler: IK Pathan
non_striker: JP Bray
runs:
batsman: 0
extras: 0
total: 0
- 2nd innings:
team: India
deliveries:
- 0.1:
batsman: G Gambhir
bowler: WB Rankin
non_striker: RG Sharma
runs:
batsman: 4
extras: 0
total: 4
An array of associative arrays each representing an innings within the game, in the order in which they took place. Each associative array has a single key (such as ‘1st innings’) which specifies the “name” of the innings and then the value is a further associative array with the details of that innings, such as the team batting, and the deliveries faced within the innings.
An simple example is show above, showing 2 innings of a one-day match. The first innings shows Ireland facing 2 balls, with William Porterfield, while the second innings shows 1 ball of the Indian reply.
team Required
The team which is batting must be specified in the data for the innings. This will be one of the two teams mentioned in the teams section for the match.
absent_hurt Optional
If this field is provided it will be an array of players who did not take part in the innings due to being absent hurt.
declared Optional
If this is in the file then the value will be 1. This indicates that the innings was declared.
deliveries Required
An array of associative arrays each representing a delivery within the innings, in the order in which they took place. Each associative array has a single key (such as 23.5) which specifies the particular ball (in that case the 5th ball of the 24th over), then the value is a further associative array with the details of that delivery.
batsman Required
The batsman who faced the delivery.
non_striker Required
The player who was the non-striker for the delivery.
bowler Required
The bowler who bowled the delivery.
runs Required
The data for this field is a simple associative array which details the breakdown of the runs scored from the ball. It breaks the runs down to show which the batsman scored, which were extras, and the total scored from the ball. There is also the ability to indicate that a 4 or 6 was not an actual boundary should, for example if the batsmen ran a 4.
batsman Required
The total number of runs scored by the batsman off the ball. If the batsman failed to score this will show 0.
extras Required
The total number of runs conceded via extras off the ball. If no extras were conceded this will show 0.
non_boundary Optional
If this is listed against the delivery then the value will be 1. This indicates that the 4 or 6 scored was not via an actual boundary.
total Required
The total number of runs scored off this delivery. If no runs were scored from the delivery then this will display 0.
wicket Optional
If a wicket occurs then this entry will be in the file and will provide details on the wicket, such as which player is out, what type of dismissal it was, and any fielders who were involved.
kind Required
The kind of dismissal that took place. This will be one of bowled, caught, caught and bowled, lbw, stumped, run out, retired hurt, hit wicket, obstructing the field, hit the ball twice, handled the ball, or timed out.
player_out Required
The name of the player who is out.
fielders Optional
Any fielders who were involved in the dismissal. Generally this will be the player who took a catch, or who was involved in a run-out.
extras Optional
If extras were conceded from this delivery then this will indicate how the extras came about. The value of the field will be an associative array with byes, legbyes, noballs, penalty, and wides as the possible keys, and the associated value for each will be the number of runs from each. In the example shown above Irfan Pathan bowled a wide on the first ball of the Irish innings.