The format of the data file

The data provided on this site is in a YAML format. This document describes the organisation of the file and attempts to give some idea of the possible contents for each field. The version of the file documented here is version 0.4. You can also see version 0.3 of the format.

meta Required

meta:
  data_version: 0.4
  created: 2009-08-15
  updated: 1265923616

data_version Required

The version of the data format the file contains. For example, the document you’re reading details version 0.3 of the data format. The actual content of this field will be in the form X.Y, where X is the major version number, and Y indicates the revision. For example, 1.2 indicates the file refers to the 2nd revision of version 1.

The major version number will only be updated when there are substantial, or critical, changes to the format of the file. For minor changes, such as the addition of a new field, only the revision will be updated. Ideally minor changes should be able to happen without affecting anything which is using the data.

created Required

The date on which the data file ws first created. This will be in the format YYYY-MM-DD, for example 2010-12-05.

updated Optional

A unix time indicating when this data file was updated. This will only appear if the file has been recreated after the initial creation.

info Required

info:
  city: Cape Town
  dates:
    - 2011-01-02
    - 2011-01-03
    - 2011-01-04
    - 2011-01-05
    - 2011-01-06
  match_type: Test
  outcome:
    by:
      runs: 6
    winner: Chennai Super Kings
  teams:
    - South Africa
    - India
  toss:
    decision: field
    winner: India
  umpires:
    - IJ Gould
    - SJA Taufel
  venue: Newlands

city Optional

The city in which the game took place.

dates Required

The dates on which the game took place. If there is just one day, for example in a T20 match, then it will be an array containing just that one date.

match_type Required

The type of match this data file refers to. Currently the possible values are Test, ODI or T20.

neutral_venue Optional

If this is in the file then the value will be 1. This indicates that the game was played on venue neither team would regard as a home ground.

outcome Required

The data for this field is an associative array which details the outcome of the match this data file refers to. It contains information such as which team won the match, whether the game was a draw, tie, or no result, and any margin of victory.

by Optional

This entry is an associate array which details how much the winning team won by. It is an associative array with the possible values of innings, runs, and wickets.

innings Optional

If the match was won was by an innings and something then this entry will appear with a value of 1, for the 1 innings.

runs Optional

If the match was won by a number of runs, or and innings and a number of runs, then this will contain the runs.

wickets Optional

If the match was won by a number of wickets, then this will contain the number of wickets.

eliminator Optional

This field will list the winner of any elimination bowl-out which decides a tie in a T20 match.

method Optional

This field will detail any method used to determine the winner where a match has been curtailed for some reason. Currently the only value this will contain is D/L when a match uses the Duckworth Lewis method.

result Optional

The result of the match if the match was not won by one of the teams. Currently the possible values are draw, no result or tie.

winner Optional

If a team won the match then the name of the team will be here.

overs Optional

If the match_type is either ODI or T20 then this will indicate how many overs there are in each innings. Likely to be 50 for ODI matches, or 20 for T20 matches.

player_of_match Optional

If this field appears then it will contain an array of any players who were adjudged to be the player of the match.

teams Required

An array containing the teams who played in the match. There will always be two entries. If the match is not being played on a neutral venue the first team listed will be the home team.

toss Required

winner Required

This will list the team which won the toss, and will be one of the teams listed as playing this match.

decision Required

The decision made by the team winning the toss. This will be either bat or field.

umpires Required

An array containing the umpires who adjudicated in the match. There will always be two entries.

venue Optional

The venue in which the game took place.

innings Required

innings:
  - 1st innings:
      team: Ireland
      deliveries:
        - 0.1:
            batsman: WTS Porterfield
            bowler: IK Pathan
            extras:
              wides: 1
            non_striker: JP Bray
            runs:
              batsman: 0
              extras: 1
              total: 1
        - 0.2:
            batsman: WTS Porterfield
            bowler: IK Pathan
            non_striker: JP Bray
            runs:
              batsman: 0
              extras: 0
              total: 0
  - 2nd innings:
      team: India
      deliveries:
        - 0.1:
            batsman: G Gambhir
            bowler: WB Rankin
            non_striker: RG Sharma
            runs:
              batsman: 4
              extras: 0
              total: 4

An array of associative arrays each representing an innings within the game, in the order in which they took place. Each associative array has a single key (such as ‘1st innings’) which specifies the “name” of the innings and then the value is a further associative array with the details of that innings, such as the team batting, and the deliveries faced within the innings.

An simple example is show above, showing 2 innings of a one-day match. The first innings shows Ireland facing 2 balls, with William Porterfield, while the second innings shows 1 ball of the Indian reply.

team Required

The team which is batting must be specified in the data for the innings. This will be one of the two teams mentioned in the teams section for the match.

absent_hurt Optional

If this field is provided it will be an array of players who did not take part in the innings due to being absent hurt.

declared Optional

If this is in the file then the value will be 1. This indicates that the innings was declared.

deliveries Required

An array of associative arrays each representing a delivery within the innings, in the order in which they took place. Each associative array has a single key (such as 23.5) which specifies the particular ball (in that case the 5th ball of the 24th over), then the value is a further associative array with the details of that delivery.

batsman Required

The batsman who faced the delivery.

non_striker Required

The player who was the non-striker for the delivery.

bowler Required

The bowler who bowled the delivery.

runs Required

The data for this field is a simple associative array which details the breakdown of the runs scored from the ball. It breaks the runs down to show which the batsman scored, which were extras, and the total scored from the ball. There is also the ability to indicate that a 4 or 6 was not an actual boundary should, for example if the batsmen ran a 4.

batsman Required

The total number of runs scored by the batsman off the ball. If the batsman failed to score this will show 0.

extras Required

The total number of runs conceded via extras off the ball. If no extras were conceded this will show 0.

non_boundary Optional

If this is listed against the delivery then the value will be 1. This indicates that the 4 or 6 scored was not via an actual boundary.

total Required

The total number of runs scored off this delivery. If no runs were scored from the delivery then this will display 0.

wicket Optional

If a wicket occurs then this entry will be in the file and will provide details on the wicket, such as which player is out, what type of dismissal it was, and any fielders who were involved.

kind Required

The kind of dismissal that took place. This will be one of bowled, caught, caught and bowled, lbw, stumped, run out, retired hurt, hit wicket, obstructing the field, hit the ball twice, handled the ball, or timed out.

player_out Required

The name of the player who is out.

fielders Optional

Any fielders who were involved in the dismissal. Generally this will be the player who took a catch, or who was involved in a run-out.

extras Optional

If extras were conceded from this delivery then this will indicate how the extras came about. The value of the field will be an associative array with byes, legbyes, noballs, penalty, and wides as the possible keys, and the associated value for each will be the number of runs from each. In the example shown above Irfan Pathan bowled a wide on the first ball of the Irish innings.