Monthnotes: January 2023
Posted: 30th of January, 2023
Inspired by Simon Willison, and his use of weeknotes, I’ve decided to start providing a semi-regular update on what I’ve been doing with regards to Cricsheet. As I know I’m very unlikely to be consistent enough to do this every week I’m starting out slowly (and hopefully more realistically) and have decided to go with “monthnotes” instead. As we reach the end of the end of the first month of 2023 now seems as good a time as any to start. With that in mind, I’m going to provide an update on what I’m been working on since late last year. Let’s see how this goes.
Adding previously missing tests
I’ve succeeded in adding the data for two Test matches which I had previously listed as “missing”, the 5th Ashes Test in January 2003, and the 4th India vs Australia Test in January 2004. In the former I was finally able to resolve an issue regarding the runs conceded by Lee and Gillespie in the 1st innings, and in the latter I identified the issue with the total number of boundaries listed for Martyn and Lee in the 4th innings.
I’m cautiously optimistic that the same method I used to help resolve those discrepancies can be used some other missing matches of that period, however it’s a painstaking process so I’ll just fit it in when I have nothing else to do.
Adding the Syed Mushtaq Ali Trophy
I’ve added data for 648 matches from the Syed Mushtaq Ali Trophy seasons for 2016 to 2021 seasons. I was able to do this due to the fortuitous discovery that the BCCI website has ball-by-ball information hidden within the JSON data for some matches on the site. It’s not perfect, and I’ve had to work around some interesting idiosyncrasies, however it was doable and I’m happy to have the match data available.
You may have noticed that I didn’t mention the most recent (2022) season of the Trophy. Sadly the JSON data for every match I’ve checked from that season is incomplete. Only minimal data is provided for each delivery and I can’t fill in the details. I’ll check on a regular basis to see if the situation changes, but until it does I won’t be able to add the most recent iteration of the competition.
In order to be able to add data from the BCCI website I unsurprisingly needed to be able map player ids from the site to existing Cricsheet Register ids. This has resulted in the addition of some new columns to the register data, along with 1,329 ids mapping to existing entries.
A new outcome method - VJD
Another result of the addition of the Syed Mushtaq Ali Trophy to the site is that the various data formats (for example, JSON) now support another outcome method (
VJD) which indicates that the winner of a match was determined by the Jayadevan system, an alternative to the Duckworth-Lewis-Stern system. This new system has been used in a number of seasons of the competition, and may well be used in the future, and now there is support.
Twitter and Mastodon
The project has joined the exodus from Twitter and moved to Mastodon. The old account had 1,574 followers, and the new one has just 17, however hopefully that will grow, and the account can be useful.
The new profile is at https://nitech.online/@cricsheet, or you can follow @email@example.com in your preferred Mastodon client. If you’re not ready to join Mastodon yourself but know about RSS you can subscribe to updates at https://firstname.lastname@example.org.
Adding matches from scorebooks
Finally I’m working on a small project to allow me to generate ball-by-ball data from scorebooks. This will allow me to convert some of the scorecards from the wonderful Women’s Cricket History into actual data files. I’m still ironing out the kinks, but it’s looking hopeful.