Racing Dossier review

I realised that I haven’t written a product review of a tool that I am using daily in my betting, namely the Racing Dossier. I have mentioned it before and even linked to a walk through that show a little of what it is about.

I ended up checking what other reviewers are saying about the product but there isn’t much there and what little there is missing the point of the Racing Dossier in my opinion. For example this review at Betting System Truths is looking at couple example race cards and looking at their profitability while they were never intended as plug’n’play systems that one just takes and uses without thinking about it at all.

So if they missed the point, what is it about? It is a collection of ratings, that’s what. And how any single punter uses those ratings determines the profits or losses incurred. So it does need effort on the users part to make it work but so does everything that has even remotest chances of success.

Earlier the access to ratings was provided with a Adobe Air app but earlier this year they rolled a new web based version of the system which improved on the usability front. System is organized around race cards that you create by yourself. Basically race cards are a collection of ratings of your choosing that you use to analyse races. Below is an screenshot showing an example with only handful of ratings.

Racing Dossier exampleAll in all there are close to 700 different rating to take advantage of. Granted, bunch of those  are ones that I would call helper ratings, for example those that show difference from top rated. As an example, there is rating called PFP (Current form class level of horse, this rating starts at 1500) which is Glicko/Elo derived form rating and it is accompanied by DiffTpPFP or PFP Difference From Top Rated – Current form class level of horse as the official description goes. And it does get more complicated than that for example there is rating DiffTpBEPFP which is BEPFP Difference From Top Rated – Best ever form based PFP rating horse has achieved. All in all there are 152 ratings that begin with Diff meaning that they are some kind of difference from some base rating.

In addition to form rating like PFP there are also big variety of approaches to estimating speed characteristics of horses. There is SPDFIGLr or Speed rating in last race but there is also SHorPro or Projected speed rating in todays race or SHorBE30 for Best speed rating achieved in the last 30 days. I think you get the point. Naturally different looks at jockey and trainer strike rates are included as well. All they way from Percentage of winners trainer has had to Percentage of wins jockey/trainer have had at todays course and distance.

Many of the ratings also have ranking counterpart. For example PFP is paired up with RnkPFP which simply is horses ranking in a race based on its PFP score. In total there are 152 ranking ratings included same way as with Diff-ratings.

I don’t have success rates of different ratings at hand and in any case it really depends on the race niche you are looking at. Constant advice at the forums for newcomers is to really narrow it down to concentrate on the subset of subset and to become specialist on that before expanding. But a while back I did take a look at how couple different speed ratings stack up on All Weather racing. Take a look at that comparison here.

And, analysing races one at a time is not the only way to use Racing Dossier. You could do as I do and download a csv export of next days races and import them to the database of your own and do analysis there. Naturally previous days results are downloadable as well, there are occasional hiccups with results, especially BSP’s and sometimes it takes the day after to have results available. One can also look at that on winter meeting at Lingfield in 2013 if one so wishes. Exporting is also possible over maximum of 30 days if one wants to have a bit more reference data.

To summarize, Racing Dossier is not for everyone as it does take the effort of learning to utilize the information. Luckily Race Advisor forums are helpful and questions and inquiries are answered by other users as well as staff. But if you invest the time the rewards are there and it is well worth the price of £49.75 per quarter

I have been using Racing Dossier for close to two years now and I am not planning on stopping anytime soon if that is of any indication on how I feel about it. And hopefully this review also gave you the reader a slightly better idea about what the system is all about.

You can get access to Racing Dossier by clicking this link.

Disclosure. I write for Smartsigger magazine which I am compensated for and Smartsigger is published by same company as Racing Dossier. And that link above is an affiliate link.

Neural Network Diary #4: Some more thoughts about inputs and data

Last time I was thinking about how handle the negative values possible in the speed ratings provided by Racing Dossier. Luckily that is not an issue, it is just a matter of using a activation function that supports values -1 to 1. Activation functions available in FANN can be seen here and the ones to use in my case are either

FANN_SIGMOID_SYMMETRIC

 

Symmetric sigmoid activation function, AKA tanh. One of the most used activation functions.

This activation function gives output that is between -1 and 1.

or

FANN_SIGMOID_SYMMETRIC_STEPWISE

 

Stepwise linear approximation to symmetric sigmoid. Faster than symmetric sigmoid but a bit less precise.

This activation function gives output that is between -1 and 1.

And from those I am going to start with the first one. When thinking about this I also had a new idea on how to handle the  presentation of the values. Initially I was planning on using normalised values and two fields, one for each runner. Then I just thought about using the actual values and adjusting them to be between 0 and 1 (or -1 and 1). And now the current idea is that I am going to use only field for each rating and calculate the difference between the ratings there and also using one field for networks output where 1 is when inside horse came ahead and -1 when outside horse came ahead.

Datawise, I have the dataset for the races that I am going to use in development. From 1st of June 2012 to 31st of May 2015. I did exclude maidens and selling or claiming races but have included both handicaps and non handicaps. And as I am concentrating on races ran over lengths less than 8 furlongs I had total of almost 22 000 runs worth of data to use. Next up is dividing them evenly into learning, testing and unseen datasets. So that all courses and all distances are evenly represented in all datasets.

 

Neural Network Diary #3: Thoughts about inputs and ratings

Recently I have been thinking about inputs that would use in the neural network and as mentioned earlier, most will come from Racing Dossier-service. I don’t wan’t to include too many but then again not too few either. Currently I am planning to include following list of ratings.

  • Shorpro – Projected speed rating in todays race
  • SpdfigLR – Speed rating in last race
  • SHorAvD – Average speed rating at todays race distance
  • PFP – Current form class level of horse, this rating starts at 1500
  • MClSLr – Money Class Shift From Last Race. Prize money of todays race divided by prize money of last race. Anything greater than 1.07 is a shift up in class, anything less than .93 is a drop in class.
  • Raiform – Rating assessing last three races
  • Course, Distance or Course/Distance winner

I am still thinking that I might add something measuring how succesfull horse has been when it comes to pricemoney.

Originally I was planning on normalising ratings but that was before I came up with that list and now that I think of it, I might just as well use them as they are and dividing with suitably big number to bring them to less than one. Money Class shift and Course/Distance winner I am putting in as boolean values.

Only problem with that is the fact that speed figures above can be less than zero, I need to find a way to handle that.

Neural Network Diary #2: Tools & Data

Before we get to actually build the neural network I am going to go through the tools that I am planning to use during the project. This list is obviously subject to change but this is what I feel at this point that I will need to complete this.

I will need to do a fair bit of modifying of data and for that I am using Ruby. Naturally one can use any programming language they wish but I am most familiar with Ruby and I like how readable and natural language like the scripts are. When it is relevant I am going to post the code or at least snippets of it in the blog as well. If you are new to Ruby it might be worthwhile to look at this quick start at Ruby official site or this pretty throughout tutorial at Tutorials Point. in the end though, what is needed is pretty simple and beginner level stuff, some calculations and loops mostly.

One could build the Neural Network software from ground up, but I am going to rely on existing library for this purpose. Earlier I have been using AI4R but as I mentioned in my post telling about new version of Raiform I have moved on to FANN or Fast Artificial Neural Network. It seems to be doing a bit better job even with same kind of network topology but what I especially like is feature called Cascade2. It dynamically builds and trains the topology and that is what I used to build the network for Raiform 2.0.

Neural networks are a pretty advanced topic and while it does help if you understand how they work it is till possible to utilize them even if most of the underlying math is left untouched. FANN has Ruby bindings (In addition to several other languages) and I am using Ruby gem called ruby-fann to take advantage of it. FANN has several graphical interfaces as well but I find it a lot easier to work in command line (Command line in windows is pain to work with so be warned or use a proper OS like Linux 🙂 ). If you wish to get a primer about Neural networks you could read for example this.

Last big building block is data. I am going to use data starting from beginning of 2012 and all of my data is originated from Racing Dossier. I have the data in a database so it is easy for me to fetch data with required filters as needed. Actual ratings that I am planning to use I will cover later on. I haven’t decided yet, but it might make sense to build a working database to handle the training and testing data. In the past I have just used csv files for this purpose.