Raiform – version 2.0

After some unsuccesfull festival betting lets look at some ratings.

I have been in a process of converting my neural network workflow from ai4r to FANN, or Fast Artificial Neural Network and one of the larger tasks was to redo my raiform rating. I originally built that earlier this year and wrote a bit about it in March.

Anyways, here is first glance at the results. Data is from period of 2013 and 2014 and it includes same ratio of different codes as did the training and test datasets. This is now unseen data which was not used for teaching purposes nor for testing. All in all there are some 5000 races and almost 50.000 runs included.

In the chart rating goes up till 500 but theoretical limit is 1000, there were some figures in the 600 range but I dropped from the chart all the occurances where sample size was less than 10, that is the reason why strikerate lines end at different positions for different codes. All in all pretty consistently rising strikerates as rating climbs higher. Highest variance is in Hunter chases but there sample sizes are so small that I wouldn’t worry about it at this time.

This rating is now only thing that I look at when it comes to past few races. It is easier to say that Horse A with a Raiform of 400 is more likely winner than Horse B with rating of 250 rather than Horse A with last three 321 is better than Horse B with 332. I made up those form lines, but I believe point is clear.

Raiform – A new rating

I have been adding new stuff to my bayesian system recently as well as adjusting criterias for ratings.

One new piece of information I have added I call Raiform. I need to call these something in my database and Racealyst AI Form or Raiform for short sounds fancy enough 🙂

Anyway, it is neural network derived rating which combines horses three last finish positions adjusted for number of runners and under the same code as todays run, days since last run and horses age together with information on if race is flat or jumps and if it is handicap race or not. Idea for the rating is from set of ratings Smartsig ran years back called AI form. That rating used also horses sex but for some reason I haven’t deemed that information important enough to include in my database so I have to live without it. I don’t know how people at Smartsig arrived at their ratings but I wanted to do something similar as I felt that just looking at the days since last run or days sincle last good run was not giving out relevant information.

Below is a chart which shows how rating would have performed in 2014. Line is strikerate and bars show return on turnover %. As I expected, ROT is all over the place but strikerate holds a nice upwards trend without plummeting at any point.

Interestingly rating range from 125 to 150 has a return on turnover of over 15% for over 16 thousand selections and strikerate of roughly 11%.

Progress in my hunt for a good run

Now that I finally found  a way to calculate if a race was good, bad or merely OK. I took all races run between 1.10.2012 – 30.9.2014 and looked at last run each runner had had and determined if it was good bad or ok.

Now I was able to calculate if my method was up to anything, reasoning is that if last run was a good one then that should improve the odds of winning for that horse. And it actully did, from the table below you can see how performance in previous race affected strike rate.

LTOSR%ROI%
Poor6.33%-13.90%
Ok9.62%-5.44%
Good15.34%-5.69%

None of the the above made any profits but at this point I am more interested in ways that let me forecast the winner and I am happy with anything that has a better strikerate than choosing randomly. And in races in question betting randomly one would have achieved strikerate of approximately 10%. My good runs last time out perform better than that and as importantly, races deemed bad perform worse.

I didn’t expect to find profits by using only one indicator so I was surprised that Hunter Chases and NH Flats combined made almost 600 points of profit over two years using just this one indicator. And that is over almost thousand races with roughly 450 winners. Not too shabby I’d say.

New Smartsigger and Artificial Neural Networks

Latest issue of Smartsigger was released a few days ago. Among other great articles there is mine where I document my first forays into Artificial Neural Networks. As I am relative beginner and wrote that article from that perspective I am constantly learning new things. One outcome of this continuous learning is that after writing that article I have already changed my toolset. Instead of Fann I am now using AI4R which I actually find to suit my workflow a lot better.

Speaking of ANN’s, I am almost done with my second version of network which determines if run a horse was a good, ok or in worst case a poor one. As I am writing this I teaching it flat races and once that is done I am able to calculate results for each last time out for runs ran in 2013 and have a first look at what effect (if any) this has on strike rate and profits. So one step closer to Suitability Score.

What is a good run?

Earlier I wrote about my plan to add a new kind of rating called a Suitability Score. And for that I needed to know if run horse had was a good one even if it didn’t finish first.

Initially I used a concept from Peter Mays book, but it was obviusly only for flat races as the system broke down when distances went further and  this lead to chases and hurdles having a lot more good runs compared to flat races.

After staring at lengths lost for flat and jumps for a while I came to conclusion that one-size-fits all solution wont work here and I also started thinking about going, it has to have effect on lengths a horse lost by, or atleast that is the hypothesis at the moment.

I was surprised that there seems to be greater variance between race distances in flat races compared to jump races, as can be seen from two charts below which describe average lengths lost per BSP for different distances in furlongs.

As I have recently dabbled with Artificial Neural Networks I decided to see if they could be of help in determining a good run. Plan is to train an ANN to give out expected lengths lost based on distance, going and bsp. I have now done first version of this with All Weather in mind and just by putting in criteria that run where actual lengths lost was less than half of expected distance behind winner was a good run and run where actual was more than 1.5 times the expected would be considered a poor run I were able to divide runs to roughly 25% good runs, 50% of ok runs and 25% of poor runs.

Next up is to check if this classification has any bearing on how horse performs next time out.

1 2