New month – New SmartSigger

By lhourahane profile (Flickr) [CC BY 2.0 (], via Wikimedia Commons

It is March and Cheltenham is upon us. Coincindentally this months issue of SmartSigger is titled Cheltenham Special  in preparation for the festival. Main point being article about key trends for each race over the four day festival.

My own article for the month is about beaten favourites and if there is any money to be made with that information. Most interesting finding for me was the big difference between top and bottom success rates for trainers when their beaten favourites run again. For full list you need to subscribe the magazine but below I present top and bottom five trainers based on strikerates.




Top 5

J Ferguson257035.71%53.6676.66%11.4133.91%
D Lanigan175034.00%45.1090.20%2.8515.25%
A O'Brien5619528.72%-34.12-17.50%-5.20-4.10%
W Mullins8329428.23%21.847.43%-6.43-3.77%
N Henderson5921127.96%16.167.66%3.113.07%

Bottom 5

M Dods8829.76%-42.91-52.33%-8.45-44.05%
H Candy5539.43%-36.09-68.10%-3.76-24.06%
R Harris4666.06%-20.45-30.99%-2.71-37.34%
Mrs R Carr3545.56%-37.74-69.88%-6.29-63.85%
Richard Guest51134.42%-72.53-64.18%-7.47-55.23%


Back to drawing board

I ran a system test for my bayesian system at Race Advisor forums for a few months and few thousand selections. It became clear that even though I was profitable it depended too much on the luck factor.

My system still seems to favor one selection too much which means that my odds line is not accurate as on many occasions my top ranked is at odds of close to 1.0 and then rest are way overboard. It is not uncommon to see odds of few thousand for the lower ranked runners.

Thus, I am going to rebuild and rethink my likelyhoodratios and in addition add a couple new ones as well. I am also in the process of redoing my good run neural network and that means that I am going to recalculate all of the ratings that depend on that. Meaning most of the  Suitability Score-ratings of which I have added quite a few.

I also plan to add better view, maybe chart of some sort to visualize which likelyhood ratios are the ones that make certain horse top ranked selection.

Trainer consistency per lay off range

So far I have been using days since last run on per horse basis. But for a while now I have wanted to explore that further. Article in old Smartsig sparked an idea (again!). In issue 9.08 from August 2002 (page 30) there was an article about quick return trainers, ie. trainers who have been succesful with horses that return to action within 7 days.

But I didn’t want to look only quick returners but all different day ranges. Ofcourse there are hundreds different number of days that horse can return after so I needed to group them somehow. To get some kind of idea how the runs are grouped I took data from 2012-2014  and divided them evenly to 6 groups. This lead to some interlap between certain dates and grouping was also a bit counterintuitive. For that reason I ended up with almost but not quite even grouping as follows.

Days sinceGroup
No data-1
1 - 71
8 - 142
15 - 213
22 - 284
28 - 505
51 - 2006

After determining group for each run in my database I calculated two values for each run by comparing trainers success with that Days since-group. First, Suitability Score by comparing ratio of good and bad runs and second strikerate for that trainer in that Days since-group.

Hypothesis was that higher the suitability score or strikerate in a group then higher the strikerate these horses would have. And this was true to certain extent. Two charts below from 2013 and 2014 show strike rates on a range -100 to 100 for suitability score and 0 to 100 for strike rate.

Interestingly strikerate drops as as we get to top end of the range as was the case with trainer form as well. This might partially be due to smaller number of samples.


New issue of SmartSigger!

Today marks the day for February issue of SmartSigger magazine. My article this month is about a system where I utilize some Racing Dossier ratings with some of my own creation. Test for that system is ongoing at Race Advisor forums, but so far results have not been quite up to par but I am confident that results will improve.

All in all I think it is well worth a read and if you are not already a subscriber you can get your first month for free to see if the content is such that you wish to read it in the future months as well.

I am also excited about competition Michael Wilding is publishing this month were competitors need to create a system based on given historical dataset. Systems are then matched against test data to see whos system gives the best results. Unfortunately one is only able to participate if you are able to access Race Advisor forums which can only be gained buying any of Michaels products.

I already have few ideas on how to approach the competition and I will write about them here, atleast after the fact when I have seen if results were there 🙂

Speed figures in All Weather

We are having a discussion at Race Advisor forums about different speed figures provided in Racing Dossier and their relative strength in choosing contenders in All Weather races. I for one was telling my opinions just based on hunch and I wanted some data to confirm my own thoughts.

I looked at handicap All Weather races ran over 6 furlongs (rounded to full furlongs) with a field size of 8 to 10 runners. Data was collected from years 2012 to 2014.

I looked at three rankings based on four different speed figures which are

  • SHorPro – Horses projected speed rating in todays race
  • SpdFigLR – Speed figure earned last time out
  • ShorAvG – Average speed rating on todays going
  • ShorAvD – Average speed rating at todays race distance

I used ranking based on these ratings and not the raw numbers, so top rated would be ranked 1 on so on. To make the charts more readable I grouped the rankings to three groups, top 3, mid 4 and bottom 3.

First chart shows strike rates for those three groups combined from all of the tracks.

We can see that there are interesting variations between the tracks. But one ought to bear in mind that as data is only from three years there is relatively low number of data points (complete dataset was only a little over 3000 runs).

1 2 3 4 6