Rather lengthy hiatus, sorry about that. Moving to another country and day job requirements has throttled down my betting related activities to bare minimum. Now I hope that I have a little bit more time to invest in this project in the near future.
Let’s get started from where we left of at the end of previous post and start looking at the code I use to instruct FANN to learn. As FANN does most of the heavy lifting we only need to tell it what data to use and how to handle it.
# input_qty = number of input nodes
# output_qty = number of input nodes
# savefile = name of the file network is saved to
# neurons = how many neurons will be trained by cascade function
def train_cascade(input_qty, output_qty, savefile, neurons)
# Create the basis for the network, define number of inputs and outputs
net = RubyFann::Shortcut.new(:num_inputs=> input_qty, :num_outputs=> output_qty)
# As our inputs and outputs can have a value of -1 to 1 we can only use sigmoid_symmetric
# and that needs to be specified
# Search for pairs to be used for training
pairs = Pair.where(:race_id => Run.where(:dataset => "learning").where.not(:draw => nil).pluck(:race_id).uniq).where.not(:input => nil)
# I have saved precalculated inputs to the datase as one field, thus it needs splitting
# before they are usable
inputs = 
pairs.each do |pair|
inputs << pair.input.split(",").map(&:to_f)
# And same done to outputs. Output also needs to be an array, even if it is only one value
outputs = 
pairs.pluck(:output).each do |o|
outputs << Array.new(1,o)
# Once we have arrays of inputs and outputs we can combine them into form that FANN
# can understand
train = RubyFann::TrainData.new(:inputs => inputs, :desired_outputs => outputs)
# Finally it is time for some training. This will take a while, depending naturally
# on how much data one is using.
net.cascadetrain_on_data(train, neurons, 1, 0.05)
# After training it is important to remember to save the network into a file
# for further use
That is pretty simple isn’t it?
Obviously it does need a single line in some another file to actually call this function. Now we can proceed to training and finding out if there actually is something at the end of this exercise.
Before we get to actually build the neural network I am going to go through the tools that I am planning to use during the project. This list is obviously subject to change but this is what I feel at this point that I will need to complete this.
I will need to do a fair bit of modifying of data and for that I am using Ruby. Naturally one can use any programming language they wish but I am most familiar with Ruby and I like how readable and natural language like the scripts are. When it is relevant I am going to post the code or at least snippets of it in the blog as well. If you are new to Ruby it might be worthwhile to look at this quick start at Ruby official site or this pretty throughout tutorial at Tutorials Point. in the end though, what is needed is pretty simple and beginner level stuff, some calculations and loops mostly.
One could build the Neural Network software from ground up, but I am going to rely on existing library for this purpose. Earlier I have been using AI4R but as I mentioned in my post telling about new version of Raiform I have moved on to FANN or Fast Artificial Neural Network. It seems to be doing a bit better job even with same kind of network topology but what I especially like is feature called Cascade2. It dynamically builds and trains the topology and that is what I used to build the network for Raiform 2.0.
Neural networks are a pretty advanced topic and while it does help if you understand how they work it is till possible to utilize them even if most of the underlying math is left untouched. FANN has Ruby bindings (In addition to several other languages) and I am using Ruby gem called ruby-fann to take advantage of it. FANN has several graphical interfaces as well but I find it a lot easier to work in command line (Command line in windows is pain to work with so be warned or use a proper OS like Linux 🙂 ). If you wish to get a primer about Neural networks you could read for example this.
Last big building block is data. I am going to use data starting from beginning of 2012 and all of my data is originated from Racing Dossier. I have the data in a database so it is easy for me to fetch data with required filters as needed. Actual ratings that I am planning to use I will cover later on. I haven’t decided yet, but it might make sense to build a working database to handle the training and testing data. In the past I have just used csv files for this purpose.