Classifying Animals: Machine Learning in Excel Spreadsheets
Recently DSP held a webinar entitled ‘What’s On Your Watch List? An Azure Machine Learning Analysis’, in which we demonstrated how to create a film recommendation system on Azure Machine Learning [watch here].
Following this, we wanted to demonstrate how Azure Machine Learning can be used to deploy models to Excel, allowing anybody to interact with the model by simply downloading the spreadsheet.
For this project I will use an opensource dataset used for classifying animals into the basic classes of mammal, bird, fish, insect, reptile, amphibian and other. There are 16 variables describing each animal which will be used to predict which of these classes they fall into, called explanatory variables.
These variables are whether they have: hair, feathers, teeth, a backbone, venom, fins, a tail; whether they are: airborne, aquatic, predatory, domesticated, approximately the size of cat; as well as whether they breathe air, produce milk or lay eggs, and how many legs they have.
The dataset for training a model consists of approximately 100 animals fully described, where over half of the training dataset are mammals or birds. There are very few amphibians, reptiles or insects described, which may make it harder to be certain whether a given animal will fall into these classes.
I chose to use a Neural Network model as these are very good at finding interactions between the explanatory variables, and they are very capable with smaller datasets like the one we have. The downside of using a Neural Network is that it is a black-box model, which means it isn’t possible to interpret how the model works. It simply takes input data, works some 'magic' and outputs predictions.
Azure Machine Learning displays the process of creating the model as a flowchart, giving an intuitive feel for the modelling process. The model architecture is shown below, where the dataset is ‘zoo.csv’ and a SQL transformation was used to write out the classes as “mammal”, “bird”, etc., which were originally labelled as “1”, ”2”, etc.
Modelling in Excel
Now that a model has been created in Azure Machine Learning, it can be deployed to Excel. If you want to try this yourself the Excel model can be downloaded from this link. When you open the file there will be a tab on the right called ‘Azure Machine Learning’. To load the model, simply click the blue text “Experiment created on…” and input the specified Input and Output cell names, pressing Enter after each. The model is now ready to test.
This is the part where you should get creative and think of the most ridiculous animal you can to test whether the model will correctly classify it. I have included a column for the name of the animal on the left, but the model does not use this for determining the type of animal, so this is for your own reference.
You can check this is true by entering the name of an animal and the explanatory variables of a completely different animal (e.g. naming your animal ‘clown fish’ and entering details of a cow will still predict that your animal is a mammal).
The model will give outputs quantifying how likely it is that your animal is in each of the classes, and will predict the class of your animal based on which is most likely to be correct.
An example output is shown here with the details entered for a pig. We can see that there is a 99.85% chance that our animal is a mammal, so the predicted class is mammal.
The probabilities default to scientific notation, but with this being Excel we can easily make these more readable by going to ‘Number’ in the Home tab, and changing the formatting to non-scientific or to percentages.
We can see that these probabilities don’t add up to 100%, and this is because the model is allowing the possibility that an animal fits into multiple classes, though the predicted class will always only be one of the classes.
Breaking the Model
While I have tested the model it seems inevitable that other users will have other experiences, so if you’d like help getting it working - or just want to share a funny prediction you got from the model - feel free to email me at firstname.lastname@example.org.
Thus far I have only found one animal which the model could not correctly classify: the fire salamander. This is an amphibian but bares a strong resemblance to lizards, and the model classified it as a reptile. Training the model with a wider range of amphibians and reptiles would likely fix this, though I’m very impressed how accurate the model is having trained it only on approximately 100 animals.
Thank you for reading, I hope you have fun with the animal predictor model. If you would like to get in touch about our Machine Learning services (perhaps for something a little more business oriented than animal classification) then head to www.dsp.co.uk/Machine-Learning-Consulting, or get in touch by emailing email@example.com.