Data Science in SHEQ assurance – AI and Data Science

Part 1

Data Science will seriously influence the impact health and safety professionals can make within their organisation – for the better! Explore how it will transform your safety, health, environmental and quality (SHEQ) assurance processes in our new 5-part blog series…

What is Artificial Intelligence?

Many of you who have been keeping up with technological advancements in the past decade have most likely heard the terms “artificial intelligence” and “data science”. Let’s dig into both of these terms and find out what they really mean, whether we’ll have our own Terminator apocalypse in the next few decades and whether companies like Amazon can predict the exact point in time when you will want to buy a particular product and in turn, suggest it to us a bit before in order to persuade us, or in other terms, “read our mind”.

In the perception of the general public, there are essentially two categories of artificial intelligence (AI), one of which exists and one of which does not. The latter is the kind of AI you see in science fiction movies like Terminator, Eagle Eye and Blade Runner. We call this artificial general intelligence; AI which can perform general intelligent action (like humans and other animals do) or perhaps even experience a kind of consciousness. The former is the kind of AI you see in software, websites and other applications such as self-driving cars, virtual assistants and those face-changing mobile apps. We call this applied artificial intelligence; AI for studying specific datasets, solving specific problems or performing specific tasks. In general, you can expect that the continued development of applied AI will lead to the eventual emergence of artificial general intelligence (AGI).

The distinguishing mark of the kinds of problems we use applied AI to solve is that they are problems which previously we would call on a human (or at least an animal) to solve. For a long time, human drivers, chat assistants and human artists are how we would accomplish solutions to the problem examples mentioned above. Meanwhile, the natural strength of computers is in calculation alone, hence why we typically make machines do this work for us.

Now, where does Data Science come into all this?

Data scientists work to create models that use data to solve a business or science problem – these so-called models are, in essence, mathematical (or statistical, if you want to be more precise) equations. These equations are then taken through a set of trial-and-error steps (also known as an algorithm) until they describe a dataset as closely as possible. For example, let’s imagine we have a dataset describing the shape and colour of a vegetable, however, we don’t know what vegetable it actually is – that’s for the machine to determine. If a vegetable is orange and long, it probably is a carrot, right? Emphasis on the “probably”, as there is a very small chance that it’s an oddly shaped sweet potato.

Machine Learning

“But how can we trick a machine into thinking like this”, you ask. Enter “machine learning”. So far, we know that the machine is a computer, right? But how can a computer learn to tell a carrot from an onion just by telling it the colour and shape of the vegetable? This is all done by describing the relationship between the features of a vegetable (colour, shape and perhaps many other characteristics) in a mathematical way. For example, let’s look at the table below:

Shape	Colour	Vegetable
Long	Orange	Carrot
Roundy	Orange	Pumpkin
Roundy	Brown	Potato
Slightly long	Brown/orange	Sweet potato

We would use this table as a training dataset; in other words, a Data Scientist will give this table to a computer, and the computer will produce an algorithm which describes the relationship between shape, colour and what vegetable it can be (also known as training the model), together with a probability of it being correct.

In this scenario, if we asked the machine “I have a roundy, orange vegetable – what vegetable is it?”, the machine would think somewhere along the lines of:

In a robot voice: “It’s roundy, therefore it can either be a pumpkin or a potato”
“Oh, it’s also orange, so if it’s’ roundy and orange then it must be a pumpkin – I’m 100% sure!”

But, depending on the training data (the table above), the result may not always be 100% accurate. If we asked the machine to tell us what vegetable is long and orange, it might think something like:

“Hmm, it’s long, so it can either be a carrot or a sweet potato.”
“It’s also orange, but from my training, I remember that it can be either a carrot or sweet potato. Probably a carrot, but I’m only 75% sure.”

And there, ladies and gentlemen, we have a very simple machine learning model, an example of applied artificial intelligence.

Leaving the “guess the vegetable” exercise aside, imagine how valuable it would be for a company such as Notify Technology to have a team who could use Health & Safety hazards data (such as time and location) to determine the time and location of the next hazard – this could make employees safer at work, couldn’t it?