Analytics and data science represent an interdisciplinary field of processes and systems, able to extract insights and knowledge from different data formats.
This extraction of insights from data is increasingly important to complement the descriptive analytics represented by traditional Business Intelligence projects.
Through the insights hidden in the data, we can predict events and therefore act in a timely manner.
What patterns are hidden in my organization’s data?
How can my clients be grouped?
What kind of profiles do my clients have?
How can I classify new data and examples according to those already classified in the past?
What is the level of risk of a new client?
What can I do to enhance the use of my data?
What knowledge can I extract from my data?
Business Analytics Solutions
Ability to identify patterns hidden in the data. Typically, by grouping the data or through associations, we can discover behaviours in the data that allow us to characterize them.
Some typical approaches are:
- Association rules
- Text mining
Predictive Analytics covers a wide range of predictive modelling statistical techniques, machine learning and data mining. They analyse historical facts and are able to predict future situations.
With the use of predictive modelling we can learn according to decision criteria, consequently allowing us to classify new examples that were previously unknown.
In the field of predictive modelling we have essentially two types of predictive approach:
– if what we want to predict is continuously characterised, we have a problem of Estimation (regression)
– if what we want to predict is categorical, we have a problem of Classification (classification).
The way it works is to provide the model with a set of example data, previously classified with your answers, so that we can create a classifier.
Once this component is developed, we can say that our model is “trained” and consequently we can give it new unclassified examples as input, so that it tries to classify them correctly.
In order to correctly classify input data, you must choose a classifier tailored to the problem and the available data.
To this end, there is a relatively wide range of classifiers, some more academic than others, which must be combined in order to obtain the best possible result:
- Naïve Bayes
- Instance Based
- Decision Trees
- Neural Networks
In many situations there is a need for the outcome we are trying to achieve to be expressed in a continuous and not categorical way, as a price, value or age.
For these cases we cannot use classifiers that only operate on value classes and we need to use different approaches such as:
- Linear Regression
- Support Vector Machines
In order to achieve a good performance of the classifiers, there is a set of approaches capable of using different classifiers, allowing the creation of a voting system in order to obtain better performance.
Some of the methods used are:
In order to use the capacity of these algorithms, there are tools that allow you to apply them to the data and generate results accordingly.
Programming languages such as Python and R are often used, which through their APIs provide the possibility to use these algorithms and apply them to the data, with great performance results.