The term data science may not yet have made it into the official Oxford or Webster’s dictionaries, but it won’t be long. Also known as data-driven science, it is an interdisciplinary field about scientific methods, processes and systems used to extract knowledge or insights from data in various forms. Used correctly, it has the potential to clear up decades of incremental spaghetti and mismatched data sets and processes and significantly improve an organisation’s bottom line.
Furthermore, data science is not just about the analysis of data. It involves the whole data life cycle: generating, collecting, storing, managing, analysing, visualising, and finally interpreting through story-telling, and can be applied to most industries. The abundant volume and variety of data from a plethora of sources, which is commonly referred to as Big Data, enables and propels data science to fulfil what has been envisaged for a long time: real-time insight-driven decision making within the enterprise at all levels.
“We have been talking about data science for a long time,” says Brickendon Partner and Data Specialist Nathan Snyder. “But now we are reaching a tipping point. The data is loaded and the toolsets are understood and available.”
In other words, we are moving into an era where there is an increased understanding of the data available and a corresponding desire to do more things with it. Going forward, businesses who harness this data in the most efficient and effective ways will have an advantage that can fuel performance and help them stay ahead of the competition.
The question is, how to do this. During a recent panel discussion hosted by interview site The Cube in conjunction with IBM, data science was likened to Batman. Unlike Superman, who uses his super-human powers to rescue heroines and ward off the baddies, Batman’s trademark is the tools and technologies he uses to help make the things he wants to happen, happen.
Tools and technologies
In the same way, effective use of data science requires not only an insight into, and understanding of, the data available, but also a certain knowledge of the tools and technologies that can help this process.
For businesses, the key is to focus on using data and the appropriate technologies to predict what will happen and prescribe what should be done, rather than just focusing on the past. Done correctly, this will allow firms to move from managing processes to optimising them. “It is the difference between fixing a broken part and assigning an engineer in advance to a part we know will break,” says Snyder.
Many of these developments are starting to be accepted and the way data is used is beginning to change. There is already a move away from descriptive and diagnostic analytics, which look at what happened and why, towards a predictive and prescriptive focus, looking at what will happen and what should be done. In other words, value-added analytics – data fuels predictive algorithms that businesses can use to drive consumer behaviour and meet consumer needs. This in turn helps fuel advances in science and artificial intelligence.
It is also no longer all about putting large amounts of data into a big database and hoping everyone uses it. Instead, there is a focus on Logical Data Warehouses (LDWs) and the creation of semantic maps with a view to getting the right targeted data out.
Going forward, the four-tiered approach to data science, incorporating the Data Architecture Framework (DAF); Information Engine (IE); Advanced Analytics Engine (AAE); and Data Portal (DP); could be of great benefit to companies seeking to make the most of their data:
- the architecture stage sets the framework for data to be sourced, integrated and organised into LDWs and Data Lakes, improving access to all data types;
- the Information Engine provides standard analytics capabilities as well as advanced visualisation tools, using drag and click technology such as Tableau and QlikSense, to allow business users to play with data (that has been sourced, catalogued and assigned trust ratings) to see what they can find;
- the Advanced Analytics Engine enables the use of machine learning, deep learning and neural networks, facilitating work on large heuristic problems in the predictive and prescriptive space, allowing longer lead times and potentially a bigger impact; while
- the Data Portal offers self-service access and streaming capabilities with Application Programming Interfaces (API), User Interfaces (UI) and reporting solutions, which can utilise any of the underlying information or advanced analytics engines.
Like with any new development, the key is to know what you have and then to decide how to use it for the best outcome. With the proliferation of data in society today, and the increased regulatory focus on that data, handling it correctly and learning how to use it to your advantage has never been more important.
Data science is no longer just for the business. Lower costs and higher data availability mean that it can now be used for the operational engine of the organisation and not just in the client sales process. In large global organisations this could have an even bigger impact on the bottom line, clearing up decades of incremental spaghetti and mismatched data sets and processes. Data science is the future – now is the time to prepare.