The big data revolution is in full swing. It’s difficult at the current time to read an industry journal or attend a seminar where it doesn’t get a mention. Insurers have always been at the forefront of data analytics and this data-led revolution offers massive potential to those who are able to harness it.
This article looks at where this revolution may take the insurance industry over the next decade and what insurers and other service providers need to put in place to take advantage of these developments.
We focus on three key questions:
- What exactly is big data and where does it come from?
- In what areas of insurance does big data have the most potential?
- What do insurers and other participants need to do to position themselves to take advantage of this revolution?
We are clearly only at the beginning of this journey, but here we look forward to establish where big data might bear its first fruit for insurance organisations.
Notwithstanding big data’s huge potential, many commentators portray it as being all things to all people – to the extent that it will make traditional data sources and analytical techniques redundant. We need to be careful to not overstate where we are at now and where we are likely to be in a decade’s time. Gartner1 declared in 2013 that big data had reached the peak of its “hype cycle” – a point in time when the benefits of these new techniques are perhaps oversold.
What is big data and where does it come from?
The term “big data” means different things in different contexts and there is no rigid definition. At a high level, big data refers to any large unstructured data source that can be collected, organised and analysed to produce business intelligence.
One of the great benefits of the exponential growth in computing power over the past two decades is that it has opened up data sources that were previously inaccessible due to their volume, structure and volatility. These new data sources may be rich with information but they also contain lots of “noise”. The big data revolution is largely about distilling one from the other and using the results to inform decisions.
Where does it come from?
Big data can come from a huge number of sources. Most insurers would already store substantial amounts of information about their policyholders and processes that they do not currently harness. Beyond these traditional sources, there is a wealth of external data sources for an insurer to consider – these include consumer behaviour data, credit information, geospatial data and of course, social media.
In a 2012 survey of nearly 800 organisations2, IBM asked respondents who had indicated they were using big data to identify which sources of data they were collecting and analysing. The results showed that most respondents were still mainly using internal data sources.
How can an organisation harness the endless streams of information propagated by websites such as Facebook, YouTube and Twitter? What infrastructure will they need to do this? All sorts of businesses are asking themselves how these relatively cheap sources of information can help them to address challenges within their businesses and to unlock competitive advantages.
Where might big data have its biggest impact?
Insurance has always been a data-hungry industry. Some insurance functions lend themselves readily to analytic methods – these include pricing, reserving and risk analysis. While these are the traditional playgrounds for data-driven analytics, new data sources and improvements in modelling techniques are opening up a range of possibilities. Other core insurance functions that can start to use big data include marketing, distribution, claims management, customer insights and fraud detection.
In this section we take a look at what will likely be the initial battlegrounds for the use of big data within an insurance organisation.
Insurers and policyholders have much to gain by minimising fraudulent activity. The Insurance Fraud Bureau of Australia estimates that fraud costs the local insurance industry more than $2.2 billion annually, about 10 per cent of total claims costs3.
Fraud management has been touted as one of the first beneficiaries of the big data revolution. Some large insurers already have substantial fraud detection programs that in some way rely on the analysis of large unstructured data sources.
Until now, most fraud detection programs have relied on a combination of fixed business rules and searching through masses of unstructured data sources – often text fields. They depend heavily on having a pre-existing notion of where and how fraud arises.
More modern techniques are allowing insurers to investigate possible patterns of fraud in an unstructured way. Several large software providers have dedicated fraud detection and management platforms that rely in part on the analysis of big data sources. These methods use “non-traditional” data and seek to identify fraud at a variety of places within the insurance cycle.
Developing these techniques will depend heavily on being able to analyse free-form text fields, voice and video sources, as well as real-time social media feeds.
Marketing and customer insights
Most marketing departments are already familiar with using both proprietary and external data to assist in attracting the customers they want. Big data sources are opening up a new suite of possibilities in how to go about “measuring” and “attracting” these customers.
One of the most immediate uses of big data sources is in informing marketing activities. Analysing behaviour via social media and other “clickstreams” (the tracking of user behaviour on websites) is fast becoming a powerful means of understanding consumers. The patterns that are uncovered in these processes can assist in answering key marketing questions, such as:
Where have our customers been (online) before us?
Where do they go after us?
What consumer characteristics define our target market?
Targeted and mass marketing campaigns have a lot to gain from rich, big data sources. The ability to attract customers with particular characteristics relies on the marketing department being able to identify and isolate them via descriptive variables.
Social media provides some of the most dynamic forms of big data. With the possible recording of every click and every choice on social media platforms, there is massive potential to gather marketing insights.
Customer lifetime value
Understanding the lifetime value of a customer is becoming a core skill for insurers. This involves combining traditional and new data sources to map out a customer’s journey through the organisation. By analysing customer behaviour, text and voice sources, social media feeds and other external sources, insurers can begin to answer questions relating to a customer’s lifetime value.
By building dynamic models of a customer’s path through an organisation, we can develop metrics that describe their likely profitability. If we are able to capture this dynamic and then identify these characteristics in potential new customers (including using big data sources), we can form targeted marketing strategies at a much more precise level than previously achievable.
Underwriting and pricing
Sensor pricing is gaining significant interest across a number of insurance applications. Detailed information about the insured is transmitted, stored and analysed to inform the underwriting and pricing processes.
In the life insurance context, sensors are attached to the body and relay vital information about the health of the individual for analysis. This information can be used in a variety of ways for the benefit of both insurer and insured.
While the life insurance industry is only in the very early stages of developing applications, they have already come to fruition in motor insurance – most notably through telematics-enabled products.
Insurance telematics has become one of the first big data applications for insurance companies. Motor policies that use telematics devices have already become commonplace in the UK and Italy, where the market dynamics promoted rapid acceptance of the product.
Telematics devices have the potential to transmit massive volumes of data, relating to the insured’s driving behaviour as well as the performance and condition of the vehicle. This data can be combed to gain insights about the insured’s risk level. The learnings can also influence marketing and claims functions.
Beyond these traditional areas, the masses of information transmitted by a telematics device can provide other ancillary benefits to both insurer and insured. It can enable better roadside breakdown assistance, vehicle servicing and claim recoveries. It can also provide further methods of interacting with the policyholder on a daily basis, through upload and download of content.
Claims processes are at the very heart of an insurer’s operations. It’s through these processes that the majority of an insurer’s money generally goes out the door. This is also the time when the relationship with the policyholder is at its most sensitive.
The use of big data can improve the claim process in a number of areas: recoveries, loss adjustment, at-fault determination and fraud detection. Another area with substantial big data potential is claims triage and lifecycle management.
Smart claim processing
Claims can progress quickly through the claims process or take more time to settle. This is not always in relation to the size of the claim. Having a deep understanding of the path a claim takes can provide valuable insights into the costs of handling claims.
Straight-Through-Processing (STP) of claims is becoming an important component of the claims management process. This involves being able to identify the characteristics of claims that lend themselves to fast processing, payment and settlement. Improving the speed of settlement has benefits for both insurer and insured.
Claim “lifecycle” optimisation takes this a step further and looks to model the entire claim lifecycle and optimise it. The elements to be optimised include a range of metrics such as the financial result (costs of finalising and handling the claim), as well as claimant “satisfaction” levels.
Many large hospitals already use modern analytical techniques (such as network theory) to find the best way to triage patients and manage them in the most effective manner. The similarities with an insurer’s claims process are obvious.
Much of the information that will help develop these new disciplines is in the form of big data. This includes text mining of claim details, voice recognition of claimant interactions and profiling of third party claim providers, such as repairers and adjusters.
Preparing for a brave new world
Even with all the talk around big data, we are only peering in the front door at the moment. Indeed most organisations are not even in a position to consider moving further. Nobody can say how the next decade will pan out for big data in insurance, but insurers need to be preparing themselves so they can adapt as this new frontier develops.
We now consider what an insurer needs to do to get prepared to take advantage of the data revolution. What infrastructure, systems and people are going to be required in this brave new world?
Handling and warehousing data
Before an organisation can position itself to benefit from the explosion of data sources, it needs to be able to cope with it, logistically. By definition, big data consists of massive volumes of volatile, unstructured data – and it needs to be captured, stored and analysed.
High-capacity data storage capabilities form the cornerstone of the big data infrastructure. Most organisations do not currently have the requisite hardware or software in place to perform in a big data environment.
A distributed, “in-memory” environment is required to satisfactorily cope with high-velocity data:
- “In-memory” databases harness the internal memory of a system to process information, rather than reading and writing to and from a disk such as a hard drive. This greatly enhances the volumes and velocity of data that can be processed.
- Distributed computing allows the computers across a network to coordinate their activities and communicate with each other, via specific software. “Clusters” of computers can be arranged to process data simultaneously, to improve the efficiency of calculations.
A number of different platforms have been developed in recent years to meet these growing needs. Hadoop4, has brought distributed computing capabilities to organisations of all sizes – from small consulting firms who maintain a cluster of fewer than five computers, to the likes of Google and eBay who have thousands of computers tied together in a cluster, crunching data.
Similarly, NoSQL (or, not-only-SQL) style databases allow storage and retrieval of data without using tabular relations as are used for relational databases. This greatly increases the performance of data analytics.
The analytic capabilities
Some of the analytical tools traditionally used in the insurance industry won’t cut it in a big data world. While underwriting and pricing functions have always relied heavily on analytics, other areas such as sales, marketing and claims processes have lagged in their use of more sophisticated analytics.
There are a great variety of statistical techniques that are useful in understanding big data sources. Some of these have been around for a while, such as simulation, optimisation and predictive modelling. Other techniques are relatively new and it is in these areas that insurers (and their partners) will likely need to up-skill their workforces.
Making sense of voice and text
One of the key challenges in the insurer’s ability to harness these new data sources is decoding of the language we use – both verbal and written – into “computer-speak”. Being able to effectively interact with the spoken and written word in real-time will enable the use of big data techniques and provide learnings down at the “front line”.
Often referred to as machine learning, these techniques are useful when there are not pre-established ideas about the interrelationships between data elements. These methods are essential when approaching data sets that are so large that prescribing all the “causal” relationships between variables is too difficult – at least in the first instance.
Cluster analysis is not a new technique but is gaining a new life in the deconstruction and segmentation of unstructured data sources. Many of the statistical packages used by large insurers will have “modules” that can perform cluster analysis – the challenge is having the in-house skills to prepare the data and then to interpret and communicate the results.
Network theory and link analysis
Link analysis is an area of network theory used to evaluate the relationship between elements of a data set. The techniques can assist by “visualising” patterns in the data, and are an important modern tool in the fraud detection space. Indeed, Suncorp currently uses some of these techniques to assist in its claim triage and management processes.
Clearly only the larger organisations would have the resources to develop all these capabilities internally and much of the investment will need to be undertaken for them by their service providers.
Big data people
Computers can only take insights from data so far. From there it requires skilled people to interpret and communicate what has been found in the data. On the flip side, computers don’t make mistakes, people make mistakes.
Insurers who wish to reap the benefits of big data will need to attract people with a new breed of technical skills. Many of the skills required won’t currently exist on an insurer’s payroll, especially the smaller ones.
Initially, IT departments will require staff familiar with the infrastructure required to support big data capabilities. From the back-end to the front-end, there is a requirement for a range of new hardware and software skill sets before an insurer can “harvest” these new data sources.
The analytical capabilities of staff will also need refreshing. The ability to interpret the results of data mining activities and unsupervised learning techniques will become a crucial skill set for an insurer, or its partners, to develop. Organisations need to start investing in the people that will be able to drive a data-led decision framework.
Where are we headed?
We are only at the very beginning of this data-led revolution. Initially the path will be quite steep but so will the gains for those who are on the front foot in adopting new techniques to assist with business decisions. New data sources and analytical capabilities have the potential to be to the twenty-first century, what coal and steel were to the nineteenth – fuelling massive gains in productivity and in turn, establishing new jobs and industries.
Insurance has always been a data-intense industry and it sits at the forefront of the big data revolution. Marketing, pricing, underwriting and claim management processes all stand to benefit substantially from the development of these new techniques. This will require substantial investment in infrastructure and personnel before an organisation will be in a position to implement business solutions that can benefit from these varied, volatile and high velocity data sources.
1 Gartner Inc. – Hype Cycle Report. 2013.
2 Analytics: The real world use of big data. IBM Global Business Services, Said Business School. 2012.
3 insuranceNEWS.com.au – 4 June 2012.
4 Apache’s Hadoop™ is open source software that allows large data sets to be analysed across a cluster of computers.