(ARTICLE 2 in the Series)
Automation in Pet Insurance
If you think about it, Pet Insurance should not be different from human medical insurance. Pets need vaccinations, wellness checks and can get sick or hurt just as humans do. They can also suffer from serious illnesses like cancer, diabetes, etc. whose treatments start to get expensive. Yet, the market is still small as most people just pay out of pocket. Consequently, standardization and automation of processes have not yet fully taken place.
Sometimes, being behind in the technology curve can be a good thing. One can skip a cycle and leapfrog into the latest wave (as happened with cellular phones in Africa). Larger companies with some level of legacy automation will take longer to adopt the latest technologies. The pet insurance business can leapfrog this cycle instead of playing catch up with the adoption of Machine Learning and AI.
In the world of pets and vets, a pet owner directly submits a claim against incurred medical expenses. The claims may be submitted via mobile apps, web apps, email, fax, etc. Most insurance companies are trying to encourage digital submissions using their online tools so as to avoid paper handling. The claim will include a standard insurance claim form and various receipts and medical reports sent by the doctor and hospital.
The claim is then processed by a claims adjuster who often has to manually re-enter information from the various documents received into the claims application. The adjuster then translates the line items to a category, sub-category codes and validates against the customer’s policy to determine what is covered, amount of co-pays, etc. Larger companies may split the task by using lower-paid contractors for data entry while using the adjuster’s knowledge and experience to focus on the categorization and validation of the claim. Different mechanisms may be used for fraud detection (is only one dog insured while multiple are claimed?).
Businesses also need access to analytics – these are typically derived from the claims database in terms of count of claims processed, amounts claimed vs paid, etc.
How can AI Help?
Artificial Intelligence is the ability of a system to learn and automatically improve itself over time. This goes beyond building Machine Learning models periodically to automate the entire process of learning and tuning the algorithms.
With the use of Machine Learning and Deep Learning technologies (and AI), a significant portion of the claims process can be automated while providing additional new insights. Let’s look at some of them.
1. Invoices Automation
One complexity in automation of the documents ingestion is that the invoices and medical records can vary widely between doctors and various service providers. Handwritten comments and diagnosis complicate the situation. Pure OCR alone is insufficient to extract the necessary information from these documents. Using Machine Learning can help improve the extraction process and recognize the various fields in claim and invoice forms. For example, extracting Provider name and address.
For embedded handwritten text, images, etc. deep learning techniques using neural networks are required. With the use of AI, the system can improve itself over time requiring less manual validation.
2. Medical Records Analysis
It is a well-known fact that not only is a doctor’s handwriting illegible but that no two doctors will describe an ailment in the same way!
With a lack of standardization, it is left to humans to translate the description of the services performed into codes that the claims application and database can understand and process.
This task can be handled by Machine Learning as well. Learning from existing medical records, it can predict and categorize new medical records correctly – and do so, better than a human in many cases! This is especially useful when dealing with complex, rare diagnosis or medicines.
Think about it this way – you buy an exotic vegetable (say Chayote Squash) at a grocery store and bring it to the checkout counter. The clerk doesn’t know what it is or even if he does, has to manually look up the code in the book since it’s not an often-purchased item and so is not retained in memory. However, an image recognition application would instantly recognize it and pull up the relevant code.
3. Claims Analytics
When all the data flows through an AI application, it can provide advanced analytics resulting in new insights.
4. Fraud Detection
Fraud is a huge issue in many forms of insurance and can significantly reduce profits. Today, financial companies have built sophisticated fraud detection algorithms using big data and AI techniques. The insurance industry can do the same. As more claims get processed, big data and machine learning can help predict potential fraud.
5. Marketing Analytics
Individual customer, patient and provider behavior can be analyzed as well as an aggregated understanding of regions, states, demographics, etc. Enriching the data with external data sets can provide new insights for marketing and sales organizations. Targeted ad campaigns can reduce expenses while improving customer acquisition.
6. Customer Service
Improving customer service is a goal of every organization. The pet insurance companies are generally ahead of the game by providing easy to use web portals and mobile apps. Filing claims using mobile apps exclusively with minimal documentation enhances the customer experience.
With Big Data and AI, it is possible to automate the analysis of customer service calls and emails allowing customer service agents to proactively address complaints and issues.
Trying to find a database that can both store and retrieves your time-series data? There are many different databases today to handle time-series data. We started out using OpenTSDB but then switched to TimescaleDB. In this article, we examine the characteristics of time-series data and requirements for handling them.
We initially tried with OpenTSDB and met with a clumsy work on combining the metadata(stored in PostgreSQL) with Time Series data. To overcome the burden, we switched to TimeScaleDB. We will also compare Timescale DB against OpenTSDB on key technical features and architecture.
Time-series Data :
Let’s have a look at the basic definition of time-series data :
A discrete time-data sequence taken at successive equally spaced points in time-indexed in time order.
Data that collectively represents how a system, process, or behavior changes over time.
Types of Time-series Data :
- Seasonal effect (Seasonal Variation or Seasonal Fluctuations)
- Other Cyclic Changes (Cyclical Variation or Cyclic Fluctuations)
- Trend (Secular Trend or Long Term Variation)
- Other Irregular Variation (Irregular Fluctuations)
Some of the examples are Monitoring applications data, weather analysis data, Stock Market Data etc.,
In this digital environment, a great deal of data is gathered by various devices and applications. For example, current location, browsing data, personal fitness/metrics trackers etc. In this kind of scenario, it is really important to store the data for the overall population in an effective time-series database for future predictions/forecast.
fig1: Example of Time series data points
TimescaleDB Overview :
TimescaleDB is the first time-series database specifically designed for scale, ease of use, and complex queries. While TimescaleDB is an extension of PostgreSQL, it provides the following:
- Automatic partitioning across time and space (partitioning key)
- Full SQL support
- Easy to use; like a relational database
- Fast and parallel ingestion
fig 2: PostgreSQL and TimescaleDB – A Comparison of Insert Rates
As we can see in the above figure (fig 2), the insert rates go down as the Dataset size increases in PostgreSQL. While in TimeScaleDB, steady insertion rate is maintained irrespective of the size of the Dataset. Thus, the performance of the application that sits on top of TimescaleDB improves greatly.
TimescaleDB executes the query on Hypertable comprising of many Chunks partitioned by time and space which really look like regular tables.
“ Time-series data is largely immutable. Writes primarily occur as new appends to recent time intervals, not as updates to existing rows. Both read and writes have a natural partitioning across both time and space.”
-TimeScale DB Developers
Data Handling in TimescaleDB :
HyperTable Outlook: Abstracts the table as the Hypertable composed of many right-sized chunks partitioned as per Time and Space.
Optimized Query Execution: During Query Execution it checks whether only the necessary chunks are used for retrieval of data. This can be done by aggressive constraint exclusion.
Data Model: TimeScale DB follows the Wide table Model which helps in the estimate, measure, or note the similarity between data.
Benchmarking TimeScaleDB :
Our use case required running complex aggregation queries while also supporting simultaneous ingestion of incoming time-series data. To ensure that the chosen platform can handle this load, we did some benchmarking.
An ingestion application kept pumping in data into the database. We ran three types of queries that accesses a varying number of rows. Every query was run several times to ensure stable results. Execution time over a table varied only slightly. This is because the query hits only the selected number of chunks satisfying the filtering conditions. The results are shown in the table below. We will describe the individual queries in a future blog post.
Results of TimescaleDB benchmark:
|Records Retrieved||Query 1 Execution time(seconds)||Query 2 Execution Time(seconds)||Query 3 Execution time(seconds)|
TimeScaleDB(v0.4.2) Vs OpenTSDB(v2.3.0)
Based on our use case for handling and storing time-series data for IoT implementation in a Big Data environment, TimescaleDB compares better on features such as partitioning, data retention, access methods, and compatibility to scripting for automation. Moreover, ease of access and simple retrieval of data for further application makes it more convenient when compared to the other time-series databases that we have used.
(ARTICLE 1 in the Series)
The Rise of Big Data
Big Data technologies made it possible for enterprises to capture and integrate diverse sets of data. They were no longer constrained to the data warehouses for analytics and reporting. Big Data allowed the integration of third-party syndicated data sets and social media data such as tweets and blogs. In addition, it helped break down silos between the various divisions within the enterprise, democratizing data access and help gain new insights from data.
The enriched big data sets can be used not just to understand the past, but make predictions about the future – which customers are likely to churn, which customers/equipment are most likely to generate new claims, which products are the most likely to succeed, etc.
We are now in the next wave of deriving value from data using AI-powered applications. The big breakthrough for this wave is the ability to use AI-powered neural networks to solve a wide variety of problems including autonomous driving vehicles, natural language understanding, image recognition, etc. Translating these technological advancements to real business use cases will result in significant operational benefits – reducing cost, providing faster customer service while creating new business models and sources of revenue.
Let’s look at some of the use cases for AI in insurance.
Underwriting or new application processing is the first pillar in any type of insurance – namely, processing applications for new insurance policies. The process can be complicated depending on the type, size, prior history and other components of the application to evaluate the risk and enroll the client. It involves communication among multiple parties – the client, agent, and underwriter. This is traditionally a manual process as it involves a review of many different types of documents from diverse carriers with no standardization that allows easy automation. Further, many carriers still receive paper documents that are faxed or scanned (or worse – sent via snail mail !)
AI-powered systems can help this step in multiple ways:
- Natural Language Processing (NLP) systems and chatbots and streamline communication between the parties
- AI-driven document extraction systems (Docu-AI) can automate the processing of the various documents using AI and Big Data
- Data from documents can then be analyzed by AI-powered analytics to help the underwriter assess risk
Claims processing forms the core of the business for insurance carriers. When a claim is processed in a timely manner, it improves customer satisfaction and retention. Simultaneously, the processing has to minimize financial loss due to fraud or other factors to maximize profitability. Most companies have focused their energies on improving the claims process using technology.
Many software applications already automate workflows ensuring timely processing and smooth communication with all parties involved. Mobile apps allow users to easily submit claims along with documentation such as photos of the incident, claim form, invoices, etc.
Yet, main parts of the process are heavily manual. Claims adjusters have to frequently go out in the field to make assessments. Even in the case of smaller claims, the adjuster may manually review documents and photos.
How can AI-powered systems help claims processing?
- Image recognition algorithms can help identify and automatically categorize various pieces of information in claim evidence photos such as license plates of vehicles, insurance cards, various types of damages, etc.
- AI-driven document extraction systems (DocuAI) can automate analysis and categorization of line items in medical records, body shop estimates, contractor reports, etc. Using NLP and Deep learning techniques allows these systems to recognize a wide variety of content.
- Robotic Process Automation (RPA) can automate many parts of the processing workflow along with 1) and 2) above
Fraud detection is usually a part of claim processing to ensure that no opportunistic fraud has taken place. The biggest loss for insurance companies is due to fraud. Many larger carriers already use predictive analytics to help identify potential fraud in claims. These Machine Learning models use not just a carrier’s own data but also shared databases across companies to flag potential fraud.
AI-powered systems can take this a step further. They can use the vast amounts of accumulated data and images to detect more subtle instances of fraud as well as previously intractable ones. With the cost of running these models dropping dramatically, even small claims can be analyzed to detect patterns.
Improving customer service is the goal of every organization. With Big Data and AI, it is possible to automate the analysis of customer service calls and emails allowing customer service agents to proactively address complaints and issues.
AI-driven chatbots are now pervasive on websites and web portals. They provide an easy way of answering customers’ questions while reserving human interaction to handle more complex issues. Mobile apps with the ability to answer spoken natural language queries are now possible using technologies like Siri, Alexa and the same knowledge base used by chatbots and customer service agents.
New Business Models
With IoT enabling the gathering of fine-grained data (how many do I drive every day, what is the average trip, how many hours is the property unoccupied), insurance companies are seizing the opportunity to come up with new ways of underwriting policies. AI-powered systems can provide better risk analysis for determining premiums resulting in new personalized products. These new products can be provided at attractive premiums, driving new business.
I will be giving a talk titled “Anomaly Detection for Predictive Maintenance” at the Global Artificial Intelligence Conference in Seattle on April 27th 2018. If you are going to the conference, please do reach out.
Detecting anomalies in sensor events is a requirement for a wide variety of use cases in the industrial IoT. Examples include predicting failures of HVAC systems and elevators for property management to identifying potential signals of malfunction in aircraft engines to schedule preventive maintenance. When the number of sensors runs into the tens of thousands or more, as is common in large IoT installations, a scalable model for preventive maintenance is needed.
Unlike prediction models for customer churn, inventory forecasts, etc. that rely on multiple sources of data and a wide range of domain-specific parameters, it is possible to detect anomalies for many types of time-series data using statistical techniques alone.
In this session, we will discuss a step by step process for anomaly detection with examples that aid in quick insights for building models for preventive maintenance.
I will be giving a talk titled “Time-series analysis in minutes” at the Global Data Science Conference in Santa Clara on April 2nd at 3:30 PM.
The focus of the talk will be on understand why and how to analyze time-series data quickly and efficiently. You can read the full abstract here.
An interview given as part of this conference is also available at the conference website.
If you are going to the conference and would like to connect, I would be happy to meet with you.