Case study: Transforming Healthcare with Near Real-Time Analytics
The goal of Modernizing Medicine® is to increase efficiency and improve outcomes by transforming how healthcare
information is created, consumed and utilized. Its data-driven electronic health records (EHR) and practice management (PM) systems transform the clinical, financial and operational aspects of medical practices.
Modernizing Medicine turned to Cloudwick to support this goal. Cloudwick provided the company with its AWS Modern Analytics QuickStart services, aiding with data ingestion from MySQL to Cloudera CDH, integrating the company’s applications for predictive analytics, and providing professional services to ensure Health Insurance Portability and Accountability Act (HIPAA) and Protected Health Information (PHI) compliance.
- Significant improvements in ingest performance
- Organization enjoys near real-time analytics, making it easier to diagnose patients faster and more accurately.
- The bottom line is improved healthcare.
Cloudwick Delivers AWS Solution for Improved Healthcare
Modernizing Medicine has developed a specialty-specific electronic medical records (EMR) platform. The solution incorporates features like digital record tracking, revenue cycle management, along with advanced analytics enabling managers to make informed decisions about patients and their practices. The EMR application uses MySQL databases running on Amazon Web Services , while analytics are enabled using a mix of Hadoop, Microsoft SQL and MySQL data marts and various visualization tools.
To support visualization and reporting, Modernizing Medicine developed a Sqoop-based data ingestion process that took data from the MySQL sharded environments into a Cloudera-based Hadoop cluster running on AWS EC2 instances. There were about 800 tables to import on a daily basis with a total data size of roughly 4.7 TB. Many of these tables were small and a Cloudera CDH cluster was running with a fixed number of nodes regardless of the ingestion workload.
These factors resulted in a load time exceeding 48 hours, impacting their ability to use this data to manage their practices. To eliminate the delay and introduce near-real time predictive analytics, odernizing Medicine decided to move to transient AWS EMR Spark clusters to handle the data ingestion, providing flexibility, performance, cost savings and reliability improvements. The new data ingestion process runs within 3 hours, enabling Modernizing Medicine to make crucial healthcare and business decisions with near real-time data.
The MySQL environment for Modernizing Medicine consisted of 12 different shards. The redesigned ingest process leverages a transient, HIPAA compliant EMR Spark cluster which connects to the MySQL pods in parallel with different Spark executors using Spark JDBC. The loaded Spark Dataframes save the data to disk before sending it to S3 using ‘S3Distcp’. During a later step, the Cloudera cluster ingests the data using Distcp from S3. The entire process takes around 3 hours, compared to the earlier Sqoop-based ingestion time of 48hrs.
Modernizing Medicine turned to Cloudwick to support this effort. Cloudwick provided the company with its AWS Modern Analytics QuickStart services, aiding with data ingestion from MySQL to Cloudera CDH, integrating the company’s applications for predictive analytics, and providing professional services to ensure Health Insurance Portability and Accountability Act (HIPAA) and Protected Health Information (PHI) compliance.
Cloudwick chose Amazon Simple Storage Service (Amazon S3) for storing the PHI information and AWS Elastic Map Reduce (EMR) for resizable compute capacity. To achieve HIPAA compliance, Cloudwick set up encryption-at-rest as well as encryption-in-transit for both Cloudera Data Hub as well as EMR. The EMR transient cluster and the Cloudera Data Hub is secured using AWS VPC, Subnet ACLs, Routing tables, Sentry, IAM role, LUKS encryption, HDFS transparent encryption, Kerberos authentication and S3 bucket policies.
Modernizing Medicine now enjoys significant improvements in ingest performance enabling analytics with near real-time data, a crucial capability in medicine. This makes it easier to diagnose patients more accurately and enables the company to make crucial healthcare and business decisions on a day-to-day basis.With the new solution, the company enjoys a cloud-based modernized business intelligence and analytics infrastructure, managed by Cloudwick. With near real-time predictive analytics, it gives doctors the information they need to treat patients and improve healthcare.