Blog

Why Data Repository is Different

What makes MEDITECH's database uniquely challenging

Read
Blog

Topic:

Blog

Industry trends, regulatory updates, and expert perspective on the MEDITECH ecosystem.

TL/DR

MEDITECH's Data Repository looks like a standard relational database on the surface, but several design choices make it uniquely challenging for report writers.

  • Case-sensitive collation means every query must match exact letter casing
  • Over 16,000 tables across two databases with no foreign keys and no database diagram
  • Schema changes arrive with each ring release or PP update, potentially breaking existing reports

Data Repository differs from typical SQL databases in collation, normalization, schema stability, and sheer complexity of healthcare data.

Article content

Technically speaking, MEDITECH's Data Repository is a typical relational database, in that it stores data in tables, uses primary key columns for referential integrity, and is interacted with using off-the-shelf tools like Microsoft SQL Server Management Studio. There are several aspects that make it unique, and challenging to work with, all of which are important to understand if you're a data analyst or report writer at a MEDITECH hospital.

Why Data Repository is different

Here are some features of DR that make it an atypical database.

Healthcare Content

The sheer complexity of healthcare data can never be underestimated. Every department in a hospital functions as its own business unit with specific data requirements and workflows. Yet everything flows together into the central warehouse that is Data Repository. When you think about it, most databases specialize in a specific software purpose or business discipline. Yet DR must contain data for dozens of applications that must seamlessly work together. DR includes core data from divergent sources that comprise every area within healthcare: pharmacy, diagnostic imaging, billing and accounting, patient registration, the Lab, MIS dictionaries, surgery, HR and payroll, nursing documentation, order management, the Emergency Room, and more. Each category of data has its own unique properties and storage needs.

Collation Settings

In simple terms, collation controls the way strings are stored and compared from within the database. When the Data Repository was originally designed, MEDITECH chose to use the collation setting of Latin General Case Sensitive. Being such a large warehouse with constant write activity, this made sense because the server will have slightly better performance since it doesn't have to compare upper/lower case when reading and writing transactions. SQL databases by default are case insensitive so most T-SQL writers don't need to worry about this. DR developers must learn to write case sensitive code for their statements: an "A" is not the same as an "a." For fast keyboarders, spelling everything correctly is hard enough, adding the upper and lower case gives your backspace key a workout.

Size and Complexity

Data Repository complexity

In the 6.1x and Expanse environments, there are two databases: the M-AT side clocks in with 11,458 tables, while the NPR side has 5,377 tables. They have a combined total of over 145,000 fields! That's quite a bit of data to sort through when trying to build a report. Also, there isn't much help available when trying to figure out which tables to join and how. There's no database diagram. Some of the column names can be confusing and it requires a fair amount of experience to know what to look for. When looking for specific data, sometimes the best analogy I can think of is looking for a needle not in a haystack, but in a stack of needles.

Not Quite Normalized

Since the advent of M-AT (based on the Focus programming language, that premiered with MEDITECH version 6.x) it's getting better, DR is not a truly normalized database. In many cases, the same data can be found in multiple tables depending on which MEDITECH application it lives in, and the native storage structures and hierarchy in NPR and Focus (DPMs and segments, or objects and fields). The real kicker: there are NO foreign key relationships defined in the database. None. As report writers, we infer them and write them in our JOINs using primary key relationships, a lot of experience and a dash of luck.

The Schema Changes Regularly

I've found this point to be one of the more challenging dynamics when working with DR. Every time you take a ring release or a PP update, the database structure may physically change: new tables and columns, renamed objects, deleted objects, changing datatypes, etc. New requirements or enhancements on the front-end cause changes all the way down the chain and these are reflected in DR. These updates can wreak havoc on any of your reports, processes, or data exports. This also makes it difficult for DR developers to share SQL code and queries, since each version of DR can be slightly different.

Check mark

There are other differences between DR and more typical relational databases, but these are the biggies. If you've come to healthcare from another industry with relational database experience, the DR schema will be challenging at first. Once you understand why and how it's different, you can quickly adapt and become a productive report writer.