Working with data these days seems to be all about metrics, measurements, and numbers. Healthcare data of course is no exception, just think of the many different contexts of numeric values possible in the MEDITECH EHR: lab test results, patient counts, drug dosages, account balances and budgets – the list is long. Fortunately, SQL Server gives us some tools to work with and manipulate numeric values to the sometimes very precise formats we need in a clinical context.
Thanks to Acmeware consultants Geoff Grouten and Joe Adams for this blog content!
Using the ROUND function
We often need to round whole or decimal numbers to a certain length. And while we think in terms of rounding up or down, when using the T-SQL ROUND system function, we have to think a little more specifically: to what length (or precision) do we want to round (the number of decimal places), and do we want to round the number to the specified length, or merely truncate any numbers beyond it? I'll provide examples in a moment, but here's the syntax of the ROUND function, directly from Microsoft:
ROUND ( numeric_expression , length [ ,function ] )
You can see that the ROUND function takes two or three arguments: the numeric_expression is the value you're working with (typically, a field name or calculated value from your dataset) and is required; length specifies the precision to which you wish to round the numeric_expression relative to a decimal point and can be specified as a positive or negative number (depending on which side of the decimal point you're working). Finally, the optional function argument is specified as 0 or 1 (not including this third argument is the same as specifying 0). This is the part that can catch you out; when don't include it or use 0, the numeric_expression is rounded; when you use 1, it is truncated at the specified length. Let's see how this works.
Look at how our numeric_expression of 1.526 is represented based on the length and function - you can see the difference between rounding and truncating quite clearly. An example I can provide from a client of ours is from pharmacy medication dosages and dispensing forms: a patient is prescribed a total of 4.5mg of melatonin. The melatonin tablets are 3mg each, so how many should be given? If we're not careful with the ROUND function and don't specify the correct length, we can wind up rounding to 2, when in fact we should give 1.5 tablets for the correct total dosage.
The float and decimal data types
Speaking of decimals, this reminds us that when working with numbers in SQL Server we should always keep in mind the underlying data types for table columns, or those that we define when using system or user-defined functions. Two numeric data types that we see often in use by Data Repository tables are float and decimal. These two data types are similar in that we can use them when we need to calculate or otherwise interact with non-whole numbers (that is, numbers with decimals). However they behave very differently by design in terms of precision: float values are approximate, whereas the decimal data type (which is synonymous with the numeric data type) is exact, in that you can specify the precision and scale. (For more on these and other data types in SQL Server, see Microsoft's Learn pages.)
Let's look at how these data types behave. In the sample below, we declare four variables: the numbers 0.1 and 0.2, but doing so as both float (@f1 and @f2) and decimal (@d1 and @d2) data types:
Now let's compare some results after seeing how SQL Server handles adding the variables together and comparing them. In lines 10-11, you can see we're adding 0.1 and 0.2 together, as both float and decimal datatypes; the first 2 columns in the results show they appear to be the same value (.03 and 0.300). However in lines 12-13, we use a CASE statement to determine if each set of variables added together returns a value of 0.3 - and in the case of float, it does not. To our eyes it does, but behind the scenes in SQL it does not, because the underlying data type is approximate, and handled differently for this operation.
So why use the float data type at all? In theory, what it lacks in precision, it in fact can accommodate a greater range of possible values than decimal. In practice however, we'll never encounter the extreme ranges of those values in a MEDITECH environment, so it's safe to avoid the float data type. (For an excellent and more technical discussion on this topic, see the Learn SQL blog.)