Location metadata storage: Difference between revisions
Ttenbergen (talk | contribs) |
Ttenbergen (talk | contribs) mNo edit summary |
||
Line 7: | Line 7: | ||
== Flexibility for values changing over time == | == Flexibility for values changing over time == | ||
The actual values for some of this information change over time, e.g. a ward may cease to be a HOBS, or an ICU may increase its bed number. If we simply included a column for the information, we would not be able to store different information over time. So we need to store this in a linked table instead, where there could be multiple lines with different dates per location. We could include start and end dates, but really including only start dates would be enough to encode this fully, since the start date of the next later location would automatically be the end date of the former. It would be easier to use lines with both start and end dates (no linking between lines required that way), but then there would be a risk of overlapping time periods. | The actual values for some of this information change over time, e.g. a ward may cease to be a HOBS, or an ICU may increase its bed number. If we simply included a column for the information, we would not be able to store different information over time. So we need to store this in a linked table instead, where there could be multiple lines with different dates per location. We could include start and end dates, but really including only start dates would be enough to encode this fully, since the start date of the next later location would automatically be the end date of the former. It would be easier to use lines with both start and end dates (no linking between lines required that way), but then there would be a risk of overlapping time periods. | ||
== Flexibility for what needs to be collected == | == Flexibility for what needs to be collected == | ||
Line 15: | Line 14: | ||
* Value (the value for the named attribute, e.g. ''10'' in the case of beds) | * Value (the value for the named attribute, e.g. ''10'' in the case of beds) | ||
* Start_DtTm (when did this entity-value combination become true) | * Start_DtTm (when did this entity-value combination become true) | ||
* End_DtTm (when did the entity-value combination stop being true) | * End_DtTm (when did the entity-value combination stop being true) | ||
** end dttm is not needed for complete info, and in fact could become inconsistent, but it will make it easier for the [[Statistician]] to use. Since updates will be rarer than use, the difficulty is worth it | |||
== How and where to keep the master data == | == How and where to keep the master data == |
Revision as of 16:14, 2022 February 9
|
We track locations in many fields; they are drawn from either s_tmp Boarding Loc entries or from s_dispo table. We are interested in some information about some of these locations, like their Level of care hierarchy or how many beds they have.
Flexibility for values changing over time
The actual values for some of this information change over time, e.g. a ward may cease to be a HOBS, or an ICU may increase its bed number. If we simply included a column for the information, we would not be able to store different information over time. So we need to store this in a linked table instead, where there could be multiple lines with different dates per location. We could include start and end dates, but really including only start dates would be enough to encode this fully, since the start date of the next later location would automatically be the end date of the former. It would be easier to use lines with both start and end dates (no linking between lines required that way), but then there would be a risk of overlapping time periods.
Flexibility for what needs to be collected
There also might be information we have not even considered yet that we might need to track. So, we would want a flexible infrastructure. One way to do this would be something similar to the Entity–attribute–value model we use for the s_tmp table that has the following:
- location name (a physical location, so should not change) (the entity)
- Attribute (the place where we would name the thing we want to describe, e.g. "bed number")
- Value (the value for the named attribute, e.g. 10 in the case of beds)
- Start_DtTm (when did this entity-value combination become true)
- End_DtTm (when did the entity-value combination stop being true)
- end dttm is not needed for complete info, and in fact could become inconsistent, but it will make it easier for the Statistician to use. Since updates will be rarer than use, the difficulty is worth it
How and where to keep the master data
This data is useful to external users of our data to make sense of it.
We would become aware of changes to this data either when a collector stumbles across the change during their collection activities, or when management informs us, usually as part of requesting data.
We already collect data with a similar requirement: the ICD10 Categories. We could store this data the same way and make use of similar mechanisms to integrate it into our analyses. That is, the data is updated on wiki by whoever finds out that something has changed, and Tina then runs a process to extract an updated data table from the wiki to include in CCMDB.accdb.
Types of data to consider
- Bed size and possibly other things: Collection location documentation / Site and Location table
- HOBS info - Change_of_remaining_location_names_from_"our"_names_to_EPR/Cognos_names#How_will_we_code_the_HOBS_status_of_a_unit
- Bed numbers as in STB-L2HA
- Level of care hierarchy