Old data in new models
One of the challenges we faced in designing our HR data mart was how to map the data from the old HR and Payroll system into the new data model used by the People & Money system.
As an example of the problem, consider the data about which posts are funded from which sources – such as core funding, research councils, or charities. As with all financial information, this uses a core data structure called the Chart of Accounts. In the old system, each entry in the chart of accounts had three fields, whereas in the new system this has increased to nine fields. Also, the account codes and cost centre codes used in these fields have been changed and reorganised. Mapping an entry from the old chart of accounts to the new can be quite complicated.
The way that salaries are represented has also been changed. In the old system, all salaries were mapped to a central “spine” salary scale; in the new model, that central spine no longer exists. As another example, the new model uses a completely different three-level hierarchy for classifying jobs into categories such as “academic”, “clinical”, “IT”, etc.
As I mentioned in my previous post, the data warehouse has no problem in importing the data from both systems. It stores all the relationships between the data entities, with dates to indicate when these relationships held true. The challenge arises when we want to provide a consistent view of this data over time, so that our BI analysts can produce trend reports covering the whole period.
To help create this consistent view, we store extra information in the EDW, based on mappings created by the People & Money project. Those mappings show, for each value in the old data, the corresponding value in the new model. Our database queries check the source of each record and, if the record came from the old system, store the relevant mapped values in the new model.
Some of the transformations are more complicated than a simple one-to-one mapping. An example is with the change to job categories: at the top level, the mapping is straightforward, but at the more detailed levels, the values in the two models don’t match, and we have had to interpolate values that have the best fit.
Now that we have these transformations in place, the warehouse has data in both the old and new models, and analysts can report on either or both, as the need arises.