Getting data to the analysts who need it, with Power BI Dataflows
With increasing numbers of data analysts across the University using Power BI, the HR Datamart project decided to make strategic HR data available in Power BI workspaces. We have achieved this, but it turned out to be significantly more work than we expected.
Power BI is a data visualisation tool that lets analysts present information using a range of interactive charts and graphics. It works very well when an analyst has all the data available that they need to produce their dashboards and reports. Several areas of the University are using Power BI to better present the information they have available.
The challenge facing us was to make HR strategic data available to analysts in a Power BI workspace, so that they have reliable data that is automatically kept up to date, and to implement access controls so that analysts get the data they need and don’t get access to data they shouldn’t see.
Our first thought was to let Power BI users connect our data warehouse directly, but this would have created a great deal of overhead for users as they would have had to install an Oracle Client on their PCs, and even then we encountered issues in how the client worked with updated versions of Power BI. It was also difficult to configure consistent access control rules using this approach.
Instead, we use a tool called Power BI Dataflows, in conjunction with a Power BI Gateway server. ISG developers configured data flows to extract data from the warehouse, apply access control rules to that data, and make it available to analysts via the server.
This required careful design. We had to analyse the requirements to understand which data was needed by each group of users. For example, core HR staff need to be able to access all personal information, while reporting staff should only see a subset of it. We also had to ensure that data was removed from these feeds when it was no longer needed, for example to meet privacy regulations, and these retention schedules vary depending on the type of data and what it is being used for.
One challenge we have encountered is that Power BI Dataflows are not very good at controlling who can see which entries in a table. To work around this, our dataflows flatten all the data from a database view into what is effectively a single file, so that we can apply the access control rules, and then rebuild the original view again. This was time-consuming to design and implement, and perhaps reflects the fact that Power BI Dataflows is still a new and evolving technology.
I hope I’ve given some idea of the complex design and implementation that was required. The result is that our data analysts are getting up to date, reliable data in the BI tool they chose. Initial feedback seems positive, so it looks like the team’s hard work has paid off. This investment will put us in good stead as we add more data to the warehouse.
You can find more information about Power BI Dataflows here: https://www.sqlshack.com/an-introduction-to-power-bi-dataflows/