PostgreSQL Roles are Cluster-Wide

A role in PostgreSQL is common to all databases in the cluster. This seems to be the result of a design decision made when the former user and group handling was unified under role. Follow these links for reference:

Roles, or rather those roles that are not just representing a specific user, ought instead to be an intrinsic part of the database model. Roles are defined by the kind of access they provide (read, write, etc) and by what relations(table, view, function, etc) they provide that access to. Access control is ideally managed within a database using roles rather than separately within each individual application that uses that database. So it makes sense that the access control rules (the roles and their associated permissions) would be defined alongside the definitions of the relations for which they are controlling access, any changes are then self contained. The access control model should be represented as part and parcel of the rest of the database model. Individual users (which are also represented as roles in PostgreSQL) are assigned one or more of the roles defined within each particular database model (based on the local enterprise definition of what needs they have of any particular database).

There is no sense to representing this kind of role at the cluster level as the definition of the role is associated specifically with the database where it actually controls access. In PostgreSQL, to encapsulate the full functionality of the database requires using not only the system catalog tables specific to that database but also the roles relevant to that database held in the cluster wide system catalog tables. With the exception of some special cases, like roles for cluster wide database management, this is an artificial split. Roles that are managed outside of the database model across all databases in the cluster make some sense either when there is only one database in the cluster, or when all the databases in the cluster are not independent and act together as part of one managed system. Roles (again, those that are not just being used to represent a specific user) should otherwise be defined and managed at the level of the individual database, or arguably even at the level of each schema within a database.

Below are some ideas for enhancing PostgreSQL to support database specific roles. These ideas build up from limited and simplistic solutions to more flexible and complex ones.

Approach One

This is what we currently do. We prefix every role name with the database name. So roles called customer and salesperson used in a database called orders would actually be called orders_customer and orders_salesperson. This approach relies on all owners of databases in the cluster playing ball and we need some custom handling around role DDL statements when changes are made to the database model.

Approach Two

This just puts some syntactic sugar around the first approach.

Syntax Behaviour
CREATE [DBONLY] ROLE abc Creates the role abc after first adding a prefix which is the name of the current database returning an error if the name already exists. Without the DBONLY option behaves as now. Most other options would be incompatible with the DBONLY option, for example LOGIN, CREATEDB, etc.
DROP [DBONLY] ROLE abc Drops the role abc with a prefix matching the name of the current database returning an error if the name is not found (except when IF EXISTS is used). Without the DBONLY option behaves as now.
GRANT … ON … TO [DBONLY] abc Grants the privileges to the role abc with a prefix matching the name of the current database returning an error if the name is not found. Without the DBONLY option behaves as now.
REVOKE … ON … FROM [DBONLY] abc Removes the privileges from the role abc with a prefix matching the name of the current database returning an error if the name is not found. Without DBONLY option behaves as now.
DROPDB Drops a database and all the roles in the cluster that have a prefix matching the given database name.

The cluster wide system catalog table pg_roles is still used. A role created manually with a database prefix (without using the DBONLY option) would be treated the same as if the role had been created database specific (when using the DBONLY option). Roles names still have to be unique across the cluster.

With this approach no meta data is retained that identifies a role as cluster-wide or database-specific. A pg_dump will simply dump the role name as now (including the prefix) and a pg_load would re-create it in the same way as now. The dropdb command would drop all roles that have a prefix matching the database name irrespective of how they were actually created.

The advantage of this approach is that no change is required to the mechanism that looks up roles and checks that the level of access is appropriate to the user that has those roles when relations are actually used.

However we can definitely do better while staying simple.

Approach Three

This works much as above except that instead of adding a prefix to the role name we add a column to the cluster wide pg_roles table, catalog_name, where we add the name of the database associated with the role when using the DBONLY option (left null otherwise). The advantage is that we now have the meta data preserved as to how the role was originally defined – either database-specific or cluster-wide.

This fixes various concerns with the second approach. DROPDB can be constrained to search for roles matching the database name in the catalog_name column, so cluster roles that just happen to have a matching database name prefix will be ignored. The pg_dump and pg_restore commands can use the DBONLY option in the output as necessary by checking whether the catalog_name column has a value or not. The role name lookups for GRANT and REVOKE remain specifically controlled using the DBONLY option (so as to avoid handling any complex role hierarchy semantics). Still no change is required to the mechanism that looks up roles and checks that the level of access is appropriate to the user that has those roles when relations are actually used (on the assumption that when GRANT is used the corresponding pg_roles row OID is assigned, the actual name is irrelevant). However two roles could now exist called the same, one with catalog_name empty and one (or more) with a database name in catalog_name. So there are consequences for displaying the content of system catalog tables in psql using \du or \dp for example, so the system catalog views would need to be amended in some way.

A downside is that all the roles are still held in one system catalog table for the cluster, but in practice this might not actually matter. The semantics for pg_dump need to be thought about: when dumping a specific database would want an option to also dump its own specific roles; when dumping all databases all roles should be dumped – database-specific and cluster-wide.

This approach needs few code changes and is backwards compatible.

Approach Four

This would add a per-database pg_roles system catalog table. This holds the roles specific to that database and the role names can be the same as role names in other databases or cluster-wide. Would probably need rules to handle prioritisation in name clashes with roles held in the cluster wide pg_roles table. For example CREATE ROLE would need to check both the database-specific pg_roles and cluster-wide pg_roles. Similar complexity would need to be handled in GRANT and REVOKE. Access control checks would now need to check both database-specific pg_roles and cluster-wide pg_roles, and there may be conflicts to resolve depending on how a role hierarchy is implemented. While having a database-specific pg_roles table makes semantic sense, is necessary for managing who can change which roles, and a thorough implementation of role priority/conflict handling might result in useful behaviour, there is a lot of additional complexity in the implementation.

Schema Handling

A further enhancement should allow roles to be separately defined per-schema within a database. Support could be added in some fashion by allowing schema qualification of the role name in CREATE ROLE, using public (or as defined in the schema path) by default. The implementation of this in the third approach above would be to add a schema_name column into pg_roles.

Personalised Information Portal – Top Level Design

Introduction

Top level design for the Personalised Information Portal, Computing Project #470.

The Personalised Information Portal (PIP) hosts web pages with an individualised cut of data that is appropriate to the teaching and administrative duties currently held by the particular member of staff viewing those pages. For example: a member of staff that is a PGR supervisor will see a page containing a set of live data on those PGR students they are supervising; a member of staff that is a PGR Selector for an institute will see a page containing a set of live data on the current applications being processed for that institute; a member of staff that is an institute director will see a page containing management reports for things like student numbers and funding information customised to their institute. Individualised web pages provide direct access to the data we have in local systems (and links to associated resources) needed by staff performing the associated duties. This: is more efficient for staff than having to search for aggregated resources held in many different places; helps the School to meet various data protection obligations; provides a useful resource for new staff (or existing staff taking on new roles) when they first start out. In addition, the content (or a particular view of it) can be subsequently used within a PDR.

Design Principles

The content on the PIP will be predominantly (and at the outset exclusively) provided from Theon (or a Theon Managed Micro Service). So the design must work simply and efficiently when using this source. The design should not however preclude the use of other sources in the future, even if these are not as directly integrated or as easy to add.

The PIP should be completely agnostic to the sources and content of data so the design must not require any context sensitive (source dependent) structure or functionality. The decision on what content to provide and how that content is presented (while remaining consistent with the overall presentation) on the PIP rests entirely with the originating service, since that is where the data is mastered and where what data to present and how is most effectively controlled. This keeps the PIP itself simple to maintain and keeps any evolving logic specific to each originating service local to that service.

The design should be lightweight, robust and fast to implement by re-using or borrowing from as much existing technology and standards as possible. For example, a solution like uPortal (what MyEd is based on) is too heavyweight as well as being an outdated approach. The PIP only needs to host static content, it does not need to support embedded applications.

Adding a new external service onto the PIP should be cheap to do on top of the changes that have to be made to the external service anyway in order to provide the appropriate content in a suitable format. Assuming ideal conditions (which depend on the nature of the service) a service owner that has the content ready that they want to put on the PIP, and is familiar with the procedure, should need no more than half a working day to include their content on the PIP.

Terminology

PIP
Personalised Information Portal. The service acting as a proxy to serve up data on an individualised basis from any number of channels supplied by any number of external services. The PIP is essentially a service that collates the content from other services and builds this content onto individualised web pages for users (as well as providing the content via an API for re-use). The PIP is content agnostic – it does not need or use contextual information which is specific to each individual external service. In effect the PIP is just a relay. The PIP can provide two way relaying – some services may want to expose controls for update within the PIP, when bouncing the user back to the full external service interface would be overkill. The PIP here pushes content back to the external service, changes can be made, content re-generated and relayed back to the user through the PIP. Note that we assume just one PIP exists here, in practice there might be more than one if they have more specialised roles, the design does not preclude there being any number of specialised instances.
satellite
A service that has content that is made available through the PIP.
channel
A single data set presented on the PIP and supplied by a satellite. A satellite may supply more than one channel. A channel may contain embedded callbacks to allow a PIP to remotely initiate specific actions on the satellite. A channel dataset must be supplied raw (unformatted) but can also be supplied formatted (in compliance with the PIP channel specification). A channel will also provide associated meta data, such as generation time and a “backlink” (allowing the channel to provide a link to the satellite so that the user can view additional content or perform any operations not implemented via callback).
supplier
A supplier is a satellite that is providing one or more channels to a PIP in order for that PIP to present the content held in that external service relevant to the current individual accessing. A supplier pushes content to a PIP.
receiver
A receiver is a satellite that can accept return data from one or more of its channels in a PIP in order for that PIP to remotely affect change within the corresponding data sets held in that external service. The PIP pushes content to a receiver.
tightly coupled
A satellite that is structurally integrated with the PIP in order to be a supplier and/or a receiver of content for the PIP. A tightly coupled satellite can operate with a closed loop (push and wait for response) or open loop (push and continue) as either supplier or receiver for the PIP.
loosely coupled
A satellite that uses the RESTful API provided by the PIP in order to be a supplier or a receiver of content for the PIP. A loosely coupled satellite can operate with a closed loop (push and wait for response) or open loop (push and continue) as supplier for the PIP, but it cannot operate as receiver for the PIP.
TMMS
Theon Managed Micro-Service. The School Database (Hypatia) is a TMMS, despite it being anything but “micro”. There are others within the School such as Referee, DiceDesk, InfROS. Over time Hypatia itself will be fractured into a separate TMMS to cover each distinct workflow. The PIP will itself be a TMMS. A TMMS provides out of the box: PostgreSQL database model management; a web based user interface (for administration); a RESTful API; internalised page generation for a web interface. Services built using a TMMS, such as the PIP, are fast to implement.
callback
A mechanism for a channel on a tightly coupled satellite to use a PIP to pass over user affected content and initiate an action on that satellite. A callback can be implemented as a closed loop requiring the satellite to supply updated content for the channel reflecting the results of the action.
closed loop
A closed loop is synchronous. An atomic transaction initiated from the push of data (either from a satellite to the PIP as a supplier or from the PIP to a satellite as a receiver) that must fully complete before control is handed back. In the context of a supplier this ensures that channel content has been transferred and made available to the end user before the transaction is complete. In the context of a receiver this ensures that return content has been transferred and any action associated with it, including perhaps updating the channel content, has completed. A closed loop can easily result in performance problems. or even worse deadlock, in the PIP and a satellite so care must be taken in implementation.
open loop
An open loop is asynchronous. Any processing resulting from the push of data (either from a satellite to the PIP as a supplier or from the PIP to a satellite as a receiver) is deferred and control is handed back immediately. In the context of a supplier this means that once the channel content has been transferred the operation is complete – the necessary processing to make the content available to the end user will then happen at some point thereafter. In the context of a receiver this means that once the return content has been transferred the operation is complete – any action associated with it will be down to the satellite to schedule.
notification
A notification is a side band of a channel data set. These are temporary messages that are handled by the PIP differently (but consistently across all channels) from the main content. They are usually, but don’t have to be, generated from running callbacks.

Design

The PIP will have its own local PostgreSQL database. This will hold the content supplied by each satellite. The content in this local database will be used to create individualised web pages. A standard Cosign authenticated Apache service fronts the content. All the web page HTML is created directly from the database on the fly by aggregating the relevant satellite channel content for the currently authenticated user. This is done by a Python server running under WSGI. This also handles the users delegated credentials and database connection pooling. Identical functionality already exists to do this, and has been proven robust and performant over many years, in the server for TheonUI. We will simply re-use the core functionality in this to develop a custom server for hosting satellite channel content. Like in TheonUI all authentication and authorisation control will be offloaded to the database itself, as the connection is made using the current users delegated credentials. The Python server will incorporate a content agnostic version of the CGI form submission handler, currently in use on TheonPortal, in order to implement callbacks.

The most recent content supplied by a satellite is always used on the web page on reload. User initiated forced refresh of satellite content can be implemented, where necessary, by using a callback for the originating satellite. Any satellite content which is raw data will, when updated, automatically trigger internal page re-generation (full or partial) within the database – the most recent re-generated page is returned on web page reload. Such content can also be re-generated, where more appropriate, by scheduled run. In most cases though the logic to constrain, build and format channel content is done in each satellite, the PIP just replays and/or reformats relevant fragments of the content into a web page. There will be a CSS stack accessed through Apache used to enforce a consistent style on the PIP and satellite channel content.

The PIP will be implemented as a TMMS, meaning all functionality and authorisation control (apart from the actual web hosting and authenticated user mapping done by Apache) is implemented in the local back-end database. In this way the PIP can: use the built-in RESTful API for direct content access and callback by users and for loosely coupled satellite data upload; use TheonUI for administrative management (global settings).

Authentication of users on the local PostgreSQL database will be done through Prometheus roles and capabilities using pgluser to manage change. Authorisation rules will be part of the TMMS model. The local PostgreSQL database will take a feed of data from the School Database containing active users assigned posts, teaching and administrative duties/scope. These attributes are used by satellites to construct appropriately grouped channel content, correlated against users and ultimately constrained within the PIP based on currently authenticated user.

Users with an administrative role that allows them to see other users pages (or specific views of those pages) do so by virtue of having the necessary user channels embedded on their own pages.

Adding a satellite and channel(s) to the PIP can be done by anyone making the necessary satellite amendments (described below) and the PIP owner enabling access for that satellite.

Tightly Coupled Satellite

There are three kinds of satellite:

  1. Satellite is a TMMS
  2. Satellite uses PostgreSQL but is not a TMMS
  3. Satellite does not use PostgreSQL or even a database at all

All of these can be configured as a tightly coupled satellite, but in order of increasing complexity. Any of the above could instead be configured as a loosely coupled satellite.

A tightly coupled satellite is connected to the PIP by having shared exchange tables defined in a PostgreSQL database which are each a foreign data wrapper for a corresponding table in the PIP PostgreSQL database. These are used by the satellite to supply data to the PIP or receive data from the PIP in a standardised way.

The satellite PostgreSQL database must be amended to include these FDW’s and supporting framework. The satellite must then be adapted to generate the channel content to put in to the supply FDW and handle the channel content got out of the return FDW. These satellite adaptations to generate and handle channel content would be necessary whatever the format and communication protocol so there is no overhead to doing this. The PIP provides support to make the addition of the FDW tables to the satellite low overhead (see below).

When a callback is made through the return FDW the entire transaction can be atomic forming a closed loop. Starting from the initiation within the PIP, through to the return content handling and supply content re-generation within the satellite and finally back within the PIP to re-generate any web pages, including full roll back in both the satellite and the PIP in case of error. This atomicity can also be two way, so that content supply requires complete loop completion (re-generation of web pages) within the PIP. Using a closed loop can constrain performance as a result of delays in the PIP or satellite and concurrency due to locking in the shared exchange tables. Either direction can optionally be configured as an open loop (neither end waits for completion of the other) to avoid problems of this nature where necessary.

Configurations for the three kinds of satellite are set out below.

  1. Satellite is a TMMS:
    Satellite directly includes the PIP factory generator for a “satellite connection” and uses a PIP provided template to auto build the necessary FDW schema. Can also directly include the PIP template headers for data formatting to comply with the standard channel content requirements.
  2. Satellite uses PostgreSQL but is not a TMMS:
    Satellite applies the necessary “satellite connection” DDL from the PIP TMMS manually in order to build the necessary FDW schema (or can use IMPORT FOREIGN SCHEMA, although this requires an authenticated connection first, or can just do “CREATE EXTENSION pip”). Can also optionally use the PIP template headers for data formatting, but could do its own thing entirely if necessary (although must still comply with the standard channel content requirements).
  3. Satellite does not use PostgreSQL or even a database at all:
    Satellite must add a standalone TMMS PIP “satellite connection” database which has the necessary FDW schema. The satellite must push data into this intermediate database and or handle callbacks from it itself in whatever way is appropriate to that service. Can also optionally use the PIP template headers for data formatting, but could do its own thing entirely if necessary (although must still comply with the standard channel content requirements). Depending on the implementation of the interface between the satellite and the intermediate connection database a closed loop may not be achievable in either one or both directions, in which case a loosely coupled satellite is likely to be a simpler approach.

Loosely Coupled Satellite

This kind of satellite will supply content into the PIP PostgreSQL database shared exchange tables indirectly by using the PIP API, or directly using an appropriately authenticated database connection but that is not recommended. It is not possible to use callbacks in a loosely coupled satellite, so there will never be return content to handle. A loosely coupled satellite is most likely to be open loop but can be closed loop. A loosely coupled satellite can use the PIP template headers for data formatting, but is more likely to use a different implementation (although must still comply with the standard channel content requirements).

Implementation Summary

Functionality is almost entirely within the back-end PostgreSQL database – the shared exchange tables and associated constrained views, triggers, processing and authorisation control. The TMMS model for this database adds the API and also provides the FDW extension/templates for satellites to use. The Python WSGI server under Apache/Cosign hosts the content. Standard related tools such as pgluser and Prometheus Roles and Capabilities, LCFG and spanning maps are used to configure and setup the rest of the service.

The only custom development involved in implementing this service consists of work on the Python WSGI server (by extension of the existing TheonUI server) and on the back-end database structure and function. All other effort is essentially configuration of existing services. The communication protocol between satellites and the PIP is implemented using the standard postgres_fdw extension.

Further design work will be needed on the actual structure and content of a satellite channel.

Staff Personalized Information Portal

These blog pages will track design and development of the Staff Personalized Information Portal, which is a local computing project.

Description

The Staff Personalized Information Portal contains a personalised cut of data that is appropriate to the teaching and administrative duties currently held by the member of staff. For example: a member of staff that is a PGR supervisor will have a page containing a set of live data on those PGR students that they are supervising; a member of staff that is a PGR Selector for an institute will have a page containing a set of live data on the current applications being processed for that institute; a member of staff that is an institute director will have a page containing management reports for things like student numbers and funding information customised to their institute. The intention is that it provides a one stop shop for access to the data (and links to associated resources) needed to perform those duties (well, that which can be satisfied through our local systems anyway). It is more efficient for staff but also helps us meet various data protection obligations. In addition the content (or a particular view of it) can be subsequently used within a PDR. We also think it would be a useful resource for new staff (or existing staff taking on new roles) when they start.

We produced a rough prototype of what such a portal might look like and examples of what content it might have on it. This was reviewed by a number of administrative staff and was also demonstrated to over ten different academic staff for feedback. There was unanimous agreement that this would be a good thing. A more comprehensive report on the feedback received would be produced as an initial deliverable for this project. This project will define and produce an underlying technical framework, evolved from the prototype and user requirements, onto which multiple sources of data can ultimately be included, as well as a basic information portal for academic staff to cover the core known requirements. An initial deliverable of the project will itemise the specific datasets that will be included as well as the source, their construction and whether any new central data is required. However the actual content on the page is likely to change and grow over time based on what different users in different roles identify that they need (and this would constitute ongoing work beyond this project).

We would like to see a usable prototype in place during Semester 2 of 2018/19 with the final version being available from early in Summer 2019 (so as to cover duty assignment, TSP bid handling and course preparation).

 

 

 

string_to_array behaviour with an empty string

So this caught me out the other day. We had a query which broke out multiple delimited strings each with one or more elements (but always the same number of elements in each) into multiple records using string_to_array and unnest. In each string any element could potentially have an empty value, but at least one string would always have values for all the elements. This meant there should always be at least one output row. So below worked as expected:

SELECT
  UNNEST(STRING_TO_ARRAY(test.s1, '|')) AS col1,
  UNNEST(STRING_TO_ARRAY(test.s2, '|')) AS col2
FROM ( VALUES (('a|b'),('1|2')) ) AS test(s1, s2);

with the following result:

 col1 | col2 
------+------
 a    | 1
 b    | 2
(2 rows)

Below worked entirely as expected as well:

SELECT
  UNNEST(STRING_TO_ARRAY(test.s1, '|')) AS col1,
  UNNEST(STRING_TO_ARRAY(test.s2, '|')) AS col2
FROM ( VALUES (('a|b'),('|')) ) AS test(s1, s2);

with the following result:

 col1 | col2 
------+------
 a    | 
 b    | 
(2 rows)

So did this when we had some data with just the minimum one element in the input:

SELECT
  UNNEST(STRING_TO_ARRAY(test.s1, '|')) AS col1, 
  UNNEST(STRING_TO_ARRAY(test.s2, '|')) AS col2
FROM ( VALUES (('a'),('1')) ) AS test(s1, s2);

with the following result:

 col1 | col2 
------+------
 a    | 1
(1 row)

But then we discovered that there was some data that was being completely lost in the query output. Turns out that this occurred when, like above there was just one element in the input, but one value was empty:

SELECT
  UNNEST(STRING_TO_ARRAY(test.s1, '|')) AS col1,
  UNNEST(STRING_TO_ARRAY(test.s2, '|')) AS col2
FROM ( VALUES (('a'),('')) ) AS test(s1, s2);

with the following result:

 col1 | col2 
------+------
(0 rows)

So after some investigation it turns out that this happens because string_to_array on an empty string returns an array with no elements rather than an array with one element which is an empty string. This does actually make sense because without any application specific context it is indeterminate whether an empty string should be treated as having no items or should itself be treated as one item. There is a very full and complete discussion of the reasoning behind reaching the decision to return an empty array:

Nevertheless we needed to fix the query. The approach I took was to force all input strings to be at least two elements by pre-pending the delimiter and then always discarding the first element:

SELECT
  UNNEST((STRING_TO_ARRAY('|' || test.s1, '|'))[2:100]) AS col1,
  UNNEST((STRING_TO_ARRAY('|' || test.s2, '|'))[2:100]) AS col2
FROM ( VALUES (('a'),('')) ) AS test(s1, s2);

producing the correct result:

 col1 | col2 
------+------
 a    | 
(1 row)

We are using PostgreSQL version 9.5 and we knew an upper bound on the number of elements so we could get away with doing [2:100] but in PostgreSQL version 9.6 you can now do [2:] to leave the upper bound unspecified and defaulting to the array length which is obviously nicer!

The behaviour of string_to_array on a zero length string is of course documented on https://www.postgresql.org/docs/9.5/static/functions-array.html but in a rather oblique way under a supplementary note documenting changes in behaviour from before PostgreSQL version 9.1 when NULL was returned instead of an empty array.

Hold the exact(ish) “CREATE VIEW” definition in the PostgreSQL System Catalog

It would be rather nice if the exact SQL definition (well almost, see below) supplied with the “create view” statement could be held in the PostgreSQL System Catalog, rather than the re-written optimised version that is currently held. This has come up before, see:

http://www.postgresql-archive.org/idea-storing-view-source-in-system-catalogs-td1987401.html
http://www.postgresql-archive.org/Preserving-the-source-code-of-views-td5775163.html

Back then this idea got stomped on for good reasons but I am not suggesting quite the same.

We would prefer PostgreSQL to store view definitions in the system catalog *AFTER* column expansion (unlike in the discussion above) but *BEFORE* internal parser rewrites. So, currently for example:

CREATE TABLE bar ( x INTEGER, y TEXT );

CREATE VIEW foo(x, y) AS
SELECT * FROM bar WHERE y IN ( ‘a’, ‘b’ ) AND x IS NOT DISTINCT FROM 0;

SELECT view_definition FROM information_schema.views
WHERE table_name = ‘foo’;

SELECT bar.x, bar.y
FROM bar
WHERE ((bar.y = ANY (ARRAY[‘a’::text, ‘b’::text])) AND (NOT (bar.x IS DISTINCT FROM 0)));
(1 row)

The column expansion and corresponding rewrite is mandatory at the time of creation (see below), but the other changes in the definition are not. They make comparing the current definition of a view as it is in a database against an externally maintained definition difficult, which is what we in particular want to be able to do.

The SQL standard (at least the ancient version I have access to) for “information_schema” seems rather open to interpretation on this issue. It says that the view definition column:

“contains a representation of the view descriptors”

What do we take “representation” to be – presumably a completely mangled internally optimised “blob” would be compliant (but otherwise useless). The spirit of the statement might be more reasonably taken to be “a human legible representation”. In that case how much rewrite should actually be allowed – to me the rewritten definition above is already getting less legible than the original, but maybe others feel different. I don’t know if it was a design decision that the parser rewrites were folded back into the stored definition or just a side effect of the implementation. Either way, if there is any additional rewriting in the future there must be a point at which the rewrites would be considered to make the stored view definition so sufficiently illegible that the spirit of the standard at least is no longer being followed – or would future rewrites be needlessly constrained by having to preserve the legibility of the stored definition?

The SQL standard is however completely explicit about column expansion:

“NOTE 13: Any implicit column references that were contained in the <query
expression> associated with the <view definition> are replaced by explicit
column references in VIEW_DEFINITION.”

The SQL standard also implies that the definition should be NULL when the length exceeds some limit (not sure what that is but I have a vague feeling from somewhere that it was 4K characters). PostgreSQL does not enforce that limit that I know of (or simply has a much higher limit) since we have some massive views that are returned entirely – and that seems very sensible to me, can’t see a point having a limit these days (except for backwards compatibility with external applications). Maybe this limit has been removed in more recent versions of the standard.

The re-written view definition is also what is output by pg_dump. The consequence of this is that any new/altered parser rewrites made in a newer version of PostgreSQL might not be automatically applied to the view when it is restored from the old versions dump file into that new version. They would however always be applied if the view was restored from a dump that used the original raw definition from the CREATE VIEW statement.

I looked at MariaDB for comparison. This does some rewriting as well – it quotes every table and column reference and adds superfluous bracketing, but does not replace IN (it does not have a comparable “NOT DISTINCT” syntax, just <=>, but I suspect it would shift an embedded NOT in similar constructs in the same way as PostgreSQL). I guess YMMV here depending on how much clever rewriting the database does. Don’t know about other databases like Oracle or SQL Server.

Arguably the same principle should be applied to any embedded SQL literal expression used in DDL operations where the parser rewrites it, e.g. in DEFAULT values for table columns or WITH clauses used in trigger conditions. No idea what the SQL standard says about these, and indeed if they are even included under the information_schema.

Storing the original CREATE VIEW statement post column expansion is obviously of no benefit internally to PostgreSQL itself (excepting the minor pg_dump point above) so I can understand the reluctance to change anything. Reading the earlier discussion on this issue also suggests that this might be a really tricky change to make, I guess due to the way the parser is currently written, so cost/benefit is poor.

Anyway, for the record at least, we have a use case where it would be better for us if the original CREATE VIEW statement was returned as-is in the “information_schema.view_definition” column, but in compliance with Note13 any column expansion re-writes will have been applied. Over and above standards compliance I believe that the column expansion is structurally necessary anyway to facilitate subsequent table/column re-naming and the automatic cascade of the re-naming to views that use them. The further parser rewritten definition could continue to be stored in the pg_catalog in addition if it is needed internally within PostgreSQL (i.e. pg_dump). However, I also suggest that it might be sensible for the original “column expansion only” definition to be used in the pg_dump output, so that the “intent” of the view when originally created is what should (or should not in the case of any incompatibilities) migrate into a future version (or be restored into the same version).

Mock REF Reviewer System

This project is to provide a simple paper referee and review system for the mock REF exercise. The expectation is that College will produce something for the actual REF submission (in 2021). So in this context we want something that can be thrown together fairly quickly. Also the workflow is not entirely clear at the outset, so we also want something that can be quickly altered. So I decided to use a Theon managed service. This means the database schema and corresponding UI can be prototyped and turned into a production system very quickly, but also allows the live system to be easily updated to account for any late design changes. Furthermore we now also get a RESTful  API for free by doing this, and if a more custom UI turns out to be required it will be relatively quick to do using the existing API.

Production Hadoop Service

Myself and Chris have started working on this project. The idea is for both of us to get up to speed with installing and running a Hadoop cluster in the first instance.

To date we have got one one node running all service and successfully running a job. This was using secure ssh for connection. We have started looking at converting this to using Kerberos before trying to build more nodes. Configuration seems to be working but we have an issue using host principals to ssh.

Free Machine Finder – Draft Design and Implementation Proposal

Description

Provide a way for our students to easily locate free DICE desktops in AT. In addition provide something like the functionality in the student written MAPP demonstration system that helps students find out
which labs their friends are working in.

Proposal

This is the first draft so likely to see some amendments.

User Interface

We will provide an Android App as the primary user interface. If this is positively accepted we would then look at also providing an iOS equivalent. We will use a REST API so as not to preclude other interfaces, such as desktop web browser and command line, but will not necessarily implement any of these as part of this project.

The App will have a live home screen widget which simply displays the name of the AT lab that currently has the most available free machines. An additional icon will be shown alongside the lab name if there are “friends” logged onto machines in that lab (see details on
the “friends” feature below).

Opening the App will show one main window. This will contain an ordered scrollable list of AT lab names with those at the top of the list having the most available free machines and those at the bottom having the least available. An additional icon will be shown alongside each lab name where there are one or more “friends” logged onto machines in that particular lab.

The status of AT lab machine usage and logged in “friends” presented by the App will be refreshed automatically every minute while the App is running.

There will be a “Refresh” icon on the main App window which can be touched to force a manual refresh.

There will be a “Friends” icon on the main App window. This can be touched to open the  App “friends” window. See details below.

Friends Feature

The “friends” feature is configured in the App “friends” window. At the top of the window will be an on/off “opt-in” slide toggle which will be off by default. Nothing else is shown on the window in this state. When this toggle is off (or switched to off) the users UUN is not available to other users to search for and invite to be a “friend” (and any previously made associations will then be hidden from those users, see below). Also the “friends” icon will no longer be shown alongside the AT lab name, irrespective of whether previously made associations would normally result in the icon being shown.

When the toggle is slid to on the following additional items are shown.

“Enter Friend UUN” which is an entry box where you can enter a UUN. Alongside this is a “Search” icon. Touch this (or touch Return on the onscreen keyboard) to search for the UUN among only those users that have chosen to “opt-in”. If the UUN is not found an error like “UUN not found (invalid) or UUN has not opted-in” will be displayed. If the UUN is found a modal will be shown with the message “UUN (Firstname Lastname) found”. This will have two buttons – “Invite” and “Cancel”.  Touching “Invite” will close the modal and add the UUN to the list underneath (see below) and the “Enter Friend UUN” box will be cleared. Touching “Cancel” will close the prompt modal doing nothing else.

Below the “Enter Friend UUN” entry box will be a list of friends, each shown as “UUN (Firstname)”. For those waiting acceptance alongside is text saying “Invited”. In addition some will be separated at the top of the list which have an “Accept” button alongside. These are requests from other users to this one. Touch the button to confirm.

Swiping any friend item will display a modal with an option to delete the association or cancel. The entry does not have to be a confirmed association, swipe to delete can also reject an invitation or cancel an invitation they have made (in which case an orphaned acceptance would then be discarded on both sides). Once deleted the association will need to be requested again from either side and confirmed. There is no explicit notification on removal for the other user (the corresponding UUN will just  drop off their list on the next refresh).

A “friend” association is always a two way pairing, it does not matter whether A invites and B accepts or B invites and A accepts, after this has happened A will have B listed as a friend and B will have A listed as a friend.

Once friend associations exist (which requires invitation AND acceptance) then when any of the associated UUN’s in the list are logged into machines this will be alerted to the user by displaying the “friends” icon alongside the corresponding AT lab name. This is except where a user has subsequently turned off “friends”, in which case their UUN will (silently) not be included in the search (although they will still be listed on the associated users “friends” window) – possibly with an “offline” status shown.

Security

For ease of use it would be impractical for the user to go through DICE authentication every time they wanted to use the App. So instead, on first opening the App the user will go through some kind of DICE authentication (possibly by just making an authenticated REST/API call although not entirely clear how this will work from  a mobile app yet) for which they receive a unique token (random hash value). The token is stored on their phone and the token/UUN mapping is stored in the back end. Subsequently the App just uses this stored token for authentication (by passing it as an argument in every REST/API call).

Compromise by loss of phone is possible obviously, but if we are informed the relevant token/uun pair can be revoked by deletion of the record at the back end requiring the user to re-authenticate in order to use the App. Deletion of the token/uun pair at the back end would also delete the users “opt-in” status (reverting to the default which is off) and any existing “friend” associations. Increased security is possible, for example by constructing the hash based on UUN and phone UDID, but not sure that is really necessary here (and may cause some users concern over tracking).

Implementation

Machine Status Collection

Each desktop in the AT student labs will have a cron job that runs a command every minute. This command will make an authenticated REST/API call to update the back end PostgreSQL database with the current status. This updates the corresponding machines record with the current console UUN (or empty if no user logged into the console). It need not wake up the machine to run during sleep – the last reported state should in general indicate no console user (unless they left themselves logged on).

An alternative (maybe preferable) to running the status update command every minute would be to just call the command directly from console login and console logout, this could be done through the PAM stack for example. Although it would be necessary to also run the command on machine reboot and GUI restart for example.

Authentication will use the machine hostclient principal. Using an LCFG spanning map and pgluser we will be able to automatically create a mapping role for each host principal on PostgreSQL for each AT lab machine as and when they go in and out of service. We can also use pgluser to grant the role permission to access the “status” table view (below).

On PostgreSQL these roles will use a view that constrains their access to only the record for the machine corresponding to the host principal. This can be trivially achieved since the SESSION_USER will be the “machine name” (derived from host principal) and by using a view with update rules based on SESSION_USER.

In summary:

  • the client will use hostclient/HOSTNAME.inf.ed.ac.uk
  • a spanning map and pgluser will automatically maintain a corresponding “HOSTNAME” user in PostgreSQL for each hostclient principal
  • similarly pgluser will grant suitable access to the “status” table view for each HOSTNAME role (by assigning the relevant access role to each HOSTNAME user role).
  • when the hostclient principal connects the  SESSION_USER will be “HOSTNAME”
  • a “status” table in PostgreSQL will have two columns – “hostname” and “console_uun”
  •  a view will on the “status” table that is “select” and “update” constrained to only rows where “hostname” matches SESSION_USER – hence the given “HOSTNAME” role can only see and change the value of the row in the table which has a matching hostname value. A similar “insert” rule will be defined on the view so that only a row with matching hostname value can be added. The REST/API will use an “UPSERT” to insert or update the row it owns as appropriate.
  • the “status” table will be unique on hostname to avoid any risk of double entry by accident or design flaw.

Obviously status updates will not work during online examinations since the connection will be firewalled. This is reasonable behaviour (all locked down labs will show as fully available during the exam).

Unfortunately it will probably not be possible at present to get a feed of timetabled lab room bookings (for tutorials etc) from the central timetabling system as remote programmatic access to that system was de-scoped in the original project. Consequently this service may show rooms having many free machines that are not actually available to students to use as there is a tutorial happening in the room at that time, for example.

Support Data

A feed of data will populate an “active_user” table from the Informatics School Database with known UUN’s and corresponding names. The known UUN’s will only be those that have a valid account entitlement (effectively the same data as provided via the Prometheus user view) . Once a UUN drops off that feed the authenticated REST/API can no longer be used to retrieve a token for that UUN and any existing token/uun pairing and associated data will be marked for deletion and will be fully purged after a short (glitch handling) delay. By using a FDW and TheonCoupler all the feed management can be done entirely in PostgreSQL and no supporting infrastructure should be required.

An additional feed of data will be required to populate a “room” table. This will contain machine to room mapping data. This data could come from LCFG or the inventory. Directly from the inventory would probably be preferable and can be achieved with a simple REST/API query against that and internalized as an FDW. Alternatively it may be possible this information could be extracted from clientreport or LCFG information on the client itself and returned in the status update along with the console login status.

REST/API

Initial Authentication and Connection. When the App has no token it will require DICE authentication. It can then make an authenticated connection to the REST/API as that UUN.

An authenticated URI to REST/API can request a token for the authenticated UUN (or a call to the authenticated URI will first bounce the user through an authentication step). Once the App has the token it no longer uses the authenticated URI.

All other connections to REST/API use the unauthenticated URI always passing the token as one of the arguments.

All REST/API calls return JSON.

The authenticated API simply returns the corresponding token. To do so it will perform an “UPSERT” operation on the “user” table. This table contains three columns: “uun”, “token” and “friend_optin”. The upsert will be view limited (for the REST/API credentials) so that it can only set the “uun” value – the token will be generated and set automatically and “friend_optin” will be left as NULL. The authenticated API will use credentials held on the server to perform this operation, but that is all these credentials will be able to do. The table will be unique on “uun”, preventing double entry by accident or design flaw. Note that an attempt to re-authenticate (if the user loses their token by deleting the App or getting a new phone) will update the existing record and re-generate a token.

The unauthenticated REST/API provides the following calls:

  • Get a list of labs and free machines – arguments are “token”. This returns an array of lab objects in order from most free to least free. The app will put the first entry in the homepage widget and display all the entries on the main App window. Each lab object contains the name and a flag indicating if friends are in it. Where the user has not turned on “opt-in” the friends flag will always be false. Where the supplied token does not match a stored token/uun pair the result will always be empty. Where it does the array of lab name objects will be returned. Where it does and the token/uun friend “opt-in” value is true then the relevant “friend” table will be scanned to set the friend flag appropriately. The “friend” table is used for this. It has two “uun” columns (invitor and invitee, although after this point the two are synonymous) and a “status” column which if true indicates the association has been confirmed (the invite has been accepted). When scanned the token/uun is looked for under both columns (i.e. find friends that the user invited and also friends that invited that user). Also has a “timestamp” column set on row creation, used later for automatic expiry of requests. The App makes this API call automatically every minute or when the “refresh” icon is touched.
  • Enable friends – arguments are “token”. This is called to opt-in to the “friends” feature. Sets the relevant flag in the “user” table and returns confirmation message.
  • Get friend – arguments are “token” and “uun”. This is called to return a match for the passed “uun” value. A match is only returned if the UUN matches an entry in the “user” table where “status” is true. The returned match object includes the corresponding firstname and lastname from the “active_user” table.
  • Invite friend (PUT) – arguments are “token” and “uun”. Adds a record in “friend” table with “uun,friend_uun” (where “uun” is that corresponding to the token and “friend_uun” is the “uun” passed directly as argument) and the default for status boolean (cannot be set via add api) which is NULL (meaning “request”). Only a UUN that would be returned as valid in the “get friend” call above will be added. Only one instance can be added (the table is unique on each uun+friend_uun combination).
  • Approve friend request – arguments are “token” and “uun”. Updates the “friend” table by locating the row where “the uun corresponding to the token” = “friend_uun” and “uun” equals the passed “uun” (the requester) and status is NULL and sets status to TRUE.
  • Decline friend (either an invite, a request for approval or a confirmed association) – arguments are “token” and “uun” (of friend). Searches “friend” table and deletes the row where uun corresponding to token equals “uun” and “friend_uun” equals passed “uun” or vice versa.
  • Get friends – arguments are “token”. Returns data from “friend” table where current uun (corresponding to token) equals “friend_uun” or the “uun” columns. An ordered array of objects is returned suitable for direct display on the “friends” App window. Three types will be returned: “requests” where status is NULL and current uun matched the “uun” column; “invites” where status is NULL and current_uun matched the “friend_uun” column; “associated” where status is TRUE and current uun matches either column. Data included in the object will be the names corresponding to the UUN and a type flag. This API call is made when the “friends” App window is opened, every minute while it is open and immediately after the “enable/invite/approve/decline” requests above.

Note that rows in the “friends” table are deleted automatically by purge function from where status is NULL (invites have not been accepted) after a period of time, e.g. 7 days.

To alleviate tracking concerns for “friends” data we could consider (as in the demonstration application) hashing all UUN values (with a random salt value held only on the server) within PostgreSQL so that casual internal support access (and accidental end user remote access by API bug or deliberate hack) reveals no useful information.

All the API functionality is implemented directly in PostgreSQL using
views and appropriate update rules. The REST/API will connect locally as a specific set user (no connections from outside the server for that user). It would be easily implementable with Python/FlaskRestful for example, although the authenticated side might need more work.

The App simply makes the necessary  REST/API URI calls and displays the results, it needs little other logic. It should be simulatable via Curl calls. It holds no state other than its connection token.

Implementation Sequence

Below will probably be done in two discreet stages, each running through some or all of the steps below. First would be to implement the core support for finding free machines (all steps). Second would be to implement support for the “friends” feature (steps 8 through 12).

  1. Create back end PostgreSQL service.
  2. Add LCFG spanning map and pgluser configuration to map hostclient principals to db users.
  3. Create REST/API for client “status” table update.
  4. Create machine client.
  5. Test then deploy onto all student lab machines.
  6. Monitor content on PostgreSQL to confirm functionality.
  7. Implement and test feed for “active_user” table update.
  8. Implement PostgreSQL functionality for REST/API calls including permissions. Test directly in PostgreSQL.
  9. Implement a Python/Flask REST/API interface.
  10. Test REST/API functionality with Curl.
  11. Implement a full test suite using Curl and a simulated data set of some kind.
  12. Write the Android App.

Adopting Theon – Part 3.4

The first part of this series discussed why you might use Theon as a Schema Management Tool for your PostgreSQL database. The second part part covered the necessary packages and installation. This final part is an introductory tutorial.

In the first part of this tutorial we created a database and started to manage it in Theon.  In the second part we configured TheonCoupler to load data automatically into the database. In the third part we configured TheonUI to allow access to the database and made a custom desktop. In this fourth and final part we will package and distribute the result.

The Theon toolkit has some commands to build and package a ModelLibrary for distribution. The resulting distribution will then have the toolkit as a pre-requisite.

We need to do some work with Git first, as the ModelLibrary is held as a Git repository.

rm -fr derived
git add factory schemat
git commit --message "first pass"

Now we can build a release (which will have version number 1.1 by default).

ttkm release

Check the result.

ttkm display
* >
  Establishing UnitArchive
  - Checking for UnitArchive in Folder[CWD]
    - Found Unit[packages] in Folder[CWD]
    - This[packages]
* packages[CWD] >
  History of Unit[packages]
  - Release Directory:         CWD::.release
  - Release Directory Index:   CWD::.release/index
  - Release SpecFile:          CWD::packages.spec
  - Current Release Version:   packages-1.1
  - Current Release Revision:  07f1a86d37bc74a6b5a2bd0c5157650c60501b3a

Now build a source and binary package for the current release above and list the results.

ttkm package rebuild
ls package
packages-1.1-1.src.rpm                   packages-theon-library-devel-1.1-1.noarch.rpm
packages-theon-library-1.1-1.noarch.rpm  packages-theon-library-ui-1.1-1.noarch.rpm

Install the packages.

rpm -U package/*.noarch.rpm

Now we can test the installation, first drop our existing live database, change directory and install from distribution.

dropdb packages
cd ~/
ttkm packages install

Feed in the data in the same way but using the coupler configuration in the distribution.

rpm -qa --qf '"%{NAME}","%{VERSION}","%{RELEASE}","%{ARCH}","%{BUILDTIME}"\n'| ttkm packages stream agent rpm refresh

rpm -qa --qf '["%{NAME}","%{ARCH}","%{REQUIRENAME}","%{REQUIREVERSION}"\n]'| ttkm packages stream agent rpm refresh to packdep couple packdep

Finally start up TheonUI to access the content using the configuration and metadata in the distribution.

ttkm packages server ui

Note that the server configuration is built for the distribution using the LiveProfile attached to the ModelLibrary at build time. It is more than likely that  in most distributions that this would ultimately be separately configured.

There is more extensive documentation on Theon, TheonCoupler, TheonToolkit and TheonUI in the The Theon Book.

Adopting Theon – Part 3.3

The first part of this series discussed why you might use Theon as a Schema Management Tool for your PostgreSQL database. The second part part covered the necessary packages and installation. This final part is an introductory tutorial.

In the first part of this tutorial we created a database and started to manage it in Theon.  In the second part we configured TheonCoupler to load data automatically into the database. In this third part we will configure TheonUI to allow access to the database and make a custom desktop. In the fourth and final part we will package and distribute the result.

Clear out existing derived content and generate the TheonUI metadata content and configuration file.

rm -fr derived
ttkm derive ui

Start up a server instance using the just generated configuration file and content.

ttkm server ui using config=derived/ui/default/ui.conf

On a browser running on the local machine go to localhost:8000/ui.html to access our instance of TheonUI. Once there we can access  the data in the package or rpmdata tables as follows.

From the Choose Desktop menu select Browser.
From the Choose Cluster menu select All Clusters.
From the Choose Table menu select package.
Then the content will be accessible.
Choose Use Grid option to see the content in a tabular form.

We can construct a desktop to show content from multiple related tables simultaneously. However at the moment our database is too simplistic for this. So first we will add another table which is a child of package and create a stream to load it with data. All aspects of this have been covered in part one and part two of this tutorial so much of the detail will be glossed over here.

Create a new table related to package and a corresponding data source table, import the new structure into Theon, then undo our changes and upgrade from Theon.

psql packages <<EOF
CREATE TABLE packdep ( name TEXT, arch TEXT, requires TEXT, version TEXT );
CREATE TABLE rpmdeps AS SELECT * FROM packdep;
ALTER TABLE package ADD PRIMARY KEY (name,architecture);
ALTER TABLE packdep ADD FOREIGN KEY (name,arch) REFERENCES package (name,architecture);
EOF
ttkm import
psql packages <<EOF
DROP TABLE packdep;
DROP TABLE rpmdeps;
ALTER TABLE package DROP CONSTRAINT package_pkey;
EOF
ttkm export
ttkm upgrade

Now modify the existing rpm Stream to add a new Couple with a custom source table and feed of data. Start the TheonUI first.

ttkm self server ui

Now enter the necessary data using a locally running browser pointed at localhost:8000/ui.html.

Open the Streaming desktop.
On the first panel (Choose Model) ensure the right model is shown (in this case packages should be the only one) and then open the Stream panel and add one new row into the Couples box:
 Position SourceEntity TargetEntity Name[Info]
 2 rpmdeps packdep rpmpackagedeps
Open the Couple panel, make sure the rpmpackagedeps couple is shown and enter and save the following for the couple (keep quotes):
 Source Origin Value = 'RPM'
 Reload = Pipe
 Then add two new rows in the Bonds box:
 Position Type
 1 Handle
 2 Column

Open the Bond panel and add rows in the Pairs box for each bond type as below:
 For Handle:
 Position SourceColumn TargetColumn
 1 name name
 2 arch arch
 3 requires requires
 For Column:
 Position SourceColumn TargetColumn
 1 version version

For the couple we have defined the source table (rpmdeps) and target table (packdep). We have also defined the handle pair (name:name + architecture:architecture + requires:requires) to tie records together between source and target.

Now exit the server process with Control-C.

Export the modified schema and upgrade the live database structures.

ttkm export
ttkm upgrade

Update the stream configuration.

ttkm derive coupler

Run the stream, load the data and process the new couple to load the target table content. Then check that content.

rpm -qa --qf '["%{NAME}","%{ARCH}","%{REQUIRENAME}","%{REQUIREVERSION}"\n]'| ttkm stream agent rpm refresh to packdep couple packdep
psql packages -c "SELECT arch, COUNT(*) AS total FROM packdep GROUP BY arch;"
  arch  | total 
--------+--------
 (none) |      1
 i486   |     99
 noarch |  17006
 i686   |   2631
 x86_64 |  63068
(5 rows)

Now we have some related tables and data we can construct a desktop in TheonUI that in one panel allows package selection and shows the related package dependencies. Start up TheonUI for the Theon management database.

ttkm self server ui

Now enter the necessary data using a locally running browser pointed at localhost:8000/ui.html.

Open the Reflection desktop.
On the first panel (Choose Model) ensure the right model is shown (in this case packages should be the only one).
Add a new row in the Desktops box:
 Desktop-Label = packages
 Name[Info] = Packages
Open the Panel panel and add one new row:
 Panel-Label = package
 Name[Info] = Package
 ? Base Entity = package
 Selector = Scroll
and save the changes. Then add one new row in the Connectors box:
 Position ?JoiningRelationship Type Label
 1 packdep_name_fkey ForwardNormal Requirements
and save the changes. Then add one new row in the Presenters box:
 Presenter-Label Style Height
 Requirements Grid 16
and save the changes. Then add the following rows in the Fields box:
 Position ?Connector ?Attribute Presenter Width Name[Info]
 1 -NONE- name - 80 Name
 1 -NONE- architecture - 20 Architecture
 1 -NONE- version - 20 Version
 1 -NONE- release - 20 Release
 5 Requirements requires Requirements 80 Name
 6 Requirements version Requirements 20 Version
and save the changes.
Open the Desktop panel and add one new row into the Panels box:
 Position ?Panel Link-Label
 1 package Package

Now exit the server process with Control-C.

Export the modified schema and derive the new UI metadata.

ttkm export
rm -fr derived
ttkm derive ui

Before starting the server tweak the configuration manually so that the initial data fetch size is limited. Then start up a server instance using the just generated configuration file and content.

Add at the top of the "config = {" section in derived/ui/default/ui.conf:
  "limit_rows" : 250,
  "limit_rows_conditionally" : True,

ttkm server ui using config=derived/ui/default/ui.conf

On a browser running on the local machine go to localhost:8000/ui.html to access our instance of TheonUI.

From the Choose Desktop menu select Packages.

This is a very simple demonstration desktop. You can use it to lookup a package by name at the top for example and each pre-requisite package and version will be shown in the lower box. Or you can query for a pre-requisite package name and all the packages dependent on it will be shown.