pg_jts

Create a generalized JSON-table-schema structure from a live postgres database.

The JSON data structure returned from get_database() is a generalization of the JSON-table-schema: The resources in our structure comply with the table definition there (we extend it in allwoed ways). Our structure comprises the whole database. It is the JSON-encoded form of a dictionary with these keys (values being strings, if not otherwise indicated):

  • source: the string ‘PostgreSQL’
  • source_version: the PostgreSQL version returned by the server
  • database_name: the database name
  • database_description: the comment on the database
  • generation_begin_time: begin datetime as returned from PostgreSQL
  • generation_end_time: end datetime as returned from PostgreSQL
  • datapackages: a list of dictionaries, one for each PostgreSQL schema, with these keys:
    • datapackage: the name of the PostgreSQL schema
    • resources: a list of dictionaries, each describing a table within the current PostgreSQL schema and having these keys:
      • name: the name of the table
      • description: the table comment (only those components not part of a weak foreign key definition)
      • primaryKey: the primary key of the table, which is a list of column names
      • fields: a list of dictionaries describung the table columns and having these keys:
        • name: the column name
        • description: the column comment
        • position:
        • type: the PostgreSQL data type, e.g., ‘varchar(100)’ or ‘int4’
        • defaultValue: the default value of the column, e.g., ‘0’, or ‘person_id_seq()’ in case of a sequence
        • constraints: a dictionary describing constraints on the current column, with these keys:
          • required: boolean telling whether the column has a ‘NOT NULL’ constraint
      • indexes: a list of dictionaries, one per index and column, having these keys:
        • name: name of the index
        • columns: a list with the names of the columns used in the index and ordered by priority
        • creation: the SQL statement for creating the index
        • definition: the index definition, e.g., ‘btree (id1, id2)’
        • primary: boolean telling whether the indexed columns form a primary key
        • unique: boolean telling whether the indexed columns are constrained to be unique
      • foreignKeys: a list of foreign keys used by the current table:
        • columns: the names of the columns in the current table which are referencing a remote relation
        • enforced: a boolean telling whether the foreign key constraint is being enforced in PostgreSQL (True), or if it is a weak reference and the constraint is kept only by the application software (False)
        • reference: a dict for specifying the reference target, having these keys:
          • datapackage: the name of the PostgreSQL schema in which the referenced table resides
          • resource: the name of the referenced table
          • name: the name of the foreign key constraint
          • columns: a list of the names of the referenced columns
          • cardinalitySelf: (optional) the cardinality of the foreign key relation (as obtained from a column or table comment) on the side of the current table
          • cardinalityRef: (optional) the cardinality of the foreign key relation (as obtained from a column or table comment) on the side of the remote table
          • label: (optional) a label describing the foreign key relation (as obtained from a column or table comment)

Foreign key syntax

Foreign keys will be recognized where either a (hard) foreign key constraint is present in PostgreSQL, or a table or column comment describes a foreign key relation according to these syntax rules (we call this weak reference):

  • the comment is split at 1) ; followed by a space character or 2) \n, and results in what we call components

  • if a component matches one of the relation_regexps, we try to find a column name, a table name and an optional schema name in it; we match existing names in one of these four formats:

    • schema.table.column
    • table.column
    • schema.table(column1, column2, ..., columnN)
    • table(column1, column2, ..., columnN)
  • if a relation is valid, we also extract both cardinalities on the side of the table (card1) and on the foreign side (card2); the syntax is card1 link card2, where card1 and card2 are values in cardinalities and link is one of --, - with an optional space character on both sides (independently).

  • if a relation is valid, we also extract a label for the relation: when the component contains a string like label="<LABEL>", <LABEL> will be extracted. (On both sides of ‘=’ an arbitrary number of white spaces may appear.

In cases where both a foreign key constraint and a weak reference are present, the weak reference information supplements the constraint, in particular by adding cardinalities (if present).

pg_jts.pg_jts.cardinalities = ['0..1', '1', '0..N', '1..N']

Cardinalities.

These values are allowed in weak references.

pg_jts.pg_jts.get_database(db_conn_str, relation_regexps=None, exclude_tables_regexps=None)[source]

Return a JSON data structure representing the PostgreSQL database.

Returns a JSON string and a list of notifications. The notifications inform about invalid or possibly unwanted syntax of the weak references (contained in the comments).

A valid PostgreSQL connection string (db_conn_str) is required for connecting to a live PostgreSQL database with read permissions.

The resulting data structure is missing some details. Currently mainly these structures are extracted from the database:

  • tables
  • foreign key relations (both constraints and weak references)
  • indexes

The optional arguments have these meanings:

  • exclude_tables_regexps is a list of regular expression strings; if a table name matches any of them, the table and all its relations to other tables are omitted from the result
  • relation_regexps is a list of regular expression strings; if a table comment or a column comment matches any of them, it is parsed for a ‘weak’ foreign key relation (cf. Foreign key syntax)
pg_jts.pg_jts.get_schema_table_column_triples(database)[source]

Return a list of all (schema_name, table_name, column_name)-combinations.

database must have the same structure as obtained from get_database().