pg_jts¶
Create a generalized JSON-table-schema structure from a live postgres database.
The JSON data structure returned from get_database()
is a generalization
of the JSON-table-schema: The resources in our structure comply with the
table definition there (we extend it in allwoed ways). Our structure comprises
the whole database. It is the JSON-encoded form of a dictionary with these keys
(values being strings, if not otherwise indicated):
- source: the string ‘PostgreSQL’
- source_version: the PostgreSQL version returned by the server
- database_name: the database name
- database_description: the comment on the database
- generation_begin_time: begin datetime as returned from PostgreSQL
- generation_end_time: end datetime as returned from PostgreSQL
- datapackages: a list of dictionaries, one for each PostgreSQL schema, with these keys:
- datapackage: the name of the PostgreSQL schema
- resources: a list of dictionaries, each describing a table within the current PostgreSQL schema and having these keys:
- name: the name of the table
- description: the table comment (only those components not part of a weak foreign key definition)
- primaryKey: the primary key of the table, which is a list of column names
- fields: a list of dictionaries describung the table columns and having these keys:
- name: the column name
- description: the column comment
- position:
- type: the PostgreSQL data type, e.g., ‘varchar(100)’ or ‘int4’
- defaultValue: the default value of the column, e.g., ‘0’, or ‘person_id_seq()’ in case of a sequence
- constraints: a dictionary describing constraints on the current column, with these keys:
- required: boolean telling whether the column has a ‘NOT NULL’ constraint
- indexes: a list of dictionaries, one per index and column, having these keys:
- name: name of the index
- columns: a list with the names of the columns used in the index and ordered by priority
- creation: the SQL statement for creating the index
- definition: the index definition, e.g., ‘btree (id1, id2)’
- primary: boolean telling whether the indexed columns form a primary key
- unique: boolean telling whether the indexed columns are constrained to be unique
- foreignKeys: a list of foreign keys used by the current table:
- columns: the names of the columns in the current table which are referencing a remote relation
- enforced: a boolean telling whether the foreign key constraint is being enforced in PostgreSQL (True), or if it is a weak reference and the constraint is kept only by the application software (False)
- reference: a dict for specifying the reference target, having these keys:
- datapackage: the name of the PostgreSQL schema in which the referenced table resides
- resource: the name of the referenced table
- name: the name of the foreign key constraint
- columns: a list of the names of the referenced columns
- cardinalitySelf: (optional) the cardinality of the foreign key relation (as obtained from a column or table comment) on the side of the current table
- cardinalityRef: (optional) the cardinality of the foreign key relation (as obtained from a column or table comment) on the side of the remote table
- label: (optional) a label describing the foreign key relation (as obtained from a column or table comment)
Foreign key syntax¶
Foreign keys will be recognized where either a (hard) foreign key constraint is present in PostgreSQL, or a table or column comment describes a foreign key relation according to these syntax rules (we call this weak reference):
the comment is split at 1)
;
followed by a space character or 2)\n
, and results in what we call componentsif a component matches one of the relation_regexps, we try to find a column name, a table name and an optional schema name in it; we match existing names in one of these four formats:
- schema.table.column
- table.column
- schema.table(column1, column2, ..., columnN)
- table(column1, column2, ..., columnN)
if a relation is valid, we also extract both cardinalities on the side of the table (card1) and on the foreign side (card2); the syntax is
card1
link
card2
, where card1 and card2 are values incardinalities
and link is one of--
,-
with an optional space character on both sides (independently).if a relation is valid, we also extract a label for the relation: when the component contains a string like
label="<LABEL>"
,<LABEL>
will be extracted. (On both sides of ‘=’ an arbitrary number of white spaces may appear.
In cases where both a foreign key constraint and a weak reference are present, the weak reference information supplements the constraint, in particular by adding cardinalities (if present).
-
pg_jts.pg_jts.
cardinalities
= ['0..1', '1', '0..N', '1..N']¶ Cardinalities.
These values are allowed in weak references.
-
pg_jts.pg_jts.
get_database
(db_conn_str, relation_regexps=None, exclude_tables_regexps=None)[source]¶ Return a JSON data structure representing the PostgreSQL database.
Returns a JSON string and a list of notifications. The notifications inform about invalid or possibly unwanted syntax of the weak references (contained in the comments).
A valid PostgreSQL connection string (db_conn_str) is required for connecting to a live PostgreSQL database with read permissions.
The resulting data structure is missing some details. Currently mainly these structures are extracted from the database:
- tables
- foreign key relations (both constraints and weak references)
- indexes
The optional arguments have these meanings:
- exclude_tables_regexps is a list of regular expression strings; if a table name matches any of them, the table and all its relations to other tables are omitted from the result
- relation_regexps is a list of regular expression strings; if a table comment or a column comment matches any of them, it is parsed for a ‘weak’ foreign key relation (cf. Foreign key syntax)
-
pg_jts.pg_jts.
get_schema_table_column_triples
(database)[source]¶ Return a list of all (schema_name, table_name, column_name)-combinations.
database must have the same structure as obtained from
get_database()
.