Firebird News

Friday, January 28, 2005

Global Temporary Tables

From Vlad Horsun:

Hi All !

During last two months i did some research and implementation of
global temporary tables. Here i want to show what i has done and listen
your opinions and critics.


Definition:

Global temporary table (GTT) is base relation with persistent metadata
definition, stored in database catalog (system tables), and temporary data.
GTT's can be two kinds - with data, persistent within connection or within
transaction only. Data from from different connection\transaction are isolated
one from others but metadata is common.


Syntax and semantics:

CREATE GLOBAL TEMPORARY TABLE
[ON COMMIT {PRESERVE | DELETE} ROWS]

Creates metadata for GTT and stored it in database catalog.
If optional ON COMMIT clause is omitted then PRESERVE ROWS is default.

CREATE GLOBAL TEMPORARY TABLE is usual DDL statement and
handled by the engine the same way as CREATE TABLE statement - when
issued all necessary entries made in system tables and when transaction
commited all other work are done by DFW. GTT differs from persistent tables
by two new flags in RDB$RELATIONS.RDB$FLAGS. ODS change not
required.

GTT's can have indexes, triggers, field and table constraints as permanent
tables. All constraints between any kind of permanent and temporary tables
follow the rule below:

a) references between permanent and temporary tables are forbidden
b) GTT with ON COMMIT PRESERVE ROWS can't reference on GTT
with ON COMMIT DELETE ROWS

There is a table for easyer understanding which references are allowed

Master: Permanent Preserve rows Delete rows
Child:
Permanent Allow
Preserve rows Allow
Delete rows Allow Allow

Domain constraints also can't reference on GTT.


Instantiation and cleanup:

GTT instance created when it first referenced, usually at statement
prepare time. Each instance has its own private set of pages on which
data and indexes are stored. When connection\transaction ends all
instance pages are released.


Storage:

Data stored on the same manner as persistent tables. At creation
time GTT has 2 pages allocated - primary pointer page (not necessary
actually) and index root page (all index definitions for GTT stored there).
Let call them "base" pages.

When GTT instance is created, engine allocate new primary pointer page
for it and index root page (this pages not tracked in RDB$PAGES) . "Base"
index root page are readed and all indexes from it created for given GTT
instance. All data\index operations with this GTT instance used this pages.

Storage can be inside database files or outside it. For external storage
purposes introduced concept of "page space" :

"Page space" is a set of pages enumerated from zero to 2^32-1 (SLONG)
on which engine can store all types of pages. Each page space reside in
it's own OS file or set of files. Engine has all necessary info about this files.

All database pages now reside within predefined "base" page space.
Pages of GTT's instances can reside within this "base" page space or within
additional page spaces created and dropped on the fly. Temporary page spaces
creation and deletion performed by the engine automatically. Files created
with "temporary" attribute set (on platforms which have such attribute).
"forced writes" set to "off" for temporary page spaces regardless of database
settings. All pages from all present page spaces are handled within common
page cache.


Blob "problem":

At blob creation time engine not know in which relation (permanent or
temporary) this blob will be attached when materialized. Therefore when blob
will be materialized may arise necessity to move it from initial ("base") page
space into page space of the appropriate relation. To avoid such waste of
time i propose extension of blob parameter block (used when blob created).
This extension allows to set initial page space in which blob pages will be
allocated before materialization. No public API change is necessary, only
few new constants.


Known limitations:

When user create new index already existing GTT's instances will not
know about it (and don't build it of course). Only new instances will read
new index definition from "base" index root page and build it.

This limitation seems not very important to me since this is usually
bad practice to change metadata on working database.


Work progress:

Almost all of said above is implemented in my private tree.

Current implementation allows to have follow kinds of storage for
temporary page spaces:

1. No temporary page spaces - all inside database
2. Page space per attachment
3. Page space per engine process (one common page space for SS or
separate page space per CS process)

Now it is hardcoded but can easy be moved into configuration file.
I have no strong opinion about how files for temporary page spaces must
be configured - where it must be created. Now i simple create each page
space in its own single file with predefined name:

sprintf(file_name, "fb_tmp_%x_%x", pid, hash);

where "pid" is process id and "hash" is hashed full name of primary database
file. Files are placed into GetTempPath(). As temporary solution and for testing
purposes its worked fine but for production use we must introduce some
rules for it.


Regards,
Vlad

No comments: