From Wikipedia, the free encyclopedia - View original article
|Developer(s)||PostgreSQL Global Development Group|
|Initial release||1 May 1995|
February 20, 2014
|This article's lead section may not adequately summarize key points of its contents. (February 2014)|
|Developer(s)||PostgreSQL Global Development Group|
|Initial release||1 May 1995|
February 20, 2014
PostgreSQL, often simply "Postgres", is an open-source object-relational database management system (ORDBMS) with an emphasis on extensibility and standards-compliance. It is released under the PostgreSQL License, a free/open source software license, similar to the MIT License. PostgreSQL is developed by the PostgreSQL Global Development Group, consisting of a handful of volunteers employed and supervised by companies such as Red Hat and EnterpriseDB. It implements the majority of the SQL:2011 standard, is ACID-compliant, is fully transactional (including all DDL statements), has extensible updateable views, data-types, operators, index methods, functions, aggregates, procedural languages, and has a large number of extensions written by third parties. PostgreSQL runs on many operating-systems including Linux, FreeBSD, Solaris, Microsoft Windows and Mac OS X.
The vast majority of Linux distributions have PostgreSQL available in supplied packages. OS X, starting with Lion, has PostgreSQL server as its standard default database in the server edition, and PostgreSQL client tools in the desktop edition.
PostgreSQL's developers pronounce it / /; (Audio sample, 5.6k MP3). It is abbreviated as Postgres, its original name. Because of ubiquitous support for the SQL Standard amongst most relational databases, the community considered changing the name back to Postgres. However, the PostgreSQL Core Team announced in 2007 that the product would continue to use the name PostgreSQL. The name refers to the project's origins as a "post-Ingres" database, being a development from University Ingres DBMS (Ingres being an abbreviation for INteractive Graphics REtrieval System).
PostgreSQL evolved from the Ingres project at the University of California, Berkeley. In 1982, the project leader, Michael Stonebraker, left Berkeley to make a proprietary version of Ingres. He returned to Berkeley in 1985 and started a post-Ingres project to address the problems with contemporary database systems that had become increasingly clear during the early 1980s. The new project, POSTGRES, aimed to add the fewest features needed to completely support types. These features included the ability to define types and to fully describe relationships – something used widely before but maintained entirely by the user. In Postgres, the database "understood" relationships, and could retrieve information in related tables in a natural way using rules. Postgres used many of the ideas of Ingres, but not its code.
Starting in 1986, the team published a number of papers describing the basis of the system, and by 1988 had a prototype version. The team released version 1 to a small number of users in June 1989, then version 2 with a re-written rules system in June 1990. Version 3, released in 1991, again re-wrote the rules system, and added support for multiple storage managers and an improved query engine. By 1993 the great number of users began to overwhelm the project with requests for support and features. After releasing version 4—primarily a cleanup—the project ended.
But open-source developers could obtain copies and develop the system further, because Berkeley had released Postgres under an MIT-style license. In 1994, Berkeley graduate students Andrew Yu and Jolly Chen replaced the Ingres-based QUEL query language interpreter with one for the SQL query language, creating Postgres95. The code was released on the web.
In July 1996, Marc Fournier at Hub.Org Networking Services provided the first non-university development server for the open-source development effort. Along with Bruce Momjian and Vadim B. Mikheev, work began to stabilize the code inherited from Berkeley. The first open-source version was released on August 1, 1996.
In 1996, the project was renamed to PostgreSQL to reflect its support for SQL. The first PostgreSQL release formed version 6.0 in January 1997. Since then, the software has been maintained by a group of database developers and volunteers around the world, coordinating via the Internet.
The PostgreSQL project continues to make major releases (approximately annually) and minor "bugfix" releases, all available under the same license. Code comes from contributions from proprietary vendors, support companies, and open-source programmers at large.
PostgreSQL manages concurrency through a system known as multiversion concurrency control (MVCC), which gives each transaction a "snapshot" of the database, allowing changes to be made without being visible to other transactions until the changes are committed. This largely eliminates the need for read locks, and ensures the database maintains the ACID (atomicity, consistency, isolation, durability) principles in an efficient manner. PostgreSQL offers three levels of transaction isolation: Read Committed, Repeatable Read and Serializable. Because PostgreSQL is immune to dirty reads, requesting a Read Uncommitted transaction isolation level provides read committed instead. Prior to PostgreSQL 9.1, requesting Serializable provided the same isolation level as Repeatable Read. PostgreSQL 9.1 and later support full serializability via the serializable snapshot isolation (SSI) technique.
Procedural languages allow developers to extend the database with custom subroutines (functions), often called stored procedures. These functions can be used to build triggers (functions invoked upon modification of certain data) and custom aggregate functions. Procedural languages can also be invoked without defining a function, using the "DO" command at SQL level.
Languages are divided into two groups: "Safe" languages are sandboxed and can be safely used by any user. Procedures written in "unsafe" languages can only be created by superusers, because they allow bypassing the database's security restrictions, but can also access sources external to the database. Some languages like Perl provide both safe and unsafe versions.
PostgreSQL has built-in support for three procedural languages:
PostgreSQL includes built-in support for regular B-tree and hash indexes, and two types of inverted indexes: generalized search trees (GiST) and generalized inverted indexes (GIN). Hash indexes are implemented, but discouraged because they cannot be recovered after a crash or power loss. In addition, user-defined index methods can be created, although this is quite an involved process. Indexes in PostgreSQL also support the following features:
WHEREclause to the end of the
CREATE INDEXstatement. This allows a smaller index to be created.
Triggers are events triggered by the action of SQL DML statements. For example, an INSERT statement might activate a trigger that checks if the values of the statement are valid. Most triggers are only activated by either INSERT or UPDATE statements.
Triggers are fully supported and can be attached to tables. In PostgreSQL 9.0 and above, triggers can be per-column and conditional, in that UPDATE triggers can target specific columns of a table, and triggers can be told to execute under a set of conditions as specified in the trigger's WHERE clause. As of PostgreSQL 9.1, triggers can be attached to views by utilising the INSTEAD OF condition. Views in versions prior to 9.1 can have rules, though. Multiple triggers are fired in alphabetical order. In addition to calling functions written in the native PL/PgSQL, triggers can also invoke functions written in other languages like PL/Python or PL/Perl.
In PostgreSQL, all objects (with the exception of roles and tablespaces) are held within a schema. Schemas effectively act like namespaces, allowing objects of the same name to co-exist in the same database. Schemas are analogous to directories in a file system, except that they cannot be nested, nor is it possible to create a "symbolic link" pointing to another schema or object.
By default, databases are created with the "public" schema, but any additional schemas can be added, and the public schema isn't mandatory. A "search_path" determines the order in which schemas are checked on unqualified objects (those without a prefixed schema), which can be configured on a database or role level. The search path, by default, contains the special schema name of "$user", which first looks for a schema named after the connected database user (e.g. if the user "dave" were connected, it would first look for a schema also named "dave" when referring to any objects). If such a schema is not found, it then proceeds to the next schema. New objects are created in whichever valid schema (one that presently exists) is listed first in the search path.
A wide variety of native data types are supported, including:
In addition, users can create their own data types which can usually be made fully indexable via PostgreSQL's GiST infrastructure. Examples of these include the geographic information system (GIS) data types from the PostGIS project for PostgreSQL.
There is also a data type called a "domain", which is the same as any other data type but with optional constraints defined by the creator of that domain. This means any data entered into a column using the domain will have to conform to whichever constraints were defined as part of the domain.
Starting with PostgreSQL 9.2, a data type that represents a range of data can be used which are called range types. These can be discrete ranges (e.g. all integer values 1 to 10) or continuous ranges (e.g. any point in time between 10:00am and 11:00am). The built-in range types available include ranges of integers, big integers, decimal numbers, time stamps (with and without time zone) and dates.
Custom range types can be created to make new types of ranges available, such as IP address ranges using the inet type as a base, or float ranges using the float data type as a base. Range types support inclusive and exclusive range boundaries using the  and () characters respectively. (e.g. '[4,9)' represents all integers starting from and including 4 up to but not including 9.) Range types are also compatible with existing operators used to check for overlap, containment, right of etc.
New types of almost all objects inside the database can be created, including:
Tables can be set to inherit their characteristics from a "parent" table. Data in child tables will appear to exist in the parent tables, unless data is selected from the parent table using the ONLY keyword, i.e.
SELECT * FROM ONLY parent_table. Adding a column in the parent table will cause that column to appear in the child table.
Inheritance can be used to implement table partitioning, using either triggers or rules to direct inserts to the parent table into the proper child tables.
As of 2010[update] this feature is not fully supported yet—in particular, table constraints are not currently inheritable. All check constraints and not-null constraints on a parent table are automatically inherited by its children. Other types of constraints (unique, primary key, and foreign key constraints) are not inherited.
Inheritance provides a way to map the features of generalization hierarchies depicted in Entity Relationship Diagrams (ERD) directly into the PostgreSQL database.
Rules allow the "query tree" of an incoming query to be rewritten. Rules, or more properly, "Query Re-Write Rules", are attached to a table/class and "Re-Write" the incoming DML (select, insert, update, and/or delete) into one or more queries that either replace the original DML statement or execute in addition to it. Query Re-Write occurs after DML statement parsing, but before query planning.
PostgreSQL, beginning from version 9.0, includes built-in binary replication, based on shipping the changes (write-ahead logs) to slave systems asynchronously.
Version 9.0 also introduced the ability to run read-only queries against these replicated slaves, where earlier versions would only allow that after promoting them to be a new master. This allows splitting read traffic among multiple nodes efficiently. Earlier replication software that allowed similar read scaling normally relied on adding replication triggers to the master, introducing additional load onto it.
Beginning from version 9.1, PostgreSQL also includes built-in synchronous replication that ensures that, for each write transaction, the master waits until at least one slave node has written the data to its transaction log. Unlike other database systems, the durability of a transaction (whether it's asynchronous or synchronous) can be specified per-database, per-user, per-session or even per-transaction. This can be useful for work loads that don't require such guarantees, and may not be wanted for all data as it will have some negative effect on performance due to the requirement of the confirmation of the transaction reaching the synchronous standby.
There can be a mixture of synchronous and asynchronous standby servers. A list of synchronous standby servers can be specified in the configuration which determines which servers are candidates for synchronous replication. The first in the list which is currently connected and actively streaming is the one that will be used as the current synchronous server. When this fails, it falls to the next in line.
Synchronous multi-master replication is currently not included in the PostgreSQL core. Postgres-XC which is based on PostgreSQL provides scalable synchronous multi-master replication and is licensed under the BSD license.
The community has also written some tools to make managing replication clusters easier, such as repmgr.
There are also several asynchronous trigger-based replication packages for PostgreSQL. These remain useful even after introduction of the expanded core capabilities, for situations where binary replication of an entire database cluster isn't the appropriate approach:
PostgreSQL provides an asynchronous messaging system that is accessed through the NOTIFY, LISTEN and UNLISTEN commands. A session can issue a NOTIFY command, along with the user-specified channel and an optional payload, to mark a particular event occurring. Other sessions are able to detect these events by issuing a LISTEN command, which can listen to a particular channel. This functionality can be used for a wide variety of purposes, such as letting other sessions know when a table has updated or for separate applications to detect when a particular action has been performed. Such a system prevents the need for continuous polling by applications to see if anything has yet changed, and reducing unnecessary overhead. Notifications are fully transactional, in that messages aren't sent until the transaction they were sent from is committed. This eliminates the problem of messages being sent for an action being performed which is then rolled back.
Many of the connectors for PostgreSQL provide support for this notification system (including libpq, JDBC, Npgsql, psycopg and node.js) so it can be used by external applications.
Security within the database is managed on a per-role-basis. A role is generally regarded to be a user (a role that can log in), or a group (a role which other roles are members of). Permissions can be granted or revoked on any object down to the column level, and can also allow/prevent the creation of new objects at the database, schema or table levels.
The sepgsql extension (provided with PostgreSQL as of version 9.1) provides an additional layer of security by integrating with SELinux. This utilises PostgreSQL's SECURITY LABEL feature.
PostgreSQL natively supports a broad number of authentication mechanisms including:
The GSSAPI, SSPI, Kerberos, peer, ident and certificate methods can also use a specified "map" file that lists which users matched by that authentication system are allowed to connect as a specific database user.
These methods are specified in the cluster's host-based authentication configuration file (pg_hba.conf), which determines what connections are allowed. This allows control over which user can connect to which database, where they can connect from (IP address/IP address range/domain socket), which authentication system will be enforced, and whether the connection must use SSL.
As of version 9.1, PostgreSQL can link to other systems to retrieve data via foreign data wrappers (FDWs). These can take the form of any data source, such as a file system, another RDBMS, or a web service. This means regular database queries can use these data sources like regular tables, and even join multiple data sources together.
In order of commit:
PostgreSQL has several forms of interface available and is also widely supported among programming language libraries. These include:
PostgreSQL is available for the following operating systems: Linux (all recent distributions), Windows (Windows 2000 SP4 and later), DragonFly_BSD, FreeBSD, OpenBSD, NetBSD, Mac OS X, AIX, BSD/OS, HP-UX, IRIX, OpenIndiana, OpenSolaris, SCO OpenServer, SCO UnixWare, Solaris and Tru64 Unix. As of 2012, support for the following obsolete systems was removed: DG/UX, NeXTSTEP, SunOS 4, SVR4, Ultrix 4, and Univel. Most other Unix-like systems should also work.
PostgreSQL works on any of the following instruction set architectures: x86 and x86-64 on Windows and other operatings systems; other than Windows: IA-64 Itanium, PowerPC, PowerPC 64, S/390, S/390x, SPARC, SPARC 64, Alpha, ARMv8-A (64-bit) and older ARM (32-bit), MIPS, MIPSel, M68k, and PA-RISC. It is also known to work on M32R, NS32k, and VAX. In addition to these, it is possible to build PostgreSQL for an unsupported CPU by disabling spinlocks.
The primary front-end for PostgreSQL is the
psql command-line program, which can be used to enter SQL queries directly, or execute them from a file. In addition, psql provides a number of meta-commands and various shell-like features to facilitate writing scripts and automating a wide variety of tasks; for example tab completion of object names and SQL syntax.
The pgAdmin package is a free and open source graphical user interface administration tool for PostgreSQL, which is supported on many computer platforms. The program is available in more than a dozen languages. The first prototype, named pgManager, was written for PostgreSQL 6.3.2 from 1998, and rewritten and released as pgAdmin under the GPL License in later months. The second incarnation (named pgAdmin II) was a complete rewrite, first released on January 16, 2002. The third version, pgAdmin III, was originally released under the Artistic License and then released under the same license as PostgreSQL. Unlike prior versions that were written in Visual Basic, pgAdmin III is written in C++, using the wxWidgets framework allowing it to run on most common operating systems.
PostgreSQL Studio allows users to perform essential PostgreSQL database development tasks from a web-based console. PostgreSQL Studio allows users to work with cloud databases without the need to open firewalls.
The pgFouine PostgreSQL log analyzer generates detailed reports from a PostgreSQL log file and provides VACUUM analysis.
A number of companies offer proprietary tools for PostgreSQL. They often consist of a universal core that is adapted for various specific database products. These tools mostly share the administration features with the open source tools but offer improvements in data modeling, importing, exporting or reporting.
Many informal performance studies of PostgreSQL have been done. Performance improvements aimed at improving scalability started heavily with version 8.1. Simple benchmarks between version 8.0 and version 8.4 showed that the latter was more than 10 times faster on read-only workloads and at least 7.5 times faster on both read and write workloads.
The first industry-standard and peer-validated benchmark was completed in June 2007 using the Sun Java System Application Server (proprietary version of GlassFish) 9.0 Platform Edition, UltraSPARC T1-based Sun Fire server and Postgres 8.2. This result of 778.14 SPECjAppServer2004 JOPS@Standard compares favourably with the 874 JOPS@Standard with Oracle 10 on an Itanium-based HP-UX system.
In August 2007, Sun submitted an improved benchmark score of 813.73 SPECjAppServer2004 JOPS@Standard. With the system under test at a reduced price, the price/performance improved from $US 84.98/JOPS to $US 70.57/JOPS.
The default configuration of PostgreSQL uses only a small amount of dedicated memory for performance-critical purposes such as caching database blocks and sorting. This limitation is primarily because older operating systems required kernel changes to allow allocating large blocks of shared memory. PostgreSQL.org provides advice on basic recommended performance practice in a wiki.
In April 2012, Robert Haas of EnterpriseDB demonstrated PostgreSQL 9.2's linear CPU scalability using a server with 64 cores.
Heroku, a platform as a service provider, has supported PostgreSQL since the start in 2007. They offer value-add features like full database "roll-back" (ability to restore a database from any point in time), which is based on WAL-E, open source software developed by Heroku.
In January 2012 EnterpriseDB released a cloud version of both PostgreSQL and their own proprietary Postgres Plus Advanced Server with automated provisioning for failover, replication, load-balancing, and scaling. It runs on Amazon Web Services.
Although the license allowed proprietary products based on Postgres, the code did not develop in the proprietary space at first. The first main offshoot originated when Paula Hawthorn (an original Ingres team member who moved from Ingres) and Michael Stonebraker formed Illustra Information Technologies to make a proprietary product based on Postgres.
In 2000, former Red Hat investors created the company Great Bridge to make a proprietary product based on PostgreSQL and compete against proprietary database vendors. Great Bridge sponsored several PostgreSQL developers and donated many resources back to the community, but by late 2001 closed due to tough competition from companies like Red Hat and to poor market conditions.
In 2001 Command Prompt, Inc. released Mammoth PostgreSQL, a proprietary product based on PostgreSQL. In 2008 Command Prompt, Inc. released the source under the original license. Command Prompt, Inc. continues to support the PostgreSQL community actively through developer sponsorships and projects including PL/Perl, PL/php, and hosting of community projects such as the PostgreSQL build farm.
In January 2005, PostgreSQL received backing by database vendor Pervasive Software, known for its Btrieve product which was ubiquitous on the Novell NetWare platform. Pervasive announced commercial support and community participation and achieved some success. In July 2006, Pervasive left the PostgreSQL support market.
In mid-2005 two other companies announced plans to make proprietary products based on PostgreSQL with focus on separate niche markets. EnterpriseDB added functionality to allow applications written to work with Oracle to be more readily run with PostgreSQL. Greenplum contributed enhancements directed at data warehouse and business intelligence applications, including the BizGres project.
In October 2005 John Loiacono, executive vice president of software at Sun Microsystems, commented: "We're not going to OEM Microsoft but we are looking at PostgreSQL right now," although no specifics were released at that time. By November 2005 Sun had announced support for PostgreSQL. By June 2006 Sun Solaris 10 (6/06 release) shipped with PostgreSQL.
In August 2007, EnterpriseDB announced EnterpriseDB Postgres, a pre-configured distribution of PostgreSQL including many contrib modules and add-on components. EnterpriseDB Postgres was renamed to Postgres Plus in March 2008. Postgres Plus is available in two versions: Postgres Plus Solution Pack (comprising PostgreSQL delivered in a GUI one-click install plus Solution Pack components that include; Postgres Enterprise Manager, Update Monitor, xDB Replication Server, SQL Profiler, SQL Protect, Migration Toolkit and PL/Secure), and Postgres Plus Advanced Server which has all the features of Postgres Plus Solutions Pack plus Oracle compatibility, performance features not available in PostgreSQL, as well as advanced security features not available in PostgreSQL. Both versions are available for download at no cost and are fully supported. The Solution Pack components and Advanced Server are restricted by a "limited use" license for evaluation purposes only unless purchased though a subscription. In 2011, EnterpriseDB announced Postgres Plus Cloud Database, which easily provisions PostgreSQL and Postgres Plus Advanced Server databases (with Oracle compatibility) in single instances, high availability clusters, or development sandboxes for Database-as-a-Service environments.
In 2011, 2ndQuadrant became a Platinum Sponsor of PostgreSQL, in recognition of their long-standing contributions and developer sponsorship. 2ndQuadrant employ one of the largest teams of PostgreSQL contributors and provide professional support for open source PostgreSQL.
Many other companies have used PostgreSQL as the base for their proprietary database projects. e.g. Truviso, Netezza, ParAccel. In many cases the products have been enhanced so much that the software has been forked, though with some features cherry-picked from later releases.
|Release||First release||Latest minor version||Latest release||Milestones|
|0.01||1995-05-01||0.03||1995-07-21||Initial release as Postgres95|
|1.0||1995-09-05||1.09||1996-11-04||Changed copyright to a more liberal license|
|6.0||1997-01-29||—||Name change from Postgres95 to PostgreSQL, unique indexes, pg_dumpall utility, ident authentication.|
|6.1||1997-06-08||6.1.1||1997-07-22||Multi-column indexes, sequences, money data type, GEQO (GEnetic Query Optimizer).|
|6.2||1997-10-02||6.2.1||1997-10-17||JDBC interface, triggers, server programming interface, constraints.|
|6.3||1998-03-01||6.3.2||1998-04-07||SQL92 subselect capability, PL/pgTCL|
|6.4||1998-10-30||6.4.2||1998-12-20||VIEWs and RULEs, PL/pgSQL|
|6.5||1999-06-09||6.5.3||1999-10-13||MVCC, temporary tables, more SQL statement support (CASE, INTERSECT, and EXCEPT)|
|7.0||2000-05-08||7.0.3||2000-11-11||Foreign keys, SQL92 syntax for joins|
|7.1||2001-04-13||7.1.3||2001-08-15||Write-ahead log, Outer joins|
|7.2||2002-02-04||7.2.8||2005-05-09||PL/Python, OIDs no longer required, internationalization of messages|
|8.0||2005-01-19||8.0.26||2010-10-04||Native server on Microsoft Windows, savepoints, tablespaces, exception handling in functions, point-in-time recovery|
|8.1||2005-11-08||8.1.23||2010-12-16||Performance optimization, two-phase commit, table partitioning, index bitmap scan, shared row locking, roles|
|8.2||2006-12-05||8.2.23||2011-09-26||Performance optimization, online index builds, advisory locks, warm standby|
|8.3||2008-02-04||8.3.23||2013-02-07||Heap-only tuples, full text search, SQL/XML, ENUM types, UUID types|
|8.4||2009-07-01||8.4.20||2014-02-20||Windowing functions, default and variadic parameters for functions, column-level permissions, parallel database restore, per-database collation, common table expressions and recursive queries|
|9.0||2010-09-20||9.0.16||2014-02-20||Built-in binary streaming replication, Hot standby, 64-bit Windows, per-column triggers and conditional trigger execution, exclusion constraints, anonymous code blocks, named parameters, password rules|
|9.1||2011-09-12||9.1.12||2014-02-20||Synchronous replication, per-column collations, unlogged tables, k-nearest-neighbor indexing, serializable snapshot isolation, writeable common table expressions, SE-Linux integration, extensions, SQL/MED attached tables (Foreign Data Wrappers), triggers on views|
|9.2||2012-09-10||9.2.7||2014-02-20||Cascading streaming replication, index-only scans, native JSON support, improved lock management, range types, pg_receivexlog tool, space-partitioned GiST indexes|
|9.3||2013-09-09||9.3.3||2014-02-20||Custom background workers, data checksums, dedicated JSON operators, LATERAL JOIN, faster pg_dump, new pg_isready server monitoring tool, trigger features, view features, writeable foreign tables, materialized views, replication improvements|
|Community support ended|
|Wikimedia Commons has media related to PostgreSQL.|