From Wikipedia, the free encyclopedia - View original article
In computing, ODBC (Open Database Connectivity) is a standard programming language middleware API for accessing database management systems (DBMS). The designers of ODBC aimed to make it independent of database systems and operating systems; an application written using ODBC can be ported to other platforms, both on the client and server side, with few changes to the data access code.
ODBC accomplishes DBMS independence by using an ODBC driver as a translation layer between the application and the DBMS. The application uses ODBC functions through an ODBC driver manager with which it is linked, and the driver passes the query to the DBMS. An ODBC driver can be thought of as analogous to a printer or other driver, providing a standard set of functions for the application to use, and implementing DBMS-specific functionality. An application that can use ODBC is referred to as "ODBC-compliant". Any ODBC-compliant application can access any DBMS for which a driver is installed. Drivers exist for all major DBMSs, many other data sources like address book systems and Microsoft Excel, and even for text or CSV files.
ODBC was originally developed by Microsoft during the early 1990s, and became the basis for the Call Level Interface (CLI) standardized by SQL Access Group in the Unix and mainframe world. ODBC retained a number of features that were removed as part of the CLI effort. Full ODBC was later ported back to those platforms, and became a de facto standard considerably better known than CLI. The CLI remains similar to ODBC, and applications can be ported from one platform to the other with few changes.
The introduction of the mainframe-based relational database during the 1970s led to a proliferation of data access methods. Generally these systems operated hand-in-hand with a simple command processor that allowed the user to type in English-like commands, and receive output. The best-known examples are SEQUEL from IBM and QUEL from the Ingres project. These systems may or may not allow other applications to access the data directly, and those that did used a wide variety of methodologies. The introduction of SQL aimed to solve the problem of language standardization, although substantial differences in implementation remained.
Additionally, since the SQL language had only rudimentary programming features, it was often desired to use SQL within a program written in another language, say Fortran or C. This led to the concept of Embedded SQL, which allowed SQL code to be "embedded" within another language. For instance, a SQL statement like
SELECT * FROM city could be inserted as text within C source code, and during compilation it would be converted into a custom format that directly called a function within a library that would pass the statement into the SQL system. Results returned from the statements would be interpreted back into C data formats like
char * using similar library code.
There were a number of problems with the Embedded SQL approach. Like the different varieties of SQL, the Embedded SQL's that used them varied widely, not only from platform to platform, but even across languages on a single platform - a system that allowed calls into IBM's DB2 would look entirely different than one that called into their own SQL/DS[dubious ]. Another key problem to the Embedded SQL concept was that the SQL code could only be changed in the program's source code, so that even small changes to the query required considerable programmer effort to modify. The SQL market referred to this as "static SQL", as opposed to "dynamic SQL" which could be changed at any time - like the command-line interfaces that shipped with almost all SQL systems, or a programming interface that left the SQL as plain text until it was called. Dynamic SQL systems became a major focus for SQL vendors during the 1980s.
Older mainframe databases, and the newer microcomputer based systems that were based on them, generally did not have a SQL-like command processor between the user and the database engine. Instead, the data was accessed directly by the program - a programming library in the case of large mainframe systems, or a command line interface or interactive forms system in the case of dBASE and similar applications. Data from dBASE could not generally be accessed directly by other programs running on the machine. Those programs may be given a way to access this data, often through libraries, but it would not work with any other database engine, or even different databases in the same engine. In effect, all such systems were static, which presented considerable problems.
By the mid-1980s the rapid improvement in microcomputers, and especially the introduction of the graphical user interface and data-rich application programs like Lotus 1-2-3 led to an increasing interest in using personal computers as the client-side platform of choice in client-server computing. Under this model, large mainframes and minicomputers would be used primarily to serve up data over local area networks to microcomputers that would interpret, display and manipulate that data. For this model to work, a data access standard was a requirement - in the mainframe world it was highly likely that all of the computers in a shop were from a single vendor and clients were computer terminals talking directly to them, but in the micro world there was no such standardization and any client might access any server using any networking system.
By the late 1980s there were a number of efforts underway to provide an abstraction layer for this purpose. Some of these were mainframe related, designed to allow programs running on those machines to translate between the variety of SQL's and provide a single common interface which could then be called by other mainframe or microcomputer programs. These solutions included IBM's Distributed Relational Database Architecture (DRDA) and Apple Computer's Data Access Language. Much more common, however, were systems that ran entirely on microcomputers, including a complete protocol stack that included any required networking or file translation support.
One of the early examples of such a system was Lotus Development's DataLens, initially known as Blueprint. Blueprint, developed for 1-2-3, supported a variety of data sources, including SQL/DS, DB2, FOCUS and a variety of similar mainframe systems, as well as microcomputer systems like dBase and the early Microsoft/Ashton-Tate efforts that would eventually develop into Microsoft SQL Server. Unlike the later ODBC, Blueprint was a purely code-based system, lacking anything approximating a command language like SQL. Instead, programmers used data structures to store the query information, constructing a query by linking many of these structures together. Lotus referred to these compound structures as "query trees".
Around the same time, an industry team including members from Sybase, Tandem Computers and Microsoft were working on a standardized dynamic SQL concept. Much of the system was based on Sybase's DB-Library system, with the Sybase-specific sections removed and several additions to support other platforms. DB-Library was aided by an industry-wide move from library systems that were tightly linked to a particular language, to library systems that were provided by the operating system and required the languages on that platform to conform to its standards. This meant that a single library could be used with (potentially) any programming language on a given platform.
The first draft of the Microsoft Data Access API was published in April 1989, about the same time as Lotus' announcement of Blueprint. In spite of Blueprint's great lead - it was running when MSDA was still a paper project - Lotus eventually joined the MSDA efforts as it became clear that SQL would become the de facto database standard. After considerable industry input, in the summer of 1989 the standard became SQL Connectivity, or SQLC for short,.
In 1988 a number of vendors, mostly from the Unix and database communities, formed the SQL Access Group (SAG) in an effort to produce a single basic standard for the SQL language. At the first meeting there was considerable debate over whether or not the effort should work solely on the SQL language itself, or attempt a wider standardization which included a dynamic SQL language-embedding system as well, what they called a Call Level Interface (CLI). While attending the meeting with an early draft of what was then still known as MS Data Access, Kyle Geiger of Microsoft invited Jeff Balboni and Larry Barnes of Digital Equipment Corporation (DEC) to join the SQLC meetings as well. SQLC was a potential solution to the call for the CLI, which was being led by DEC.
The new SQLC "gang of four", MS, Lotus, DEC and Sybase, brought an updated version of SQLC to the next SAG meeting in June 1990. The SAG responded by opening the standard effort to any competing design, but of the many proposals, only Oracle Corp had a system that presented serious competition. In the end, SQLC won the votes and became the draft standard, but only after large portions of the API were removed - the standards document was trimmed from 120 pages to 50 during this time. It was also during this period that the name Call Level Interface was formally adopted. In 1995 SQL/CLI became part of the international SQL standard, ISO/IEC 9075-3. The SAG itself was taken over by the X/Open group in 1996, and, over time, became part of The Open Group's Common Application Environment.
MS continued working with the original SQLC standard, retaining many of the advanced features that were removed from the CLI version. These included features like scrollable cursors, and metadata information queries. The commands in the API were split into groups; the Core group was identical to the CLI, the Level 1 extensions were commands that would be easy to implement in drivers, while Level 2 commands contained the more advanced features like cursors. A proposed standard was released in December 1991, and industry input was gathered and worked into the system through 1992, resulting in yet another name change to ODBC.
During this time, Microsoft was in the midst of developing their Jet database system. Jet combined three primary subsystems; an ISAM-based database engine (also known as "Jet", confusingly), a C-based interface allowing applications to access that data, and a selection of driver DLLs that allowed the same C interface to redirect input and output to other ISAM-based databases, like Paradox and xBase. Jet allowed programmers to use a single set of calls to access common microcomputer databases in a fashion similar to Blueprint (by this point known as DataLens). However, Jet did not use SQL; like DataLens, the interface was in C and consisted of data structures and function calls.
The SAG standardization efforts presented an opportunity for Microsoft to adapt their Jet system to the new CLI standard. This would not only make Windows a premier platform for CLI development, but also allow users to use SQL to access both Jet and other databases as well. What was missing was the SQL parser that could convert those calls from their text form into the C-interface used in Jet. To solve this, MS partnered with PageAhead Software to use their existing query processor, "SIMBA". SIMBA was used as a parser above Jet's C library, turning Jet into an SQL database. And because Jet could forward those C-based calls to other databases, this also allowed SIMBA to query other systems. Microsoft included drivers for Excel to turn its spreadsheet documents into SQL-accessible database tables.
ODBC 1.0 was released in September 1992. At the time, there was little direct support for SQL databases (as opposed to ISAM), and early drivers were noted for poor performance. Some of this was unavoidable due to the path that the calls took through the Jet-based stack; ODBC calls to SQL databases were first converted from SIMBA's SQL dialect to Jet's internal C-based format, then passed to a driver for conversion back into SQL calls for the database. Digital Equipment and Oracle both contracted Simba to develop drivers for their databases as well.
Meanwhile the CLI standard effort dragged on, and it was not until March 1995 that the definitive version was finalized. By this time Microsoft had already granted Visigenic Software a source code license to develop ODBC on non-Windows platforms. Visigenic ported ODBC to a wide variety of Unix platforms, where ODBC quickly became the de facto standard. "Real" CLI is rare today. The two systems remain similar, and many applications can be ported from ODBC to CLI with few or no changes.
Over time, database vendors took over the driver interfaces and provided direct links to their products. Skipping the intermediate conversions to and from Jet or similar wrappers often resulted in higher performance. However, by this time Microsoft had changed focus to their OLE DB concept, which provided direct access to a wider variety of data sources from address books to text files. Several new systems followed which further turned their attention from ODBC, including DAO, ADO and ADO.net, which interacted more or less with ODBC over their lifetimes.
As Microsoft turned its attention away from working directly on ODBC, the Unix world was increasingly embracing it. This was propelled by two changes within the market, the introduction of GUIs like GNOME that provided the need for access to these sources in non-text form, and the emergence of open software database systems like PostgreSQL and MySQL, initially under Unix. The later adoption of ODBC by Apple for Mac OS X 10.4 using the standard Unix-side iODBC package further cemented ODBC as the standard for cross-platform data access.
Sun Microsystems used the ODBC system as the basis for their own open standard, JDBC. In most ways, JDBC can be considered a version of ODBC for the Java programming language as opposed to C. JDBC-to-ODBC "bridges" allow JDBC programs to access data sources through ODBC drivers on platforms lacking a native JDBC driver, although these are now relatively rare.
ODBC remains largely universal today, with drivers available for most platforms and most databases. It is not uncommon to find ODBC drivers for database engines that are meant to be embedded, like SQLite, as a way to allow existing tools to act as front-ends to these engines for testing and debugging.
However, the rise of thin client computing using HTML as an intermediate format has reduced the need for ODBC. Many web development platforms contain direct links to target databases - MySQL being particularly common. In these scenarios, there is no direct client-side access nor multiple client software systems to support, everything goes through the programmer-supplied HTML application. The virtualization that ODBC offers is no longer a strong requirement, and development of ODBC is no longer as active as it once was.
ODBC is based on the device driver model, where the driver encapsulates the logic needed to convert a standard set of commands and functions into the specific calls required by the underlying system. For instance, a printer driver presents a standard set of printing commands, the API, to applications using the printing system. Calls made to those APIs are converted by the driver into the format used by the actual hardware, say PostScript or PCL.
In the case of ODBC, the drivers encapsulate a number of functions that can be broken down into several broad categories. One set of functions is primarily concerned with finding, connecting to and disconnecting from the DBMS that driver talks to. A second set is used to send SQL commands from the ODBC system to the DBMS, converting or interpreting any commands that are not supported internally. For instance, a DBMS that does not support cursors can emulate this functionality in the driver. Finally, another set of commands, mostly used internally, is used to convert data from the DBMS's internal formats to a set of standardized ODBC formats, which are based on the C language formats.
An ODBC driver enables an ODBC-compliant application to use a data source, normally a DBMS. Some non-DBMS drivers exist, for such data sources as CSV files, by implementing a small DBMS inside the driver itself. ODBC drivers exist for most DBMSs, including Oracle, PostgreSQL, MySQL, Microsoft SQL Server (but not for the Compact aka CE edition), Sybase ASE, and DB2. Because different technologies have different capabilities, most ODBC drivers do not implement all functionality defined in the ODBC standard. Some drivers offer extra functionality not defined by the standard.
Device drivers are normally enumerated, set up and managed by a separate Manager layer, which may provide additional functionality. For instance, printing systems often include functionality to provide spooling functionality on top of the drivers, providing print spooling for any supported printer.
In ODBC the Driver Manager (DM) provides these features. The DM can enumerate the installed drivers and present this as a list, often in a GUI-based form.
But more important to the operation of the ODBC system is the DM's concept of Data Source Names, or DSN. DSNs collect additional information needed to connect to a particular data source, as opposed to the DBMS itself. For instance, the same MySQL driver can be used to connect to any MySQL server, but the connection information to connect to a local private server is different than the information needed to connect to an internet-hosted public server. The DSN stores this information in a standardized format, and the DM provides this to the driver during connection requests. The DM also includes functionality to present a list of DSNs using human readable names, and to select them at run-time to connect to different resources.
The DM also includes the ability to save partially complete DSN's, with code and logic to ask the user for any missing information at runtime. For instance, a DSN can be created without a required password. When an ODBC application attempts to connect to the DBMS using this DSN, the system will pause and ask the user to provide the password before continuing. This frees the application developer from having to create this sort of code, as well as having to know which questions to ask. All of this is included in the driver and the DSNs.
A bridge is a special kind of driver: a driver that uses another driver-based technology.
A JDBC-ODBC bridge consists of a JDBC driver which employs an ODBC driver to connect to a target database. This driver translates JDBC method calls into ODBC function calls. Programmers usually use such a bridge when a particular database lacks a JDBC driver. Sun Microsystems included one such bridge in the JVM, but viewed it as a stop-gap measure while few JDBC drivers existed. Sun never intended its bridge for production environments, and generally recommends against its use. As of 2008[update] independent data-access vendors deliver JDBC-ODBC bridges which support current standards for both mechanisms, and which far outperform the JVM built-in.
An ODBC-JDBC bridge consists of an ODBC driver which uses the services of a JDBC driver to connect to a database. This driver translates ODBC function-calls into JDBC method-calls. Programmers usually use such a bridge when they lack an ODBC driver for a particular database but have access to a JDBC driver.
Microsoft provides an OLE DB-ODBC bridge for simplifying development in COM aware languages (e.g. Visual Basic). This bridge forms part of the MDAC system component bundle, together with other database drivers.