From Wikipedia, the free encyclopedia - View original article
In software engineering, an entity–relationship model (ER model) is a data model for describing the data or information aspects of a business domain or its process requirements, in an abstract way that lends itself to ultimately being implemented in a database such as a relational database. The main components of ER models are entities (things) and the relationships that can exist among them, and databases.
Entity–relationship modeling was developed by Peter Chen and published in a 1976 paper. However, variants of the idea existed previously, and have been devised subsequently such as supertype and subtype data entities and commonality relationships.
An entity-relationship model is a systematic way of describing and defining a business process. The process is modeled as components (entities) that are linked with each other by relationships that express the dependencies and requirements between them, such as: one building may be divided into zero or more apartments, but one apartment can only be located in one building. Entities may have various properties (attributes) that characterize them. Diagrams created to represent these entities, attributes, and relationships graphically are called entity–relationship diagrams.
An ER model is typically implemented as a database. In the case of a relational database, which stores data in tables, every row of each table represents one instance of an entity. Some data fields in these tables point to indexes in other tables; such pointers represent the relationships.
The first stage of information system design uses these models during the requirements analysis to describe information needs or the type of information that is to be stored in a database. The data modeling technique can be used to describe any ontology (i.e. an overview and classifications of used terms and their relationships) for a certain area of interest. In the case of the design of an information system that is based on a database, the conceptual data model is, at a later stage (usually called logical design), mapped to a logical data model, such as the relational model; this in turn is mapped to a physical model during physical design. Note that sometimes, both of these phases are referred to as "physical design". It is also used in database management system.
An entity may be defined as a thing capable of an independent existence that can be uniquely identified. An entity is an abstraction from the complexities of a domain. When we speak of an entity, we normally speak of some aspect of the real world that can be distinguished from other aspects of the real world.
An entity may be a physical object such as a house or a car, an event such as a house sale or a car service, or a concept such as a customer transaction or order. Although the term entity is the one most commonly used, following Chen we should really distinguish between an entity and an entity-type. An entity-type is a category. An entity, strictly speaking, is an instance of a given entity-type. There are usually many instances of an entity-type. Because the term entity-type is somewhat cumbersome, most people tend to use the term entity as a synonym for this term.
Entities can be thought of as nouns. Examples: a computer, an employee, a song, a mathematical theorem.
A relationship captures how entities are related to one another. Relationships can be thought of as verbs, linking two or more nouns. Examples: an owns relationship between a company and a computer, a supervises relationship between an employee and a department, a performs relationship between an artist and a song, a proved relationship between a mathematician and a theorem.
The model's linguistic aspect described above is utilized in the declarative database query language ERROL, which mimics natural language constructs. ERROL's semantics and implementation are based on reshaped relational algebra (RRA), a relational algebra that is adapted to the entity–relationship model and captures its linguistic aspect.
Entities and relationships can both have attributes. Examples: an employee entity might have a Social Security Number (SSN) attribute; the proved relationship may have a date attribute.
Entity–relationship diagrams don't show single entities or single instances of relations. Rather, they show entity sets and relationship sets. Example: a particular song is an entity. The collection of all songs in a database is an entity set. The eaten relationship between a child and her lunch is a single relationship. The set of all such child-lunch relationships in a database is a relationship set. In other words, a relationship set corresponds to a relation in mathematics, while a relationship corresponds to a member of the relation.
Certain cardinality constraints on relationship sets may be indicated as well.
Chen proposed the following "rules of thumb" for mapping natural language descriptions into ER diagrams:
|English grammar structure||ER structure|
|Common noun||Entity type|
|Transitive verb||Relationship type|
|Intransitive verb||Attribute type|
|Adjective||Attribute for entity|
|Adverb||Attribute for relationship|
Physical view show how data is actually stored.
In Chen's original paper he gives an example of a relationship and its roles. He describes a relationship "marriage" and its two roles "husband" and "wife".
A person plays the role of husband in a marriage (relationship) and another person plays the role of wife in the (same) marriage. These words are nouns. That is no surprise; naming things requires a noun.
However as is quite usual with new ideas, many eagerly appropriated the new terminology but then applied it to their own old ideas. Thus the lines, arrows and crows-feet of their diagrams owed more to the earlier Bachman diagrams than to Chen's relationship diamonds. And they similarly misunderstood other important concepts.
In particular, it became fashionable (now almost to the point of exclusivity) to "name" relationships and roles as verbs or phrases.
It has also become prevalent to name roles with phrases such as is the owner of and is owned by. Correct nouns in this case are owner and possession. Thus person plays the role of owner and car plays the role of possession rather than person plays the role of, is the owner of, etc.
The use of nouns has direct benefit when generating physical implementations from semantic models. When a person has two relationships with car then it is possible to generate names such as owner_person and driver_person, which are immediately meaningful.
Modifications to the original specification can be beneficial. Chen described look-across cardinalities. As an aside, the Barker-Ellis notation, used in Oracle Designer, uses same-side for minimum cardinality (analogous to optionality) and role, but look-across for maximum cardinality (the crows foot).[clarification needed]
In Merise, Elmasri & Navathe and others there is a preference for same-side for roles and both minimum and maximum cardinalities. Recent researchers (Feinerer, Dullea et al.) have shown that this is more coherent when applied to n-ary relationships of order > 2.
In Dullea et al. one reads "A 'look across' notation such as used in the UML does not effectively represent the semantics of participation constraints imposed on relationships where the degree is higher than binary."
In Feinerer it says "Problems arise if we operate under the look-across semantics as used for UML associations. Hartmann investigates this situation and shows how and why different transformations fail." (Although the "reduction" mentioned is spurious as the two diagrams 3.4 and 3.5 are in fact the same) and also "As we will see on the next few pages, the look-across interpretation introduces several difficulties that prevent the extension of simple mechanisms from binary to n-ary associations."
Chen's notation for entity–relationship modeling uses rectangles to represent entity sets, and diamonds to represent relationships appropriate for first-class objects: they can have attributes and relationships of their own. If an entity set participates in a relationship set, they are connected with a line.
Attributes are drawn as ovals and are connected with a line to exactly one entity or relationship set.
Cardinality constraints are expressed as follows:
Attributes are often omitted as they can clutter up a diagram; other diagram techniques often list entity attributes within the rectangles drawn for entity sets.
Related diagramming convention techniques:
Crow's foot notation is used in Barker's Notation, Structured Systems Analysis and Design Method (SSADM) and information engineering. Crow's foot diagrams represent entities as boxes, and relationships as lines between the boxes. Different shapes at the ends of these lines represent the cardinality of the relationship.
Crow's foot notation was used in the consultancy practice CACI. Many of the consultants at CACI (including Richard Barker) subsequently moved to Oracle UK, where they developed the early versions of Oracle's CASE tools, introducing the notation to a wider audience. The following tools use Crow's foot notation: ARIS, System Architect, Visio, PowerDesigner, Toad Data Modeler, DeZign for Databases, Devgems Data Modeler, OmniGraffle, MySQL Workbench and SQL Developer Data Modeler. CA's ICASE tool, CA Gen aka Information Engineering Facility also uses this notation. Historically XA Systems Silverrun-LDM (logical data model) also supported this notation.
There are many ER diagramming tools. Free software ER diagramming tools that can interpret and generate ER models and SQL and do database analysis are MySQL Workbench (formerly DBDesigner), and Open ModelSphere (open-source). A freeware ER tool that can generate database and application layer code (webservices) is the RISE Editor. SQL Power Architect while proprietary also has a free community edition.
Proprietary ER diagramming tools are Avolution, ER/Studio, ERwin, DeZign for Databases, MagicDraw, MEGA International, ModelRight, Navicat Data Modeler, OmniGraffle, Oracle Designer, PowerDesigner, Prosa Structured Analysis Tool, Rational Rose, Software Ideas Modeler, Sparx Enterprise Architect, SQLyog, System Architect, Toad Data Modeler, and Visual Paradigm.
Free software diagram tools just draw the shapes without having any knowledge of what they mean, nor do they generate SQL. These include Creately, yEd, Calligra Flow, and Dia. LucidChart will generate an ERD from different schema types, but you can't generate SQL from an ERD.
Peter Chen, the father of ER modeling said in his seminal paper:
He is here in accord with philosophic and theoretical traditions from the time of the Ancient Greek philosophers: Socrates, Plato and Aristotle (428 BC) through to modern epistemology, semiotics and logic of Peirce, Frege and Russell. Plato himself associates knowledge with the apprehension of unchanging Forms (The forms, according to Socrates, are roughly speaking archetypes or abstract representations of the many types of things, and properties) and their relationships to one another. In his original 1976 article Chen explicitly contrasts entity–relationship diagrams with record modelling techniques:
An extensional model is one that maps to the elements of a particular methodology or technology, and is thus a "platform specific model". The UML specification explicitly states that associations in class models are extensional and this is in fact self-evident by considering the extensive array of additional "adornments" provided by the specification over and above those provided by any of the prior candidate "semantic modelling languages"."UML as a Data Modeling Notation, Part 2"
|Wikimedia Commons has media related to Entity-relationship models.|