From Wikipedia, the free encyclopedia - View original article
Data (// DAY-tə or // DA-tə, also // DAH-tə), are tokens that can be interpreted as some kind of value, usually either as a quantitative measurement of, or a qualitative fact about some thing. Data are manipulated either as values or variables by encoding them into information. Data which are derived through reason or which are employed in the course of behaving, are collectively called knowledge.
In computing and data processing, data are often represented in a structure that is tabular (made up of rows and columns), a tree (a set of nodes with parent-child relationship), or a graph (a set of connected nodes). Data are typically the results of measurements, and can be visualised as graphs or images.
Raw data, i.e., unprocessed data, refers to a collection of numbers, characters and is a relative term; data processing commonly occurs by stages, and the "processed data" from one stage may be considered the "raw data" of the next. Field data refers to raw data that is collected in an uncontrolled in situ environment. Experimental data refers to data that is generated within the context of a scientific investigation by observation and recording.
The word data is the traditional plural form of the now-archaic datum, neuter past participle of the Latin dare, "to give", hence "something given". In discussions of problems in geometry, mathematics, engineering, and so on, the terms givens and data are used interchangeably. This usage is the origin of data as a concept in computer science or data processing: data are accepted numbers, words, images, etc.
Data is also increasingly used in humanities (particularly in the growing digital humanities) the highly interpretive nature whereof might oppose the ethos of data as "given". Peter Checkland introduced the term capta (from the Latin capere, “to take”) to distinguish between an immense number of possible data and a sub-set of them, to which attention is oriented. Johanna Drucker has argued that the humanities affirm knowledge production as “situated, partial, and constitutive” and that using data may therefore introduce assumptions that are counterproductive, for example that phenomena are discrete or observer-independent. The term capta, which emphasizes the act of observation as constitutive, is offered as an alternative to data for visual representations in the humanities.
Datum means "an item given". In cartography, geography, nuclear magnetic resonance and technical drawing it often refers to a reference datum wherefrom distances to all other data are measured. Any measurement or result is a datum, though data point is now far more common.
In one sense, datum is a count noun with the plural datums (see usage in datum article) that can be used with cardinal numbers (e.g. "80 datums"); data (originally a Latin plural) is not used like a normal count noun with cardinal numbers and can be plural with such plural determiners as these and many or as a singular abstract mass noun with a verb in the singular form. Even when a very small quantity of data is referenced (one number, for example) the phrase piece of data is often used, as opposed to datum. The debate over appropriate usage continues.
The IEEE Computer Society allows usage of data as either a mass noun or plural based on author preference. Some professional organizations and style guides[dead link] require that authors treat data as a plural noun. For example, the Air Force Flight Test Center specifically states that the word data is always plural, never singular.
Data is most often used as a singular mass noun in educated everyday usage. Some major newspapers such as The New York Times use it either in the singular or plural. In the New York Times the phrases "the survey data are still being analyzed" and "the first year for which data is available" have appeared within one day. The Wall Street Journal explicitly allows this usage in its style guide. The Associated Press style guide classifies data as a collective noun that is singular when a unit and plural when referring to individual items ("The data is sound.", and "The data have been carefully collected.").
In scientific writing data is often treated as a plural, as in These data do not support the conclusions, and as a singular mass entity like information, for instance in computing and related disciplines. British usage now widely accepts treating data as singular in standard English, including everyday newspaper usage at least in non-scientific use. UK scientific publishing still prefers treating it as a plural. Some UK university style guides recommend using data for both singular and plural use and some recommend treating it only as a singular in connection with computers.
Data, information and knowledge are closely related terms, but each has its own role in relation to the other. Data are extracted from information, and knowledge is derived from data. For example, the height of Mt. Everest is generally considered to be data. A book on Mt. Everest geological characteristics contains the information from which such data may be extracted, and an understanding based on such data of the most practical way to reach Mt. Everest's peak may be seen as "knowledge".
It is people and computers who collect data and impose patterns on it. These patterns are seen as information which can be used to enhance knowledge. These patterns can be interpreted as truth, and are authorized as aesthetic and ethical criteria. Events that leave behind perceivable physical or virtual remains can be traced back through data. Marks are no longer considered data once the link between the mark and observation is broken. This is nearly the inverse of the more common notion that information is processed to obtain data, which is then processed into knowledge.
Mechanical computing devices are classified according to the means by which they represent data. An analog computer represents a datum as a voltage, distance, position, or other physical quantity. A digital computer represents a datum as a sequence of symbols drawn from a fixed alphabet. The most common digital computers use a binary alphabet, that is, an alphabet of two characters, typically denoted "0" and "1". More familiar representations, such as numbers or letters, are then constructed from the binary alphabet.
Some special forms of data are distinguished. A computer program is a collection of data, which can be interpreted as instructions. Most computer languages make a distinction between programs and the other data on which programs operate, but in some languages, notably Lisp and similar languages, programs are essentially indistinguishable from other data. It is also useful to distinguish metadata, that is, a description of other data. A similar yet earlier term for metadata is "ancillary data." The prototypical example of metadata is the library catalog, which is a description of the contents of books.
|Look up data in Wiktionary, the free dictionary.|