Collection of Database Terminology

By Bijaya Subedi On Feb 13, 2016

Finding the perfect definition of the words you want is always hectic. I have compiled the some of the basic terminologies related to the database based on the reference I got from various tech sites like AboutTech, Raima, eTutorials.org, Techterms.

Database

A database is a collection of information organized into interrelated tables of data and specifications of data objects.

Data Mining

Data mining is the use of automated data analysis techniques to uncover already undetected relationships among data items. Data mining often involves the analysis of data stored in a data warehouse. Three of the major data mining techniques are the regression, classification, and clustering.

Data Warehouse

A data warehouse is a centralized database that captures information from various parts of an organization’s business processes. This information can later be analyzed to determine predictive relationships through the use of data mining techniques.

DB2

DB2 is a relational database system developed by IBM Corporation, originally for use on large mainframe computer systems. It has since been ported to a variety of platforms including SunOS, Solaris, Linux, Windows 95/98/NT/2000 and HP-UX.

DBMS

A database management system (DBMS) is the software that allows a computer to perform database functions like storing, retrieving, adding, deleting and modifying data. Relational database management systems (RDBMS) implement the relational model of tables and relationships. Examples: Microsoft Access, MySQL, Microsoft SQL Server, Oracle and FileMaker Pro are all examples of database management systems.

DDL

(Data Definition Language) DDL is a language used by a database management system (like Oracle) that allows users to define the database and specify data types, structures and constraints on the data. Examples of DDL statements are CREATE TABLE, CREATE INDEX, ALTER, and DROP.

De-Normalization

In a relational database, denormalization is an approach to speeding up read performance (data retrieval) in which the administrator selectively adds back particular instances of redundant data after the data structure has been normalized. A denormalized database should not be confused with a database that has never been normalized.

Dimension Table

In data warehousing, a dimension table is one of the sets of companion tables to a fact table. The fact table contains business facts (or measures), and foreign keys which refer to candidate keys (usually primary keys) in the dimension tables. A dimension table has a primary key column that uniquely identifies each dimension record (row).

DML

Database Manipulation Language. In SQL, such statements as UPDATE, INSERT and DELETE are considered DML.

Fact Table

Fact tables contain the data corresponding to a particular business process. Each row represents a single event associated with that process and contains the measurement data related to that event.

For example, a retail organization might have fact tables related to customer purchases, customer service telephone calls, and product returns. The customer purchases table would likely contain information about the amount of the purchase, any discounts applied and the sales tax paid.

Flat Files

They are the database that consists of a single table. They are data files that contain records with no structured relationships. Additional knowledge is required to interpret these files such as the file format properties.

Lightweight database programs such as the database component in Microsoft Works are sometimes called ‘flat-file managers’ (or list managers) because they can only handle single-table databases. More powerful programs, such as FileMaker Pro, Access, Approach, and Paradox, can handle multi-table databases, and are called relational database managers, or RDBMSs, they use more structured approach to file management (such as one defined by SQL).

Foreign Key

In the context of relational databases, a foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table. In simpler words, the foreign key is defined in a second table, but it refers to the primary key in the first table. Referential Integrity is maintained when every foreign key refers to one and only one existing primary key.

Index

An index is the summary table used to locate data quickly within the table. Indexes are created by selecting a particular field in a table (which is commonly searched attribute/s on the table).

Example: In the Customer table which contains customer numbers, customer names, addresses and other details. You can make indexes based on any information, such as the customers’ number, last name + first name (a composite index based on more than one field), or postal code. Then, when you’re searching for a particular customer name or group of customers, you can use the index to speed up the search.

Many-to-One Relationships

A many-to-one relationship is where many entity instances belong to only one entity instance. This is the reverse of a one-to-many relationship.

Many-to-Many Relationships

Many-to-many relationships occur when each record in TableA may have many linked records in TableB and vice-versa.

Metadata

“Data about data.” In a DBMS context, data stored in columns of a table have certain attributes to them such as the type, length, description or other characteristics that allow the DBMS to process the data meaningfully, or the users to understand it better.

Examples: The eXtensible Markup Language (XML) is a metadata format used to define other data objects.

Normalization

The process of structuring data to minimize duplication and inconsistencies. The process usually involves breaking down a single table into two or more tables and defining relationships between those tables. Normalization is usually done in stages, with each stage applying more rigorous rules to the types of information which can be stored in a table. While full adherence to normalization principles increases the efficiency of a particular database, the process can become so esoteric that you need a professional to create and understand the table design. Most people, when creating a database, don’t need to go beyond the third level of normalization, called third normal form. And there’s no need to know the terminology: simply applying the principles is sufficient.

Not- Null Constraint

NOT NULL constraints in Microsoft SQL Server allow you to specify that a column may not contain NULL values. When you create a new NOT NULL constraint on a database column, SQL Server checks the column’s current contents for any NULL values. If the column currently contains NULL values, the constraint creation fails. Otherwise, SQL Server adds the NOT NULL constraint and any future INSERT or UPDATE commands that would cause the existence of a NULL value fail.

One-to-Many Relationships

One-to-many relationships occur when each record in TableA may have many linked records in TableB, but each record in TableB may have only one corresponding record in TableA.

One-to-One Relationships

One-to-one relationships occur when there is exactly one record in the first table that corresponds to exactly one record in the related table.

Online Analytical Processing (OLAP)

Online Analytical Processing software allows for the real-time analysis of data stored in a database. The OLAP server is normally a separate component that contains specialized algorithms and indexing tools to efficiently process data mining tasks with minimal impact on database performance.

Online Transaction Processing (OLTP)

Databases must often allow the real-time processing of SQL transactions to support e-commerce and other time-critical applications. This type of processing is known as online transaction processing (OLTP).

Open Database Connectivity (ODBC)

Open Database Connectivity (ODBC) is a common framework for accessing and altering the contents of databases. It allows developers to use the same coding conventions regardless of the actual database platform implemented on the backend. When a new database type is installed, administrators merely need to install an ODBC driver that supports that platform and existing ODBC software should normally function.

Primary Key

A primary key is a key in a relational database that is unique for each record. It is a unique identifier, such as a driver license number, Social Security Number, telephone number (including area code), or vehicle identification number (VIN). A relational database must always have one and only one primary key.

Primary key constraint

A table usually has a column or combination of columns whose values uniquely identify each row in the table. This column (or columns) is called the primary key of the table and enforces the entity integrity of the table. You can create a primary key by defining a PRIMARY KEY constraint when you create or alter a table.

RDBMS

A database management system (DBMS) is the software that allows a computer to perform database functions of storing, retrieving, adding, deleting and modifying data. Relational database management systems (RDBMS) implement the relational model of tables and relationships.

Referential integrity

A condition in which the foreign key column values in all of the rows in one table have matching rows in the referenced primary key table. Referential integrity is maintained by SQL during the processing of an INSERT and DELETE statement and any UPDATE statement that modifies a foreign or primary key value.

Repository

A repository is a collection of resources that can be accessed to retrieve information. Repositories often consist of several databases tied together by a common search engine.

Query

Queries are the primary mechanism for retrieving information from a database and consist of questions presented to the database in a predefined format. Many database management systems use the Structured Query Language (SQL) standard query format.

Schema

A database schema is a collection of meta-data that describes the relations in a database. It may be graphical or textual. Graphical representations typically involve the use of boxes that represent database tables and arrows that represent inter-table relationships. Textual schema representations utilize Database Definition Language (DDL) statements to describe a database design. The schema is normally described using Structured Query Language as a series of CREATE statements that may be used to replicate the schema in a new database.

Self join

Self-joins allow you to retrieve related records from the same table. The most common case where you’d use a self-join is when you have a table that references itself.

SQL

Structured Query Language (SQL). A computer language designed to organize and simplify the process of getting information out of a database in a usable form, and also used to reorganize data within databases. SQL is most often used on larger databases on minicomputers, mainframes, and corporate servers.

Stored Procedure

Stored Procedure is a set precompiled SQL database access statements that are used to perform a special task and can be executed as a unit via a Call to the Stored Procedure.

Super Key

A super key is a combination of attributes that can be uniquely used to identify a database record. A table might have many super keys. Candidate keys are a particular subset of superkeys that do not have any extraneous information in them.

Examples: Imagine a table with the fields <Name>, <Age>, <SSN> and <Phone Extension>. This table has many possible super keys. Three of these are <SSN>, <Phone Extension, Name> and <SSN, Name>.

Of those listed, only <SSN> is a candidate key, as the others contain information not necessary to uniquely identify records.

Surrogate Key

A surrogate key is any column or set of columns that can be declared as the primary key instead of a “real” or natural key. Sometimes there can be several natural keys that could be declared as the primary key, and these are also called as candidate keys.

Table

A table is a data structure that organizes information into rows and columns. It can be used to both store and display data in a structured format. For example, databases store data in tables so that information can be quickly accessed from specific rows.

Trigger

Triggers are stored procedures that may be configured to execute automatically (or “fire” in tech language) when certain events take place. Each trigger is associated with a single table and is set to fire when records are inserted into, updated in, or deleted from that table.

Unique Constraint

Unique constraints allow database administrators to specify that a column may not contain duplicate values. When you create a new unique constraint, the DBMS checks the column in question to determine whether it contains any duplicate values. If the table contains preexisting duplicates, the constraint creation command fails. Similarly, once you have a unique constraint on a column, attempts to add or modify data that would cause duplicates to exist also fail.

View

A view is an alternative way to present a table (or tables). You might think of a view as a “virtual” table. A view is (usually) defined in terms of one or more tables. When you create a view, you are not storing more data, you are instead creating a different way of looking at existing data. A view is a useful way to give a name to a complex query that you may have to use repeatedly.

XML

Extensible Markup Language. XML documents are much used in the Internet’s World-Wide Web but are also used in many computing contexts in which data needs to be shared.

XPath

XPath is a syntax for defining parts of an XML document. XPath uses path expressions to navigate in XML documents. XPath contains a library of standard functions. XPath is a major element in XSLT. XPath is a W3C recommendation.