SQL
This article is about the database language. For other meanings, see SQL (disambiguation).
SQL (official pronunciation [ɛskjuːˈɛl], but often also [ˈsiːkwəl] after its predecessor SEQUEL; in German also often the German pronunciation of the letters) is a database language for defining data structures in relational databases and for editing (inserting, modifying, deleting) and querying datasets based on them.
The language is based on relational algebra, its syntax is relatively simple and semantically based on English colloquial language. A joint committee of ISO and IEC standardizes the language with the participation of national standardization bodies such as ANSI or DIN. The use of SQL aims to make applications independent of the database management system used.
The term SQL is generally understood as an abbreviation for "Structured Query Language", although according to the standard it is an independent name. The name is derived from its predecessor SEQUEL ([ˈsiːkwəl], Structured English Query Language), which was developed with the participation of Edgar F. Codd (IBM) in the 1970s by Donald D. Chamberlin and Raymond F. Boyce. SEQUEL was later renamed SQL because SEQUEL is a registered trademark of the Hawker Siddeley Aircraft Company.
Language elements and examples
SQL commands can be divided into five categories (assignment according to the theory of database languages in parentheses):
- Data Query Language (DQL) - commands for querying and preparing the information searched for, also classified as a subset of DML.
- Data Manipulation Language (DML) - commands for data manipulation (change, insert, delete records) and read access
- Data Definition Language (DDL) - commands for defining the database schema (creating, changing, deleting database tables, defining primary keys and foreign keys)
- Data Control Language (DCL) - Commands for rights management
- Transaction Control Language (TCL) - Commands for transaction control
The term SQL refers to the English word "query". Queries are used to retrieve the data stored in a database, i.e. to make it available to the user or user software.
The result of a query looks like a table and can often be displayed, edited and used like a table.
See also: Database table
The basic commands and terms are explained using the following example:
ER diagram: |
| ||||||||||||||||||||||||||||||||||||||||||||||||
Relations: |
|
|
|
|
Simple query
lists all columns and all rows of the Student table.
Results:
MatrNo | Name |
26120 | Spruce |
25403 | Jonas |
27103 | Lazy |
Query with column selection
lists the columns VorlNr and Titel of all rows of the table Vorlesung.
Results:
PrelNo | Title |
5001 | ET |
5022 | IT |
5045 | DB |
Query with unique values
lists only different entries of the column MatrNr from the table listens. This shows the matriculation numbers of all students who listen to at least one lecture, whereby matriculation numbers that occur more than once are only output once.
Results:
MatrNo |
25403 |
26120 |
Query with renaming
lists the columns MatrNr and Name of all rows of the table Student. MatrNr is listed as the student number in the display result.
Results:
Matriculation number | Name |
26120 | Spruce |
25403 | Jonas |
27103 | Lazy |
Query with filter
lists the VorlNr and title of all those lines of the table Lecture whose title is 'ET'.
The frequently used instruction structured in such a way is also called "SFW block" after the initial letters.
Results:
PrelNo | Title |
5001 | ET |
Query with filter by content
lists the names of all students whose name begins with F (in the example: Fichte and Fauler).
LIKE can be used with different placeholders: _ stands for a single arbitrary character, % stands for an arbitrary string. Some database systems offer further such wildcard characters, for example for character sets.
Results:
Name |
Spruce |
Lazy |
Query with filter and sorting
lists the first name, last name, street number, postal code and city of all students from the specified postal code area sorted by name.
Query with linked tables
The enumeration after FROM specifies the data sources: at this point, several tables can be linked together using so-called JOINs, so that data from different tables is merged and displayed.
In this example, a natural inner join is used: All records from the Professor and Lecture tables that have the same value in the PersNr field. Professors without a lecture and lectures without a professor are thus not displayed.
This is equivalent to:
Caution: Not all implementations understand both notations, the "Oracle" notation "FROM Professor, Lecture" is considered obsolete and is less common. It also does not conform to the ANSI standard and should therefore be avoided. However, it is still common for historical reasons.
Tables can be linked to each other not only via key fields, but also via any fields, as the following, technically nonsensical example shows:
The result gets the combinations of all professors and all lectures where the name of the professor differs from the title of the lecture - that's just all of them (no lecture is named like a professor):
Title | Name |
ET | Tesla |
ET | Wirth |
ET | Holidaymaker |
IT | Tesla |
IT | Wirth |
IT | Holidaymaker |
DB | Tesla |
DB | Wirth |
DB | Holidaymaker |
Left outer compound
results in all data records of the Professor table connected with the data records of the Lecture table that have the same value in the PersNr field. Professors without a lecture are included, the lecture columns in the result then have the value NULL. Lectures without a professor are not included.
The following query returns only those records for which no matching record exists in the left outer compound (all professors who do not hold lectures):
The same can be achieved by means of a subquery:
Grouping with Aggregate Functions
counts the number of lectures per professor using the aggregate function COUNT.
Note: COUNT(Professor.PersNr) or COUNT(*) would be incorrect (zero values should not be counted).
Summary of a SELECT
In summary, the most important elements of an SQL SELECT query can be described something like this:
Explanation:
- DISTINCT specifies that identical result tuples are to be removed from the result relation. This means that each data record is only output once, even if it occurs more than once in the table. Otherwise, SQL returns a multiset.
- Selection list determines which columns of the source are to be output (* for all) and whether aggregate functions are to be applied. As with all other enumerations, the individual elements are separated by commas.
- Source specifies where the data comes from. Relations and views can be specified and linked with each other as a Cartesian product or as a compound (JOIN, from SQL-92). With the additional specification of a name, relations can be renamed for the query (see examples).
- Where clause determines conditions, also called filters, under which the data should be output. In SQL, it is also possible to specify subqueries here, so that SQL becomes strictly relationally complete.
- Group-by attribute determines whether different values are to be output as individual rows (GROUP BY = grouping) or whether the field values of the rows are to be combined into a result value by means of aggregations such as addition (SUM), average (AVG), minimum (MIN), maximum (MAX), which refers to the grouping.
- Having clause is like Where clause, except that the specified condition refers to the result of an aggregation function, for example HAVING SUM (amount) > 0.
- Sort attribute: after ORDER BY attributes are specified by which to sort. The default is ASC, which means ascending sort, DESC is descending sort.
Set operators can be applied to multiple SELECT queries that have the same number of attributes and where the data types of the attributes match:
- UNION unifies the result sets. In some implementations, result tuples that occur more than once are removed, as with DISTINCT, without having to or being allowed to write "UNION DISTINCT".
- UNION ALL merges the result sets. Result tuples that occur more than once are preserved. However, some implementations interpret "UNION" as "UNION ALL" and may not understand the "ALL" and issue an error message.
- EXCEPT returns the tuples that are contained in a first result set but not in a second result set. Result tuples that occur more than once are removed.
- MINUS is an analog operator like EXCEPT, which is used alternatively by some SQL dialects.
- INTERSECT returns the intersection of two result sets. Result tuples that occur more than once are removed.
Inserting records
inserts four data records into the table Lecture. The values must match the data types of the fields VorlNr, Titel, PersNr.
then returns the result, for example (the order can also be different):
PrelNo | Title | PersNo |
1001 | Databases | 15 |
1000 | Software development 1 | 12 |
1200 | Networks 1 | 20 |
5001 | ET | 12 |
5022 | IT | 12 |
1600 | Algorithms | 12 |
5045 | DB | 15 |
Modify records
changes all records for which PersNr has the value 15. The value of VorlNr is increased by 1000 and the value of PersNr is set to 20.
result of a subsequent SELECT *, possibly with a different order:
PrelNo | Title | PersNo |
1000 | Software development 1 | 12 |
1200 | Networks 1 | 20 |
1600 | Algorithms | 12 |
2001 | Databases | 20 |
5001 | ET | 12 |
5022 | IT | 12 |
6045 | DB | 20 |
Deleting records
deletes all records for which PersNr has the value 12.
Result of a subsequent SELECT *, possibly in a different order:
PrelNo | Title | PersNo |
1200 | Networks 1 | 20 |
2001 | Databases | 20 |
6045 | DB | 20 |
Summary of INSERT, UPDATE, DELETE
Generalized, the change directives look like this:
INSERT statement:
UPDATE statement:
DELETE statement:
Components of SQL
Data definition
Database table
You can create the database table Lecture with the following statement:
The value NULL
is not allowed in any of the fields VorlNr, Titel, PersNr. The foreign key PersNr references the primary key PersNr of the table Professor. This ensures that only data records for which the value of PersNr occurs in the Professor table as a primary key can be inserted into the Lecture table (see referential integrity).
Database index
With the statement
a database index can be defined for the Lecture table, which can be used by the database system to speed up the execution of queries. Whether this makes sense is decided by the database system independently through complex evaluations and analyses for each query again.
View
A view is essentially an alias for a database query. It can be used like a database table. The statement
saves the defined query as a view. The query
uses this view and could, for example, return the following result:
Title | PersNo |
Software development 1 | Wirth |
Networks 1 | Holidaymaker |
Algorithms | Wirth |
Databases | Holidaymaker |
Summary
In summary, the most important elements of the definition of a database table, database index, or view are as follows: