Comparison of the different Databases

This is not a comparison in order to decide which database is best or has the most valuable features. It is written from a programmer's point of view who often has to work with a certain database just because it can't be changed in the project. Of course between the lines it is also a cry for standards, but as long as the different vendors don't follow the standards it gives a guideline for workarounds in case of compatibility issues.

Oracle compared to MySQL and MS-SQL:

ORACLE	MySQL	MS-SQL
to_date('28.01.08','dd.mm.yy')	str_to_date('28.01.08','%d.%m.%y')	convert(date,'28.01.2008',104)
'text ' \|\| string \|\| chr(39)	concat('text ',string, char(39))	'text ' + string + char(39)
instr('long ... text', 'needle')	position('needle' in 'long ... text')	patindex('%needle%', 'long...text')
GROUP BY ROLLUP(author_id)	GROUP BY author_id WITH ROLLUP	GROUP BY ROLLUP (author_id)
ELSIF	ELSEIF	n.a. (use nested IF ... ELSE)
NVL(expr,value)	IFNULL(expr,value)	ISNULL(expr,value)
CREATE PROCEDURE sp_name( p1 IN NUMBER ) IS	CREATE PROCEDURE sp_name( IN p1 INTEGER )	CREATE PROCEDURE sp_name @p1 SMALLINT = 1 AS
var_a INTEGER; BEGIN	BEGIN DECLARE var_a INTEGER;	SET NOCOUNT ON BEGIN TRY DECLARE @var_a INT
	DECLARE EXIT HANDLER FOR NOT FOUND SET var_a = 0;
var_a := 55;	SET var_a = 55;	SET @var_a = 55;
SELECT col_a INTO var_a FROM a_table WHERE rownum <= 1;	SELECT col_a INTO var_a FROM a_table LIMIT 1;	SELECT TOP(1) @var_a = col_a FROM a_table; (Note: *)
EXCEPTION WHEN NO_DATA_FOUND THEN var_a := 0;		END TRY; BEGIN CATCH SET @var_a = 0;
END;	END	END CATCH;

In contrary to the other database engines T-SQL doesn't produce an error if the select fetches multiple rows. It will walk through the result set and will assign the value of current row/record/cursor to the variable again and again. This can lead to logical errors in your procedure that are difficult to find. Essentially the variable will have the column's value of the last processed row. If you want to make sure that you always get the value of the first row add the TOP(1) after the SELECT as demonstrated above.

listagg and group_concat

Starting with release 11g Oracle offers a aggregation function listagg that is the counter part to group_concat of MySQL.

SQL> SELECT g_ID, min(f_ID), max(f_ID),
            LISTAGG(f_ID, ',') WITHIN GROUP (ORDER BY f_ID) AS list_of_IDs
       FROM belong_to GROUP BY g_ID;

 g_ID  min(f_ID) max(f_ID)  list_of_IDs                    
 ----- --------- ---------  ------------------------------
 cb           69        78  69,70,71,72,73,74,75,76,77,78  
 ndf          79        87  79,80,81,82,83,84,85,86,87     
 nv           60        68  60,61,62,63,64,65,66,67,68     
 suk           1         9  1,2,3,4,5,6,7,8,9              
 wes          29        37  29,30,31,32,33,34,35,36,37

The MySQL syntax looks more consistent because all specifications for the concatenation process are inside the parentheses for the parameter of the GROUP_CONCAT function. The additional SEPARATOR keyword seems to be more natural than Oracle's WITHIN GROUP keyword.

mysql> SELECT g_ID, min(f_ID), max(f_ID),
              GROUP_CONCAT(f_ID ORDER BY f_ID SEPARATOR ',') AS list_of_IDs
         FROM belong_to GROUP BY g_ID;
+------+-----------+-----------+-------------------------------+
| g_ID | min(f_ID) | max(f_ID) | list_of_IDs                   |
+------+-----------+-----------+-------------------------------+
| cb   |        69 |        78 | 69,70,71,72,73,74,75,76,77,78 |
| ndf  |        79 |        87 | 79,80,81,82,83,84,85,86,87    |
| nv   |        60 |        68 | 60,61,62,63,64,65,66,67,68    |
| suk  |         1 |         9 | 1,2,3,4,5,6,7,8,9             |
| wes  |        29 |        37 | 29,30,31,32,33,34,35,36,37    |
+------+-----------+-----------+-------------------------------+

PostgreSQL achieves the same in a two step process. There is no aggregation function that returns a concatenated string, but PostgreSQL can return an aggregated array. And as it supports join and split functionality to move from arrays to strings and back you only need to chain the array_agg and the array_to_string functions:

 SELECT g_ID, min(f_ID), max(f_ID),
        array_to_string( array_agg(f_ID), ',') AS list_of_IDs
   FROM belong_to GROUP BY g_ID;

With PostgreSQL 9 it it possible to use a short form that is similar to what MySQL and Oracle have. Instead of chaining array_agg and array_to_string you can simply use: string_agg(f_ID,','). It is also possible to specify a order by clause to the aggregate expression of array_agg or string_agg:

    array_agg( f_ID ORDER BY f_ID DESC ),
    string_agg(f_ID, ',' ORDER BY f_ID )

Hierachical Queries

Oracle has a specific syntax for hierarchical queries CONNECT BY with the pseudo column level. In MS-SQL you need to implement this with a Common Table Expression that references itself. The concept is analogous to the linked list structures in C where you already use the name of the structure while you are still defining the structure. Like in in C you need to have a kind of declaration in advance and can't use the simple WITH profile_tree AS (...) but need to list the fields of the CTE explicitly.

WITH profile_tree (level, id, owner_id, path_name) AS
(
 SELECT 1 AS level, p.id, p.owner_id,
     cast(N'/' COLLATE Latin1_General_CS_AS + p.last_name AS NVARCHAR(60)) AS path_name
   FROM profile p
  WHERE p.id = 1   /* start at the root node (usually a parameter) */
 UNION ALL
 SELECT t.level + 1 AS level, p.id, p.owner_id,
     cast(t.path_name + N'/' + p.last_name AS NVARCHAR(60)) AS path_name
   FROM profile p INNER JOIN profile_tree t ON t.owner_id = p.id
)   /* This is the end of the Common Table Expression */
SELECT level, id, owner_id, path_name FROM profile_tree

PostgreSQL doesn't require a formal table declaration in advance, but instead it needs the special recursive keyword to prepare the parser for the case that it will find a reference to the CTE inside the definition of the CTE. And it is not so picky about the type of the combined pathname column. It can just be VARCHAR, but doesn't need a type cast to fix the length of that field.

WITH RECURSIVE profile_tree AS
(
 SELECT 1 AS level, p.id, p.owner_id,
     '/' || p.last_name AS path_name
   FROM profile p
  WHERE p.id = 1   /* start at the root node (usually a parameter) */
 UNION ALL
 SELECT t.level + 1 AS level, p.id, p.owner_id,
     t.path_name || '/' || p.last_name AS path_name
   FROM profile p INNER JOIN profile_tree t ON t.owner_id = p.id
)
SELECT level, id, owner_id, path_name FROM profile_tree

Comparing multiple values (tupels) with an IN list

The classical WHERE id IN (select max(id) from ...) to figure out the latest (or highest) record in a table might be inefficient because it runs two queries, the select for the maximum value in the subselect and the outer select. A SELECT ... FROM ... WHERE ... ORDER BY id DESC LIMIT 1 needs only a single pass through the table and should be substantially faster.

But what if you not only want the absolute maximum, but also the relative maximum for each month or each genre? Instead of comparing a scalar value to a one-dimensional list you have to compare a tupel with 2 elements to a 2 x m result set of the subquery.

 SELECT b.g_ID, b.f_ID, f.year, f.director
   FROM belong_to b INNER JOIN films f ON b.f_ID = f.f_ID
  WHERE (b.g_ID,b.f_ID) IN (SELECT g_ID,max(f_ID) FROM belong_to GROUP BY g_ID);

+------+------+------+-------------------+
| g_ID | f_ID | year | director          |
+------+------+------+-------------------+
| suk  |    9 | 1947 | John Huston       |
| fn   |   28 | 1939 | Raoul Walsh       |
| wes  |   37 | 1993 | Detlev Buck       |
| sc   |   59 | 1985 | Martin Scorsese   |
| nv   |   68 | 1965 | Jean-Luc Godard   |
| ndf  |   87 | 1982 | Michael Verhoeven |
+------+------+------+-------------------+

This concept works in MySQL, PostgreSQL and in Oracle, but the MS-SQL server does not support such expressions.

The reason why such a tupel comparison is not allowed in MS-SQL might be that the above query could easily be rewritten into a join query:

SELECT b.g_ID, b.f_ID, f.year, f.director
  FROM belong_to b INNER JOIN films f ON b.f_ID = f.f_ID
    INNER JOIN (SELECT g_ID,max(f_ID) AS maxid FROM belong_to GROUP BY g_ID) c
       ON b.g_ID = c.g_ID AND b.f_ID = c.maxid
GO
g_ID  f_ID    year   director                 
----- ------- ------ -------------------------
suk         9   1947 John Huston              
fn         28   1939 Raoul Walsh              
wes        37   1993 Detlev Buck              
sc         59   1985 Martin Scorsese          
nv         68   1965 Jean-Luc Godard          
ndf        87   1982 Michael Verhoeven

The WHERE ... IN is usually used when we want to avoid that rows in one table appear multiple times because the field or fields in the join condition appear multiple times in the other table. If you have IN (1,2,1) you will have only 2 records, but with a join you will get 3 records. But due to the GROUP BY in this case it is guaranteed that each combination of tupels appears only once.

But this is the straight forward constellation if the primary key column is used to build the aggregate values. This will be different if we don't want the highest Id, but the latest year.

SELECT x.g_ID, x.f_ID, f.year, f.director, f.title
  FROM belong_to x INNER JOIN films f ON x.f_ID = f.f_ID
  INNER JOIN
  (SELECT x.g_ID, max(x.f_ID) as max_id, max(f.year) AS max_year
     FROM belong_to x INNER JOIN films f ON x.f_ID = f.f_ID
    GROUP BY x.g_ID) m ON x.g_ID = m.g_id AND f.year = m.max_year
GO
g_ID  f_ID    year   director             title
----- ------- ------ -------------------- -----------------
suk         7   1949 Jean Cocteau         Orphée
fn         19   1992 Steven Soderbergh    Kafka
wes        29   1995 Jim Jarmusch         Dead Man
sc         51   1995 Woody Allen          Mighty Aphrodite
nv         66   1994 Gianni Amelio        Lamerica
cb         78   1995 Danny Boyle          Trainspotting
ndf        83   1995 Dieter Wedel         Der Schattenmann
out        90   1995 Wayne Wang           Blue in the Face
out       102   1995 Wayne Wang           Smoke
des       111   1993 Robert Altman        Short Cuts

If the data is post-processed it is very likely that the follow-up process is not able to handle double entry for genre out. windowing expressions allow us to use two columns for sorting:

SELECT g_ID, f_ID, year, director, title
  FROM (
   SELECT ROW_NUMBER() OVER (PARTITION BY x.g_ID ORDER BY f.year DESC, f.f_ID DESC) row_num,
          x.g_ID, f.*
    FROM belong_to x INNER JOIN films f ON x.f_ID = f.f_ID
  ) src WHERE row_num = 1
ORDER BY 2
GO
g_ID  f_ID    year   director              title
----- ------- ------ --------------------- -----------------------
suk         7   1949 Jean Cocteau          Orphée
fn         19   1992 Steven Soderbergh     Kafka
wes        29   1995 Jim Jarmusch          Dead Man
sc         51   1995 Woody Allen           Mighty Aphrodite
nv         66   1994 Gianni Amelio         Lamerica
cb         78   1995 Danny Boyle           Trainspotting
ndf        83   1995 Dieter Wedel          Der Schattenmann
out       102   1995 Wayne Wang            Smoke
des       111   1993 Robert Altman         Short Cuts

Another approach uses CROSS APPLY, but this time the genre out is completely missing from the list:

SELECT g.g_ID, src.f_ID, src.year, src.director, src.title,
  FROM genres g CROSS APPLY
  (SELECT TOP(1) b.g_ID, b.f_ID, f.year, f.director, f.title
    FROM belong_to b INNER JOIN films m ON b.f_ID = f.f_ID
   WHERE b.g_ID = g.g_ID ORDER BY f.year DESC, b.f_ID DESC) src
GO
g_ID  f_ID    year   director              title
----- ------- ------ --------------------- ------------------------
suk         7   1949 Jean Cocteau          Orphée
fn         19   1992 Steven Soderbergh     Kafka
wes        29   1995 Jim Jarmusch          Dead Man
sc         51   1995 Woody Allen           Mighty Aphrodite
nv         66   1994 Gianni Amelio         Lamerica
cb         78   1995 Danny Boyle           Trainspotting
ndf        83   1995 Dieter Wedel          Der Schattenmann
des       111   1993 Robert Altman         Short Cuts

Dynamic SQL

Oracle implements dynamic SQL with the help of the PL/SQL procedural language. And it supports two types of dynamic SQL:

EXECUTE IMMEDIATE is easier to learn and to read because you only have to add these two keywords before your string with the SQL statement.
But EXECUTE IMMEDIATE won't work with SELECT statements that return multiple rows. These multiple rows are complex data and they have to be stored somewhere. Those variables that would be needed for storing the result set can't be created automatically.
With the package DBMS_SQL you have to declare any variable that you use. This is painful for quick updates or single rows selects, but it is the way to work with multi row result sets. People familiar with a host programming languages as C, Perl or Java will feel at home, because DBMS_SQL uses the steps (Open statement, Parse, Execute, Fetch, Close)

BEGIN
   v_stmt := 'SELECT id, title, director FROM dvds WHERE year BETWEEN :x AND :y';
   h_cursor := DBMS_SQL.OPEN_CURSOR;
   DBMS_SQL.PARSE(h_cursor, v_stmt, DBMS_SQL.NATIVE);
   -- Input variables
   DBMS_SQL.BIND_VARIABLE(h_cursor, ':x', v_oldest);
   DBMS_SQL.BIND_VARIABLE(h_cursor, ':y', v_youngest);
   -- output variables
   DBMS_SQL.DEFINE_COLUMN(h_cursor, 1, v_idr);
   DBMS_SQL.DEFINE_COLUMN(h_cursor, 2, v_title, 40);
   DBMS_SQL.DEFINE_COLUMN(h_cursor, 3, v_director, 30); 
   v_ignore := DBMS_SQL.EXECUTE(h_cursor);
   LOOP
      IF DBMS_SQL.FETCH_ROWS(h_cursor) > 0
      THEN
         -- get column values of the row
         DBMS_SQL.COLUMN_VALUE(h_cursor, 1, v_id);
         DBMS_SQL.COLUMN_VALUE(h_cursor, 2, v_title);
         DBMS_SQL.COLUMN_VALUE(h_cursor, 3, v_director);
      ELSE
         EXIT;  -- No more rows to copy:
      END IF;
   END LOOP; 
   DBMS_SQL.CLOSE_CURSOR(h_cursor);
END;

Using the OPEN FOR statement and a cursor variable can do the same and is a lot easier (The cursor variable is the type of variable for storing result set like data that we were missing in the above approach):

DECLARE
  -- Cursor is only needed to easily create a record for the dynamic cursor
  CURSOR cur_spec IS SELECT id,title,director FROM dvds;
  rec_movie cur_spec%ROWTYPE;
  v_cursor  SYS_REFCURSOR;
BEGIN
  v_from := 1956;
  v_until := 2000;
  v_stmt := 'SELECT id,title,director FROM dvds WHERE year BETWEEN :x AND :y';

  OPEN v_cursor FOR v_stmt USING v_from, v_until;
  LOOP
     FETCH v_cursor INTO rec_movie;
     EXIT WHEN v_cursor%NOTFOUND;
     dbms_output.put_line( 'Movie: ' || rec_movie.title);
  END LOOP;
  CLOSE v_cursor;
END;

MySQL does this directly in the mysql terminal with the help of session variables:

mysql> set @qry = 'SELECT id, title, director, year, length FROM dvds WHERE year BETWEEN ? AND ?';
Query OK, 0 rows affected (0.02 sec)

mysql> PREPARE dynsql FROM @qry;
Query OK, 0 rows affected (0.01 sec)
Statement prepared

mysql> set @oldest = 1967;
mysql> set @youngest = 1976;

mysql> EXECUTE dynsql USING @oldest,@youngest;
+----+----------------------------+----------------------+------+--------+
| id | title                      | director             | year | length |
+----+----------------------------+----------------------+------+--------+
| 24 | Spiel mir das Lied vom Tod | Sergio Leone         | 1968 |    159 |
| 27 | Badlands                   | Terence Malick       | 1973 |     90 |
| 58 | The Wild Bunch             | Sam Peckinpah        | 1969 |    139 |
| 71 | Hundstage                  | Sidney Lumet         | 1975 |    119 |
+----+----------------------------+----------------------+------+--------+
4 rows in set (0.00 sec)

mysql> set @youngest = 1971;
mysql> EXECUTE dynsql USING @oldest,@youngest;

mysql> DEALLOCATE PREPARE dynsql;

Records, Cursor and Table variables

Oracle's PL/SQL has a fairly complete set of methods to pass structured data to a procedure: You can either use a cursor variable or a collection of records (known as PL/SQL index-by table) for result set like data. Or you can just pass a record variable for "one row" values. This is handy if you have to specify a lot of parameters for the procedure. You don't need to declare every single parameter (which turns out to be a maintenance nightmare if you have more then 5 or 6 parameters) but you can hide the single parameters as attributes (or members) inside the record variable. You won't need to change the signature of your procedure or function and neither the parameter list at the places where you call the procedure. You only need to add another member to the record definition and recompile your code.

MS-SQL tends to be a nightmare when you need to create a procedure that depends on several attributes as parameters. In many cases these are some of the values of a table row. Passing the table row as a record is not possible. Instead you have to declare a variable for each column value of the table row, that you want to pass as parameter.
Table variables make your life easier in such a case. But because MS-SQL doesn't have the concept of packages you can't declare the record, cursor or table variable close to the procedure in the same package, but you have to do it on database level:

CREATE TYPE dbo.param_tab AS TABLE
(
   some_id  BIGINT        NOT NULL,
   name     NVARCHAR(255),
   a_date   DATETIME
)

CREATE PROCEDURE process_the_data
(
   @params   dbo.param_tab READONLY
)
AS
BEGIN
   SELECT TOP(1) ...
     FROM @params p INNER JOIN other_table o ON p.some_id = o.id
    WHERE ....
END

The attribute READONLY is required and it shows that in contrary to Oracle's cursor or record variables the table variable can not be changed.

A similar inconvenience appears in MS SQL when you fetch from a cursor. You can't fetch into a record variable, but have to declare a variable with the appropriate type for each field that is selected in the cursor.
Maybe this inconvenience is on purpose to remind the people that cursor and loops are procedural thinking and against the spirit of SQL.


DECLARE @x INT = 1956, @y INT = 2000
DECLARE cur_movies CURSOR FOR
  SELECT id, title, director, year FROM dvds WHERE year BETWEEN @x AND @y
BEGIN
   DECLARE @aValue INT
   DECLARE @id INT, @year INT,
           @title NVARCHAR(255), @director NVARCHAR(250)

   OPEN cur_movies

   FETCH NEXT FROM cur_movies INTO @id, @title, @director, @year
   WHILE @@FETCH_STATUS = 0
   BEGIN
      -- do something with the variables that were filled by the cursor row
      SET @aValue = @year + @id
      FETCH NEXT FROM cur_movies INTO @id, @title, @director, @year
   END

   CLOSE cur_movies
   DEALLOCATE cur_movies
END

The @@FETCH_STATUS cursor attribute variable is set by the last FETCH NEXT. If FETCH NEXT would return a boolean variable it could be used directly as condition for the WHILE. But because it is only writing into and modifying the global pseudo variable fetching and testing the result of the fetch is splitted into two statements and hence we need the FETCH statement twice: one time before the while and another time at the end of the WHILE's execution block. And sadly each time we have to give the full list of variables after the INTO. Better don't ask me how often I have forgotten to add the new variable also to the second FETCH NEXT statement ...

Functions with complex return values

Sometimes (and sometimes might be quite often) a function is created to do something and return a summary at the end. A typical example can be a numeric result code and a detailed result description/summary. If this is a negative code and the result description is a error description you should better sit back and reconsider your understanding of exceptions. Exceptions give you just that: An Exception code or type and an exception message. Instead of messing around with output parameters, just be brave and throw an exception.

Oracle has pipelined functions and can return nested tables or more generally collections that can be unpacked to View like n x m result sets with the help of the special cast operator TABLE(...). PostgreSQL knows about functions that return a set of records (where the type for these records have to be predefined as a custom type) or a TABLE(..colspec..) where the returned composite type is specified in the function signature and can return n x m structured data. Both database engines treat such kind of functions as if they were views and consequently you can select from them like SELECT * FROM my_func(23)
MS-SQL's counterpart are table valued functions. If their body contains only one select statement they are often called parameterized views because they can be used just like views. Actually the way how you can work with the parameters that are passed into the function are much more fexible then what you can do with a plain old view (you can use the parameters in any of the inner subselects of your statements and not only in the where clause as for views).

PostgreSQL has a very similar concept but consequently because the body of the function doesn't contain any procedural stuff, the language plpgsql is not needed and SQL as language is good enough. The following function generates some basic DDL for the available tables with a query against the information_schema. The sizes, that are passed as parameter for formatting of the generated SQL have to go deeply inside the subselect, it is nearly impossible to do this with a simple view.

CREATE OR REPLACE
FUNCTION re_create_tables(p_name_size integer, p_colspec_size integer)
  RETURNS TABLE(table_spec text)
AS $$
 SELECT concat(a.tab_name, a.column_name, a.column_spec) AS table_spec FROM
  (SELECT CASE ordinal_position WHEN 1 THEN concat('CREATE TABLE ', table_name, ' (', chr(10)) ELSE '' END as tab_name,
      table_name, ordinal_position, rpad(column_name,p_name_size) as column_name,
      rpad(concat(CASE upper(data_type) WHEN 'CHARACTER VARYING' THEN 'VARCHAR' ELSE upper(data_type) END,
        CASE WHEN upper(data_type) IN ('CHARACTER VARYING') THEN format('(%s)',character_maximum_length)
             WHEN upper(data_type) IN ('BIGINT','INTEGER','SMALLINT') THEN ''
             WHEN upper(data_type) IN ('NUMERIC','DECIMAL') THEN format('(%s,%s)',numeric_precision,numeric_scale)
           ELSE '' END, 
        CASE is_nullable WHEN 'NO' THEN ' NOT NULL,' ELSE ',' END),p_colspec_size) AS column_spec
     FROM information_schema.columns
    WHERE table_schema = 'public') a
  ORDER BY table_name, ordinal_position
$$ LANGUAGE SQL;

/* Another function that uses plpgsql as language (which is the wrong language here) */
CREATE OR REPLACE
FUNCTION get_film(p_pattern VARCHAR) 
  RETURNS TABLE (
    film_title VARCHAR,
    film_year  INT
  ) 
AS $$
BEGIN
  RETURN QUERY
    SELECT title, cast(release_year as integer)
      FROM film WHERE title LIKE p_pattern;
END;
$$ LANGUAGE 'plpgsql';

Oracle and PostgreSQL allow you to pass around cursor variables (that can even be scrollable and are much cleaner as the MS-Sql and MySql approach with simply selecting in the output stream inside stored procedures.

Comparison Matrix

SQL 92	Oracle	MySQL	PostgreSQL	MS-SQL
coalesce()	Yes since 8i, before nvl()	Yes, since 3.23.3	Yes	Yes, since 2008, before isnull()
case ... when	9i Yes, before decode()	Yes	Yes	Yes
extract(.. from ..)	only a few unit specifiers	Yes	Yes	No, use DATEPART(year, date_expr)
position(.. in ..)	No, use instr()	Yes, as a subset of locate()	Yes	No, use patindex() or charindex()
auto increment	no (needs sequence/trigger)	INT AUTO_INCREMENT	(BIG)SERIAL	IDENTITY
	table(nested table)		unnest(array-type)
INTERSECT	Yes	No, use INNER join (a)	Yes	Yes
EXCEPT	No, use MINUS	No, use OUTER join (b)	Yes	Yes

 -- (a)
 SELECT DISTINCT a.year AS common_years
   FROM films a INNER JOIN films b ON a.year = b.year
        INNER JOIN belong_to ab ON a.f_ID = ab.f_ID
        INNER JOIN belong_to bb ON b.f_ID = bb.f_ID
  WHERE ab.g_ID = 'wes' AND bb.g_ID = 'fn'
+--------------+
| common_years |
+--------------+
|         1992 |
+--------------+

 -- (b)
 SELECT DISTINCT a.year AS exclusive_years
   FROM films a INNER JOIN belong_to ab ON a.f_ID = ab.f_ID
     LEFT JOIN
       (SELECT b.year FROM films b INNER JOIN belong_to bb ON b.f_ID = bb.f_ID
         WHERE bb.g_ID = 'wes') n ON a.year = n.year
  WHERE ab.g_ID = 'fn' AND n.year is null
+-----------------+
| exclusive_years |
+-----------------+
|            1995 |
|            1993 |
|            1948 |
|            1943 |
|            1954 |
|            1957 |
|            1950 |
+-----------------+

Fetch the value of Identity or autoincrement columns

There are two approaches for this problem:

Extend the INSERT statement

Oracle and PostgreSQL both support the RETURNING INTO clause to get hold of the last used ID value. Actually this syntax can also serve for other purposes and not only for auto increment values.

INSERT INTO persons (firstname,lastname)
  VALUES ('Peter','Smith') RETURNING the_id INTO var_id;

MS-SQL uses a different syntax for the same concept:

INSERT INTO persons (firstname,lastname)
  OUTPUT inserted.the_id
  VALUES ('Peter','Smith')

The MS-SQL commands creates a result set. This means in the host language you have to treat this insert just like you would do with a select statement. For Oracle and PostgreSQL you have to prepare a output variable that you need to bind to a specific host variable (or of course in PL/SQL and PL/pgSQL a variable declaration in the surrounding block is enough).

A built-in function to retrieve the ID

For all database engines the listed function have a scope of the active session. A parallel insert from a second session won't modify the returned value. So they are safe, except if the insert fires one or more triggers, that are also using auto increment values. Of course the additional inserts made by the triggers are in the same session (they are even in the same transaction) and will affect the returned value of these functions.

MySQL: The LAST_INSERT_ID() Function for handling this in the database and an analogous function mysql_insert_id() if you need to retrieve this value from a host language like C, Java or Perl.
For a multiple-row insert, LAST_INSERT_ID() actually returns the AUTO_INCREMENT key from the first of the inserted rows.
MS-SQL: The SCOPE_IDENTITY() function ignores the modification by triggers in contrary to the older method with a server variable @@IDENTITY. And surprise, MS-SQL returns the ID of the last inserted row in case of a multiple row insert.
PostgreSQL: LASTVAL()

Importing into the Database

The MS-SQL method BULK INSERT is somewhere located between Oracle's external tables and the MySQL approach load data [local] infile. Both BULK INSERT and external tables need the datafiles to be located on the server's file system. MySQL is not limited to this if you use the optional keyword local. But the syntax and concept of BULK INSERT are very similar to load data infile as both need an existing "real" table in which the data will be inserted.

But even if you can do it with MySQL ... using a file on the clients file system might be against the spirit of these bulk loading tools. Because it obviously means that the data has to travel somehow over the network in order to get into the database and this includes quite a lot of network protocol overhead.

BULK INSERT an_existing_table
  FROM 'C:\path\on\the\server\file.csv'
WITH (
   DATAFILETYPE = 'char',   -- single byte encoding
   CODEPAGE = 'ACP',        -- Ansi Code Page (windows-1252)
   FIELDTERMINATOR = ';'
   ROWTERMINATOR = '\n'
)

LOAD DATA LOCAL INFILE '/home/rolf/data/voicemail-calls.csv'
  INTO TABLE voicemail
  FIELDS TERMINATED BY ';'
  (id,calling_no,called_no,calling_time,week_day,isdn_cause,proto_err);

BULK INSERT has a cousin OPENROWSET .. BULK that works more like the external tables of Oracle. It reads the flat datafile every time when it is accessed and keeps the data in memory.

SELECT m.empid, m.firstname, m.lastname
  FROM OPENROWSET( BULK 'C:\path\on\the\server\file.csv',
                   FORMATFILE = 'C:\path\on\the\server\employess.fmt') AS m;

TRUNCATE TABLE

Both Oracle and MS-SQL have a TRUNCATE TABLE statement. But there is a crucial difference between them. In Oracle it is a DDL statement and requires the DROP TABLE privilege and as all DDL statements it executes an implicit COMMIT and breaks any transaction border of the surrounding statements.

The MS-SQL server treats a TRUNCATE TABLE statement as a super DELETE that can't have a filter criterion and collects no row-based transaction information. But if the TRUNCATE is inside a transaction there is still the chance to revert the effects of the TRUNCATE later on with a ROLLBACK.

Although many users won't like it, because they have learned it the hard way and lost their data when using TRUNCATE in an Oracle database. Oracle's approach is more consistent, because the keyword TABLE should alert you that you are not performing normal data manipulation, but are executing some data definition command.

Exchanging columns

update test_data set label=description, description = label;

The above statement should simply exchange the data in label with the values in description. This works for PostgreSQL, MS-SQL and ORACLE, but it doesn't work for MySQL. When MySQL is evaluating the description = label it doesn't use the original input value of label, but the one after processing the assignment of description to label. So finally both columns label and description will have the same value from the original description column.

Building average aggregates

Luckily most of the time you need either min, max or sum as aggregate functions which all behave as expected. But if you use avg in MS SQL you will look at the data and think that the values look strange and wrong.

SELECT avg(n) AS average FROM (VALUES (0),(1),(1),(1)) AS num(n)
average    
-----------
          0

On second sight you will realize that the SQL-Server works with integer values and that he even applies integer arithmetic when dividing the sum of all through the number of rows. So the correct way to build the average, would require a type cast to float or decimal first:

SELECT avg(cast(n as float)) AS average
  FROM (VALUES (0),(1),(1),(1)) AS num(n)

average    
------------------------
                    0.75