Wednesday, February 07, 2007

Review topic: SQL, correlated subqueries, and more misc

If you wanted to know all of the customers in an invoice table that haven't been coded in a master customer table, you could code a correlated subquery. DB2, for example, will run the inner subquery for every row of the outer query, which is often inefficient. To find a "non match" you would code a WHERE NOT EXISTS before the subquery, which still has to be run for every row. A more efficient way to code is simply to use the "NOT IN" option and code the second query.

A synonym for the INNER JOIN is "equi-join" (or equijoin).

A UNION will eliminate duplicates but otherwise display all output "rolled up" and sometimes column names will have misleading meaning.

The SELECT DISTINCT ignores NULL, but the COUNT(*) includes NULL.

To define an index, the first part of the composite key must be used. An index cannot be created on a VIEW.

ANSI SQL always recongizes aggregate functions (SUM, AVG) but does not necessarily offer triggers, checkpoints. b-trees, or even stored procedures. But stored procedures are typically very important in many large DB2 shops.

When a NULL is concatenated in any string, you get a NULL result. This may be important in shops that do batch cycle database replication from a mainframe legacy (DB2) to a midtier in another DBMS.

ANSI SQL recognizes CASE structures (as in most languages) and even remainder on division.

To get rid of an entire table, you use the DROP command (not DELETE) and have to be careful about foreign key dependencies (INT REFERENCES) when the table was created.

View names must be unique and may not specify orders, because of the way they are executed. Views created by SELECT DISTINCT can't be updated.

Back in the early 1990s, RBsase (from Microrim) was the first relational dabatase to offer SQL on a home PC (it as a specific menu item). Around 1990 or so, Ashton-Tate created dBase4, where you had to go into SQL mode to use SQL (otherwise it was like dBase III+). RBase made a lot of creating views.

When Microsoft offers testdrive versions of Visual Studio ADO .NET, it supplies a script to create all the tables in SQL Server, with many CREATE TABLE and INSERT statements to create the test databases for the classwork exercises for its Visual C# and Visual Basic, etc. books. The user could conceivably pattern his own database construction after these.

A table can have only one clustered index, but may have other non-clustered indices.

Here is a reference for normal forms. The first normal form would maintain that a column must be an elementary item and not itself a list. The second normal form requires that a item depend on only one key. The third normal form is the "if and only if" rule and removes any "transitive dependency". That is: Data should depend on the key, the whole key, and nothing but the key!

No comments: