Your Ad Here
Basic select queries
query: retrieve/extract data
but sometimes more generally: any operation, e.g. insert/update/delete
"queries" (which often involve an implicit or explicit selection)

result of querying a table is a table.
A table can be queried to produce a table that can be queried to
produce a table that can be queried...
The resultset table can be stored back in the DB as a table.


**********************
Basic SELECT statement:

SELECT values,to,display,columns,constants,expressions
 FROM source_tables
 WHERE row_specifying_expression
 GROUP BY how_to_group
  HAVING expresion
 ORDER BY how_to_sort
 LIMIT row_count;

--all clauses are optional
--if a clause is used, it must be in this order

SELECT * FROM t                --all columns
SELECT f1,f2 ...               --fields f1 and f2 only of selected rows
SELECT DISTINCT f1...          --unique f1 values

--Limit number of rows returned
SELECT ... LIMIT 5;
           LIMIT skip,count       #skip

--sorted order
SELECT ... ORDER BY f1
SELECT ... ORDER BY f1 DESC    --descending order
SELECT ... ORDER BY f1,f2      --primary and secondary sorts
  cannot Order By Count(*)   [use Alias]

--WHERE clause filters rows
SELECT valuesToDisplay FROM t WHERE f1 relOp value [AND|OR...]]
  relOp: < <= > >=  = (single =) !=  (<> too)    
              
--pattern matching:
  WHERE f1 [NOT] LIKE [BINARY] 'string'        
  case-insensitive unless BINARY
  'string' contains pattern matching/wildcard operators:
                   % 0 or more chars,  
                   _ single char:
                       "j%"	#starts with j
		       "%j%"    #contains j   
                       "j____"  #5 chars, starts with j
  Like can be used with non-strings:  date1 LIKE '19%'  #dates starting 19....

--testing for Null.  = and != cannot be used
  WHERE f1 IS [NOT] NULL

--range test
  WHERE f1 [NOT] BETWEEN x AND y         equivalent to: f1>=x AND f1<=y

--set membership
  WHERE f1 [NOT] IN (val1,val2,...)      equivalent to: f1=val1 OR f1=val2 OR...


--calculated/derived values/columns 
SELECT f1*f2+sqrt(f3),f4-f5  FROM t....

--alternate/alias column name
SELECT f1 AS newalias, f1*f2 AS meaningfulname FROM t 
    [ORDER BY meaningfulname]
  alias can not be used in WHERE clause
  AS is optional

--counts:
SELECT COUNT(*) ...          --number of rows
SELECT COUNT(DISTINCT f1)... --number of distinct f1 values

--aggregate/column functions: summarize an entire column
SELECT SUM(f1)...
       AVG(f1)...
       MIN(f1)...
       MAX(f1)...
 can be derived column/expression:  SUM(100*f1/f2)
 Nulls ignored.
 cannot be in Where clause

--grouping/subtotals.  summary per group of rows.
SELECT f1,aggregateFunction(f2) FROM t GROUP BY f1
  output is sorted by the grouping column(s)

--cannot mix column and aggregate function in select list without Group By:
  //syntax error:  select f1,aggFunc(f2) from t;    

--frequencies of each different f1 value
SELECT f1,COUNT(*) FROM t GROUP BY f1
--frequency of each f1,f2 combo
SELECT f1,f2,COUNT(*) FROM t GROUP BY f1,f2    

--HAVING clause modifies Group By clause (further constrains the groups)
SELECT f1,aggfunc(f2) FROM t GROUP BY f1 HAVING aggfunc(f2) relOp value
  can be different aggregate functions.
find duplicate fi's:
SELECT ... FROM t GROUP BY fi HAVING COUNT(*)>1

HAVING and WHERE sometimes interchangable in GROUP BY if condition not
involve aggregate function:
SELECT ... FROM t GROUP BY fi HAVING condition
SELECT ... FROM t WHERE condition GROUP BY fi



Output to text file on server:
SELECT selectList INTO OUTFILE 'outfilename' FROM ...
  must have FILE privilege.
  file must not already exist.
  same default format and options as Load Data Local Infile.
  file is world-writable, owned by mysql.  
  Useful for reloading with LOAD DATA INFILE...  (i.e. not LOCAL)
  use: mysql -e "select ....." >outfilename   to output to file on client

***********************************************************************
Subqueries/nested queries.
is why SQL is "structured" query language.
a Select within another statement (select, insert, update, delete).

SELECT .. FROM .. WHERE f1 relOp (SELECT f2 FROM .. [WHERE..][GROUP BY..][HAVING..])
  parens required around inner Select, which runs before outer Select.

  Usually, subquery produces one column, thus it has only one Select
  item (f2), which is often an aggregate function:
  SELECT .. FROM .. WHERE f1 relOp (SELECT aggFunc(f2) FROM...)
select * from pet where dob=(select max(dob) from pet);
select * from pet where dob=(select max(dob) from pet where species='cat');
select * from pet where dob>(select avg(dob) from pet);
select * from pet where weight>.25*(select sum(weight) from pet);
   #pets that are > 25% of total pets weight
select * from pet as t where 2=(select count(*) from pet 
  where pet.species=t.species);    #species that occur exactly twice

  Cannot have Order By in subquery (would be useless anyway, since result is not seen).

  subquery can return scalar, one column, one row, or table.

  subquery could be another table:
SELECT .. FROM t1 WHERE fi relOp (SELECT .. FROM t2 ..)
 select * from pet where birth<(select min(date) from events);

  subquery can be in select clause:
SELECT fi,fj*(SELECT Avg(fk) FROM t) FROM t

  subquery gets the rows from another table to insert into a table:
INSERT INTO t SELECT ... FROM t2...;   #no parens
 insert into t select name,weight from pet;  #

  subquery used in Update:  must be from different table?
UPDATE t SET fi=expr WHERE fj relOp (SELECT .. FROM t2)  #
  update pet set weight=weight+100 where birth<(select avg(eventdate)
                                                from events)

  subquery used in Delete: must be from different table?
DELETE FROM t WHERE fi relOp (SELECT .. FROM t2)
  delete from pet where birth<(select avg(eventdate) from events)

***********************************************************************
Joining multiple tables
two (or more) tables are joined together, typically on common fields.
Form pairs of rows by matching the contents of related columns.
Joins exercise the data relationships between tables. Are the only way
to exercise cross-table relationships.

"inner join": 
  "equi-join" is an inner join where the joining condition is =
  (i.e. exact match)
  "theta-join": joining condition is inequality <,> etc. Uncommon.

SELECT ... FROM t1,t2 WHERE t1.fi=t2.fj ...
The From clause lists the tables being joined.  The Where clause
 specifies the matching condition, and possibly further restrictions.
Alternate syntaxes:
SELECT ... FROM t1 INNER JOIN t2 ON fi=fj ...
SELECT ... FROM t1 INNER JOIN t2 USING(fi) ...  #if joining columns
                                                 have same name in both tables.
order of t1,t2 makes no difference.

Any pair of columns from the two tables can serve as matching columns,
provided they have comparable data types but most often the common
fields are the primary key of "parent" table and foreign key in the
"child" table that references that primary key.  One-to-many,
parent/child relationship.
SELECT ...FROM parent,child WHERE parent.pk=child.fk ...

[If the names are the same,?] this is sometimes called "natural join".??

Without a Where clause, Cartesian product is formed (all possible
combos of rows from each of the two tables). N*M where N and M are
number of rows of the two tables, respectively.  A join is the product
with the rows that do not meet the matching column condition removed.

Any columnname that's in both tables must be qualified by tablename to
disambiguate it:  t1.fi  if there's a fi in t2 too


Examples:
#combine the pet data into each events:
SELECT * FROM pet,events WHERE pet.name=events.name;
#sex of the pets that had litters duh
SELECT pet.name,sex,date FROM pet,events WHERE pet.name=events.name AND type='litter';
#owner's whose pets have had birthdays:
SELECT DISTINCT owner FROM pet,events WHERE pet.name=events.name AND type='birthday';
#Harold's pets' events:
SELECT pet.name,type,date FROM pet,events WHERE pet.name=events.name AND owner='harold';
#age of pets on visits:
SELECT pet.name,type,DATEDIFF(date,dob)/365 AS age FROM pet,events WHERE pet.name=events.name ;
#pets who visited when less than year old:
SELECT pet.name,type,dob,date FROM pet,events WHERE pet.name=events.name AND DATEDIFF(date,dob)<365;

#name of vendor who supplies product:
SELECT product.code,description,name FROM product,vendor WHERE product.v_code=vendor.code;
#products whose vendor's reorder is 'Y':
SELECT product.code, description,name FROM product,vendor WHERE reorder='y' AND product.v_code=vendor.code;
#products per state:
SELECT state,count(*) as `number of products` FROM vendor, product WHERE product.v_code=vendor.code GROUP BY state ORDER BY `number of products`;




"outer join": mismatches (rows in one table not matched by any row in
the other table) are included too.  superset of inner join.
SELECT ... FROM t1 LEFT JOIN t2 ON fi=fj...
  fi is column in t1, fj is column in t2
left table t1 is reference table; output is produced for each of its
rows, those with matching row(s) in the right table (the same as inner
join) and those with no matching row(s) in the right table t2 and in
which any selected columns of t2 will be displayed as NULL.

SELECT product.code,description,name FROM product LEFT JOIN vendor ON product.v_code=vendor.code;

Parent rows that are not matched by any child: i.e. childless parents.
SELECT parentFields FROM parent LEFT JOIN child ON pk=fk WHERE childField IS NULL;
(parent parent child child)

Useful to find referential integrity problems of imported data: left
table's foreign key does not match primary key of any row of right table.
i.e. orphan child.
SELECT childFields FROM child LEFT JOIN parent ON fk=pk WHERE parentField IS NULL;
(child child parent parent)

SELECT product.code,description,name FROM product LEFT JOIN vendor ON product.v_code=vendor.code WHERE name IS NULL;