|      
This short course consists of three parts, in the first part we explain a few
generalities about SQL using a public domain database server called postgreSQL.
In the second part we will study in more detail the SQL commands. Finally the third
part will show us some of the advanced options of the SQL commands, functions
specific of PostgreSQL that might be of interest for our projects and we will
finally review a small C program putting it all together. 
 IntroductionDuring this quick introduction I will only refer to databases (DB). There are other organisations
of data, but writing about them would be outside the scope of our course.
Until recent, the access to data items was accomplished through entities that
were inter-related by well-defined links of the database scheme. This type of access
had advantages, mainly speed of access, but a big disadvantage: we could only
access data through an existing link, for example: 
     country -> states -> counties
but never : 
     country -> counties
where "->" is the link. 
If we wish to establish that second relationship we would have to redefine the scheme
and compiled it again....
 
In fact, in a hierarchical DB, the relationship among the various entities is static and can only be
modified after alteration of the DB scheme and recompilation of the latter.
 
The basic idea behind relation databases is precisely to link data
during the query instance, without the need for a static link, but instead
using an identifier that permits to link one register with another.
 
What I just wrote probably needs an Aspirin :)
 
Relational database managers do not required static links to allow us go down the 
hierarchy of entities, instead they use a unique code that identify these  entities while establishing
a temporary relation as a result to a query.
 
The identification is nothing but a code. Ex: My phone number is not:
 
1234567
 
but :
 
34 6 1234567
 
Clearly my phone number is identified by the country code (34), state code (6)
and the proper device number  (1234567).
 
In the set of countries, the code 34 (Spain) is unique.
In the set of states, the code 34-6 (Spain/Valencia) is unique.
In the set of devices, the code 34-6-1234567 (Spain/Valencia/my phone number) is unique.
 
Let me set the foundations for the first example that illustrates what I just said.
 
     All counties have a code, belong to a state and a country.All states have a code and belong to a country.
 All countries have a code.
 
 
To find  all the counties in a state we relate the county with the state through the
country and county codes; to find all the counties in a country we relate the county
to the country by the country code. These relationships are temporary and only exist
during the time of my query.
 
This is a bit arid and hard to understand but with the first few examples I hope to make 
clearer the concept of code and belonging.
 
When I send the first query to the DB manager it returns me all the related data items.
But what data am I really receiving? The union of country and county items, for every county 
I am going to receive the related country items.
 
During my first query a new nameless entity is suddenly created, it contains a replica
of countries and counties. This new entity, once again, disappears at the end of my query.
 
Before we used to name a set of data a "file". This are made of registers and
each "register" has a "field". Well, in a relational database, a "file" is known as
a table, a table contains rows and each row has columns, this is just a small
cosmetic change. ;-)
 
It is good to mention at this point that some hierarchical DB managers introduce SQL as
an access language, but this is only anecdote. SQL language is almost exclusively
a property of relational managers.
 
To illustrate the use of SQL we will use the relational managers PostgreSQL.
Although it is not fully compliant with the rules of SQL, it is sufficiently
close for our purposes, it is also a very good manager for more heavy duty
tasks.
 
Let me explain only briefly the installation process, given the goal of this article is SQL.
First download the sources from
www.postgresql.org, as well as any patches available.
Extract the sources (tar   zxvf)  to a directory, cd postgresql-6.3
 
cd src
./configure --prefix=/the/desired/path
make all >& make.log &
tail -f make.log
export PATH=$PATH:/the/desired/path/pgsql/bin
export MANPATH=$MANPATH:/the/desired/path/pgsql/man
export PGLIB=/the/desired/path/pgsql/lib
export PGDATA=/the/desired/path/pgsql/data
initdb
createdb test
psql test
Welcome to the POSTGRESQL interactive sql monitor:
  Please read the file COPYRIGHT for copyright terms of POSTGRESQL
   type \? for help on slash commands
   type \q to quit
   type \g or terminate with semicolon to execute query
 You are currently connected to the database: postgres
test=>
This is the prompt for postgres, now we can start executing commands. 
mytest=>create table mytest (field1 varchar(10));
CREATE
mytest=>insert into mytest values ('hello');
INSERT number 1
mytest=>commit work;
NOTICE:EndTransactionBlock and not inprogress/abort state
END
mytest=>select * from mytest;
field1
------
hello
(1 row)
mytest=>drop table mytest;
DROP
mytest=>Ctrl-d
We have already exit the SQL console.
In case of having trouble compiling and installing Postgres95 correctly
please refer to the INSTALL file in the root directory for the distribution.
 
Let me make another side comment; a relational database server is generally
made of the following parts:
 
As clients we connect to the 4th layer, we send SQL commands which are then
passed to the parser layer. This translates the commands and, in the absence
of errors, sends the requests to the second layer. All the processing
and querying operations take place at this level in collaboration with the 
data access layer obviously: data is collected, errors are transmitted to
the client via the communications layer. The SQL processing layer establishes a
dialogue  with the client while managing the correct transfer of data and 
controlling the transactions and interrupts. Data access layer  SQL processing layer  SQL parser layer  Communications layer  Primer PasoNext I will illustrate with an example what has been described so far, let us
build three tables (or files):
File: countries.sql
create table countries (cod_country integer, name varchar(30));
insert into countries values (1, 'country 1');
insert into countries values (2, 'country 2');
insert into countries values (3, 'country 3');
commit work;
 
File: states.sql
create table states (cod_State int, 
			cod_country int, 
			nam_State varchar(30));
insert into states values (1, 1, 'State 1, Country 1');
insert into states values (2, 1, 'State 2, Country 1');
insert into states values (1, 2, 'State 1, Country 2');
insert into states values (2, 2, 'State 2, Country 2');
insert into states values (1, 3, 'State 1, Country 3');
insert into states values (2, 3, 'State 2, Country 3');
commit work;
 
File: counties.sql
create table counties (cod_country int, 
			cod_state int, 
			cod_county int,
			nam_county varchar(60));
insert into counties values (1, 1, 1, 'County 1, State 1, Country 1');
insert into counties values (2, 1, 1, 'County 2, State 1, Country 1');
insert into counties values (3, 1, 1, 'County 3, State 1, Country 1');
insert into counties values (1, 2, 1, 'County 1, State 2, Country 1');
insert into counties values (2, 2, 1, 'County 2, State 2, Country 1');
insert into counties values (3, 2, 1, 'County 3, State 2, Country 1');
insert into counties values (1, 3, 1, 'County 1, State 3, Country 1');
insert into counties values (2, 3, 1, 'County 2, State 3, Country 1');
insert into counties values (3, 3, 1, 'County 3, State 3, Country 1');
insert into counties values (1, 1, 2, 'County 1, State 1, Country 2');
insert into counties values (2, 1, 2, 'County 2, State 1, Country 2');
insert into counties values (3, 1, 2, 'County 3, State 1, Country 2');
insert into counties values (1, 2, 2, 'County 1, State 2, Country 2');
insert into counties values (2, 2, 2, 'County 2, State 2, Country 2');
insert into counties values (3, 2, 2, 'County 3, State 2, Country 2');
insert into counties values (1, 3, 2, 'County 1, State 3, Country 2');
insert into counties values (2, 3, 2, 'County 2, State 3, Country 2');
insert into counties values (3, 3, 2, 'County 3, State 3, Country 2');
insert into counties values (1, 1, 3, 'County 1, State 1, Country 3');
insert into counties values (2, 1, 3, 'County 2, State 1, Country 3');
insert into counties values (3, 1, 3, 'County 3, State 1, Country 3');
insert into counties values (1, 2, 3, 'County 1, State 2, Country 3');
insert into counties values (2, 2, 3, 'County 2, State 2, Country 3');
insert into counties values (3, 2, 3, 'County 3, State 2, Country 3');
insert into counties values (1, 3, 3, 'County 1, State 3, Country 3');
insert into counties values (2, 3, 3, 'County 2, State 3, Country 3');
insert into counties values (3, 3, 3, 'County 3, State 3, Country 3');
commit work;
 
A file with SQL commands can be executed from pgsql like this:
 
 
\i  file_name
 
We could also insert the commands using a simple cut & paste.
 
Let us see next what counties are available:
 
manu=> select * from counties;
cod_country|cod_state|cod_county|nam_county
-----------+---------+----------+----------------------------
          1|        1|         1|County 1, State 1, Country 1
          2|        1|         1|County 2, State 1, Country 1
          3|        1|         1|County 3, State 1, Country 1
          1|        2|         1|County 1, State 2, Country 1
          2|        2|         1|County 2, State 2, Country 1
          3|        2|         1|County 3, State 2, Country 1
          1|        3|         1|County 1, State 3, Country 1
          2|        3|         1|County 2, State 3, Country 1
          3|        3|         1|County 3, State 3, Country 1
          1|        1|         2|County 1, State 1, Country 2
          2|        1|         2|County 2, State 1, Country 2
          3|        1|         2|County 3, State 1, Country 2
          1|        2|         2|County 1, State 2, Country 2
          2|        2|         2|County 2, State 2, Country 2
          3|        2|         2|County 3, State 2, Country 2
          1|        3|         2|County 1, State 3, Country 2
          2|        3|         2|County 2, State 3, Country 2
          3|        3|         2|County 3, State 3, Country 2
          1|        1|         3|County 1, State 1, Country 3
          2|        1|         3|County 2, State 1, Country 3
          3|        1|         3|County 3, State 1, Country 3
          1|        2|         3|County 1, State 2, Country 3
          2|        2|         3|County 2, State 2, Country 3
          3|        2|         3|County 3, State 2, Country 3
          1|        3|         3|County 1, State 3, Country 3
          2|        3|         3|County 2, State 3, Country 3
          3|        3|         3|County 3, State 3, Country 3
(27 rows)
manu=>
There are 27 rows and pgsql is now waiting the next command,
try this one:
manu=> select * from countries, states;
cod_country|name     |cod_state|cod_country|nam_state
-----------+---------+---------+-----------+------------------
          1|country 1|        1|          1|State 1, Country 1
          2|country 2|        1|          1|State 1, Country 1
          3|country 3|        1|          1|State 1, Country 1
          1|country 1|        2|          1|State 2, Country 1
          2|country 2|        2|          1|State 2, Country 1
          3|country 3|        2|          1|State 2, Country 1
          1|country 1|        1|          2|State 1, Country 2
          2|country 2|        1|          2|State 1, Country 2
          3|country 3|        1|          2|State 1, Country 2
          1|country 1|        2|          2|State 2, Country 2
          2|country 2|        2|          2|State 2, Country 2
          3|country 3|        2|          2|State 2, Country 2
          1|country 1|        1|          3|State 1, Country 3
          2|country 2|        1|          3|State 1, Country 3
          3|country 3|        1|          3|State 1, Country 3
          1|country 1|        2|          3|State 2, Country 3
          2|country 2|        2|          3|State 2, Country 3
          3|country 3|        2|          3|State 2, Country 3
(18 rows)
18  rows ???  We inserted 3 countries and 6 states, all identify a single country.
How is it possible we get 18 rows?
The last command has performed a union of  two tables, we have related the
table of countries with the table of counties, since we have not specify any
union exclusion rule, pgsql returned ALL possible rows of countries
related with  ALL rows of states, i.e. 3 for countries times 6 for states
for  a total of 18. This result is obviously illogic and useless, we better could have done
the following:
 
manu=> select * from countries, states
manu-> where countries.cod_country = states.cod_country;
cod_country|name     |cod_state|cod_country|nam_state
-----------+---------+---------+-----------+------------------
          1|country 1|        1|          1|State 1, Country 1
          1|country 1|        2|          1|State 2, Country 1
          2|country 2|        1|          2|State 1, Country 2
          2|country 2|        2|          2|State 2, Country 2
          3|country 3|        1|          3|State 1, Country 3
          3|country 3|        2|          3|State 2, Country 3
(6 rows)
Well, this begins to appear a bit more reasonable, Six rows, Correct?
Yes, there are six counties and each county is in a country. It is
reasonable to get a number of rows identical to the number of 
counties because country is a qualifier of counties. We just related
the table of countries with the table of counties via the country code.
Remember that countries have a code and counties have the code of
the country they belong to.
 
Why countries.cod_country = states.cod_country ?
 
The country code in the table of countries is  cod_country
and in the table of counties too, therefore:
 
cod_country = cod_country
is illogical, the interpreter will never now which of the two to use and it would return
us an error: 
select * from countries, states
		where cod_country = cod_country;
ERROR:  Column cod_country is ambiguous 
Next, we can use aliases for columns: 
manu=> select * from countries a, states b
manu-> where a.cod_country = b.cod_country;
cod_country|name     |cod_state|cod_country|nam_state
-----------+---------+---------+-----------+------------------
          1|country 1|        1|          1|State 1, Country 1
          1|country 1|        2|          1|State 2, Country 1
          2|country 2|        1|          2|State 1, Country 2
          2|country 2|        2|          2|State 2, Country 2
          3|country 3|        1|          3|State 1, Country 3
          3|country 3|        2|          3|State 2, Country 3
(6 rows)
What does the manager return?:  cod_country,  name,
cod_state, cod_country y nam_state.
Since we query  "select * from  countries, states",  where the 
* is a wild card that stands for EVERYTHING, we obtained 
the two columns for countries and the three for counties. Now we would like
to be more specific:
 
manu=> select a.cod_country, cod_state, name, nam_state
manu-> from countries a, states b
manu-> where a.cod_country = b.cod_country;
cod_country|cod_state|name     |nam_state
-----------+---------+---------+------------------
          1|        1|country 1|State 1, Country 1
          1|        2|country 1|State 2, Country 1
          2|        1|country 2|State 1, Country 2
          2|        2|country 2|State 2, Country 2
          3|        1|country 3|State 1, Country 3
          3|        2|country 3|State 2, Country 3
(6 rows)   
In the last command  we explicitly asked for the country
code, the state code and the name of the country and state.
Notice that some column names are qualified (a.cod_country)
while others are not (nam_state), this is because cod_country 
is repeated in both tables while nam_state exists only in states.
Unique column names do not need extra qualifiers.
Let us make things more complicated:
 
manu=> select a.cod_country, cod_state, name, nam_state
manu-> from countries a, states b
manu-> where a.cod_country = b.cod_country
manu-> and a.cod_country = 3;
cod_country|cod_state|name     |nam_state
-----------+---------+---------+------------------
          3|        1|country 3|State 1, Country 3
          3|        2|country 3|State 2, Country 3
(2 rows)
This time we limited the search to country number 3 only.FunctionsHere is an example of using the row counting function count():
select count(*) from states;
count
-----
   27
(1 row)
It returns the number of rows contained in the table of counties,
next: 
manu=> select cod_country, count(*) from states
manu-> group by cod_country;
cod_country|count
-----------+-----
          1|    2
          2|    2
          3|    2
(3 rows)
It returns the number of rows with IDENTICAL country code,
this is the reason for using the cod_country.
An even better example:
 
manu=> select name, count(*) from countries a, states b
manu-> where a.cod_country = b.cod_country
manu-> group by name;
name     |count
---------+-----
country 1|    2
country 2|    2
country 3|    2
(3 rows)
We still obtained the same three rows but this time the returned information 
is more clear. 
Well until now we have only given an introduction, just warming up :-)
      Concepts ReviewSo far we have seen a few very basic  concepts of SQL. The most relevant thing here
is the concept itself of SQL. We do not work with concrete data anymore but with
data entities. A data entity is an abstract concept of databases. Simplifying it could be
understood as "ONLY RETURN PART OF ALL WHAT IS AVAILABLE"
We have seen several commands:
 
 
| CREATE TABLE | This command  creates a
table with its columns. |  
| DROP TABLE | Erases a table. |  
| SELECT | This command is the foundation of SQL, allows us to create a temporal
table containing the necessary data items only. SELECT can take as
parameters functions or complex statements, as well as  sub_selects: 
select count(*) from states
	where cod_country in (select cod_country from countries);
count
-----
   27
(1 row)
 |  
| COMMIT  WORK | This is another fundamental command.
It tells the DB manager to commit ALL the modifications given since BEGIN WORK.
In our particular DB manager a BEGIN WORK marks the initialisation of a transaction,
in other managers the beginning  of a transaction is mark by the first command that
alters something in the database. In postgreSQL all commands that alter data will 
operate directly unless there was a previous BEGIN WORK. 
NOTE: commands that modify the scheme of the database execute a COMMIT WORK,
therefore  if a transaction is opened and any such commands is executed our transaction 
will be closed immediately and it will be impossible to launch a  ROLLBACK WORK.
 
While a user has an open transaction he can declare the access type to his data
by other users:
 Modified data
 Original Data previous to the transaction
 Block data access
 |  
| COMMIT WORK | Closes a transaction leaving the committing the
modifications introduced. The command ROLLBACK WORK returns the data to their state
previous the current transaction. |  
 The concept of transaction is very important given that it let us 
return to the previous state in case of an error. Let us try this operation,
first a "rollback work" to close any previous transactions:
 
 
manu=> select * from countries;
cod_country|name
-----------+---------
          1|country 1
          2|country 2
          3|country 3
(3 rows)
There are three rows,
 
begin work;
Begins a transaction 
insert into countries values (5, 'Country Not True');
We inserted a row, let us next verify that all the rows are there 
 
manu=> select * from countries;
cod_country|name
-----------+----------------
          1|country 1
          2|country 2
          3|country 3
          5|Country Not True
(4 rows)
All the rows are there. Next
 
rollback work;
this abandons the transaction. 
manu=> select * from countries;
cod_country|name
-----------+---------
          1|country 1
          2|country 2
          3|country 3
(3 rows)
After checking the number of rows we see it went back to the original 3 rows.
 
 
| INSERT | We have seen this command already. It inserts data in a table. |  
| CREATE TABLE | Another important command, the creation of
a table and its columns, let us see now the type of data that can be handle: 
The definitions of the type of data are specific for every kind of SQL manager,
nevertheless there is an SQL standard (the last one is  ANSI/92 or SQL/3) 
that defines a few types with its characteristics. This course we will only see
few types specific of PostgreSQL.
| char(range): | Alfa-numeric data of fixed length 30 bytes. |  
| varchar(rango): | Alfa-numeric data of variable length up to 30 bytes. |  
| int2: | Numeric binary
data of 2 bytes:   2**-15 hasta 2**15 |  
| int4: | Numeric binary 
data of 4 bytes: 2**-31 - 2**31 |  
| money: | Fixed point data number, ex: money(6,3) is a number of six digits, 3 of which are decimal
(3 integers and 3 decimals). |  
| time: | Temporal
data containing hours, minutes, seconds and hundreds of second,
HH:MM:SS:CCC |  
| date: | Date
data containing yer, month and day,  AAAA/MM/DD |  
| timestamp: | Date and time data as AAAA/MM/DD:HH:MM:SS:CCC |  
| float(n): | Real data of single precision. |  
| float3: | Real data of double precision. |  
 |  
| DELETE | deletes rows of a table |  
| UPDATE | modifies the columns of a row in a table. |  SummaryDespite our somewhat erratic style, we have introduced SQL and install a relational
database.
SQL let us build an abstraction layer to our data and permits us manage them
according to our needs.
 
From what we have seen so far, one could ask:
How do I use SQL within an application?
 
The answer will come one step at a time, in our third article we will
review a short C application using SQL.
           |