PostgreSQL Tutorial – Creating Your First Database

Once you’ve installed PostgreSQL (OS X Instructions) (other platforms), you’ll probably want to get started with a new database. Unfortunately PostgreSQL does not include a sample database by default. You can download the pagila sample database from the pgFoundry (direct download link). Pagila is a simple movie rental database, sort of like what you would get if you ran a video rental store. It is also a port of a sample MySQL database with changes made to take advantage of PostgreSQL’s features – you can read about the differences in the README file. Installing pagila is easy:``` createdb -U jeremiah pagila psql -U jeremiah -f pagila-schema.sql pagila

There will be a great deal of text here about creating tables and other database

objects. You can read through the create file if you’d like, but we’re going to be

taking a look around so you can hold off for a minute

psql -U jeremiah -f pagila-insert-data.sql pagila

You should see a lot of output from psql again talking about inserting rows

and altering seqvals. Once again, we’ll be looking at this data over


### createdb

**createdb** is a command line tool that is installed with your PostgreSQL installation. This is one way to create a new database using PostgreSQL. If you supply the -e command line option (`createdb -U jeremiah -e pagila`), you can see the commands that are actually being run against PostgreSQL. In this case it’s as simple as `CREATE DATABASE pagila;`. We need to tell PostgreSQL which user we’re using to connect because only certain users have the appropriate rights to create new databases. This isn’t something that just anyone can do, after all.

### psql

**psql** is a command line tool that is used to connect to your PostgreSQL databases. **psql** has a number of options, but I’m only going to cover the options that we’re directly using to set things up for the pagila database.

#### \-U

The `-U` option is used to specify the user name to use to connect to the database instead of the default user. Unless you’ve changed something during installation, the default user is the `postgres` admin account.

#### \-f \[filename\]

**psql** can operate in two modes: an interactive mode and a file input mode. Passing the `-f` option to **psql** will use the commands in the file _filename_. You could use the input redirection operator (<), but using `-f` has the advantage of processing the file as a single batch and subsequently providing helpful line numbers in the event of an error.

#### The database

The last parameter we’re passing is the name of the database. If you want to connect to a database on a different host, you would use the `-h` or `--hostname` command line flag to change the database host.

### Related Reading

[createdb syntax](http://www.postgresql.org/docs/8.4/static/app-createdb.html) [CREATE DATABASE](http://www.postgresql.org/docs/8.4/static/sql-createdatabase.html) [psql](http://www.postgresql.org/docs/8.4/static/app-psql.html)

### Some Minutiae About PostgreSQL Databases

PostgreSQL, like many other databases, has a vast set of features. Two of these features, collations and tablespaces, aren’t immediately important for understanding what we’re doing here, but it’s a good thing to know about them for when you get into more advanced scenarios.

#### Collations and Locales

When you create a new database you can also specify a collation for the database. Collations are used to determine how text is stored and sorted within PostgreSQL. You can set the collation when PostgreSQL is first configured, during database creation, or when connecting to the database. There are a few gotchas that aren’t important here, but you can read about them in the PostgreSQL documentation on [Local Support](http://www.postgresql.org/docs/8.4/interactive/locale.html).

#### Tablespaces

You can also specify the default tablespace when you create a new database. A tablespace is the place where database objects are stored on disk. If you’re familiar with SQL Server, you will know this as a filegroup. This is a more advanced feature and can be used to achieve incredible performance benefits by spreading physical database objects across multiple physical disks. Like collation, we’re not going to worry about tablespaces for now, but it is important to know that the feature is available.