The Weekend DBA: August 2009

Monday 10 August 2009

Other key non-database files

Although there are only three types of file which technically make up an Oracle database, there are other disk files floating around the place which are extremely important tot he correct operation of an Oracle server. They are not considered to be technically part of the database, though, because if they ever get deleted or damaged, no-one will know about it and you won't be told about it. Compare that with losing your control file or a data file: the database will definitely report that as an error or even crash as a consequnce. In short, the key difference between a database file and a non-database one is whether or not the file is continually being written to and monitored. The three types of file I'm going to mention here are not continually monitored.

Archived Redo Logs

The online redo logs contain a record of every transaction that has modified data in the database. That 'stream of redo' is there so that if we ever need to re-perform those transactions (during a database recovery), we can. The trouble is that there is only a limited number of online redo log groups and LGWR eventually fills up the last group... and then switches back to the first and over-writes its contents. That means LGWR eventually destroys redo... redo which was supposed to be there to protect our database by making it possible to recover it!

Archived redo logs are the necessary answer to this paradox. Every time LGWR stops writing to one online log and starts writing to another, the ARCn background process makes a replica copy of the just-filled log in some new location on disk. LGWR can eventually do its worst in over-writing the redo in the online log, because we have a copy of its contents stored elsewhere. The stream of redo produced by the database is thus made continuous and permanent by the Archiver process -which means that archived redo logs are essential to truly being able to say, 'I can always recover my database'.

A database therefore operates with just a handful of online logs -but with an ever-increasing number of archives. You do not need to retain archives for ever, however: once they are a week or more old, they can usually be aged out to tape and deleted from the disk to which they were originally written. This means, of course, that whilst archives are great for protecting your database they impose a management and maintenance burden on the DBA, who has to make sure there is enough disk space for the next archive to be written (implying bulk deletion of old archives) but enough logs available on disk to make any conceivable recovery do-able as quickly as possible (implying retaining as many archives as feasible on disk). It's a management balancing act, in short.

Just as we care about redo enough to multiplex our online redo logs, it stands to reason that we should think about multiplexing our archived redo log copies, too -and, indeed, it is possible for Enterprise Edition users to create archives in up to ten different locations. Standard Edition users can only create duplexed copies (that is, archivees can be written to only two different locations). It is not compulsory to multiplex your archives, but since they are the only thing (ultimately) standing between you and data loss, it would not usually be a sensible move to neglect to do so.

Password File

Logic dictates that the record of who is allowed to start up instances cannot be held inside the database -because to read data from inside a database you need an instance... but you can't create an instance until you've read the record of who is allowed to start up the instance, which you can't read until you've got an instance... and so ad infinitum!

To break this sort of Catch-22, the details of who is allowed to start an instance is held outside the database, in a small binary file called the password file. The password file permits external authentication of privileged users -where the term "privileged user" means 'someone who can startup and shutdown an instance, backup and recover a database, and create a database in the first place'. These are the five basic privileged actions, and if your username and password are found inside the password file, you can perform any of them. If your username and password are instead only stored inside the database itself, then you are merely an "ordinary user" (however many rights and powers you might be granted) and you will never be able to issue the startup or shutdown commands.

Password files need to be created in a given location and with a specific name. The location is $ORACLE_HOME/dbs and the file name must be orapwXXX, where "XXX" is the name of your instance, given by the value assigned to the ORACLE_SID environment variable in force at the time. If you set ORACLE_SID to FRED, for example, then the startup command will create you an instance called FRED, provided we can find a password file called orapwFRED in the ORACLE_HOME/dbs directory. (Note that Windows users have to store their password files in %ORACLE_HOME%\database and the file is called PWDXXX.ora). There is no leeway on the naming or location of the password file, though Unix/Linux users can always store it anywhere they like and call it anything they like and leave behind a symbolic link of the right name in the right place that points to where the file really is. Windows users don't have that sort of luxury, though.

Password files are created using a special Oracle-supplied utility called orapwd. You might issue this sort of command, for example:

C:\oracle\product\10.2.0\db_1\database> orapwd file=PWDtkit.ora password=tkit entries=25

C:\oracle\product\10.2.0\db_1\database> dir
Volume in drive C is W2KOS
Volume Serial Number is 24FE-B205

Directory of C:\oracle\product\10.2.0\db_1\database

07/10/2006 10:54a 61 inittkit.ora
17/10/2006 09:08a 4,608 PWDtkit.ora
2File(s) 39,997 bytes
3 Dir(s) 6,502,473,728 bytes free

This command creates a file of the right name and sets the default privileged user's password to be "dizwell". It also causes 24 other 'slots' to be created inside the new file so that details of other users who we later decide should become privileged users can be stored here. If you ever run out of slots and need to make one extra user a privileged user, you will have no choice but to delete the password file and re-create it with a larger value for the entries setting. Similarly, if you ever are unfortunate enough to lose your password file, you simply re-create it using this sort of command. It is clearly a major security problem is any old user can gain access to your file system and delete and re-create password files, so you take ordinary operating system-related precautions to make sure that only authorised users can delete and create files in the ORACLE_HOME/dbs (or \database) directory. That is a system administrator's function, though: there's nothing the database itself can do to protect the password file.

Not everyone will use a password file, because there are alternative ways of performing external authentication of privileged users. It is possible, for example, to assign an operating system user to membership of a special operating system group (on Unix, usually it's called dba; on Windows, it's ORA_DBA). Logging on to the server itself then means you acquire O/S group membership, and Oracle can check for that and allow you to do privileged actions accordingly. But you can only exercise O/S group privileges if you log on to the Oracle server directly. The password file is the only way of authenticating privileged users remotely.

Parameter File

I discussed earlier how the SGA consists of a Shared Pool, a Buffer Cache and so on. But the obvious question then arises: how big should the Shared Pool be? Should there be a Java Pool at all? Where should the database be storing the archived redo logs? These are important questions concerning database configuration and functionality, and the answers to those questions are contained within a special file called the parameter file by setting keyword-value pairs. For example, you might set SHARED_POOL_SIZE=600M, and then we know exactly how big your Shared Pool is going to be. If you set DB_CACHE_SIZE=348M, I know your Buffer Cache is going to be 348 MB in size, and if I see JAVA_POOL_SIZE=0, I know you don't want a Java Pool at all.

There are 257 different possible parameters that can be set in an Oracle 10g Release 2 database... but, fortunately, only about 20 or 30 of those possibilties commonly need to be set. If you don't set a parameter yourself, it usually has a vaguely-sensible default value. I would say that, to begin with, you absolutely must know (and be able to recite in 'parameter-speak'!) the following core parameters:

DB_BLOCK_SIZE	The size of the Oracle block used by default for every data file.
DB_NAME	The name of the database
SGA_TARGET	The target size of the SGA overall
SGA_MAX_SIZE	The absolute maximum size the SGA can grow to
SHARED_POOL_SIZE	A minimum size for the Shared Pool component of the SGA
DB_CACHE_SIZE	A minimum size for the Buffer Cache component of the SGA
LOG_BUFFER	The absolute size of the Log Buffer component of the SGA
UNDO_MANAGEMENT	Whether the database uses automatic undo management or not
UNDO_TABLESPACE	Where undo is written to if automatic undo management is switched on
UNDO_RETENTION	How long before undo can be over-written
DB_RECOVERY_FILE_DEST	The location of the Flash Recovery Area, where archive redo logs are written by default
DB_RECOVERY_FILE_DEST_SIZE	How much disk space the Flash Recovery Area can consume before declaring 'out of space'

Obviously, there are many other parameters to become familiar with over time, but these are absolutely core competency ones. A full list (together with their meaning and default values, if any) is available from the Oracle website. Note that new parameters get added to the list all the time as each new Oracle version (and its new functionality) gets released. Old parameters also disappear off the list as their functionality becomes deprecated or (finally) obsoleted. If you are trying to learn about what's truly new in a new Oracle version, it's usually a good idea to start by comparing lists of parameters from both versions and seeing what new arrivals there are, and what ones have disappeared.

Parameters can be stored in two totally different types of parameter file. You can either use a text file, editable with nothing more than notepad or vi, and if you do, we call that an init.ora, because the file must have a name of initXXX.ora, where "XXX" matches the value assigned to your ORACLE_SID environment variable. Additionally, the file must be stored in the ORACLE_HOME/dbs directory (ORACLE_HOME\database for Windows users). Alternatively, the parameters can be stored in a binary equivalent of the init.ora called the spfile.ora, because the file must have a name of spfileXXX.ora (the "XXX" again matches your ORACLE_SID). Like the init.ora, the spfile.ora must be located in the ORACLE_HOME/dbs directory. Because spfiles are binary files, stored in a proprietary Oracle format, only an instance can be used to edit them. You edit an spfile by connecting to a running instance and issuing commands such as:

alter system set undo_retention=1800 scope=spfile;

If you cannot get an instance running and yet still need to edit the spfile, you must convert the spfile to a pfile, edit the pfile with a standard text editor and then convert the text file back to the spfile, like so:

create pfile from spfile;

--edit the init.ora thus produced--

create spfile from pfile;

If both an init.ora and an spfile.ora exist in the ORACLE_HOME/dbs directory, the spfile.ora is used to configure the instance and the existence of the init.ora is simply ignored. For that reason, I generally make a point of always deleting one or other of the files, so that there's no ambiguity in my mind about which is the 'operative' file.

Spfiles were invented in Oracle 9i, and before then the init.ora was the only parameter file that was ever used. A lot of 9i users stuck with the init.ora (me included!), out of habit and old familiarity, but these days you should generally switch to using an spfile and become familiar with how you edit it.

Summary on non-Database Files

Archived redo logs are absolutely critical to being able to recover your database; the password file is essential for anyone wanting to remotely administer a database; the parameter file (of whatever type) is crucial for correctly configuring the nature and capabilities of an instance and database -so each of the three files I've discussed here are in no way trivial or irrelevant! They are all immensely important files, but they are not technically part of the dataabse, because if anything ever happened to any of them, you could simply ignore the loss or re-create the relevant file. You can't re-create archives, of course, but so long as you take a new backup, you shouldn't ever need to use or rely on a lost archive. If you lose an spfile, you can use vi to type up a replacement init.ora, and then create an spfile from that. If the password file is ever corrupted, you merely delete it and use orapwd to re-create it. In other words, loss of these files does not cause you to perform formal database recovery, which is definitely not the case if you lose your control file, one of your data files or your current redo log!

Sunday 9 August 2009

The Database

Whilst an Oracle instance is made up of many different and various memory areas together with countless different background processes, each doing a specific task, an Oracle database consists of just three types of physical disk file. Theya re all binary files (so don't go opening them in vi or notepad!) and have an internal structure that only an Oracle instance can make sense of. Taking each of the file types in turn, therefore, we have:

Control Files

Every database must have at least one control file. It is a small binary file (typically, no more than about 10MB in size) and has two main jobs. In the first place, it points to the physical location of all the other files that make up the database. That is, it physically stores the full path and file name of every other constituent file. Secondly, it stores the latest checkpoint change number, which is the timestamp written by the CKPT process at the end of each checkpoint event. This is the "CKPT timestamp" I mentioned earlier.

Other files in the database get timestamped by CKPT at every checkpoint -but the control file knows how to find every other file in the database, it must be the one file which is read first whenever a database is opened. That means the timestamp stored in the control file is read first, and is assumed to be the timestamp to which all the other files in the database will agree. In a sense, therefore, you can say that the control file is the database's master clock.

In short, the control file tells us where everything else in the database can be found and how old it should be when we find it. If you lose your control file, therefore, you are in deep trouble: not only can your instance not find out where every other file is located, but it won't have any idea what checkpoint change number everything should be up to, and therefore SMON has no basis for working out whether the database is clean and consistent. For this reason, although you technically only need to have one control file, sensible people will multiplex their control files, so that the database sees two (or more) copies of it, and will therefore maintain both (or all) copies identically. At that point, if you lose one copy of the control file, there's another copy which you can use to recover with, and no major harm is done.

Note that multiplexing your control files is not the same thing as mirroring them. In a hardware mirror, the instance sees and maintains only one control file copy. Whatever is done to that one file, the hardware kicks in and replicates to a mirror copy. Multiplexing means telling the instance that there are two or more files to write to, and anything written by the instance to one copy is written afresh by the instance to the second copy. To really protect your control file, you really want to do both: multiplex two or three times, and then mirror: potentially, you'll have four or six copies of the control file, and if one copy is lost, there are three or five others you can rely on for recovery purposes.

Data Files

Data files are, funnily enough, where your data is stored, in tables, indexes, clusters and other more exotic kinds of storage structures. All Oracle databases start out with at least two data files: one called system and the other called sysaux (sysaux is new in 10g). Both these files store special system-related data, such as the data dictionary tables which tell us what users exist, what privileges they have, what tables have been created and what columns they contain. Although there's nothing stopping you from creating your own tables inside system or sysaux, it's a bad idea to do so: having good access to the data dictionary is extremely important, because it's needed every time a SQL statement gets parsed. Having your own tables being worked on inside the same file would cause I/O issues, and slow down parsing.

Therefore, although every database starts out with just two data files, you will soon begin to add more -and in fact you can add up to another 64,000 or so data files if you really needed to, each up to several Terabytes in size. The entire collection of data files in an Oracle database can therefore get significantly big!

Internally, data files are organised into discrete 'chunks' of data known as an Oracle block. Blocks are usually 2K, 4K, 8K, 16K or (on some O/Ses only) 32K in size. Once you've specified a block size, that's it: you can't change your mind and ask for an 8K-block data file to re-organise itself so that its blocks are now 16K in size. Whatever block size you decide to use, that is then the smallest unit of I/O that Oracle will use when reading or writing to the data file: ask to see one column of one row in a table, and an entire 16K (or 8K etc.) block is read instead.

The very first blocks of any data file are known as the data file header, and in the header of every data file, CKPT periodically writes the latest checkpoint change number. All data files in a healthy database must have the same checkpoint change number stored in their headers, and all must agree with whatever checkpoint change number CKPT also wrote into the control file. If these conditions are met, we say that the database is consistent. (Note that there one or two exceptions to the 'every file must have the same checkpoint number. Files which are marked read-only, for example, are allowed to be different).

Although a data file doesn't care what data is stored inside itself, it is common practice to make sure that each data file only stores one 'type' of data. For example, you will often see a data file which stores only index data, and another one which stores only tables -just as the system data file is supposed to store only the data dictionary information. Such 'specialisation' in the type of data stored in a data file is a management convenience only and it requires constant vigilence to make sure an index, for example, is not accidentally created inside the tables' usual data file (though no actual harm will arise if such a thing were to occur).

Online Redo Logs

There must be a minimum of two online redo logs, which are then each known as a redo log group. Each group is usually not that big (say, a couple of hundred megabytes), but that obviously depends on the specific requirements of each database installation.

Log Writer saves the contents of the Log Buffer to one of the log groups -which is therefore known as the current log group. When the current log is filled to capacity, Log Writer switches to writing to the other group. When that in turn is filled, Log Writer switches back to the first group and begins to over-write its contents. In this way, a couple of relatively small online logs are re-used in a cyclical fashion.

The contents of the online logs is known generically as redo. Redo lets the database re-perform transactions should the database ever need recovering. Therefore, losing redo is not good -in fact, it's about as close to a disaster as it ever gets with Oracle. Therefore, just as with the control files, each online redo log group should be multiplexed so that Log Writer writes into multiple copies of the same group. Lose one of the copies (known as a redo log member), and the other member is still able to be used and no redo has been irretrievably lost. Multiplexing of redo log groups is not actually compulsory... but if you care about not losing any of your data, it ought to be considered so.

Whereas data files are internally made up of Oracle blocks, redo logs are internally made up of file system blocks which are almost always 512 bytes in size. That size is a product of how a disk is formatted with a file system and is therefore not configurable from within Oracle at all. Some documentation available on the Internet refers occasionally to 'redo log blocks', but they're really just talking about 'file system blocks'.

Summary of the Oracle Database

Oracle databases can be huge: up to 8 Exabytes (that's eight million terabytes!) in fact. But however big they get, they are always only ever made up of three sorts of file: a 10MB-ish control file, two or more 100MB-ish online redo logs, and large numbers of extremely large data files. (The sizes mentioned here are just typical sorts of values. Specific requirements will mean each database is different, though).

Just as we can say 'An instance = SGA + Background Processes", so we can therefore say:

Database = Control Files + Data Files + Online Redo Logs

Each file type has a specific job to do. Data files store data (yours and the database's internal housekeeping variety). Redo logs store redo, to use in case of recovery of transactions. Control files store the location of every other file and the master timestamp for the database as a whole.

All database files are forever being read from, and written to, by background and other processes, and losing any one of them is not an option. If you lose just one of these files, your database must be considered damaged; it will probably cease functioning and the instance managing it will probably crash; and you will have to perform some sort of recovery. We can use multiplexing techniques to minimise the damage done by the loss of one file; we can also use hardware mirroring to protect against hardware failure. A truly safe database will be using both techniques.

Saturday 8 August 2009

Summary of the Oracle Instance

When you run Oracle, you cause a large chunk of memory to be reserved for Oracle's use. That chunk of memory is called the System Global Area (SGA). It is composed internally of several discrete 'pools' of memory, such as the Buffer Cache and the Shared Pool. Each 'pool' is used to store a particular sort of data for particular purposes -sometimes, 'your' data, so you can see rows from a table; sometimes, Oracle's own internal 'housekeeping' data needed to recover a database or parse a SQL statement.

In addition to grabbing chunks of memory, running Oracle also causes many background processes to be launched. The processes run independently of what users happen to be doing (hence the use of the term background). They perform many different housekeeping tasks which are essential for the health and future recoverability of the database, such as writing your modified data back to disk or periodically 'time-stamping' the database so that we can tell it is healthy and consistent at any given point.

An SGA plus the background processes is known as an Oracle instance. The instance is the name we give to 'Oracle running in memory': it's both the memory the Oracle program consumes and the processes (or threads) it needs to carry out is work. If we were to express this concept in a formula, it might read:

Instance = SGA + Background Processes

One instance can only ever open, read, manage and manipulate the data from one Oracle database. But what exactly constitutes a database is something we need to look at next!

Friday 7 August 2009

Processes or Threads?

I've been busy describing DBWn and LGWR as "background processes", but in fact they might not be processes at all. If you are running Oracle on Windows, for example, then they are technically 'threads' within the main process, not independent processes in their own right. On Unix and Linux, however, they are genuinely independent processes and can be listed just like you can list the existence of any operating system process:

[oracle@tkit ~]$ ps -ef | grep ora
oracle 6688 1 0 06:32 ? 00:00:00 ora_pmon_mydb
oracle 6690 1 0 06:32 ? 00:00:00 ora_psp0_mydb
oracle 6692 1 0 06:32 ? 00:00:00 ora_mman_mydb
oracle 6694 1 0 06:32 ? 00:00:02 ora_dbw0_mydb
oracle 6696 1 0 06:32 ? 00:00:01 ora_lgwr_mydb
oracle 6698 1 0 06:32 ? 00:00:01 ora_ckpt_mydb
oracle 6700 1 0 06:32 ? 00:00:02 ora_smon_mydb
oracle 6702 1 0 06:32 ? 00:00:00 ora_reco_mydb
oracle 6704 1 0 06:32 ? 00:00:01 ora_cjq0_mydb
oracle 6706 1 0 06:32 ? 00:00:03 ora_mmon_mydb
oracle 6708 1 0 06:32 ? 00:00:00 ora_mmnl_mydb
oracle 6710 1 0 06:32 ? 00:00:00 ora_d000_mydb
oracle 6712 1 0 06:32 ? 00:00:00 ora_s000_mydb
oracle 6716 1 0 06:32 ? 00:00:00 ora_qmnc_mydb
oracle 6719 1 0 06:32 ? 00:00:00 ora_q000_mydb
oracle 6721 1 0 06:32 ? 00:00:00 ora_q001_mydb

The middle bit of the names in the right-hand column show you the process abbreviations I've been using -and you can see PMON, DBW0, LGWR and so on listed together with a bunch of those 'other' processes (like S000) I mentioned briefly in passing.

There is no equivalent on Windows to the 'ps' command, however -and even downloadable tools like CurrProcess Explorer won't show you items with identifiable names such as I've been using here. The best you can see is the existence of a single process, called oracle.exe, and that's about it. You can see the process names even on Windows, though, if you query the instance itself by selecting from V$PROCESS:

SQL> select program from v$process;

PROGRAM
-----------------------------------
PSEUDO
ORACLE.EXE (PMON)
ORACLE.EXE (PSP0)
ORACLE.EXE (MMAN)
ORACLE.EXE (DBW0)
ORACLE.EXE (LGWR)
ORACLE.EXE (CKPT)
ORACLE.EXE (SMON)

...but that's as good as it ever gets on Windows.

This difference between Unix and Windows is down to the underlying architecture of the operating system and has no impact on the fundamental capabilities and functions of the Oracle database itself, but it means that, technically, we should talk about "background threads" when discussing Oracle-on-Windows ...but no-one in their right mind ever does so! Everyone still calls them 'background processes', even on Windows. It might not be technically precise but everyone will know what you mean.

Thursday 6 August 2009

The Background Processes

A memory area, like the SGA, is useless unless something (or some many things) can write data into the memory area and read data out of it. You need, in short, a set of processes which know how to interact with the various memory areas which make up an Oracle SGA. There are indeed several such processes, and they are collectively known as background processes, because they run and do their work in the background, not interfering in any obvious way with what the users of the database are trying to achieve in terms of practical work outcomes.

There are five compulsory background processes, and all the rest are optional. Taking them in turn...

SMON (System Monitor)

SMON checks the database for consistency and health every time you start up an instance and try and use it to open a database. If the database is, for whatever reason, inconsistent SMON will either fix up the problem entirely automatically if it is in its power to do so, or it will prompt you to rectify the problem if it can't do it itself. SMON, in short, either performs or prompts for the performance of some sort of recovery. A recovery that SMON can perform entirely automatically is known as an instance recovery. One which requires manual intervention from the DBA is known as a media recovery.

A reasonable question to ask is, 'On what basis does SMON assess a database as being healthy or in need of recovery?'. To which the answer is: checkpoint change numbers. Periodically, the database is "timestamped" with a new sequence number (the latest number from a pool of ever-increasing sequence numbers). So long as every file in the database has been stamped with the same number, we can say that the database is 'consistent to that point of time'. If all the checkpoint change numbers that SMON reads during the database opening procedure, and which are stored in the headers of every physical file constituting the database, are the same, the database is healthy. If they are not all the same, and if there are no good reasons for the difference, the database is presumed to have suffered some sort of failure and a recovery procedure is therefore invoked.

CKPT (Checkpoint)

It is another reasonable question to then ask, 'What writes the checkpoint change number, which SMON reads, into the headers of each database file?'. To which the answer is, 'Another background process called Checkpoint, abbreviated as CKPT'. The occasions when CKPT writes its information into the headers of all database files are known as checkpoint events. Every transaction that takes place on an Oracle database (that is, every insert, update or delete of a record) causes a system-wide sequence number called the System Change Number or SCN to be incremented. At a checkpoint event, the latest SCN is deemed to be the latest Checkpoint change number, and it is this CCN which is written by CKPT into the headers of all database files.

It is fair to say that, during prolonged database operation, CKPT will forever be writing new checkpoint change numbers into the headers of all the database files; it is SMON's job, at the opening of the database, to ensure that every constituent component of the database agrees on what the latest checkpoint change number should be. If they are all in agreement, the database opens. If there is disagreement, recovery is called for.

To make things even simpler: SMON reads what CKPT writes, and both processes are intimately involved in identifying when databases need recovery, and how much recovery they need, if any.

DBWn (Database Writer)

Database Writer is the process responsible for periodically saving the modified contents of the Buffer Cache back down to the physical database files. By writing modified buffers back to disk, Database Writer makes that data safe, immune from the vagaries of operating system crashes or power failures.

Since a buffer in memory that contains modified data is known as a dirty buffer, and because Database Writer's job is, in effect, to re-synchronise the version of data on disk with the version of data held in the Buffer Cache, we sometimes say that Database Writer's job is to clean buffers.

Furthermore, since a clean buffer only contains data which could simply be re-read off disk (if we had to), it is OK to over-write the contents of a clean buffer. By contrast, a dirty buffer is the only version of a modified buffer -it hasn't yet been saved to disk- so it would not be very sensible to let you over-write the contents of a dirty buffer. Therefore, we can also say that Dataabase Writer's job is to permit the recycling of the Buffer Cache: by 'cleansing' buffers, it makes them re-usable.

Although we call the process 'Database Writer', since 8.0 it's been possible to have multiple independent database writers, and that's why you see an 'n' in its abbreviated name: on a production system where several writers have been configured, you'd see processes DBW0, DBW1, DBW2 and so on.

LGWR (Log Writer)

As Database Writer takes the contents of the Buffer Cache and writes them to disk, so the Log Writer background process periodically saves the contents of the Log Buffer to disk. The Log Buffer, you will recall, stores redo -a series of instructions about how to re-perform (and hence recover) data modifications. Log Writer therefore makes the record of transactions that was stored originally in memory permanently safe by writing them to hard disk.

In the process, Log Writer permits the 'recycling' of the contents of the Log Buffer, just as Database Writer allows data buffers to be re-used: you can't over-write the contents of any part of memory if they're the only record you've got of something happening to the database, but once there's a copy of the old stuff on disk, the memory in the Log Buffer can be re-used by new redo entries.

There is only ever one Log Writer process. Unlike Database Writer, for example, which can have several 'clone' processes sharing the work burden, redo log records must be written to disk sequentially -and that means only one process can be in charge of working out what to write to disk. If that wasn't true, you might come to grief when a series of transactions have dependencies. If you first create a new product record and then raise a new sale record to indicate that the new product has just been sold, for example, you need the product transaction to be recorded before the sale transaction -otherwise, during a recovery, you might end up trying to sell a product that doesn't exist yet. To prevent that logical nonsense from happening, therefore, a single Log Writer process writes everything in the correct sequence.

Whilst it guarantees the correct order of events, though, only have one Log Writer does sometime mean that Log Writer becomes a bottleneck on the system: there's no way to speed him up by sharing his work amongst other processes, for example. That's why the files to which Log Writer saves data (called redo logs) should ideally be stored on the very fastest hard disks you have available. It's far more important to let Log Writer write as fast as possible than it is to let Database Writer write quickly, for example.

PMON (Process Monitor)

Process Monitor keeps an eye on all the other processes. If any of them hang or die, PMON can try to re-start them -and sometimes it actually succeeds! If it can't succeed, then it takes appropriate action -which is, quite often, to crash the entire instance. PMON also keeps an eye on users: if they are in the middle of a transaction, for example, and their PC hangs or the network cable gets disconnected, PMON will spot the abnormal disconnection of the user and rollback whatever transaction they happened to be in the middle of. That means locks on rows are removed, memory is freed and any other 'database resources' the user was consuming at the time get reclaimed.

ARCn (Archiver)

I've mentioned already that Log Writer allows us to re-use the Log Buffer by periodically saving its contents to disk, by writing them to one of the online redo logs. If it writes enough redo to these logs, Log Writer must eventually run out of space -and must, at that point, begin to overwrite the redo previously stored in the online logs. Put another way: online redo log contents are recycled. That in turn must mean there is a limit to how far back you can go and still hope to be able to re-perform transactions during a recovery -and that is indeed the case.

Log Writer on its own protects the database for only the relatively short amount of time it takes to fill up all online logs. If you want more protection than that, you need to copy the contents of the online logs out to some more permanent store before you over-write them... and that is exactly what the Archiver Process does. Once one online log has filled up, it copies its contents to an archive redo log. The online log can then be over-written whenever LGWR feels like it and yet there is a continual stream of redo available from the archives. Recoveries going back days, weeks or months therefore remain entirely possible.

Archiver therefore makes a permanent record of redo. With that permanent record, you can guarantee recovery of a database without loss of committed data. Without it, you have to pray the redo you want to re-play is still available in the online logs, and it might or might not be. Technically, therefore, although Archiver is an optional background process, every production database that cares about not losing committed data would have it switched on.

You'll note that the process name abbreviation includes an 'n', like Database Writer does. That's for the same reason, too: just as you can have up to ten Database Writers, so you can have up to ten Archivers, and hence the processes get named ARC0, ARC1, ARC2 and so on.

Others

I've just listed the six background processes which are most closely involved in saving your data, saving the instructions for how to modify your data and checking the health and consistency of your database. Five of them are compulsory and one (ARCn) should be considered almost so.

There are, however, dozens of additional background processes that might be running in any given Oracle instance, depending on what extra software has been installed or what additional database options have been implemented. If you configure an Oracle RAC, for example, you will immediately get five additional background processes (LMON, LMD, LMS, LCK, DIAG). If you start to schedule database jobs, you'll see a new process called CJQ0 ("coordinator of the job queue"). If you configure to run in what is known as Shared Server mode, you'll see a lot of processes called S000, S001 and so on (shared servers).

When you install or configure these 'exotic' configurations of Oracle, it's as well to know what the extra processes are called and what they do (and how they do it). But for now, you simply need to know that there can be many more background processes running than the quick half dozen I've described to you so far -but that half dozen are 'core', and every other background process you might see are configurable extras.

Wednesday 5 August 2009

SGA Summary

When the Oracle program runs, it grabs a large chunk of memory in which it can do its work. That 'chunk', in its entirety, is known as the System Global Area, or SGA.

The SGA is not a monolithic chunk of memory, however: it has internal structure, and is internally chopped up into discrete areas of memory which perform specialised tasks. These sub-areas are known as the Shared Pool (which caches execution plans and the data dictionary tables), the Buffer Cache (which caches ordinary data), the Log Buffer (which records changes made to ordinary data), the Large Pool (which is needed for efficient memory allocations to parallel slaves and backup processes), the Java Pool (providing memory to Java stored procedures) and the Streams Pool (unique to 10g and above, and providing memory to replication services).

Tuesday 4 August 2009

The Streams Pool

New in 10g, a special area of memory within the SGA is now dedicated to allowing a database to replicate itself to a different location (think of a head office database which needs to be 'copied' up to a branch office out in the boondocks). If you don't have a need to have your database replicate itself, you don't need a Streams Pool. Before 9i, replication took whatever memory it needed from the Shared Pool -and hence the business of preventing hard parses was compromised. Oracle version 10g has recognised the significant memory requirements for effective replication and has thus created the Streams Pool as a dedicated resource which alleviates the burden on the Shared Pool.

Monday 3 August 2009

The Java Pool

Since Oracle 8i, it has been possible to write procedures, functions and other code which gets stored inside the database in the Java programming language. It has since turned out that you would be certifiably insane to do so but nevertheless, the capability remains. (Java is better utilised "in the middle tier" -that is, in an application server, rather than in the backend database itself). If you run Java code in the database itself, it needs its own unique area of memory in which to execute -and that area of memory is called the Java Pool. It can be enormous (in the gigabyte range, for example) -but if you don't run Java in the database at all (all your code is, say, PL/SQL), then it can be zero in size.

Sunday 2 August 2009

The Large Pool

The Large Pool is an optional component of the SGA, though these days it is so useful to have that I doubt many production databases would not be using one: optional it may be in strict right, but I suspect most DBAs these days consider it functionally compulsory.

Whenever you parallelise an operation in Oracle, communication between the individual parallel slaves requires memory access. That memory is usually grabbed from the Shared Pool if nothing else is available: but if a Large Pool has been configured, it gets taken from that instead. You end up using parallel slaves whenever you do a parallel query, perform an RMAN backup or utilise multiple I/O slaves. Parallel query and multiple I/O slaves are slightly exotic (and cost money), but practically every Oracle database on the planet needs backing up with RMAN -so that means practically every Oracle database on the planet needs a Large Pool.

You can do all of these things without a Large Pool, but doing so means the Shared Pool is stressed. It's busy trying to minimise hard parses, and suddenly great gobbets of memory that it needs to do that job are pinched by a backup job or a parallel query! Not a good outcome, in short... and the existence of a Large Pool would mean that such a conflict never arises.

So let me make it simple for you: if you ever back up your database, you need a Large Pool. That should garner a near 100% response, I would have thought!!

Saturday 1 August 2009

The Log Buffer

It is impossible to generalise, but if there is such a thing as a 'typical' Oracle database, you might measure the size of the Shared Pool and the Buffer Cache in the several hundred megabytes range, or bigger. In comparison, most Log Buffers anywhere in the World would not be much bigger than, tops, 10M or so.

The Log Buffer is thus a relatively tiny area of memory in which a record of the changes you make to data is stored. Every insert, update or delete that you perform will generate a record of what row was affected and how it was modified. That record is written, in the first instance, to the Log Buffer component of the SGA. From there it is periodically saved to special files on disk, thus making the record of what transactions have taken place on a database permanent.

The transaction record is generically known as redo, and we can therefore say that redo is first written (for reasons of speed) into the Log Buffer of the SGA for each and every transaction that users perform.

As the name implies, redo is generated in order to let us 'do again' those transactions which might otherwise have been lost because of hard disk failure or power supply failure or software error or user stuff-up. Redo, in short, is Oracle's recovery mechanism, and the Log Buffer is simply the first place redo is created.