The Weekend DBA: July 2009

Friday 31 July 2009

The Buffer Cache

The Buffer Cache is a memory area in which the data physically stored in the database is read and modified. When you want to see the employee records in the EMP table, they are first read from disk and loaded into the Buffer Cache. Why not simply fetch the rows off disk and straight back to you and your client PC? Someone else might want to read the EMP records just after you. If we loaded the rows into Buffer Cache first time round, the second person won't have to read the rows from disk at all: they'll be able to access the copy of them direct from the Buffer Cache. Reading rows of data off disk involves performing a physical read. Reading rows out of the Buffer Cache copy of them involves performing a logical read. Logical reads don't require physical disk access and so are much quicker to perform than physical reads. The job of the Buffer Cache can be said to be, therefore, to minimise physical reads and to help maximise logical ones.

It's not just reading data, though. If you want to update a record, the modification is made to the version of the data stored in the Buffer Cache. Only later is this modified buffer (known as a dirty buffer) then saved back down to disk. When we write a modified buffer back down to disk, the contents of the in-memory buffer become the same as the on-disk data, and we therefore say that the buffer is now a clean buffer.

Oracle never reads individual rows, though. If you want to update Jim's phone number in the EMP table, we never just get Jim's record and let you at it. The minimum unit of reading and writing is an entire "collection" of rows -usually 2K, 4K, 8K or 16K in size. The idea is that if I am modifying Jim's phone number, you will probably want to update Bob's phone number similarly in just a moment or two... so my request to update Jim's record helps you out, because it causes the entire collection of rows around Jim's row to be loaded into memory at the same time. Your subsequent update to a different row hopefully therefore only involves logical I/O, not physical I/O -and you therefore benefit from my earlier work.

This 2K, 4K, 8K or 16K collection of data is known as the Oracle Block and is the minimum unit of I/O in an Oracle database. Request to see just one byte in a block, and the entire block is read into the Buffer Cache. The term "block" is actually used to describe the collection on disk; when a block from disk is read into memory, we call it a "buffer" instead. Essentially, though, a block and a buffer are the same thing, except that one physically exists and the other is only an in-memory thing. Some other database products (such as SQL Server) call the same thing a "page" of data. Blocks, buffers, pages... they're all really just different words for the same idea: a minimum 'chunk' of data read from disk into memory, and stored in memory for other users to find without having to re-visit the physical disk.

Thursday 30 July 2009

The Shared Pool

When you issue a query such as select * from emp, the database must work out what that means. Do we have a table called EMP? What are its constituent columns? Where, physically, on disk, does the table live? Are there any indexes built on the table which might help us get the results back more quickly? Do you, the user, have rights to query the table? The process of answering all these sorts of questions is known as parsing a SQL statement, and it involves a lot of work going on 'behind the scenes'.

To work out whether a table exists at all, for example, we have to query an in-built system table called TAB$. To work out where it physically exists on disk, we have to query more in-built system tables including one called TS$ . To check if you have the rights to view the table, yet another query is made of a table called USER$ (amongst others). You issue a simple select statement, in other words, and the poor database has to issue a dozen queries of assorted system tables before it can even work out what your statement means. We call these 'System SQL statements issued in response to a user SQL statement' recursive SQL, and performing recursive SQL means performing lots of preparatory work before actually answering your SQL requests.

Parsing is therefore very expensive... so we'd really like to only have to do it once and have the results of the parse stored for you and other users to make re-use of every time you re-issue the same basic SQL statement in the future. The result of a parse is known as an execution plan, and it represents a package of pre-worked-out instructions as to how to answer a SQL statement. Each new SQL statement causes a new execution plan to be constructed and stored... and they are stored in a sub-section of the Shared Pool known as the Library Cache. If a second user issues exactly the same SQL statement as someone else happened to issue earlier, therefore, the second user can simply nip into the Library Cache and make use of the execution plan previously worked out for the first user.

If you have to go through the entire rigmarole of working out an execution plan from scratch, that's called 'doing a hard parse'. If you can merely re-use execution plans previously created by other users, that's called 'doing a soft parse', and soft parses are a lot cheaper and quicker to carry out than hard ones.

You can fairly say, therefore, that the job of the Library Cache is to help stop us having to do hard parses.

Incidentally, fetching the contents of system tables like TAB$, USER$ and so on is expensive, too: these tables are collectively known as the data dictionary, and the data dictionary is physically stored on disk in the system datafile. Forever having to access these tables off disk would not be a good idea: lots of disk access generally means "slow performance". Therefore, once we've read some rows from these tables, we store them for future access in another component of the Shared Pool called the Data Dictionary Cache. The data dictionary cache is where in-memory copies of the data dictionary tables, needed to carry out parsing operations, reside.

The job of the dictionary cache is therefore to stop us having to visit the system data file on disk every time we parse.

Since both the Library Cache and the Data Dictionary Cache are both sub-components of the Shared Pool, and since both are designed to minimise parsing activity, or at least make it an in-memory affair if it does have to happen, it is reasonable to conclude that the job of the Shared Pool overall is to make parsing SQL statements as cheap an affair as possible.

Wednesday 29 July 2009

Oracle Server Components

What is an Oracle Server?

When you edit a document in Word (or whatever word processor you happen to prefer!), you don't type documents directly into a disk file, because doing so would be painfully slow. Instead, documents are initially 'typed into' and created in memory -your computer's RAM- because that works at nanosecond speed and is thus very fast and responsive to work with. Unfortunately, most of today's RAM is volatile: switch off the computer's power supply and you lose the lot -and if you've ever been typing a long, complicated document and had a power cut before you remembered to save it, you'll know the sinking feeling of your brush with volatility!

To protect against data loss like that is why you're supposed to do a periodic 'File -> Save', of course. Hard disks might be slow and painful to use for constant access, but they make a fine permanent store of things which have initially been created and worked on in RAM. Saving a document in Word simply means 'transfer a copy of it out of volatile memory and onto permanent disk storage'. In this way, your word processor lets you combine the speed of RAM with the permanence of hard disk storage. You therefore end up getting the best of both worlds: speed and safety.

In fact, that goes for any program you care to mention, be it a word processor, a video editing package, a media player, whatever: data is read from its permanent store on hard disk and manipulated and worked on in memory, because one is fast but volatile and the other is slow but safe.

The Oracle Relational Database Management System (RDBMS) is just another program in this respect, because it does exactly the same thing. The data it manages is stored permanently on disk but when someone wants to update it, view it or delete it, it is loaded from disk into memory and it is in memory that all the data manipulation takes place. The permanent store of data on disk is what we call the Oracle database. The chunk of RAM where the database's data is actually worked on is called the Oracle instance.

The relationship between an instance and a database is one-to-one: if an instance has accessed the data from database X, it cannot also access or work on the data from database Y. If you want to work with the data from database Y, you will need to use another instance entirely. This relationship -one instance manages only one database- is always and completely true.

The relationship the other way around is not always like that, though. In an advanced configuration of Oracle known as a Real Application Cluster (or RAC for short), it is possible for one database to be accessed by two or more instances, simultaneously. That's usually done to speed things up and get more work done. But even so, each of the participating instances in a RAC can only be used to store and work on data from one database. So the rule can be more fully stated as:

One instance can only work on one database,
but one database may be opened by many instances

An Oracle Server is nothing more, therefore, than a computer onto which the Oracle software has been installed; on which an Oracle instance is running in memory (providing fast but volatile data management capabilities); and on which an Oracle database has been permanently stored on disk.

Users connect to the instance (not the database, note) and make requests to see or to manipulate data from the database by issuing SQL statements. In response to these SQL statements, the Oracle program loads the requested data into the instance. The user continues to work on the data within the instance. Occasionally, to make sure a power failure doesn't cause the loss of all the work users have been doing, the Oracle software performs, in effect, an automatic 'File -> Save' and writes the data stored in the instance back to the physical database on disk. In this way, users of an Oracle database get all the performance advantages of working with things in fast RAM, but all the permanence and safety advantages of having saved stuff to disk.

The Structure of the Oracle Instance

An Oracle instance is merely an allocation of memory in which to do work plus a bunch of background processes which automate a lot of that work for us. The memory areas are collectively known as the System Global Area (or SGA for short). The constituent 'pools' of memory, which in the aggregate make up an SGA, are called The Shared Pool, The Buffer Cache and The Log Buffer. These three SGA components are compulsory and every SGA on the planet will have all three of them. There are various additional 'pools' which your SGA might or might not have configured, depending on how you are using your database, what software options you've enabled (and paid for!) and so on. You will commonly see, for example, a Java Pool and a Large Pool. You will less commonly see (in 10g and above) a Streams Pool.

No matter what fancy names these pools of memory are given, they are all merely chunks of memory in which different sorts of data are stored. The principle function of all of them is to try and 'cache' that data in memory so that you do not have to fetch it back off disk or work it out from scratch (both of which options are relatively slow and expensive).

Tuesday 28 July 2009

Simple Block Layout

So, with this overhead, free space, and (wow) actual data, how is it all stored in the data block?

SIMPLE BLOCK LAYOUT

The data block header (fixed) is at the beginning of the block. A small tail is at the end of the block (the tail is for block consistency checking). Everything else (variable header and data rows) fits between the beginning and the end.

Variable header info grows from the top down (just below the fixed header) if necessary and rows are inserted from the bottom up (just above the tail).

Everything else is freespace. I will discuss free space and "freelists" in part 2 of the data block.

Ok, in part 2, I will show some more examples of the data block and how the other structures we've talked about use and sometimes mimic the data block structure.

Now it's time to see data blocks in action. For this example I am just going to consider a normal table with rows and columns and nothing more exotic (once you understand this, the various permutations (clusters, partitions, etc.) should be easy to understand).

So, we've done step 1; we've issued a CREATE TABLE statement. Whether we accepted the default storage settings (by not entering a storage clause explicity) or we tweaked it for this specific table, a few things happen during the execution of that CREATE table statement:

-Oracle creates a table segment in a tablespace and grabs some initial extents comprised of data blocks that sit physically in one or more datafiles. (reread the earlier posts if you don't get this part)

-Those previously empty datablocks are preloaded with some small amount of overhead information (if you're still wondering why this overhead is good or even needed, don't get discouraged).

-Oracle "registers" this segment in some data dictionary tables (adds rows) for database management purposes (it is a Relational Database Management System after all).

-----------------------------
That last bullet might sound interesting. We have not INSERTed a single row of data yet, but there appears to already be rows loaded in the database. That is correct.

One of the required tablespaces in Oracle is SYSTEM. The SYSTEM tablespace has a whole host of tables, views, indexes, and procedural objects that have nothing to do with your data, but everything to do with your metadata.

You see, even the Oracle code uses the Oracle database as part of normal operations. This is great for two reasons: it makes the system very scalable; and it means that almost any DBA task can be done via simple SQL commands.
------------------------------

So now we are ready to insert some data into our table (let's not worry about indexes right now). Let's say we have some data from an application or a flat file and we want to insert 1,000 rows into our new table.

After some parsing (we'll also talk about SQL later), Oracle is ready to store the rows in data blocks. How does Oracle determine which data blocks to use? Freespace is used first.

If Oracle can find some empty space in the segment, it will try and fill those empty spaces with the new data. If there is no freespace left, Oracle will try to add more extents to the segment and use those data blocks for storage.

Because we just created the table, all of the allocated space (minus overhead) should be available. We know it's new and empty, but how does Oracle know?

Well, Oracle could simply keep a list of all new tables that have never held rows, but that would only be of very limited use and could be a source of contention (if everyone inserting data had to check a global listing of empty tables, blah, blah, blah to see if their table was on the list).

Instead, Oracle leverages the overhead in data blocks and does something similar at the segment level. That something is -- SEGMENT HEADERS. From a logical perspective, segment headers are similar in many ways to data block headers: they contain some management data about the segment.

Segment headers contain a list of the extents in the segment (extents table), the FREE LISTS, and the HIGH WATER MARK. The segment header starts in the first block of the first extent in the segment.

FREE LIST

The freelist is simply a list of data blocks currently allocated to extents in the segment that have some free space in them for new rows. There are some interesting things to understand about freelists.

Contention- instead of having everyone in the database wait in line to find free space in segments, freelists moves that contention down to the individual segment so only people manipulating data on that segment (table) have to contend with each other to find free space. If you have a segment (table) that you expect to have a lot of insert and update activity on, you can create multiple freelists to reduce that contention even more. To avoid waits, you can determine how many people will be inserting into that table AT THE EXACT SAME TIME (on the count of three, push your button) and create that many freelists. Unfortunately, many people adjust this number too high (thinking if two is good, a hundred is great) and end up wasting disk space (I'll try to remember to explain this when we start talking about design issues). A good DBA will monitor any freelist contention and adjust AS NECESSARY.

Data block free space- in the first post on data blocks, I talked about free space inside data blocks. This free space is used to expand rows (via update) and for new rows (insert). In OLTP systems, almost every data block will have at least one lonely block that is not being used. Do we really want to see if we can squeeze a whole row into that single byte of space? No, it will never fit. We only want to try and put data where there is plenty of room. Fortunately, when we created the segment (CREATE TABLE), we told the segment (explicitly or by using defaults) how we wanted to report this free space inside of data blocks.

We did this using the keywords PCTFREE and PCTUSED. These are the keywords that determine if a table's individual data blocks shows up on the freelist group or not.

These percentages (PCTxxxx) are just that, parts of 100. PCTUSED puts blocks on the freelist, PCTFREE takes a block off of the freelist (it is no longer accepting new inserts, only updates (DELETE's are always accepted and have no interest in checking freelists)).

The default value for PCTUSED is 40(%) and PCTFREE is 10(%). That means if a block is only 39% used (<40%), it goes on the freelist as a candidate for new rows. It will stay on the freelist after new rows are inserted until the data block is 90% full (<10% free space remaining).

Inserts increase PCTUSED (thereby decreasing PCTFREE), deletes decrease PCTUSED (thereby increasing PCTFREE), and updates can do either depending on if the update is increasing or decreasing the size of the row(s).

Since our new segment currently holds no data, all of the data blocks in our INITIAL extent(s) are well below 40% used, so they are on our freelist. Any of those data blocks can be used to hold our 1,000 rows.

Now, what if we had 1,000,000 empty blocks and were only going to use maybe 100 for this insert, would we really want to keep track of the freespace in all 1,000,000 blocks individually? I wouldn't. It could take longer to maintain and search for free space than just allocating a new extent. But then, like I hinted at above, I would be saving time, but possibly wasting disk space.

So Oracle has another neat item that helps me out (not only here, but when doing things like FULL TABLE SCANS as well).

That thing is...

High Water Mark (HWM) - just like the name sounds, the HWM keeps track of how high the river of data has ever risen.

While just counting the number and size of extents in a segment will tell us how much space has been allocated, the HWM will tell us how much of that space has ever been used (until to rebuild or truncate it). Hopefully, you can already guess at some of the benefits of knowing that.

One of those benefits is in controlling the freelists. The freelist only tracks blocks that are below the HWM. That keeps things very manageable and reduces the chances of wasting space and resources (like by having 1,000,000 blocks with a single row each as opposed to having fewer, more densly packed data blocks).

Let's say we have 2 extents each with 10 data blocks that can each hold 50 of our 1,000 rows. Let's also keep or PCTUSED=40 and our PCTFREE=10

so our block count=20. All of them are empty.

Because we've never inserted rows into this segment, our HWM is 0.

When we insert the first row, the freelist will say "I don't have anything free, let me try to add some empty blocks." The freelist sees that there is a block already allocated to the segment (from one of our 2 extents) that is empty, so it advances the HWM from 0 to 1 and grabs one of those blocks. The row is then inserted.

When we insert the second row, the freelist says "I have a block waiting for data" and tells us where to stick the data (whether its being rude or not depends on how you treat the database --joke), decreases the PCTFREE in the block and increases the PCTUSED.

This goes on for the next few dozen rows until we try to insert our 46th row (we know that physically we have enough space for 50 rows in each block). By then, something has happened. Our PCTUSED has grown to 90%, but we don't really care about that because the block was already on the freelist (any of our blocks with PCTUSED<40 will be put on the freelist). What we do care about is that the PCTFREE has gone from almost 100 down to 10 when we inserted our 45th row. At that point, the data block said "I'm stuffed, leave me alone." So it was removed from the freelist.

Now the freelist says "I don't have anymore space below the HWM, I'd better see what's out there." It sees that there are still 19 data blocks that have been allocated but never used. So, it grabs the next one, puts it on its list and advances the HWM to 2.

If this sounds like overhead to you, you're right. If it sounds like unnecessary (or even undesirable overhead) you're wrong. When we start looking at the big picture, you'll learn to appreciate it.

So now we continue to insert rows in the same fashion, filling up block after block in the first extent (no flamers!) until all the blocks are full (or at least they're 90% full) and the HWM has been moved up to 10. At this point, we have used all of the 10 blocks in our first extent. Now the freelist has to grab some data blocks from another extent (see the no flamers comment).
The freelist checks with the segment and sees that we have more blocks allocated in our second extent that have free space.

We'll continue to insert rows like we have been, moving up the HWM until it gets to 20.

At this point, both of our extents' set of data blocks have been removed from the freelist because they are 90% full (only 10% free). Now when the freelist looks for allocated, but unused space, it won't find any, so it will extend the segment by allocating a new extent (with its set of 10 data blocks) and advance the HWN to start adding those newly allocated data blocks to the freelist.

We can then insert the remaining 100 rows.
--------------------------------------------
In this example, we've seen how freelists are used to insert data into data blocks. We have seen our segment size grow to the sum of the extent sizes allocated and how the HWM is used in this process. We have seen how data blocks accept new rows and when they reject new rows.

What we haven't seen is the affect this has on the tablesapce and the datafile(s).

Since a tablespace is similar in ways to a segment, we know that the tablespace has grown (logically) to equal the sum of the size of it segments and is (physically) still the sum of the size of its datafiles. But what has happened to the datafile(s) themselves? You'll remember, or can look it up, that in Chapter 1 of this series we said that a datafile had a predetermined size. When we create it, we told it what size to be.

That datafile is allocated to the tablespace. Allocated but not used (for data). As we create segments, we create extents which consist of data blocks from datafiles owned by the tablespace that the segment was created in. Whew! Say that 10 times fast.

As we allocate new extents for our segment, does the datafile get bigger? No, not necessarily. The allocation of that space goes from unused (free) to used as a segment needs it. So, both the tablespace and the datafile get filled up. When the datafiles get full (by allocation, not by inserting data) then you have to either add a new datafile or allow an existing one to grow. Tablespaces work in a similar fashion logically with extents. As you approach the MAXEXTENTS setting, you have to increase it or prevent the segment (table) from growing.
-------------------------------------------------

This is not the end of our discussion on data blocks. It is just the beginning. But it is very important that I do a good job of helping you understand the basic structure and role of data blocks (and their chain of custody up to the tablespace) before moving on to all of the simple tasks in Oracle. That's part of why I simplified some of this without trying to mislead you. This is the way it works and you will discover any shortcuts I took as we continue.

Understanding the data block is critical to mastering the Oracle database and all of the tasks that come with developing for and administering the Oracle database.

Monday 27 July 2009

Concurrency - Interested Transaction List

Concurrency really just means how many people can be doing things at the same time in the same place. It directly affects scalability.

Many RDBMS products use a form of centralized "lock manager" to keep track of who's doing what to what. That is a valid solution, but as the number of concurrent users rises (not people just logged on to the database, but the people actively doing things) then a lock manager can and does become a point of contention. Oracle decentralizes almost all of this and let's the data blocks themselves deal with concurrent access (two or more people trying to manipulate rows in the same data block).

That is the reason why a well designed Oracle database can scale higher on less than any other database out there.

An interesting test I've performed (because my boss thought Oracle sucked and tried to prove it) is comparing Oracle to Brand X for a single user. Oracle was fine, but in numerous tests it lagged behind it's competitor. But, as I started to ramp up to dozens then hundreds of concurrent transactions (different sessions, etc.) I noticed something: the Oracle performance line stayed about the same, while Brand X started to fall behind more and more.
-------------------------------------

The ITL is controlled by the settings INITTRANS and MAXTRANS (just accept the defaults unless you really know what you're doing). As you can probably guess from those keywords, you set a starting number and then tell it how big it can grow (how many concurrent transactions you'll track before making people wait to get into the block).

There is also a row directory that contains data about each row in the block. One interesting thing to know about this (small) component is that once space has been allocated in the row directory, you don't get it back when you just delete a row. It's interesting, as we'll see later, but not a thing to worry about or to feel you're being robbed over.

All in all, the data block overhead is usually only about 100 bytes or more (let's round it up to 200 bytes and you can see that in a 4k block it's still a small percentage), so it shouldn't play too much of a role in your decision on a data block size.

There are about 2 dozen other things in the overhead that I haven't mentioned, but they are as simple as the rest of it and we will address those later.

I've mentioned the pieces I think you need to know for now.

Sunday 26 July 2009

Parts of the Database (5) - Data Blocks

Data Blocks

If you could only pick one thing in Oracle to know well, you should pick the DATA BLOCK. It is the key to everything worth doing in Oracle.

If you can understand the data block, everything else in the Oracle RDBMS is easy. EVERYTHING.

Another name for an Oracle data block is a "logical block", but you won't hear many people call it that and I'm glad. Because for now I want you to consider it a physical construct in our layers of logical and physical pieces.

That's because data blocks impact/influence/determine storage, backup and recovery, SQL performance, memory usage, I/O, scalability, and probably anything else I haven't mentioned.

And the cool thing is, data blocks are very easy to understand.

---------------------------------------------
A lot of my developer friends never get proper database instruction. As a result they are forced to guess about how an RDBMS like Oracle works. They know that databases store data and that a document server stores data. So it's easy for them I think to make the analogy that a database is just a fancy document storage tool with some fancy retrieval/manipulation tools. And they think, "So what? My app is cooler than that." So they don't invest too much time or attention in this "flat file repository."

The ones that do have the time and interest sometimes are discouraged when they approach a DBA who is busy with his/her own work and may not even have a mastery of the basics themselves (people get mad when I say it, but if you can run a script, you can find someone who will hire you as a DBA).

That's why you hear DBA's say "it's complicated" and developers say "Oracle sucks."

Hopefully I can do a good enough job explaining these basics that you can avoid these retorts.
--------------------------------------------

WHAT IF ORACLE DIDN'T USE DATA BLOCKS?

If Oracle didn't use data blocks, then data would be stored in one big file and developers would be right to be unimpressed. That would mean that the database would just be a big sequential file and it would be a mess to get the data in and out, you'd have to scan the whole thing every time you wanted the data.

At the other extreme, you could save every row of data as a separate file. Then you'd...well, it's not hard to see how unworkable that would be.

In short, without data blocks, Oracle really would suck.

Data blocks are the way Oracle groups things like rows into manageable sets. How many rows you can fit into a data block depends on how big your row is and how big you make your data block.

And the beautiful thing is, you never have to deal with data blocks directly. You just have to understand them so you can design how you want to use them.

HOW BIG IS A DATA BLOCK?

As big as you tell it to be (kinda).

Oracle doesn't force you into an arbitrary data block size. You determine what your needs are and pick an appropriate size. The most common sizes I see for data blocks in regular (OLTP) databases are 4k or 8k.

-------------------------------------------------------
NOTE: check the block size of your OS before picking a block size, you want the data block size to be some multiple of the OS block size (1...64?) to avoid unnecessary IO. OS's read from disks in blocks as well and you don't want an uneven relationship between the OS and Oracle block sizes.

Now Oracle even allows you have multiple data block sizes within the same database so you can really customize it to your needs.
--------------------------------------------------------

The basic idea in choosing a size is that if you do a lot of single row operations --you tend to like a smaller block size; and if you're doing primarily bulk operations --you tend to like a larger block size. That's mainly because of performance reasons, but we'll get to those when we start discussing SQL.

For now let's just say that for a transactional database we'll use a 4k block size and for a reporting database we'll use a 16k block size. Now let's forget about the reporting database and just focus on the transactional database for this discussion.

I have data blocks that are 4k in size. That means if I have rows that are 1k each (not a small row, but I'm keeping the math easy), I can squeeze 4 rows into each of those data blocks.

Got it? Ok, now forget it, because you won't be able to get 4 1k rows into that block. That's because Oracle is not a flat file repository. And there are two extra parts of a data block that Oracle gives you to make things better: data block overhead and empty space.

FREE SPACE

--The empty space is easy. The whole idea is that if you have a row of 1k which includes a text (VARCHAR2) column with the value "Dratz is a jerk", what if you want to update that record to a value of "Dratz is a genius, I hope he DOESN'T get eaten by a bear."

Dratz is a jerk.
Dratz is a genius, I hope he doesn't get eaten by a bear.

See how much more space the second sentence takes up? If your data block were full, how could you update it? Oracle would let you, it would just move that row to another block that has room to fit it all in and leave a note behind telling the query engine where to find the real row (row migration). That means two reads instead of one to get that data. So it's a good idea to leave free space in every data block that you think has a chance of being updated to included more bytes. Read only data is not updated, so you don't have to waste free space with those.

The free space also allows new rows to be added so you don't have to allocate new data blocks. When you delete rows, you are also freeing up space in a data block that be reused by new rows or expanding existing rows.

OVERHEAD

--The overhead contains some really neat pieces. In later chapters I'll go into some more detail and explain how and why they are used.

First, there are 2 header pieces.

One is fixed in size for every data block and contains information about the data block, like what kind of block it is (data, etc.) and what the unique block address is. That may not seem very useful at this point, but we'll see how important they are later.

The other header piece is variable in size and the important piece to remember here is ITL (Interested Transaction List). This is what allows concurrency.

Saturday 25 July 2009

Parts of the Database (4)

What are Datafiles

Every Oracle database has one or more physical datafiles. The datafiles contain all the database data. The data of logical database structures, such as tables and indexes, is physically stored in the datafiles allocated for a database.

A datafile can be associated with only one database.

Datafiles can have certain characteristics set to let them automatically extend when the database runs out of space.

One or more datafiles form a logical unit of database storage called a tablespace.

Data in a datafile is read, as needed, during normal database operation and stored in the memory cache of Oracle. Modified or new data is not necessarily written to a datafile immediately. To reduce the amount of disk access and to increase performance, data is pooled in memory and written to the appropriate datafiles all at once.

Friday 24 July 2009

Parts of the Database (3)

What are Tablespaces

A tablespace is made from at least one datafile. Many datafiles can be within a tablespace. This is a logical grouping of datafiles.

A datafile cannot be split between tablespaces or shared and is exclusively used by a tablespace.

There are two types of tablespace permanent and temporary.

A permanent tablespace contains data pertinent to the database and is non-transitive, the data remains.

A temporary tablespace is transitive in nature. The data is volatile, and consists of space set aside for large data sorts and intermediate results in the query process.

Thursday 23 July 2009

Parts of the Database (2)

Segments and Extents

Ok, now that we have a basic understanding of datafiles and tablespaces, we are one step closer to discussing data blocks and something I haven't mentioned called "headers".

But before we get there, we need to bridge the gap between D&T and blocks.

That bridge comes in the form of two logical constructs that build off of the block physical layer.

Those constructs are SEGMENTS and EXTENTS.

Easy ways to think of these two are as mapping tools. They don't actually store data, they just keep it organized and accessible. In that way, they are not very dissimilar from tablespaces (just at a more granular level).

SEGMENTS

When you type in the syntax to create a table, one thing you have to provide (among others) is a table name. That table name along with the "create table" keywords gives Oracle everything it needs to create a segment.

A segment is registered with the tablespace as the traffic cop for the table. Like a tablespace, a segment has no real space of its own; it is the sum of its parts. However, one thing you can do with a segment is put a limit of how big it can ever get by limiting its extents.

The segment is a critical part of the RDBMS, but it is not one you lose a lot of sleep over right now. You will, over time, find that you want to give some better instructions or definition to your segments, but most of these can be accomplished via ALTER TABLE statements. For now, just try to think of segments as the logical controller of a table (or index, etc.). There are different segment types, but don't worry about those now.

EXTENTS

If segments are the traffic cops, you can think of extents as parking lot attendants and data blocks as actual parking spaces.

So, the traffic cops tell you which parking lot to use and the parking lot attendant tells you where to park in that lot.

(SEGMENT-> EXTENT-> BLOCK)

Now the job of your parking attendant (extents) is to make it easy to park your car.

If the parking lot attendant were a workaholic and liked to operate in crisis mode, he would just tell each car that entered "You park here, you park way over there, you go up to the next level."

But a good parking attendant will want to make life easier for himself and his customers. If you were part of a caravan, you would probably want to park next to other members of your caravan. And the parking attendant would want to make his life easier by handling cars in groups (like maybe 10 at a time) and say "you next ten cars park in slot 1-10" ( parking attendants don't like to start with zero).

This improves allocation. You see, as Oracle puts it "an extent is a specific number of contiguous data blocks allocated for storing a specific type of information."

So instead of allocating space one data block at a time, extents let you allocate a group of datablocks at a time. Though those blocks get allocated (reserved) for a segment (hence a tablespace), they don't actually contain data until you INSERT data into your table (segment).

If you use DICTIONARY MANAGED TABLESPACES (DMT's) you have lots of options on what your initial extent size is (basically how many blocks), what you want your next extent size to be (when you outgrow your current allocation and need room for new rows) and if you want your new extents to grow in a certain pattern (using PCTINCREASE). This does not change the size of extents you've already allocated, just new ones (unless you rebuild using MOVE or something).

So, if you had INTIAL =32k, NEXT=64k, PCTINCREASE=100

then your first extent would be 32k in size, your second would be 64k, your third would be 128k, and your fourth would be 256k, and so on and so on.

As you can see, you would have many different extents of different sizes which can lead to fragmentation and even the possibility that you won't have enough room to even create your next extent (because the size requirement keeps doubling even if you just wanted to add a single row).

Wow, that sounds like it can cause problems, doesn't it? Well, it often does. So why does Oracle allow this to happen?

Because Oracle is an incredibly flexible piece of software. Oracle doesn't give you all these options because the developers (developers wrote it, not DBA's) couldn't figure out what would work best, they did it because they wanted to give you the ability to build your database the way you want to based on your needs.

Unfortunately, too many people don't bother to learn the basics or analyze the implications, they just follow a script they saw somewhere or make some guesses without reviewing them for impact.

Properly used, this flexibility is incredibly powerful. But DMT's have inherent issues that DBA's cannot address. This is basically because of the serial nature in which things HAVE to be allocated.

Oracle has relieved most of this allocation contention by moving allocation management from the entire database to the individual datafiles (where the lines could be a lot shorter). For that and other reasons, almost everybody now uses LOCALLY MANAGED TABLESPACES (LMT's).
---------------------------

To summarize, table data management is the responsibility of the table's SEGMENT (logical) which employs EXTENTS (logical) to marshal BLOCKS(physical).

So, how do segments and extents relate to tablespaces and datafiles?
We know that datafiles and blocks are physical constructs while the other three are logical constructs.

Datafiles consist of one or more blocks (I've never seen a single block datafile, but I guess it's possible).

Tablespaces contain one or more segments (a tablespace can contain many tables and indexes).

If you've studied "create tablespace" and "create table" statements, you've seen that each allows for the storage clause (INITIAL, NEXT, MINEXTENTS, MAXEXTENTS, and PCTINCREASE, etc.). This is so you can setup defaults for all tables at the tablespace level or customize storage at the table level.

Extents always belong directly to the segment, but since segments reside in a tablespace, you have this option to set up storage parameters at the tablespace level if you want.

Ok, hopefully this was understandable because we are about to get to the last piece that will enable us to really start understanding most of what you'll ever need to know about how Oracle works (data blocks). Sounds too good to be true? Stay tuned.

Of course, we will come back to these topics again, but by then you'll be a master.

Wednesday 22 July 2009

Parts of the Database (1)

Parts of the Database

Why Oracle Works the Way it Does

Now, the smallest unit Oracle really messes with is the "data block". I'm going to skip over them for now because I think they'll make more sense after we discuss datafiles and tablespaces.

You'll remember (because I asked you to) that Oracle employs an elegant weaving of physical and logical layers. As we go through them, some may wonder "why do all these layers exist?" All these layers (I hope to discuss the most important ones) exist for: performance, scalability, concurrency, and manageability.

Ok, you have a wonderful database with tables and indexes and then, what do those lousy users do? They start using your database and start pumping more and more data into it. Where does all that data go? That's what we are going to talk about today.
--------------------------------
When I use MS Word to write a document, I have 2 options when I'm done working on it: I can throw it all away or I can save it to disk. If I choose to save it to disk, I don't just save it randomly anywhere on the disk, I save it someplace where I can easily find it again. I save it to a directory or folder. I may have multiple directories, one for technical documents, one for various versions of my resume, one for the exciting screenplays I write that will never be made into movies.

I can have as many documents as I want in those directories*, I'm only limited by the capacity of the disk.

That makes sense, right?

Ok, so every document I save has a definite size to it. I can browse the directory and see how many files I have, how big each of them is, and what the total size of all my files in a directory is. So how big is my directory? Outside of some minimal space to hold the specification, it's really just a logical construct. The directory itself doesn't really have any size of its own right and is defined by the size of all of the files in it? How big can it get? As big as it gets with files before I run out of disk space.

Congratulations, you now understand the basic concepts behind DATAFILES and TABLESPACES in Oracle. They don't work exactly that way, but pretty close.

Word doc= datafile
directory = tablespace

(A tablespace is in reality not exactly like a OS level directory (it just functions the same kind of way). Datafiles can only belong to a single tablespace, but can be physically located in more than one OS directory.)

All objects in an Oracle database are assigned to a tablespace, and the tablespace knows which datafiles it has to actually store the objects and data.

You don't specify the size of a tablespace; it doesn't really have a max size (unless you want to do the math in 10g to finally come up with thousands of petabytes).

You do, however, specify the size of datafiles. This is like telling a Word doc that you plan on saving a million words in your document so go ahead and reserve that space now. This preallocation is nice, because you don't suffer the overhead of trying to find new space to add every time you type in a new word.

So, what do you do when you start approaching your million word limit and see that you are going to need space for a lot more words? Well, you have two basic options: you can add another file and let the additional words go there, or let the files you have grow to make room.

In Oracle, if you want you your datafiles to grow as needed, you use the AUTOEXTEND keyword when creating or altering the datafile (just make sure to set the max size so you don't chew up all the disk space).

So which is better, several smaller files or fewer larger files? Generically, neither is better but it is often easier to herd a couple of cats than a hundred. And if you're going to be a good DBA, you want to make your life as easy as you can.

If you oversize your datafiles, you could be wasting space (and time). If you undersize them and don't tell them how to grow (using AUTOEXTEND), you will be getting paged a lot with warnings that your tablespace is almost out of room.

Everything you do as a DBA should be done for a reason. If you choose to do no analysis of your environment and your needs, you'll be working a lot harder than you need to.

How many tablespaces should you have? Well, there are a couple of tablespaces every Oracle database requires to function, but we'll discuss those in the future. All we care about now is you data.

You probably already know about tables and indexes, right? Tables hold data and indexes (can) make it a lot quicker and easier to access that data.

A lot (and I mean a lot) of people like to separate their tables from their indexes and put them in different tablespaces (with datafiles on different disks). There's nothing inherently wrong with that if you have a reason to do it, but I want to mention one fallacy that too many people take as not only valid but as an absolute truth: that you MUST separate your indexes from your tables.

The theory goes that you want to reduce I/O contention on your disks by not fighting for read control between scanning indexes and scanning tables (rows). This is wrong on at least three fronts:
-your indexes will often (not always) be cached in memory anyway
-your disk heads are constantly bouncing around unless maybe you have a single user
-it means you don't understand how sql queries work

I don't want to get into it too much here, but it is germane to the discussion of tablespaces.

When you issue a SELECT statement, the query optimizer will analyze your statement and decide how it wants to get the data. It will decide to use an index or not to use one. If it does use an index, it has to read the index BEFORE it knows where to find the row(s) in the table. First one, THEN the other. Access for this query will not simultaneously read the index and the table. It can't. So split them into different tablespaces if you want to, but don't let anyone tell you that you HAVE to.

Some people have lots of tablespaces; some people only have one (outside of the required ones). Well talk about some possible reasons for each approach in the future. As long as you understand what tablespaces and datafiles are, we can move on for now. The main thing to be able to do at the end of this series is to be able to make intelligent choices. And a lot of times that just means trying something and seeing how it works. Never be afraid to change when you need to.

How does all of this really work?

All of you applications data and other objects will reside inside of a tablespace and it's related datafiles. If Oracle were not super friendly and easy to use, it would make you manage how that data is stored manually. Fortunately it is super friendly and easy to use. It's even flexible.

In older versions of Oracle you had to use Dictionary Managed Tablespaces (DMT's) (we'll talk about the data dictionary later). This was fine and still is, but it's not the best you can do. Since Oracle 8i you can take advantage of Locally Managed Tablespaces(LMT's).

LMT's offer so many advantages to DMT's (mainly because of things like allocation overhead and fragmentation) that few people choose or can find good reasons to use DMT's nowadays. It doesn't mean you're an idiot if you use DMT's, it just means people will think you are.

We have a lot more to discuss with datafiles and tablespaces, but this is enough to get us started and ready for the next topic.

Tuesday 21 July 2009

The Database

The database consists of two parts the database and the instance. So what is the difference?

To understand the difference between a database and an instance, we need to know the basic definition of each and the relationship between them.

Definition:

A database is a set of files stored on disk.

An instance is a collection of Oracle background processes and shared memory.

Relationship:

An instance can mount and open one and only one database.

A database may be mounted and opened by one or more instances.

For example, in the case of Real Application Clusters (or RAC) a database is usually mounted and opened by one or more instances.

Thursday 16 July 2009

DataGuard

Data Guard

Data Guard is a mechanism for having a copy of one database mirror the activities happening on the main site. This mechanism is particularly applicable where recovery from a backup is unacceptable with regards to time and effort. This solution allows a quick 'failover' from machine to machine with the minimum of effort.

There are various levels of data transfer between live and data guard sites -
-
-
-

Types of Data Guard.

Physical standby - The standby database is mounted in constant recovery and is unavailable for query.

Logical standby - The database is open (read only) and available for query. Extra indexes can be added to improve query performance turning the standby database into a reporting database as well. There are limitations to the types of data supported by this type of standby database. Please see the documentation for your particular database version and requirements.

The background.

Data Guard continuously copies archive logs from one machine to 'another' and then applies them to a 'backup' database, which is in a constant state of recovery.

If the main database / machine should fail then the backup machine / database can take over.

All directories and filenames are the same as on the live machine. This allows the elimination of filename conversion and any other associated conversion and the inevitable confusion that arises.

How to Put the Standby Live.

1. Shut down the live database.

2. Copy all redo logs to the backup machine (if possible).

On Athena (current Data Guard backup machine) issue the following command when connected as / as sysdba.

Identify and resolve any archived log gaps.
SQL> SELECT THREAD#, LOW_SEQUENCE#, HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;
This will show all gaps of log files and what may need to be copied across from the original machine.

Register any unapplied log files just copied across.
SQL> ALTER DATABASE REGISTER PHYSICAL LOGFILE 'arcxxxx.log';

Repeat the last two steps until the query returns nothing.

3. Copy in the redo logs to bring the database up to the most recent transaction level.

4. Finish managed recovery.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;

5. Switch the backup to take over from the primary database.
SQL>ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;

After issuing this SQL statement, you can no longer use this database as a standby database and subsequent redo logs from the original primary database cannot be applied.

6. SQL> SHUTDOWN IMMEDIATE

7. SQL>STARTUP

To resume the original machine / database combination the original database should be recreated as a standby database and then you should perform a switchover.
Stopping / Starting the Standby Database.

The data guard database is started and stopped in a slightly different manner to a normal database.

Stopping the database.
Connect / as sysdba
à shutdown immediate.

Starting the database
Connect / as sysdba
à STARTUP NOMOUNT;
à ALTER DATABASE MOUNT STANDBY DATABASE;
àALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;

The 'disconnect from session' allows the recover command to return you to the command line and run recovery in the background. (Else you would have to leave a SQL window open.)

Basic Commands

To see what activity is taking place on the standby site:
SQL> SELECT * FROM V$MANAGED_STANDBY;

Determine how much current redo data is available in the redo log.

SQL>SELECT THREAD#, SEQUENCE#, BLOCK#, BLOCKS
2> FROM V$MANAGED_STANDBY WHERE STATUS='RECEIVING';

Obtain a list of the archived redo logs that were applied or are currently pending application to the SAT database.

SQL> SELECT SUBSTR(NAME,1,25) FILE_NAME, SEQUENCE#, APPLIED
2> FROM V$ARCVHIVED_LOG ORDER BY SEQUENCE#;

Oracle and OS

Oracle Database Architecture.
-----------------------------

Choosing the correct database architecture (hardware platform, operating system) is critical to the success of any new Oracle database application. Architecture decisions were simple when the mainframe was the only architecture available. But architecture selection is not as clear-cut now that Microsoft Windows platforms, UNIX and LINUX are viable database environments. Oracle is more than happy to offer their flagship database on a wide-variety of hardware and operating system combinations. As a result, IT shops have more hardware and software choices available to them than ever before. The key to success is choosing the right ones.

A correctly chosen architecture will result in an application that performs to expectations, scales easily to meet increased user demands, and is easily administered. Incorrect architecture decisions may result in poor performance, limited functionality, complex administration and tuning, lack of scalability, poor vendor support, poor reliability and availability and a high total cost of ownership.

Architecture decisions should not be made in a vacuum or based on personal preference. It is important for all large IT shops to create a part-time architecture selection team consisting of members drawn from the following areas: LAN/WAN communications, O/S support, operations, architectures and database administration. This team should be responsible for determining the best architecture for a particular application.

Extensive up-front analysis is critical during the architecture selection process, so let’s continue our discussion on Oracle database architectures by comparing three of the more popular Oracle database platforms: Oracle on Windows, Oracle on LINUX and Oracle on Unix. We’ll conclude with a database architecture comparison chart that compares the Windows, LINUX and Unix architectures.

Oracle on Microsoft Windows Platforms.
--------------------------------------
Microsoft Window’s success as a server operating systems began when Microsoft released SQL Server 4.2 for the Windows NT operating system in 1994. Microsoft SQL Server 4.2 was a full-function relational database server with an attractive price tag. The traditional database competitors (Oracle, Informix, Sybase, IBM) realized that in order to compete in the departmental arena, they must begin to market a scaled down (and cheaper) version of their enterprise UNIX databases. Some of the workgroup database server products were simply the vendor's flagship enterprise database repackaged and re-priced to appeal to a different group of customers.

Database vendor competition is fierce in this environment. Microsoft’s attempt to dominate this architecture tier with SQLServer is forcing all vendors to adjust pricing and accelerate the release of new products as well as enhancements to existing products. This high level of competition between vendors allows consumers to take advantage of a high level of functionality at a reasonable cost.

The Windows server operating systems are known for their low-cost and ease-of-use. The Windows operating system is more easily administered than UNIX and Linux. Windows servers use the same type of interface as their Windows client operating systems counterparts. Administrators are able to perform a large percentage of their administrative duties through a familiar point-and-click GUI interface. Windows operating system manuals use familiar PC terminology when discussing concepts of general administration. Some flavors of UNIX have GUI administrative tools, but the majority of administrative duties are still performed via command line interface using rather cryptic operating system commands (grep, awk, ps -ef, cat).

Database vendors realize that Windows servers oftentimes exist outside of the traditional information processing support framework. As a result, general ease of use is not a competitive advantage but a competitive requirement. Windows database products can be characterized by their easy installation, ease of use, limited tuning options, and good documentation. All vendors offer strong visual administration and monitoring tools that do not require the administration skill and intimate knowledge of the database required by their UNIX and LINUX counterparts.

The fastest Windows database title moves from vendor to vendor due to strong vendor competition and fast paced advancements in hardware and operating systems. Microsoft and Intel hardware vendor’s clustering technologies allow technicians to connect multiple hardware platforms together and make them appear to end-users as a single-image environment. Clustering provides Windows environments with both scale-up (scalability within the hardware chassis by adding hardware components) and scale-out (scalability by adding additional hardware chassis) capabilities.

Although not as easily measured as performance, reliability must be considered when evaluating database architectures. In the beginning, the majority of consumers rarely considered Windows NT to be a viable alternative in this arena. Consumers’ faith in Windows reliability is growing as the Windows operating system family matures and new functionality is added by Microsoft, hardware vendors, and third-party software solutions providers. Oracle offers highly available solutions for Windows environments with its Oracle Data Guard, Oracle Fail Safe and Oracle RAC for Windows product sets.

Oracle on LINUX Platforms.
--------------------------
Running Oracle on LINUX/Intel platforms is rapidly gaining in popularity. The Oracle Corporation is fostering LINUX’s credibility as a mission-critical operating system by being the first vendor to market both an enterprise relational database and an enterprise application server for the operating system.

Oracle, Dell and Red Hat have formed a partnership to to enhance the LINUX/Intel environment in the areas of performance, reliability, clustering, and manageability. The three partners are collaborating to develop and market enterprise-ready LINUX solutions based on Dell PowerEdge Servers, Dell/EMC and PowerVault storage systems, Oracle9i Database Release 2 with Real Application Clusters, and Red Hat LINUX Advanced Server.

Oracle also provides direct support for the Red Hat LINUX Advanced Server operating system. Oracle customers who use the Red Hat LINUX Advanced Server operating system will be able to call Oracle support for both LINUX and Oracle issues.

LINUX combines the ease of use and cost-effectiveness of Intel hardware platforms with the performance, reliability and availability of enterprise-class UNIX servers. If the application requires more processing power, CPUs, memory and disk can be purchased without making a major impact on the business unit’s operating budget. LINUX is designed to be user-friendly and easy to install. LINUX can be installed on a home PC as well as a network server for a fraction of the cost of competing software packages.

Compared to their Windows and UNIX counterparts, the majority of LINUX vendors have only recently begun to expend resources on providing highly available architectures to consumers desiring fault tolerant systems for mission-critical applications. The reasons for this lack of focus are:

The LINUX operating system is relatively new and has only become recently popular with corporate consumers. Historically, large corporations have not considered LINUX as a viable corporate operating system. One can only assume that most corporations felt that anything that was free wasn’t a viable product. Only when Red Hat and other competing vendors began to formalize and “corporatize” their LINUX offerings did corporate consumers begin to consider LINUX as an alternative to Windows and UNIX.

Consumer perception (or misconception) that the UNIX operating systems are inherently more reliable than their LINUX counterparts resulting in a lack of consumer demand for highly available LINUX servers.

Market analysis showing that product cost and not high availability being the primary reason consumers choose LINUX environments.

The cost of highly available LINUX servers approaching the bottom tier prices of enterprise Unix environments that have many fault tolerant features inherent to their architectures.

LINUX hardware server platforms not being perceived by some consumers as highly available when compared to competing UNIX hardware offerings from Sun, HP, IBM, etc.

Will LINUX continue to gain in popularity? The answer is absolutely! LINUX growth will continue to be fed by the rapidly improving price/performance ratio of Intel server hardware and the increasing acceptance of LINUX as an enterprise-ready operating system by both corporate consumers and third-party application vendors.

Oracle on Unix Platforms.
-------------------------
Unlike LINUX, which has only recently become popular, UNIX platforms have been the perennial favorites of shops desiring to build non-mainframe applications. UNIX platforms were the only alternative to mainframes before Microsoft Windows and LINUX evolved into viable alternatives.

Sun, HP, Data General and Sequent were competing for the corporate consumer’s dollar years before the client/server revolution popularized non-mainframe architectures. As a result, there are a large number of third party tools and applications available for UNIX architectures. In addition, competition between hardware and database vendors in this environment is fierce, resulting in rapid advances in technology and features.

UNIX databases lead all other architectures in database software options and add-ons available to consumers. Hardware vendor (HP, SUN, IBM, etc.) competition in this tier is fierce. Because of this competition, hardware vendors are required to release new hardware and O/S features on a regular basis to survive, making this tier the most technically advanced of the three. Performance and reliability are the key advantages of this environment, while cost and complex administration are the drawbacks.

Traditional UNIX vendors (with Sun leading the way) realize that to compete against Windows and LINUX they must make their operating systems easier to use. Most UNIX operating systems now provide GUI administrative tools, but the majority of administrative duties are still performed via command line interface using rather cryptic operating system commands (grep, awk, ps -ef, cat).

Although alike in many ways, each UNIX operating system has its own variations of system commands, administrative processes and procedures. This complicates application development, operating system support, and database administration for shops that run different versions of the UNIX operating system. Operating system administrators are unable to make a seamless transition from one environment to another and administrators must be trained in each operating system to effectively administer it.

Performance is a key selling point for vendors in this environment and as a result, enterprise UNIX hardware, operating systems and database software are all designed for high performance. High-end enterprise servers have the capability of supporting hundreds of gigabytes of disk and can certainly support many hundreds of users. Until recently, the majority TPC-C and TPC-D benchmark records were held by hardware platforms running some flavor of the UNIX operating system. Windows and Linux are now beginning to challenge UNIX’s position as performance leader, recently capturing several TPC-C and TPC-D benchmarks.

All enterprise UNIX vendors must stress reliability to be competitive in the enterprise marketplace. In addition, many offer additional products that take advantage of UNIX clustering services. UNIX has almost thirty years of continuous development, making it the most mature of all operating systems with a proven track record of reliability.

Venturing into the Great Unknown – New Database Architectures.
--------------------------------------------------------------
If a new database architecture is selected (one that has never been used before) an additional set of concerns must be addressed:

Venturing into new architectures will increase development time and training costs. Can the additional time and higher application development costs be justified?

If the desired architecture requires support from the development area, can application personnel effectively administer the new architecture?

Is the additional functionality or cost reduction the new architecture provides worth any additional risk to the application? This is often described as the comfort ratio.

What is the business unit’s, application area’s and the IT department’s vision of the future?

Wednesday 15 July 2009

Killing a session in Oracle

Sessions in oracle can be killed with the command:

SQL> alter system kill session 'sid,serial#';

The value for sid and serial# can be query'd using the following query:

select * from v$session
or
select sid, serial#, osuser, program
from v$session;

After this command, the indicated session is marked for kill. When it's possible the session will be killed. Sometimes this can take a while (for example when a lot of rollback is needed).
To kill a session faster you can use the keyword IMMEDIATE like this:

SQL> alter system kill session 'sid,serial#' immediate;

Tuesday 14 July 2009

The Oracle decode function

The decode function can be used in SQL for and IF-THEN-ELSE construction. It's an alternative for the CASE statement which was introduced in Oracle 8.

Syntax:
decode( expression , compare_value, return_value, [,compare, return_value] ... [,default_return_value] )
with:
expression is the value to evaluate
compare_value is the value that can match the evaluated value
return_value is the value that is returned if compare_value equals the value.
The default_return_value is the value that is returned if no match is found.

To evaluate this expression, Oracle compares the expression to each compare_value one by one. If expression is equal to a compare_value, Oracle returns the corresponding return_value. If no match is found, Oracle returns the default_return_value. If no default value is specified, the null value will be returned.

Sample code
select id, decode(status,'A','Accepted','D','Denied','Other')
from contracts;

Will return for each id:
If status = 'A' : 'Accepted'
If status = 'D' : 'Denied'
Else : 'Other'

Oracle automatically converts the values for expression and compare_value to the datatype of the first compare_value. Also the datatype of the return_value is converted to the datatype of the first return_value. If the first result has the datatype CHAR or if the first result is null, then Oracle converts the return value to the datatype VARCHAR2.
Note: two null values are considered equivalent in the decode statement.

Monday 13 July 2009

Oracle Database Admin

Oracle Database Objects

The following sections provide some interesting information on Oracle database objects (tablespaces, tables, indexes). To keep the reading lively, I’ll intersperse some tuning information and hints, tips and tricks in the discussion. This section is not an all-encompassing guide to administering Oracle objects, rather it is a collection of information that I felt that was important to all DBAs, regardless of the experience level.

What Database Are You Working In?

Working in the wrong database is a common problem for database experts as well as their less experienced counterparts. How many times have YOU found yourself running statements in the wrong environment? Feel free to include me in that not so select group. The operating system command SET can be used in Windows systems to display environment variables. The ENV command can be used to display the environment variables in UNIX. Many seasoned database administrators change their UNIX shell prompt in their.profile to display the current Oracle SID. Displaying the current Oracle SID in the shell’s prompt provides a continuous reminder to the DBA of the database they are working in.

GLOGIN.SQL and LOGIN.SQL

Administrators are able to use two configuration files, glogin.sql and login.sql to customize their SQL*PLUS environment (including the prompt). When a user activates SQL*PLUS and connects to the database, SQL*PLUS will execute the contents of the glogin.sql configuration file in $ORACLE_HOME/sqlplus/admin.

After the glogin.sql file has been executed, SQL*PLUS will execute login.sql. Oracle will look for the login.sql file in the current working directory (where you started SQL*PLUS) and the operating system environment variable sqlpath. Remember that the statements in the login.sql file take precedence over glogin.sql. The exact names of these files may be different on some operating systems. Check the Oracle installation and administration guides provided for your operating system for the exact names.

Here's an example of my glogin.sql file on my PC that displays the time and instance name in my SQL*PLUS prompt. The file also contains a few formatting commands to format SQL*PLUS output.

Glogin.sql
COLUMN file_name FORMAT a44
COLUMN tablespace_name FORMAT a20
COLUMN owner FORMAT a15
COLUMN segment_name FORMAT a20
set lines 132
set pages 100
set termout off
col dbname new_value prompt_dbname
select instance_name dbname from v$instance;
set sqlprompt "&&prompt_dbname> "
set termout on
set time on

Choosing a Database Block Size

Many database specifications can be changed after the database has been created. For releases prior to Oracle9i, one important exception is the database block size. Although Oracle9i allows you to specify different block sizes, choosing the default block size for an Oracle9i database is still critical.

Configuring the Oracle database to use bigger blocks often leads to an increase in performance since bigger blocks allow more data to be transferred per I/O call (the database block is the unit of I/O for the database engine). Larger blocks sizes also allow more key values to be stored in B-tree index blocks, which reduces the index’s height and improves the performance of SQL statements that use the index structures.

Since you are storing more data per block, bigger blocks may increase the number of transactions that access data and index blocks concurrently. If you have a very high number of concurrent users, you may need to adjust the initrans and maxtrans parameters for data objects that have a higher than normal transactional concurrency.
A few quick thoughts on database block sizes:
A block size of 2048 used to be the preferred database block size. This was before the era of high-speed disk drives and controllers. Now that more data can be transferred more quickly, 2048 has been replaced with 8192.
A database block size of 8192 is currently the recommended block size for most database configurations.

Use database block sizes of 16K and 32K for applications that have row sizes greater than 8K. If the application data has row sizes that are greater than 8K, using 16K or 32K block sizes allows more rows to be stored per block and decreases the I/O costs of accessing a single or multiple row(s).

Use database block sizes of 16K and 32K for data warehouses and decision support systems. Decision support systems and data warehouses, by their nature, access large volumes of data to provide users with the information required to make business decisions. You will reduce disk I/O by storing more data in each block.

The Oracle Database Fundamentals I (1Z0-031) and the Oracle Database Performance Tuning (1Z0-033) certification tests will have a few questions on database blocks. Test candidates should know:
The db_block_size parameter
How to specify non standard block sizes in Oracle9i
What tablespaces must use the default Oracle block size
How to change the default Oracle block size
The contents of an Oracle block
What affect the PCTFREE and PCTUSED parameters have on data block free space
The differences between PCTFREE/PCTUSED and automatic segment space management
What row chaining and row migration are and the affect they have on database performance
How to correctly size the default Oracle block size

Copying Databases Between Servers

Don't use the EXPORT/IMPORT utility to copy databases between servers running the same operating system. Execute the following steps to speed the transfer:
1. Execute the ALTER DATABASE BACKUP CONTROLFILE TO TRACE; statement on the source server.
2. Bring the database down and copy the trace file, parameter file, all datafiles, control files and redo logs to the new server.
3. Make the following changes to the trace file created in step 1:
4. If you are changing the database name, change the first line of the create statement to reflect the new name and change the REUSE keyword to SET.
5. Change NORESETLOGS to RESETLOGS.
6. Change directory names if they have changed on the new database server.
7. Delete all comments and lines that have a # in front of them (#s aren't comments in all Oracle tools).
8. Connect as INTERNAL and run the SQL statement contained in the trace file. It will start up the database in NOMOUNT stage and recreate the control files. The SQL statement’s final step will be to MOUNT and then OPEN the database.

What happens if you can’t take the database down? You’ll have to execute a hot backup, which means your database will have to be in ARCHIVELOG mode. Here are the steps you will have to perform to copy an online database to a new server.

Execute the hot backup, making sure all of the files are backed up. Note the time after the last tablespace is backed up.

Execute ALTER SYSTEM SWITCH LOGFILE; to archive the last redo log that was active during the hot backup.

Execute the ALTER DATABASE BACKUP CONTROLFILE TO TRACE; statement on the source server.

Execute the ALTER DATABASE BACKUP CONTROLFILE TO filename; statement on the source server.

Copy the datafile backups, both controlfile backups (backup to trace, backup to filename) and the archived redo logs that were generated during the hot backups. Make sure you match the directory structures that are on the source platform.

Execute the RECOVER DATABASE UNTIL TIME XXXXX USING BACKUP CONTROLFILE; to recover the database on the new platform. Pick a time just after the hot backup was complete. If the file or directory names will be different on the target server, bring the database to MOUNT stage and issue the ALTER DATABASE RENAME FILE oldfilename TO newfilename command to point the control file to the new locations.

If you need to change the database name, execute the following changes to the trace file created in step 1:
o Change the first line of the create statement to reflect the new name and change the REUSE keyword to SET.
o Change NORESETLOGS to RESETLOGS.
o Change directory names if they have changed on the new database server.
o Delete all comments and lines that have a # in front of them (#s aren't comments in all Oracle tools).
o Connect as INTERNAL and run the SQL statement contained in the trace file. It will start up the database in NOMOUNT stage and recreate the control files. The SQL statement’s final step will be to MOUNT and then OPEN the database.

Oracle Tablespaces

Students that are new to Oracle often become confused when instructors begin discussing Oracle’s logical and physical storage structures. Every Oracle database is made up of data files that contain the database’s data. The logical database structures (tables, indexes, etc.) define and format the data that is physically stored in the datafiles. The logical storage structure that ties the physical structures (files, blocks of data) to the logical structures (tables, indexes, etc.) is the tablespace. The DBA specifies the datafile name, size, and location during tablespace creation. The physical datafile(s) are created when the create tablespace statement is executed. The DBA then assigns various logical objects to a specific tablespace during their creation.

Temporary Tablespaces
Sorting rows as efficiently as possible is one of the keys to high performance database applications. Oracle will automatically perform sorting operations on row data requested by CREATE INDEX, SQL ORDER BY, SQL GROUP BY statements and some join operations. For optimal performance, most sorts should occur in memory. Oracle8i allocates a private sort area (defined by the sort_area_size and sort_area_retained_size parameters) while Oracle9i also allows a shared sort area (pga_aggregate_target parameter) to be defined.
For many applications, sorting on disk cannot be avoided. Administrators are able to increase the performance of disk sorts by creating a tablespace that is optimized for sorting. Data sorted in a permanent tablespace requires many space allocation calls to allocate and deallocate temporary segments. If a sort tablespace is declared to be temporary, all processes requesting a sort operation share one sort segment in that tablespace. The first user process to sort creates the initial sort segment. All other sort operations share that sort segment by taking extents in the segment that was initially created.

Temporary tablespaces improve the performance of sorts that cannot be performed completely in memory. The performance benefits of bypassing the normal space allocation mechanism should not be taken lightly. The processing costs of each space allocation/deallocation execution can be roughly compared to the resources consumed by ten insert/update/delete statements.
You create a temporary tablespace to be temporary by using the TEMPORARY keyword of the CREATE TABLESPACE and ALTER TABLESPACE commands. It is important to note that temporary tablespaces cannot contain permanent objects (tables, indexes, rollback segments, etc.).
The Oracle Database Fundamentals I (1Z0-031) and the Oracle Database Performance Tuning (1Z0-033) certification tests will have a few questions on temporary tablespaces. Test candidates should have a firm understanding of:
What the temporary tablespace is used for
The differences between permanent and a temporary tablespace used for sorting
How to create a default temporary tablespace in Oracle9i
The impact default temporary tablespaces have on user administration
Where the user sorts if a default temporary tablespace is not specified
Create Tablespace Temporary vs. Create Temporary Tablespace
The Oracle SQL reference manual provides two different statements to create a temporary tablespace:

CREATE TABLESPACE temp DATAFILE ‘d:\oradata\orcl\tempdata.dbf.’ The statement above creates a tablespace that, although described as "temporary" in dba_tablespaces, has one or more permanent datafiles associated with it. The tablespace is designed to hold sort data and manages segments and extents differently (see above) than its permanent tablespace counterparts.

CREATE TEMPORARY TABLESPACE temp TEMPFILE ‘d:\oradata\tempdata.dbf.’ This statement creates a true temporary tablespace. This eliminates the need to back up the temporary tablespace, which results in faster backups and a reduction in disk space. Information describing true temporary tablespaces is found in the v$tempfile and dba_temp_file views. No information is stored in the dba_data_files and v$datafile views, which describe permanent datafiles.

The database will start even if this file is missing. The only way the administrator will know that the file is gone is when the first sort tries to overflow to disk. A very explicit error message will be generated stating that the sort could not complete due to a missing or invalid sort file. The administrator will be forced to drop and recreate the temporary tablespace before sorting to disk can take place.

Partitioning
Data partitioning is an absolute requirement for the administration and management of large database tables and indexes. In Oracle, a partitioned table is divided into components called tablespace partitions (see Figure 1). All table partitions have the same logical attributes (columns, datatypes, and integrity constraints). Oracle allows administrators to store each partition in a separate tablespace. Separate tablespaces allow each partition to have different physical storage characteristics (PCTFREE, PCTUSED, PCTINCREASE etc.).

Partitioning of data into separate tablespaces provides the following advantages:
Increases availability - Data corruption is less likely to occur across multiple tablespaces. If data does become corrupted in a single, partitioned tablespace, all other partitions are still available for queries and DML. In addition, you can perform certain administrative operations against a single tablespace partition. Once again, all other tablespace partitions remain unaffected and are available for access.

Easier administration - Administrative operations (import/export, analyze, backup/recovery and load) can be performed on individual partitions. Remaining partitioned tablespaces continue to be available for access.

Partitioning allows applications to take advantage of "rolling window" data operations. Rolling windows allow administrators to roll off (and un-plug data using Oracle’s transportable tablespace feature) that are no longer needed. For example, a DBA may roll off the data in the tablespace containing last April’s data as they add this year’s data for April. If the data is ever needed again, administrators are able to pull the data from tape and plug the data back into the database using the transportable tablespace.

Increases performance - Partitioning allows you to distribute data and balance the I/O load across several devices. The Oracle optimizer is partition aware and will create query plans that access only those partitions and subpartitions needed to satisfy the query's request (partition pruning). Partition pruning is critical in providing quick access to data that is logically grouped together (i.e. date, customer id, etc.).

Partition pruning allows administrators to create large data stores and still provide fast access to the data. There is no difference in query performance between a 20 GIG database and a 200 GIG database if the optimizer prunes the data to create access paths to only those partitions required to solve the query. Partitioned tablespaces also increase the performance of bulk data loads. Oracle’s SQL*Loader supports concurrent loading of individual partitions and entire partitioned tables.

Oracle 8 - Range Partitioning
Oracle8 introduced the first partitioning technique called range partitioning. To create a range partitioned table in Oracle, you code a partitioning clause for the table that includes a key-based specification that is used to map rows to specific partitions and a partition description that describes each partition:

CREATE TABLE sales_account_history_data
(acct_no NUMBER (5),
person VARCHAR2 (30),
week_no NUMBER (2))
PARTITION BY RANGE (week_no)
(PARTITION p1 VALUES LESS THAN (4) TABLESPACE TSP1,
PARTITION p2 VALUES LESS THAN (8) TABLESPACE TSP2,
…
PARTITION px VALUES LESS THAN(53) TABLESPACE TSPx);

Rows are placed in the different partitions based on the table’s partitioning key. A partitioning key is a column or set of columns that is associated with a specific tablespace. The example above shows a table that is partitioned by a range of values based on a number.

With range partitioning, the best partition keys are dates, primary keys or foreign key columns. To prevent the overhead associated with migrating rows among partitions, application programs should not update partitioning key columns. To change a value in one of these columns, the application program should delete the row and then reinsert it with the new values. To define a partition, an upper boundary of partitioning key values is hard-coded into the table’s definition. These upper boundaries should distribute the data (more or less) evenly among the different partitions.

Index Partitioning

Oracle also allows you to create range-partitioned indexes. Oracle uses the same range-partitioning algorithm as it does for tables. Like partitioned tables, Oracle maps rows to specific index partitions based on the index’s partitioning key. The partitioning key of an index must include one or more of the columns that define the index. This allows administrators to create an index that contains multiple columns but partitions the index on a subset of those columns. An index partition is defined exactly like its partitioned table counterpart; an upper boundary of partitioning key values is hard-coded into each tablespace specification.

Equi-Partitioned Objects

An index and table that has the same number of partitions are said to be equi-partitioned. Multiple tables and multiple indexes can be equi-partitioned if they meet Oracle’s equi-partition specifications. Please refer to the Oracle Server Concepts manual for a more complete listing of equi-partitioning specifications.

Equi-partitioned objects improve query performance by reducing the number of rows being sorted and joined. The Oracle optimizer is also partition aware. If the optimizer determines that a single index/table partition combination will satisfy a query, it will create an execution plan that accesses the single index and table partition.

Local Indexes

You can ensure that a table and its associated index are equi-partitioned by specifying the local parameter during index creation. All keys in a local index partition point to rows stored in a single table partition. When creating an index using the local parameter, range specifications and a maximum value do not need to be specified. Oracle automatically partitions the local index on the same partitioning key columns as the table it references, creates the same number of partitions and gives each index partition the same partitioning key boundaries.

Global Indexes
Unlike their local index counterparts, index keys in a global index may point to rows in more than one table partition (see Figure 4). To create a global index, you specify the GLOBAL parameter (default) during index creation. Global indexes can be equi-partitioned with their tables that they reference, but Oracle will not take advantage of equi-partitioning when generating query plans or executing maintenance operations. The entire index will be affected. To take advantage of Oracle’s equi-partitioned performance improvements, the index must be created as local.

Oracle8i – Hash and Range/Hash Composite Partitioning
Oracle8i enhanced the Oracle database’s partitioning feature by providing two new partitioning techniques, hash and range/hash composite.

Hash Partitioning
Hash partitioning allows data to be evenly striped across devices. Hash partitioning uses a hashing algorithm on the partitioning columns to determine row placement in the tablespaces. The more unique the key values are, the more efficient the hashing algorithm will be. In general, the hash-partitioning key should be unique or near unique.

Hash partitioning attempts to evenly divide data among all available partitions. There is no way for the administrator to logically group data together in a partition. As a result, hash partitioning prohibits "partition pruning" from occurring. Hash partitioning also prevents "rolling window" operations from occurring. Rolling windows allow administrators to roll off data that is no longer needed.

Combining Range and Hash Partitioning – Range/Hash Composite Partitioning
Range/hash composite partitioning combines the best of both approaches. A range of values first partitions data then each partition is sub-partitioned into several hash partitions. This provides administrators with the ability to provide customized partitioning per application. The user specifies ranges of values for the primary partitions of the table or index then specifies a number of hash subpartitions.

Oracle9i – List and Range/List Composite Partitioning

Oracle9i continues to add partitioning functionality to the Oracle database by providing two new partitioning algorithms, list and range/list composite.

List Partitioning
List partitioning allows the administrator to map a row to a specific tablespace based on the partitioning key value. The partition is defined by specifying a hard-coded list of individual partition key values. In Oracle9i Release 1, there was no overflow area to hold rows that had keys that were not identified in the list specifications. If an application program attempted to insert a row that had a partitioning key value that did not match any of the specifications, the row was not inserted and an Oracle error code was returned.

Oracle9i Release 2 solves this problem by allowing administrators to specify a default partition. Key values that do not match the list partition specifications are placed in the overflow tablespace. A new DEFAULT attribute has been added to the list partition specification. The example below shows the DEFAULT attribute being specified during partitioned table creation:

CREATE TABLE sales
(salesid NUMBER(9999), transdate DATE, state VARCHAR2(30)…..)
PARTITION BY LIST (state)
(PARTITION region_north VALUES (‘New York’, ‘Maine’)
PARTITION region_south VALUES (‘Florida’, ‘Georgia’)
……….
PARTITION region_ovflw VALUES (DEFAULT));

Range/List Composite Partitioning
Oracle’s latest release offers a hybrid solution for list partitioning that is a combination of the range and list partitioning techniques. The table is first partitioned by range and then subpartitioned using the list partitioning technique. Unlike composite range-hash partitioning, each subpartition contains a logical division of the data that is specified by the DBA (as opposed to the range-hash technique where the subpartitions are selected by the hashing algorithm itself).

For example, the primary partition could be based on a date range to allow rolling window operations to occur and the second level could be based on a logical grouping (a list of states (i.e. Pennsylvania, New York, Ohio, etc.) is an good example of a logical grouping). The data would be divided according to the date range and divided again according to the state’s name. The example below shows a table using the range-list composite partitioning technique:

CREATE TABLE sales
(salesid NUMBER(9999), transdate DATE, state VARCHAR2(30)…..)
PARTITION BY RANGE (transdate)
SUBPARTITION BY LIST (state)
SUBPARTITION TEMPLATE
(PARTITION region_north VALUES (‘New York’, ‘Maine’)
PARTITION region_south VALUES (‘Florida’, ‘Georgia’))
(PARTITION 2002_quarter1 values less than
(TO_DATE (’01-APR-2002’, ‘DD-MON-YYYY’)),
PARTITION 2002_quarter2 values less than……..

The CREATE TABLE partitioned tablespace above uses the SUBPARTITION TEMPLATE clause to define the subpartitions. If all subpartitions have the same definition, administrators are able to code the subpartition specification once and Oracle will apply the template specification to each main partition that does not override the template with its own subpartition specification. If the template were not used, the subpartition clause would have to be specified after each main partition specification.

Which Partitioning Technique Do I Choose?
Choose range partitioning when the partitioning keys have distinct ranges that can be easily defined and do not vary dramatically. The problem with range partitioning is trying to determine what the range specifications are. The DBA should select a range of values that evenly divides the data among the individual tablespace partitions. Dates are excellent candidates for range partitioning keys because they allow the previously discussed rolling window operations to occur. If range partitioning will result in partitions that vary dramatically in size because of unequal key distribution, the DBA should consider other partitioning techniques.

If the DBA wants to use partitioning for performance and manageability but there are no columns that have distinct key ranges, the hash partitioning technique provides an excellent alternative. Since the rows are mapped based on a hashing algorithm, the higher the cardinality (the number of different values) the partitioning key contains, the more evenly the data will be divided among the different partitions. Many DBAs create an artificial series of numbers that are used as input to the hashing algorithm. These unique values ensure that the hashing algorithm evenly divides the rows among the tablespace partitions.

If you need specific control over the mapping of rows to tablespace partitions, use the list partitioning technique. List partitioning allows the DBA to logically group the data among the different partitions. Sales region would be an excellent example of a column that would lend itself to the list partitioning technique. The DBA would be able to divide the data among the partitions based on the company’s different sales regions.

Range/hash composite partitioning provides the manageability and availability benefits of range partitioning with the data distribution advantages of hash partitioning. Data skew is unlikely, because the user can always add or drop subpartitions within a partition to maintain even distribution of each container. Rolling windows of historical data are easily maintained by adding or dropping primary partitions with no effect on subpartitions in other primary partitions.
The range-list composite partitioning method provides for partitioning based on a two-level hierarchy. For example, the primary partition could be based on a date range to allow rolling window operation to occur and the second level could be based on a logical grouping (a list of state codes (i.e. PA, NY, OH, etc.) is an good example of a logical grouping). The data would be divided according to the date range and divided again according to the state’s abbreviation.

The Oracle Database Performance Tuning (1Z0-033) certification test will contain a few questions on tablespace partitioning. Test candidates should understand:
The basic partitioning algorithms (range, list, hash, composite)
How to specify partitioning during table creation
When to use a specific partitioning algorithm

Oracle9i Tablespace Changes
Oracle9i provides the database administrator with a variety (read that bewildering array) of new tablespace parameters and block sizes. Administrators are now able to create Oracle managed tablespaces, user managed tablespaces, locally managed tablespaces, dictionary managed tablespaces, specify AUTOALLOCATE, UNIFORM, PERMANENT, UNDO as well as select block sizes of 2K, 4K, 8K, 16K, or 32K.

The tablespace definition below combines a few of the aforementioned options:
CREATE TABLESPACE oracle_local_auto DATAFILE SIZE 5M BLOCKSIZE 2K;