Wednesday, 22 July 2009

Parts of the Database (1)

Parts of the Database

Why Oracle Works the Way it Does

Now, the smallest unit Oracle really messes with is the "data block". I'm going to skip over them for now because I think they'll make more sense after we discuss datafiles and tablespaces.

You'll remember (because I asked you to) that Oracle employs an elegant weaving of physical and logical layers. As we go through them, some may wonder "why do all these layers exist?" All these layers (I hope to discuss the most important ones) exist for: performance, scalability, concurrency, and manageability.

Ok, you have a wonderful database with tables and indexes and then, what do those lousy users do? They start using your database and start pumping more and more data into it. Where does all that data go? That's what we are going to talk about today.
--------------------------------
When I use MS Word to write a document, I have 2 options when I'm done working on it: I can throw it all away or I can save it to disk. If I choose to save it to disk, I don't just save it randomly anywhere on the disk, I save it someplace where I can easily find it again. I save it to a directory or folder. I may have multiple directories, one for technical documents, one for various versions of my resume, one for the exciting screenplays I write that will never be made into movies.

I can have as many documents as I want in those directories*, I'm only limited by the capacity of the disk.

That makes sense, right?

Ok, so every document I save has a definite size to it. I can browse the directory and see how many files I have, how big each of them is, and what the total size of all my files in a directory is. So how big is my directory? Outside of some minimal space to hold the specification, it's really just a logical construct. The directory itself doesn't really have any size of its own right and is defined by the size of all of the files in it? How big can it get? As big as it gets with files before I run out of disk space.

Congratulations, you now understand the basic concepts behind DATAFILES and TABLESPACES in Oracle. They don't work exactly that way, but pretty close.

Word doc= datafile
directory = tablespace

(A tablespace is in reality not exactly like a OS level directory (it just functions the same kind of way). Datafiles can only belong to a single tablespace, but can be physically located in more than one OS directory.)

All objects in an Oracle database are assigned to a tablespace, and the tablespace knows which datafiles it has to actually store the objects and data.

You don't specify the size of a tablespace; it doesn't really have a max size (unless you want to do the math in 10g to finally come up with thousands of petabytes).

You do, however, specify the size of datafiles. This is like telling a Word doc that you plan on saving a million words in your document so go ahead and reserve that space now. This preallocation is nice, because you don't suffer the overhead of trying to find new space to add every time you type in a new word.

So, what do you do when you start approaching your million word limit and see that you are going to need space for a lot more words? Well, you have two basic options: you can add another file and let the additional words go there, or let the files you have grow to make room.

In Oracle, if you want you your datafiles to grow as needed, you use the AUTOEXTEND keyword when creating or altering the datafile (just make sure to set the max size so you don't chew up all the disk space).

So which is better, several smaller files or fewer larger files? Generically, neither is better but it is often easier to herd a couple of cats than a hundred. And if you're going to be a good DBA, you want to make your life as easy as you can.

If you oversize your datafiles, you could be wasting space (and time). If you undersize them and don't tell them how to grow (using AUTOEXTEND), you will be getting paged a lot with warnings that your tablespace is almost out of room.

Everything you do as a DBA should be done for a reason. If you choose to do no analysis of your environment and your needs, you'll be working a lot harder than you need to.

How many tablespaces should you have? Well, there are a couple of tablespaces every Oracle database requires to function, but we'll discuss those in the future. All we care about now is you data.

You probably already know about tables and indexes, right? Tables hold data and indexes (can) make it a lot quicker and easier to access that data.

A lot (and I mean a lot) of people like to separate their tables from their indexes and put them in different tablespaces (with datafiles on different disks). There's nothing inherently wrong with that if you have a reason to do it, but I want to mention one fallacy that too many people take as not only valid but as an absolute truth: that you MUST separate your indexes from your tables.

The theory goes that you want to reduce I/O contention on your disks by not fighting for read control between scanning indexes and scanning tables (rows). This is wrong on at least three fronts:
-your indexes will often (not always) be cached in memory anyway
-your disk heads are constantly bouncing around unless maybe you have a single user
-it means you don't understand how sql queries work

I don't want to get into it too much here, but it is germane to the discussion of tablespaces.

When you issue a SELECT statement, the query optimizer will analyze your statement and decide how it wants to get the data. It will decide to use an index or not to use one. If it does use an index, it has to read the index BEFORE it knows where to find the row(s) in the table. First one, THEN the other. Access for this query will not simultaneously read the index and the table. It can't. So split them into different tablespaces if you want to, but don't let anyone tell you that you HAVE to.

Some people have lots of tablespaces; some people only have one (outside of the required ones). Well talk about some possible reasons for each approach in the future. As long as you understand what tablespaces and datafiles are, we can move on for now. The main thing to be able to do at the end of this series is to be able to make intelligent choices. And a lot of times that just means trying something and seeing how it works. Never be afraid to change when you need to.

How does all of this really work?

All of you applications data and other objects will reside inside of a tablespace and it's related datafiles. If Oracle were not super friendly and easy to use, it would make you manage how that data is stored manually. Fortunately it is super friendly and easy to use. It's even flexible.

In older versions of Oracle you had to use Dictionary Managed Tablespaces (DMT's) (we'll talk about the data dictionary later). This was fine and still is, but it's not the best you can do. Since Oracle 8i you can take advantage of Locally Managed Tablespaces(LMT's).

LMT's offer so many advantages to DMT's (mainly because of things like allocation overhead and fragmentation) that few people choose or can find good reasons to use DMT's nowadays. It doesn't mean you're an idiot if you use DMT's, it just means people will think you are.

We have a lot more to discuss with datafiles and tablespaces, but this is enough to get us started and ready for the next topic.

No comments: