Tuesday, 28 July 2009

Simple Block Layout

So, with this overhead, free space, and (wow) actual data, how is it all stored in the data block?

SIMPLE BLOCK LAYOUT

The data block header (fixed) is at the beginning of the block. A small tail is at the end of the block (the tail is for block consistency checking). Everything else (variable header and data rows) fits between the beginning and the end.

Variable header info grows from the top down (just below the fixed header) if necessary and rows are inserted from the bottom up (just above the tail).

Everything else is freespace. I will discuss free space and "freelists" in part 2 of the data block.

Ok, in part 2, I will show some more examples of the data block and how the other structures we've talked about use and sometimes mimic the data block structure.

Now it's time to see data blocks in action. For this example I am just going to consider a normal table with rows and columns and nothing more exotic (once you understand this, the various permutations (clusters, partitions, etc.) should be easy to understand).

So, we've done step 1; we've issued a CREATE TABLE statement. Whether we accepted the default storage settings (by not entering a storage clause explicity) or we tweaked it for this specific table, a few things happen during the execution of that CREATE table statement:

-Oracle creates a table segment in a tablespace and grabs some initial extents comprised of data blocks that sit physically in one or more datafiles. (reread the earlier posts if you don't get this part)

-Those previously empty datablocks are preloaded with some small amount of overhead information (if you're still wondering why this overhead is good or even needed, don't get discouraged).

-Oracle "registers" this segment in some data dictionary tables (adds rows) for database management purposes (it is a Relational Database Management System after all).

-----------------------------
That last bullet might sound interesting. We have not INSERTed a single row of data yet, but there appears to already be rows loaded in the database. That is correct.

One of the required tablespaces in Oracle is SYSTEM. The SYSTEM tablespace has a whole host of tables, views, indexes, and procedural objects that have nothing to do with your data, but everything to do with your metadata.

You see, even the Oracle code uses the Oracle database as part of normal operations. This is great for two reasons: it makes the system very scalable; and it means that almost any DBA task can be done via simple SQL commands.

------------------------------

So now we are ready to insert some data into our table (let's not worry about indexes right now). Let's say we have some data from an application or a flat file and we want to insert 1,000 rows into our new table.

After some parsing (we'll also talk about SQL later), Oracle is ready to store the rows in data blocks. How does Oracle determine which data blocks to use? Freespace is used first.

If Oracle can find some empty space in the segment, it will try and fill those empty spaces with the new data. If there is no freespace left, Oracle will try to add more extents to the segment and use those data blocks for storage.

Because we just created the table, all of the allocated space (minus overhead) should be available. We know it's new and empty, but how does Oracle know?

Well, Oracle could simply keep a list of all new tables that have never held rows, but that would only be of very limited use and could be a source of contention (if everyone inserting data had to check a global listing of empty tables, blah, blah, blah to see if their table was on the list).

Instead, Oracle leverages the overhead in data blocks and does something similar at the segment level. That something is -- SEGMENT HEADERS. From a logical perspective, segment headers are similar in many ways to data block headers: they contain some management data about the segment.

Segment headers contain a list of the extents in the segment (extents table), the FREE LISTS, and the HIGH WATER MARK. The segment header starts in the first block of the first extent in the segment.

FREE LIST

The freelist is simply a list of data blocks currently allocated to extents in the segment that have some free space in them for new rows. There are some interesting things to understand about freelists.

Contention- instead of having everyone in the database wait in line to find free space in segments, freelists moves that contention down to the individual segment so only people manipulating data on that segment (table) have to contend with each other to find free space. If you have a segment (table) that you expect to have a lot of insert and update activity on, you can create multiple freelists to reduce that contention even more. To avoid waits, you can determine how many people will be inserting into that table AT THE EXACT SAME TIME (on the count of three, push your button) and create that many freelists. Unfortunately, many people adjust this number too high (thinking if two is good, a hundred is great) and end up wasting disk space (I'll try to remember to explain this when we start talking about design issues). A good DBA will monitor any freelist contention and adjust AS NECESSARY.

Data block free space- in the first post on data blocks, I talked about free space inside data blocks. This free space is used to expand rows (via update) and for new rows (insert). In OLTP systems, almost every data block will have at least one lonely block that is not being used. Do we really want to see if we can squeeze a whole row into that single byte of space? No, it will never fit. We only want to try and put data where there is plenty of room. Fortunately, when we created the segment (CREATE TABLE), we told the segment (explicitly or by using defaults) how we wanted to report this free space inside of data blocks.

We did this using the keywords PCTFREE and PCTUSED. These are the keywords that determine if a table's individual data blocks shows up on the freelist group or not.

These percentages (PCTxxxx) are just that, parts of 100. PCTUSED puts blocks on the freelist, PCTFREE takes a block off of the freelist (it is no longer accepting new inserts, only updates (DELETE's are always accepted and have no interest in checking freelists)).

The default value for PCTUSED is 40(%) and PCTFREE is 10(%). That means if a block is only 39% used (<40%), it goes on the freelist as a candidate for new rows. It will stay on the freelist after new rows are inserted until the data block is 90% full (<10% free space remaining).

Inserts increase PCTUSED (thereby decreasing PCTFREE), deletes decrease PCTUSED (thereby increasing PCTFREE), and updates can do either depending on if the update is increasing or decreasing the size of the row(s).

Since our new segment currently holds no data, all of the data blocks in our INITIAL extent(s) are well below 40% used, so they are on our freelist. Any of those data blocks can be used to hold our 1,000 rows.

Now, what if we had 1,000,000 empty blocks and were only going to use maybe 100 for this insert, would we really want to keep track of the freespace in all 1,000,000 blocks individually? I wouldn't. It could take longer to maintain and search for free space than just allocating a new extent. But then, like I hinted at above, I would be saving time, but possibly wasting disk space.

So Oracle has another neat item that helps me out (not only here, but when doing things like FULL TABLE SCANS as well).

That thing is...

High Water Mark (HWM) - just like the name sounds, the HWM keeps track of how high the river of data has ever risen.

While just counting the number and size of extents in a segment will tell us how much space has been allocated, the HWM will tell us how much of that space has ever been used (until to rebuild or truncate it). Hopefully, you can already guess at some of the benefits of knowing that.

One of those benefits is in controlling the freelists. The freelist only tracks blocks that are below the HWM. That keeps things very manageable and reduces the chances of wasting space and resources (like by having 1,000,000 blocks with a single row each as opposed to having fewer, more densly packed data blocks).

Let's say we have 2 extents each with 10 data blocks that can each hold 50 of our 1,000 rows. Let's also keep or PCTUSED=40 and our PCTFREE=10

so our block count=20. All of them are empty.

Because we've never inserted rows into this segment, our HWM is 0.

When we insert the first row, the freelist will say "I don't have anything free, let me try to add some empty blocks." The freelist sees that there is a block already allocated to the segment (from one of our 2 extents) that is empty, so it advances the HWM from 0 to 1 and grabs one of those blocks. The row is then inserted.

When we insert the second row, the freelist says "I have a block waiting for data" and tells us where to stick the data (whether its being rude or not depends on how you treat the database --joke), decreases the PCTFREE in the block and increases the PCTUSED.

This goes on for the next few dozen rows until we try to insert our 46th row (we know that physically we have enough space for 50 rows in each block). By then, something has happened. Our PCTUSED has grown to 90%, but we don't really care about that because the block was already on the freelist (any of our blocks with PCTUSED<40 will be put on the freelist). What we do care about is that the PCTFREE has gone from almost 100 down to 10 when we inserted our 45th row. At that point, the data block said "I'm stuffed, leave me alone." So it was removed from the freelist.

Now the freelist says "I don't have anymore space below the HWM, I'd better see what's out there." It sees that there are still 19 data blocks that have been allocated but never used. So, it grabs the next one, puts it on its list and advances the HWM to 2.

If this sounds like overhead to you, you're right. If it sounds like unnecessary (or even undesirable overhead) you're wrong. When we start looking at the big picture, you'll learn to appreciate it.

So now we continue to insert rows in the same fashion, filling up block after block in the first extent (no flamers!) until all the blocks are full (or at least they're 90% full) and the HWM has been moved up to 10. At this point, we have used all of the 10 blocks in our first extent. Now the freelist has to grab some data blocks from another extent (see the no flamers comment).
The freelist checks with the segment and sees that we have more blocks allocated in our second extent that have free space.

We'll continue to insert rows like we have been, moving up the HWM until it gets to 20.

At this point, both of our extents' set of data blocks have been removed from the freelist because they are 90% full (only 10% free). Now when the freelist looks for allocated, but unused space, it won't find any, so it will extend the segment by allocating a new extent (with its set of 10 data blocks) and advance the HWN to start adding those newly allocated data blocks to the freelist.

We can then insert the remaining 100 rows.
--------------------------------------------
In this example, we've seen how freelists are used to insert data into data blocks. We have seen our segment size grow to the sum of the extent sizes allocated and how the HWM is used in this process. We have seen how data blocks accept new rows and when they reject new rows.

What we haven't seen is the affect this has on the tablesapce and the datafile(s).

Since a tablespace is similar in ways to a segment, we know that the tablespace has grown (logically) to equal the sum of the size of it segments and is (physically) still the sum of the size of its datafiles. But what has happened to the datafile(s) themselves? You'll remember, or can look it up, that in Chapter 1 of this series we said that a datafile had a predetermined size. When we create it, we told it what size to be.

That datafile is allocated to the tablespace. Allocated but not used (for data). As we create segments, we create extents which consist of data blocks from datafiles owned by the tablespace that the segment was created in. Whew! Say that 10 times fast.

As we allocate new extents for our segment, does the datafile get bigger? No, not necessarily. The allocation of that space goes from unused (free) to used as a segment needs it. So, both the tablespace and the datafile get filled up. When the datafiles get full (by allocation, not by inserting data) then you have to either add a new datafile or allow an existing one to grow. Tablespaces work in a similar fashion logically with extents. As you approach the MAXEXTENTS setting, you have to increase it or prevent the segment (table) from growing.

-------------------------------------------------

This is not the end of our discussion on data blocks. It is just the beginning. But it is very important that I do a good job of helping you understand the basic structure and role of data blocks (and their chain of custody up to the tablespace) before moving on to all of the simple tasks in Oracle. That's part of why I simplified some of this without trying to mislead you. This is the way it works and you will discover any shortcuts I took as we continue.

Understanding the data block is critical to mastering the Oracle database and all of the tasks that come with developing for and administering the Oracle database.

No comments: