Thursday, 30 July 2009

The Shared Pool

When you issue a query such as select * from emp, the database must work out what that means. Do we have a table called EMP? What are its constituent columns? Where, physically, on disk, does the table live? Are there any indexes built on the table which might help us get the results back more quickly? Do you, the user, have rights to query the table? The process of answering all these sorts of questions is known as parsing a SQL statement, and it involves a lot of work going on 'behind the scenes'.

To work out whether a table exists at all, for example, we have to query an in-built system table called TAB$. To work out where it physically exists on disk, we have to query more in-built system tables including one called TS$ . To check if you have the rights to view the table, yet another query is made of a table called USER$ (amongst others). You issue a simple select statement, in other words, and the poor database has to issue a dozen queries of assorted system tables before it can even work out what your statement means. We call these 'System SQL statements issued in response to a user SQL statement' recursive SQL, and performing recursive SQL means performing lots of preparatory work before actually answering your SQL requests.

Parsing is therefore very expensive... so we'd really like to only have to do it once and have the results of the parse stored for you and other users to make re-use of every time you re-issue the same basic SQL statement in the future. The result of a parse is known as an execution plan, and it represents a package of pre-worked-out instructions as to how to answer a SQL statement. Each new SQL statement causes a new execution plan to be constructed and stored... and they are stored in a sub-section of the Shared Pool known as the Library Cache. If a second user issues exactly the same SQL statement as someone else happened to issue earlier, therefore, the second user can simply nip into the Library Cache and make use of the execution plan previously worked out for the first user.

If you have to go through the entire rigmarole of working out an execution plan from scratch, that's called 'doing a hard parse'. If you can merely re-use execution plans previously created by other users, that's called 'doing a soft parse', and soft parses are a lot cheaper and quicker to carry out than hard ones.

You can fairly say, therefore, that the job of the Library Cache is to help stop us having to do hard parses.

Incidentally, fetching the contents of system tables like TAB$, USER$ and so on is expensive, too: these tables are collectively known as the data dictionary, and the data dictionary is physically stored on disk in the system datafile. Forever having to access these tables off disk would not be a good idea: lots of disk access generally means "slow performance". Therefore, once we've read some rows from these tables, we store them for future access in another component of the Shared Pool called the Data Dictionary Cache. The data dictionary cache is where in-memory copies of the data dictionary tables, needed to carry out parsing operations, reside.

The job of the dictionary cache is therefore to stop us having to visit the system data file on disk every time we parse.

Since both the Library Cache and the Data Dictionary Cache are both sub-components of the Shared Pool, and since both are designed to minimise parsing activity, or at least make it an in-memory affair if it does have to happen, it is reasonable to conclude that the job of the Shared Pool overall is to make parsing SQL statements as cheap an affair as possible.


No comments: