Skip to main content

Should we keep Index and Data in separate tablespace?

Index and Data in Separate Tablespaces

Recently I got caught up in a developer-DBA argument regarding placing of database indexes in separate tablespace from data tablespace. These developers and DBAs belong to a project where all data and indexes are stored in different tablespaces.

But still there are a few indexes on a daily truncate-and-load type temporary table that are managed through the ETL code. Meaning, those indexes are dropped before batch loading and recreated after the load. The trouble here is - the ETL jobs create the index in the data tablespace instead of creating them in the index tablespace.
While DBA wants developers to change the code to create the indexes in the proper tablespace, developers’ argument is why creating indexes in a different tablespace are so important?

Why do you need a separate Tablespace for Indexes?

The DBA’s argument is they need a separate tablespace for indexes for performance reasons.

And that is off course wrong.

Putting all your indexes in a separate tablespace than your data does not increase performance (tablespace is not a ‘performance thing’.) Table data and table indexes are not read simultaneously. Oracle reads index first followed by the data. So what really needs to be achieved for performance is an evenly distributed I/O over all available devices. Separating data and index does not necessarily achieve a balanced I/O. And if one really wants to achieve an evenly distributed I/O – a far better idea will be to take all the available devices – create a really big stripe set and put everything in it.

So DBA is wrong if s/he wants it for performance reason.



But then, is there any other benefit for separating index and data tablespace?

There can be some administrative plus points for the same. For example separating index and data might help in recovery. If you place all of your indexes on a separate disk, you don't have to worry if the disk of backup becomes corrupt because you can rebuild the indexes to restore the device.

Comments

  1. Great! But then what is this,
    http://www.dba-oracle.com/t_segregate_table_index_data_separate_tablespaces.htm

    ReplyDelete
  2. Why put index and data in the same tablespace where as we can easily separate them into different tablespace? Beside that will be better to manage also.

    ReplyDelete
  3. Normally it is a convention to separate them for better management/reporting/backup. Not sure about the performance...

    ReplyDelete

Post a Comment

Popular posts from this blog

Informatica Incremental Aggregation

Saurav has posted a new article here on Incremental Aggregation Using Informatica . The need of incremental aggregation arise when we capture our source data (transactional data) incrementally in a frequency faster than the aggregation period. Take this example, a data warehouse system is refreshed every night from source data. The data warehouse has a monthly aggregated table. So it is obvious that every day's data you need to aggregate and put together in the monthly table. But in stead of loading the monthly table at month end, if you consider loading this monthly table everyday or every week or bi-monthly, then incremental aggregation is possibly the best option for you. Now performance wise, it remains an open question on how good is Informatica in doing incremental aggregation. I think Saurav might consider an other article by putting informatica in test with considerable data volume.

Compare between CTAS, COPY and DIRECT PATH Load in Oracle

OK. Here is a simple task that I am trying to achieve all through out tonight. Loading Huge Table Over DBLINK I have a big (infact very big) table on one database (SRCDB) and I am trying to pull the data from that table to a table in different database (TGTDB). Both SRCDB and TGTDB reside in different HP Unix servers connected over network. And I have only SELECT privilege on the SRCDB table. The table has no index (And I am not allowed to create an index, or any database object for that matter, in the SRCDB). But the SRCDB table has many partitions, only one of which I am supposed to pull from that. Let's suppose the SRCDB table has 10 partitions. Each partition has 500 million records. And as I said above, I need to pull data from only one partition to the target. So what will be the best suited strategy here? My Options When I started to think about this, following options came into my mind: 1. Using Transportable Table Space 2. Using CTAS over DBLink 3. Using direct load path 4