How to Update millions or records in a table", version 8.1.7

wapysun

浏览: 21490605 次
性别:
来自: 杭州

最近访客更多访客>>

devcang

hunankeda110

辽东小小

apex53

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

2012-06 ( 77)
2012-05 ( 587)
2012-04 ( 177)
更多存档...

You Asked (Jump to Tom's latest followup)

Good Morning Tom.

I need your expertise in this regard. I got a table which contains millions or
records. I want to update and commit every time for so many records ( say
10,000 records). I dont want to do in one stroke as I may end up in Rollback
segment issue(s). Any suggestions please ! ! !

Murali

and we said...

If I had to update millions of records I would probably opt to NOT update.

I would more likely do:

CREATE TABLE new_table as select <do the update "here"> from old_table;

index new_table
grant on new table
add constraints on new_table
etc on new_table

drop table old_table
rename new_table to old_table;

you can do that using parallel query, with nologging on most operations
generating very little redo and no undo at all -- in a fraction of the time it
would take to update the data.
Reviews
GOTO a page to Bookmark Review | Bottom | Top
updating millions of records November 11, 2002
Reviewer: Om from India

Hi Tom,

Could you please elaborate

CREATE TABLE new_table as select <do the update "here"> from old_table;

the above statement with a suitable example please

Followup:
ok, say you wanted to update emp to set ename = lower(ename). Instead, you
could do this:

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table new_emp as
2 select empno, LOWER(ename) ename, JOB,
3 MGR, HIREDATE, SAL, COMM, DEPTNO
4 from emp;

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> drop table emp;

Table dropped.

ops$tkyte@ORA817DEV.US.ORACLE.COM> rename new_emp to emp;

Table renamed.

GOTO a page to Bookmark Review | Bottom | Top
Million records update November 11, 2002
Reviewer: Ramesh G from U.K

What if a table has over 100 million records and if i only want to update 1
million? If your method is still applicable could you elaborate it.

Many thanks in advance.

Followup:
most likely -- yes. I don't have a 100million row table to test with for you
but -- the amount of work required to update 1,000,000 indexed rows is pretty
large. Fortunately, you are probably using partitioning so you can do this
easily in parallel -- bit by bit.

GOTO a page to Bookmark Review | Bottom | Top
"How to Update millions or records in a table", version 8.1.7 November 11, 2002
Reviewer: John Bittner from Hunt Valley, MD USA

This is absolutely a viable approach, and one we have used repeatedly. One of
our apps updates a table of several hundred million records.

The cursor..For loop approach for the update was calculated to take 53.7 years
to complete!
We institued the Insert into a dummy table append with nologging, and were able
to complete the "update" in under 30 minutes.

With nologging, if the system aborts, you simply re-run the 'update' again, as
you have the original data in the main table. When done, we swap the partition
of original data with the 'dummy' table (the one containing new values),
rebuild indexes in parallel, and wha-la! Our update is complete.

i.e, to update field2 in a table:

1) First create your dummy hold table: create table xyz_HOLD as select * from
xyz where rownum<1. Alter tablexyz nologging.

2) insert /*+ append parallel (xyzhold,12) */ into xyz_hold xyzhold (field1,
field2, field3) select /*+ parallel (x,12) */ xyz.field1,
my_new_value_for_field2, xyz.field3 from xyz x where blah blah blah.

3) when done, either rename the table, or swap the partition if your original
table is partitioned, and you only updated one partition as we do. Obviously
you need to rebuild indecies, etc as required.

Hope this helps!

GOTO a page to Bookmark Review | Bottom | Top
updating millions of records November 11, 2002
Reviewer: A.Ashiq from Trichy,Tamil Nadu ,India

Hi Tom,

As u suggested us to create a new table ,then drop the original table and rename
the new table to original table instead of updating a table with millions of
records.But what happen to dependent objects ,everything will get
invalidated.Yeah ,of course it'll recompile itself when it called next time.But
again it(dependent objects) has to do parsing.Is it Ok.

Followup:
It is OK.

GOTO a page to Bookmark Review | Bottom | Top
in case of delete November 12, 2002
Reviewer: A reader

We've a similar situation., We delete around 3 million records from 30 million
rows table everyday.
There is no logical column to do partition.,
I guess the insert into a new table will take considerable time with 27 mil
records.. Please let me know what is the best approach.

Followup:
wait 10 days so that you are deleting 30 million records from a 60 million
record table and then this will be much more efficient.

Time it some day though. 3 million records on an indexed table will take
considerable time. There is a chance that INSERT /*+ append */ select <rows to
keep> done nologging, parallel with create indexes in parallel could best the
delete.

GOTO a page to Bookmark Review | Bottom | Top
November 12, 2002
Reviewer: A reader

Tom,

Recently I had conducted a interview in which one the dba mentioned that they
had a table that might conatin 10 million records or might be 1 million. He
meant to say they delete the records and some time later the table will be
populated again and viceversa.

Tom according to you do you consider partitions for such tables and if yes which
type of partition..

Thanks.

Followup:
hard to tell -- is the data deleted by something that is relatively constant
(eg: the value in that column doesn't change - so the row doesn't need to move
from partition to partition). If so -- sure, cause we could just drop
partitions (fast) instead of deleting the data.

GOTO a page to Bookmark Review | Bottom | Top
Agree with John - insert append / CTAS / partition swap is the only way to fly November 12, 2002
Reviewer: Jack Silvey from Richardson, TX

I work with John Bittner, one of the previous reviewers. I second what he said
absolutely. It is the only way to fly. We also have an absolutely incredible
stored procedure that rebuilds all of our indexes concurrently after the load,
using the Oracle job scheduler as the mechanism of allowing separate threads in
pl/sql.

GOTO a page to Bookmark Review | Bottom | Top
addendum November 13, 2002
Reviewer: A reader

This process was introduced to our environment by a master tuner and personal
friend, Larry Elkins. This was a totally new paradigm for the application, and
one that saved the entire mission-critical application. The in-place updates
would not have worked with the terrabytes of data that we have in our database.

GOTO a page to Bookmark Review | Bottom | Top
How to Update millions or records in a table November 19, 2002
Reviewer: Boris Milrud from San Jose, CA USA

In response to the Jack Silvey (from Richardson, TX ) review, where he wrote "It
is the only way to fly. We also have an absolutely incredible stored procedure
that rebuilds all of our indexes concurrently after the load, using the Oracle
job scheduler as the mechanism of allowing separate threads in pl/sql":

Could you provide more information about that procedure and how to rebuild
multiple same-table indexes concurrently using Oracle job scheduler?

Thanks,
Boris.

Followup:
instead of

begin
execute immediate 'alter index idx1 rebuild';
execute immediate 'alter index idx2 rebuild';
end;

you can code:

declare
l_job number;
begin
dbms_job.submit( l_job, 'execute immediate ''alter index idx1 rebuild'';' );
commit;
dbms_job.submit( l_job, 'execute immediate ''alter index idx2 rebuild'';' );
commit;
end;

Now, just set job_queue_processes > 0 to set the "degree of threads" and in 8i
and before set job_queue_interval to say 60 or so and there you go.

GOTO a page to Bookmark Review | Bottom | Top
How to Update millions or records in a table November 19, 2002
Reviewer: Boris Milrud from San Jose, CA USA

Thanks, Tom.
Your status said that you had a large backlog, so I decided not to wait for your
response and tried myself using dbms_job.submit() calls. At first, it did not
work (job_queue_processes was 0), but after I set it to 12 it started working.
The only one difference between your code and mine is that I issue just one
commit at the end. It should not matter, right?

I selected 1 mln. rows table and rebuild 5 non-partitioned indexes with 'compute
statistics parallel nologging' clause.
Here is the numbers I've got: rebuilding indexes sequentually consistently took
76 sec., while using dbms_job.submit() calls took around 40 - 42 sec.

I said "around", because the technique I used may not be perfect, though it
served the purpose. I recorded the time right after the commit statement at the
end of PL/SQL block - that's the start time. Then I kept querying user_jobs view
every 2 - 3 sec, until the last of the 5 jobs were gone. That's was the end
time.

The last question on this topic: is user_jobs view is the right place to look in
order to determine that's rebuilding is done and how long it took? In package I
am writing, I do massive delete operation, then rebuilding indexes, then
starting the next routine. What would be the best way to detect the end of
rebuiding, in order to proceed with the next call?

Thanks.
Boris.

Followup:
You can use user_jobs or dba_jobs but -- you might just want to put some
"logging" into your jobs themselves so you can monitor their progress and record
their times.

GOTO a page to Bookmark Review | Bottom | Top
You can even rebuild all the indexes in a partition simultaneously November 21, 2002
Reviewer: ramakrishna from India

Thanks Tom,
In our environment, we have partitioned tables and we use:
ALTER TABLE table_name MODIFY PARTITION part_name REBUILD UNUSABLE LOCAL INDEXES

and this rebuilds all the indexes in the partition at one shot. We found this
much faster than doing the indexes one by one. We will now try out if we can
submit multiple such jobs in parallel (one for each partition of the table).

regards
ramakrishna

GOTO a page to Bookmark Review | Bottom | Top
in case of deletes November 29, 2002
Reviewer: carl from moondog, mo

Hi Tom,

Thanks so much for your web site and help. It is our
number 1 reference in times of fear and loathing.

This is what we came up with concerning mass updates

INV 50M
INVINS 10M
INVDEL 7M

There are indexes on INV.KEY and INVDEL.KEY

Execution Plan ( for deletes and updates )
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'INV'
3 1 INDEX (RANGE SCAN) OF 'IX_INVDEL01' (NON-UNIQUE)

alter table INVTMP nologging;

-- INVINS contains delta inserts and updates

insert /*+ APPEND */
into INVTMP
select *
from INVINS t1;

-- INVDEL contains delta deletes and updates

insert /*+ APPEND */
into INVTMP
select *
from INV t1
where not exists ( select null
from INVDEL t2
where t2.KEY = t1.KEY );

alter table INVTMP logging;
drop table INV;
rename INVTMP to INV

-- build indexs etc

This is what we came up with and is to the fastest
approach we've tested.

Any comments or suggestions are welcome and appreciated.

Followup:
I might try an anti-join on the NOT EXISTS and doing the work in a single
parallel CTAS

create table invtmp as
select * from invins
UNION ALL
select t1.*
from inv t1, invdel t2
where t1.key = t2.key(+)
and t2.key is null;

add parallel /nologging as needed.

GOTO a page to Bookmark Review | Bottom | Top
in case of deletes - many thanks December 01, 2002
Reviewer: carl from moondog, mo

Ran the test cases at home 500K rows in INV and 50K rows in
INVINS and INVDEL

My way: 16.33 secs
Your way: 08.35 secs

Many thanks.

GOTO a page to Bookmark Review | Bottom | Top
A bit confused March 17, 2003
Reviewer: Choche from England

I was just wandering that none of the reviews made mention if these techniques
could be applied in a multi-user environment where multiple users could be
updating the same table at the same time.

Followup:
sorry -- I though it obvious that in most cases "no" is the answer. we are
copying the data or locking excessive amounts of it -- or disabling indexes and
the like.

This is a "batch" process here, to update millions of records.

GOTO a page to Bookmark Review | Bottom | Top
Replacing PK values July 25, 2003
Reviewer: Michael from San Diego, CA, USA

We need to replace the main ID for all entries in the database which is the PK
in about 100 tables, largest of them have about 2 mln rows. New and old values
for ID are stored in the lookup table, and there about half a million of them.
The original approach was to replace PK in all tables for one ID value, commit
the changes and move on to the next ID. This way referential integrity would be
guaranteed even if update fails at any stage, and we do not need a long rollback
segments and can restart the process at the point it aborted. But this update is
too lengthy, and in our case would take almost a month of continuous processing.
Do you have any suggestions on how to improve the performance of this critical
update? Would it be better to add new ID as an additional column into one table
at a time and populate all of it at once as "create table ... as select" (with
nologging option), and then rename table and drop old ID column?

Followup:
sounds like I would:

create table new_table as select <join>;
create table new_table2 as select <join>;
etc etc etc etc

drop old tables
add indexes
add primary/fkeys/constraints etc....

that is -- rebuild in bulk, using nologging (no redo, no undo generated), using
parallel when appropriate.

GOTO a page to Bookmark Review | Bottom | Top
Update million Rows July 26, 2003
Reviewer: Reaz from Dhaka, Bangladesh

Dear Tom,

We have this situation, where we load data from external files numbering 7.
Next we had to run a transformation process, which checks if the row exists in
the target table and updates the existing row with the new one. Otherwise
inserts the row to the target table.

Our current process is very slow, sometimes taking a night long to complete for
say 200,000 rows.

The procedures discussed so far meant of updating only, but in our case we need
to update and insert too.

I would appreciate your suggestion.

Thanks.
Reaz.

Followup:
read about MERGE in the sql reference guide.

GOTO a page to Bookmark Review | Bottom | Top
A single batch? July 26, 2003
Reviewer: A reader

Tom,
Is it still okay, if these steps (create a new table, drop the old one, and them
rename the new table as old, create indexes etc.) are coded as the batch update
program? OR would you suggests these should be detailed as the Steps that should
be performed under the supervision of a DBA? What I would like to know is that
can we still have the steps in a single batch routine (as update would have had)
or instead of the single batch these should be DBA instructions instead?
Thanks

Followup:
I would have these performed under the watchful eye of someone.

it is a one time bulk manipulation. it won't be a "pretty program" with tons of
gracefully logging/error handling. You'll be doing it "down and dirty" (at
least I would be). It would need someone to watch over it.

GOTO a page to Bookmark Review | Bottom | Top
What about space consideration July 28, 2003
Reviewer: A reader

Hi Tom,
I had a similar problem where my table had gone big and i wanted to truncate
some old data, my table was big more then 20 million records, i could not use
the create temp table truncate and insert flow as the space of the database was
not enough to hold 2 copies of the large table, I thought of doing a partition
on table and then truncate one partition of table but that does not work.
Can you suggest some way which caters for this space constraint also

Followup:
why didn't the partitioning work for you.

GOTO a page to Bookmark Review | Bottom | Top
How to update part of a Very Large Table July 28, 2003
Reviewer: Mohan K from Bangalore, India

The update operation can be made simple by sepearting out the query and update
operation. To do that create another table which stores the rowids of the
records in the original table which has to be updated along with the the value
to be updated. Then run the PL/SQL script to update the records one by one. To
test the above method the following scripts can be used.

SQL> create table test_update(id number, name varchar2(100), description
varchar2(4000)) storage initial(48M next 4m);

SQL> declare
v_n number;
v_name number;
v_desc number;
i number;
begin
for i in 1..1000000 LOOP
insert into test_update(id, name, description) values(i, 'Test Name'||i, 'Test
Name '||i||' description ');
END LOOP;
end;
/

Elapsed: 00:04:277.23

The above script will insert 1 million rows.

SQL> select count(*) from test_update where description like '%5 description%';

COUNT(*)
----------
100000

Elapsed: 00:00:02.63

SQL> create table test_update_rowids as select rowid rid, description from
test_update where description like '%5 description%';

Elapsed: 00:00:54.58

The table test_update_rowids stores the rowids and the new values that has to be
updated. ie 100000 rows needs to be updated.

SQL> declare
begin
for c1 in(select rid, description from test_update_rowids)
LOOP
update test_update set description=c1.description||' after update' where
rowid=c1.rid;
END LOOP;
end;
/

Elapsed: 00:01:82.17

The above script performs the updation.

Mohan

Followup:
the above script performs an "updation" in the worst possible way, consuming the
most resources.

it should have been simply:

update test_udpate set description = description || ' after update ' where
description like '%5 description%';

Never do procedurally that which can be done in a single sql statement.

GOTO a page to Bookmark Review | Bottom | Top
Updating 100K of 1 Million rows July 28, 2003
Reviewer: Mohan K from Bangalore, India

I ran the update statement as a single SQL statement and the time elapsed is
slightly more than the above method. It has to be tested against update
statements containing complex queries or join operations.

SQL> update test_update set description = description || ' after update ' where
description like '%5 description%';

Elapsed: 00:01:100.13

Mohan

Followup:

trust me -- that difference is neglible and the amount of time you spent writing
code wasn't included.

Also, you FORGOT to add in the time to populate your temporary table -- no?

if you can do it in a single sql statment (and you almost always can) you
SHOULD.

Lets do you benchmark a little better:

ops$tkyte@ORA920> create table test_update(id number, name varchar2(100),
description varchar2(4000));

Table created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> insert /*+ append */
2 into test_update
3 select rownum, 'Test Name' || rownum, 'Test Name ' || rownum || '
description '
4 from big_table.big_table
5 where rownum <= 1000000;

1000000 rows created.

ops$tkyte@ORA920> commit;

Commit complete.

ops$tkyte@ORA920>
ops$tkyte@ORA920> select count(*)
2 from test_update
3 where description like '%5 desc%' ;

COUNT(*)
----------
100000

ops$tkyte@ORA920>
ops$tkyte@ORA920> select count(*)
2 from test_update
3 where description like '%6 desc%' ;

COUNT(*)
----------
100000

ops$tkyte@ORA920>
ops$tkyte@ORA920> exec runstats_pkg.rs_start

PL/SQL procedure successfully completed.

ops$tkyte@ORA920>
ops$tkyte@ORA920> update test_update
2 set description = description || ' after update'
3 where description like '%6 description%';

100000 rows updated.

ops$tkyte@ORA920>
ops$tkyte@ORA920> exec runstats_pkg.rs_middle

PL/SQL procedure successfully completed.

ops$tkyte@ORA920>
ops$tkyte@ORA920> create table test_update_rowids
2 as
3 select rowid rid, description from test_update where description like '%5
description%';

Table created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> begin
2 for c1 in(select rid, description from test_update_rowids)
3 LOOP
4 update test_update
5 set description=c1.description||' after update'
6 where rowid=c1.rid;
7 END LOOP;
8 end;
9 /

PL/SQL procedure successfully completed.

ops$tkyte@ORA920>
ops$tkyte@ORA920>
ops$tkyte@ORA920> exec runstats_pkg.rs_stop
Run1 ran in 1376 hsecs
Run2 ran in 9806 hsecs
run 1 ran in 14.03% of the time

so, according to me, the SINGLE (efficient, effective, correct way) update
statment runs in about 14% of the time. furthermore:

Name Run1 Run2 Diff
STAT...recursive cpu usage 5 4,662 4,657
STAT...CPU used by this sessio 448 5,397 4,949

only 4,949 MORE cpu seconds using the row by row approach...

STAT...CPU used when call star 448 5,397 4,949
STAT...free buffer inspected 10,162 17,039 6,877
STAT...db block gets 111,705 104,450 -7,255
STAT...dirty buffers inspected 9,016 16,620 7,604
STAT...calls to kcmgas 7,755 68 -7,687
STAT...Cached Commit SCN refer 7,718 0 -7,718
STAT...switch current to new b 7,719 0 -7,719
STAT...cleanouts only - consis 0 7,720 7,720
STAT...immediate (CR) block cl 0 7,720 7,720
STAT...commit txn count during 0 7,721 7,721
STAT...cleanout - number of kt 3 7,726 7,723
STAT...consistent gets - exami 3 7,760 7,757
LATCH.undo global data 221 7,985 7,764
STAT...redo entries 100,700 108,727 8,027
LATCH.redo allocation 100,795 108,945 8,150
STAT...db block changes 202,723 210,907 8,184
STAT...Elapsed Time 1,385 9,825 8,440
STAT...physical reads 7,894 16,476 8,582
LATCH.checkpoint queue latch 26,845 38,012 11,167
LATCH.cache buffers lru chain 35,638 52,827 17,189
STAT...no work - consistent re 7,741 100,072 92,331
STAT...table scan blocks gotte 7,719 107,721 100,002
STAT...table scan rows gotten 1,000,000 1,100,002 100,002
STAT...execute count 27 100,092 100,065
STAT...buffer is not pinned co 7,763 107,836 100,073
STAT...calls to get snapshot s 56 100,241 100,185
LATCH.shared pool 172 100,778 100,606
STAT...session logical reads 119,503 220,165 100,662
STAT...consistent gets 7,798 115,715 107,917
LATCH.library cache pin 233 200,865 200,632
STAT...recursive calls 177 200,870 200,693
LATCH.library cache 332 201,322 200,990
LATCH.cache buffers chains 587,377 835,167 247,790
STAT...redo size 31,323,804 31,836,904 513,100
STAT...session pga memory max 0 524,824 524,824
STAT...session pga memory 0 1,049,112 1,049,112

Run1 latches total versus runs -- difference and pct
Run1 Run2 Diff Pct
754,602 1,553,986 799,384 48.56%

PL/SQL procedure successfully completed.

and worse, it used 2x the number of latches! (scalability inhibitors)

There is nothing good about row at a time processing, it is something to do ONLY
when a set based operation isn't feasible

GOTO a page to Bookmark Review | Bottom | Top
This is very good example for updating the records July 28, 2003
Reviewer: Bipin Ganar from INDIA

Hi Tom,
Yes this is good example to update the record but my problem
is i have 1Million records in one table. I want update and give hike of 10% of
their salaries in one table but that table is accessed by n no of people at the
same moment. I can not drop the table and update and second how can i define
tablespace in this script
Create Table a as select * from emp_sal;

Followup:
create table a
tablespace foo
as
select * from emp_sal;

GOTO a page to Bookmark Review | Bottom | Top
slightly different problem! July 28, 2003
Reviewer: Srikanth Adiga from Pune, India

My table:
5M records are inserted each month.

Operations:
5M records inserted on the first day of the month
5M records updated on rest of the days on the month.
Note that this update is update each row with diff values.

I am using OCI and using batch operations. Can you please suggest me what else
can I do within my program to improve the performance of my updates? I tried
multithreading, but that did not help.

anything else?

Followup:
it only takes a couple of minutes to update 5 million rows on my desktop pc --
so, there must be something more involved here.

I'd be able to do that in minutes -- why is it taking you so long -- what is
your bottleneck.

GOTO a page to Bookmark Review | Bottom | Top
space considerations July 28, 2003
Reviewer: A reader

could not create another temp table which holds the data same as teh table being
partitioned as there was not enuff space in the harddisk to hold 2 or 1.5 to be
precise copies of the same huge table,is there a way to partition without having
to drop of truncate or transfer data to some temp location and then partition

Followup:
disk is cheap.

time is money.

your time spent just thinking about this problem cost much more then the disk to
solve it would.

penny wise, pound foolish. How big is this 20million row table (assuming you
are the same "a reader" from above). Maybe 1-3 gig?

GOTO a page to Bookmark Review | Bottom | Top
slightly different problem! July 29, 2003
Reviewer: Srikanth Adiga from Pune,India

couple of minutes!!! Oh dear, thats really too good.

Let me reitterate my case:
My table schema is something like this:
SubscriberVsQtyTable:
int client-id
int year
int month
int quantity
primary key - (client-id, year,month)

This table has 5M client ids.

Now every day, I have update the quantity field with the txns made for that day.
Note that, I have to update one subscriber at a time since the number of txns
made by each subscriber if different.

So, even now do you believe you can do this in couple of minutes? My app takes
about 30 to 50 mins even on higher machines like 8 CPU.

As I said, this is a C++ app, running Solaris + Oracle 8i/9i. We are using OCI.

Thanks,

Followup:
if you are doing that work with more then a single sql statement, I could see it
taking a while.

If you do that work in a single update, no, I don't see it taking more then a
couple of minutes.

GOTO a page to Bookmark Review | Bottom | Top
Time for Update July 29, 2003
Reviewer: Mohan K from Bangalore, India

It is my understanding that the time for update depends on the following
factors. There may be something else also involved.

1. Row Chaining/Migration caused by update
2. The where cluase is used to select the rows to be updated.
3. Percentage rows updated in a block. The more the better. Even if one row is
updated in a block containing 10 records the entire block has to written by DBWR
and LGWR.

So just the number of records is not that matters. 100 million recorsd may takes
days for one table and may take less than a hour for another table with few
columns.

Mohan

Followup:
1) updating a quantity (number) field is unlikely to do that.
2) correct
3) correct

no, i disagree with the last part. to update 5million rows, a single unindexed
column, should not take "a really long time". I was answering their direct
question -- they update every row, on every block. so #1 and #3 really don't
come into play. The biggest factor is #2 here

GOTO a page to Bookmark Review | Bottom | Top
Verifing if neccesary to update, before update July 29, 2003
Reviewer: juan carlos reyes from Bolivia

Hi Tom, I had seen several times that you can
increase the performance by evaluating if you really need to update, or maybe it
was only my impression.

For example
Update table set income = 500
where level = 7
AND NOT INCOME = 500

The AND NOT INCOME = 500, allows to avoid unncesary updates,
my point is a question

How advisable is it, does it really increase performance?

As you are always thinking how to increase performance, tt could be a good idea
in newer version of Oracle, that if Oracle verifies that the value was set to
that value, not update
:)

Followup:
if there are more then a handful of records where income = 500, then yes, this
will definitely increase performance by

a) decreasing the latching
b) decreasing the redo
c) decreasing the undo

we cannot make that optimization for you -- it would stop triggers and other
EXPECTED side effects from happening.

GOTO a page to Bookmark Review | Bottom | Top
July 29, 2003
Reviewer: A reader

:) Thanks,

Maybe then it could be an optional feature
UPDATE_ONLY_IF_DIFFERENT hint
I think it could interest several people
who don't use trigger, neither constraints ,etc.

GOTO a page to Bookmark Review | Bottom | Top
July 29, 2003
Reviewer: Stu Charlton from Toronto Canada

"Note that, I have to update one subscriber at a time since the number of txns
made by each subscriber if different."

That still seems doable in 1 SQL statement, even if you wanted to merge the
updates. Couldn't you just do a COUNT(*) grouped by client_id, year, and month?

Followup:
I agree -- most things CAN be done in a single sql statement.

GOTO a page to Bookmark Review | Bottom | Top
easier said then done in real life situations July 29, 2003
Reviewer: A reader

Hi Tom,
I agree you may think getting a disk is cheaper, that is only when u are
considering getting up or just going to the store and getting a disk,
In real life conditions specially corporates this is not as easy as it seems nor
is it less costly.
A Simple purchase of disk has to first be justified
which is quite difficuult, i dont think i have justification as yet)then be
approved then be purchased by purchase dept then be installed or fixed by the
server teams, server team may want to take the system down which means more
dollars much more.These are just things off my head, when we acutually inititate
the purchase process , there will be definitly some more steps and processes.

Coming back to my question, am i right in saying that we can't partion an
existing table without copy the data to some temp location,Let me know if there
is any other way.

Thanks

Followup:
justification

you pay me alot
disk is cheap.
think about it.

(sometimes it really can be that simple)

we are talking about a couple of gig (like 1 to 3 -- small, tiny -- my /tmp has
more room then that) here, something my LAPTOP would not have an issue with.

in order to partition -- think about it -- you'll need the SOURCE DATA and the
PARTITIONED DATA -- at the same time, for a period of time. no magic there.

GOTO a page to Bookmark Review | Bottom | Top
slightly confused ! July 30, 2003
Reviewer: Srikanth Adiga from Pune, India.

Thanks, Stu and rest for your updates.

Taking this buit more further, since I am confused :)

[Sri]
"Note that, I have to update one subscriber at a time since the number of txns
made by each subscriber if different."
[Sri]

[Stu Charlton]
That still seems doable in 1 SQL statement, even if you wanted to merge the
updates. Couldn't you just do a COUNT(*) grouped by client_id, year, and month?

Followup:
I agree -- most things CAN be done in a single sql statement.
[Stu Charlton]

Sorry, how would one do this?
If my table has two rows;
clientid =1,year=july,year=2003,quantity=10
clientid =2,year=july,year=2003,quantity=20
Now I have to update (clientid=1)'s quantity by 15 and (clientid=2)'s quantity
by 25.
How would you manage this in a single SQL? Like this there would be 5M rows to
be updated.

Btw, my table is indexed on clientid,year and month.

Followup:
what is the OTHER table you are updating from? you must have a detail table
elsewhere from where you derive that 15 and 25. So, assuming something like
this:

ops$tkyte@ORA920> create table t ( clientid int, month int, year int, quantity
int );

Table created.

ops$tkyte@ORA920> create table txns ( clientid int, month int, year int );

Table created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> insert into t values ( 1, 7, 2003, 10 );

1 row created.

ops$tkyte@ORA920> insert into t values ( 2, 7, 2003, 20 );

1 row created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> insert into txns select 1, 7, 2003 from all_objects where
rownum <= 15;

15 rows created.

ops$tkyte@ORA920> insert into txns select 2, 7, 2003 from all_objects where
rownum <= 25;

25 rows created.

ops$tkyte@ORA920>
ops$tkyte@ORA920>
ops$tkyte@ORA920> select * from t;

CLIENTID MONTH YEAR QUANTITY
---------- ---------- ---------- ----------
1 7 2003 10
2 7 2003 20

ops$tkyte@ORA920> update t
2 set quantity = quantity + ( select count(*)
3 from txns
4 where txns.clientid = t.clientId
5 and txns.month = t.month
6 and txns.year = t.year );

2 rows updated.

ops$tkyte@ORA920> select * from t;

CLIENTID MONTH YEAR QUANTITY
---------- ---------- ---------- ----------
1 7 2003 25
2 7 2003 45

there you go.

(it is a shame you are using the wrong datatypes -- only a DATE should be used
to hold, well, DATES. Using numbers to hold a year and a month isn't a good
practice)

GOTO a page to Bookmark Review | Bottom | Top
SCN,CHECK POINT July 30, 2003
Reviewer: AATIF from PAKISTAN

i ask about whta is difference SCN system commit number,systemt change number
and check point.
Which base oracle sync the database SCN or CHECKPOINT ,
when check point ocur who number write in control file headre,datafile and log
file header.
dear tom.
your web site say me not ask more question now but i need clear my point.
AATIF.

Followup:
er?

i remember you asking this on another page -- yes.

and I also remember answering this on that other page -- yes.

I fail to see how this relates to this page.

anyway there IS no difference between

scn
system commit number
system change number

they are the same.

as for the rest of the comment -- it doesn't "make sense" (i do suggest a read
through of the server concepts guide, many questions are answered there!)

GOTO a page to Bookmark Review | Bottom | Top
More clarification August 01, 2003
Reviewer: Srikanth Adiga from Pune,India

>what is the OTHER table you are updating from? you must
>have a detail table elsewhere from where you derive that
>15 and 25.
It is coming from an OCI application after processing some files.

Followup:
use bulk processing - update between 100 and 500 rows at a time in a single call
using array processing.

Or, put the data where it belongs - not in files, in tables -- and process the
data in the database rather then in C

GOTO a page to Bookmark Review | Bottom | Top
August 06, 2003
Reviewer: Srikanth Adiga from Pune, India

>what is the OTHER table you are updating from? you must
>have a detail table elsewhere from where you derive that
>15 and 25.
It is coming from an OCI application after processing some files.

>Followup:
>use bulk processing - update between 100 and 500 rows at a >time in a single
call using array processing.
Yes, that is what we do. It takes about 60 mins to update 5M records on a 2CPU
machine. Is this much expected?

If I do the same in multiple threads I do not see any performance improvement
i.e. 1M records parallely updated in 5 threads.
Any idea why?

Followup:
are you sure they are going in parallel and not blocking/locking each other.

verify that all 5 sessions are in fact not only ACTIVE (v$session) at the same
time but that you are not blocking yourself.

and you have taken some statspacks during your testing to see if you have any
obvious bottlenecks right?

GOTO a page to Bookmark Review | Bottom | Top
Explicit commit ! August 07, 2003
Reviewer: Nathan from London

Tom,

Why do we commit explicitly after submitting jobs ?

Thanks
Nathan
/*------------------------------------------------
Followup:
instead of

begin
execute immediate 'alter index idx1 rebuild';
execute immediate 'alter index idx2 rebuild';
end;

you can code:

Followup:
because if you do not YOUR session can "see" the job in the job queue but the
job queue processes cannot!

How many of us have sat there for minutes going "when is the stupid thing going
to run" and until we exit sqlplus it does nothing :)

GOTO a page to Bookmark Review | Bottom | Top
Explicit Commit August 07, 2003
Reviewer: Nathan from London

Apologies,
I must withdraw my previous question... i was obviously not thinking straight. I
was thinking about a similar sitaution where a commit was issued ( before
dbms_job.submit ) with a comment /* do not remove */ without any
explanations... I'm still pondering why the commit should be there.

Sorry for the trouble.

Regards
Nathan

GOTO a page to Bookmark Review | Bottom | Top
I have to take issue.... August 07, 2003
Reviewer: cheaper than disk from USA

Your response about justification for buying disk space is outdated...

"justification

you pay me alot
disk is cheap.
think about it.

(sometimes it really can be that simple)"

Anybody else out there NOT paid alot ANYMORE?!

;-)

GOTO a page to Bookmark Review | Bottom | Top
Re: Explicit Commit - To Nathan August 07, 2003
Reviewer: A reader

Tom is out till weekend..
"I'm still pondering why the commit should be there. "
The commit after dbms_job is requierd so that other sessions can see the
job_queue...which typically is the
requirement...As usual, Tom can confirm..

Menon:)

GOTO a page to Bookmark Review | Bottom | Top
Reader August 10, 2003
Reviewer: A reader

"
create table invtmp as
select * from invins
UNION ALL
select t1.*
from inv t1, invdel t2
where t1.key = t2.key(+)
and t2.key is null;
"

Like to know why
"
where t1.key = t2.key(+)
and t2.key is null
"
better than
where t1.key != t2.key with ALWAYS_ANTI_JOIN hint

Followup:
because there is no "always anti join" hint?

you could of course use the CBO, and a NOT IN subquery in which case it'll do it
itself.

GOTO a page to Bookmark Review | Bottom | Top
August 14, 2003
Reviewer: Su Baba

So if I have a big table, x, that needs to be updated, I would

create table y nologging as select * from x...
retrieve index definition from x
build indexes on y
drop table x
rename y to x

If I have all these done in a PL/SQL batch job, what's the best way to retrieve
index definition and form create index statement from table x? I want to create
indexes on y that have the exact storage parameter, tablespace, maybe partition,
etc. as the ones on table x.

By the way, the database version is 8.1.7.

thanks

Followup:
you COULD, not WOULD, it is an option.

you don't have your create indexes somewhere?

guess I would exp rows=n, imp indexfile=foo to get them -- or
DBMS_METADATA.get_ddl in 9i

GOTO a page to Bookmark Review | Bottom | Top
How to update part of a Very Large Table August 28, 2003
Reviewer: Mohan K from Bangalore, India

please refer "How to update part of a Very Large Table July 28, 2003"

When updating a large table in a database having comparatively less buffer cache
one thing to make sure is that blocks are updated in a continuous order. Suppose
some records of block1 is updated first in the beginning and next after updating
some other blocks, then block1 will be written twice by DBWR and LGWR. After the
first update block1 may get flushed out by LRU algorithm. If all records in
block1 is updated at a time then block1 is wriiten by LGWR and DBWR only once.
If the update statement is doing a full table scan then bloacks are updated in
order. But the order can be different when the is using an index for selecting
the rows.

The SQL statement that makes sure of that is

update (select id, description from test_update where rowid in(select rowid from
(select rowid from test_update order by rowid))) set description=description||'
new val' where description like '%5%';

Followup:
not necessarily so -- first lgwr doesn't write blocks -- it writes redo.
second, dbwr may or may not write the block more then once.

I would never never never use your approach. why I would want to

a) read the entire table
b) sort the rowids
c) distinct that set (implied)
d) join to it (and you are making the assumption the table in the IN will be the
driving table and that a hash join will not be used and ..... lots of other
stuff)

when a simple:

update T set d = d || 'x' where d like '%5%'

will do it -- if you are worried about the index on d being used - you would
simply FULL hint it -- but doubtful that it would be necessary.

the additional work you add would more then offset the work you are trying to
save.

GOTO a page to Bookmark Review | Bottom | Top
update large table August 28, 2003
Reviewer: Vera from D.C.

Hi Tom and thank you again and again for all your work.

My problem is that I cannot simply replace update with creating a new table. The
table I need to update - I'll call it 'pension table' - has about 1.5 mln rows.
I have to update "amount" fields for 1994, 1995,1995 etc. years, and the way how
I update these fields depends on status of the participant at each particular
year. To determine this status I built a function that looks through another
tables and takes several input parameneters including each particular financial
year because it determines not just status of a participant, but status he/she
had in particular FY. (This function I call a "box of cockroaches"). Basically,
I ask Oracle to bring me annual monthly amount for Mr/Mrs 'X' for, let's say,
1994 and if 'X' had status 'retired' in 1994 then I want this amount to be
calculated one way, if 'X' was 'deferred' -another way and if 'X' was 'dead'
third way.
As you alredy guessed, updates take long time - about an hour for each year. I
wonder what approach would you use in a situation like this?
Thank you
By the way, what is a status of your new book?

Followup:
I would use a CASE statement in SQL without using PLSQL to compute the value.

optionally -- i would turn it inside out. instead of sql calling plsql, this
might be one of the times when plsql calling sql is more appropriate. build up
procedurally an array of values to update -- forall update them -- and do it
again until the rows are processed. with a little foresight, you can even
parallelize this processing and run your procedure N times -- each instance of
it running against a different slice of the table...

book is "real soon" -- should be in trucks right now.

GOTO a page to Bookmark Review | Bottom | Top
update large table August 28, 2003
Reviewer: Luc Gyselinck from Belgium

For large tables, if I have to update more than 20 percent of the rows, I do it
like Tom: write a query to insert the 'updated' records in a new table, using
the /*+ APPEND */ hint. You have no indexes to update, no constraints to
validate and you generate almost no redo. Next, I truncate the old table, after
I have disabled all constraints and triggers (once again, almost no redo, very
fast), and then I inject, again with the /*+ APPEND */ hint, the data from the
new table into the old (very fast, little redo). Indexes get automaticaly
rebuild AFTER the extents are populated. I reenable the triggers, the
constraints (with or without validation, as you wish).

If the new values for the columns being updated must come from other tables, I
NEVER write something like

update t1
set c1 = (select c2 from t2 where t2.c3 = t1.c4)

In fact, you are performing NESTED LOOPs. Using functions to get the new values
for the columns is much the same : NESTED LOOPS (even worse: SQL / PLSQL engine
context switches, open/close cursors in PL/SQL, NO read consistency).

Whenever I find myself in such a situation (typicaly when writing batch
procedures, during data migrations, data transformations), I make sure my
queries use HASH joins, I give the session more resources (higher
SORT_AREA_SIZE, higher HASH_AREA_SIZE, DB_MULTIBLOCK_READ_COUNT to the max),
avoid the INDEX RANGE SCANS, do FULL (yes FULL) table scans, do FAST FULL INDEX
scans, thus bypassing the buffer pool which otherwise gets flushed by the data
from my queries. And by the way, use LMTs with uniform size extents.

As an example, I rewrote a batch procedure that took 3 days to complete (once,
many years ago), written the old way (see the update statement above), that now
does the same job in only 4 hours, on the same (old) hardware, same database,
but different Oracle version: Oracle 7 / Oracle 8i.

GOTO a page to Bookmark Review | Bottom | Top
Yes, you are right, August 28, 2003
Reviewer: Vera from D.C.

but right now I don't see how can I perform my task without update statement and
a function. In other cases I do exactly the same thing you do on large tables -
create new table rather than update old one and then rename tables.
As for LM tablespaces I was begging our DBAs to switch on LMT since I don't know
when and they keep promising to do it.

GOTO a page to Bookmark Review | Bottom | Top
Does parallel insert just imply direct? September 08, 2003
Reviewer: Dennis from Missouri, USA

Tom (or anyone),

Does a parallel insert imply direct (append) insert? I read something in the
documentation (or is that forbidden territory) ;) awhile back about the append
hint not being needed on a parallel insert because it was implied.

I issued the following:
INSERT /*+ append, nologging, parallel(ext, 8) */ INTO
hrdmadm.ext_xpayd_pay_details ext
SELECT /*+ parallel(ext2, 8) */ *
FROM hrdmadm.ext_xpayd_pay_details@hip2 ext2

and while I saw the parallel processes spawn on hip2 (for the select), I didn't
notice any spawn where I was.

I was wondering if that was because parallel insert was synonymous with direct
insert, or did I mess up the syntax somehow? Is the insert in fact parallel,
and it's just 'hidden' from us?

Thanks,
Dennis

Followup:

http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96524/c21dlins.htm#10629

quote:

Serial and Parallel Direct-Path INSERT

When you are inserting in parallel DML mode, direct-path INSERT is the
default. In order to run in parallel DML mode, the following requirements must
be met:
......

GOTO a page to Bookmark Review | Bottom | Top
Thanks! September 08, 2003
Reviewer: Dennis from Missouri, USA

My bad was that I didn't alter session. Thanks Tom for your valuable insight.
I guess I was just too afraid to read down a few more pages when I was in the
documentation, with that being forbidden territory and all ;) Well maybe not
forbidden...just hard to put that 'haunted/fear-inspiring' word that would fit
in there.

Thanks,
Dennis

GOTO a page to Bookmark Review | Bottom | Top
Optimizing Query September 11, 2003
Reviewer: Mohan

BULK collect and BULK insert can improve the performance. Create SQL objects of
type t1%rowtype and t2%rowtype.
BULK collect 100 records at a time to the collection array. Create seperate
collection arrays to store values that are computed or obtained after searching
the database.
Do BULK insert/update into the target table. Search this site using key words
"bulk insert" or "bulk collect" or "bulk update" for more examples.

Other things are
1. Improve query performance by proper indexing
2. Remove fragmentation by "alter table move" command
3. Rebuild indexes
4. Allow large pctfree if the update causes row migration/row chaining
5. Size rollback segments and rollback tablespaces

Mohan

Followup:
#2 isn't really "a good thing" in general.
#3 is really not "a good thing" in general.

reorgs -- so so overrated.

GOTO a page to Bookmark Review | Bottom | Top
September 11, 2003
Reviewer: Kailash from NJ

Can you explain which of the 3 operations will be faster. If I want to update
millions of rows,
1. then would delete/reinsert be faster or
2. mere update will be faster
3. the one you suggested will be faster.
Can you advise as to why the method you had suggested will be faster than 1 and
2.

Can you explain why updating millions of rows is not a good idea. Is it based on
the amount of redo generated and what if the columns I update are not part of
the index?

Which operation will be the fastest and I request you to explain the various DML
operations and the amount of redo they generate. ie. I would appreciate if you
could explain me which DML generates most redo, followed by the second followed
by the 3rd which generates the least redo.

Currently we have an application that checks for a condition in a table A and
obtains the unique id. if a record is found, then flushes all of the dependent
tables for the given id and re-inserts these records back into the dependent
tables with a set of new values. Is this a good approach? If so, what would be
the ideal approach?

Followup:
I can say yes to all three -- each is the fastest.

each is the slowest.

If you have a large bulk update that is done once and hits most of the of rows
-- you may be best off by doing a CTAS, drop old and rename new.

the CTAS, indexing, etc can be done in nologging mode -- no undo, no redo. you
end up getting done much faster.

GOTO a page to Bookmark Review | Bottom | Top
updating millions of records October 03, 2003
Reviewer: b from austin TX

Hi,
The approach of updating the view instead of a table works great.
But here i want to update multiple colums of the same table based on different
table queries.
we have around 600,000 records in the base table.
what could be the most inexpensive query.
We run oralce 9i.

UPDATE user1.UserSummary
Set
Column1 = (
SELECT Count(DISTINCT WebOrderID)
FROM user1.OrderTotal ot
WHERE total>0 and
(ot.ostatus <> -1 and
ot.ostatus <> -1) and
ot.UREFITEM = user1.UserSummary.UREFITEM)),
Column2 = (
SELECT NVL(Sum(Total),0) FROM
user1.OrderTotal ot
WHERE (ot.ostatus <> -1 and ot.ostatus <> -1)
and ot.UREFITEM = user1.UserSummary.UREFITEM),
Column3 = (
SELECT min(sodate) FROM orders.salesorders so left join
user1.ordertotal ot on ot.sorefitem = so.sorefitem
WHERE total>0 and (so.ostatus <> -1 and so.ostatus <> -1)
and so.UREFITEM = user1.UserSummary.UREFITEM),
Column4 = (
SELECT max(sodate) FROM orders.salesorders so
left join user1.ordertotal ot on ot.sorefitem = so.sorefitem
WHERE total>0 and (so.ostatus <> -1 and so.ostatus <> -1) and
so.UREFITEM = user1.UserSummary.UREFITEM)
;
the above is the sample of my query

thanks in advance
;;)

Followup:
if you can come up with a single query that pulls all 4 columns together in a
single result set -- we can optimize this. Left as an exercise for you -- see
if you cannot merge the above 4 separate query into a single query

GOTO a page to Bookmark Review | Bottom | Top
great October 17, 2003
Reviewer: b from tx

thanks
i finally made my query simple

GOTO a page to Bookmark Review | Bottom | Top
Great October 21, 2003
Reviewer: Raj from San Diego CA USA

Great technique never thought of it.

GOTO a page to Bookmark Review | Bottom | Top
October 22, 2003
Reviewer: Anthony Reddy from Malaysia, KL

GOTO a page to Bookmark Review | Bottom | Top
Updating 1 million records October 22, 2003
Reviewer: Kailash from NJ, USA

I have a situation where I have a unique key on 3 columns (c1,c2,c3) on a table
T and I want to update c2 from A to B for some records that match certain
criteria. There are a total of 30 million records in the table and there will be
about 1 million records in the table that match the criteria to be updated. The
table is not partitioned.
1. Your suggestion to drop the old table and to re-create a new table and
re-naming it to the old table may not be possible because I am updating only a
subset of the table. Is there any other method to acheive this ?
2. Which one of the following will be a better solution?
a. Drop index and update the rows and recreate the
index
b. Delete the rows with C2 = A and insert new rows
with C2 = B. My DBA says that this will create
fragmentation and is not a better approach. Is
that true?
c. Do you suggest any better solution under these
scenarios since I keep getting these type of
questions from my manager often.

Your suggestion in this regard is highly appreciated.

Followup:
have you considered just "update set c2 = b where ...."?

1,000,000 rows out of 30,000,000 isn't alot. if c2 is indexed, you might
benefit from a set unusable, skip unusable, update, rebuild.

but it won't take that long to update 1,000,000 rows. couple of minutes on my
laptop

big_table@ORA920> update big_table set created = created+1;

1833792 rows updated.

Elapsed: 00:02:13.28

not a big deal.

GOTO a page to Bookmark Review | Bottom | Top
How about rollback segment problem when updating million rows. October 22, 2003
Reviewer: Ramasamy from NC USA

Tom ,

Thanks lot for your service.

I have to update a single column NUMBER(9) where
the table has got more than 80 columns.
This table has got 30 million rows and i have to update
around 17 million rows each night.

As you always insisted ,
i tried with a single update like

UPDATE t1 SET batchno = (A constant )
WHERE batchno is NULL;

1.)I came across the rollback segment issue.
2.) It is taking around 3 hrs to update.

Your input is greatly appreciated to solve the above issues.

Thanking You,
Ramasamy

Followup:
there is no problem -- other then you haven't sized your system to do the things
you require it to do!!!!!!!!

1) the rollback segment "issue" is a "non issue". size things for what you do
and there are NO ISSUES. disk is cheap -- compared to what they pay us all to
have this discussion. consider that. the amount of time you've spent thinking
about this already cost your employer much more then the disk it would take to
do this correctly.

2) you are changing a null to a non-null -- increasing the width of the field.
make sure you are not causing massive row migration (check your pctfree, make
sure it is set appropriately here). is batchno "indexed"?

GOTO a page to Bookmark Review | Bottom | Top
MILLION RECORD INSERT/SELECT/UPDATE TAKES TOO MUCH TIME October 23, 2003
Reviewer: Santosh Das from Bangalore, INDIA.

Hi Tom,
I have a table A and soon after creating it 1million records are put into it.
Then it being in a real time environment its size goes on increasing atleast
half million each day.However each day a cleanup application cleans half million
every day.After inserting 1 million at the creation of table its put in the
realtime environment.When ever a new record is inserted a contuniously listening
application detects it and applies a select operation on table A. Then it send
the selected items to some table B and then updates the table A.
The listening operation may be 3 or 4 in number.How ever the select and update
operations are taking a lot of time and the initial insertion of 1 million
records is taking 11 hours to complete so what can be the possible architecture.
I tried sending the rowid to a queue/pipe after inserting in table A and then
used it to select and update.But still its taking a lot of time. Plz suggest
some method.
Thanks and regards,
Sandy..

Followup:
sounds like you should just be using AQ -- plop the record in as a message --
have as many dequeuers as you like.

not sure what else to say, not a ton of detail here.

GOTO a page to Bookmark Review | Bottom | Top
Select and update of a table Having Millions of Records. October 23, 2003
Reviewer: Jack from NY

Hi Tom,
I have the following queries which execute on a table of million records.That
table has a PK but its not used in both the queries given below.CMTIME,SAD and
STATUS all are NUMBER fields.If I make index on cmtime then it takes longer.
Presently i dont have a index field.Can you plz tellme how i can optimize these
or where the performance bottleneck is ?

update SD set CMTIME = 1064929834 where (CMTIME is NULL or 1064929834 - CMTIME
>= 10) and (1064929834 - SAD >= 0) and 1064929834 not in (select CMTIME from SD
where CMTIME is not NULL)

SELECT * from SD where (CMTIME is not NULL and (CMTIME = 1064929834 OR
1064929834 - CMTIME >= 10)) AND (ED - 1064929834 > 0 and (STATUS = 1 OR STATUS =
0) and 1064929834 - SAD >= 0)

Thanks and regards,
Jack.

Followup:
are you using the CBO.

GOTO a page to Bookmark Review | Bottom | Top
Followup to your question October 23, 2003
Reviewer: Ramasamy from NC USA

Yes Tom. The Batchno is indexed like
index on (BATCHno,STATUS);

Will it be worth Droping the index and recreating it?.

Thanks,
Ramasamy

Followup:
could definitely help, yes.

GOTO a page to Bookmark Review | Bottom | Top
October 23, 2003
Reviewer: George Lee from Hong Kong

I tried to update 30 million records using Tom's solution.
It works fine. Using approximately 1.5 hours to finish.

George (Broadway Photo Supply Ltd. HK)

GOTO a page to Bookmark Review | Bottom | Top
whats CBO October 24, 2003
Reviewer: Jack from NY

Followup:
cost based optimizer

GOTO a page to Bookmark Review | Bottom | Top
No October 27, 2003
Reviewer: jack from NY

We are not using any cost based optimizers.
It's a simple query which we are using to retrieve data from a table having
millions of records and it's performance is not satisfactory.
How to optimize it so that it runs faster ?
jack

Followup:

you are kidding right?

this goes right up there with "my car won't start, why not?"

there is so much missing from this "request" as to make it impossible to say
anything sensible.

GOTO a page to Bookmark Review | Bottom | Top
November 11, 2003
Reviewer: George Lee from Hong Kong

I have a fact table being partitioned by month. The indexes are built local to
partition.

The data population is working fine.

But the rebuild index fails at the end.

I don't know how to rebuild indexes in the partitions.
Can you help me ?

Followup:
why do you need to rebuild here at all??

but why don't you show us what you are doing -- at the very least -- define what
"fails at the end" means (error codes, messages, details)

GOTO a page to Bookmark Review | Bottom | Top
November 12, 2003
Reviewer: George Lee from Hong Kong

Dear Tom,

Yes, I should provide enough information to you. Sorry for that.

Here is the story,

I create a table as following,

drop table D_DYNRPT_SALES_FACT;

create table D_DYNRPT_SALES_FACT
(
TX_DATE DATE,
SHOP_NO VARCHAR2(20),
ITEM_NO VARCHAR2(12),
PARENT_ITEM_NO VARCHAR2(12),
BRAND_NO VARCHAR2(5),
VENDOR_NO VARCHAR2(20),
ITEM_GROUP VARCHAR2(5),
ITEM_TYPE VARCHAR2(5),
CONSIGNMENT VARCHAR2(1),
ITEM_DESC VARCHAR2(40),
BRAND_DESC VARCHAR2(30),
VENDOR_DESC VARCHAR2(30),
CATEGORY_DESC VARCHAR2(30),
QTY_SOLD NUMBER(14,2),
NET_SALES_AMT NUMBER(14,2),
GROSS_PROFIT_AMT NUMBER(14,2)
)
PARTITION BY RANGE (TX_DATE)
( PARTITION D_DYNRPT_SALES_FACT_2000_09 VALUES LESS THAN
(TO_DATE('01/10/2000','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2000_09,
PARTITION D_DYNRPT_SALES_FACT_2000_10 VALUES LESS THAN
(TO_DATE('01/11/2000','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2000_10,
PARTITION D_DYNRPT_SALES_FACT_2000_11 VALUES LESS THAN
(TO_DATE('01/12/2000','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2000_11,
PARTITION D_DYNRPT_SALES_FACT_2000_12 VALUES LESS THAN
(TO_DATE('01/01/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2000_12,
PARTITION D_DYNRPT_SALES_FACT_2001_01 VALUES LESS THAN
(TO_DATE('01/02/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_01,
PARTITION D_DYNRPT_SALES_FACT_2001_02 VALUES LESS THAN
(TO_DATE('01/03/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_02,
PARTITION D_DYNRPT_SALES_FACT_2001_03 VALUES LESS THAN
(TO_DATE('01/04/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_03,
PARTITION D_DYNRPT_SALES_FACT_2001_04 VALUES LESS THAN
(TO_DATE('01/05/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_04,
PARTITION D_DYNRPT_SALES_FACT_2001_05 VALUES LESS THAN
(TO_DATE('01/06/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_05,
PARTITION D_DYNRPT_SALES_FACT_2001_06 VALUES LESS THAN
(TO_DATE('01/07/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_06,
PARTITION D_DYNRPT_SALES_FACT_2001_07 VALUES LESS THAN
(TO_DATE('01/08/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_07,
PARTITION D_DYNRPT_SALES_FACT_2001_08 VALUES LESS THAN
(TO_DATE('01/09/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_08,
PARTITION D_DYNRPT_SALES_FACT_2001_09 VALUES LESS THAN
(TO_DATE('01/10/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_09,
PARTITION D_DYNRPT_SALES_FACT_2001_10 VALUES LESS THAN
(TO_DATE('01/11/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_10,
PARTITION D_DYNRPT_SALES_FACT_2001_11 VALUES LESS THAN
(TO_DATE('01/12/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_11,
PARTITION D_DYNRPT_SALES_FACT_2001_12 VALUES LESS THAN
(TO_DATE('01/01/2002','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_12,
PARTITION D_DYNRPT_SALES_FACT_2002_01 VALUES LESS THAN
(TO_DATE('01/02/2002','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2002_01,
PARTITION D_DYNRPT_SALES_FACT_2002_02 VALUES LESS THAN
(TO_DATE('01/03/2002','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2002_02,
PARTITION D_DYNRPT_SALES_FACT_2002_03 VALUES LESS THAN
(TO_DATE('01/04/2002','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2002_03,
PARTITION D_DYNRPT_SALES_FACT_2002_04 VALUES LESS THAN
(TO_DATE('01/05/2002','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2002_04,
.
.
.
PARTITION D_DYNRPT_SALES_FACT_2004_12 VALUES LESS THAN
(TO_DATE('01/01/2005','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2004_12,
PARTITION D_DYNRPT_SALES_FACT_MAXVALUE VALUES LESS THAN (maxvalue)
TABLESPACE D_DYNRPT_SALES_FACT_MAXVALUE
)
/

CREATE INDEX D_DYNRPT_SALES_FACT_I01 ON D_DYNRPT_SALES_FACT
(TX_DATE) LOCAL;

CREATE INDEX D_DYNRPT_SALES_FACT_I02 ON D_DYNRPT_SALES_FACT
(TX_DATE, ITEM_GROUP, ITEM_TYPE) LOCAL;

CREATE INDEX D_DYNRPT_SALES_FACT_I03 ON D_DYNRPT_SALES_FACT
(TX_DATE, SHOP_NO, ITEM_NO, ITEM_GROUP) LOCAL;

CREATE INDEX D_DYNRPT_SALES_FACT_I04 ON D_DYNRPT_SALES_FACT
(TX_DATE, BRAND_NO, ITEM_GROUP, ITEM_TYPE) LOCAL;

CREATE INDEX D_DYNRPT_SALES_FACT_I05 ON D_DYNRPT_SALES_FACT
(TX_DATE, ITEM_NO) LOCAL;

CREATE OR REPLACE PUBLIC SYNONYM D_DYNRPT_SALES_FACT FOR
BPSADM.D_DYNRPT_SALES_FACT;

then, populate data into the table as the following,

alter index D_DYNRPT_SALES_FACT_I01 unusable;
alter index D_DYNRPT_SALES_FACT_I02 unusable;
alter index D_DYNRPT_SALES_FACT_I03 unusable;
alter index D_DYNRPT_SALES_FACT_I04 unusable;
alter index D_DYNRPT_SALES_FACT_I05 unusable;

alter session set skip_unusable_indexes=true;

@pop_d_dynrpt_sales_fact.sql '2000/09/01' '2000/01/30';
.
.
.
@pop_d_dynrpt_sales_fact.sql '2003/11/01' '2003/11/30';

alter index D_DYNRPT_SALES_FACT_I01 rebuild nologging;
alter index D_DYNRPT_SALES_FACT_I02 rebuild nologging;
alter index D_DYNRPT_SALES_FACT_I03 rebuild nologging;
alter index D_DYNRPT_SALES_FACT_I04 rebuild nologging;
alter index D_DYNRPT_SALES_FACT_I05 rebuild nologging;

the data_population is working fine, but alter index rebuild nologging fail with
the error as following,

alter index D_DYNRPT_SALES_FACT_I01 rebuild nologging
*
ERROR at line 1:
ORA-14086: a partitioned index may not be rebuilt as a whole

alter index D_DYNRPT_SALES_FACT_I02 rebuild nologging
*
ERROR at line 1:
ORA-14086: a partitioned index may not be rebuilt as a whole

alter index D_DYNRPT_SALES_FACT_I03 rebuild nologging
*
ERROR at line 1:
ORA-14086: a partitioned index may not be rebuilt as a whole

alter index D_DYNRPT_SALES_FACT_I04 rebuild nologging
*
ERROR at line 1:
ORA-14086: a partitioned index may not be rebuilt as a whole

alter index D_DYNRPT_SALES_FACT_I05 rebuild nologging
*
ERROR at line 1:
ORA-14086: a partitioned index may not be rebuilt as a whole

I checked the reference, it should be something syntax error on my rebuild index
statement, it is like alter index D_DYNRPT_SALES_FACT_I05 rebuild partition...,
I don't know the exact statement, can you help me, thanks.

Followup:
ops$tkyte@ORA920LAP> CREATE TABLE t
2 (
3 data char(255),
4 temp_date date
5 )
6 PARTITION BY RANGE (temp_date) (
7 PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
8 PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
9 PARTITION part3 VALUES LESS THAN (to_date('15-mar-2003','dd-mon-yyyy')) ,
10 PARTITION part4 VALUES LESS THAN (to_date('16-mar-2003','dd-mon-yyyy')) ,
11 PARTITION part5 VALUES LESS THAN (to_date('17-mar-2003','dd-mon-yyyy')) ,
12 PARTITION part6 VALUES LESS THAN (to_date('18-mar-2003','dd-mon-yyyy')) ,
13 PARTITION junk VALUES LESS THAN (MAXVALUE)
14 )
15 ;

Table created.

ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> create index t_idx1 on t(temp_date) LOCAL nologging;

Index created.

ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> alter index t_idx1 unusable;

Index altered.

ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> begin
2 for x in ( select 'alter index ' || index_name ||
3 ' rebuild partition ' || partition_name stmt
4 from user_ind_partitions
5 where index_name = 'T_IDX1' )
6 loop
7 dbms_output.put_line( x.stmt );
8 execute immediate x.stmt;
9 end loop;
10 end;
11 /
alter index T_IDX1 rebuild partition PART1
alter index T_IDX1 rebuild partition PART2
alter index T_IDX1 rebuild partition PART3
alter index T_IDX1 rebuild partition PART4
alter index T_IDX1 rebuild partition PART5
alter index T_IDX1 rebuild partition PART6
alter index T_IDX1 rebuild partition JUNK

PL/SQL procedure successfully completed.

GOTO a page to Bookmark Review | Bottom | Top
December 11, 2003
Reviewer: A reader

Hi Tom,
I am running below update statement from past 24 hr and it's still running..

Table S_REVN has 1.5 Million and Table S_OPTY_PROD has 1.2 Million Record.. It's
doing full table scan of both table sinace I am using function in where clause..
We using RULE base optimizer and oracle 8.1.7.4

Here is sql statement

Update siebel.S_REVN r
set x_opty_prod_city=(select X_CITY from siebel.S_OPTY_PROD p where
replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$') = r.row_id),
X_opty_prod_STATE =(select X_STATE from siebel.S_OPTY_PROD p where
replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$') = r.row_id),
X_PR_YEAR_QTY =(select X_PR_YEAR_QTY from siebel.S_OPTY_PROD p where
replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$') = r.row_id)

0 UPDATE STATEMENT Optimizer=RULE
1 0 UPDATE OF 'S_REVN'
2 1 TABLE ACCESS (FULL) OF 'S_REVN'
3 0 TABLE ACCESS (FULL) OF 'S_OPTY_PROD'
4 0 TABLE ACCESS (FULL) OF 'S_OPTY_PROD'
5 0 TABLE ACCESS (FULL) OF 'S_OPTY_PROD'

Please advice
Thanks

Followup:
you understand that it is doing 3 full scans of S_OPTY_PROD for EACH and EVERY
row in S_REAVN yes???

how about

update t
set (a,b,c) = (select a,b,c from t2 where ... )

at the very least (reduce the full scans from 3 to 1)

at the very most, consider creating an index:

create index
temp_index_to_make_this_complete_before_my_grandchildren_have_grandchildren
on p(replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$'))

and useing a hint (to let sighbul use the cbo) to get the index used.

GOTO a page to Bookmark Review | Bottom | Top
December 12, 2003
Reviewer: A reader

Hi Tom,

I am following your advice on update I've changed sql statement

Update /*+ INDEX(S_OPTY_PROD T_TEMP) */ siebel.S_REVN r
set (x_opty_prod_city,X_opty_prod_STATE,X_PR_YEAR_QTY) =
(select X_CITY,X_STATE,X_PR_YEAR_QTY from siebel.S_OPTY_PROD p
where replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$') = r.row_id)

created function based index on S_OPTY_PROD

create index T_TEMP
on S_OPTY_PROD(replace(replace(replace(ROW_ID,'ce','='),'-','='), '@', '$'))

BUT above sql statement in not using index eventhough I supply hint (Is it b/c
function based index not reconized by Rule based optimizer?)

Please help
Thanks

Followup:
you put the hint in the wrong place :)

and you might not have met all of the requirements for FBI's. read:

http://asktom.oracle.com/~tkyte/article1/index.html

but it is not the RBO (the well formed hint -> cbo)....

your hint is wrong (must use the correlation name of P) and in the wrong place
(should be in the subquery).

you need query_rewrite_enabled and (depending on version) might need
query_rewrite_integrity (see the above link).

but here is an example:
ops$tkyte@ORA920> create table t1( x_opty_prod_city int, x_opty_prod_state int,
x_pr_year_qty int, row_id varchar2(25) );

Table created.

ops$tkyte@ORA920> create table t2( x_city int, x_state int, x_pr_year_qty int,
row_id varchar2(25) );

Table created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> create index t2_idx on t2(
replace(replace(replace(ROW_ID,'ce','='),'-','='), '@', '$') );

Index created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> delete from plan_table;

4 rows deleted.

ops$tkyte@ORA920>
ops$tkyte@ORA920> ALTER SESSION SET QUERY_REWRITE_ENABLED=true;

Session altered.

ops$tkyte@ORA920>
ops$tkyte@ORA920> explain plan for
2 Update t1 r
3 set (x_opty_prod_city,X_opty_prod_STATE,X_PR_YEAR_QTY) =
4 (select /*+ INDEX( t2 t2_idx ) */ X_CITY,X_STATE,X_PR_YEAR_QTY
5 from t2 p
6 where replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$')
= r.row_id)
7 /

Explained.

ops$tkyte@ORA920>
ops$tkyte@ORA920> prompt @?/rdbms/admin/utlxpls
@?/rdbms/admin/utlxpls
ops$tkyte@ORA920> set echo off

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
--------------------------------------------------

-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |Cost(%CPU)|
-------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 82 | 4346 | 3 (34)|
| 1 | UPDATE | T1 | | | |
| 2 | TABLE ACCESS FULL | T1 | 82 | 4346 | 3 (34)|
| 3 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 53 | 2 (50)|
|* 4 | INDEX RANGE SCAN | T2_IDX | 1 | | 2 (50)|
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 -
access(REPLACE(REPLACE(REPLACE("P"."ROW_ID",'ce','='),'-','='),'@','$')=:B
1)

16 rows selected.

GOTO a page to Bookmark Review | Bottom | Top
December 12, 2003
Reviewer: A reader

hello Tom,I create table as following using the parallel hint,but the cost is
different:

sql>insert /*+ append parallel(demo01,4) */ into demo01 select /*+
parallel(customer,4) */ * from customer;
the cost --42s
and the Execution Plan is:
Execution Plan
----------------------------------------------------------
ERROR：
ORA-12838:
无法在并行模式下修改&
#20043;后读/修改对象
SP2-0612: 生成AUTOTRACE
EXPLAIN报告时出现错误

Statistics
----------------------------------------------------------
1698 recursive calls
455 db block gets
8728 consistent gets
8075 physical reads
66492440 redo size
630 bytes sent via SQL*Net to client
525 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
11 sorts (memory)
0 sorts (disk)
390126 rows processed

sql>insert * from demo02 select * from customer;
cost --26s
and the Execution Plan is:
Execution Plan
----------------------------------------------------------
0 INSERT STATEMENT Optimizer=CHOOSE (Cost=1226 Card=659568 Byt
es=680674176)

1 0 PARTITION RANGE (ALL)
2 1 TABLE ACCESS (FULL) OF 'CUSTOMER' (Cost=1226 Card=659568
Bytes=680674176)

Statistics
----------------------------------------------------------
1939 recursive calls
44876 db block gets
20918 consistent gets
1 physical reads
64954576 redo size
646 bytes sent via SQL*Net to client
467 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
7 sorts (memory)
0 sorts (disk)
390126 rows processed

The customer table have 0.4 million records and the machine is 2 CPU's SUN
E420,which runiing Oracle database.

Can you tell me why the physical reads increase so much using parallel hint in
the DML sql,thanks

Followup:
well, for a tiny number of records like this, i would not even consider PQ at
this point.

But -- to answer your questions

the cost, well the cost is affected by thousands of things. HINTS definitely
affect the cost (That is in part how hints work -- by playing with costs).
Ignore the cost of two queries that are not the same, they are NOT comparable.

the PIO's -- well, you ran one before the other? That and parallel query
prefers to checkpoint -- and do direct io many times (you might find PQ doing a
global checkpoint before the query begins to get current images onto disk in
order to just all hit slam the data as fast as it can without messing with the
buffer cache)

but for so few records -- it would seem that regular sql is what you want.

GOTO a page to Bookmark Review | Bottom | Top
December 13, 2003
Reviewer: A reader

Hi Tom,
Followed you advice...WORKS LIKE A CHARM..updated 1.5 Million recod in 10 Min.
EXCELLENT!!!!!!!

Thanks a LOTTTTTTTTTTT

Update siebel.S_REVN r
set (x_opty_prod_city,X_opty_prod_STATE,X_PR_YEAR_QTY) =
(select /*+ INDEX(S_OPTY_PROD T_TEMP) */ X_CITY,X_STATE,X_PR_YEAR_QTY from
siebel.S_OPTY_PROD p
where replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$') = r.row_id)

GOTO a page to Bookmark Review | Bottom | Top
Online updation of very big table March 06, 2004
Reviewer: Praveen from Delhi

Hi Tom
Iwant to update a table with one go on online system.A table has 20 lakh records
and has 110 columns. when i give the update command it takes ~ one hrs.I don't
know why it is taking so much time even index is created on that particular
field.

select count(*),app_flg from test;
app_flag has only two values Approved or unapproved.By defualt is unapproved.
approved =170000
unapproved=30000
update test set app_flg='APPROVED' where app_flg='UNAPPROVED'
it took 1hrs to update the records and other application online users processing
got slow down and locks started to occure on the table.

Suggest the best solution
pkr

Followup:
It would be the INDEX that slowed it down.

You want to FULL SCAN this table.
You want there to be NO index on the column you are updating.

GOTO a page to Bookmark Review | Bottom | Top
it gives error March 07, 2004
Reviewer: praveen from india

on dropping the index it gives error ora-00054

thanks

Followup:
that'll happen whilst there are outstanding transactions, yes. You'll want to
keep trying until you get a chance to drop it.

GOTO a page to Bookmark Review | Bottom | Top
thanks March 07, 2004
Reviewer: jasdeep,praveen from patiala india

i have solved that problem as an user has locked rows on the but at present not
logged on i killed that session and index was dropped immediately.
and updates were as fast as u can think.

thanks

GOTO a page to Bookmark Review | Bottom | Top
Update taking hrs of time March 09, 2004
Reviewer: Sachin from India

Hi Tom,
I have a query as:
I have two tables Table1(with around 10000 records max) and table2(actually the
GL code combination master table with around 600,000 records). I need to update
three fields in table1 (which is a temp processing table) with a unique value
from table2. T1 has ccid fields which need to be updated, and s1-s4 fields
corresponding to segment1-segment4 fields of table2
Following is the query like:

UPDATE table1 t1
SET t1.ccid1 = (SELECT t2.code_combination_id
FROM gl_code_combinations t2,
gl_sets_of_books gb
WHERE t2.segment5 = 'yyyyy'
and t2.segment4 = t1.s4
and t2.segment3 = 'XXXXX'
and t2.segment2 = t1.s2
and t2.segment1 = t1.s1
and t2.coa_id = gb.chart_of_accounts_id
and gb.set_of_books_id = 1),
t1.ccid2 = (SELECT t2.code_combination_id
FROM gl_code_combinations t2,
gl_sets_of_books gb
WHERE t2.segment5 = 'yyyyy'
and t2.segment4 = t1.s4
and t2.segment3 = 'zzzzzz'
and t2.segment2 = t1.s2
and t2.segment1 = t1.s1
and t2.coa_id = gb.coa_id
and gb.set_of_books_id = 1),
t1.ccid3 = (SELECT t2.ccid
FROM gl_code_combinations t2,
gl_sets_of_books gb
WHERE t2.segment5 = 'yyyyy'
and t2.segment4 = t1.s4
and t2.segment3 = 'wwwww'
and t2.segment2 = t1.s2
and t2.segment1 = t1.s1
and t2.coa_id = gb.coa_id
and gb.set_of_books_id = 1

But this is taking more than 6 hrs to complete. When I check the table locks the
table remains locked Row Xclusively. I am committing immediately after the
update statement in the procedure. Cud u pls tell me why is this happening?

Followup:
not that i don't believe you but -- about the locking issue -- I'm not sure i
believe you.

when you commit -- locks are released. You'd have to show me the info from
v$lock showing that t1 is still "locked" and locked by your session.

you could

set (c1,c2,c3) = ( select max(decode(segment3,'xxxxx',ccid)),
max(decode(segment3,'zzzzz',ccid)),
max(decode(segment3,'wwwww',ccid))
FROM gl_code_combinations t2,
gl_sets_of_books gb
WHERE t2.segment5 = 'yyyyy'
and t2.segment4 = t1.s4
and t2.segment3 in ('xxxxx','zzzzzz', 'wwwww')
and t2.segment2 = t1.s2
and t2.segment1 = t1.s1
and t2.coa_id = gb.coa_id
and gb.set_of_books_id = 1 )

instead of running a query per column.

if you run the query

select max(decode(segment3,'xxxxx',ccid)),
max(decode(segment3,'zzzzz',ccid)),
max(decode(segment3,'wwwww',ccid))
FROM gl_code_combinations t2,
gl_sets_of_books gb
WHERE t2.segment5 = 'yyyyy'
and t2.segment4 = some_value_value
and t2.segment3 in ('xxxxx','zzzzzz', 'wwwww')
and t2.segment2 = some_value_value
and t2.segment1 = some_value_value
and t2.coa_id = gb.coa_id
and gb.set_of_books_id = 1

how long does that take?

GOTO a page to Bookmark Review | Bottom | Top
Updating millions of rows March 15, 2004
Reviewer: A reader from NJ, USA

Tom,

In this discussion you mention that you would not do an UPDATE/DELETE of
millions of rows, but rather would create a new table (in parallel/nologging)
with the rows that needs to be retained and then create indexes in parallel,
drop the old table and rename the new table to old table. If so, my
understanding is :

1. Delete/UPDATE of tables with more than 1 million row would also have to
maintain the index structure and also will generate considerable rollback and
undo.
2. Other consideration with respect to update/delete is that the blocks will be
moved in and out of the freelists. Please confirm if this is true and if so,
please explain what happens during update/delete of large number of rows with
regard to freelists?
3. Other consideration is that it will cause lot of table/segment fragmentation
and lot of empty blocks below the HWM and all full table scans on the table will
read lot of empty blocks.

Is my understanding correct and if I have missed any other things as to why
DELETE/UPDATE of large tables is not recommended, please let me know.

4. If my table that needs to be updated/deleted is the primary/parent table with
dependent child tables/constraints and triggers, is this approach correct
Disable all the triggers/constraints
Drop the table
Rename the new table to the old table
Reenable all constraints and triggers
What to do in such cases?

Followup:
I "might", "probably" if it was most of the records...

1) correct
2) correct, deletes would put them on, updates might put them on, might take
them off.

3) it could, yes.

the delete/update can take lots longer than CTAS + reindex if you are doing MOST
all of the records in a LARGE table with lots of indexes.

4) triggers are not relevant here....

but you would have to drop the fkeys
do the CTAS+drop+rename
add the constraints back with enable/rely (don't need validation)

GOTO a page to Bookmark Review | Bottom | Top
Deletes/Updates March 15, 2004
Reviewer: A reader from NJ, USA

Tom,

In the above discussion you mention :
1. The deletes will put them on because the used space in the block will drop
below PCTUSED. Is it correct?
2. The update will put them on if the updated column makes the used space in
the block fall below PCTUSED or the updated column makes the free space
available in the block less than PCTFREE. Is it correct.

If both of the above understanding is not correct, please explain. Also, please
let me know how we could track the movement of the blocks on and off the
freelist. Any SQL scripts/tools available that I could use?

Thanks

Followup:
1) the deletes could, yes.

2) yes.

to track individual blocks -- not without dumping blocks (and I don't go there).
dbms_space and other tools (search this site for show_space for the script I
use) can show you how MANY things are on the freelist.

GOTO a page to Bookmark Review | Bottom | Top
A small correction March 15, 2004
Reviewer: A reader from NJ, USA

Tom,

In the above scenario for question 2, there is a small correction marked in
CAPITAL letters :
"The update will put them on if the updated column makes the used space in the
block fall below PCTUSED AND WILL TAKE OFF of the freelist if the updated column
makes the free space available in the block less than PCTFREE. "
Is it correct?

Thanks

Followup:
I'll state it simply, forget pctfree/pctused/etc -- in english:

if the update reduces the row size, it can put it back onto the freelist.

if the update increases the row size, it can take it off the freelist.

GOTO a page to Bookmark Review | Bottom | Top
Parallel DML March 16, 2004
Reviewer: A reader

1)On page 630 of your Oracle Expert One on One, you mention that parallel DML
can be used only upon partitions -- in 8i. I'm using 9.2. Has this behavior
changed in the later versions ?

2)I need to do a mass delete on a 12 GB table and it's taking to much time (far
too much, really). How could I speed it up ?

Thanks

Followup:
1) yes, that is a 9ir2 new feature.

http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96531/ch2_9ir2.htm#74432

2) did you read the original answer above?

GOTO a page to Bookmark Review | Bottom | Top
enable PK with parallel clause March 20, 2004
Reviewer: A reader

I am working in 8.1.7.4.

I want to delete 30 million rows of 60 million so I am doing this

1. create a copy of original table with good data
2. disable original tables′s constraints, primary keys (and child FKs) and
foreign keys
3. truncate the original table
4. make all original tables indexes unusable
5. insert append back from copy to original, I have to do this way because I am
not allowed to modify constraint names
6. drop the copy
7. enable constraints. here when I enable PK I cannot provide paralle clause
right? I searched the doco but it seems that I can only specify using index
tablespace clause or I am missing something?
8. rebuild all unusable indexes

Followup:
you can create an index in parallel (unique) and then enable/rely/novalidate the
pk.

GOTO a page to Bookmark Review | Bottom | Top
How to Update millions or records in a table March 25, 2004
Reviewer: A reader

Hi Tom,
I read your response to Murali's question above and believe, there will be a
downtime window for the application. This is because if I want to keep the
index names same as before, I will have to create the new table, drop the old
table, rename the new table to old table name and then create the required
indexes on the same.

Wondering, how if we can create the indexes by some other name on the new table
and rename the indexes after dropping the old table.

As always your valuable advice helps a lot!

Followup:
you can rename the indexes.

alter index supports this.

GOTO a page to Bookmark Review | Bottom | Top
how to make this restartble? March 25, 2004
Reviewer: A reader

I want to delete 80% of rows from a couple of 50 million rows table. The process
is divided in 10 steps

1. Create TEMP table, CTAS from original table
2. Disable constraints
3. Truncate original table
4. Set indexes to UNUSABLE
5. Insert into original table from TEMP table
6. Drop TEMP table
7. Create PK and UK
8. Enable PK and UK
9. Enable FKs
10. Rebuild indexes

I want to make this process restartble, i.e if it fails in step 3 if I later
rerun the procedure it will start from step 3 again. How can we achiveve this? I
have been thinking using a table which stores the two table name and process
status something like following table then update the status as the process is
going

TABLE: PROCESS_STATUS
TABLE_NAME STATUS
----------- ---------------
EMP 2
DEPT 0

Process_status is the step, 0 means that I need to start from beginning so in
each process I would query this table and see which step should be run. For
example if the process failed at step 3 then STATUS would be 2

select status into l_proc_status from PROCESS_STATUS
where table_name = 'EMP'
if l_proc_status = 0
then
run process 1
end if;

select status into l_proc_status from PROCESS_STATUS
where table_name = 'EMP'
if l_proc_status = 1
then
run process 2
end if;

select status into l_proc_status from PROCESS_STATUS
where table_name = 'EMP'
if l_proc_status = 2
then
run process 3
end if;

and so on.

Any suggestions :-? What I see is that I will hav quite a few redundant codes...

Followup:
you'd have to keep a state table and have you code query it up, much like you
are suggesting. That would do it yes.

another "inserting" idea might be to

a) insert the steps to process
b) delete them as you complete them (and commit)
c) to restart, just pick up at the step you wanted

you could insert the procedures to be called and just:

for x in ( select name from procedures order by seq )
loop
execute immediate 'begin ' || x.name || '; end;';
delete from procedures....
commit;
end loop

(just a thought, not fully baked)

GOTO a page to Bookmark Review | Bottom | Top
provide more information about that procedure drop indexes and rebuild with nologging April 07, 2004
Reviewer: mohan from india

Hi Tom
Could you provide more information about that procedure and
how to drop indexes and rebuild with nologging .We are using informatica etl
tool
before loading bulk data into target drop the indexes ( pre-session) and after
load the data rebuilt the indexes with nologging (post-session ), it takes less
amount of time because its generates less undo
Regards
Mohan

Followup:
it is just a drop and create? not sure what you are looking for -- if it is
"syntax", we document that.

GOTO a page to Bookmark Review | Bottom | Top
Problem with Update April 08, 2004
Reviewer: Ram from Bangalore,India

Hi Tom,
This update is not working properly.How to correct this?

SQL> declare
2 x number;
3 y number;
4 begin
5 execute immediate 'update e set sal = sal*1.1 where ename in
('SMITH','KING') returning sal into :x,:y' using x,y;
6 dbms_output.put_line(x || chr(9)||y);
7* end;
*
ERROR at line 5:
ORA-06550: line 5, column 65:
PLS-00103: Encountered the symbol "SMITH" when expecting one of the following:
. ( * @ % & = - + ; < / > at in is mod not rem return
returning <an exponent (**)> <> or != or ~= >= <= <> and or
like between into using || bulk

Please do reply.
Bye!

Followup:
..... ( ''SMITH'', ''KING'' ) .....

you need two quotes for a quote in a string (until 10g when there is another
way)

GOTO a page to Bookmark Review | Bottom | Top
Updating million records April 08, 2004
Reviewer: Himnish Narang from Chandigarh, India

Hi Tom,

i was just going through the discussion in this subject. In this you have
described that you will create a table instead of going for updates of million
records and this update will be peroformed as part of creation of new table.

Don't you think that the new table will also occupy the same amount of space and
after dropping the table we will have to do the reorg for that tablespace.

please comment.

Thanx.

Followup:
reorgs are so overrated.

no, i would not reorge a tablespace or anything simply cause I copied some data
and dropped some old stuff. not a chance.

GOTO a page to Bookmark Review | Bottom | Top
How to Update millions or records in a table April 08, 2004
Reviewer: Ganesh from Visakhapatnam

Hi Tom,
Its very useful and I used this saved lots of time by creating new table.
I got another issue similar to this. We are using 9i. there is requirement of
needs to modify primary key datatype number to varchar which is having lot of
dependents. Is there any option with out rebuilding table as data already exist.

Can you please suggest on this.
Thanks in Advance.
Ganesh

Followup:
that'll be a rebuild pretty much.

GOTO a page to Bookmark Review | Bottom | Top
Please see this April 08, 2004
Reviewer: Ram from Bangalore,India

Hi Tom,
Thanks for your reply but Still it's not working properly.
SQL> declare
2 x number;
3 y number;
4 begin
5 execute immediate 'update e set sal = sal*1.1 where ename in
(''SMITH'',''ALLEN'')
6 RETURNING sal into :x,:y' using x,y;
7 dbms_output.put_line(x ||chr(9) ||y);
8 end;
9 /
declare
*
ERROR at line 1:
ORA-00913: too many values
ORA-06512: at line 5

How to correct this?You say about another way in Oracle 10G.How does that
work?Could you please provide an example.
Please do reply.
Bye!

Followup:
you are returning SAL (one thing) into two things (:x,:y)

different problem all together.

in 10g, the strings could be

q'|how's this for quoting|'

instead of
'how''s this for quoting'

a character string literal that starts with q for quote will use the next two
characters as the start quote and those two characters in reverse for the end
quote.

GOTO a page to Bookmark Review | Bottom | Top
Thanks April 09, 2004
Reviewer: Ram from Bangalore,India

Hi Tom,
Thanks for your reply.I found out the way of doing it as follows and Do you
have any other option to do it in a better way?
SQL> create or replace type sal_t as table of number;
2 /

Type created.

SQL> declare
2 sals sal_t := sal_t();
3 begin
4 execute immediate 'update emp set sal = sal*1.1 where ename in
(''SMITH'',''KING'')
5 returning sal into :sals' returning bulk collect into
sals;
6 for i in 1..sals.count loop
7 dbms_output.put_line('Updated salary:'||sals(i));
8 end loop;
9 end;
10 /
Updated salary:968
Updated salary:5500

PL/SQL procedure successfully completed.

Please do reply.
Bye!

Followup:
get rid of the dynamic sql, it isn't necessary (fixes your original issue with
the quotes as well). also, would seem to make sense to pull back other columns
so you would be able to see what rows were actually updated:

ops$tkyte@ORA9IR2> declare
2 type numArray is table of number;
3 l_sals numArray;
4 l_empno numArray;
5 begin
6 update emp
7 set sal = sal*1.1
8 where ename in ('SMITH','KING')
9 returning empno, sal bulk collect into l_empno, l_sals;
10
11 for i in 1..l_sals.count loop
12 dbms_output.put_line
13 ('Updated salary for empno ' || l_empno(i) ||
14 ' to: '||l_sals(i));
15 end loop;
16 end;
17 /
Updated salary for empno 7369 to: 880
Updated salary for empno 7839 to: 5500

PL/SQL procedure successfully completed.

GOTO a page to Bookmark Review | Bottom | Top
To Mr. Ram April 09, 2004
Reviewer: A reader

Hi Ram and All,
May I request you all not to pester Tom with inane posts requesting Tom to debug
and program on our behalf.
This forum is not to learn coding.

GOTO a page to Bookmark Review | Bottom | Top
update based on rowid April 16, 2004
Reviewer: john

Tom,

we use non-intelligent primary key to update the table through a stored
procedure.

i.e. when the table is queried, our select procedure returns the non-intelligent
key along with the details like below

select emp_key, other details from emp;

when updating the data, emp_key is passed from the front end to the stored
procedure and table is updated based on emp_key.

is it a bad idea, using rowid instead of non-intelligent key(emp_key)?

Followup:
non-intelligent? interesting choice of terminology.

But in any case -- once upon a time ago, rowids were IMMUTABLE. Once assigned a
row would have a rowid assigned to it and that rowid would live with that row
until you deleted it.

Starting in 8i with support for updates to partition keys that would cause a row
to move from partition A to partition B -- that is no longer true (and then
there are IOT's...).

In 10g, there are even more options for 'row movement' -- an online segment
shrink for example.

So, rowids can change and are changing in more circumstances as time goes on.

Sooooo, if you have lost update detection in place using 'FOR UPDATE' -- rowids
are very safe (forms uses them). What I mean is -- you

a) select a.*, rowid from T a where .... 'normally'
b) you decide to update a row, so you

select * from t where rowid = :that_rowid AND
((c1 = :that_c1 or (c1 is null and :that_c1 is null)) AND
((c2 = :that_c2 or (c2 is null and :that_c2 is null)) AND ...
((cN = :that_cN or (cN is null and :that_cN is null))
for update NOWAIT;

you lock it -- this prevents anything that could change a rowid from doing so
and verifies the data hasn't been changed. If that returns 0 rows, someone
changed the data (or reorg'ed it and moved it). You need to requery to get the
current values before you let the user even think about modifying it. If that
returns a row -- you got it. If that returns ora-54, something has it locked,
you have to decide what you want to do about that.

c) you can then safely update that row by rowid

If you do not have lost update detection in place using FOR UPDATE locks -- then
you should stick with a primary key probably, just to protect yourself in the
future.

GOTO a page to Bookmark Review | Bottom | Top
update based on rowid April 19, 2004
Reviewer: john

thanks a lot Tom.

to understand fully what you said, can you please tell

1.why do we have to where condition?

((c1 = :that_c1 or (c1 is null and :that_c1 is null)) AND
((c2 = :that_c2 or (c2 is null and :that_c2 is null)) AND ...
((cN = :that_cN or (cN is null and :that_cN is null))

2.another question is, is there a situation where a rowid of a row being
assigned another row of the same table?(because this may cause more danger as we
end up updating another row)?

thanks again.

Followup:
1) you want to make sure the row hasn't changed.

c1 = :that_c1 will only with if both the column and the bind are not null,
hence the rest is needed to make it so null's compare "the same"

2) this is tied into 1 again. sure, we could delete from emp where empno =
7788; commit; and then re-insert that row and get the same rowid (rare, but
could happen).

But - if we inserted the same exact values and they got the same exact rowid --
then #1 would make this "safe". the values of the row are the same, for all
intents and purposes it IS the same row.

If the newly inserted row doesn't match column by column the values we expect,
well, then we'll reject it (won't lock it) and all will be well.

GOTO a page to Bookmark Review | Bottom | Top
update large table April 22, 2004
Reviewer: Prasad Chittori from US

I have a very large partitioned table with DATE column, I would like to take out
the time portion from the date column
I did the following and it is taking lot of time and failing with unable to
extend the rollback segments.
Any better ideas without increasing the rollback segments or changing the query

update t1
set dt = trunc(dt)

Followup:
parallel dml -- each pq slave can get its own RBS, letting you use them all at
the same time, not just one of them.

GOTO a page to Bookmark Review | Bottom | Top
convert delete to insert April 26, 2004
Reviewer: marvin

We want to delete several tables of several million of rows. The problem is, the
delete statement is quite simple but if we want to this faster we would insert
to temp table, truncate original and insert back the good data. We want to
convert delete into insert statements but I think it′s very hard

For example

We have these DELETE statements

DELETE
FROM OM_ANNEX t
WHERE EXISTS
(SELECT 1
FROM LISTA_ORDENES l,OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id=l.order_id)
/
COMMIT
/
DELETE
FROM OM_ANNEX t
WHERE EXISTS(SELECT 1 FROM OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id IS NULL)
/
COMMIT

to convert to that we did

INSERT /*+ APPEND */ INTO OM_ANNEX$TMP
SELECT t.*
FROM OM_ANNEX t
WHERE NOT EXISTS (SELECT 1
FROM LISTA_ORDENES l,
OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id=l.order_id)
UNION
SELECT t.*
FROM OM_ANNEX t
WHERE NOT EXISTS (SELECT 1
FROM OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id IS NULL);

Seems incorrect, if I run the queries inside insert seperately I get correct
results but if we use UNIONs we get data which we dont want, if we do it in two
inserts we get duplicate rows

One approach is this

create a temp table using the delete statements but converted to querys for
example instead of

DELETE
FROM OM_ANNEX t
WHERE EXISTS
(SELECT 1
FROM LISTA_ORDENES l,OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id=l.order_id)

we convert to

select t.*
FROM OM_ANNEX t
WHERE EXISTS (SELECT 1
FROM LISTA_ORDENES l,OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id=l.order_id)

then simply says

select * from om_annex a
where not exists (select null from temp_table b
where a.annexid = b.annexid)

annexid is PK

the other approach is this

select * from om_annex a
where annexid not in (select annexid
FROM OM_ANNEX t
WHERE EXISTS (SELECT 1
FROM LISTA_ORDENES l,OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id=l.order_id)
union
select annexid
FROM OM_ANNEX t
WHERE EXISTS(SELECT 1
FROM OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id IS NULL))

They are all very slow because involving tables have several million of rows!

Is there a better approach?

Followup:
why not:

create table new as select rows to keep;
drop table old;
rename new to old;

do the first in parallel, with nologging.

GOTO a page to Bookmark Review | Bottom | Top
err the problem is convert the DELETE to CTAS April 26, 2004
Reviewer: A reader

create table new as select rows to keep;
drop table old;
rename new to old;
do the first in parallel, with nologging.

That is exactly I want to do, the problem is until now we have always done the
other way round, use plain DELETE (and it takes a week to delete everything!),
the problem I am not sure how to convert DELETE to CTAS. If I want to do the
reverse of DELETE statements (some table has 5 DELETE statements!) it is not as
simple as write the DELETE the other way round? For example how would you change

delete tab1
where exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)

delete tab1
where exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)

Would you change it to

insert into tmp_x
select *
from tab1
where not exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)

insert into tmp_x
select *
from tab1
where not exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)

Is simple as this?

Followup:
if i had this:

delete tab1
where exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)

delete tab1
where exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)

I would probably have this:

create table tab1_new
as
select tab1.*
from tab1, tab2, tab3
where tab1.id = tab2.fid(+)
and tab2.id = tab3.id(+)
and NOT ( tab2.fid is not null and tab3.id is not null )
and NOT ( tab2.fid is not null and tab2.id is null )
/

outer join the three tables. Negate the conditions for the where exists.

that is, after outer joining tab1 to tab2, tab3 -- remove the rows

where tab2.fid is not null and tab3.id is not null -- that is subquery one in
your deletes above.

where tab2.fid is not null and tab2.id is null -- that is subquery two in your
deletes above.

GOTO a page to Bookmark Review | Bottom | Top
err the problem is convert the DELETE to CTAS April 26, 2004
Reviewer: marvin

create table new as select rows to keep;
drop table old;
rename new to old;
do the first in parallel, with nologging.

delete tab1
where exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)

delete tab1
where exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)

Would you change it to

insert into tmp_x
select *
from tab1
where not exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)

insert into tmp_x
select *
from tab1
where not exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)

Is simple as this?

GOTO a page to Bookmark Review | Bottom | Top
thank you very much for the outer join tip April 27, 2004
Reviewer: marvin

I am going to have a look how to apply the outer join in order to convert DELETE
to CTAS.

I have a further question, I have another table which undergoes 4 DELETEs

DELETE
FROM SW_PERSON t
WHERE EXISTS
(SELECT 1
FROM LISTA_ORDENES o
WHERE o.order_id=t.swobjectid AND t.swtype='ORDER')
/
COMMIT
/
DELETE
FROM SW_PERSON t
WHERE t.swtype='ORDER' AND t.swobjectid IS NULL AND COMP_INST_ID IS NULL
/
COMMIT
/
DELETE
FROM SW_PERSON t
WHERE t.swtype IS NULL
/
COMMIT
/
DELETE
FROM SW_PERSON t
WHERE t.swtype='ORDER'
AND t.swobjectid IS NULL
AND EXISTS
(SELECT 1
FROM OM_COMPANY_INST c, LISTA_ORDENES l
WHERE c.COMP_INST_ID=t.COMP_INST_ID
AND l.order_id=c.order_id)
/
COMMIT
/

I need to convert this to CTAS as well, however I am not sure if this can be
done in a single statement. These DELETE for example cant be converted into one
as follows right? (because of commit between them)

DELETE
FROM SW_PERSON t
WHERE EXISTS (SELECT 1
FROM LISTA_ORDENES o
WHERE o.order_id=t.swobjectid
AND t.swtype='ORDER')
OR (t.swtype='ORDER'
AND t.swobjectid IS NULL
AND COMP_INST_ID IS NULL)
OR t.swtype IS NULL
OR (t.swtype='ORDER'
AND t.swobjectid IS NULL
AND EXISTS (SELECT 1
FROM OM_COMPANY_INST c,
LISTA_ORDENES l
WHERE c.COMP_INST_ID=t.COMP_INST_ID
AND l.order_id=c.order_id));

Can this use the outer join tip as well?

TIA

Followup:
why have any commits in between.

but of course -- any four deletes against a single table can (and if you ask me,
should) be done as a single delete.

the outer join was used in the CTAS, not in a delete.

GOTO a page to Bookmark Review | Bottom | Top
why do you use outer join April 27, 2004
Reviewer: A reader

Why is outer join needed for tab1, tab2 and tab3 :-?

Followup:

because we wanted to keep all rows in tab1 -- if there is NO mate in tab2/tab3

since they deleted "where exists in tab2/tab3":

delete tab1
where exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)

delete tab1
where exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)

rows in tab2/tab3, we need to ALWAYS keep rows that are NOT in tab2/tab3 using
CTAS. outer join is mandatory for that in this example.

GOTO a page to Bookmark Review | Bottom | Top
to marvin April 27, 2004
Reviewer: A reader

hi marvin, you can try this

select * from tab1
where PK not in (select PK
from tab1
where exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)
union
select *
from tab1
where exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)))

GOTO a page to Bookmark Review | Bottom | Top
regarding the conditions April 27, 2004
Reviewer: A reader

Hi Tom

May you show some light why

NOT ( tab2.fid is not null and tab3.id is not null )
is same as
exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)

and
NOT ( tab2.fid is not null and tab2.id is null )
is same as
exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)

Cant see why. Thank you

Followup:
it isn't the same.

it is in fact the opposite.

if you outer join T1 to T2 to T3

and before you were looking for (where exists)

a) a match in T2 (tab1.id = tab2.fid)
b) a match in T3 for that T2 (tabe2.id = tab3.id)

then you are saying "if I outer join T1 to T2 to T3, that row would be such
that:

a) tab2.fid is NOT NULL (we found a mate)
b) tab3.id is NOT NULL (we found a mate in t3 for t2)

with the where exists -- we would have deleted that row, hence with the CTAS
(which is finding rows to keep) we simply NEGATE that (with NOT).

Therefore we would keep that row, IF that was "not" satisfied.

Same logic for the second part.

the second where exists says

delete the row if

a) there is a match in T2 (where tab2.fid = tab1.id)
b) the id column in t2 for that match is NULL

that would be (in an outer join)

(tab2.fid is not null) -- we joined to a row
(tab2.id is null) -- and that row is having id is null

negate it and keep it.

GOTO a page to Bookmark Review | Bottom | Top
Updating Table having millions of records taking lot of Time.. April 28, 2004
Reviewer: Anand Pandey from INDIA

Hi Tom,
I hd a table having millions of record in which two of its cols are Null.
i just tried to update the null colls with the data from other table, which is
taking taking around 10-12 hrs for a single day record, and I've to update it
for 31 days.
pls help Me in getting the high perf. on updation.

Query used is:

UPDATE /*+nologging parallel(4) */ MASTER_tab A SET
(null_col1,null_col2)=(SELECT /*+PARALLEL(4) */ MIN(C1),MIN(C2)
FROM Table2 B WHERE SUBSTR (A.col1_T1, 1, LENGTH (B.C1)) = B.C1
AND SUBSTR(A.col2_T1,1,3)=B.C2)
WHERE c_date='01-Jan-04'

Thanks and Regards,
Anand Pandey

GOTO a page to Bookmark Review | Bottom | Top
Nologging - how does it impact recovery? April 30, 2004
Reviewer: Naresh from Cyprus

Hi Tom,

This is a great chain of discussion. I especially liked the "outer join to
replace the not exists". I am really looking forward to my copy of your first
book that I oredered recently (on it's way from amazon).

One question regarding making the table nologging: Does it not have implications
for recovery? What am I missing?

Followup:
you need to schedule a hot backup if you use non-logged operations, yes.

GOTO a page to Bookmark Review | Bottom | Top
db sequential waits on UPDATE May 14, 2004
Reviewer: A reader

Dear Tom,

Due to some processing complexities I am unable to use a single SQL for
Update/Insert.

I use:
LOOP
1. Bulk select (500 rows at a time) from Table A, C with rowids from C
2. Bulk insert
3. Bulk update table C
END LOOP

I am getting a very high number of "db file sequential read" waits on the update
part. Please see relevant portions of Trace File:
********************************************************************************
SELECT /*+ FULL(v) FULL(c) PARALLEL(v,4) PARALLEL(c,4) HASH(v,c) */
c.rowid,
...
FROM vp v,citi c
WHERE v.idno= c.idno

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.07 10 234 0 0
Execute 1 0.32 3.38 3023 156 3 0
Fetch 155 1.96 82.82 0 0 0 77500
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 157 2.32 86.28 3033 390 3 77500

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 95 (recursive depth: 1)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 4 0.01 0.03
rdbms ipc reply 26 0.00 0.00
enqueue 2 0.02 0.02
PX Deq: Join ACK 7 0.00 0.00
PX Deq: Parse Reply 4 0.23 0.35
PX Deq: Execute Reply 717 2.06 735.45
PX qref latch 4 1.02 2.75
PX Deq: Table Q Normal 5218 0.35 23.95
********************************************************************************

INSERT /*+ APPEND PARALLEL(A,4) */ INTO t_v A
VALUES(:b1,
:b2)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 2 11 0 0
Execute 155 0.32 8.00 552 1116 5567 77500
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 156 0.32 8.02 554 1127 5567 77500

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 95 (recursive depth: 1)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 552 0.36 6.43
free buffer waits 6 1.02 1.23
********************************************************************************

UPDATE /* PARALLEL(c,4) */ citi c
SET ...
WHERE c.rowid = :b6

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 1 0 0
Execute 154 35.26 586.87 75251 77036 79595 77000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 155 35.26 586.87 75251 77037 79595 77000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 95 (recursive depth: 1)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 75578 1.04 461.09
free buffer waits 242 1.03 96.11
********************************************************************************

Is it not true that if the row has already been selected (and is there in the
buffer cache) no disk IO should happen to update this row?

Please tell me a way to make this faster.

Regards

Followup:
insert /*+ append parallel... */ VALUES hmmmmmm

append = no go with values, only works with "as select"
parallel -- parallel on what exactly?

update /* parallel(c,4) */ hmmmmmm

in search of the fast = true switch. thankfully that hint is malformed and hence
ignored as well -- you are updating by rowid.

parallel is *not* fast=true.
append is not either really, especially with the values clause.

you can review the trace file itself (p1,p2,p3 will tell you file and block
info, along with blocks read). you can use that to verify that it is the very
act of reading the indexes that need to be updated and maintained that is
causing this. If they are not in the case, well, we'll need to read them into
there.

GOTO a page to Bookmark Review | Bottom | Top
More Info May 16, 2004
Reviewer: A reader

Thanks Tom,
The hint in the update was there by a Developer, it has been rightly disabled.
The insert is not a bottleneck so didn't look into it.
This is a development machine, there are no other jobs running, asynchronous IO
is enabled, the machine is on RAID 0 (no fault tolerance - being a development
one). There are NO INDEXES on Tables being inserted and updated.

The segment on which the WAITS (db sequential read) are happening are that of
the TABLE that is being UPDATED.

Please guide me next.

Followup:
then you are seeing physical IO performed to read into the buffer cache the data
needed to be updated. If it is not cached, we'll have to read it.

GOTO a page to Bookmark Review | Bottom | Top
PARALLEL DML May 28, 2004
Reviewer: Sar from NJ, USA

Tom,

I need to update a table that has 50 million rows but
the number of rows affected are only 1 million rows.

I have a single update statement to do this.

UPDATE acc_proc
SET acc_seq_num = 0
WHERE acc_seq_num > 0
/

The column acc_seq_num is not part of an index.

Can you please suggest me if there is anything better
that I can do on this SQL to run it faster.

Thanks
Sar

Followup:
nope, that is perfect.

shouldn't take very long at all. unless you are getting blocked constantly by
other sessions.

GOTO a page to Bookmark Review | Bottom | Top
do i need to commit after execute immediate ? for dml or ddl ? June 02, 2004
Reviewer: A reader

http://download-east.oracle.com/docs/cd/A87861_01/NT817EE/index.htm

do i need to commit after execute immediate ? for dml or ddl ? or it dose
automatically ?

I tried to find the answer but not found in doc. from above site please help

Followup:
DDL is processed *conceptually as:

begin
commit; -- commit anything you started, even if DDL fails, you are committed
do the ddl;
commit; -- commit the ddl
exception
when others
then
rollback; -- the ddl
raise;
end;
/

You need not commit DDL, it'll commit 2 times for you.

You need to commit DML as DML does not do that.

This is regardless of the method used to execute SQL, native dynamic sql,
dbms_sql, static sql, whatever -- the rules are the same in all cases.

GOTO a page to Bookmark Review | Bottom | Top
Thanks !! June 02, 2004
Reviewer: A reader

ok,

so i did
1.)

begin
begin
Execute immediate 'SET CONSTRAINTS ALL DEFERRED';
end;

begin
execute immediate 'delete from table ' || t;
end;

begin
execute immediate 'delete from table ' || t1;
end;

exception
when others then
rollback;
raise;
end;
/

i got the message some constraint of t1 violated. (child record exists)
if i did not issue the commit,post or rollback,
and did not perform any DDL,constraints are DEFERRED
what is the issue ? can u help ?

Followup:
...
begin
execute immediate 'set constraints all deferred';
execute immediate 'delete from ' || t;
execute immediate 'delete from ' || t1;
execute immediate 'set constraints all immediate';
end;
/

that is all the code you want -- the all immediate will validate the contraints

if they fail -- then the error (exception) will be propagated back to the client
and the rollback of the statement (which is whatever the client sent over to the
database -- a plsql block, whatever) will be undone.

GOTO a page to Bookmark Review | Bottom | Top
June 02, 2004
Reviewer: A reader

...
begin
execute immediate 'set constraints all deferred';
execute immediate 'delete from ' || t; -- ***** this one
-- execute immediate 'delete from ' || t1;
-- execute immediate 'set constraints all immediate';
end;
/

the question is
as soon as it executes the
'delete from ' || t;
statement, it raises the referential constriant violation exception. why ?
as
1.) constraints all deferred
2.) it NOT a ddl
3.) I am not commiting(or ending the tx) manually

Followup:
the constraint must not be deferrable.

you have to have deferrable constraints in order to defer them. the default is
"not deferrable"

GOTO a page to Bookmark Review | Bottom | Top
7.3.4 - extracting 5 million records from 100 million record table June 15, 2004
Reviewer: A reader

Any suggestions on how to accomplish this on 7.3.4? No concept of nologging,
partitioning, Export/query clause etc in 7.3.4..

How I can extract about 5 million records (based on a condition) into a separate
database/server from a 100million record table? What's the most efficient way?
I'm having to make do with very limited disk/space on both machines.

Thanks.

Followup:
5 million is pretty small.

I'd just

create table as select ( concept of nologging did in fact exist, unrecoverable);
export it
import it

or use the sqlplus copy command if they are connected via a nice network.

GOTO a page to Bookmark Review | Bottom | Top
June 16, 2004
Reviewer: A reader

Thanks. Had some long columns in there. So 'Create table as select' wouldn't
work. Am just ending up plsql looping and hitting smaller chunks of the mega
table. Then creating smaller target tables..then exp/imp.

Am afraid I won't have sufficient temp space to do a sqlplus "Copy from". Also,
documentation talks about Copy not being intended for Oracle to Oracle DBs. No
idea why.

Followup:
copy is ONLY for oracle to oracle db's....

GOTO a page to Bookmark Review | Bottom | Top
June 16, 2004
Reviewer: A reader

Per 7.3.4 documentation :

http://download-west.oracle.com/docs/pdf/A42562_1.pdf

Excerpt from it -

Note: In general, the COPY command was designed to be used
for copying data between Oracle and non-Oracle databases. You should use SQL
commands (CREATE TABLE AS and INSERT)
to copy data between Oracle databases.

Followup:
but if you think about it..... doesn't matter what the doc says, it takes two
connections. connections are only to oracle databases. sure you could be using
a gateway -- but even there, it would be true that create table as and insert
would work.

GOTO a page to Bookmark Review | Bottom | Top
Updating and Inserting 1 Million rows Daily with Bitmap indexes June 24, 2004
Reviewer: Sidda from Santa Clara,USA

Hi Tom,
Here we are facing very big problem. We have a Partitioned table with 180Million
records(with 70 columns and 10 bitmaps,10 B-tree indexes).End users need Bitmap
indexes 24/7 so we can't drop and recreate table and indexes.Daily we have to
update and insert 1Million records. We tried with bulk updates but invain. What
is the best method to follow up?

Thanks in advance
Sidda

Followup:
describe "in vain", what went wrong?

it would be best to do a SINGLE insert into and a SINGLE Update against this
table (not even in bulk - just single statements)

GOTO a page to Bookmark Review | Bottom | Top
Creating Table with aggregated data from another table July 26, 2004
Reviewer: RB from NJ

Tom: Qn related to creating a table with data from another table -

I want to create a table with few fields and aggregated some of few columns from
another table.

Here is a query:

create table SRC200401 as select ID, src, LOC, sum(BYTES) TOTAL, sum
(billable_bytes) TOTAL_BILLABLE from SRC_MONTHLY where ID = 1072915200 group by
ID,LOC, SRC;

I have 128Million record in the SRC_MONTHLY table and it is taking more than 1
hr to get this table created. I have index on ID, src and LOC fields.

Any faster way of getting this table created?

Followup:
so how many records have that id?

GOTO a page to Bookmark Review | Bottom | Top
Great aproach, but is it as fast for an IOT table? July 26, 2004
Reviewer: Peter Tran from Houston, TX USA

Hi Tom,

I'm trying the same approach with an IOT table. We have an IOT table
partitioned daily. I want to recreate this table with a monthly partition. I
do a CTAS parallel nologging using the new monthly partition, but it's SLOW.
Then again, the table does have 200 million rows. Is the "Index Organization"
part of table the slow part?

Thanks,
-Peter

Followup:
what do the inputs/outputs look like (table structure) -- remember, you are
sorting 200,000,000 rows as well!

did you give it a nice juicy sort area size?

GOTO a page to Bookmark Review | Bottom | Top
Unfortunately no. July 26, 2004
Reviewer: Peter Tran from Houston, TX USA

"...nice juicy sort area size" That would be a negative. :( Man, this is
going to take forever isn't it? Ugh...

SQL> show parameter sort_area

NAME TYPE VALUE
------------------------------------ ----------- --------
sort_area_retained_size integer 8388608
sort_area_size integer 8388608

SQL> desc odfrc;
Name Null? Type
----------------------------------------------------- -------- ------------

ODIFID NOT NULL NUMBER(10)
TRPORGN NOT NULL VARCHAR2(5)
TRPDSTN NOT NULL VARCHAR2(5)
POSCOUNTRYCODE NOT NULL VARCHAR2(3)
PAXTYPE NOT NULL VARCHAR2(1)
DCP NOT NULL NUMBER(2)
ODIFDATE NOT NULL DATE
FRCDATE NOT NULL DATE
BKGMEAN NUMBER
BKGMEANINFLUENCED NUMBER
BKGVARIANCE NUMBER
XXLMEAN NUMBER
XXLMEANINFLUENCED NUMBER
XXLVARIANCE NUMBER

Here's my CTAS:

CREATE TABLE ODFRC_MONTHLY (
ODIFID,
TRPORGN,
TRPDSTN,
POSCOUNTRYCODE,
PAXTYPE,
DCP,
ODIFDATE,
FRCDATE,
BKGMEAN,
BKGMEANINFLUENCED,
BKGVARIANCE,
XXLMEAN,
XXLMEANINFLUENCED,
XXLVARIANCE,
CONSTRAINT ODFRC_MONTHLY_PK PRIMARY KEY
(ODIFID, ODIFDATE, TRPORGN, TRPDSTN,POSCOUNTRYCODE, PAXTYPE, DCP, FRCDATE)
) ORGANIZATION INDEX nologging parallel 8
PARTITION BY RANGE (ODIFDATE)
(PARTITION ODFRC_20021130 VALUES LESS THAN (TO_DATE('2002-12-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX01 ,
PARTITION ODFRC_20021231 VALUES LESS THAN (TO_DATE('2003-01-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX02 ,
PARTITION ODFRC_20030131 VALUES LESS THAN (TO_DATE('2003-02-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX03 ,
PARTITION ODFRC_20030228 VALUES LESS THAN (TO_DATE('2003-03-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX04 ,
PARTITION ODFRC_20030331 VALUES LESS THAN (TO_DATE('2003-04-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX05 ,
PARTITION ODFRC_20030430 VALUES LESS THAN (TO_DATE('2003-05-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX06 ,
PARTITION ODFRC_20030531 VALUES LESS THAN (TO_DATE('2003-06-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX07 ,
PARTITION ODFRC_20030630 VALUES LESS THAN (TO_DATE('2003-07-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX08 ,
PARTITION ODFRC_20030731 VALUES LESS THAN (TO_DATE('2003-08-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX09 ,
PARTITION ODFRC_20030831 VALUES LESS THAN (TO_DATE('2003-09-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX10 ,
PARTITION ODFRC_20030930 VALUES LESS THAN (TO_DATE('2003-10-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX11 ,
PARTITION ODFRC_20031031 VALUES LESS THAN (TO_DATE('2003-11-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX12 ,
PARTITION ODFRC_20031131 VALUES LESS THAN (TO_DATE('2003-12-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX13) AS SELECT
ODIFID,
TRPORGN,
TRPDSTN,
POSCOUNTRYCODE,
PAXTYPE,
DCP,
ODIFDATE,
FRCDATE,
BKGMEAN,
BKGMEANINFLUENCED,
BKGVARIANCE,
XXLMEAN,
XXLMEANINFLUENCED,
XXLVARIANCE
FROM odfrc nologging;

The original ODFRC table has daily partition for one year. Anyway, I can
estimate how long this will take?

-Peter

Followup:
the nologging is in the wrong place

from odfrc nologging;

is the same thing as:

from odfrc a;

would you consider

a) create the table (empty, alter it to be nologging)
b) insert /*+ append */ into the individual partitions (instead of all at once)
in parallel sessions?

GOTO a page to Bookmark Review | Bottom | Top
Create Table with data from an aggregated sum of few fields from another table July 26, 2004
Reviewer: RB from NJ

Tom: Qn related to creating a table with data from another table -

I want to create a table with few fields and aggregated some of few columns from

another table.

Here is a query:

create table SRC200401 as select ID, src, LOC, sum(BYTES) TOTAL, sum
(billable_bytes) TOTAL_BILLABLE from SRC_MONTHLY where ID = 1072915200 group by
ID,LOC, SRC;

I have 128Million record in the SRC_MONTHLY table and it is taking more than 1
hr to get this table created. I have index on ID, src and LOC fields.

Any faster way of getting this table created?

Followup:
so how many records have that id?

RB: Tom - This number varies - we have so many IDs in the master table. If I
pass one id then the query will have one equi join with that ID if more than one
I was planning to use IN clause. So I do not know how many records per id I will
have it in the table at any given point of time.

Followup:
now I'm confused -- the predicate is variant? why create *that* table then?

what is the goal here -- to make queries of the form:

select ID, src, LOC, sum(BYTES) TOTAL, sum
(billable_bytes) TOTAL_BILLABLE from SRC_MONTHLY where ID = :x group by id, loc,
src;

faster in general? (you don't have to duplicate lots of text, it is all right
here)

GOTO a page to Bookmark Review | Bottom | Top
Great suggestion! July 26, 2004
Reviewer: Peter Tran from Houston, TX USA

Hi Tom,

Thanks for the useful suggestion.

1) Part one done (easy).
2) Is this what you mean for part 2? When you say parallel sessions, do you
mean kick off a bunch of them using execute immediate?

INSERT /*+ append */ INTO odfrc_monthly(
ODIFID,TRPORGN,TRPDSTN,POSCOUNTRYCODE,
PAXTYPE,DCP,ODIFDATE,FRCDATE,BKGMEAN,
BKGMEANINFLUENCED,BKGVARIANCE,XXLMEAN,
XXLMEANINFLUENCED,XXLVARIANCE)
SELECT
ODIFID,TRPORGN,TRPDSTN,POSCOUNTRYCODE,
PAXTYPE,DCP,ODIFDATE,FRCDATE,BKGMEAN,
BKGMEANINFLUENCED,BKGVARIANCE,XXLMEAN,
XXLMEANINFLUENCED,XXLVARIANCE
FROM odfrc partition(ODFRC_20021114) nologging;

3) Should I still give it a nice juicy sort_area_size? :)

Thanks,
-Peter

Followup:
2) fire up N sqlplus sessions and run the insert append in each

3) sure, let each session have a big SAS -- alter session can be used.

GOTO a page to Bookmark Review | Bottom | Top
Create Table with data from an aggregated sum of few fields from another table July 26, 2004
Reviewer: RB from NJ

Tom - If the user can select one or more ids. If I have more than one ID then I
was planning to use an IN clause in the where clause. The temp table that I am
creating will be used in a later phase of the app for other joins. What I am
looking for a soln which will be must fater than my current approach. The Query
that I have given with a 120M table is taking more than 1 hr to create the
aggregated table.

Followup:
what is the query plan in general then.

you would not create a temporary table in oracle -- that would be so sqlserver.
just use that query in the "IN" statment in the first place!!!!!

GOTO a page to Bookmark Review | Bottom | Top
Update... July 27, 2004
Reviewer: Peter Tran from Houston, TX USA

Hi Tom,

I wanted to give you an update on progress. The nice thing about your approach
is I can monitor the progress, but it's not as fast I as thought it would be.

I started 6 sessions and gave each session a sort_area_size = 200M. I then
executed a month's worth of insert each session. It's taking about 30 minutes
per partition/day to insert from the source daily table to the monthly table.
Each partition holds around 700K to 800K rows.

Should I expect to run this long?

Thanks,
-Peter

Followup:
sounds long -- can you run one of the sessions with a 10046 level 12 trace and
see what they might be waiting on?

GOTO a page to Bookmark Review | Bottom | Top
sort_area_size, pga_aggregate_target, insert /*+ APPEND*/ July 28, 2004
Reviewer: Dusan Valasek from Czech republic

Hi Tom,

I have two questions about setting of big sort_area_size before direct path
load.
1) How does setting of sort_area_size (from session) affect direct path load? I
thought that this is useful for sort operation and building of indexes.
2) If I use pga_aggregate_target, does setting of sort_area_size (from session)
override pga_aggregate_target?

Thanks,

Dushan

Followup:
1) you are loading an IOT. big sort going on.

2) you would set the workarea policy to manual (in your session) to allow you to
override the sort area size for this "batch operation"

GOTO a page to Bookmark Review | Bottom | Top
July 28, 2004
Reviewer: A reader

Would you suggest to re-create the table when other users want to update it
online

Followup:
huh?

GOTO a page to Bookmark Review | Bottom | Top
How to change a DATATYPE of column. July 30, 2004
Reviewer: peru

Hi Tom,
How to change a datatype of a particular column.
say a table with column VARCHAR2 for date.
Now i want to change the datatype DATE.
Assume that table has many records and referenced with
tables , procedures,triggers.
Thank you

Followup:
not really going to happen.

you can add a new column that is a date, update this new column to be to_date()
of the varchar2 and set the varchar2 to null, drop the old column and rename the
newly added column to the old name.

GOTO a page to Bookmark Review | Bottom | Top
Update to July 27, 2004 August 22, 2004
Reviewer: Peter Tran from Houston, TX USA

Hi Tom,

Sorry it took awhile to get back with you on this. You wanted to see a 10046
level 12 trace. I wanted to do some research first and I had to rebuild the
table to reproduce the step.

Here's the trace.

insert /*+ append */ into odfrc_monthly select * from odfrc
partition(ODFRC_20040201)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 15.08 5776.28 11657 15874 111333 569939
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 15.09 5776.29 11657 15874 111333 569939

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 34

Rows Row Source Operation
------- ---------------------------------------------------
1 LOAD AS SELECT
569939 INDEX FAST FULL SCAN ODFRC_PK PARTITION: 185 185 (object id 12351)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
control file sequential read 8 0.00 0.00
enqueue 1911 3.00 5726.43
db file sequential read 38 0.03 0.20
db file scattered read 380 0.63 9.50
direct path write 19 0.00 0.00
direct path read 10 0.00 0.00
log buffer space 45 0.61 12.01
log file switch completion 2 0.28 0.33
log file sync 1 0.00 0.00
SQL*Net message to client 1 0.00 0.00
SQL*Net message from client 1 0.00 0.00

Jonathan Lewis wrote in his book that you cannot use direct-path insert using
the APPEND hint into an IOT table. Can this be the reason for the large enqueue
timed event? Are the 6 sessions waiting to lock the index to modify it?

Thanks,
-Peter

Followup:
Umm, you were blocked by someone else for an excessively long period of time
here. Enqueue waits:

enqueue 1911 3.00 5726.43

normal "blocker, blockee" problem here! doh, it was me doing it to you.

only one session at a time can append. only one session at a time can direct
path insert into a table.

sorry -- use normal insert, I goofed.

GOTO a page to Bookmark Review | Bottom | Top
delete 2M records without dropping table. August 23, 2004
Reviewer: Sean from NJ, USA

Hi Tom,

I have to delete 2M records from 6M records and the table has about 400 columns
(Oracle 920, Solaris 9).

I understand that your suggestion of creating temp table with the records I
needed, then drop the original table and change the temp table name.

But since our table is not that big and the application is using this table all
the time, we try to use traditional delete method to accomplish this.

(1)If we have enough redo tablespace, what is the good number of rows for each
delete. I tried each commit for 10000 records or 1000 records. Both are quite
slow.
(2)Anything you can suggest to improve the speed of the delete in the following
procedure.

Thanks so much for your help

------------------------------------------------------------
create or replace procedure sp_del_audit(p_total_loop_time number)
is

total_loop_time number := 1;

BEGIN
WHILE total_loop_time <= p_total_loop_time
LOOP
delete /*+ parallel */ from audit a
where rownum < 10000
and exists ( select /*+ parallel */ null from audit_temp b
where a.Pkey = b.Pkey and
a.row_type = b.row_type and
a.update_time = b.update_time);

commit;

total_loop_time := total_loop_time + 1;

END LOOP;
END;
/
------------------------------------------------------------

Followup:
1) assuming you meant UNDO tablespace, there is no such thing as a REDO
tablespace...

removing 2 million rows, with 400 columns (every row in this table is CHAINED,
when you have more than 255 columns...) and probably many indexes -- it is not
going to be what you might term "speedy".

2) do it as a SINGLE delete (deleting 10,000 records in "parallel" doesn't even
begin to make sense here. PQ is for BIG BIG BIG things. this is tiny)

delete from audit
where (pkey, rowi_type, update_time) in ( select pkey, row_type, update_time
from audit_temp);

nothing more, nothing less. but don't expect this to be super fast if this
table is indexed. removing 33% of the index entries is going to take some
amount of time.

GOTO a page to Bookmark Review | Bottom | Top
Deleting 90% of rows October 13, 2004
Reviewer: A reader

As part of some ETL processing, I need to delete 90% of the rows in a large
(6million rows) table. This table has a PK index so traditional delete takes a
long time. I see the following options

1. Mark the index unusable, delete and rebuild index nologging. This is
significantly faster than plain delete
2. create new table as remaining 10% of the rows, drop, rename, etc
3. Precreate a regular or GTT and do insert /*+ append */ in my batch program.

(2) is not very attractive as I dont want to be doing DDL in my code.

How would you compare (1) and (3) above? Both of them would generate similar
amounts of undo/redo, take roughly same amount of time?

Thanks

Followup:
ETL is 'special', it doesn't matter 5 times a minute.

I'd go with #2 actually, CTAS a newone, drop oldone, rename newone to oldone.

#3 is "dangerous" -- you could lose everything in the event of a failure.

GOTO a page to Bookmark Review | Bottom | Top
October 14, 2004
Reviewer: A reader

"#3 is "dangerous" -- you could lose everything in the event of a failure"

Why would I lose everything? What kind of failure? Instance or media failure?
Oracle guarantees recoverability of commited transactions, right, why do you
bring that up here?

The only reason I dont want to do (2) is that I usually want to prevent DDL in
my code. The only difference between (2) and (3) is that the table is already
created in (3), right? Yes, (3) would generate much more redo/undo (lesser if I
create the table as nologging), but I thought using a GTT would cut down on the
redo/undo as well?)

Followup:
i saw "use a gtt"

i could only assume you mean:

insert rows to keep into gtt
truncate table
insert rows from gtt into the table

that would be dangerous. if you fail after the truncate, before/during the
insert -- gone, all gone. gtt would be dangerous here.

(truncate is ddl btw...)

GOTO a page to Bookmark Review | Bottom | Top
How can you Display multiple rows in one record October 14, 2004
Reviewer: Mack from USA

Hi Tom,

Let's suppose in deptno 10, there are 3 to 5 employees, I want to see the emp
names like
TOM, KIM, JOHN... and so on.

Is there an easy way to do it in SQL? Not through PL/SQL. The number of records
are unknown, it could have 10, 20 or one hundred. Please advise.

Followup:
collect() in 10g

stragg (search for it -- nasty plsql you install once and use over and over and
over and over and over) in pre 10g

GOTO a page to Bookmark Review | Bottom | Top
Query first and then update October 15, 2004
Reviewer: A reader

We have two tables with approximately 28 million and 35 million records
respectively. These tables are joined to produce data to be displayed to users
using IE browsers. Based on this information, around 10000 records get added to
these tables and around the same updated daily. Our SLA is to display each
screenful (100 rows) in 2-4 seconds.

While partitioning is being reviewed to improve the performance for the queries,
could you let us know if there are any issues regarding partitions? For instance
someone has reported that using global indexes on a partitioned table has
degraded the performance.

Followup:
i seriously doubt partitioning is going to be used to increase the performance
of these queries.

partitioning -- great for speeding up a full scan. are you suggesting to full
scan and still return in 2-4 seconds?

"someone reported using global indexes ... degraded performance"

I just about fell out of my chair on that one. geez.

If you have my book "Effective Oracle by Design" -- I go into the "physics
behind partitioning".

In order to return the first 100 rows to a web based application -- you are
doing to be using indexes (or you are not going to be doing what you signed up
to do -- funny, you have an SLA in place but no idea if you can live up to
it...). whether the tables are partitioned or not probably won't have any
bearing on making this faster. Partitioning could certainly make things slower
if you do it wrong, but in a system like this, you would be looking at
partitioning to make administration easier and hopefully not NEGATIVELY impact
performance.

given two tables...
index access
get 100 rows

I personally would be shooting for well under 1 second response times for
everything -- regardless of whether there was 1 row or 1 billion.

don't get the tie into "query first and then update" though.

GOTO a page to Bookmark Review | Bottom | Top
October 18, 2004
Reviewer: A reader

"Our SLA is to display each *screenful* (100 rows) in 2-4 seconds."

In our case the queries, after joining 2 tables of 28 million and 35 million
records, could return 10 screens or 600 screens or many more screens based on
the query parameters. Each screenful (with 100 records in each screen) should
appear in 2-4 seconds.

Followup:
so, you've signed up for an SLA you have no idea if you can meet.

but hey -- using indexes to retrieve 100 rows from 1, 1000, 10000000000000
should be about the same amount of time (and way under subsecond).

but -- getting to "screen 600" is not. Look to google as the gold standard for
searching and web pagination

o totally estimate the number of returned rows -- don't even THINK about giving
an accurate count

o don't give them the ability to go to page "600", pages 1-10 is more than
sufficient

o even if there is a page 600 -- realize it doesn't make sense to go there (no
human could know "what I need is on page 600 -- 6,000 rows into this result
set). Google stops you at page 99

o understand that page 2 takes more time to retrieve than page 1, page 50 more
than 2, and so on (as you page through google --each page takes longer)

But perhaps most importantly -- laugh at people that say things like:

"For instance
someone has reported that using global indexes on a partitioned table has
degraded the performance."

and say -- "yes, it has when improperly applied it can degrade performance,
there have also been sightings of systems where it didn't affect performance at
all, there have also been reported instances of them massively improving
performance. Now what we have to do is understand how the feature works, what
it does, and how it might apply to our problem"

The neat thing about that paragraph -- it is INFINITELY reusable. You can use
it with regards to any feature!

(if you have effective Oracle by design -- i go into the "physics" of
partitioning and who -- without the judicious use of global indexes, your system
could fall apart and run really really slow as well)

GOTO a page to Bookmark Review | Bottom | Top
Senior Statistical Analyst October 18, 2004
Reviewer: Hank Freeman from Atlanta, GA

18 October 2004

Tom,

Thanks for the lob_replace code !!!
It worked wonders when trying to fix about 25,000 clobs with a known error.
Here my detailed discussion in outline and then in detail.

1. What type of data was is in the CLOB
2. What went wrong to create the error
3. What was done to correct it.
a. Stored Proc
b. Declare syntax
c. VBA code in Excel

1. What type of data was is in the CLOB. The company has about 45,000
internet/intranet web pages stored in an Oracle database table field which is a
CLOB. Meaning instead of calling the website from a file server or physical
location the site, the entire website-webpages/html source code is held in the
table at this CLOB field.
2. What went wrong to create the error. An error occurred when these records
were being modified to remove a specific piece of information and a replacement
null character was inserted. The null character for some unknown reason did not
work and the web-page information in the CLOB field got garbage appended to the
end of the web-page after the closing </html> tag.
a. It was easy to spot the records but a conventional update within would not
work because of all the size of the web-pages being so large and the special
characters that were found within and the unique nature of working with clobs.
b. I contacted a friend who pointed me to your site and URL:
http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:1533006062995
c. Upon examination of the last changes you made for a previous person, the
solution came into focus clear on how to update the records and, it had to be
done with a program.
3. What was done to correct the problem.
a. The last stored procedure you listed on your website was used, because it
was applicable to our situation. This procedure allowed the creation of a
specific version of the Declare sql to select the bad part of the CLOB record
and replace it with a “ ” (space character) and not a null.
i. The Stored Proc syntax, is not listed since it is the Tom’s “lob_replace”
syntax verbatim.
ii. The Declare statement is listed below would take the unique record id and
the starting location of the bad information in the CLOB. The sql would then
pass the CLOB, the data to find, and the replacing string (a space) to the
stored procedure.

Declare Sql syntax

declare
l_lob clob;
l_substr varchar2(32767);
begin
select KS_DESCRIPTION into l_lob
from ks_longdescription
WHERE KS_LONGDESCRIPTIONID = 55012 for update;
select SUBSTR(KS_DESCRIPTION, 876, length(KS_DESCRIPTION) - 876) into
l_substr
from ks_longdescription
WHERE KS_LONGDESCRIPTIONID = 55012;
hf_clob_replace (l_lob, l_substr, ' ');
end;

iii. It was decided to use Excel VBA to control the update process. It was
contemplated to write a second procedure with for the declare statement and just
call execute it with parameters. However, as you noticed the maximum value for
the l_substr Varchar2 field was declared as 32767. It was already known that
some of the erroneous parts of the CLOB data were larger than 25,000 characters.
Therefore, Excel was used to ascertain which records were going to be within
this limit and those outside a 25,000 character limit would require a manual
fix.
iv. Approximately 6,000 records were loaded into an Excel worksheet and the
following Macro code was written to process/update the records. This was the
optimum method of performing the cleanup because the VBA code was written to
decide:
1. Length of clob
2. Starting location of the junk in the clob
3. The unique record id was already known and with the aforementioned
starting location being determined by the VBA code, it was a simple task to just
pass there parameters so the VBA Declare SQL string and create a specific run
for each qualifying record.
4. The VBA code then would execute the Declare SQL string, firing up the
stored procedure hf_lob_replace, thus cleaning up the record.
5. If a record could be processed it was, otherwise it was marked for manual
cleanup.
6. During the process it was determined that any l_Substr value greater than
20,000 would cause an Oracle error on the Buffer size. Therefore, upon see that
the error would occur often any l_Substr(p_what) value greater then 20,000 was
immediately marked for a “SQL Buffer Failure”, meaning it would have to be
manually updated.. The important part here was that the process would not stop
if did not like a specific record, it just kept going.

In Summary, it took less than ONE second to fix a record ,with an auto-commit
turned on and we knew it was working for the process could be watched. Within
four hours (6,000 per hour) using two systems, the 25,000 records were
corrected with only 150 needing a manual fix, a highly acceptable result.

In closing, I would like to give credit to the following.
? Mr. Andy Sword, who is/was a subscriber to Tom’s web page and submitted to
me the first draft of the Declare statement we used. This was what we really
needed to know !!! I had never worked with CLOB, so this was a real leason.
? Tom’ Web page:
http://asktom.oracle.com/pls/ask/f?p=4950:1:
? Tom Reed:
http://the-big-o.port5.com/index.shtml
where some of the best Excel/VBA code for interfacing with
Oracle can be found.
? I have listed the VBA code below for those who are so inclined to read it.
There is nothing there that is private and confidential. So anyone who can learn
from it may feel free to do so.

Respectfully

Hank Freeman
FCS Ltd
hfreeman@msn.com

GOTO a page to Bookmark Review | Bottom | Top
updating millions of records October 26, 2004
Reviewer: Josphat from KENYA

You approach is quite ok for one line update but how about procedure which
updates the next row based on value of previous row.
Kindly look at the following scripts and give hard suggestions on how 5 million
rows could be processed in under 10 hours...capturing the business logic
----starting here
---3 Scripts to process over 5 million rows
---and constitute approx 40000 distinct identifiers (id)
---should exec as ordered

drop table temporary_emphist;

create table temporary_emphist
(ID VARCHAR2(15),PAYDATE DATE, AMOUNT NUMBER(10,2),
PPART NUMBER(13,2), IPART NUMBER(13,2), INTRATE NUMBER(5,2), OBAL
NUMBER(13,2),
REF NUMBER(7), NOOFMONTHS NUMBER(13,2), ACCRUED_INT NUMBER(15,2))
/

-----values(ID ,PAYDATE,AMOUNT , PPART, IPART, INTRATE, OBAL, REF, NOOFMONTHS,
ACCRUED_INT)
-----data for different ids being inserted (one id might have as many as 300
entries)
-----but i have put just a few for each id
----set for customer id 10979844

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE, OBAL)
values('10979844','30-jun-2003',2000,2,112000);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10979844','30-jul-2003',2000,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10979844','30-aug-2003',2000,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10979844','30-dec-2003',2000,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10979844','30-jan-2004',1950,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10979844','30-jun-2004',5000,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10979844','30-aug-2004',4000,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10979844','30-oct-2004',3000,2);

----set for customer id 10779844

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE, OBAL)
values('10779844','30-jun-2003',2000,2,99380);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10779844','30-jul-2003',2000,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10779844','30-aug-2003',2000,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10779844','30-dec-2003',2000,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10779844','30-jan-2004',1950,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10779844','30-jun-2004',5000,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10779844','30-aug-2004',4000,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10779844','30-oct-2004',3000,2);

----set for customer id 10679844
insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE, OBAL)
values('10679844','30-jun-2003',800,4,40390);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10679844','30-jul-2003',1500,4);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10679844','30-aug-2003',1600,4);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10679844','30-dec-2003',2000,4);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10679844','30-jan-2004',1950,4);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10679844','30-mar-2004',1950,4);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10679844','30-jul-2004',1000,4);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10679844','30-aug-2004',1000,4);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10679844','30-sep-2004',1992,4);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE)
values('10679844','30-oct-2004',1200,4);

commit;
set lines 1000
select * from temporary_emphist
order by paydate,id;
create index
temporary_emphistid on temporary_emphist(id);
create index temporary_emphistpd on temporary_emphist(paydate);

---script 1 (will number transactions for each customer from 1 to ..n then start
from 1 for the next customer
DECLARE -----Procedure 19
V_2 NUMBER(7) :=1;
V_1 VARCHAR2(18);
v_3 VARCHAR2(18);
v_4 VARCHAR2(18) :='T';
v_5 date;
CURSOR DRECS IS
SELECT ROWID,id,paydate
FROM temporary_emphist
ORDER BY ID,paydate;
BEGIN
OPEN DRECS;
LOOP
FETCH DRECS INTO V_1,v_3,v_5;
EXIT WHEN DRECS%NOTFOUND;

if v_3 <> v_4 then
v_4 := v_3;
v_2 := 1;
end if;
UPDATE temporary_emphist
SET REF = V_2
WHERE ROWID = V_1;
V_2 := V_2+1;
END LOOP;
commit;
CLOSE DRECS;
END;
/
select * from temporary_emphist
order by id,paydate;

-----script 2

----gets the difference in months from one transaction to the next

DECLARE -----Procedure 20
V_1 NUMBER(7) :=0;
V_2 NUMBER(15,2):=0;
V_3 NUMBER(15,2) :=0;
v_4 NUMBER(15,2) :=0;
V_5 NUMBER(15,2) :=0;
V_6 NUMBER(15,2) :=0;
V_7 varchar2(15) :='0';
V_8 NUMBER(15,2) :=0;
v_9 number(15,2);
v10 number(35,10) := ROUND(1/12,10);
V11 number(15,2) ;
V12 DATE;
V14 DATE;
V13 NUMBER(13,2);
V15 number(5);

CURSOR UPLO IS
SELECT REF,OBAL,amount,ID,INTRATE,last_day(PAYDATE) FROM temporary_emphist
where nvl(ref,0)>1
ORDER BY ID, REF;

BEGIN
OPEN UPLO;
LOOP
FETCH UPLO INTO V_1,V_2,V_3,V_7,V11,V12;
EXIT WHEN UPLO%NOTFOUND;
---V13 :=V12;
uPDATE temporary_emphist
SET NOOFMONTHS=mONTHS_BETWEEN(v12,last_day(paydate))
WHERE NVL(REF,0) = V_1-1 and id = V_7 ;

END LOOP;

commit;
CLOSE UPLO;
END;
/
select * from temporary_emphist
order by id,paydate;

---script 3
-------------------------------------------puts interest per month
-------------------------------------------CALCULATES INTEREST ON OUTSTANDING
BALANCE
---as per the numbering above(script 1) value of rows for customer
----will be manipulated to get values for the subsequent row though some input
from the
----current row will be integrated with derivative of the previous to populate
values into the
----current row. This is applied for each customer untill all are cleared

SELECT 'ESTABLISHING BALANCES ON REDUCING BASIS-----Procedure 21 STARTS ' FROM
DUAL;
DECLARE -----Procedure 21
V_1 NUMBER(7) :=0;
V_2 NUMBER(15,2):=0;
V_3 NUMBER(15,2) :=0;
v_4 NUMBER(15,2) :=0;
V_5 NUMBER(15,2) :=0;
V_6 NUMBER(15,2) :=0;
V_7 varchar2(15) ;
V_8 NUMBER(15,2) :=0;
v_9 number(15,2);
v10 number(35,10) := ROUND(1/12,10);
V11 number(15,2) ;
V12 NUMBER(15,2) :=0;
V13 NUMBER(15,2);
v14 number(15,2) :=0; ---interest hold
v15 number(15,2) :=0; ---balance hold
v16 number(15,2) :=0; ---ipart hold
V20 VARCHAR2(18) :='T';
CURSOR UPLO IS
SELECT NVL(REF,0),NVL(OBAL,0),NVL(amount,0),ID,INTRATE,nvl(NOOFMONTHS,1) FROM
temporary_emphist

ORDER BY ID,REF;
BEGIN
OPEN UPLO;
LOOP
FETCH UPLO INTO V_1,V_2,V_3,V_7,V11,V12;
EXIT WHEN UPLO%NOTFOUND;
if v_7<>v20 AND V_1=1 then
v20 :=v_7;
V_4 :=V_2-V_3;
v16 :=ROUND((((v_4)*POWER((1+V11/100),v10*V12)-V_4)/V12),2);
v15 :=v_4;
ELSE

select obal,INTRATE,nvl(NOOFMONTHS,1) into v_4,V11,V12 from temporary_emphist
where NVL(ref,0) = V_1 - 1 AND ID = V_7 ;

select nvl(amount,0) into v_3 from temporary_emphist
where NVL(ref,0) = V_1 AND ID = V_7 ;

v15 := ROUND((v_4)*POWER((1+V11/100),v10*V12),2)-v_3; -------obal
v14 :=ROUND((v_4)*POWER((1+V11/100),v10*V12),2)-V_4; ---accrued interest
v16 :=ROUND((((v_4)*POWER((1+V11/100),v10*V12)-V_4)/V12),2); --ipart
END IF;
UPDATE temporary_emphist
SET OBAL = v15,ACCRUED_INT =v14,IPART=v16
WHERE NVL(REF,0) =V_1 AND ID = V_7 ;
--END IF;
END LOOP;
commit;
CLOSE UPLO;
END;
/
select * from temporary_emphist
order by id,paydate;

I dont know whether there is a faster way to employ the logic...I've seen your
authority..give it a short

Followup:
one word for you:

analytics

they rock, they roll. lag/lead -- i frankly did not really read all of your
code (i'm not a compiler, v_1, v_2 etc make it really truly utterly hard to
follow and comprehend) but if you need to update a row based on the prior row(s)
after sorting -- analytics -- lag, lead, sum, whatever.

GOTO a page to Bookmark Review | Bottom | Top
database October 26, 2004
Reviewer: cp

Hi Tom,

I have to appear for interview and need your help.
1. Hows to decide for datafile size?
2. How to decide for redolog file size?
3. Any other factors which you feel as interviewer to ask?

Please help me out as its really urgent
Regards

Followup:
how do you know they'll ask these questions -- not questions I would ask.

datafile size = pure function of amount of data you want to store. I mean,
well, what else could you say?

redolog file size is a function of

o amount of redo generated by most of your transactions
o the number of these transactions you do in a period of time

predominantly (but if you are an experienced dba -- why not just fall back on
your knowledge of what you've done in the past?)

GOTO a page to Bookmark Review | Bottom | Top
problem October 26, 2004
Reviewer: cp

So untill or unless we get some opportunity we cant learn the skills to decide
for initial size of database right?

Please provide words about or facotrs to think on this

Thanks!

Followup:
experience.

you get an entry level job (you have to sort of start in the basement) and get
experience.

GOTO a page to Bookmark Review | Bottom | Top
update of millions of records/rows October 29, 2004
Reviewer: Josphat from KENYA

Hi Tom,
Thanks for decisive debate in this site.

Kindly go through my script cos now I have elaborated on
the v_1,v_2..etc just variables as you will see...SORRY for MY FIRST SUBMISSION.
I will be grateful if ya suggestion /code
could reduce my processing time for 5 million records from
15 hrs to 1, 2, 3? .

---Script to process over 5 million rows
---and constitute approx 40000 distinct identifiers (id)

-------------------------------------------puts interest per month
-------------------------------------------CALCULATES INTEREST ON OUTSTANDING
BALANCE
-----EACH customer HAS OWN ROWS STARTING 1..N AND REPRESENTED BY REF COLUMN ON
TABLE
-----MULTIPLIER FACTOR FOR EACH MONTH IS REPRESENTED BY NOOFMONTHS
-----will be manipulated to get values for the subsequent row though some input
from the
-----current row will be integrated with derivative of the previous to populate
values into the
-----current row. This is applied for each customer untill all are cleared
-----PROCESSING IS SEQUENCIAL (I.E FROM REF 1 ..N FOR EACH CUSTOMER)
-----REF =1 FOR EACH CUSTOMER HOLDS A VALUE FOR OBAL WHICH WILL BE USED TO FILL
UP
-----THE SUBSEQUENT ROWS AFTER MANIPULATION WITH MULTIPLIER FACTOR

----work starts next

drop table temporary_emphist;

----DATA set for customer id 10979844

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,
OBAL,ref,NOOFMONTHS)
values('10979844','30-jun-2003',2000,2,112000,1,1);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10979844','30-jul-2003',2000,2,2,1);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10979844','30-aug-2003',2000,2,3,4);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10979844','30-dec-2003',2000,2,4,1);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10979844','30-jan-2004',1950,2,5,5);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10979844','30-jun-2004',5000,2,6,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10979844','30-aug-2004',4000,2,7,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10979844','30-oct-2004',3000,2,8,0);

----DATA set for customer id 10779844

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,
OBAL,ref,NOOFMONTHS)
values('10779844','30-jun-2003',2000,2,99380,1,1);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10779844','30-jul-2003',2000,2,2,1);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10779844','30-aug-2003',2000,2,3,4);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10779844','30-dec-2003',2000,2,4,1);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10779844','30-jan-2004',1950,2,5,5);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10779844','30-jun-2004',5000,2,6,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10779844','30-aug-2004',4000,2,7,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10779844','30-oct-2004',3000,2,8,0);

----DATA set for customer id 10679844
insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,
OBAL,ref,NOOFMONTHS)
values('10679844','30-jun-2003',800,4,40390,1,1);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10679844','30-jul-2003',1500,4,2,1);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10679844','30-aug-2003',1600,4,3,4);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10679844','30-dec-2003',2000,4,4,1);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10679844','30-jan-2004',1950,4,5,2);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10679844','30-mar-2004',1950,4,6,4);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10679844','30-jul-2004',1000,4,7,1);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10679844','30-aug-2004',1000,4,8,1);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10679844','30-sep-2004',1992,4,9,1);

insert into temporary_emphist(ID ,PAYDATE,AMOUNT , INTRATE,ref,NOOFMONTHS)
values('10679844','30-oct-2004',1200,4,10,0);

commit;
set lines 1000
select * from temporary_emphist
order by paydate,id;

create index temporary_emphistid on temporary_emphist(id);
create index temporary_emphistpd on temporary_emphist(paydate);

---script STARTS

DECLARE
V1_REF temporary_emphist.REF%TYPE;
V2_OBAL temporary_emphist.OBAL%TYPE;
V3_AMOUNT temporary_emphist.amount%TYPE;
v_4 NUMBER(15,2) :=0;
V5 NUMBER(15,2) :=0;
V6 NUMBER(15,2) :=0;
V7_ID Temporary_emphist.ID%TYPE;
V8 NUMBER(15,2) :=0;
v9 number(15,2);
v10 number(35,10) := ROUND(1/12,10); ---CONVERTS ONE MONTH INTO
YEARS
V11_INTRATE temporary_emphist.INTRATE%TYPE;
V12_NOOFMONTHS temporary_emphist.NOOFMONTHS%TYPE;
V13 NUMBER(15,2);
v14 number(15,2) :=0; ---interest hold
v15 number(15,2) :=0; ---balance hold AS CALCULATED
v16 number(15,2) :=0; ---ipart holdING INTEREST AS CALCULATED
V20 VARCHAR2(18) :='T'; ----ARBITRALY KEY TO ENABLE FOCUS ON
DIFFERENT IDS
CURSOR UPLO IS
SELECT NVL(REF,0),NVL(OBAL,0),NVL(amount,0),ID,INTRATE,nvl(NOOFMONTHS,1) FROM
temporary_emphist

ORDER BY ID,REF;
BEGIN
OPEN UPLO;
LOOP
FETCH UPLO INTO V1_REF,V2_OBAL,V3_AMOUNT,V7_ID,V11_INTRATE,V12_NOOFMONTHS;
EXIT WHEN UPLO%NOTFOUND;

----V20 IS VARIABLE KEY FOR CHECKING WHETHER WORKING ON SAME ID

if v7_ID<>v20 AND V1_REF=1 then
v20 :=v7_ID;
V_4 :=V2_OBAL-V3_AMOUNT;
v16
:=ROUND((((v_4)*POWER((1+V11_INTRATE/100),v10*V12_NOOFMONTHS)-V_4)/V12_NOOFMONTHS
),2);
v15 :=v_4;
ELSE

select obal,INTRATE,nvl(NOOFMONTHS,1) into v_4,V11_INTRATE,V12_NOOFMONTHS from
temporary_emphist
where NVL(ref,0) = V1_REF - 1 AND ID = V7_ID ;
select nvl(amount,0) into v3_AMOUNT from temporary_emphist
where NVL(ref,0) = V1_REF AND ID = V7_ID ;
--------V10 IS ONE MONTH COVERTED INTO YEARS
v15 :=
ROUND((v_4)*POWER((1+V11_INTRATE/100),v10*V12_NOOFMONTHS),2)-v3_AMOUNT;
-------CALCULATED obal
v14 :=ROUND((v_4)*POWER((1+V11_INTRATE/100),v10*V12_NOOFMONTHS),2)-V_4;
---CALCULATED accrued interest
v16
:=ROUND((((v_4)*POWER((1+V11_INTRATE/100),v10*V12_NOOFMONTHS)-V_4)/V12_NOOFMONTHS
),2); --ipart
END IF;
UPDATE temporary_emphist
SET OBAL = v15,ACCRUED_INT =v14,IPART=v16
WHERE NVL(REF,0) =V1_REF AND ID = V7_ID ;
--END IF;
END LOOP;
commit;
CLOSE UPLO;
END;
/
select * from temporary_emphist
order by id,paydate;
spool off

Followup:
this isn't the right place for "recode my application".

show me your attempt at using analytics.

(UPDATE temporary_emphist
SET OBAL = v15,ACCRUED_INT =v14,IPART=v16
WHERE NVL(REF,0) =V1_REF AND ID = V7_ID ;

is still really obscure, comments didn't "do much" really)

GOTO a page to Bookmark Review | Bottom | Top
Reader November 21, 2004
Reviewer: dheeraj from UK

Excellent discussion as usual on this site.

GOTO a page to Bookmark Review | Bottom | Top
Selecting data over a database link from 7.3.4 to 10g December 02, 2004
Reviewer: Stewart W. Bryson from Atlanta, GA

Tom:

If I use the "INSERT /*+ APPEND */ ..." statement on a 10g database, but the
"SELECT..." part of the statement is over a database link to a 7.3.4 database,
would the bulk-loader still be invoked, or would the hint be ignored?

Also, if I do a "MERGE... " statement with the "INSERT..." part of the statement
being an "INSERT /*+ APPEND */...", would the MERGE work correctly, and would
the bulk-loader be used?

Thanks for your time.

Followup:
trick question

it isn't going to work -- 9ir2 and later do not interoperate with 7.3 and
earlier.

if it did work

insert /*+ append */ into local_table select * from remote_table@dblink;

could be direct pathed.

insert /*+ append */ is supported only for insert as select -- not insert
values.

ops$tkyte@ORA9IR2> create table t ( x varchar2(25), y varchar2(25) );

Table created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> merge into t
2 using dual
3 on ( t.x = dual.dummy )
4 when matched then update set y = dummy
5 when not matched then insert /*+ append */ values ( dummy, null );

1 row merged.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select * from t;

X Y
------------------------- -------------------------
X

ops$tkyte@ORA9IR2> insert /*+ append */ into t values ( 'y', null );

1 row created.

ops$tkyte@ORA9IR2> select * from t;

X Y
------------------------- -------------------------
X
y

ops$tkyte@ORA9IR2> insert /*+ append */ into t select 'z', null from dual;

1 row created.

ops$tkyte@ORA9IR2> select * from t;
select * from t
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

ops$tkyte@ORA9IR2>

only the last insert append was a direct path...

GOTO a page to Bookmark Review | Bottom | Top
A follow-up question December 02, 2004
Reviewer: Stewart W. Bryson from Atlanta, GA

Thanks for the good information Tom.

I've looked and I've looked, but I still cannot find the information about 9iR2
and 7.3.4 database links stated in the documentation. As a matter of fact, the
9iR2 Migration guide lists that 7.3.4 Materialized View sites can interact with
9iR2 master sites, and vice versa:

http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96530/migrep.htm#620714

As Advanced Replication utilizes database links, this seems a little
contradictory. Of course, if you tell me once more (I'm a bit stuborn) that it
won't work, I promise to believe you.

As I am currently architecting a solution to try and bring some of the data from
our European sites into a central repository, and as I don't yet have access to
those sites, I cannot simply test it as we have no 7.3.4 sites (thank God) in
the US.

I appreciate your time and understanding.

Followup:
contact support -- they will confirm

version 7.3 was so far out of any sort of support at the time of 9ir2 being
released that they never even considered testing it. and it does not work.

I believe that documentation to be erroneously left over from 9ir1.

GOTO a page to Bookmark Review | Bottom | Top
December 03, 2004
Reviewer: mg from Germany

Hi Tom,

I have a table T in tablespace TS1.

Every day so much of insertions on table T.

So I needs to check the a perticular size of TS is exceeds, if then I needs to
move the older data to another tablespace.

So could you please tell me the best way to do this?

meny thanks in advance.

Followup:
you sound like you really want a rolling window partitioned table -- a table
partitioned by date.

GOTO a page to Bookmark Review | Bottom | Top
December 06, 2004
Reviewer: A reader

Can we use trigger to check the % of filled tablespace size ?

Followup:
yes, but -- why? just to make the code run really really slow?

GOTO a page to Bookmark Review | Bottom | Top
What about the larger tables? December 15, 2004
Reviewer: Steven from Austin, TX USA

I used to work at a place where the main table had 4 billion rows and growing.
This tracked manufacturing details and rows for an item trickled in and then
were "archived" (written to tape then deleted) as one. Finding a good
partitioning scheme was difficult due to the large variable time that items
could stay in the manufacturing process. That is, two rows inserted on the same
day could be archived on vastly different timescales.

The system had around 40 million rows inserted and deleted on a daily basis
(about 1%). Some of the items could have 1 million or more records; they needed
to be archived at the same time. The table had a PK and a secondary index.

Obviously CTAS was not an option due to the 24x7 availability requirement and
besides we usually didn't have enough space for a full copy of the main table.

Is this a case where you just kick off the delete and take your lumps?

(Hmm, given past growth curves, the table should by now be up to 20 billion rows
and 200 million rows in/out daily.)

Followup:
or, enable row movement....

partition by a date field, the date field represents "record done as of"

when you initially insert it, it has a "done as of" way into the future (in the
active partition(s))

when you are done, you update this field and it moves from active into the "you
can archive me later" partitions.

It would be like the delete but the delete would be happening continously as
little transactions instead of a big delete all at once.

GOTO a page to Bookmark Review | Bottom | Top
data warehouse daily load December 16, 2004
Reviewer: shgoh from Malaysia

Dear Tom,

One of my dw developer is having performance problem with their daily data
load from legacy system. Approximately 5-8 million records inserted into a table
(append load table). The table has a lots of constraints and none of us are
familiar with the data given (it could have a constraint voilation).

What is the method to speed up the data loading process beside parallel,
direct load and nologging ?

Appreciated if you can share your view. Since we don't know well about the
data and the constraints, is it advisable to disbale constraint in this case.
Thanks

Rgds
SHGoh

Followup:
parallel
direct
nologging

they are them. how long is "long"

if you are adding 5/8 million records a day -- are you using partitioning? you
can look at disabling portions of partitioned indexes and rebuilding them after
the fact.

GOTO a page to Bookmark Review | Bottom | Top
Not a partionioning table December 17, 2004
Reviewer: shgoh from Malaysia

Dear Tom,

It is not a partioning table. Would it help if I go for parallel , nologging,
direct and disable portion of the constraints? Thanks

Rgds
Shgoh

Followup:
you are already doing parallel, nologging, direct -- disabling the constraints
only means "gotta enable them".

you asked for additional ideas. partitioning.

GOTO a page to Bookmark Review | Bottom | Top
How about Bulk Update? December 22, 2004
Reviewer: Tony from India

Tom,

I have two tables PO_HDR and PO_DTL having millions of records. I want to join
the two tables, and update a flag in PO_HDR table. Can I use Bulk Update based
on rowid here?

Example:

DECLARE

TYPE tblrowid IS TABLE OF ROWID INDEX BY BINARY_INTEGER;
arr_rowid tblrowid;

CURSOR po_cur
IS SELECT a.rowid FROM po_hdr a, po_dtl b
WHERE a.poseq = b.poseq
AND a.custid = b.custid
AND b.po_qty <> b.del_qty;

BEGIN

OPEN po_cur;
LOOP
FETCH po_cur BULK COLLECT INTO arr_rowid LIMIT 100000;
EXIT WHEN arr_rowid.COUNT = 0;

FORALL i IN arr_rowid.FIRST..arr_rowid.LAST
UPDATE po_hdr
SET del_flg = 'Partial'
WHERE rowid = arr_rowid(i);

COMMIT;
END LOOP;
CLOSE po_cur;

END;
/

Is rowid based bulk update is fast, and reliable? Is there any better, and
faster approach?

Please advice me.

Followup:
update po_hdr
set del_flg = 'Partial'
where (poseq,custid) in ( select poseq,custid
from po_dtl
where po_qty <> del_qty );

period.

GOTO a page to Bookmark Review | Bottom | Top
To update or insert by sql statement and not procedurally December 23, 2004
Reviewer: DAV

Hi,Tom,
I have 2 questions:
The one is :
By the update - sql statement that you wrote above
say i want to execute on millions records, and i want to make commit every 1000
records
how can i do it?
The second question is:
If i have to use pl/sql table and i want to load
alot of records to the table say 70,000- have i do it by:
fetch bulk bind limit 1000
or SELECT ... BULK BIND ....?
Why is it a problem to load many records to pl/sql tables?
Thank's
DAV

Followup:
you don't want to commit every 1000 records, that would take longer, consume
more resources and break your transactional integrity.

if you have 70k rows to process, and want to do it 1k at a time, you'll be using
fetch ... bulk collect ... limit 1000;

can you spell "memory"

GOTO a page to Bookmark Review | Bottom | Top
Tom,sorry i didn't understand you. December 23, 2004
Reviewer: DAV

Why fetch is better than sql statement at the case that written above. You told
before that sql is better.
Can you answer me please to the second question thar i wrote above?
Tnank you very much.
DAV

Followup:
not understanding you.

you asked:

If i have to use pl/sql table and i want to load
alot of records to the table say 70,000- have i do it by:
fetch bulk bind limit 1000
or SELECT ... BULK BIND ....?
Why is it a problem to load many records to pl/sql tables?

and I answered??? the question wasn't "single sql vs procedural code"

you started with "if I HAVE TO USE..."

you would not select BULK COLLECT for 70k records, you would use fetch bulk
collect with the limit clause.

GOTO a page to Bookmark Review | Bottom | Top
I was unclear - let me explain December 23, 2004
Reviewer: A reader

Sorry ,Tom,
i didn't explain myself well.
What i meant was:

First:

if i have to update million records in the plsql procedure
is it better to do it in update sql statement or procedural kode like
fetch.....? i want to commit every n records because of rollback segment so if i
use sql update statement how can i do it?

Second:

General question - IF I USE PROCEDURAL CODE - why it is not good to load say 40k
of records to plsql table by SELECT...BULK BIND... but by fetch bulk bind limit
...
As i know plsql table's records are kept in pga memory,so what happend if there
will be 40k of records?
Please,clarify the theory.
Thank you very much.

Followup:
you need to size your rollback segment large enough to do you job. seriously --
really you do.

a million records is nothing.

big_table@ORA9IR2> update big_table set owner = lower(owner);

1000000 rows updated.

big_table@ORA9IR2> select used_ublk*8/1024 from v$transaction;

USED_UBLK*8/1024
----------------
92.4921875

big_table@ORA9IR2>

93meg, big deal.

If you commit every N records -- you do realize you have to make your code
restartable somehow, so when you inflict the ora-1555 on YOURSELF by committing,
or fail for whatever other reason -- you can restart where you left off. Now we
are talking some code!

If you have to update tons of records and tons of columns AND they are indexed,
disable indexes, consider recreating the table instead of a mass update (if you
are updating millions of records -- this is going to be an offline thing anyway,
you might be supporting queries but you are not going to have "high
concurrency")

can you spell "memory"

and think about streaming -- keeping everyone busy. So you wait for a long time
while oracle gets 40k records, jams them into ram, manages them, then you
process them and do it all over. doing things in bite sized pieces is easier
for everyone - couple hundred in , couple hundred out.

GOTO a page to Bookmark Review | Bottom | Top
Update 1000000 in sql statement December 24, 2004
Reviewer: A reader

Thank's, Tom!
What is the message ora-1555?I haven't a guide.
And i really can't findthe way to mmit n rows in update sql statement.May you
give me some idea?
About loading to PL/SQL table for example 80k, if i need procedural code of
update at least 1000000 records and not update sql statement so why can'not i
avoid pl/sql table and do it just by regular cursor?

Followup:
search this site for 1555, you'll see lots and lots. check out Expert one on
one Oracle -- i wrote about it lots.

YOU DO NOT WANT TO COMMIT THE ROWS INCREMENTALLY. that is my fundemental
premise here.

If you have access to my book "Expert one on one Oracle" -- you can read all
about it.

GOTO a page to Bookmark Review | Bottom | Top
Another question December 24, 2004
Reviewer: A reader

About loading to PL/SQL table for example 80k, if i need procedural code of
update at least 1000000 records and not update sql statement so why can'not i
avoid pl/sql table and do it just by regular cursor?

Thank's

Followup:
slow by slow is slow by slow (opps, keep doing that don't I, row by row is row
by row)

GOTO a page to Bookmark Review | Bottom | Top
Sorry,misunderstanding. December 25, 2004
Reviewer: A reader

I didn't explain myself well.
What i meant in my question is a generall question:
If a need a cursor ( and not sql statement) for some process that acceses to a
big table,why is it better to load data to pl/sql table and run on plsql table
than just fetch records directly from the db table ?

Followup:
I understood, you didn't understand me.

slow by slow processing (a row at a time)

is slower than

o a single sql statement
o bulk collect and forall updates

doing things in BULK is

o faster
o uses less resources

GOTO a page to Bookmark Review | Bottom | Top
index DDL in the midst of transactions December 30, 2004
Reviewer: Nik Malenovic from Chicago, IL

Tom,

this is a followup to your March 07, 2004 followup related to ddl operations on
indecies whilst there are outstanding transactions in progress on the underlying
table. You said

"that'll happen whilst there are outstanding transactions, yes. You'll want to
keep trying until you get a chance to drop it."

We just ran into a similar production problem: the system was slow because we
were missing an index on a column and there were updates running full table
scans (don't ask!). Since there are users banging on the darn thing, there was
no way to shutdown and add index so we had to do it on the fly. Here's what we
did:

declare
l_attempt pls_integer := 0;
l_limit pls_integer := 2000*10; -- 10 seconds
l_rindex binary_integer;
l_slno binary_integer;
l_target binary_integer;
begin
l_rindex := dbms_application_info.set_session_longops_nohint;
l_target := 42;

dbms_application_info.set_client_info('Vincent The Builder');
dbms_application_info.set_module('rebuild_index', 'between_transactions');
for l_attempt in 1 .. l_limit
loop
begin
dbms_application_info.set_session_longops
(
rindex => l_rindex,
slno => l_slno,
op_name => 'rebuilding the index',
target => l_target,
context => 0,
sofar => l_attempt,
totalwork => l_limit,
target_desc => 'attempt',
units => 'attempts'
);

execute immediate 'create index RESPONSE_QUESTIONID_INDX on response
(questionid) nologging parallel 5';
dbms_output.put_line('SUCCESS - now rebuilding');
exit;
exception
when others then
null;
end;
if l_attempt = l_limit then
dbms_output.put_line('FAILED - there was not a phat chance to succeed');
end if;
end loop;
dbms_application_info.set_module(null, null);
dbms_application_info.set_client_info(null);
end;
/

and in another session we monitored the progress with

select sid, message from v$session_longops where time_remaining > 0;

that's all. This may be useful for somebody in a similar situation, hence the
post.

thanks,

Nik

Followup:
you could try (for maximum "uptime")

create index .. ONLINE;

that would

a) block and wait for outstanding transactions to complete but...
b) block new transactions from starting and
c) once it started creating -- lets the new transactions proceeed and
d) at the very very end -- does a/b again to finish

reduces the amount of "downtime" by allowing people to still work in the middle
(yes, they'll be full scanning, but working -- if that was something you
wanted..)

GOTO a page to Bookmark Review | Bottom | Top
CTAS doesnt have compute statistics? January 06, 2005
Reviewer: A reader

create index .... compute statistics;

Why doesnt create table have a similar compute statistics clause?

I have a CTAS that takes 5 hours during which Oracle already knows everything
about the data (similar to create index). Gathering stats on this table later
takes another 5 hours!

Is there a way to gather stats during the CTAS itself?

Thanks

Followup:
Oracle does not already know about everything.

Have you traced a dbms_stats.gather_table_stats to see how many queries, in how
many different ways it gathers stuff?

about the only thing Oracle knows after a CTAS is number of rows and number of
blocks (and so do you, you can set that)

there is no way it was computing histograms and such

I don't believe it takes 5 hours after the fact -- if it does, have you
considered "estimate"

GOTO a page to Bookmark Review | Bottom | Top
January 18, 2005
Reviewer: robin

I am trying to modify your code from above:

/**********************************************/
set (c1,c2,c3) = ( select max(decode(segment3,'xxxxx',ccid)),
max(decode(segment3,'zzzzz',ccid)),
max(decode(segment3,'wwwww',ccid))
FROM gl_code_combinations t2,
gl_sets_of_books gb
WHERE t2.segment5 = 'yyyyy'
and t2.segment4 = t1.s4
and t2.segment3 in ('xxxxx','zzzzzz', 'wwwww')
and t2.segment2 = t1.s2
and t2.segment1 = t1.s1
and t2.coa_id = gb.coa_id
and gb.set_of_books_id = 1 )
/***************************************************/
to allow me to update one column based on several different where clauses and
having no luck.

What I want to do is:

update table t1

set t1.c1 = 'X' where t1.c5 and t1.c6 are null

set t1.c1 = 'A' where t1.4 is not null and t1.c8 > sysdate

set t1.c1 = 'B' where t1.4 is null and t1.c8 < sysdate

etc....

I've tried a case statement and decode like above, but neither seems to
compile....thanks

Followup:
well, in answer to the "what I want to do is"

update table t1
set t1.c1 = (case when c5 is null and c6 is null
then 'X'
when c4 is not null and c8 > sysdate
then 'Y'
when c4 is null and c8 < sysdate
then 'B'
end )
where (c5 is null and c6 is null )
or (c4 is not null anc c8 > sysdate)
or (c4 is null and c8 < sysdate)

I cannot comment on the other one -- well, because there isn't anything to
comment on :)

GOTO a page to Bookmark Review | Bottom | Top
updating large table January 20, 2005
Reviewer: Tracy from UK

I have table_a with 50 million rows and table_b with 10 million rows.

table_a has a new column added called statid

Table_b has statid and tabaid. tabaid has a unique constraint on it and it is
also a foreign key to the primary key of table_a (id)

I want to update all rows in table_a to set the statid column to match the
statid column from table_b where the tabaid column in table_b = the primary key
in table_a

which of the following two do you think is better, and is there some other
altogether different way which would be better still?

UPDATE
(SELECT a.statid newval
,b.statid oldval
FROM table_a a
,table_b b
WHERE a.id=b.tabaid)
SET newval = oldval

UPDATE table_a a SET a.statid =
(SELECT b.statid
FROM table_b b
WHERE a.id=b.tabaid)

I've tested the above on a subset of the data using rownum, and both statements
seem to update at a rate of about 300,000 rows per hour, but I would like to get
it going faster than this.

Followup:
update the join.

expect to see a nice big juicy full scan and hash join. it'll be as good as it
gets.

GOTO a page to Bookmark Review | Bottom | Top
I have a similar situation where I must update millions of records fast February 09, 2005
Reviewer: Ryan from Canada

First I want to say thank you very much for all your help this last week Tom, it
has helped me out a lot. Unfortunately Oracle is not easy so I have another
question.

I am updating millions of records using insert statements.
I am updating the records using insert with values. I'm not updating one table
using another table.

example:
insert into dbname (FirstName, Number) values ('Ryan', 1);

called with different records millions of times.

When I call "commit" at the end of all the insert's it takes more than 17 hours
to commit. Of course this was importing records from a 1.5 gigabyte text file
with 83000 records. The time to call all insert statements was less than a half
an hour. Can you think of any reasons why the commit would be so slow. There is
another program at the office that can import the same data (in another database
format) in less than 20 minutes and that program is calling insert and commit in
a similar fashion so it can't be that oracle is just that slow. I'm also
disabling all triggers at the beginning of the import, and enabling them at the
end. I'm turning off logging and using /*+ APPEND */ and parallel hints but they
probably don't help as I'm not using direct load inserts.

Any help would be greatly appreciated.

Followup:
<quote>
I am updating millions of records using insert statements.
</quote>

really?

sorry-- but i don't believe you on your commment about commit, you are mistaken.

committing is fast, regardless of the number of records. Your method of
observation is somehow "flawed".

Unless of course you have an "on commit refresh" set of mv's or something.

So, turn on 10046 level 12 trace and trace it, use tkprof to report on it and
see what you see.

GOTO a page to Bookmark Review | Bottom | Top
query retrival slow it effects in the whole system February 10, 2005
Reviewer: Praveen from US

hi Tom
When developer using the escape sequence in the selection criteria the response
time is slow upto 15sec. this delay time effect the whole system. Kindly suggest
the appropriate way to resolve this.
select count(*)
from pf_document_rv where
((r_folder_path like '/Jobs/52021/Assembly/UNDER~_BAR/%' escape '~')
and (lower(r_object_id) <> lower('f004000000010471')))
takes about 15 secs to run

select count(*)
from pf_document_rv where
((r_folder_path like '/Jobs/52021/Assembly/NOBAR/%')
and (lower(r_object_id) <> lower('f004000000010471')))
takes less then one second.
Perhaps this is helpful ... if I add the escape '~' specifier to this query it
also runs slowly even though it doesn't use the escape character.
regards
PKR

Followup:
so, why do you believe this has anything to do with escape?

(you do understand these two queries return totally different result sets and
perhaps that -- that is the "difference")

have you tkprof'ed these queries and compared them?

GOTO a page to Bookmark Review | Bottom | Top
RE : Row Migration February 17, 2005
Reviewer: A reader from NJ, USA

Hi Tom,

You mentioned in one of the above discussion threads that while updating a
null-column to a not null value we have to make sure that we have enough PCTFREE
and no row migration happens.
I have a table that gets 2 million rows on a daily basis and I want to update
say a column for only 50000 rows.

1. What should be my approach
A. if the column to update is indexed.
B. if the column to update is not indexed
C. How do I find out row migration that may happen
during the update

Thanks

Followup:
regardless of whether the column is indexed or not -- the approach would be "the
same". the index will make the update take longer as it does lots more work,
but for just 50,000 rows -- it won't be a very big deal.

2,000,000 and 50,000 are only big if it is US dollars and it was your bonus and
paycheck. In a database, in 2005, they are very small numbers.

ops$tkyte@ORA9IR2> set timing on
ops$tkyte@ORA9IR2> update big_table set owner = owner||'x' where mod(id,40) = 0;

50000 rows updated.

Elapsed: 00:00:18.02

ops$tkyte@ORA9IR2> create index big_table_idx on big_table(owner);

Index created.

Elapsed: 00:00:27.84
ops$tkyte@ORA9IR2> update big_table set owner = owner||'y' where mod(id,40) = 0;

50000 rows updated.

Elapsed: 00:00:28.04
ops$tkyte@ORA9IR2> select count(*) from big_table;

COUNT(*)
----------
2000000

Elapsed: 00:00:03.02
ops$tkyte@ORA9IR2>

to discover if rows migrated, you would analyze the table (perhaps using the
list chained rows, if you just analyze it, you'll get chain_cnt filled in the
dictionary)

you cannot find out "before" the update what will happen.

GOTO a page to Bookmark Review | Bottom | Top
RE : Update February 18, 2005
Reviewer: A reader from NJ, USA

Tom,

I think I did not make the last statement regarding the UPDATE of 50000+ rows
more clear. Here is my requirement :

1. I have a flat-file coming in to our system around 10.00 AM in the morning
with over 2 million records and has the following fields :
ACCOUNT_NUMBER
SECURITY_ID
COST_BASIS
....
....
2. I have another flat file (a delta file containing only data that has changed)
coming in on a daily-basis around 2.00 AM in the morning with around 50000+
records and has the following fields :
ACCOUNT_NUMBER
SECURITY_ID
.....
.....
which we directly load into a table B with the following fields :
ACCOUNT_NUMBER
SECURITY_ID
COST_BASIS <== which is set to NULL during this load
.....
.....
3. Once this file is loaded, we run a process which takes unmodified data (from
prior run ie. which did not appear in file in step 2) so that the table B always
has full-load which is the 2 million records I was talking about
4. We need to run a process that will update the COST_BASIS field only for
modified data (ie. that appeared in today's delta load which will be the 50000+
records or so that I need to update. For data is brought over from previous run,
we use prior run's value itself)

In order to perform this, I thought of using the 2 million file as an external
table (say table A) and join this table with B and update the COST_BASIS from A
to B for the modified records. We have the logic to identify the changed records
however. But my concern is that the table A has to be full-scanned for 50000+
records from B and no indexes can be created on A.
I kindly need your suggestion as to what would be the best approach in tackling
this problem.

Followup:
why would A be full scanned more than once in a join?

GOTO a page to Bookmark Review | Bottom | Top
RE : Full Scan February 18, 2005
Reviewer: A reader from NJ, USA

Tom,

This is because I thought if we join two tables, then it will do a Nested Loop
join. For each row got from table B, it would have to full scan A to get
matching rows and also that B (delta table) would be my driving table (since it
has fewer number of rows to start with)

If my understanding is not correct, please advise.

Followup:
hash join it would do in light of no indexes.

hash join it would do even in the presence of indexes because the CBO is smart
and knows indexes are evil when most or all of the rows are going to be
processed.

it would be a nice big juicy fast hash join.

GOTO a page to Bookmark Review | Bottom | Top
Use MERGE command to combine steps 3 and 4 February 18, 2005
Reviewer: Logan Palanisamy from Santa Clara, CA USA

Combine steps 3 and 4 with the merge command. Your problem is solved. The
External table with 2 million records will be scanned just once.

MERGE INTO B
USING A
ON (B.ACCOUNT_NUMBER = A.ACCOUNT_NUMBER)
WHEN MATCHED THEN UPDATE
SET B.COST_BASIS = A.COST_BASIS
WHEN NOT MATCHED THEN INSERT (B.ACCOUNT_NUMBER, B.SECURITY_ID, B.COST_BASIS, ..)
VALUES (A.ACCOUNT_NUMBER, A.SECURITY_ID, A.COST_BASIS, ..)

"A" is your external table with 2 million records.
"B" is the one that initally contains 50,000+ records. After the merge, B will
have 2 million records.

The INSERT clause is step number 3
The UPDATE clause is step number 4

Hope that helps

GOTO a page to Bookmark Review | Bottom | Top
Delete Record !!! February 28, 2005
Reviewer: Anupam

What is the diffence between :

1. DELETE <table> TRUNCATE WHERE <clause>;
2. DELETE <table> WHERE <clause>;

?? Both works!!!

Followup:
just like

ops$tkyte@ORA9IR2> delete t this_is_a_correlation_name where 1=1;

27794 rows deleted.

works -- truncate is just being used as a correlation name there (a table alias)

GOTO a page to Bookmark Review | Bottom | Top
Delete record February 28, 2005
Reviewer: Anupam from India

What is the diffence between :

1. DELETE <table> TRUNCATE WHERE <clause>;
2. DELETE <table> WHERE <clause>;

?? Both works!!!

GOTO a page to Bookmark Review | Bottom | Top
Deleted message March 03, 2005
Reviewer: Franc Uwa from Nigeria

the deleted was of today and trying to see if you can help me to resend those
deleted message of me back to my inbox

Followup:
huh?

GOTO a page to Bookmark Review | Bottom | Top
March 10, 2005
Reviewer: Brad Doczkat from Pittsburgh, PA

What do you do when the resident DBAs do not approve of the use of this method?
How do I convince them that this approach is much more streamlined and requires
far less processing time than doing an update of millions of rows the old
fashioned way? For my particular need that brought me to this thread, I only
have to do a one time initialization of a one character column that I want to
put a not null (with default) contraint on. There are 5.5 million rows in the
table. I need to do this one-time as an implementation step, yet the DBAs do not
seem to buy into this approach. They would rather update so many rows a night,
over the course of several nights, doing frequent commits during the update
process.

Followup:
tell the dba's to do the work then?

Sorry -- I just don't know how to make your dba's do something they don't want
to.

GOTO a page to Bookmark Review | Bottom | Top
Help Please I don't Understand March 11, 2005
Reviewer: A Reader from US

Tom:

I have a project due tomorrow that I am getting paid a boat load of money for
and do not have a clue… Could you write it for me without being compensated? Oh
and answer my interview questions for me as well.

Just joking ;)

As always you have the best info on your site (dare we say better than meta
link?).

However I have made it through two threads in just over 9 hours. Have you ever
thought of hiring a few people and extracting: best part of thread ; lessons
learned; sample code (I still have not found the latest and greatest 10g
show_space); and a glossary for items like “CTAS”, “(sic)”, “SLA”, “FBI”?
Perhaps make a book out of it and sell it but it would have to be a searchable
electronic copy. (Hmm, WIKI enable a portion of asktom (naa)?)

In this thread alone I found: q’| quoted strings |’; CTAS for speed; I really
need to start using hints; I need to look into Analytics; I have to check if
my DW really needs partitions…

Thanks for everything.
A Reader.

(Guess response: This is a feature request… Denied :)

Followup:
I do that -- it is called "a book" :)

Effective Oracle by Design -- best of asktom, with scripts as of the time of the
book.

Expert One on One Oracle -- what you need to know, with scripts -- second
edition under construction (in fact I'm going to take April "off" -- from asktom
-- in order to just get it going)

Expert One on One Oracle second will be the 10g version of what you are looking
for.

GOTO a page to Bookmark Review | Bottom | Top
March 12, 2005
Reviewer: Alberto Dell'Era from Milan, Italy

> best of asktom, with scripts
> as of the time of the book.

Yes, but that doesn't mean that the books are just a collection of q&a from
asktom (as i've found many ones believe); they are "regular" books - the q&a are
mainly used as "Customer-driven Topic Selection", so to speak...

Followup:
correct -- definitely not a collection of asktom articles, that would be
disjointed at best. Rather, a flow -- a story -- with examples from the site to
help it along. The things I pick to write about are driven very much by the
questions and trends I see here.

GOTO a page to Bookmark Review | Bottom | Top
March 13, 2005
Reviewer: A reader

Hi Tom,
Regarding...
"in fact I'm going to take April "off" -- from asktom "

What will happen to desparate people, who are looking for answers supported by
technical reasons? I am sure that others will be of the same opinion that this
is one of greatest help ever one can get in Oracle in timely and accurate
manner...

Followup:
start here:

http://www.jlcomp.demon.co.uk/faq/ind_faq.html

and try:

groups.google.com -> comp.databases.oracle.*

the place where I got hooked on discussing Oracle in the first place way back in
October of 1994. You might get an answer from Jonathan Lewis, Connor McDonald,
or a host of others. I still lurk there from time to time.

otn.oracle.com -> discussion forums

metalink.oracle.com -> discussion forums

the oracle-l list

http://www.freelists.org/archives/oracle-l/
Where Cary Millsap or Steve Adams or Tim Gorman to name but a very few might
answer your question.

You might even find yourself answering some from time to time!

GOTO a page to Bookmark Review | Bottom | Top
dblink March 23, 2005
Reviewer: shiva from india

We are trying to access a stored procedure which is existing in another database
thro dblink, in that stored procedure one update stat.. is there because of
that screen gets hanged

Regards,
Shivachandramouli.R

Followup:
so little data... 'one update stat' what is that. and what have you done to
diagnose the issue?

GOTO a page to Bookmark Review | Bottom | Top
questions in the dependent objects after March 27, 2005
Reviewer: A reader

Hi Tom,

As u suggested us to create a new table ,then drop the original table and rename

sounds the great method update the table by CTAS, drop original table and then
rename it to the original table.
questions all the dependent objects such as packages, views, synonynums... all
will because invalid. do these need to be recompiled? I know all the grants have
to be regrants.

Thanks.

Followup:
dependent objects will take care of themselves next time they are executed.

GOTO a page to Bookmark Review | Bottom | Top
March 27, 2005
Reviewer: A reader

Hi Tom,

Sounds the great method to update the table by CTAS, drop original table and
then
rename it to the original table.
Question is all the dependent objects such as packages, views, synonynums... all

will become invalid. Do these need to be recompiled? I know all the grants have
to be regrants.

Thanks.

GOTO a page to Bookmark Review | Bottom | Top
March 27, 2005
Reviewer: A reader

tom,
you said:

"
Now, just set job_queue_processes > 0 to set the "degree of threads" and in 8i
and before set job_queue_interval to say 60 or so and there you go.
"

what's the "degree of threads"?

Followup:
"degree of threads" is number of concurrent job queue processes, how many
dbms_job jobs can be running simultaneously.

GOTO a page to Bookmark Review | Bottom | Top
commits or not April 01, 2005
Reviewer: A reader

want to insert 30 million records to a dummy table with nologging as you
adivsed.

insert into /*+ append parallel ( ) */
select * from ...

if i have the large rollback segs, it can be complete without doing the commits
during the insert into (for example comment every one millions inserts), is it
faster without the commits?

Thanks alot

Followup:
to commit or not. Everytime you commit, you will wait, wait for log file sync.

Every time you commit, you say "i'm done". That's ok, if you are restartable!

However, with a big parallel direct path insert it'll be all or nothing (no
point to do N rows, and do the query all over again). Each parallel process
will utilize it's own rbs and if you have no indexes, a direct path insert isn't
going to generate MUCH undo at all (it writes above the HWM and doesn't need to
generate UNDO information)

GOTO a page to Bookmark Review | Bottom | Top
April 05, 2005
Reviewer: A reader

GOTO a page to Bookmark Review | Bottom | Top
Let's Tell the CBO About Our Great Idea! April 11, 2005
Reviewer: Doug Winage from New Orleans, LA

Given that it seems there are clear cases in which making and loading a new
table is preferable to updating in place, I wonder if it would be just as well
to make such an approach available to the CBO as a candidate execution plan.
I'll grant that one wouldn't want to leave the choice entirely up to the
optimizer; the safest thing would be to allow enabling and disabling of the plan
option at the system level with a parameter called, say,
"ctas_for_updates_enabled." One could also provide a pair of hints, say, /*+
USE_CTAS */ and /*+ NO_CTAS */, to allow one to override the CBO, if need be,
and expressly demand or prohibit a plan involving recreation of the table.

Followup:
all really good ideas need to go via support and get filed....

GOTO a page to Bookmark Review | Bottom | Top
CTAS instead of Deletes April 11, 2005
Reviewer: GG from Minneapolis, MN

Tom,
Please, this is a relevant follow up to this thread.
I have also included a test case.
Instead of deleting rows from a table with 270 Million records, I have used
CTAS. The delete criteria is based on a join of two fields from two tables. The
delete occurs only on one table (ORDERS). The other table is INVOICE.
If DOC_NUMBER and ORD_ITEM values from both tables match, then delete the
matching records from ORDERS.

Test case:

drop table orders;
create table orders (
DOC_NUMBER number(10) not null
,S_ORD_ITEM number(10) not null
,PROMOTION varchar2(10) not null);

drop table invoice;
create table invoice(
DOC_NUMBER number(10) not null
,S_ORD_ITEM number(10) not null
,OTHER_INFO varchar2(10) not null);

begin
for i in 1..10 loop
insert into orders values(i,i+1,'ORDERS');
end loop;
commit;
end;
/

select * from orders;

DOC_NUMBER S_ORD_ITEM PROMOTION
---------- ---------- ----------
1 2 ORDERS
2 3 ORDERS
3 4 ORDERS
4 5 ORDERS
5 6 ORDERS
6 7 ORDERS
7 8 ORDERS
8 9 ORDERS
9 10 ORDERS
10 11 ORDERS

begin
for i in 1..5 loop
if mod(i,2) = 0 then
insert into invoice values(i,i+1,'INVOICE');
end if;
end loop;
commit;
end;
/

select * from invoice order by 1;

DOC_NUMBER S_ORD_ITEM PROMOTION
---------- ---------- ----------
2 3 INVOICE
4 5 INVOICE

The above two rcords should be deleted from ORDERS:

drop TABLE ORDERS_NEW;
CREATE TABLE ORDERS_NEW
NOLOGGING
AS
SELECT /*+ parallel(ORDERS,4) */ *
from orders ord
where NOT EXISTS
(select null from invoice inv
where inv.DOC_NUMBER = ord.DOC_NUMBER and
inv.S_ORD_ITEM = ord.S_ORD_ITEM);

select * from orders_new order by 1;

DOC_NUMBER S_ORD_ITEM PROMOTION
---------- ---------- ----------
1 2 ORDERS
3 4 ORDERS
5 6 ORDERS
6 7 ORDERS
7 8 ORDERS
8 9 ORDERS
9 10 ORDERS
10 11 ORDERS

IS THIS THE CORRECT WAY OF DOING IT? ESPECIALLY THE SQL WITH "NOT EXISTS"?

Followup:
hints need to use the correlation (alias) names....

I would just use NOT IN but not exists with the CBO will be considered as not in
conceptually so it is just harder to code, but has the same net effect.

See the concepts guide for details on parallel CTAS.

GOTO a page to Bookmark Review | Bottom | Top
Update millions of records April 12, 2005
Reviewer: Marsboy from USA

Hi Tom,
It helped me so much that my days work is now can be accomplished in
minutes.

GOTO a page to Bookmark Review | Bottom | Top
update partition April 14, 2005
Reviewer: phil

I am sorry for posting aq similar question in another thread. The follow up
should have been here !

I am getting a little confused with the technique of updated lots of rows in a
partitioned table.

I have a table partioned by month. To do an update on a column you suggest doing
a create table as select. This of course will create a new table with the
updated colum and the data from the table to be updated.

What I do not understand is how to handle the each partition. My new table has
none, just valid data. How do I now get the new table to the structure of the
old with all the partitions ?

I am sure the answer is here, I just cannot see it !

regards

Followup:
to do a huge bulk update of most of the rows, using a create table as select
could be a way to go. I followed up on the other thread.

GOTO a page to Bookmark Review | Bottom | Top
Partioned tables April 14, 2005
Reviewer: Greg from Toronto

a previous poster was asking about partioned tables ..

[quote]
I am getting a little confused with the technique of updated lots of rows in a
partitioned table.
[/quote]

Feel free to ignore this question if you answered it already in the other
followup (I wasn't sure which thread you posted it in, and can't see find it ..
*sigh*)

Anyway ... if the table is partitioned, can you not just deal with each
partition seperately ?

That is: if you have 15 partitions (ie P01 through P15 ...), and you "update"
only encompased P03 and P04 ... then could you not just "re-create" that
partition in the same way you "re-create" the table .. then just "stuff" the
partition(s) back in?

I can't remember all the syntax to pull this little "card-shuffle" off ... but
I'm pretty sure it can be done? can't it?

Followup:
sorry, the other thread is

http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:7143624535091

Yes, if the table is partitioned, you can. (and that is what I was trying to
describe on the other page, I just didn't have sufficient detail the first time
around to make it clear)

GOTO a page to Bookmark Review | Bottom | Top
April 21, 2005
Reviewer: A reader

"
declare
l_job number;
begin
dbms_job.submit( l_job, 'execute immediate ''alter index idx1 rebuild'';' );
commit;
dbms_job.submit( l_job, 'execute immediate ''alter index idx2 rebuild'';' );
commit;
end;

Now, just set job_queue_processes > 0 to set the "degree of threads" and in 8i
and before set job_queue_interval to say 60 or so and there you go.

"
will the dbms job only be executed once?

TIA

Followup:
yes, well, it will execute successfully ONCE.

It might fail over and over and over and over, then run successfully and go
away.

GOTO a page to Bookmark Review | Bottom | Top
How to Update millions or records in a table? May 17, 2005
Reviewer: S.Darshan from Singapore

I have to update one table which has 10 million records. The regular rollback
segment won't enough to hold this data, In this case, how can we predict, how
much rollback area will be used by this update.

Initially I thought of your approach "CREATE TABLE new_table as select <do the
update "here"> from old_table;"

But I have to update about 30% of columns. Rest of the 70% columns should be not
touched. Also this table have lot foreign key Constraints. Is any alternative
method? Thanks in Advance.

With best regards
S.Darshan

Followup:
if I had to update 1/3 of a 10,000,000 row table -- that is the perfect
candidate for create table as select. foreign keys or not (you can always
disable, ctas and for the columns you have not updated -- enable novalidate,
just put them back on)

but, if you cannot do that, disable indexes on columns to be updated and update
and then rebuild the indexes. to get a grasp on "how much undo", create a copy
of the first 100,000 rows of data for example, update it and see how much undo
you generated.

GOTO a page to Bookmark Review | Bottom | Top
Parallel DML May 18, 2005
Reviewer: Tamilselvan from USA

I agree with what Tom said.

How ever, in many real life situation CTAS may not be executed as we think. For
example, if the table has a LONG column, then CTAS will not work.

Consider Parallel DML. I have successfully implemented parallel dml in a large
DB.

Here is the script:

alter table scott.table_a parallel 8;
alter index scott.table_a_p1 parallel 8;
alter table scott.tamil_temp_table_b parallel 8;
alter index scott.tamil_temp_table_b_idx parallel 8;

commit;
alter session enable parallel dml ;
alter session enable parallel query ;

update /*+ parallel(a,8) */ scott.table_a a
set a.pr_addr_id = ( select addr_per_id
from scott.tamil_temp_table_b c
where c.accnt_id = a.ID)
where exists ( select null
from scott.tamil_temp_table_b c
where c.accnt_id = a.ID) ;
commit;
alter session disable parallel dml ;
alter session disable parallel query ;
alter table scott.table_a noparallel ;
alter index scott.table_a_p1 noparallel ;

Thanks
Tamil

Followup:
<quote>
if the table has a LONG column, then CTAS will not work.
</quote>

convert long to clob and be done with it.

GOTO a page to Bookmark Review | Bottom | Top
Ecellent Explanation May 18, 2005
Reviewer: A reader from Virgina

this is very helpful nice Toma i have question i am facing while using dbstart
utility where can i ask that question

Followup:
suggest you don't use dbstart but rather write your own rc.d script to
startup/shutdown that which you want.

What I do is like this:

http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:1302619965694#12118047114111

GOTO a page to Bookmark Review | Bottom | Top
How to Update millions or records in a table June 08, 2005
Reviewer: Debasish from New Delhi, India

you told that create tble with update command. but I have a problem that. ( in
8i)
One of my Master table is contain following fields ( approx 500000 records)

1. ID varchar2
2. Pincode Nuimber
3. Tag char

Another three table is bankmaster1,bankmaster2,bankmaster3 fields are
1. pincode number
2. branch varchar2

I have to update tag column in master table according the pincode distribution
in bankmaster table, like tag E for bank1, tag C form bank2, tag V for bank3.
when I try it through update command then it works for a small table but not for
this 500000 record table.

Plese let me know in advance.

Followup:
I told you to create a table with the update command? Where and when -- I'll
need to correct that if true, for you cannot create a table with the update
command.

It works for me for a 50,000,000 row table or a 5 row table. You'll have to
tell us why it isn't "working" (and hopefully it is not using a single index)

ops$tkyte@ORA10G> create table t
2 as
3 select id, id pincode, 'x' tag
4 from big_table.big_table
5 where rownum <= 1000000;

Table created.

Elapsed: 00:00:01.36
ops$tkyte@ORA10G>
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> create table t1 as select id pincode, 'xxxxxx' branch from t
where mod(id,3) = 0;

Table created.

Elapsed: 00:00:01.60
ops$tkyte@ORA10G> create table t2 as select id pincode, 'xxxxxx' branch from t
where mod(id,3) = 1;

Table created.

Elapsed: 00:00:01.54
ops$tkyte@ORA10G> create table t3 as select id pincode, 'xxxxxx' branch from t
where mod(id,3) = 2;

Table created.

Elapsed: 00:00:01.49
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'T', method_opt=>
'for all indexed columns' );

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.06
ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'T1', method_opt=>
'for all indexed columns' );

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.13
ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'T2', method_opt=>
'for all indexed columns' );

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.07
ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'T3', method_opt=>
'for all indexed columns' );

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.13
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> set timing on
ops$tkyte@ORA10G> update (select tag from t where pincode in (select pincode
from t1)) set tag = 'E';

333333 rows updated.

Elapsed: 00:00:06.13
ops$tkyte@ORA10G> update (select tag from t where pincode in (select pincode
from t2)) set tag = 'C';

333334 rows updated.

Elapsed: 00:00:06.48
ops$tkyte@ORA10G> update (select tag from t where pincode in (select pincode
from t3)) set tag = 'V';

333333 rows updated.

Elapsed: 00:00:06.40
ops$tkyte@ORA10G>

Only takes a couple of seconds.

GOTO a page to Bookmark Review | Bottom | Top
June 10, 2005
Reviewer: A reader from MI

Tom,
I have this query in which I'm trying update the old POC_ID with the new
POC_ID. However, it doesn't produce the results I would expect. The table does
not get updated with the correct POC_ID. What am I missing here?

begin for c1 in (select w.poc_id new_poc_id, a.poc_id old_poc_id
from wsar_poc w, ap32admin.adas_poc a
where w.poc_responsible_ind = a.poc_responsible_ind)
loop
update wsar_archive_cs
set poc_id = c1.new_poc_id
where poc_id = c1.old_poc_id;
end loop;
end;

Thanks,
Randy

Followup:
need example.

but that looks like it should be at most a SINGLE sql statement, no code
whatsoever...

GOTO a page to Bookmark Review | Bottom | Top
June 10, 2005
Reviewer: A reader

I have a table WSAR_ARCHIVE_CS which contains a column POC_ID. POC_ID is FK to
the ADAS_POC POC_ID PK. I want to update that column with the new POC_ID which
is stored in WSAR_POC POC_ID PK.

Thanks,
Randy

Followup:
create table .....
create table ....
create table ....
insert some data...
insert some more data...

followed by "and this is what I need to do..."

GOTO a page to Bookmark Review | Bottom | Top
June 10, 2005
Reviewer: A reader

create table wsar_poc
(POC_ID NOT NULL NUMBER,
POC_RESPONSIBLE_IND VARCHAR2(50),
DIVISION_ID NOT NULL NUMBER,
OUSER VARCHAR2(30),
ACCESS_ROLE VARCHAR2(50),
PHONE_NUMBER VARCHAR2(20),
EMAIL VARCHAR2(200))

Create table wsar_archive_cs
(FILE_ID NOT NULL NUMBER,
ONELINER_DATE DATE,
PRODUCT VARCHAR2(50),
COMPONENT VARCHAR2(400),
PRIORITY VARCHAR2(50),
POC_ID NUMBER,
SUBJECT VARCHAR2(300),
ISSUE VARCHAR2(4000))

I have the table adas_poc which contians the old poc_id that is in the
WSAR_ARCHIVE_CS table.

desc adas_poc
POC_ID NOT NULL NUMBER,
POC_RESPONSIBLE_IND VARCHAR2(50),
DIVISION_ID NOT NULL NUMBER,
OUSER VARCHAR2(30),
ACCESS_ROLE VARCHAR2(50),
PHONE_NUMBER VARCHAR2(20),
EMAIL VARCHAR2(200)

I tried to update using the cursor for loop, but no luck. The column does not
get updated with the correct ID.

Thanks,
Randy

Followup:
oh, so close -- you almost had inserts there with data to show

a) what happened
b) explain what you expected

That will definitely join wsap_poc to adas_poc by poc_responsible_ind

it will definitely then update wsar_archive_cs

you need to explain why this logic isn't working, it is rather straight forward.
Demonstrate for us.

GOTO a page to Bookmark Review | Bottom | Top
RE: Parallel Import June 20, 2005
Reviewer: Bakunian

Tom,

I created range partitioned table with parallelization enabled, then I use
import utility to load 5Gb of data into table however I don't see parallel
processes spawned during load.
Shouldn't import take advantage of this functionality?
I am using 9.2 on Solaris 8.

These are instance parameters.

Thank you for your time.

NAME_COL_PLUS_SHOW_PARAM TYPE VALUE_COL_
--------------------------------------------- ----------- ----------
fast_start_parallel_rollback string LOW
log_parallelism integer 4
parallel_adaptive_multi_user boolean TRUE
parallel_automatic_tuning boolean TRUE
parallel_execution_message_size integer 4096
parallel_instance_group string
parallel_max_servers integer 40
parallel_min_percent integer 0
parallel_min_servers integer 0
parallel_server boolean FALSE
parallel_server_instances integer 1
parallel_threads_per_cpu integer 2
recovery_parallelism integer 0

Followup:
imp is a serial process, always has been - it reads a single DMP file
sequentially and loads it.

10g with impdb has a parallel capability.

GOTO a page to Bookmark Review | Bottom | Top
Tome u r great June 30, 2005
Reviewer: shyam

I have a quick question my statistics are as below i am loading data

datafile Blocks_Read FS.Phyblkwrt Blocks_Written Total_IOs

/oracle/u03/oradata/dev/data_01.dbf
26507 5240 31747

/oracle/u03/oradata/dev/data_09.dbf
26823 2823 29646

/oracle/u06/oradata/dev/data_08.dbf
24978 3704 28682
/oracle/u04/oradata/dev/data_10.dbf
26288 1831 28119

/oracle/u04/oradata/dev/data_02.dbf
24053 3634 27687

/oracle/u05/oradata/dev/data_11.dbf
26155 1284 27439
/oracle/u05/oradata/dev/data_07.dbf
23125 2865 25990

/oracle/u04/oradata/dev/data_06.dbf
23558 1978 25536

/oracle/u06/oradata/dev/data_25.dbf
21672 2234 23906
/oracle/u05/oradata/dev/data_03.dbf
21299 2574 23873

/oracle/u03/oradata/dev/data_05.dbf
20324 1819 22143

/oracle/u06/oradata/dev/data_04.dbf
19673 1746 21419

/oracle/u06/oradata/dev/data_12.dbf
19574 1306 20880

/oracle/u04/oradata/dev/data_14.dbf
19309 755 20064

/oracle/u03/oradata/dev/data_13.dbf
18807 848 19655
/oracle/u05/oradata/dev/data_24.dbf
17656 1316 18972

/oracle/u06/oradata/dev/data_16.dbf
16748 1730 18478

/oracle/u05/oradata/dev/data_15.dbf
17006 810 17816
/oracle/u03/oradata/dev/data_17.dbf
14810 1350 16160

/oracle/u05/oradata/dev/data_19.dbf
13552 1157 14709

/oracle/u03/oradata/dev/data_21.dbf
13622 911 14533

/oracle/u05/oradata/dev/data_23.dbf
13366 878 14244

/oracle/u04/oradata/dev/data_18.dbf
12644 908 13552

/oracle/u04/oradata/dev/data_22.dbf
12521 741 13262
/oracle/u06/oradata/dev/data_20.dbf
12137 546 12683

Tom u can observe that total reads is more than total writes as a result the
performance is decreasing. let me know how should i increse the block written.
as a result o this my data loading is slow

Followup:
who is "u" and why is "u" observing things?

how do you know that reads are the cause of performance decreasing. All I see
are a bunch of numbers.

GOTO a page to Bookmark Review | Bottom | Top
loading data June 30, 2005
Reviewer: shyam

this is the quesry i used to generate statis stics

select DF.Name File_Name,
FS.Phyblkrd Blocks_Read,
FS.Phyblkwrt Blocks_Written,
FS.Phyblkrd+FS.Phyblkwrt Total_IOs
from V$FILESTAT FS, V$DATAFILE DF
where DF.File#=FS.File#
order by FS.Phyblkrd+FS.Phyblkwrt desc;

Followup:
so, you've just shown the number of reads done since the database was started.
All they are are a bunch of numbers with no meaning.

how do you know read IO is your bottleneck? what made you decide that?

GOTO a page to Bookmark Review | Bottom | Top
performance while loading data June 30, 2005
Reviewer: shyam

FS.Phyblkwrt
datafile = /oracle/u03/oradata/dev/data_01.dbf

Blocks_Read = 26507
Blocks_Written =5240
Total_IOs= 31747

Followup:
so what? (he says again)

you've shown me that file has been read 26,507 times SINCE STARTUP and written
to 5,240 time SINCE STARTUP

big deal, so what?

where they all during your load.
did they take A LONG TIME <<<===

GOTO a page to Bookmark Review | Bottom | Top
PRECAUTIONS TO TAKE WHEN HUGE INSERTS July 03, 2005
Reviewer: shyam

tom can u tell me the tips and how to improv the speed with which data is
loading.

to to improve data load speed. for huge inserts at present i have 73 tables data
is being loaded it takes for one project 60 mins

Followup:
depends on your needs and circumstances. To describe them all would be a small
book.

So, you'd need to be more "specific". How many new rows, how many old rows, is
there lots of validation going on, is this a reload or a load, have lots of
un-necessary indexes, etc etc etc

GOTO a page to Bookmark Review | Bottom | Top
Delete & import 6M records - Data Warehouse July 06, 2005
Reviewer: A reader

Tom,

We currently have v8.0.6 running data warehouse. We have 6M rows to delete
everyday & same time import same amount. It currently take us 10hrs to do
deletion part. Data are import every 5mins. Is there any way we can speed up the
deletion & improve the import(using oracle imp/exp)

Thanx so much for your help

Followup:
insufficient data. why no partitioning? maybe a create table as select rows to
keep and drop old table.

but insufficient data to answer properly.

and really old software.
and really archaic tools.

a lot has happened in the time between 8.0.6 and 10gR2 - you'd find many more
options in 2005 than you did in 1997 when 8.0 was current.

GOTO a page to Bookmark Review | Bottom | Top
slowly my data base August 06, 2005
Reviewer: hameed

hi tom
pleas help me
i have program by oracle 8i& dev 6i
the problem:
data base is slowly

GOTO a page to Bookmark Review | Bottom | Top
One large insert Vs multiple smaller inserts August 14, 2005
Reviewer: Sami

Dear Tom,

We had this issue in production during huge data migration.

Environment:-
Oracle 8.1.7.3 Sun 2.6 - 2 Node OPS. 12 CPU on each box.
The table in question here involved in M-M Oracle replication.

There is no other activity on both "12Million_Rec_table" and "1Million_Rec_table
" other than my process.

Process-1
=========
exec dbms_reputil.replication_off();
commit;
insert /*+ append parallel(t,12) */ into 12Million_Rec_table t
select /*+ parallel */ * from 1Million_Rec_table;

Process-2
=========
exec dbms_reputil.replication_off();
commit;
--for loop with diff predicate value
loop
insert /*+ append parallel(t,12) */ into 12Million_Rec_table
select /*+ parallel */ * from 1Million_Rec_table
where predicate=&1; -- there is an Index on Predicate column
end loop

12Million_Rec_table has only one primary key index. There is no other
constraints or LOB columns on this table. For some reason it didn't use
direct-path insert, don't know why.

Process-2 executed N times in a sequence.Basically Process-1 is one large
transaction compared to Process-2 (which has N number of transactions in a
loop).

In terms of execution time the Process-1 took longer than Process-2. Process-1
was running for more than
2 hrs without completion so we stopped it. Process-2 with smaller chunk of
records completed very fast.

When Process-1 started the USED_UBLK value in v$rollstat increased rapidly then
slowed down gradually. I have seen
this behaviour many times while running Process-1.

My questions are

1)Why Process-1 is slower than Process-2. [We didn't have any rollback segment
issue while running both Process-1 and Process-2]

2)Is there any performnce issue if the rollback segment goes beyond some size,
say 200M or something..?

Followup:
you did not enable parallel DML, to do a parallel direct path insert, you must.

I cannot answer these questions, insufficient data. You could have captured
information while it was running.

There is a *small* chance that what happened was

a) the big insert, because of the primary key index, generated gobs of undo.
b) this caused more undo to be generated than fit in the buffer cache
c) so it had to be flushed by dbwr
d) causing buffer waits by the insert (trying to get more buffers from the cache
for undo)
e) and causing IO contention on the disk

The smaller inserts *MIGHT* have

a) filled up part of the buffer cache with undo blocks
b) committed, allowing the undo blocks in the cache to be reused
c) goto A n times

avoiding the buffer waits and the flushing of the undo to disk.

However, I would not have taken your approach for a small set of one million
rows. It likely would have been much faster to

a) get rid of the primary key index
b) set big table nologging if not already
c) insert /*+ append */ into big_table select * from small_table;
d) recreate the primary key index in parallel (if even needed, just 12 million
rows), nologging
e) put logging attribute back on for the future
f) backup affected datafiles

GOTO a page to Bookmark Review | Bottom | Top
did not get rid of PK index because the table is in M-M replication August 14, 2005
Reviewer: Sami

Dear Tom,

Thanks a lot for your immediate response. You are absolutely
right about why the "smaller chunk transactions" defeated "one large
transaction".

<asktom>
a) get rid of the primary key index
b) set big table nologging if not already
c) insert /*+ append */ into big_table select * from small_table;
d) recreate the primary key index in parallel (if even needed, just 12 million
rows), nologging
e) put logging attribute back on for the future
f) backup affected datafiles
</asktom>

I agree with you 100%. Even I prefer your approach [get rid of the primary key
index,delete,rebuild index]. However the table in question is involved in
Master-Master replication.
We are not sure about getting rid of primary key index for the table which
involved in replication. That is why we didn't take this approach.

Tom,thanks for your time & effort.

GOTO a page to Bookmark Review | Bottom | Top
Additional Info August 14, 2005
Reviewer: Sami

Our large pool is set to 0 size. I am just wondering whether this value had any
impact on our parallel process poor performance.

NAME TYPE VALUE
------------------------------------ ------- ------------------------------
large_pool_size string 0

Followup:
are you sure it is really zero, if you are using automatic SGA memory
management:

ops$xp10gr1/tkyte@ORA10GR1> show parameter large

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
__large_pool_size big integer 4M
large_pool_size big integer 0
ops$xp10gr1/tkyte@ORA10GR1>

I didn't set a large pool, but Oracle has set up a 4m one (that can grow/shrink
as needed)

GOTO a page to Bookmark Review | Bottom | Top
Large pool size August 15, 2005
Reviewer: Sami

Tom, We are still in 8.1.7.3.

Yes, I am sure the large pool size is set to zero on both instances(2 node OPS)

Followup:
it isn't "impacting it" performance wise, it is using memory from elsewhere if,
but the large pool would be a better place since this memory for parallel
execution isn't "LRU" memory like the other stuff in the shared pool -- the
large pool is designed for memory that is allocated and deallocated, unlike
shared pool memory which is designed to be allocated and aged out over time.

GOTO a page to Bookmark Review | Bottom | Top
Undo - With and without index August 15, 2005
Reviewer: Sami

Session #1
=================================================================================

create table t1 as select * from dba_objects;
create index t1_idx01 on t1(object_id);

*/
SQL> select distinct sid from v$mystat;

SID
----------
126

SQL> insert into t1 select * from dba_objects;

48601 rows created.

=================================================================================

Session #2
=================================================================================

SQL> select taddr,used_ublk from v$transaction t,v$session s,v$rollstat r where
s.taddr=t.addr and t.xidusn=r.usn and s.sid=126;

TADDR USED_UBLK
-------- ----------
1965B9F4 568

SQL> select taddr,used_ublk from v$transaction t,v$session s,v$rollstat r
wheres.taddr=t.addr and t.xidusn=r.usn and s.sid=126;

TADDR USED_UBLK
-------- ----------
1965AFFC 2

SQL> select taddr,used_ublk from v$transaction t,v$session s,v$rollstat r where
s.taddr=t.addr and t.xidusn=r.usn and s.sid=126;

TADDR USED_UBLK
-------- ----------
1965B9F4 636

SQL> select taddr,used_ublk from v$transaction t,v$session s,v$rollstat r where
s.taddr=t.addr and t.xidusn=r.usn and s.sid=126;

TADDR USED_UBLK
-------- ----------
1965AFFC 2

SQL> select taddr,used_ublk from v$transaction t,v$session s,v$rollstat r where
s.taddr=t.addr and t.xidusn=r.usn and s.sid=126;

no rows selected

SQL> select taddr,used_ublk from v$transaction t,v$session s,v$rollstat r where
s.taddr=t.addr and t.xidusn=r.usn and s.sid=126;

TADDR USED_UBLK
-------- ----------
1965B9F4 754

SQL> select taddr,used_ublk from v$transaction t,v$session s,v$rollstat r where
s.taddr=t.addr and t.xidusn=r.usn and s.sid=126;

TADDR USED_UBLK
-------- ----------
1965B9F4 1035

SQL>

=================================================================================

I monitored rollback activities for session#1(while inserting rows) through
session#2.

While the USED_UBLK is gradually going up for one transaction , I noticed some
other
transaction(TADDR=1965AFFC) with USED_UBLK=2.
Is it because of Index t1_idx01?

I didn't see this behavior when I tested the same test case without Index.

Sometimes I noticed "no rows selected" in session#2. I assume this is the time
the transaction is
switching between TADDR 1965AFFC and 1965B9F4?

Please correct me if I am wrong?

Followup:
ddl commits, there were two transactions there

the create table and the create index.

GOTO a page to Bookmark Review | Bottom | Top
no ddl here August 15, 2005
Reviewer: Sami

Tom,

Sorry for the confusion. I shouldn't have put the create table and create index
statement over there (though it was commented)

create table and create index is outside of this scope. The reason I put those
because I just wanted to show you that
1) the table is created from dba_objects
2) it has index on object_id column

I have tested this behavior in both 8i and 10g. The following output and "no
rows selected" appeared N times (more than 2) WHILE THE INSERT STATEMENT IS
GOING ON, just for clarity I just truncated the output.

TADDR USED_UBLK
-------- ----------
1965AFFC 2

Followup:
then I have no clue what we are in fact looking at there.

I cannot even remotely reproduce - nor is this the way it works.

the used_ublks would in this case increase, until you commit, then they go away.

GOTO a page to Bookmark Review | Bottom | Top
1 big INSERT with joins - OR - multiple UPDATES... August 16, 2005
Reviewer: Sandeep from London, UK

Hi Tom,

I'm trying to archive data from some big tables (5 or 6 of them). Before doing
that I need to "rollup" an "amount"
column within each of those big tables (based on
a "creation date" criteria). The summations of these amount will be grouped by
date (monthly sum-up) and will
need to be put into a reporting table.

The reporting table will have columns like
"Month", "Sum_Amt_BigTable1","Sum_Amt_BigTable2",
"Sum_Amt_BigTable3" and the like.

Initially there was only one BigTable (say BigTable1), so the archiving script
was simple.
Just one - INSERT INTO REPORT_TABLE... SELECT.. FROM BIGTABLE1.. WHERE (the date
criteria)
GROUP BY (the date column).

Now since there are more "BigTables", I have two(?) options:

1. Modify the insert script such that there is JOIN
between all BigTables, say BigTable1..BigTable6 (eeps!)
on the date criteria, and just sum-up the required
columns. For the sake of simplicity, assume that all
the tables contain at least one record for each month
(so no need to outer-join!).

2. Use the original INSERT script to create the initial
records within the REPORT table... then run multiple
UPDATEs on the REPORT table - one for
each bigtable, like this (roughly!)-
UPDATE REPORT_TABLE
SET Sum_Amt_BigTable2 = (SELECT sum(amountxx)
FROM BigTable2 WHERE (the date criteria)
GROUP BY (the date column))
WHERE REPORT_TABLE.month = LAST_DAY(TO_DATE(TO_CHAR(
BigTable2.date_column,'DD-MON-YYYY'),'DD-MON-YYYY'))
;

... and so on - for each remaining BigTable..

Which option do you thing will be better in terms of performance? - I suspect
option2, as I'm a bit
apprehensive about joining 6 big tables for one big insert. Will be testing this
out, but wanted
to have your opinion about the same.

Regards,
Sandeep

Followup:
#1 of course, do as much as you can in a single sql statement.

big hash area size, nice juicy full scans -- ready to rock and roll and have it
done fast.

GOTO a page to Bookmark Review | Bottom | Top
big INSERT with joins - OR - multiple UPDATES August 17, 2005
Reviewer: Sandeep from London, UK

Hi,
Was thinking you might actually suggest #2 !!
Having a big join across 6 tables with multimillion
records? hmm... not sure.. but will definitely try this
out.
Did the #2 and it works well... will try to do the
#1 and get some stats to compare... somehow joins
between 6 *huge* tables does not convince me... but...
Thanks,
Sandeep

Followup:
6 tables, 60 tables -- I don't care, make SQL work for you -- NOT the other way
around.

databases were BORN TO JOIN, it is what they do

GOTO a page to Bookmark Review | Bottom | Top
Option #3? August 17, 2005
Reviewer: Bob B from Latham, NY

If there was the likelihood of an outer join in the above scenario, it might be
worth using a union all approach. Something like:

SELECT
'Big Table 1' NAME,
SUM( WHATEVER ) WHATEVER_SUM,
SUM( WHATEVER2 ) WHATEVER2_SUM,
PARAM1,
PARAM2
FROM BIG_TABLE1
WHERE CONDITION1 = 'whatever'
GROUP BY PARAM1, PARAM2
UNION ALL
SELECT
'Big Table 2' NAME,
SUM( WHATEVER ) WHATEVER_SUM,
SUM( WHATEVER2 ) WHATEVER_SUM,
PARAM1,
PARAM2
FROM BIG_TABLE2
WHERE CONDITION1 = 'whatever'
GROUP BY PARAM1, PARAM2
... etc

Essentially, write the report query for each table, union all them together and
then summarize the union all query as necessary:

SELECT
SUM( SUM_WHATEVER ) SUM_WHATEVER,
SUM( SUM_WHATEVER2 ) SUM_WHATEVER2,
PARAM1,
PARAM2
FROM (
QUERY ABOVE
)
GROUP BY PARAM1, PARAM2

GOTO a page to Bookmark Review | Bottom | Top
With more clear test case August 17, 2005
Reviewer: sami

</asktom>
then I have no clue what we are in fact looking at there.

I cannot even remotely reproduce - nor is this the way it works.

the used_ublks would in this case increase, until you commit, then they go away.

</asktom>

Dear Tom,

While the session#1 is doing DIRECT PATH INSERT on TABLE which has PRIMAR KEY,I
Monitored SESSION#2 as below.

While the USED_UBLK is gradually going up, there are 2 occurances it reported
different TADDR with USED_UBLK=2.

I just wanted to know what is the new TADDR.

Session#1
#############################################################################

20:37:47 SQL> select distinct sid from v$mystat;

SID
----------
162

20:37:50 SQL> insert /*+ append */ into t1 select * from dba_objects;

48603 rows created.

20:38:23 SQL>
#############################################################################

Session #2

#############################################################################
SQL> desc undo_monitor
Name Null? Type
----------------------------------------- -------- -------------------------

TADDR VARCHAR2(50)
UNDO_UBLK NUMBER
TIME DATE

SQL> truncate table undo_monitor;

Table truncated.

SQL> set time on
SQL>
SQL>
SQL> declare
2 i number :=0;
3 begin
4 loop
5 for c0rec in (select taddr,used_ublk from v$session s,v$transaction t
6 where s.taddr=t.addr
7 and s.sid=&session_to_be_monitored)
8 loop
9 insert into undo_monitor values(c0rec.taddr,c0rec.used_ublk,sysdate);
10 end loop;
11 i := i+1;
12 if i=100000 then
13 exit;
14 end if;
15 end loop;
16 end;
17 /
Enter value for session_to_be_monitored: 162
old 7: and s.sid=&session_to_be_monitored)
new 7: and s.sid=162)

PL/SQL procedure successfully completed.

20:39:49 SQL> select distinct time,taddr,undo_ublk from undo_monitor order by t
ime;

TIME TADDR UNDO_UBLK
--------- -------------------------------------------------- ----------
17-AUG-05 4C64FADC 1
17-AUG-05 4C64FADC 1
17-AUG-05 4C64FADC 1
17-AUG-05 4C64FADC 1
17-AUG-05 4C6504D4 2
17-AUG-05 4C64FADC 5
17-AUG-05 4C64FADC 6
17-AUG-05 4C64FADC 9
17-AUG-05 4C64FADC 10
17-AUG-05 4C64FADC 16
17-AUG-05 4C6504D4 2
17-AUG-05 4C64FADC 17
17-AUG-05 4C64FADC 21
17-AUG-05 4C64FADC 25
17-AUG-05 4C64FADC 27
17-AUG-05 4C6504D4 2
17-AUG-05 4C64FADC 27
17-AUG-05 4C64FADC 30
17-AUG-05 4C64FADC 32
17-AUG-05 4C64FADC 36
17-AUG-05 4C64FADC 38
17-AUG-05 4C64FADC 49
17-AUG-05 4C64FADC 52
17-AUG-05 4C64FADC 52
17-AUG-05 4C64FADC 52
17-AUG-05 4C64FADC 52
17-AUG-05 4C64FADC 52
17-AUG-05 4C64FADC 52
17-AUG-05 4C64FADC 52
17-AUG-05 4C64FADC 52

80 rows selected.

20:39:56 SQL>

#############################################################################

Followup:
as I said, I cannot even REMOTELY reproduce this at all.

date is only valid down to the second, perhaps -- these rows are scrambled.
put I in that last column, not sysdate and see.

(select to_char(time, 'dd-mon-yyyy hh24:mi:ss' ) -- see if your times are not
all in the same second there.

GOTO a page to Bookmark Review | Bottom | Top
What if standby database were used? August 18, 2005
Reviewer: Michel from France

Hi Tom,
replacing update with CTAS and/or INSERT /*+ append */ INTO ... is especially
usefull because of NOLOGGING

Now, what to do if we are using stnadby databases and so it is not a good idea
to use operations with NOLOGGING

Followup:
CTAS and /*+ append */ are done with logging enabled in ARCHIVELOG mode, unless
you specify "alter <segment_name> NOLOGGING" and even then the dba can "force
logging" at the database level.

LOGGING happens with this stuff!!!

GOTO a page to Bookmark Review | Bottom | Top
Excellent August 18, 2005
Reviewer: Peter from Philadelphia, PA USA

Hi Tom,

You are the best of best. I visit your site everyday.

I need your expertise in this regard too.

I have a fact table T
( order_id number,
customer_id number,
order_date date,
month_id number,
prd_id number,
prd_name varchar2(30),
quantity number,
zip number,
flag smallint,
swiprd_id number)
with about 5 million records imported every month.
NOTE: one product (prd_name) may have one or more prd_ids. There are about 200
prd_name and 2000 different prd_id in this table)

Oracle SERVER: Oracle9i Enterprise Edition Release 9.2.0.5.0 - 64bit Production

T was partitioned by month_id.
A local index is builded on T(customer_id, order_date, prd_name).

table T was analyzed using
dbms_stats.GATHER_TABLE_STATS(
OWNNAME=>'USER',
TABNAME=>'T',
ESTIMATE_PERCENT=>dbms_stats.auto_sample_size,
METHOD_OPT => 'for all indexed columns',
cascade => true,
degree => 4);

I need to know following statitics informations each month

1) how many continue customers for each prd_id (not just prd_name) in this
month.
the continue customer is defined as A customer who bought a product (with a
prd_id ) in this month
and also bought same product (with same or different prd_id) in past 365 days
and ((previous order_date+ 30*previous quantity) >= current order_date)
SET flag =1

2) how many restart customers for each prd_id (not just prd_name) in this month.

the restart customer is defined as A customer who is not a continue customer
and bought a product (with a prd_id ) in this month
and also bought same product (with same or different prd_id) in past 365 days
and ((previous order_date+ 30*previous quantity) < current order_date)

SET flag =2

3) how many switch customers for each prd_id (not just prd_name) in this month.
the restart customer is defined as A customer who is not a continue nor
restart customer and bought a product (with prd_id) in this month
and also bought different product (not just different prd_id) in past 365
days.
and also record the what prd_id switched from
SET flag =3, swiprd_id = previous prd_id

4) how many new customers for each prd_id (not prd_name) in this month.
the new customer is defined as A customer who bouht a product but never
bought any product in past 365 days
SET flag =4

Current solutions: use a procedure to process each order

CREATE OR REPLACE PROCEDURE get_flag(v_monthid IN NUMBER)
AS
cursor c1 IS
SELECT distinct prd_id
from T
WHERE month_id=v_monthid order by prd_id;

cursor c2 is
SELECT customer_id, order_date, prd_name
FROM T
WHERE prd_id=v_prd_id and month_id = v_monthid
order by customer_id, order_date desc, quantity desc
FOR UPDATE of flag NOWAITE ;

cursor c3 -- for all orders with same customer and same
-- product during lookback
is SELECT order_date, quantity
FROM T
WHERE customer_id = v_customer_id
and prd_name = v_prd_name
AND order_date BETIEN (v_order_date-366)
AND (v_order_date-1)
order by order_date desc, quantity desc;

cursor c4 -- for all orders with same customer and
-- different product during lookback
is SELECT prd_id
FROM T
WHERE customer_id = v_customer_id
and prd_name <> v_prd_name
AND order_date BETIEN (v_order_date-366)
AND (v_order_date-1)
order by order_date desc, quantity desc;

TYPE customer_array is TABLE of T.customer_id%type;
TYPE order_date_array is TABLE of T.order_date%type;
TYPE prd_name_array is TABLE of T.prd_name%type;
lv_customer_arr customer_array;
lv_order_date_arr order_date_array;
lv_prd_name_arr prd_name_array;

v_prd_id T.prd_id%type;
v_customer_id T.customer_id%type;
v_order_date T.order_date%type;
v_order_date_tmp T.order_date%type;
v_prd_name T.prd_name%type;
v_quantity T.quantity%type;
v_flag PLS_INTEGER;
v_swiprd_id T.swiprd_id%type;

BEGIN
FOR lv_prd in c1
LOOP
v_prd_id :=lv_prd.prd_id;
v_customer_id :=0 ;
OPEN c2;
LOOP
FETCH BULK COLLECT INTO lv_customer_arr,
lv_order_date_arr, lv_prd_name_arr
LIMIT 100;
FOR i in 1 .. lv_customer_arr.count
LOOP
IF v_customer_id <> lv_customer_arr(i)
v_customer_id := lv_customer_arr(i);
v_order_date := lv_order_date_arr(i)
v_prd_name := lv_prd_name_arr(i);
v_flag := 0;
v_swiprd_id := 0;

OPEN c3;
FETCH c3 INTO v_order_date_tmp, v_quantity ;
IF c3%NOTFOUND
THEN
OEPN c4
FETCH INTO v_swiprd_id ;
IF c4%NOTFOUND
THEN -- bought another product
v_flag :=4;
ELSE
v_flag :=3;
END IF;
CLOSE c4;
ELSE -- bought same product
IF (v_order_date_tmp + 30*v_quantity)
>=v_order_date
THEN
v_flag :=1 ;
ELSE
v_flag :=2;
END IF;
END IF;
CLOSE c3
-- UPDATE flag and swiprd_id in table T
IF v_flag =3
THEN
UPDATE T
SET flag = v_flag,
swiprd_id = v_swiprd_id
WHERE CURRENT OF c2;
ELSE
UPDATE T
SET flag = v_flag
WHERE CURRENT OF c2;
END IF;
END IF;
END LOOP;

EXIT WHEN c2%notfound;
END LOOP;
CLOSE c2;
END LOOP;

EXCEPTION
WHEN ... THEN ....;
WHEN OTHERS THEN
dbms_output.put_line(SubStr('Error '||TO_CHAR(SQLCODE)||': '||SQLERRM, 1,
255));
RAISE;
END;

After processing I can run some query to get statitics information I need.

This solution is obviously very slow.
(for each record 1~2 querys invoke, total about 5~7 million querys called)
It will take about a week to finish the whole process.
IF I divide the prd_id into 6 different ranges and run 6 procedures with diffent
prd_id ranges at same time.
It will take about 2 days. But it is still too long.

I did go thru. some threads in your site. you suggested that:

CREATE TABLE new_table as select <do the update "here"> from old_table;
............

but it is very hard to achieve <do the update "here"> here.

My questions are:

1) Is there any better way to achieve this?

2) IF this is a way to be, how to improve the performance.

Thanks,

GOTO a page to Bookmark Review | Bottom | Top
As you requested August 20, 2005
Reviewer: sami

Dear Tom,

As you requested I put the sequence number(loop#).
I am able to reproduce consistenly in both 8i solaris and 10 windows.

Look at the TADDR difference between
SEQUENCE TADDR UNDO_UBLK TIME
---------- -------------------- ---------- ------------------
18714 4C656458 27 20-AUG-05 09:26:51
18715 4C65695C 2 20-AUG-05 09:26:51

Session#1
==========
SQL> truncate table t1;

Table truncated.

SQL> select index_name from dba_indexes where table_name='T1';

INDEX_NAME
------------------------------
T1_PK

SQL> desc t1
Name Null? Type
----------------------------------------- -------- ----------------------------

OWNER VARCHAR2(30)
OBJECT_NAME VARCHAR2(128)
SUBOBJECT_NAME VARCHAR2(30)
OBJECT_ID NOT NULL NUMBER
DATA_OBJECT_ID NUMBER
OBJECT_TYPE VARCHAR2(19)
CREATED DATE
LAST_DDL_TIME DATE
TIMESTAMP VARCHAR2(19)
STATUS VARCHAR2(7)
TEMPORARY VARCHAR2(1)
GENERATED VARCHAR2(1)
SECONDARY VARCHAR2(1)

SQL> select sid from v$mystat where rownum=1;

SID
----------
122

----*** STARTED MONITONING THIS SESSION FROM HERE ***----

SQL> insert /*+ APPEND */ into t1 select * from dba_objects;

48526 rows created.

SQL>
----*** STOPPED MONITONING THIS SESSION FROM HERE ***----
##########################################################

Session#2
===================
SQL> drop table undo_monitor;

Table dropped.

SQL> create global temporary table undo_monitor(sequence number,taddr
varchar2(2
0),undo_ublk number,time date)
2 ;

Table created.

SQL> alter session set nls_date_format='DD-MON-YY HH24:MI:SS';

Session altered.

SQL> declare
2 i number :=0;
3 begin
4 loop
5 for c0rec in (select taddr,used_ublk from v$session s,v$transaction t
6 where s.taddr=t.addr
7 and s.sid=&session_to_be_monitored)
8 loop
9 insert into undo_monitor values(i,c0rec.taddr,c0rec.used_ublk,sysdate);
10 end loop;
11 i := i+1;
12 if i=100000 then
13 exit;
14 end if;
15 end loop;
16 end;
17 /
Enter value for session_to_be_monitored: 122
old 7: and s.sid=&session_to_be_monitored)
new 7: and s.sid=122)

PL/SQL procedure successfully completed.

SQL> spool c:/temp/1.log
SQL> set lines 120 pages 300
SQL> select min(undo_ublk),max(undo_ublk) from undo_monitor;

MIN(UNDO_UBLK) MAX(UNDO_UBLK)
-------------- --------------
1 52

SQL>
SQL> select * from undo_monitor where undo_ublk >1 and undo_ublk <52 order by se
quence;
SEQUENCE TADDR UNDO_UBLK TIME
---------- -------------------- ---------- ------------------
[truncated for display purpose]

18696 4C656458 27 20-AUG-05 09:26:51
18697 4C656458 27 20-AUG-05 09:26:51
18698 4C656458 27 20-AUG-05 09:26:51
18699 4C656458 27 20-AUG-05 09:26:51
18700 4C656458 27 20-AUG-05 09:26:51
18701 4C656458 27 20-AUG-05 09:26:51
18702 4C656458 27 20-AUG-05 09:26:51
18703 4C656458 27 20-AUG-05 09:26:51
18704 4C656458 27 20-AUG-05 09:26:51
18705 4C656458 27 20-AUG-05 09:26:51
18706 4C656458 27 20-AUG-05 09:26:51
18707 4C656458 27 20-AUG-05 09:26:51
18708 4C656458 27 20-AUG-05 09:26:51
18709 4C656458 27 20-AUG-05 09:26:51
18710 4C656458 27 20-AUG-05 09:26:51
18711 4C656458 27 20-AUG-05 09:26:51
18712 4C656458 27 20-AUG-05 09:26:51
18713 4C656458 27 20-AUG-05 09:26:51
18714 4C656458 27 20-AUG-05 09:26:51
18715 4C65695C 2 20-AUG-05 09:26:51
18716 4C65695C 2 20-AUG-05 09:26:51
18717 4C65695C 2 20-AUG-05 09:26:51
18718 4C65695C 2 20-AUG-05 09:26:51
18719 4C65695C 2 20-AUG-05 09:26:51
18720 4C65695C 2 20-AUG-05 09:26:51
18721 4C65695C 2 20-AUG-05 09:26:51
18722 4C65695C 2 20-AUG-05 09:26:51
18723 4C65695C 2 20-AUG-05 09:26:51
18724 4C65695C 2 20-AUG-05 09:26:51
18725 4C65695C 2 20-AUG-05 09:26:51
18726 4C65695C 2 20-AUG-05 09:26:51
18727 4C65695C 2 20-AUG-05 09:26:51
18728 4C65695C 2 20-AUG-05 09:26:51
18729 4C65695C 2 20-AUG-05 09:26:51
18730 4C65695C 2 20-AUG-05 09:26:51
18731 4C65695C 2 20-AUG-05 09:26:51
18732 4C65695C 2 20-AUG-05 09:26:51
18733 4C65695C 2 20-AUG-05 09:26:51
18734 4C65695C 2 20-AUG-05 09:26:51
18735 4C65695C 2 20-AUG-05 09:26:51
18736 4C65695C 2 20-AUG-05 09:26:51
18737 4C65695C 2 20-AUG-05 09:26:51
18738 4C65695C 2 20-AUG-05 09:26:51
18739 4C65695C 2 20-AUG-05 09:26:51
18740 4C65695C 2 20-AUG-05 09:26:51
18741 4C65695C 2 20-AUG-05 09:26:51
18742 4C65695C 2 20-AUG-05 09:26:51
18743 4C65695C 2 20-AUG-05 09:26:51
18744 4C65695C 2 20-AUG-05 09:26:51
18745 4C65695C 2 20-AUG-05 09:26:51
18746 4C656458 38 20-AUG-05 09:26:51
18747 4C656458 38 20-AUG-05 09:26:51
18748 4C656458 38 20-AUG-05 09:26:51
18749 4C656458 38 20-AUG-05 09:26:51
18750 4C656458 38 20-AUG-05 09:26:51
18751 4C656458 38 20-AUG-05 09:26:51
18752 4C656458 38 20-AUG-05 09:26:51
18753 4C656458 38 20-AUG-05 09:26:51
18754 4C656458 38 20-AUG-05 09:26:51
18755 4C656458 38 20-AUG-05 09:26:51
18756 4C656458 38 20-AUG-05 09:26:51
18757 4C656458 38 20-AUG-05 09:26:51
18758 4C656458 38 20-AUG-05 09:26:51
18759 4C656458 38 20-AUG-05 09:26:51
18760 4C656458 38 20-AUG-05 09:26:51
18761 4C656458 38 20-AUG-05 09:26:51
18762 4C656458 38 20-AUG-05 09:26:51
18763 4C656458 38 20-AUG-05 09:26:51
18764 4C656458 38 20-AUG-05 09:26:51
18765 4C656458 38 20-AUG-05 09:26:51
18766 4C656458 38 20-AUG-05 09:26:51
18767 4C656458 38 20-AUG-05 09:26:51
18768 4C656458 38 20-AUG-05 09:26:51
18769 4C656458 38 20-AUG-05 09:26:51
18770 4C656458 38 20-AUG-05 09:26:51
18771 4C656458 38 20-AUG-05 09:26:51
18772 4C656458 38 20-AUG-05 09:26:51
18773 4C656458 38 20-AUG-05 09:26:51
18774 4C656458 38 20-AUG-05 09:26:51
18775 4C656458 38 20-AUG-05 09:26:51
18776 4C656458 38 20-AUG-05 09:26:51
18777 4C656458 38 20-AUG-05 09:26:51
18778 4C656458 38 20-AUG-05 09:26:51
18779 4C656458 38 20-AUG-05 09:26:51
18780 4C656458 38 20-AUG-05 09:26:51
18781 4C656458 38 20-AUG-05 09:26:51
18782 4C656458 38 20-AUG-05 09:26:51
18783 4C656458 38 20-AUG-05 09:26:51
18784 4C656458 38 20-AUG-05 09:26:51
18785 4C656458 38 20-AUG-05 09:26:51
18786 4C656458 38 20-AUG-05 09:26:51
18787 4C656458 38 20-AUG-05 09:26:51
18788 4C656458 38 20-AUG-05 09:26:51
18789 4C656458 38 20-AUG-05 09:26:51
18790 4C656458 38 20-AUG-05 09:26:51
18791 4C656458 38 20-AUG-05 09:26:51
18792 4C656458 38 20-AUG-05 09:26:51
18793 4C656458 38 20-AUG-05 09:26:51
18794 4C656458 38 20-AUG-05 09:26:51
18795 4C656458 38 20-AUG-05 09:26:51
18796 4C656458 38 20-AUG-05 09:26:51
18797 4C656458 38 20-AUG-05 09:26:51
18798 4C656458 38 20-AUG-05 09:26:51
18799 4C656458 38 20-AUG-05 09:26:51
18800 4C656458 38 20-AUG-05 09:26:51
18801 4C656458 38 20-AUG-05 09:26:51
18802 4C656458 38 20-AUG-05 09:26:51
18803 4C656458 38 20-AUG-05 09:26:51
18804 4C656458 38 20-AUG-05 09:26:51
18805 4C656458 38 20-AUG-05 09:26:51
18806 4C656458 38 20-AUG-05 09:26:51
18807 4C656458 38 20-AUG-05 09:26:51
18808 4C656458 38 20-AUG-05 09:26:51
18809 4C656458 38 20-AUG-05 09:26:51
18810 4C656458 38 20-AUG-05 09:26:51
18811 4C656458 38 20-AUG-05 09:26:51
18812 4C656458 38 20-AUG-05 09:26:51
18813 4C656458 38 20-AUG-05 09:26:51
18814 4C656458 38 20-AUG-05 09:26:51
18815 4C656458 38 20-AUG-05 09:26:51
18816 4C656458 38 20-AUG-05 09:26:51
18817 4C656458 38 20-AUG-05 09:26:51
18818 4C656458 38 20-AUG-05 09:26:51
18819 4C656458 38 20-AUG-05 09:26:51
18820 4C656458 38 20-AUG-05 09:26:51
18821 4C656458 38 20-AUG-05 09:26:51
18822 4C656458 38 20-AUG-05 09:26:51
18823 4C656458 38 20-AUG-05 09:26:51
18824 4C656458 38 20-AUG-05 09:26:51
18825 4C656458 38 20-AUG-05 09:26:51
18826 4C656458 38 20-AUG-05 09:26:51
18827 4C656458 38 20-AUG-05 09:26:51
18828 4C656458 38 20-AUG-05 09:26:51
18829 4C656458 38 20-AUG-05 09:26:51
18830 4C656458 38 20-AUG-05 09:26:51
18831 4C656458 38 20-AUG-05 09:26:51
18832 4C656458 38 20-AUG-05 09:26:51
18833 4C656458 38 20-AUG-05 09:26:51
18834 4C656458 38 20-AUG-05 09:26:51
18835 4C656458 38 20-AUG-05 09:26:51
18836 4C656458 38 20-AUG-05 09:26:51
18837 4C656458 38 20-AUG-05 09:26:51
18838 4C656458 40 20-AUG-05 09:26:51
18839 4C656458 40 20-AUG-05 09:26:51
18840 4C656458 40 20-AUG-05 09:26:51
18841 4C656458 40 20-AUG-05 09:26:51
18842 4C656458 40 20-AUG-05 09:26:51
18843 4C656458 40 20-AUG-05 09:26:51
18844 4C656458 40 20-AUG-05 09:26:51
18845 4C656458 40 20-AUG-05 09:26:51
18846 4C656458 40 20-AUG-05 09:26:51
18847 4C656458 40 20-AUG-05 09:26:51
18848 4C656458 40 20-AUG-05 09:26:51
18849 4C656458 40 20-AUG-05 09:26:51
18850 4C656458 40 20-AUG-05 09:26:51
18851 4C656458 40 20-AUG-05 09:26:51
18852 4C656458 40 20-AUG-05 09:26:51
18853 4C656458 40 20-AUG-05 09:26:51
18854 4C656458 40 20-AUG-05 09:26:51
18855 4C656458 40 20-AUG-05 09:26:51
18856 4C656458 40 20-AUG-05 09:26:51
18857 4C656458 40 20-AUG-05 09:26:51
18858 4C656458 40 20-AUG-05 09:26:51
18859 4C656458 40 20-AUG-05 09:26:51
18860 4C656458 40 20-AUG-05 09:26:51
18861 4C656458 40 20-AUG-05 09:26:51
18862 4C656458 40 20-AUG-05 09:26:51
18863 4C656458 40 20-AUG-05 09:26:51
18864 4C656458 40 20-AUG-05 09:26:51
18865 4C656458 40 20-AUG-05 09:26:51
18866 4C656458 45 20-AUG-05 09:26:51
18867 4C656458 45 20-AUG-05 09:26:51
18868 4C656458 45 20-AUG-05 09:26:51
18869 4C656458 45 20-AUG-05 09:26:51
18870 4C656458 45 20-AUG-05 09:26:51
18871 4C656458 45 20-AUG-05 09:26:51
18872 4C656458 45 20-AUG-05 09:26:51
18873 4C656458 45 20-AUG-05 09:26:51
18874 4C656458 45 20-AUG-05 09:26:51
18875 4C656458 45 20-AUG-05 09:26:51
18876 4C656458 45 20-AUG-05 09:26:51
18877 4C656458 45 20-AUG-05 09:26:51
18878 4C656458 45 20-AUG-05 09:26:51
18879 4C656458 46 20-AUG-05 09:26:51
18880 4C656458 51 20-AUG-05 09:26:51
18881 4C656458 51 20-AUG-05 09:26:51
18882 4C656458 51 20-AUG-05 09:26:51
18883 4C656458 51 20-AUG-05 09:26:51
18884 4C656458 51 20-AUG-05 09:26:51
18885 4C656458 51 20-AUG-05 09:26:51
18886 4C656458 51 20-AUG-05 09:26:51
18887 4C656458 51 20-AUG-05 09:26:51
18888 4C656458 51 20-AUG-05 09:26:51
18889 4C656458 51 20-AUG-05 09:26:51
18890 4C656458 51 20-AUG-05 09:26:51

Followup:
Recursive SQL.

turn on sql trace and see if you are not doing SQL behind your SQL (space
management, quotas, etc)....

That was likely a recursive sql transaction to update the dictionary.

GOTO a page to Bookmark Review | Bottom | Top
Which is the best way to update? August 22, 2005
Reviewer: A Reader from India

Hi Tom,

We have a fact table with few dimension tables. Initially the records are
inserted into the fact table. This way most of the columns in the table will be
populated with values. For the remaining columns we have to update the table by
joining with other tables.
1) What will be the best way to update? We are currently using correlated
queries to update, but there is no index on the column being joined in the fact
table. The source table has index if its a key column.

2)Should we create index on the joining columns in the fact table? If there are
many such columns, is it a good idea to create index on each of them before
updating? Will this improve performance?

3)We are currently analyzing the fact table before update. Is this the right way
though we are not creating index on all required columns before update?

4) Should index be created on table before update and the table be analyzed?

Your inputs will be highly appreciated and very helpful.

Thanks in advance.

Followup:
I like to update joins

update ( select t1.a, t2.b from t1, t2 where t1.fk = t2.pk )
set a = b;

t2 would need to have a primary key in place for this.

GOTO a page to Bookmark Review | Bottom | Top
Which is the best way to update? August 24, 2005
Reviewer: A Reviewer from India

Tom can you please answer the queries posted in the immediately previous post?

Followup:
I did, update the join by primary key. That was my answer. Use the CBO.

GOTO a page to Bookmark Review | Bottom | Top
October 02, 2005
Reviewer: Allan Sun from IL, USA

I found your answer is very helpful and wonder how I can submit my question to
you.
Thanks

Followup:
when taking them, there is a button on the home page.

GOTO a page to Bookmark Review | Bottom | Top
Calculate redolog generation October 04, 2005
Reviewer: A reader

Tom,
Is there any way to calculate the amount of redolog that is going to be
generated when some 2 million records are updated?
I have to update 2 million records (I can't recreate a dummy table etc etc). I
need to know how much redolog will be generated and OS space required for the
undo datafile

Followup:
not really, not without simulating at least a portion of it on some system and
measuring the amount of redo.

Table/column widths, row chaining/migration, indexes involved - they'll all play
a role.

I frequently do 1,000,000 to 10,000,000 records on my laptop - I would think you
can test this out on a test system someplace?

GOTO a page to Bookmark Review | Bottom | Top
it's very good December 07, 2005
Reviewer: gopal

GOTO a page to Bookmark Review | Bottom | Top
Delete, then Insert or Update? January 06, 2006
Reviewer: A reader

Tom,
Quick Question. while trying to modify a row in a table from a procedure, is
it better[Performance wise] to delete existing one and then insert with new
values or update the existing values with new values,WHY?

Thanks,

Followup:
update is less work.

you typically update a subset of the columns.

only the columns that change values and have indexes on them will affect the
indexes.

doing a delete+insert would take more resources, hit all indexes.

GOTO a page to Bookmark Review | Bottom | Top
Large update due to consolidation January 25, 2006
Reviewer: David Piazza from San Mateo, CA

We are consolidating five databases where the schemas are the same into one
database. Currently a test system was created with five schemas that hold the
previous data from the five databases. It isn't a clean update as all primary
keys are in the same range, so some number x will have to be added to the
primary(Psuedo key) and child keys in each schema to avoid dups. After reading
through this topic, I can see two ways of doing the merge and updating the first
schema: CTAS incrementing each primary and foreign key by some x value. Then
either insert /*+ append or use the merge command to insert/merge the data into
the first schema. The second way would be to directly insert /*+ append into
the first schema from tables in each of the five schemas adding the appropriate
value to the primary/foreign keys. How would you suggest implementing this?

Followup:
I would try to load the "fixed" data.

Eg: if the keys that are in need of fixing are all numbers - I might multiply
each key by 10 and add 1, 2, 3, 4, or 5 to each during the load.

GOTO a page to Bookmark Review | Bottom | Top
mass insert code package February 13, 2006
Reviewer: Brett Clark from Des Moines, IA USA

For what its worth, here is a package that can be used as a generic foundation
for mass loads. It will disable triggers and indexes and (optionally) truncate
the main table before running the insert statement that you provide. Hope this
helps someone.

create or replace package table_support is

TRUNCATE_NO constant boolean := FALSE;
TRUNCATE_YES constant boolean := TRUE;

procedure load_table(G_TABLE_NAME Varchar2, G_INSERT_STMT Varchar2,
TRUNC_ACTION Boolean DEFAULT TRUNCATE_NO );

end table_support;
/
create or replace package body table_support is

ACTION_ENABLE constant boolean := TRUE;
ACTION_DISABLE constant boolean := FALSE;

procedure exec( inStmt Varchar2 ) is
begin
EXECUTE IMMEDIATE inStmt;
end exec;

function exec( inStmt Varchar2 ) return Number is
begin
exec( inStmt );
return( SQL%ROWCOUNT );
end exec;

procedure tbl_triggers(G_TABLE_NAME Varchar2, action Boolean) is
begin
if( action = ACTION_ENABLE ) then
exec( 'ALTER TABLE ' || G_TABLE_NAME || ' ENABLE ALL TRIGGERS' );
elsif ( action = ACTION_DISABLE ) then
exec( 'ALTER TABLE ' || G_TABLE_NAME || ' DISABLE ALL TRIGGERS' );
end if;
end tbl_triggers;

procedure tbl_indexes( G_TABLE_NAME Varchar2, action Boolean ) is
begin
FOR C IN (SELECT INDEX_NAME
FROM USER_INDEXES
WHERE TABLE_NAME = G_TABLE_NAME) LOOP
if( action = ACTION_DISABLE ) then
exec( 'alter index ' || C.INDEX_NAME || ' unusable' );
elsif( action = ACTION_ENABLE ) then
exec( 'alter index ' || C.INDEX_NAME || ' rebuild parallel nologging'
);
end if;
END LOOP;

if( action = ACTION_ENABLE ) then
exec( 'alter session set skip_unusable_indexes=false' );
elsif ( action = ACTION_DISABLE ) then
exec( 'alter session set skip_unusable_indexes=true' );
end if;
end tbl_indexes;

procedure load_table(G_TABLE_NAME Varchar2, G_INSERT_STMT Varchar2, TRUNC_ACTION
Boolean DEFAULT TRUNCATE_NO ) is
rowCount Number;
begin
-- DISABLE ALL TRIGGERS
tbl_triggers( G_TABLE_NAME, ACTION_DISABLE );

-- TRUNCATE TABLE
if( TRUNC_ACTION = TRUNCATE_YES ) then
exec( 'truncate table ' || G_TABLE_NAME );
end if;

-- DISABLE INDEXES
tbl_indexes( G_TABLE_NAME, ACTION_DISABLE );

-- ENABLE PARALLEL DML
exec( 'alter session enable parallel dml' );

-- EXECUTE INSERT STATEMENT
rowCount := exec( G_INSERT_STMT );

DBMS_OUTPUT.PUT_LINE('Inserted ' || rowCount || ' records into ' ||
G_TABLE_NAME);

-- ENABLE/REBUILD INDEXES
tbl_indexes( G_TABLE_NAME, ACTION_ENABLE );

-- ENABLE TRIGGERS
tbl_triggers( G_TABLE_NAME, ACTION_ENABLE );

-- TURN OFF PARALLEL DML
exec( 'alter session disable parallel dml' );

end load_table;

begin
-- Initialization
NULL;
end table_support;
/

GOTO a page to Bookmark Review | Bottom | Top
mass update February 17, 2006
Reviewer: Hans from Canada

I have a table with 50 columns and 10 million records and I have a file with 10
million rows and I have to update 5 columns in the table from the file( the
update is performed by getting the common ID value).
Steps I am performing
1.open the file using utl_file
2. read the record
3. perform the update in the table
The process is very slow.
I will appreciate better ideas from the group

Followup:
external table (so you can query the flat file), or sqlldr it into a temporary
table.

and then merge.

If you are in 9i, you'll have to:

merge into db_table
using (select * from external_table where id in (select id from db_table))
on (db_table.id = external_table.id)
when matched then update ...
when not matched then insert (id) values (null); <<<=== never happens

in 10g, you can just

merge into db_table
using external_table
on (db_table.id = external_table.id)
when matched then update ...;

GOTO a page to Bookmark Review | Bottom | Top
February 18, 2006
Reviewer: A reader

GOTO a page to Bookmark Review | Bottom | Top
Morning TOM. February 19, 2006
Reviewer: sudip from BGALORE, india

What is archieve log mode? why we prefer recovery in Archieve log mode? Please,
will u brief it?

Followup:
"U" is not here
I have never met "U"

If you know how to contact them - I'd really appreciate a pointer on doing so.
They get a lot of requests.

If you don't mind, I'll give you a pointer to what archivelog mode is (until we
can find this "U" person)

http://download-east.oracle.com/docs/cd/B19306_01/backup.102/b14192/strategy003.htm#sthref127

GOTO a page to Bookmark Review | Bottom | Top
February 19, 2006
Reviewer: A reader

To sudip from BGALORE, india

Looks like you are new to this site just put one thing in your mind Tom really
ticks off for the "U" so dont use it next time :-)

Followup:
Hey, the page he used to post that actually said "if you do this, I will do what
I just did..."

Which is to poke fun at at :)

It amazes more than ticks me off.

Just like in some parts of the world, an "English Accent" adds 10 IQ points
(when I hear a British accent, for whatever reason - I add default IQ points,
maybe it was all of the James Bond movies when I was a kid) - using this instant
message speak takes them away.

GOTO a page to Bookmark Review | Bottom | Top
Large Update Due To Consolidation February 21, 2006
Reviewer: David Piazza from San Mateo, CA

Tom,

Regarding my previous question. They've decided to do merge one of the five
clusters at a time. I've got scripts to implement the first cluster as such:

CTAS NOLOGGING setting the keys to old_key*10+1
insert /*+ append */
DROP old table
RENAME new table
GRANTs
ALTER TABLE add Primary Key
ALTER TABLE add Foreign Key
CREATE INDEX NOLOGGING parallel(degree 4)
ALTER TABLE ADD CONSTRAINT UNIQUE ENABLE
ALTER TABLE Logging

This seems pretty fast. Anything you see that will make it faster?

When I merge in the next clusters, should I:
- Do the same steps as above(*10+2,*10+3 etc.)
- Disable primary keys, make indexes unusable and
Insert /*+ append, and then enable PK's, and indexes
- Drop primary keys, indexes, and constraints, and
INSERT /*+ append, create PK's, Indexes, and constraints

The number of rows in the tables(Approximately the same in each cluster) are:
72338677,32278464,25347185,23834984,7595305,4519904,4371347,1229877,1168373,66697
52,6382030,97250,4694,934,166

-Thanks

Followup:
why not all of them at once to avoid having to do the disable/enable over and
over.

GOTO a page to Bookmark Review | Bottom | Top
Large Update Due To Consolidation February 22, 2006
Reviewer: David Piazza from San Mateo, CA

That was my suggestion, but the customer wants to do one cluster at a time,
especially the first one, ensuring that it works as before and all the users are
happy before merging another cluster in...

Followup:
then you got it.

You don't really need to disable all contraints and if you could get it done in
two bits - that would be the least work (load cluster 1, then 2, 3, 4 and 5
after testing 1)

GOTO a page to Bookmark Review | Bottom | Top
Large Update Due To Consolidation February 23, 2006
Reviewer: David Piazza from San Mateo, CA

Ok. Which of the three methods would you suggest to merge in the remaining
clusters, or would you recommend something else?

1) Do the same steps as above, CTAS/DROP old table/Rename
new table to old table/Insert /*+ append(*10+3,*10+4
etc.)/Create constraints,indexes.
OR
2) Disable primary keys, make indexes unusable and
Insert /*+ append, and then enable PK's, and indexes
OR
3) Drop primary keys, indexes, and constraints, and
INSERT /*+ append, create PK's, Indexes, and constraints
OR
4) Something else

I'm leaning towards 2) or 3)

-Thanks

Followup:
I would just be disabling indexes (if I wanted to - insert append does a rather
efficient "merge" of data) and insert appending the other data. I would not
redo the CTAS over.

GOTO a page to Bookmark Review | Bottom | Top
Bulk Insert / Update in sigle SQL February 24, 2006
Reviewer: Narendra from India

Tom,

I have a target table that stores current as well as history records. Current
records are identified by a field CURRENT_REC having value of 'Y'. History
records have value of CURRENT_REC as 'N'. At any stage, for a PART_NO, there can
be only SINGLE record with CURRENT_REC as 'Y' in target table.
My problem is I want to transfer data (in bulk) from source table to target
table with CURRENT_REC flag as 'Y' and simultaneously update existing records
having CURRENT_REC as 'Y'. Following is table definition:
Source table:
CREATE TABLE PART_SRC
(
PART_NO NUMBER,
PART_NAME VARCHAR2(100),
BALANCE_QTY NUMBER(5),
CREATED_DATE DATE,
CURRENT_REC VARCHAR2(1)
)
target Table:
CREATE TABLE PART_TABLE
(
PART_NO NUMBER,
PART_NAME VARCHAR2(100),
BALANCE_QTY NUMBER(5),
CREATED_DATE DATE,
CURRENT_REC VARCHAR2(1)
)

Is this possible in a single SQL ?

Followup:
I don't get it.

I want to transfer data (in bulk) from source table to target table with
current_rec flag as 'Y'

I understand that.

the "simultaneously update existing records have a current-rec as 'Y'"

which existing records, in which table, and update what exactly (since we are
already looking only at 'Y' records - what exactly are we updating??!?!)

GOTO a page to Bookmark Review | Bottom | Top
Bulk Insert / Update in sigle SQL February 24, 2006
Reviewer: Michel Cadot from France

If i clearly understand you, you want something like (inspired by "merge delete"
or "insert all" statements):

merge into part_table t
using part_src s
on (t.part_no = s.part_no and t.current_rec='Y')
when matched then
update t.current_rec = 'N'
insert (t.part_no, ...) values (s.part_no, ...)
when not matched then
insert (t.part_no, ...) values (s.part_no, ...)
/

merge ALL into part_table t
using part_src s
on (t.part_no = s.part_no and t.current_rec='Y')
when matched then update t.current_rec = 'N'
when TRUE then
insert (t.part_no, ...) values (s.part_no, ...)
/

But this does not exist. ;)

Michel

GOTO a page to Bookmark Review | Bottom | Top
More details February 24, 2006
Reviewer: Narendra from India

Tom,

One correction to table definitions.
Source table does not contain CURRENT_REC field.
What I am trying to achieve is transfer data from source to target. Each time
data is appended to target table. I may transfer updated details of PART_NO
multiple times from source to target. However, I do not want data in target
table to be overwritten. In order to identify the "last loaded record", I am
maintaining CURRENT_REC field in target table. If a PART_NO (from source table)
does not exist in target table, it should simply add that PART_NO to target
table with value of CURRENT_REC = 'Y'. However, if PART_NO already exists i.e.
data has been transferred earlier from source to target for that PART_NO, target
table should already have a single record with CURRENT_REC as 'Y' and may have
multiple records with CURRENT_REC as 'N' for same PART_NO. Now, when I transfer
data for same PART_NO again from source to target, it should be added with
CURRENT_REC as 'Y'. However, at the same time, existing record for same PART_NO,
having CURRENT_REC as 'Y' should be updated with 'N'.
Hope I am clear.

Michel,
MERGE will not help since CURRENT_REC will not be there in my source table.

Thanks

GOTO a page to Bookmark Review | Bottom | Top
More details February 25, 2006
Reviewer: Michel Cadot from France

Narendra,

I understood this and the pseudo-merges don't use the current_rec of source (s)
but only of target (t). It would be clearer if I wrote "insert (t.part_no, ...,
t.current_rec) values (s.part_no, ..., 'YES')".

There is no way to do it in a single SQL statement but it is easy to do it in an
efficient PL/SQL block.

Regards
Michel

GOTO a page to Bookmark Review | Bottom | Top
Merging ten large tables March 17, 2006
Reviewer: Jesse Buzzard from VA - USA

Good Evening,

This is my first foray into very large databases and I need some advice.

We have ten tables that contain blob data. The total size of the tables is
approximately 1.8TB. They are stored in the same table space. How this came to
be is a long story for another discussion thread. Our task is merge the tables
into one table. After selecting the table with the largest number of rows as
the target we came up with the below code to insert the data from another table
into it.
-------- start of code -------
#: Announce start insert

echo "-------------------------------------------------------------"
echo "`date '+ %a %e-%b-%Y %X'` - $0 - started" >> $1
echo "-------------------------------------------------------------"

$ORACLE_HOME/bin/sqlplus /nolog << EOF
connect odirms/odirms

insert /* +APPEND PARALLEL (nto, 8) */ into odirms.t_ompf4 nto
select /* +APPEND PARALLEL (tompf1, 8) */ * from t_ompf1 tompf1;
commit;
exit

EOF

#: End of program
echo "`date '+ %a %e-%b-%Y %X'` - $0 - completed" >> $1
echo "-------------------------------------------------------------"
echo
------------ end of code ----------

At present we are trying to merge 4 of the tables. All logging and indexes are
off. Multi threading is on. From all I have read we should have completed this
in a relatively short amount of time. This job has been running for over 28
hours. I can see the space used is growing (about 300MB /minute), but my
supervisors would like to know when we will have all the tables merged.
Where have gone astray?

Thank You,
Jesse

Followup:
well, you said "all indexes are off" and "we have blobs" - the two concepts do
not go together. every blob has an index associated with it that cannot
possibly be disabled for any reason (short of dropping the blob column!)

So, you do having logging going on, you do have index maintenance going on.

but you don't give any information as to the size of the tables being read (and
hence written into the new table).

I don't think anything has "gone astray" here, I can say that your "hint" isn't
a hint, it is just text you typed in:

ops$tkyte@ORA10GR2> create table t as select * from dual;

Table created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> insert /* +APPEND */ into t select * from dual;

1 row created.

ops$tkyte@ORA10GR2> select * from t;

D
-
X
X

ops$tkyte@ORA10GR2> insert /*+ APPEND */ into t select * from dual;

1 row created.

ops$tkyte@ORA10GR2> select * from t;
select * from t
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

See how the /* + didn't really do anything, but /*+ does. If that is your real
script, you likely did not achieve your goal of either append nor parallel
(parallel implies append by the way, if it was in fact using parallel)

GOTO a page to Bookmark Review | Bottom | Top
Merging ten large tables March 20, 2006
Reviewer: Jesse Buzzard from VA - USA

Thank you for your response.
If I may I would like to impose on your wisdom again. We are averaging 1800MB
per hour. Yes, the sample I provied is one of our real scripts, and yes the goal
was work in parallel. At present five of the tables (56% of the rows) have been
merged. The total size of the data, is 1.8 TB in 32,518,931 rows. As you can
see from the description below the amount of data in addition to the blob data
is rather small. If we have gotten the hint correct would we have achieve a
better insert/append rate?

Name Null? Type
----------------------------------------- -------- -----------

OMPF_ID NOT NULL NUMBER
OMPF_UID NUMBER(10)
OMPF_IMG_TYPE_ID CHAR(1)
OMPF_IMG_STYPE_ID CHAR(1)
OMPF_PAGE_NUM NUMBER(5)
OMPF_ACTIVE_FLAG CHAR(1)
OMPF_ADDED_DT DATE
OMPF_INACTIVE_DT DATE
OMPF_IMAGE BLOB
OMPF_LOCK_FLAG NUMBER(5)
OMPF_USER_ID CHAR(8)

Followup:
you might have, you might not have. you sort of have to figure out where you
are spending your time (tracing - a level 12 10046 trace on a bit of this would
go a long way)

GOTO a page to Bookmark Review | Bottom | Top
Alternative Suggestion March 21, 2006
Reviewer: VK from Atlanta

Here is an approach based on exchange partitions I used successfully in the
past for combining tables (This will work only if the data in the individual
tables is distinct and there is no overlap - A scenario in which this will work
well is when people in their infinite wisdom create horizontally "partitioned"
tables. An example may be where all customers with ID 1 to one million are kept
in CUST1 table and one million to 5 million in CUST2 table and so on..):

4 tables to be combined:

create table test(a number,b blob);
create table test1(a number,b blob);
create table test2(a number,b blob);
create table test3(a number,b blob);

Now, let's say table test has values 1 to 10,000 in column a and test1 has
10,001 to 20,000, test2 has 20,001 to 30,000 and test3 has 30,001 to 40,000.

Create a partitioned table that looks exactly like the tables being combined -
pick a partition key where the values do not overlap between tables.

create table testp(a number,b blob)
partition by range(a)
(
partition p1 values less than (10001),
partition p2 values less than (20001),
partition p3 values less than (30001),
partition p4 values less than (maxvalue)
);

You can combine the 4 tables into the partitioned table using exchange
partition:

alter table testpart exchange partition p1 with table test;
alter table testpart exchange partition p1 with table test1;
alter table testpart exchange partition p3 with table test2;
alter table testpart exchange partition p4 with table test3;

At this point, you can use the partitioned table or if you don't want the table
to be partitioned, you can merge all 4 partitions into a single partition and
exchange with a non-partitioned table.

alter table testpart merge partitions p1,p2 into partition p5;
alter table testpart merge partitions p5,p3 into partition p6;
alter table testpart merge partitions p6,p4 into partition p7;

Now we have just one partition (P7) in the table and you can exchange this with
any of the original tables (we'll use test).

alter table testpart exchange partition p7 with table test;

Now we have all 40,000 rows in table test. As you can see, it is a lot of work
to combine 4 small tables but if you have tens/hundreds of millions of rows,
this will be much faster.

GOTO a page to Bookmark Review | Bottom | Top
Merging ten large tables March 21, 2006
Reviewer: Jesse Buzzard from VA - USA

Tom,
VK,

Thank you for your suggestions. They are educational and helpful. The merges
have nearly completed. Now I am off to read about reorganizing the tablespace
to reclaim some of the space formerluy used by the smaller tables.

GOTO a page to Bookmark Review | Bottom | Top
Merging ten large tables March 21, 2006
Reviewer: Jesse Buzzard from VA - USA

Tom,
VK,

GOTO a page to Bookmark Review | Bottom | Top
Very slow insert into table April 17, 2006
Reviewer: bayaraat from Mongolia

Hi Tom

I need your expertise in this regard.
My table is
create table CDR00
(
CALLDATETIME DATE,
TELNO VARCHAR2(20),
CALLEDNO VARCHAR2(20),
SECONDS NUMBER,
PULSE NUMBER,
PREFIXCODEA CHAR(14),
PREFIXCODEB CHAR(14),
RATECODE CHAR(28),
TOG_AMOUNT NUMBER(20,2),
USD_AMOUNT NUMBER(20,8),
NO_TOG_AMOUNT NUMBER(20,8),
NO_USD_AMOUNT NUMBER(20,2),
TOG_TAX NUMBER(20,8),
USD_TAX NUMBER(20,8),
FNAME VARCHAR2(12),
OPERATOR VARCHAR2(10),
RESERVED VARCHAR2(10),
PRINTED NUMBER(1)
)
tablespace CDR01
pctfree 20
pctused 80
initrans 1
maxtrans 255
storage
(
initial 1M
minextents 1
maxextents unlimited
);
Every day insert into this table 2 million records and truncate table. This
contains million or records insert into partition table.
Insert process is very slow, respond times 12:00:00.
I want to quickly insert many records. I using RAID 5.
Not use constraint and trigger. Non stop archive log process. Because Oracle
database server is Standby database server. Any suggestions please...

Followup:
if done correctly, it would only take minutes or less to bulk load 2 million
records.

So, you need to describe HOW you are loading this.

and if you are loading and truncating, doubtful that you are updating - so
pctfree=0 would be more appropriate -and perhaps even "compress", but it depends
on how you load after all - which we don't know.

分享到：