Okay math whizzes

Public Out of Character Board.
User avatar
Ashenfury
Lost
Posts: 2326
Location: Austin, Tx

Okay math whizzes

Unread post by Ashenfury »

I have 5 'redo' logs. When these logs are sized at 250mb they 'switch' every hour during normal load and every minute during maintenance loads. I need them to switch within a range of 15-30 minutes. It takes roughly 5 minutes to 'switch' a log. During this time the original log is writing to 'archive' while a new one is opened. I cannot have more than 3 logs open at a time meaning that the log switch should never be below 15 minutes even during peak loads.

I just increased the size of my logs to 2048mb so that I can make it through the night without high system wait and log file syncs killing critical processes. Tomorrow morning I will have a second metric of how fast the logs switch during low and peak times at a size of 2048 if that helps at all.

Please note that the size of the log and the amount of time that it takes to write them are relative constants. I'm not sure that I'm using the right term but basically if it takes 5 minutes to switch a 250mb log then it will take 10 minutes to switch a 500mb log. So simply doing the match of 250mb x 15minutes during peak load = 3750mb. With a file size of 3750 it will take roughly 75 minutes to write that file meaning that all 4 redo logs go into active mode and the database starts dropping connections.

I remember saying to myself, I'll use math everyday in life but I'll never use Algebra. Boy was I wrong...
User avatar
Ashenfury
Lost
Posts: 2326
Location: Austin, Tx

Re: Okay math whizzes

Unread post by Ashenfury »

I forgot to mention that I can have up to 10 redo logs and as few as 3. I prefer to keep it less than or equal to 8.
User avatar
Therean
Posts: 113

Re: Okay math whizzes

Unread post by Therean »

Maths is bad.
Grainger
Posts: 728

Re: Okay math whizzes

Unread post by Grainger »

I don't think its possible given your parameters.

If you're filling 250mb per minute during load times and can have three logs files open then you need to switch every three minutes but it takes 5 mins to switch, i.e. doesn't work.

If the switch time increases linearly with the log size the math doesn't change, its just not doable.

Now, if you speed up the write times by switching to faster SSD drives and get the write time down to under 3 mins then you are good to go. Or, if the write time increases non-linearly with the size of the file.

This ignores the duration of the maintenance window that causes the high load. If that information is known then you only have to span that window with three log files and could size the files to cover that, i.e. 250*(duration_in_minutes)/3. Of course, assuming the final write time can be accomplished before the first regular load log fills up.

I think speeding up the write time during the high load is the answer though - maybe even have the redo logs saving to a ramdisk and then write it out later when the load is lower.
User avatar
Ashenfury
Lost
Posts: 2326
Location: Austin, Tx

Re: Okay math whizzes

Unread post by Ashenfury »

Customer will not purchase hardware upgrades.

It fills up the entire RAM and SWAP space during the maintenance window (backups, statistics) which is 24g all together (12physical 12 swap). When I run an active session history I see that CPU+Wait and Log File Sync are my top waits. So I'm backing down on the parallel, rescheduling backups to happen at a different time than statistics and, something with memory. I don't want to shrink the SGA but I think I'll have to in order to stop using so much swap. This is a 64bit server/db so no need for ram filesystems.

Hopefully moving both heavy I/O operations to different maintenance windows after hours will balance the I/O enough to allow the numbers to relax enough that I can tune these redos to switch every 15 minutes.

Edit: I'm also upping my redo log count to 6 but that resolves nothing other than log contention.
User avatar
Ashenfury
Lost
Posts: 2326
Location: Austin, Tx

Re: Okay math whizzes

Unread post by Ashenfury »

Huzzah! Your mention of ramfs made me consider my database cache size. Just noticed that it's set at 64!!! I'll be bumping this up to about 500 in hopes that it greatly reduces the I/O crunch required for maintenance. Also your idea of putting the redo logs on a regular FS and the archivelogs (the part that takes forever to write) on a RAMFS is a DAMN good idea if I could put just a little bit more RAM on the server. /cry
User avatar
Ashenfury
Lost
Posts: 2326
Location: Austin, Tx

Re: Okay math whizzes

Unread post by Ashenfury »

Okaaaaaay. I also found out that my redo and backups were on the same filesystem for some reason. That certainly puts a crunch on I/O LOL!!! Okay so now that redo logs are on their own filesystem the write speeds have improved.

Also I noticed that I was commiting at or greater than 1 times per second from my many many importation streams which was also causing redo contention. I've changed my default commit rate during importation from 40 to 100.

I do NOT want to have to work on this over the weekend damnit.

As an aside, ADDMRPT.sql is the best Oracle tool in the world.
Post Reply