Been a while since you've had: AN ORACLE THREAD!

Public Out of Character Board.
User avatar
Ashenfury
Lost
Posts: 2326
Location: Austin, Tx

Been a while since you've had: AN ORACLE THREAD!

Unread post by Ashenfury »

In the office we have an 11.2 Oracle Enterprise Database that is accessible via WAN. We have another server at our customer location that collects data and stores it in flat files. The remote server then connects to the database over the WAN and imports the data residing in the flat file into the database. The import requires in-line selects for validation before importation of each record.

The end users called and complained that importation cannot keep up with real time data collection. I ran AWR, ASH, and ADDM report. They all show the same findings: SQL*Net more data from client is my top wait event at 63%.

After a bit of research I begin to delve into SDU, MTU, and Array Size. Array Size, as far as I can tell, can only be set within the session so I cannot alter that due to the import engines being proprietary (and written in C).

So I need to know how to calculate the best SDU and MTU sizes based on my connectivity. As a note the network on both ends is very speedy both up and down (15up/15down both sides).

The ultimate question is: How do I resolve SQL*Net more data from client in a WAN environment?
User avatar
Greebo
Member
Posts: 5896
Location: Far Southern Canuckistan
Contact:

Re: Been a while since you've had: AN ORACLE THREAD!

Unread post by Greebo »

can you change the architecture and get the loader running on something closer to home? is the connection being shoved through a vpn that might make clever assumptions? can you use a stub db and connect using advanced compression? MTU of 1500 should be fine. jumbo packets are only an issue with 100Mb+
Grisbault, Twice-Made.
The p, s, l, and t are silent, the screams are not.
User avatar
Greebo
Member
Posts: 5896
Location: Far Southern Canuckistan
Contact:

Re: Been a while since you've had: AN ORACLE THREAD!

Unread post by Greebo »

are you doing this slow-by-slow? batch up the validation requests perhaps. I mean, even if you need to hand roll something and use execute immediate on twenty or fifty rows it'll still be miles faster. or is that part of the crappy 3rd party c code?
Grisbault, Twice-Made.
The p, s, l, and t are silent, the screams are not.
User avatar
Greebo
Member
Posts: 5896
Location: Far Southern Canuckistan
Contact:

Re: Been a while since you've had: AN ORACLE THREAD!

Unread post by Greebo »

if you have the flat file and it isn't some god-forsaken binary crap you should be able to improve the insert phase.
Grisbault, Twice-Made.
The p, s, l, and t are silent, the screams are not.
User avatar
Ashenfury
Lost
Posts: 2326
Location: Austin, Tx

Re: Been a while since you've had: AN ORACLE THREAD!

Unread post by Ashenfury »

can you change the architecture and get the loader running on something closer to home?
Negative. This is our beta test for having a centralized server at our office and selling "collector" nodes to our end users. When we transfer the flat ASCII files to our local server and import them they complete in 5 minutes. Our ASCII files are broken into hourly files though so this would mean that we're an hour behind at all times which would make our product much less attractive and some features like call bypass would have to be eliminated making this not an option.
is the connection being shoved through a vpn that might make clever assumptions?
Negative. The database server is ass open to the world.
can you use a stub db and connect using advanced compression?
Negative. I must reference the actual data within the database. Basically we have 7 data types and we create an eighth record from correlating those objects and then we import the newly generated one if all of the correlation flags line up. We are already using advanced compression.
MTU of 1500 should be fine. jumbo packets are only an issue with 100Mb+
MTU is a new term for me. I just started learning about it today. When you say MTU are you meaning a parameter set on the OS or the database? Should my MTU perfectly match my SDU?
are you doing this slow-by-slow? batch up the validation requests perhaps. I mean, even if you need to hand roll something and use execute immediate on twenty or fifty rows it'll still be miles faster. or is that part of the crappy 3rd party c code?
It's definitely part of the 3rd party code. The developer wrote his own import engine using OCI so while it's still meeting Oracle Best Practices it is a complete turd when it comes to what we call "correlation/usage". He did write an array size flag into our basic loaders that will allow us to fit more rows into each packet but he did not write that flag into the correlation/usage loaders. I sent him an e-mail today to ask why.
if you have the flat file and it isn't some god-forsaken binary crap you should be able to improve the insert phase.
I can insert the flat files in milliseconds. Even when they have thousands of records in them the insert is very fast. The issue comes with this shitty C code that queries the database, sorts those results, verifies them, and THEN imports. My waits are all from this back and forth transfer of data

----

At this point I set my SDU to 32k per Oracle Best Practices for WAN connections. It feels wrong though because that would mean that I'm sending half empty packets or even worse mostly empty packets back and forth.
Grainger
Posts: 728

Re: Been a while since you've had: AN ORACLE THREAD!

Unread post by Grainger »

If you're going over the Internet your traffic is going to get split into ATM packets at some point, so I wouldn't worry about packet size. The fact you care about packets is a sign that the process isn't written for the Internet in any way, shape or form.

If you're doing a lot of back and forth traffic because the code is crap then lag is going to be your biggest problem by far. (I have a crappy old application that's similar, it does a remote call for everything and its hideous on anything other than local Ethernet)

Even if your latency is good - say 50 ms, that still a tenth of a second for every packet (50 out, 50 back for the ACK). It completely unacceptable.

The only way to fix it is to batch the transfer so the latency becomes a small part of the transfer time.
User avatar
Greebo
Member
Posts: 5896
Location: Far Southern Canuckistan
Contact:

Re: Been a while since you've had: AN ORACLE THREAD!

Unread post by Greebo »

if you can insert one hours worth of work in 5 minutes it means you can do one minute's worth in 5 seconds. so schedule once-a-minute updates.

if * that* is impossible then you need to sit down the bright spark who thought that it was a good idea to bet your new strategy completely on some shitty ass-contractor who has no vested interest in your outcome.

what I meant by a stub DB was a small install on the remote machine that only held enough data to validate that local insert - do you have control over the schema to allow decent partitioning? can you bring any pressure to bear on the third party contractor? can you stab the person on the throat who signed the contact with this person without any meaningful code performance requirements?

if you are completely hamstrung and the *only* parameters you have control over are the network configuration options then there isn't a whole lot that can be done.

MTU is the size, in bytes, of your ethernet frames. Increasing them reduces the number of round trips your data needs if your average payload is large. It increases the total number of round trips if your average payload is small. it is purely an Ethernet parameter, however, so your VPN is going to chop the Ethernet frames up anyway so you're kinda SoL there.

happy to discuss this over the phone if you want. my reception is complete ass so I'd need to call you using Google voice. pm me
Grisbault, Twice-Made.
The p, s, l, and t are silent, the screams are not.
User avatar
Greebo
Member
Posts: 5896
Location: Far Southern Canuckistan
Contact:

Re: Been a while since you've had: AN ORACLE THREAD!

Unread post by Greebo »

Or, you know, mumble.
Grisbault, Twice-Made.
The p, s, l, and t are silent, the screams are not.
User avatar
Ashenfury
Lost
Posts: 2326
Location: Austin, Tx

Re: Been a while since you've had: AN ORACLE THREAD!

Unread post by Ashenfury »

Thanks guys! I really appreciate the suggestions. I can't do a stubdb or even external tables because it has to reference data that came from another live import stream for validation. Basically we have what we call SS7 (signaling system 7). It has 5 records per call. We import those records and then this usage process that isn't keeping up makes sure that all 5 of those records are present and for the same call. It then generates the new record (we call it ISR) from those 5 individual records and imports that into the database. So the data used for validation is just as real time as the ISR that we import after validation.

This definitely just comes down to network latency because it works perfectly in a LAN environment with a 0.2ms ping and can't keep up in a WAN environment with a 50ms ping. I can get about 20-30ms pings to my personal network so I setup a mock import over the WAN from there to an internal test database at work. I noticed that the importation was sped up by the exact ratio we observed in speed differences.

So even if I tune SQL*Net to be smooth as silk it will still be too slow for this to work. Our latency either needs to be dramatically reduced or the binary re-written not to suck. I'm still trying to explore alternatives.

EDIT: The ss7 table used for validation is usually anywhere from 0.2TB to 5TB.
User avatar
Ashenfury
Lost
Posts: 2326
Location: Austin, Tx

Re: Been a while since you've had: AN ORACLE THREAD!

Unread post by Ashenfury »

Okay. Now we're digging deeeeep. I'm a smacktard when it comes to networking so I'm exploring some shit I really have no business putting my nose in. Regardless I'm currently exploring:

Tuning the OS TCP Buffer and tuning SQL*Net send/recv buffers to match the OS TCP Buffer. This stuff is really just a tad over my head when it comes to crunching numbers on packets, windows, frames, and such. So I'm looking at sysctl.conf and itching to just pump the rmem and wmem max parameters to a high number and setting my SQL*Net send/recv buffers to match it. I'm going to test that out tonight.

There are still a couple of parameters that make me scratch my head though. For example what the hell are option memory buffers?! cat /proc/sys/net/core/optmem_max. I Googled the term and it's just not making sense to me. I don't see any mention of this "auto tuning" parameter in sysctl.conf so would it be pointless to set it?

# increase Linux autotuning TCP buffer limit
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864

What I'm trying to do is resolve the issue that Grainger pointed out. I did a TCPDUMP with the help of a coworker and it indeed shows more ACK than it logically should be. Even without the Oracle Import Engine I was seeing thousands of them per second and that can't be good for network speed. Will increasing my OS TCP Buffers help reduce all of these ACKs?
User avatar
Ashenfury
Lost
Posts: 2326
Location: Austin, Tx

Re: Been a while since you've had: AN ORACLE THREAD!

Unread post by Ashenfury »

Greebo, Grainger, you guys are amazing. Thanks for the help you two.
Post Reply