Tag Archives: tcp

A collection of useful sysctl snippets packaged as a puppet module

Recently I learned from Marco about /etc/sysctl.d, a folder where you can drop in files instead of changing /etc/sysctl.conf directly. That gave me the idea of building a puppet module for sysctl:

https://github.com/cosimo/puppet-modules/blob/master/sysctl/README

The idea is to assemble a collection of useful sysctl snippets. I started with the usual things we use everywhere:

  • LVS Direct Routing

    # LVS directives for Direct Routing
    # http://www.linuxvirtualserver.org/VS-DRouting.html
    net.ipv4.conf.lo.arp_ignore = 1
    net.ipv4.conf.lo.arp_announce = 2
    net.ipv4.conf.all.arp_ignore = 1
    net.ipv4.conf.all.arp_announce = 2
    
  • TCP performance tuning
    #---------------------------------------------------------------------
    # TCP/IP performance optimization settings compared to debian defaults
    #
    # from http://varnish.projects.linpro.no/wiki/Performance
    #---------------------------------------------------------------------
    
    #net.ipv4.ip_local_port_range = 32768 61000
    net.ipv4.ip_local_port_range = 1024 65536
    # net.core.rmem_max = 131071
    net.core.rmem_max = 16777216
    # net.core.wmem_max = 131071
    net.core.wmem_max = 16777216
    # net.ipv4.tcp_rmem = 4096 87380 4194304
    net.ipv4.tcp_rmem = 4096 87380 16777216
    # net.ipv4.tcp_wmem = 4096 16384 4194304
    net.ipv4.tcp_wmem = 4096 65536 16777216
    # net.ipv4.tcp_fin_timeout = 60
    net.ipv4.tcp_fin_timeout = 20
    # net.core.netdev_max_backlog = 1000
    net.core.netdev_max_backlog = 30000
    # net.ipv4.tcp_no_metrics_save = 0
    net.ipv4.tcp_no_metrics_save = 1
    # net.core.somaxconn = 128
    net.core.somaxconn = 262144
    # net.ipv4.tcp_syncookies = 0
    net.ipv4.tcp_syncookies = 1
    # net.ipv4.tcp_max_orphans = 65536
    net.ipv4.tcp_max_orphans = 262144
    # net.ipv4.tcp_max_syn_backlog = 1024
    net.ipv4.tcp_max_syn_backlog = 262144
    # net.ipv4.tcp_synack_retries = 5
    net.ipv4.tcp_synack_retries = 3
    # net.ipv4.tcp_syn_retries = 5
    net.ipv4.tcp_syn_retries = 3
    

I'm interested in both baseline settings to be applied by default everywhere (ex. vm.swappiness = <n>), and special-purpose settings to be "attached" to server roles, like db, file servers, http servers, etc… I'd love to hear from you.

How to detect TCP retransmit timeouts in your network

Some months ago, while investigating on a problem in our infrastructure, I put together a small tool to help detecting TCP retransmits happening during HTTP requests.

TCP retransmissions can happen, for example, when a client sends a SYN packet to the server, the server responds with a SYN-ACK but, for any reason, the client never receives the SYN-ACK. In this case, the client correctly waits for a given time, called the TCP Retransmission Timeout. In the usual case, this time is set to 3 seconds.

There's probably a million reasons why the client may never receive a SYN-ACK. The one I've seen more often is packet loss, which in turn can have lots of reasons, for example a malfunctioning or misconfigured network switch.

However, you can immediately spot if your timeout/hang problems are caused by TCP retransmission because they happen to cause response times that are unusually frequently distributed around 3, 9 and 21 seconds (and on, of course).

In fact, the TCP retransmission timeout starts at 3 seconds, but if the client tries to resend after a timeout and still receives no answer, it doubles the wait to 6 s, so the total response time will be 9 seconds, assuming that the client now finally receives the SYN-ACK. Otherwise, 3 + 6 + 12 = 21, then 3 + 6 + 12 + 24 = 45 s and so on and so forth.

So, this little tool fires a quick batch of HTTP requests to a given server and measures the response times, highlighting slow responses (> 0.5s). If you see that the reported response times are 3.002s, 9.005s or similar, then you are probably in presence of TCP retransmission and/or packet loss.

Finally, here it is:


#!/usr/bin/env perl
#
# https://gist.github.com/1101500
#
# Fires HTTP request batches at the specified hostname
# and analyzes the response times.
#
# If you have suspicious frequency of 3.00x, 9.00x, 21.00x
# seconds, then most probably you have a problem of packet loss
# in your network.
#
# cosimo@opera.com, sometime in 2011
#

use strict;
use LWP::UserAgent ();
use Time::HiRes ();

$| = 1;

my $ua = LWP::UserAgent->new();
$ua->agent("$0/0.01");

# Tests this hostname
my $server = $ARGV[0] || die "Usage: $0 <hostname>n";

# Picks the URLs to test in this list, one after the other
my @url_pool = qw(
	/ping.html
);

my $total_reqs = 0;
my $total_elapsed = 0.0;
my $n_pick = 0;
my $url_to_fire;

my $max_elapsed = 0.0;
my $max_elapsed_when = '';
my $failed_reqs = 0;
my $slow_responses = 0;
my $terminate_now = 0;

sub output_report {
	print "Report for:            $server at " . localtime() . "n";
	printf "Total requests:        %d in %.3f sn", $total_reqs, $total_elapsed;
	print "Failed requests:       $failed_reqsn";
	print "Slow responses (>1s):  $slow_responses (slowest $max_elapsed s at $max_elapsed_when)n";
	printf "Average response time: %.3f s (%.3f req/s)n", $total_elapsed / $total_reqs, $total_reqs / $total_elapsed;
	print "--------------------------------------------------------------------n";
	sleep 1;
	return;
}

$SIG{INT} = sub { $terminate_now = 1 };

while (not $terminate_now) {

	$url_to_fire = "http://" . $server . $url_pool[$n_pick];

	my $t0 = [ Time::HiRes::gettimeofday() ];
	my $resp = $ua->get($url_to_fire);
	my $elapsed = Time::HiRes::tv_interval($t0);

	$failed_reqs++ if ! $resp->is_success;

	$total_reqs++;
	$total_elapsed += $elapsed;

	if ($elapsed > $max_elapsed) {
		$max_elapsed = $elapsed;
		$max_elapsed_when = scalar localtime;
		printf "[SLOW] %s, %s served in %.3f sn", $max_elapsed_when, $url_to_fire, $max_elapsed;
	}

	$slow_responses++ if $elapsed >= 0.5;
	$n_pick = 0       if ++$n_pick > $#url_pool;
	output_report()   if $total_reqs > 0 and ($total_reqs % 1000 == 0);

}
continue {
    Time::HiRes::usleep(100000);
}

output_report();

# End

It's also published here on Github, https://gist.github.com/1101500. Have fun!