{"id":429,"date":"2012-06-05T16:03:56","date_gmt":"2012-06-05T15:03:56","guid":{"rendered":"http:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/"},"modified":"2012-06-05T16:03:56","modified_gmt":"2012-06-05T15:03:56","slug":"problems-with-bnx2-kernel-module-and-high-traffic","status":"publish","type":"post","link":"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/","title":{"rendered":"Problems with bnx2 kernel module and high traffic"},"content":{"rendered":"<p>We&#39;re seeing an &quot;elevated&quot; level of traffic these days on the My Opera servers. As usual with operations matters, it&#39;s difficult to find one exact clear root cause. The rest of the post explains what we found and the fix for it.<\/p>\n<h3>TL;DR<\/h3>\n<p>You want to try <code>options bnx2 disable_msi=1<\/code> in your <code>\/etc\/modprobe.d\/bnx2.conf<\/code> if:<\/p>\n<ul>\n<li>using squeeze and bnx2 version is 2.0.2<\/li>\n<li>you see high traffic (10K+ connections)<\/li>\n<li>you see errors on public network interface<\/li>\n<li>server is dropping packets\/connections randomly or it&#39;s really slow<\/li>\n<\/ul>\n<h3>The gory details<\/h3>\n<p>During last Tuesday the DDoS attack (that is still continuing now) on the My Opera servers ramped up from ~4k req\/s\/frontend to <strong>~16k+ req\/s\/frontend<\/strong>. Both frontends were dist-upgraded (including a kernel upgrade) on May 23rd, but not rebooted, so the kernel update was <i>armed<\/i> but not actually live.<\/p>\n<p>We started seeing these bad problems of dropped connections and general slowness after the frontend servers were rebooted. The reason why there were rebooted is because we have been hitting another really weird problem, the <strong>210 days uptime<\/strong> timer bug. See <a href=\"https:\/\/bugs.launchpad.net\/ubuntu\/+source\/linux\/+bug\/805341\" rel=\"nofollow\">this<\/a> and <a href=\"http:\/\/www.gossamer-threads.com\/lists\/linux\/kernel\/1445569\" rel=\"nofollow\">this<\/a> bug reports for more details.<\/p>\n<p>Anyway, I&#39;m not sure how to verify this, because I didn&#39;t restart the boxes myself, but my theory is after they were rebooted, the new <code>bnx2<\/code> kernel module version 2.0.2 was loaded.<\/p>\n<p>Then later on we found out about this <a href=\"http:\/\/ubuntuforums.org\/archive\/index.php\/t-1726045.html\" rel=\"nofollow\">very specific bnx2 v2.0.2 bug<\/a> that only triggers in high traffic situations, at least on Debian Squeeze and Ubuntu, that causes network interfaces to stop working correctly, dropping traffic.<\/p>\n<p>Long story short, there&#39;s a magic option that prevents this from happening. <i>rmmod&#39;ing<\/i> and <i>modprobing<\/i> back the bnx2 module with this option fixed the problem so far.<\/p>\n<pre><code># \/etc\/modprobe.d\/bnx2.conf\r\noptions bnx2 disable_msi=1\r\n<\/code><\/pre>\n<p>Regarding what the option is about, I&#39;m not even going to lie about it. I have no idea&#8230; We found it with this search:<\/p>\n<p><a href=\"https:\/\/encrypted.google.com\/search?client=opera&amp;rls=en&amp;q=bnx2+debian+2.0.2+traffic&amp;sourceid=opera&amp;ie=utf-8&amp;oe=utf-8&amp;channel=suggest\" rel=\"nofollow\">https:\/\/encrypted.google.com\/search?client=opera&amp;rls=en&amp;q=bnx2+debian+2.0.2+traffic&amp;sourceid=opera&amp;ie=utf-8&amp;oe=utf-8&amp;channel=suggest<\/a><\/p>\n<p>First hit is our own Sven from sysadmin team:<\/p>\n<p>  <a href=\"http:\/\/lists.us.dell.com\/pipermail\/linux-poweredge\/2011-October\/045485.html\" rel=\"nofollow\">http:\/\/lists.us.dell.com\/pipermail\/linux-poweredge\/2011-October\/045485.html<\/a><\/p>\n<p>Second hit is the solution we used:<\/p>\n<p>  <a href=\"http:\/\/ubuntuforums.org\/archive\/index.php\/t-1726045.html\" rel=\"nofollow\">http:\/\/ubuntuforums.org\/archive\/index.php\/t-1726045.html<\/a><\/p>\n<p>We also did some tweaking for the large amount of <code>TIME_WAIT<\/code> connections that were resulting from this bnx2 bug, namely bumped up <code>net.sys.ipv4.tcp_max_tw_buckets<\/code> quite a bit.<\/p>\n<h3>Take aways<\/h3>\n<ol>\n<li><b>Before<\/b> rebooting a machine, check what&#39;s going to happen, when was last upgrade etc&#8230;, f.ex. <code>\/var\/log\/dpkg.log<\/code>.<\/li>\n<li>In case you have firewall rules, <code>iptables-save &gt; \/root\/iptables-rules.YYYYMMDD<\/code> and later restore if needed with <code>iptables-restore &lt; iptables-rules.YYYYMMDD<\/code>\n<li>Always check if the <code>conntrack<\/code> module is enabled. Most times you don&#39;t need it, and it will cause performance to drop under very high traffic (of course).<\/li>\n<\/li>\n<\/ol>\n<p>In this case what happened is that the conntrack module was accidentally also re-enabled by the reboot. We had previously disabled it, but didn&#39;t make the change permanent. This is because on My Opera we&#39;re still not using our config management infrastructure&#8230; Looking forward to make that happen. Soon. Hopefully :)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We&#39;re seeing an &quot;elevated&quot; level of traffic these days on the My Opera servers. As usual with operations matters, it&#39;s difficult to find one exact clear root cause. The rest of the post explains what we found and the fix for it. TL;DR You want to try options bnx2 disable_msi=1 in your \/etc\/modprobe.d\/bnx2.conf if: using [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[225,226,224,223,75,227],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v22.9 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Problems with bnx2 kernel module and high traffic - Random hacking<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Problems with bnx2 kernel module and high traffic - Random hacking\" \/>\n<meta property=\"og:description\" content=\"We&#039;re seeing an &quot;elevated&quot; level of traffic these days on the My Opera servers. As usual with operations matters, it&#039;s difficult to find one exact clear root cause. The rest of the post explains what we found and the fix for it. TL;DR You want to try options bnx2 disable_msi=1 in your \/etc\/modprobe.d\/bnx2.conf if: using [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/\" \/>\n<meta property=\"og:site_name\" content=\"Random hacking\" \/>\n<meta property=\"article:published_time\" content=\"2012-06-05T15:03:56+00:00\" \/>\n<meta name=\"author\" content=\"cosimo\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"cosimo\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/\"},\"author\":{\"name\":\"cosimo\",\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/#\/schema\/person\/c443bedbf6ecf99550d6395620801df1\"},\"headline\":\"Problems with bnx2 kernel module and high traffic\",\"datePublished\":\"2012-06-05T15:03:56+00:00\",\"dateModified\":\"2012-06-05T15:03:56+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/\"},\"wordCount\":513,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/#\/schema\/person\/c443bedbf6ecf99550d6395620801df1\"},\"keywords\":[\"bnx2\",\"disable_msi\",\"kernel\",\"linux\",\"performance\",\"tcpip\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/\",\"url\":\"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/\",\"name\":\"Problems with bnx2 kernel module and high traffic - Random hacking\",\"isPartOf\":{\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/#website\"},\"datePublished\":\"2012-06-05T15:03:56+00:00\",\"dateModified\":\"2012-06-05T15:03:56+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.streppone.it\/cosimo\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Problems with bnx2 kernel module and high traffic\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/#website\",\"url\":\"https:\/\/www.streppone.it\/cosimo\/blog\/\",\"name\":\"Random hacking\",\"description\":\"Assume nothing. Code defensively. Keep it simple, stupid!\",\"publisher\":{\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/#\/schema\/person\/c443bedbf6ecf99550d6395620801df1\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.streppone.it\/cosimo\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/#\/schema\/person\/c443bedbf6ecf99550d6395620801df1\",\"name\":\"cosimo\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/cb1d938720df45a2720724aae99e3bfc?s=96&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/cb1d938720df45a2720724aae99e3bfc?s=96&r=g\",\"caption\":\"cosimo\"},\"logo\":{\"@id\":\"https:\/\/www.streppone.it\/cosimo\/blog\/#\/schema\/person\/image\/\"},\"url\":\"https:\/\/www.streppone.it\/cosimo\/blog\/author\/cosimo\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Problems with bnx2 kernel module and high traffic - Random hacking","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/","og_locale":"en_US","og_type":"article","og_title":"Problems with bnx2 kernel module and high traffic - Random hacking","og_description":"We&#39;re seeing an &quot;elevated&quot; level of traffic these days on the My Opera servers. As usual with operations matters, it&#39;s difficult to find one exact clear root cause. The rest of the post explains what we found and the fix for it. TL;DR You want to try options bnx2 disable_msi=1 in your \/etc\/modprobe.d\/bnx2.conf if: using [&hellip;]","og_url":"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/","og_site_name":"Random hacking","article_published_time":"2012-06-05T15:03:56+00:00","author":"cosimo","twitter_card":"summary_large_image","twitter_misc":{"Written by":"cosimo","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/#article","isPartOf":{"@id":"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/"},"author":{"name":"cosimo","@id":"https:\/\/www.streppone.it\/cosimo\/blog\/#\/schema\/person\/c443bedbf6ecf99550d6395620801df1"},"headline":"Problems with bnx2 kernel module and high traffic","datePublished":"2012-06-05T15:03:56+00:00","dateModified":"2012-06-05T15:03:56+00:00","mainEntityOfPage":{"@id":"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/"},"wordCount":513,"commentCount":0,"publisher":{"@id":"https:\/\/www.streppone.it\/cosimo\/blog\/#\/schema\/person\/c443bedbf6ecf99550d6395620801df1"},"keywords":["bnx2","disable_msi","kernel","linux","performance","tcpip"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/","url":"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/","name":"Problems with bnx2 kernel module and high traffic - Random hacking","isPartOf":{"@id":"https:\/\/www.streppone.it\/cosimo\/blog\/#website"},"datePublished":"2012-06-05T15:03:56+00:00","dateModified":"2012-06-05T15:03:56+00:00","breadcrumb":{"@id":"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.streppone.it\/cosimo\/blog\/2012\/06\/problems-with-bnx2-kernel-module-and-high-traffic\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.streppone.it\/cosimo\/blog\/"},{"@type":"ListItem","position":2,"name":"Problems with bnx2 kernel module and high traffic"}]},{"@type":"WebSite","@id":"https:\/\/www.streppone.it\/cosimo\/blog\/#website","url":"https:\/\/www.streppone.it\/cosimo\/blog\/","name":"Random hacking","description":"Assume nothing. Code defensively. Keep it simple, stupid!","publisher":{"@id":"https:\/\/www.streppone.it\/cosimo\/blog\/#\/schema\/person\/c443bedbf6ecf99550d6395620801df1"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.streppone.it\/cosimo\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/www.streppone.it\/cosimo\/blog\/#\/schema\/person\/c443bedbf6ecf99550d6395620801df1","name":"cosimo","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.streppone.it\/cosimo\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/cb1d938720df45a2720724aae99e3bfc?s=96&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/cb1d938720df45a2720724aae99e3bfc?s=96&r=g","caption":"cosimo"},"logo":{"@id":"https:\/\/www.streppone.it\/cosimo\/blog\/#\/schema\/person\/image\/"},"url":"https:\/\/www.streppone.it\/cosimo\/blog\/author\/cosimo\/"}]}},"_links":{"self":[{"href":"https:\/\/www.streppone.it\/cosimo\/blog\/wp-json\/wp\/v2\/posts\/429"}],"collection":[{"href":"https:\/\/www.streppone.it\/cosimo\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.streppone.it\/cosimo\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.streppone.it\/cosimo\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.streppone.it\/cosimo\/blog\/wp-json\/wp\/v2\/comments?post=429"}],"version-history":[{"count":0,"href":"https:\/\/www.streppone.it\/cosimo\/blog\/wp-json\/wp\/v2\/posts\/429\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.streppone.it\/cosimo\/blog\/wp-json\/wp\/v2\/media?parent=429"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.streppone.it\/cosimo\/blog\/wp-json\/wp\/v2\/categories?post=429"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.streppone.it\/cosimo\/blog\/wp-json\/wp\/v2\/tags?post=429"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}