archive-ca.com » CA » E » EVANJONES.CA

Total: 397

Choose link from "Titles, links and description words view":

Or switch to "Titles and links view".
  • Gigabit Ethernet Latency with Intel's 1000/Pro Adapters (e1000) (evanjones.ca)
    or 10G Ethernet the maximum packet rate is extremely high and handling one interrupt per packet could be very inefficient Interrupt coalescing also called interrupt moderation is a feature where the network adapter will raise one interrupt for a group of packets My problem was that the old version of the e1000 driver on my Linux systems used a fixed minimum inter interrupt interval of 125 µs Thus the client would send the packet the server would process it and reply and then the response would sit in memory until the timer expired In reality the round trip latency was lower than 125 µs but the interrupt throttle timer imposed a minimum latency This interrupt timer does strange things to the performance of a client and server which makes many small requests For example netperf will frequently measure very close to 8000 round trips per second but it will occasionally measure a smaller value The reason is that sometimes the timing of the interrupts on the two ends are closely synchronized This causes two interrupt timer periods to elapse between message receptions one for the transmit interrupt and another for the receive interrupt This is probably a performance anomaly which would only rarely happen in reality since real applications will likely do more than 125 µs of work with the request so the interrupt timer will be less of an issue However for simple benchmarks it can make a huge difference between reliable low latency performance and performance with additional delay and unpredictable variation In my more realistic benchmark where the server does approximately 110 µs of processing of each request tuning this parameter only makes a small difference It increases the throughput with a small number of clients significantly almost double requiring fewer clients before it saturates 3 instead

    Original URL path: http://www.evanjones.ca/ethernet-latency.html (2016-04-30)
    Open archived version from archive


  • Java/C++ Networking Microbenchmark (evanjones.ca)
    I expect the results could be different Hence this benchmark tests a simple echo server The server reads a client request a length prefixed blob of bytes then echos that request back We wrote four versions C and Java epoll and threads Each client sends one request then waits for the response When it gets the response it immediately sends another request Each trial ran for 30 seconds with 5 seconds of warm up The graph shows the average of 10 trials with the 95 confidence interval The tests were ran on the localhost of a 1 8GHz Core2 Quad machine with 8 cores 2 Core2 Quad CPUs running Linux 2 6 25 in 64 bit mode Each server was fixed to a single core using numactl in order to avoid giving the threaded servers more resources than the single threaded servers Java was run with the server and XX UseSerialGC flags as they were found to improve performance The results are not very surprising Generally C performs slightly better than Java but only by a small margin I think that the C version makes fewer copies than the Java code so the C versus Java comparison here is unfair

    Original URL path: http://www.evanjones.ca/software/javanetperf.html (2016-04-30)
    Open archived version from archive

  • JSamp: A sampling profiler for Java (evanjones.ca)
    running a program with it the program becomes significantly slower Worse since it instruments every method call tiny methods which are normally very cheap will become expensive Hence I don t trust the results HProf is a sampling profiler which impacts the performance less and produces more realistic results However it starts profiling at the beginning of the Java program and finishes at the end This was inappropriate for my

    Original URL path: http://www.evanjones.ca/software/jsamp.html (2016-04-30)
    Open archived version from archive



  • TCP Checksums Are Not Enough (evanjones.ca)
    nice reminder to developers of large scale reliable distributed systems TCP checksums are not enough to protect your application Update 2015 10 08 This does happen in reality There are two reasons that you can t rely on TCP checksums The first is that the TCP checksum is very weak and can easily fail to detect errors in packets This means that it is possible for a packet to be corrupted somewhere between the sender and the receiver without the receiver ever noticing If you build a large system this is almost guaranteed to happen occasionally The second problem is that the TCP checksum happens too late and only protects the contents of a single packet which is not sufficient This is the classic end to end argument If the corruption happens before your data gets to TCP because of a memory or software error for example the TCP checksum can t help If the corruption happens while reassembling a message from multiple packets again TCP is no help However intelligent use of higher level checksums can detect these types of errors Amazon s problem was likely more insidious than simply relying on TCP checksums but it still provides a

    Original URL path: http://www.evanjones.ca/tcp-checksums.html (2016-04-30)
    Open archived version from archive

  • Core Engineering Skill: Analyzing Data (evanjones.ca)
    farm in the United States nets on average 1 400 per acre a 1 364 acre farm nets 39 an acre Why might that be true There are a number of possibilities For example it is possible that the four acre farms are more efficient so they produce 36 times more output per acre then the larger farms However a productivity difference that large seems extremely unlikely It seems more probable that when growing high value crops farms can afford to be smaller Conversely if the climate is only appropriate for growing low cost grain a huge farm is needed to support the farmer There is a world of difference between the economics of a small vineyard growing grapes for wine and a huge wheat farm in the American plains The problem is that the author uses this data to support his argument that small farms are the most productive on earth However this is not a valid comparison A scientific comparison needs to hold as many variables constant as possible In this case we need to examine farms of different sizes growing the same crops in the same regions These numbers averaged across the United States don t tell

    Original URL path: http://www.evanjones.ca/analyzing-data.html (2016-04-30)
    Open archived version from archive

  • TCP Performance Gotchas (evanjones.ca)
    TCP NODELAY option needs to be set This was not our performance problem as TCP NODELAY was already enabled Using tcpdump Wireshark on the proxy we saw that five packets were being sent to the client for each result from MySQL The first four packets were one byte long while the last packet contained a variable amount of data It turns out that Java s DataOutputStream is not buffered Even better the writeInt method makes four separate calls to write on the underlying OutputStream Hence the five packets were coming from one call to writeInt and one call to write Wrapping a BufferedOutputStream fixed the problem However what was the root cause More digging in the packet trace revealed that occasionally after waiting for a long transaction 500 ms the proxy would only send three one byte packets It would then wait for an ACK which arrived 40 milliseconds This delay is Linux s delayed acknowledgment timeout After receiving the ACK the proxy sent out one packet containing the fourth byte from writeInt plus the remaining data This implied that Linux s TCP implementation was waiting for the ACK before sending more data This happens when the congestion window is full However why was it considering the congestion window to be full It only sent three bytes There are two reasons The first is that Linux tracks the TCP congestion window by counting packets and treating each packet as if it is a maximum length packet called the maximum segment size or MSS typically 1460 bytes However the packet trace showed that receiver s congestion window was larger than 3 1460 4380 bytes The second reason is that the standard TCP congestion control specification RFC 2581 states that after the connection has been idle for a while specifically the retransmit

    Original URL path: http://www.evanjones.ca/tcp-performance.html (2016-04-30)
    Open archived version from archive

  • Extracting Text from Wikipedia (evanjones.ca)
    licence the GFDL Code wikipedia2text tar bz2 Titles of 2722 good articles XML dump of the good articles 2008 03 12 wikipedia2text toparticles xml bz2 35 MB compressed 127 MB uncompressed Parsed XML from 2615 articles wikipedia2text toparticles tar bz2 34 MB compressed 200 MB uncompressed Extracted plain text wikipedia2text extracted txt bz2 18 MB compressed 63 MB uncompressed 10 million words How to extract text from Wikipedia Get the Wikipedia articles dump direct link to English Wikipedia It is about 3 GB compressed with bzip2 and about 16 GB uncompressed Get the list of best articles I used the following shell command for i in seq 1 7 do wget http en wikipedia org wiki Wikipedia Version 1 0 Editorial Team Release Version articles by quality i done Extract the list of titles using extracttop py extracttop py toparticles sort top txt Use MWDumper to filter only the articles you care about The version in SVN is considerably newer but the prebuilt version works fine Warning It takes 28 minutes for my 3 8GHz P4 Xeon to decompress and filter the entire English Wikipedia pages dump It produced 127 MB of output for 2722 articles time bzcat enwiki 20080312 pages articles xml bz2 java server jar mwdumper jar format xml filter exactlist top txt filter latest filter notalk pages xml Use xmldump2files py to split the filtered XML dump into individual files This only takes about 2 minutes xmldump2files py pages xml files directory Use wiki2xml command php to parse the wiki text to XML This can lead to segmentation faults or infinite loops when regular expressions go wrong It doesn t always output valid XML since it passes a lot of the text through directly This took 90 minutes on my machine wiki2xml all sh files directory Use

    Original URL path: http://www.evanjones.ca/software/wikipedia2text.html (2016-04-30)
    Open archived version from archive

  • Easy Stack Traces (evanjones.ca)
    revealed a mailing list posting with a preload library that will use glib s backtrace API to get a stack trace and then use libbfd from GNU binutils to resolve file and line numbers My system doesn t have this library so I simplified it to only print the addresses I then use a script to call addr2line to translate the addresses into line numbers You can grab my code if you think it might be useful backtrace c symbolize py I would like to integrate code to resolve files and line numbers so I did some digging for what libraries can read DWARF debugging info It turns out that there are a few options out there Basically none of them will permit you to use them with commercial code except FreeBSD s libdwarf which may not be complete yet This probably isn t a practical issue for me but I still find it interesting that there are so few packages out there for this important part of the tool chain Here is a list of related software projects that I found binutils C GPL Contains libbfd which is what GDB uses libunwind C X11 Get stack traces from current

    Original URL path: http://www.evanjones.ca/software/stack-traces.html (2016-04-30)
    Open archived version from archive