archive-ca.com » CA » E » EVANJONES.CA

Total: 397

Choose link from "Titles, links and description words view":

Or switch to "Titles and links view".
  • Evan Jones - Software Engineer | Computer Scientist
    2016 February 12 Fixing Java s ByteBuffer native memory leak 2015 December 27 How both TCP and Ethernet checksums fail 2015 October 05 Naive Retries Considered Harmful 2015 September 28 Go Gotcha Don t take the address of loop variables 2015 August 03 Popular Software and Articles The Four Month Bug JVM statistics cause garbage collection pauses how I found it 2015 March 24 How to Use UTF 8 with Python 2005 October 01 Using Unicode in C C 2006 April 15 Implementing a Thread Library on Linux 2003 December 10 Getting Starting With ns2 2005 February 18 Efficient Java I O byte ByteBuffers and OutputStreams 2009 October 21 Farewell to MIT 2011 October 26 Employment History Twitter 2014 present Mitro aka Lectorius Inc 2012 2014 Infix 2011 2012 MIT CSAIL database group 2007 2011 Google 2006 2007 Research Projects H Store A database system for extremely high throughput transaction processing Relational Cloud Databases as a service Academic Publications Aubrey Tatarowicz Carlo Curino Evan P C Jones and Samuel Madden Lookup Tables Fine Grained Partitioning for Distributed Databases In ICDE Washington DC USA Apr 2012 pdf Andrew Pavlo Evan P C Jones and Stanley Zdonik On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systems In PVLDB vol 5 no 1 Sep 2011 Evan P C Jones Daniel J Abadi and Samuel Madden Low overhead concurrency control for partitioned main memory databases In SIGMOD Indianapolis IN USA June 2010 pdf Carlo Curino Evan P C Jones Yang Zhang and Samuel Madden Schism a workload driven approach to database replication and partitioning In PVLDB Singapore Sep 2010 Robert Kallman Hideaki Kimura Jonathan Natkins Andrew Pavlo Alex Rasin Stanley B Zdonik Evan P C Jones Samuel Madden Michael Stonebraker Yang Zhang John Hugg and Daniel J Abadi H Store a high

    Original URL path: http://www.evanjones.ca/ (2016-04-30)
    Open archived version from archive


  • Social networks: If you can't beat them, join them (evanjones.ca)
    is actually ideal I don t do anything and I still see most of the interesting things my friends post However I ve also long believed that it is important to promote your own identity online which is part of the reason I have a personal web site Today I m willing to admit that online audiences have changed and if I want to reach them I need to be

    Original URL path: http://www.evanjones.ca/social-network-policy.html (2016-04-30)
    Open archived version from archive

  • Evan Jones - Software Engineer | Computer Scientist
    Upper Saddle River NJ USA Addison Wesley 1995 ISBN 0201633612 2006 June 23 21 37 Coding Standards 2006 June 15 16 19 Meyers Scott Effective C 3 rd edition Upper Saddle River NJ USA Addison Wesley 2005 ISBN 0321334876 2006 June 15 15 58 Migrated to a New Computer 2006 June 08 23 22 Thesis Practical Routing in Delay Tolerant Networks 2006 June 08 23 21 Vonnegut Kurt Slaughterhouse Five New York Dell 1969 ISBN 0440180295 2006 June 08 23 14 Stephen King The Dark Tower New York Scribner 2004 ISBN 9781880418628 2006 June 08 23 11 SimpleXMLParse XML to Python Objects and Back Again 2006 April 16 13 39 Using Unicode in C C 2006 April 15 14 11 SimpleScalar Notes 2006 April 08 10 29 Towards 2020 Science 2006 April 05 14 22 Python Memory Management Part 3 The Saga is Over 2006 March 25 14 52 Match Point 2006 March 25 14 50 Burn Capacitor Burn 2006 March 10 15 35 Coupland Douglas Eleanor Rigby Toronto Random House 2005 ISBN 0679313389 2006 March 08 14 46 Vector Graphics and Animation Formats 2006 February 21 21 38 Barlow Maude Too Close For Comfort Toronto McClelland Stewart Ltd 2005 ISBN 0771010885 2006 February 21 21 31 Khadra Yasmina The Swallows of Kabul New York Doubleday 2004 ISBN 1400033764 2006 February 21 21 22 Steinbeck John The Grapes of Wrath New York Bantam 1970 2006 February 21 21 17 PXE Imager 2006 February 10 11 36 Programming Languages for Games 2006 February 02 19 37 Rethinking the Networking Stack 2006 January 31 14 58 Part Two Things Java Got Wrong 2006 January 10 19 24 Part One Things Java Got Right 2006 January 08 16 48 2005 Frequency Allocation Posters 2005 December 03 10 55 You and Me and Everyone We Know 2005 December 03 10 47 Brown Dan The Da Vinci Code New York Doubleday April 2003 ISBN 0385504209 2005 November 28 14 52 Hunt Andrew and Thomas David The Pragmatic Progammer Reading MA USA Addison Wesley Longman Inc October 1999 ISBN 020161622X 2005 November 17 22 06 Tim Bray Hard Open Problems in Network Computing 2005 November 16 21 10 Improving GTK Text Performance 2005 November 05 15 40 Brazil 2005 November 05 15 33 High Performance Server Architecture 2005 November 02 13 47 Primer 2005 November 02 13 37 Lessons from the Jikes RVM 2005 November 01 18 08 Python Parser 2005 October 30 18 16 VMware Player 2005 October 20 19 11 Welsh Irvine Trainspotting New York Norton Company 1996 ISBN 0393314804 2005 October 20 19 04 Spyware Trusted Computing Cheating and Online Games 2005 October 14 13 12 Buying Movies Online 2005 October 14 12 39 Stephenson Neal Cryptonomicon New York HarperCollins 1999 ISBN 0380788624 2005 October 11 17 28 Battle for Wesnoth 2005 October 10 22 20 Multipath Load Balancing in Multi hop Wireless Networks 2005 October 08 19 42 In America 2005 October 05 22 10 AMD s Personal Internet Communicator 2005 October 03 20 06 New Web Server Performance Champion 2005 October 03 15 16 How to Use UTF 8 with Python 2005 October 01 20 15 Debugging SSL Applications in Python 2005 September 29 15 53 The Mobile Phone As Home Computer 2005 September 24 12 48 Crash 2005 September 24 10 58 NdisWrapper Packages for Debian 2005 September 22 16 57 Practical Routing in Delay Tolerant Networks 2005 September 13 22 06 Bourdain Anthony Kitchen Confidential New York Bloomsbury 2000 ISBN 158234082X 2005 September 06 11 39 Tips for Writing a Research Proposal 2005 August 16 17 00 Secure Booting the X Box 2005 August 11 07 26 Fake Access Points 2005 August 09 17 03 McConnell Steve Code Complete Redmond Washington USA Microsoft Press 1993 ISBN 1556154844 2005 August 07 18 16 Atheros Linux Happy 2005 August 03 13 06 Google Map Hacks Bike Route Distance 2005 August 02 11 53 Portable Byte Swapping Functions 2005 August 01 15 45 Stupid Bugs Implementing Comparators 2005 July 27 13 45 Context Free 2005 July 14 09 56 Martel Yann Life of Pi Toronto Alfred A Knopf Canada 2001 ISBN 0676973760 2005 July 14 09 21 Python Memory Management Part 2 2005 June 28 18 42 Python Worker Queues Thread Pools 2005 June 27 12 27 Parallel Python 2005 June 25 16 47 Nasar Sylvia A Beautiful Mind New York Simon and Schuster 2002 ISBN 0684819066 2005 June 25 16 40 Paper Airplane An Editable Web 2005 June 15 17 59 Canadian Espresso Machine Vendors Part 2 2005 June 12 22 37 CPU Architecture is Dead 2005 June 07 15 23 Implementing Python 2005 June 06 14 05 HotOS X 2005 June 05 22 01 Finding Neverland 2005 June 04 13 07 The Canadian Coffee Industry Sucks 2005 June 02 19 27 The Inventor of Ethernet 2005 June 01 14 00 Useful Python Modules PyRSS2Gen 2005 May 25 16 23 Hornby Nick How to be Good New York Riverhead Books July 2001 ISBN 1573221937 2005 May 24 21 44 Chabon Michael The Amazing Adventures of Kavalier and Clay New York Picador 2000 ISBN 0312282990 2005 May 20 18 55 Python Challenge 2005 May 19 10 14 Convex Hulls 2005 May 18 21 06 Why Cell Phones Will Dominate the Future Internet 2005 May 16 16 40 Schneier on Spam 2005 May 13 21 03 Dynamic Data Visualization 2005 May 13 14 32 Fallout 2 2005 May 12 19 11 NFS Shell 2005 May 12 11 28 IPv6 API Documentation 2005 May 04 13 21 CiteULike 2005 April 19 09 47 Billion Transistor CPU Architectures 2005 April 18 20 34 Stephen King Song of Susannah New York Scribner 2004 ISBN 1880418592 2005 April 14 19 10 Stephen King Wolves of the Calla New York Scribner 2003 ISBN 1880418568 2005 April 14 19 06 The 100 Laptop Project 2005 April 07 15 01 Sin City 2005 April 03 15 01 Korean Python Documentation 2005 April 01 09 05 Speex and Open VoIP 2005

    Original URL path: http://www.evanjones.ca/chronological.html (2016-04-30)
    Open archived version from archive



  • Corrupt data over TCP: It was a kernel bug! (evanjones.ca)
    most days across Twitter s entire machine fleet the machines never see a packet with a corrupt TCP checksum However if a hardware failure occurs a huge number of packets can be corrupt The usual error model that assumes corruption is evenly distributed is wrong In our case about 10 of packets passing through a bad switch were corrupt and something like 0 5 had two bits of errors This

    Original URL path: http://www.evanjones.ca/checksum-failure-is-a-kernel-bug.html (2016-04-30)
    Open archived version from archive

  • Fixing Java's ByteBuffer native memory "leak" (evanjones.ca)
    process that would slowly use more and more memory until it hit its limit and was killed It turns out that Finagle responses are currently contained in heap ByteBuffers triggering this issue Finagle will eventually switch to a new version of Netty which will avoid this issue by using direct ByteBuffers To work around the problem Twitter s JVM team added a flag to our internal version to limit the size of this cache However it turns out you can easily replace one of the JDK classes for a single program This makes it easy to avoid this native memory leak by following the steps below I ve sent an email to the nio dev mailing list to see if we can limit the size of this cache However if you are affected by this you can try my workaround Demonstrating the leak I wrote a program that writes to dev null from multiple threads with both heap and direct ByteBuffers It shows that using direct ByteBuffers works as you expect where they are garbage collected when they are unused However heap ByteBuffers cause direct ByteBuffers to be allocated and cached until the threads exit You can also use this to show that my quick and dirty patch below avoids the leak I ve put the code in a Github repository and the README includes sample output The code behind the leak When you pass a ByteBuffer to an I O API there are checks to copy heap ByteBuffers to a temporary direct ByteBuffer before making the actual system call For example for network I O you use a SocketChannel which is actually an instance of sun nio ch SocketChannelImpl Reading from a socket calls IOUtil read and writing calls IOUtil write Both methods check if the ByteBuffer is a

    Original URL path: http://www.evanjones.ca/java-bytebuffer-leak.html (2016-04-30)
    Open archived version from archive

  • How both TCP and Ethernet checksums fail (evanjones.ca)
    conclusion is that if you are creating a new network protocol please append a 4 byte CRC I suggest CRC32C implemented in hardware on recent Intel AMD and ARM CPUs An alternative is to use an encryption protocol e g TLS since they include cryptographic hashes which fixed a similar incident The rest of this article describes the details about how this is possible mostly so I don t forget them Properties of the TCP checksum The TCP checksum is two bytes long and can detect any burst error of 15 bits and most burst errors of 16 bits excluding switching 0x0000 and 0xffff This means that to keep the same checksum a packet must be corrupted in at least two locations at least 2 bytes apart If the chance is purely random we should expect approximately 1 in 2 16 approximately 0 001 of corrupt packets to not be detected This seems small but on one Gigabit Ethernet connection that could be as many as 15 packets per second For details about how to compute the TCP checksum and its error properties see RFC 1071 Properties of the Ethernet CRC The Ethernet CRC is substantially stronger partly because it is twice as long 4 bytes and partly because CRCs have good mathematical properties such as detecting all 3 bit errors in 1500 byte Ethernet packets understanding this is beyond my math skills It appears that most switches discard packets with invalid CRCs when they are received and recalculate the CRC when the packet goes back out This means the CRC really only protects against corruption on the wire and not inside the switch Why not just re send the existing CRC Modern switch chips have features that modify packets such as VLANs or explicit congestion notification Hence it is

    Original URL path: http://www.evanjones.ca/tcp-and-ethernet-checksums-fail.html (2016-04-30)
    Open archived version from archive

  • Naive Retries Considered Harmful (evanjones.ca)
    11 30 the traffic increases dramatically because another data center failed over for scheduled maintenance Everything seems fine for nearly 30 minutes although the traffic from Service B is actually slowly increasing I believe this increase was caused by Service B timing out and retrying At 12 00 this service has become slow enough to cause nearly all requests from Service B to time out causing a massive spike in traffic This caused most of the instances of this service to die I no longer recall exactly why The total traffic then drops since there is nothing to receive and record it Thankfully Service B has a policy where after enough requests fail it stops sending entirely then slowly ramps back up This allowed the instances to restart after the failure However the service was then in an interesting state where the traffic from Service B would slowly ramp up overload the service then hammer it with a huge spike This would cause lots of failures so it would back off and repeat the process Each spike caused a downstream service to get overloaded and restart This situation persisted for two hours until we figured out how to limit the load on the downstream service This allowed it to survive the next spike by rejecting most of the requests and the system stabilized Service B s retry policy in this case was both good and bad The good part is that after many requests fail it stops sending then ramps back up slowly a bit like TCP congestion control This allowed the service to recover after each spike The bad part is that each spike sent double the normal traffic and killed our service Update 2015 09 28 As a second real world example a few days before I wrote this Amazon s DynamoDB key value store service had a serious outage In their postmortem they describe how retries during this outage caused heavy load so that healthy storage servers were having subsequent renewals fail and were transitioning back to an unavailable state The load was so high that it was preventing Amazon from successfully making the requisite administrative requests when they tried to add more capacity To prevent this failure in the future Amazon is reducing the rate at which storage nodes request membership data which sounds to me like they are adjusting the retry policy Retries are very helpful when there is an temporary error or slowdown In modern distributed systems there are a wide variety of things that can cause these kinds of giltches such as a network or CPU spike in a process on the same machine an unusually long GC pause or a hardware failure The problem is when the entire system is slow because it is overloaded retries make things worse As a result you must retry intelligently The challenge is what does intelligent mean Currently I think a good default policy is to send a backup request after a percentile latency target e

    Original URL path: http://www.evanjones.ca/retries-considered-harmful.html (2016-04-30)
    Open archived version from archive

  • Go Gotcha: Don't take the address of loop variables (evanjones.ca)
    golint since others have run into the same issue 1 2 3 In Go variables introduced in a for statement are re used in each iteration see the specification This means taking the address of one of these variable always produces a pointer to the same location which is re used on each iteration Consider the following snippet which creates a slice of pointers to structs from a slice of structs go playground values MyStruct MyStruct 1 MyStruct 2 MyStruct 3 output MyStruct for v range values output1 append output v fmt Println output output This produces MyStruct 3 MyStruct 3 MyStruct 3 but it should produce MyStruct 1 MyStruct 2 MyStruct 3 To fix it we have two options depending on the semantics we want If the output should contain pointers to a copy of the values we need to take the address of a new variable playground If the output should contain pointers to the original values we need to take the address of an index expression playground This problem becomes harder to spot within more complicated expressions e g v member also doesn t work or in longer loops However the real challenge is that there are legitimate uses of this expression In particular if the pointer is only used inside a single loop iteration then the value will be what we expect This means we can accidentally add this bug to some code that previously worked by accidentally holding on to a pointer for too long In my case this is exactly how I ran into this bug A bad automated checker I wanted to trying finding this bug automatically so I looked at the source code for go vet s range loop check as well as errcheck for an example of a standalone tool My

    Original URL path: http://www.evanjones.ca/go-gotcha-loop-variables.html (2016-04-30)
    Open archived version from archive