archive-ca.com » CA » E » EVANJONES.CA

Total: 397

Choose link from "Titles, links and description words view":

Or switch to "Titles and links view".
  • Static type checking: More productive for large projects (evanjones.ca)
    languages mostly C at Google C and some Java at MIT now Java at Mitro However I ve written lots of Python in my life and at Mitro our browser extension is written in JavaScript In both cases I occasionally end up with bugs in production because I m accidentally passing in the wrong arguments to a function or I m using the wrong variable or function name These bugs are particularly amusing after a task has been running for 30 minutes or when it happens on a customer s machine It could be that I m doing it wrong please tell me how to do it right if that is the case Many companies write big programs in these languages so clearly it is possible Dropbox Python Facebook PHP Github Ruby etc However many of these uses are places where errors are can be fixed quickly like web sites which makes it cost less to fix runtime errors For example Etsy argues that fast deployments are critical to their software engineering productivity However even in these cases I think the additional cost of making the compiler happy pays off by catching some errors when I write them rather than

    Original URL path: http://www.evanjones.ca/static_type_checking.html (2016-04-30)
    Open archived version from archive


  • A quick and dirty guide to finding Java memory leaks (evanjones.ca)
    If you get the error Unable to open socket file target process not responding or HotSpot VM not loaded it probably means the process is running as a different user Add sudo u process user in front of the command line or run jmap as root with the F flag Finally copy dumpfile to your local machine and run jvisualvm openfile dumpfile or use jvisualvm s File Load menu Once it loads which is slow click the Classes tab to see something like the following This groups objects by class ordered them from most to least number of instances If you have a long running leak it is probably one of the classes near the top of this list Sadly this is the point where you need use everything you know about the application and the internals of the objects You need to look for suspicious classes those that have too many instances or are occupying too much memory JVisualVM lets you step through all the references to any object so it gives you the raw tools you need to find the leak but sometimes it is confusing to understand As an example I ll briefly explain how I found this leak In this case we have millions of LinkedHashMap Entry objects which must be part of some LinkedHashMap Tracing through some random instances didn t immediately reveal anything suspicious so instead I resorted by size HashMap Entry arrays of HashMap Entry objects jumped to second place occupying 32 MB of memory or 15 of the heap This seemed promising for it to jump up so much means there must be some particularly large arrays Drilling into that class ordered by size revealed the following The left hand pane shows all instances with largest first The top entry selected is

    Original URL path: http://www.evanjones.ca/java_memory_leaks.html (2016-04-30)
    Open archived version from archive

  • Databases should encrypt unique IDs (evanjones.ca)
    object references or authorization bypass through user controlled key Finally third parties can learn information about your system by observing the ids e g you can estimate the number of accounts on Github by looking up the id of an account you just created or learn the time a MongoDB item was created One way to make these errors more difficult is to use random ids which need to be much larger than sequential ids e g 128 bits 16 bytes and require slow random inserts A second solution is to make IDs unique for each user which requires writing application level code I think the best solution would be for databases to automatically encrypt ids with a unique key per table You could implement this at the application level Create a table in your database to store keys For each table in your database generate a unique 128 bit AES key Each time you retrieve an object from that table encrypt the id with the key which outputs a pseudo random integer of the same length When you do a query on the id column decrypt the id value s This provides many of the same benefits of random

    Original URL path: http://www.evanjones.ca/encrypted-db-ids.html (2016-04-30)
    Open archived version from archive



  • Trouble with transactions (evanjones.ca)
    concurrent conflicting updates are so rare in the real world no one notices I would love to see some concrete evidence to support or disprove this theory if anyone wants a research project However since transitioning from building databases to using databases I ve learned that transactions can actually cause problems even if you ignore potential performance and scalability problems I gave a lightning talk at Ricon East 2013 with my rough thoughts on this subject PDF slides video but there were technical difficulties with the slides I would love to hear opinions about using transactions in real applications both problems and advantages so I can flesh this out into a full length intelligent article In brief the problems caused by transactions that we have run into are Weak consistency defaults See Peter Bailis s article for details Indirectly calling functions that abort commit a transaction in the middle losing atomicity Database APIs implicitly start a new transaction hiding rollbacks or commits that are in the wrong place Communicating with external systems needs to happen after the transaction commits complicating program structure Concurrency errors can happen at any point so most programs need a top level error handler May want

    Original URL path: http://www.evanjones.ca/trouble-with-transactions.html (2016-04-30)
    Open archived version from archive

  • New York State: Unfriendly to small businesses (evanjones.ca)
    14 days Worse we had recently moved offices so when I got the letter we had 2 days until the deadline After reviewing the rules I couldn t argue with the facts according to the law we didn t have worker s compensation insurance for 6 weeks for 3 employees so the fine is 4000 However this seems horribly unjust we made an honest mistake and corrected it as immediately once we were informed of our error I decided that we had a reasonable basis to file an appeal so I wrote a letter explaining the circumstances Because I didn t want to get assessed late penalties I also included payment for the fine assuming that the difference would be refunded once the appeal was resolved That was my mistake It turns out that if you pay the fine the WCB considers that an admission of guilt despite the fact that the envelope containing the payment also included an appeal According to the agent I spoke to on the phone there is literally no way to get this issue re opened beyond filing a lawsuit As a result we are out 4000 because I tried to do the right thing and because I misunderstood the directions and erred on the side of caution There is nothing I hate more than getting penalized for trying to be honest All it would have taken to prevent this is to have a WCB employee with half a brain open the envelope notice that I included both a payment and an appeal and just shred my check and process the appeal instead My favorite part is that three weeks later we received a letter from a different department of the worker s compensation board It turns out we also need disability insurance and they

    Original URL path: http://www.evanjones.ca/nystate-biz-unfriendly.html (2016-04-30)
    Open archived version from archive

  • Integer division: Results are language-specific (evanjones.ca)
    are multiple ways of handling the case where the division of integers produces a real number i e when the dividend is not an exact multiple of the divisor There are only two variants used in the programming languages I care about For computing q D d Truncated division q truncate D d Effectively rounds towards zero by dropping the fractional part The sign of the modulus remainder is the same as the dividend D Used by C99 C 2011 Java Javascript C Go and hardware Floored division q floor D d Effectively rounds towards The sign of the modulus remainder is the same as the divisor d Used by Python Ruby Matlab R and Excel Example outputs for 5 3 and 5 3 Truncated C Floored Python div mod div mod 5 3 1 2 1 2 5 3 1 2 2 1 5 3 1 2 2 1 5 3 1 2 1 2 Basically mathematicians prefer the properties of floored division over truncated division and hence the mathy languages use it Personally I am more familiar with truncated division and prefer it I find it less surprising since changing the signs of the inputs only changes the

    Original URL path: http://www.evanjones.ca/integer-division.html (2016-04-30)
    Open archived version from archive

  • Builds are Complicated: Java (evanjones.ca)
    useful but small enough to still be simple However even Ninja makes assumptions that are violated by some languages such as Java Here are a bunch of ways that the way Java is compiled is unusual and tends to break build tools like Ninja Output directories The recommended approach is to put java files in a directory hierarchy that matches the package structure However this is not required Thus package subpackage A java could end up creating otherpackage A class Multiple unpredictable output files A single java file can produce multiple class files Worse yet you can t figure out the names without parsing the file While nested types will produce files named name class which is relatively predictable Java allows you to include non nested package private classes in the same file As an example public class Wtf public static final int V 1 class PackagePrivate public static final int U 2 Compiling Wtf java produces both Wtf class and PackagePrivate class This causes a problem for incremental builds since the build system may be unaware of these extra classes For example if PackagePrivate is removed from Wtf java the build system needs to know to delete PackagePrivate class Otherwise classes that depend on the PackagePrivate class will continue to compile instead of generating a compiler error Implicit compilation of dependencies When javac is used on a single java file it will also implicitly compile all the classes it depends on technically the transitive closure of all dependencies This means build tools need to do something complicated to avoid unnecessary recompilation and it complicates parallel builds This is actually sort of required by the language Javac must at least parse all dependencies since Java permits circular dependencies between classes This can also cause problems when creating jar files since

    Original URL path: http://www.evanjones.ca/build-java.html (2016-04-30)
    Open archived version from archive

  • Python Module Name Clashes (evanjones.ca)
    to the module In Python versions 2 7 3 see below Thus mymodule first searches in its own package and finds our own email py instead of the built in module If we change the script to just import email we get python script py mypackage email imported mymodule mypackage email module mypackage email from mypackage email pyc mymodule email module mypackage email from mypackage email pyc So how do we get the built in module We need to tell Python that we want absolute imports This is the default in Python3 and it may become the default for future versions of Python2 x as well To do this we need to use from future import absolute import If we add that line we now get the following output script py mypackage email imported mymodule imported mypackage email module mypackage email from mypackage email py mymodule email module email from System python2 7 email init pyc Victory We now can access both modules one as mymodule email and the other as email If we explicitly want a relative import we can use from import email as local email and then you get script py mypackage email imported mymodule mypackage email module mypackage email from mypackage email pyc mymodule email module email from System python2 7 email init pyc mymodule local email module mypackage email from mypackage email pyc Next Problem Now let s write a unit test for mypackage mymodule We create a file called mypackage mymodule test py that imports mympackage mymodule mypackage mymodule test py Traceback most recent call last File mypackage mymodule test py line 4 in module import mypackage mymodule ImportError No module named mypackage mymodule Ah right we need to set our PYTHONPATH so it can find the module Let s try again PYTHONPATH mypackage mymodule test py mypackage email imported mypackage email imported mymodule mypackage email module mypackage email from mypackage email pyc mymodule email module email from mypackage email pyc mymodule local email module mypackage email from mypackage email pyc Wait a second look at that closely In mymodule it found our own email py module for both import email and import mypackage email even though we are specifying that we want absolute imports Didn t we just fix this problem Why isn t it still fixed Script Relative Imports The problem now is that Python puts the script s directory at the beginning of the module search path sys path or PYTHONPATH In Python the main script is assumed to be at the root of the package tree Doing anything differently like trying to put the mymodule test script inside mypackage breaks things The first warning sign here is that we needed to specify our own PYTHONPATH However even i There are two easy but unsatisfying solutions either put all the main Python scripts in the actual root of your package hierarchy or rename files to avoid name clashes with built in modules I have however found a disgusting hack that fixes

    Original URL path: http://www.evanjones.ca/python-name-clashes.html (2016-04-30)
    Open archived version from archive