<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.2.1" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>Reasonably Logical</title>
	<link>http://www.philipreames.com/Blog</link>
	<description>Reflections on technology, society, and their infinite interactions</description>
	<pubDate>Sun, 22 Jan 2012 21:44:01 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2.1</generator>
	<language>en</language>
			<item>
		<title>Coding Best Practice: 5 Whys applied to bug fixing</title>
		<link>http://www.philipreames.com/Blog/2012/01/22/coding-best-practice-5-whys-applied-to-bug-fixing/</link>
		<comments>http://www.philipreames.com/Blog/2012/01/22/coding-best-practice-5-whys-applied-to-bug-fixing/#comments</comments>
		<pubDate>Sun, 22 Jan 2012 21:44:01 +0000</pubDate>
		<dc:creator>reames</dc:creator>
		
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.philipreames.com/Blog/2012/01/22/coding-best-practice-5-whys-applied-to-bug-fixing/</guid>
		<description><![CDATA[When you find a bug, fixing it is not enough.  You should take a moment or two to think about how the bug was able to occur.  Frequently, a single bug report (particularly in more complex systems) hints at a number of underlying issues.  
I generally try to find at least three [...]]]></description>
			<content:encoded><![CDATA[<p>When you find a bug, fixing it is not enough.  You should take a moment or two to think about how the bug was able to occur.  Frequently, a single bug report (particularly in more complex systems) hints at a number of underlying issues.  </p>
<p>I generally try to find at least three independent fixes to ensure that a given bug couldn&#8217;t occur again.  You could think of this as <a href="http://en.wikipedia.org/wiki/Defense_in_Depth_%28computing%29">defense in depth</a> or <a href="http://en.wikipedia.org/wiki/Defensive_programming">defensive programming</a>, but my original inspiration to apply this came from the <a href="http://en.wikipedia.org/wiki/5_Whys">5-whys principle</a>.  In my case, I use 3, but the exact number isn&#8217;t the important part.  </p>
<p>Good questions to reflect on:<br />
1) Was there an earlier point in code where we could have noticed something was going wrong?  Can I improve the error reporting anywhere along the call trace?<br />
2) Can I refactor the interface or code to make this class of mistakes less likely?  Can I better document what the interface is?<br />
3) Is there another way I could hit this same error case?  If so, can I quickly (with low risk) fix that one too?<br />
4) What test could I write which would find this error?  (Is it worth the time?)<br />
5) Are there simple code changes which would have made debugging this much faster?  </p>
]]></content:encoded>
			<wfw:commentRss>http://www.philipreames.com/Blog/2012/01/22/coding-best-practice-5-whys-applied-to-bug-fixing/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Coding Best Practice: The role of comments in source code</title>
		<link>http://www.philipreames.com/Blog/2012/01/16/coding-best-practice-the-role-of-comments-in-source-code/</link>
		<comments>http://www.philipreames.com/Blog/2012/01/16/coding-best-practice-the-role-of-comments-in-source-code/#comments</comments>
		<pubDate>Mon, 16 Jan 2012 18:44:34 +0000</pubDate>
		<dc:creator>reames</dc:creator>
		
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.philipreames.com/Blog/2012/01/16/coding-best-practice-the-role-of-comments-in-source-code/</guid>
		<description><![CDATA[The purpose of a comment in source code is to document why you are doing something or to provide a quick summary of a non-obvious algorithm.  It is not to describe what you are doing.  Generally, a skilled programmer - i.e. hopefully your peers - can figure out what you&#8217;re doing from code [...]]]></description>
			<content:encoded><![CDATA[<p>The purpose of a comment in source code is to document <em>why</em> you are doing something or to provide a <em>quick summary</em> of a non-obvious algorithm.  It is not to describe <em>what</em> you are doing.  Generally, a skilled programmer - i.e. hopefully your peers - can figure out<em> what</em> you&#8217;re doing from code just fine.  The often confusing part is <em>why</em>.  (i.e. &#8220;Is this intentional or a bug?  If it&#8217;s intentional, why is it needed?&#8221;.  </p>
<p><strong>Good example comments:</strong><br />
&#8220;An implementation of merge sort.  See http://en.wikipedia.org/wiki/Merge_sort for an overview.&#8221;</p>
<p>&#8220;There are two obvious implementations here:<br />
a) describe<br />
b) describe<br />
Benchmarking (using the test cases in tests/mybench/*.cxx) shows that using option 1 faster by ~15%. &#8221;</p>
<p><strong>Bad example comments:</strong></p>
<p>&#8220;This is broken. (with no explanation or testcase)&#8221;<br />
(some complicated bit of code here)</p>
<p>&#8220;Sorting an array&#8221;<br />
std::sort(vec.begin(), vec.end());</p>
<p>&#8220;(none)&#8221;<br />
(massive block of hard to read code here)</p>
<p>&#8220;(none)&#8221;<br />
(edge case or tricky non-obvious behavior)</p>
<p>If you find yourself writing lots of comments (or none), you&#8217;re probably &#8220;doing it wrong&#8221;.  Go read up on <a href="http://en.wikipedia.org/wiki/Self-documenting">self-documenting code</a> and start practicing.  Remember that self documenting code isn&#8217;t writing code without comments; it&#8217;s writing code where the comments are executable code themselves.  </p>
<p>Please note: Nothing above should be read to discourage the use of function documentation.  That&#8217;s a separate topic which I may mention later.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.philipreames.com/Blog/2012/01/16/coding-best-practice-the-role-of-comments-in-source-code/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Introduction to extending Clang/LLVM</title>
		<link>http://www.philipreames.com/Blog/2012/01/10/introduction-to-extending-clangllvm/</link>
		<comments>http://www.philipreames.com/Blog/2012/01/10/introduction-to-extending-clangllvm/#comments</comments>
		<pubDate>Tue, 10 Jan 2012 19:45:17 +0000</pubDate>
		<dc:creator>reames</dc:creator>
		
		<category><![CDATA[PL Theory/Design]]></category>

		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.philipreames.com/Blog/2012/01/10/introduction-to-extending-clangllvm/</guid>
		<description><![CDATA[I&#8217;ve spent the first part of today watching some of the videos from the LLVM Dev Meeting that occurred back in November.  (I really wish I&#8217;d been able to attend!)  The first talk I watched was the Extending Clang talk by Doug Gregor with Apple.  If you&#8217;re thinking about playing around with [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve spent the first part of today watching some of the videos from the <a href="http://llvm.org/devmtg/2011-11/">LLVM Dev Meeting that occurred back in November</a>.  (I really wish I&#8217;d been able to attend!)  The first talk I watched was the <a href="http://llvm.org/devmtg/2011-11/#talk4">Extending Clang talk by Doug Gregor with Apple</a>.  If you&#8217;re thinking about playing around with Clang, I strongly suggest you watch this video.  I&#8217;ve spent the last few months hacking on clang for a language extension I&#8217;m working on and this was by far the best introduction I&#8217;ve seen.  I really wish I&#8217;d come across this before I spent hours learning it myself.  <img src='http://www.philipreames.com/Blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Other useful links:</p>
<ul>
<li><a href="http://clang.llvm.org/docs/InternalsManual.html">Clang Internals Manual</a> (a great reference for folks looking to extend clang)</li>
<li><a href="http://clang.llvm.org/docs/InternalsManual.html#AddingAttributes">How to add custom attributes </a>(from the above)</li>
<li><a href="http://blog.llvm.org">The LLVM Project Blog </a>- which is a good way to track major project starts/</li>
<li><a href="http://llvm.org/docs/LangRef.html">LLVM Reference Manual</a> - Authoritative documentation on the LLVM language</li>
<li>Some useful docs on the <a href="http://llvm.org/docs/LinkTimeOptimization.html">Link Time Optimization project</a> (LTO)</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.philipreames.com/Blog/2012/01/10/introduction-to-extending-clangllvm/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Reflections from a crash course in OpenCL</title>
		<link>http://www.philipreames.com/Blog/2012/01/01/reflections-from-a-crash-course-in-opencl/</link>
		<comments>http://www.philipreames.com/Blog/2012/01/01/reflections-from-a-crash-course-in-opencl/#comments</comments>
		<pubDate>Mon, 02 Jan 2012 04:21:19 +0000</pubDate>
		<dc:creator>reames</dc:creator>
		
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.philipreames.com/Blog/2012/01/01/reflections-from-a-crash-course-in-opencl/</guid>
		<description><![CDATA[Over the last few months, I&#8217;ve had an opportunity to spend some time playing with OpenCL.  In short, we&#8217;re trying to use a GPU to accelerate garbage collection for Java.   (Once the work is published, I&#8217;ll post more here.)  We&#8217;ve implemented a simple graph traversal algorithm on an AMD chip using [...]]]></description>
			<content:encoded><![CDATA[<p>Over the last few months, I&#8217;ve had an opportunity to spend some time playing with OpenCL.  In short, we&#8217;re trying to use a GPU to accelerate garbage collection for Java.   (Once the work is published, I&#8217;ll post more here.)  We&#8217;ve implemented a simple graph traversal algorithm on an AMD chip using OpenCL.  This article doesn&#8217;t talk about that effort directly, but instead focuses on a few of the lessons we learned the hard way while getting up to speed on OpenCL.  (So I remember them for next time!)</p>
<p>This has been a group effort, but the content, opinions, and mistakes herein are all my own.   </p>
<p><strong>Stability &#038; Dev Environment</strong></p>
<p>The first and most important lesson we learned was that <strong>each developer needs a dedicated test machine</strong> which is <em>not</em> their primary development box.  This box needs to be local.  When debugging OpenCL programs on real hardware, it is <em>shockingly</em> easy to lock up the entire box.  On multiple occasions, we had to perform hard power cycles on our test machine to get it into a usable state.  </p>
<p>Even when the box didn&#8217;t lock up entirely, a crashed program with a OpenCL kernel outstanding has a bad tendency to prevent future kernels from being executed.  Supposedly, there should be a time out that will terminate a run away program, but we never saw this happen in practice. Instead, we ended up rebooting the box quite frequently.   </p>
<p>In a related vein, we quickly started replacing every while-loop with a for-loop (over a large, but <em>fixed </em>number of iterations).  This allows you to (sometimes) recover from what would otherwise have been an infinitely loop without rebooting the box.  </p>
<p>Another important note is that the documentation available from <a href="http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/">Khronos</a> is a best incomplete and in a couple cases potentially wrong.  Many of the function descriptions don&#8217;t provide relevant details about usage and none of them provide useful examples.  (Can can get some of the latter from the <a href="http://developer.amd.com/sdks/AMDAPPSDK/Pages/default.aspx">AMD</a> and <a href="http://developer.nvidia.com/opencl-sdk-code-samples">NVIDIA</a> SDKs.)  I strongly suggest searching Google for examples before taking the documentation at its word.  </p>
<p>OpenCL does not appear to support a mechanism to forceably abort a kernel.  Nor does it support an assertion mechanism.  Nor does it have any form of debug logging (i.e. printf or the like.)  The only way to exit a kernel function is to return from the <em>main </em>kernel function <em>with all threads.</em>  Unfortunately, this means that error reporting - even for cases where you can easily tell what happened - is extremely hard.  I don&#8217;t have a great solution.  We ended up writing data into global memory - so the CPU could access it after termination - and then trying to exit cleanly.  This worked sometimes, but was error prone to say the least.  </p>
<p>I haven&#8217;t played with the various <a href="http://developer.amd.com/tools/gDEBugger/Pages/default.aspx">debuggers </a>and <a href="http://code.google.com/p/ocl-emu/">emulators</a> available, but I suspect that would help greatly in debugging.  </p>
<p><strong>Synchronization</strong></p>
<p><strong>OpenCL has different synchronization models for threads within a workgroup vs across workgroups on the same device.  </strong>As far as I can tell, <strong>there is <em>no </em>synchronization available between kernels running on different devices </strong>on the same machine.  (You can use the CPU to coordinate starting and stopping kernels of course.)  </p>
<p><strong><a href="http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/barrier.html">Barriers</a> apply only to threads within a single workgroup. </strong> The CLK_LOCAL/GLOBAL_MEM_FENCE parameters enforce memory consistency within a single workgroup, not across workgroups.  Note that you can have a barrier - where all threads stop - but not have a consistent view of memory if you don&#8217;t pass the appropriate flags.  </p>
<p><strong><em>ALL </em>threads within a workgroup must encounter the <em>same </em>barrier</strong>.  If even a single thread does not, the program will hang indefinitely.  (And require a hard reboot of the machine.)  This is unpleasant to debug to say the least.  </p>
<p><strong>Atomic operations are the only way to synchronize between workgroups. </strong> To avoid memory contention (and thus serialization of requests), you probably want only a single thread per workgroup to execute the atomic operation.  Doing this requires an additional synchronization (using a barrier within the workgroup and a temporary local memory value) to get all threads within a workgroup consistent.  </p>
<p><strong>Be careful about which versions of the atomic functions you use.</strong>  OpenCL provides 32 bit vs 64 bit and local vs shared memory versions.  The ones we used - which unfortunately are extensions not part of the language, but thankfully seem pretty common - were <a href="http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/cl_khr_int64_base_atomics.html">cl_khr_int64_base_atomics</a> and <a href="http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/cl_khr_int64_extended_atomics.html">cl_khr_int64_extended_atomics</a>.  I&#8217;ve read some reports that the atomic_op functions don&#8217;t function the same as the atom_op versions.  I can&#8217;t find confirmation of this in the documentation, but we used the atom_op versions just in case.  Another gotcha is that some cards apparently don&#8217;t support the local versions.  Check your documentation carefully since by some reports the functions will simply fail silently.  </p>
<p>Note that it is unclear whether the atomic operations on global memory are visible by the CPU, different GPUs, or merely different workgroups on the same device.  I haven&#8217;t spent much time digging through the documentation, but if this matters to you, check!  The one thing that is clear from the documentation is that atomic operations executed by different GPUs on a shared address are <em>not guaranteed to be atomic</em>!</p>
<p><strong>Infrastructure</strong></p>
<p>To get good performance - even just to minimize testing time - you should probably be using precompiled files.  (Note: These are not binary files and can not be moved between machines.  They are purely a caching mechanism.)  You&#8217;ll need a mechanism - hash, command line parameter, build system, etc.. - to make sure your cached files stay in sync with your source code of course.  </p>
<p>Having a separate program which sanity checks your files - i.e. part of your build system - will save you time in the long run.  If I get time, I&#8217;ll clean the hacky mess I&#8217;ve been using and post it here.  </p>
<p><strong>Generally, the best way to get data from the CPU to the GPU (at least on our setup) is to use CL_MEM_USE_HOST_PTR. </strong> There seems to be a lot of confusion on exactly what this does, the top Google results appear inaccurate, and the <a href="http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/clCreateBuffer.html">documentation</a> isn&#8217;t super clear, but some micro benchmarks gave much better results than for either of the other two options.  (As always, you can not assume that the CPU and GPU have consistent views of this data or that it&#8217;ll be mapped to the same address on both platforms.  All synchronization with the GPU kernels has to be explicit.)  It&#8217;s also unclear to me if OpenCL is required to copy the data back into the host memory after termination or that region can be entirely stale.  That wasn&#8217;t important for our case, so I never tested it.  The documentation is unclear.  The best discussion I&#8217;ve seen is <a href="http://www.khronos.org/message_boards/viewtopic.php?f=41&#038;t=3226">here</a>, but even that&#8217;s somewhat unclear on the finer points.  </p>
<p>Depending on what you&#8217;re doing, you may find some of the various utility libraries useful - <a href="http://www.browndeertechnology.com/coprthr_stdcl.htm">COPRTHR: STDCL</a>, <a href="http://www.bigncomputing.org/Big_N_Computing/Big_N_Computing/Entries/2010/2/22_Small_Brick,_Big_%E2%80%98N%E2%80%99.html">SOCL</a>, or oclUtils from the NIVIDIA SDK.  The only one of these I&#8217;ve used is the oclUitls files which were moderately useful.  </p>
<p><strong>Conclusion</strong></p>
<p>I hope this was useful to you.  If you have corrections, or suggestions, please feel free to <a href="http://www.philipreames.com/#contact">contact me</a>.  </p>
]]></content:encoded>
			<wfw:commentRss>http://www.philipreames.com/Blog/2012/01/01/reflections-from-a-crash-course-in-opencl/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Principles of Good Programming</title>
		<link>http://www.philipreames.com/Blog/2011/12/02/principals-of-good-programming/</link>
		<comments>http://www.philipreames.com/Blog/2011/12/02/principals-of-good-programming/#comments</comments>
		<pubDate>Fri, 02 Dec 2011 21:36:58 +0000</pubDate>
		<dc:creator>reames</dc:creator>
		
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.philipreames.com/Blog/2011/12/02/principals-of-good-programming/</guid>
		<description><![CDATA[I&#8217;d been planning to write a post for a while now on what I consider core principals of programming, but instead I found someone who said most of what I would.  Rather than repeat what&#8217;s already been said, I&#8217;ll just recommend you go read Christopher Diggins list.  
http://www.artima.com/weblogs/viewpost.jsp?thread=331531
The one&#8217;s I personally rate most [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;d been planning to write a post for a while now on what I consider core principals of programming, but instead I found someone who said most of what I would.  Rather than repeat what&#8217;s already been said, I&#8217;ll just recommend you go read Christopher Diggins list.  </p>
<p><a href="http://www.artima.com/weblogs/viewpost.jsp?thread=331531">http://www.artima.com/weblogs/viewpost.jsp?thread=331531</a></p>
<p>The one&#8217;s I personally rate most important are: Write Code for the Maintainer, and Embrace Change.  I see quite a few others on his list as being subitems of the first.  For example, KISS, Avoid Premature Optimization, Don’t make me think, Principle of least astonishment, Single Responsibility Principle, and Hide Implementation Details are all about making sure the reader of the code can scan quickly and not get bogged down.  This is extremely important if any large code base is going to be maintained over the long haul.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.philipreames.com/Blog/2011/12/02/principals-of-good-programming/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Understanding Type Systems</title>
		<link>http://www.philipreames.com/Blog/2011/11/01/understanding-type-systems/</link>
		<comments>http://www.philipreames.com/Blog/2011/11/01/understanding-type-systems/#comments</comments>
		<pubDate>Tue, 01 Nov 2011 05:00:09 +0000</pubDate>
		<dc:creator>reames</dc:creator>
		
		<category><![CDATA[PL Theory/Design]]></category>

		<guid isPermaLink="false">http://www.philipreames.com/Blog/2011/11/01/understanding-type-systems/</guid>
		<description><![CDATA[The purpose of this post is to briefly summarize a few important terms that I&#8217;ve seen thrown around for type systems and try to clarify what they mean.  Usually, I&#8217;d go to Wikipedia for such a thing, but the article on type systems is ill organized and hard to understand.  I&#8217;ve spent the [...]]]></description>
			<content:encoded><![CDATA[<p>The purpose of this post is to briefly summarize a few important terms that I&#8217;ve seen thrown around for type systems and try to clarify what they mean.  Usually, I&#8217;d go to Wikipedia for such a thing, but the <a href="http://en.wikipedia.org/wiki/Type_system">article on type systems</a> is ill organized and hard to understand.  I&#8217;ve spent the last few weeks reading papers on type systems, and I still don&#8217;t understand it!</p>
<p>I&#8217;ve chosen to organize this in terms of several orthogonal axises of typing systems.  This organization is mostly my own, but I&#8217;m freely stealing the best ideas from the papers I&#8217;ve read as well. </p>
<p><strong>Typed vs Untyped</strong> - Typing is an organization of data which classifies fields or records into distinct groups.  These groups can be either predefined or user defined.  A language is typed if there exists a feature which helps to express such typing or if such a distinction is implicit in the language specification.  Note that a language does not need to check or enforce this semantic organization in any way to be typed. </p>
<p>By this definition, even something like assembly language is typed for some aspects.  On most ISAs, there are distinct instructions for operating on floating point numbers and integers.  On the other hand, not all ISAs define separate instructions for operating on signed vs unsigned integers.  This highlights the important point that a language can be typed with respect to some attribute and not typed with respect to another. </p>
<p><strong>Strong vs Weak</strong> - The strength of the typing is essentially just how easy it is to get around.  A strongly typed language has a type system which can not be avoided.  A weakly typed system is one that is more of a suggestion than an enforced rule.  Note that the strength of the typing system says nothing about when the enforcement may happen.  (We&#8217;ll get to that in a second.) </p>
<p>In practice, every language I know of is somewhere in the middle.  A language may be closer to strongly typed or more weakly typed, but it&#8217;s a matter of degree.  As with typing itself, a language can also be more strongly (or weakly) typed <em>with respect to a particular attribute</em>.  </p>
<p><strong>Static vs Dynamic</strong> - Expresses <em>when </em>the type is checked.  A statically checked language is checked at compile time.  A dynamically checked language is checked at execution time.  There&#8217;s been much debate as to which is preferable over the years, but the basic arguments come down to safety &#038; performance (static) vs ease of use &#038; expressiveness (dynamic). </p>
<p>The term <strong>hybrid typing</strong> is an acknowledgment that most practical languages are both statically and dynamically typed.  While the terminology has only entered the academic literature in the last few years, it&#8217;s been around in practice for much longer.</p>
<p><strong>Gradual typing</strong> is another recent invention that explicitly merges static and dynamic typing.  The basic idea is that a program can be written in a dynamic style and moved to a fully statically checked version incrementally.  In terms of real languages, the best example I&#8217;ve seen is <a href="http://cython.org/">Cython</a>.  Cython focuses on incremental performance, but incremental safety and changeability are also reasons to consider gradually typed systems.  Expect a full article on gradual typing in the not too distant future; I&#8217;ve been quite engrossed with the idea. </p>
<p><strong>Nominal vs Structural</strong> - Another important distinction between typing systems is how they define equivalence.  A nominal system defines it by the name of a type.  A structural system defines it by the actual interface and field layout of the respective types. </p>
<p>As a side note, the same language may define equivalence differently for varying stages of validation and/or execution.  One common optimization performed in nominally typed languages is to combine the implementations of structurally equivalent types during code generation.  Some languages also expose this distinction in their syntax. </p>
<p>A <strong>duck type</strong> system is an odd extension of a structural type system which only considers the structural equivalence at point of use.  In particular, two different paths through the same function may have different required interfaces (and thus types.)  You could also think of duck typing as a extreme form of dependent typing which includes conditions with runtime values. </p>
<p>A related classification is the rules the system defines for when substituting one type for another is legal.  In short, a <strong>strictly classic type</strong> system allows no substitution, a <strong>subtype type</strong> system allows substitution along lines of inheritance, and a <strong>dependent type system</strong> allows substitution if the dependent conditions are met.  Given that this is a highly complex topic which I&#8217;m not sure I fully understand yet, I plan a future post to discuss this separately.  For now, don&#8217;t worry to much if this paragraph didn&#8217;t make a lot of sense. </p>
<p>I also plan on drilling into type conversion in detail.  It has important implications both with regards to practical usability of any type system and their theoretical design.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.philipreames.com/Blog/2011/11/01/understanding-type-systems/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Coding Best Practice: Make your assumptions explicit</title>
		<link>http://www.philipreames.com/Blog/2011/10/19/coding-best-practice-make-your-assumptions-explicit/</link>
		<comments>http://www.philipreames.com/Blog/2011/10/19/coding-best-practice-make-your-assumptions-explicit/#comments</comments>
		<pubDate>Thu, 20 Oct 2011 02:31:25 +0000</pubDate>
		<dc:creator>reames</dc:creator>
		
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.philipreames.com/Blog/2011/10/19/coding-best-practice-make-your-assumptions-explicit/</guid>
		<description><![CDATA[This is a topic I&#8217;m going to be expanding on in future posts, but for now let&#8217;s cover the basics.
Assertions are your single best tool for software reliability.  Why?  Unlike tests, you can write assertions which check any property of your system as it runs. Tests can only check the results of a [...]]]></description>
			<content:encoded><![CDATA[<p>This is a topic I&#8217;m going to be expanding on in future posts, but for now let&#8217;s cover the basics.</p>
<p>Assertions are your single best tool for software reliability.  Why?  Unlike tests, you can write assertions which check any property of your system as it runs. Tests can only check the results of a given execution.</p>
<p>Assertions serve several purposes:</p>
<ol>
<li><strong>Checking your assumptions</strong> - If you write new code that does something you don&#8217;t expect, you&#8217;ll find out on first execution that violates your assumptions.  Ideally, you&#8217;ll be running your code in the debugger at the point this happens and can immediate inspect the entire state around the failure.  </li>
<li><strong>Enforceable documentation of intent and expectations</strong>.  If you&#8217;re working with a team of folks and someone uses a library in a way you didn&#8217;t expect, asserts will tell them this immediately.  You should still document <em>why</em> your asserts are there mind you.  </li>
<li><strong>Fault isolation. </strong> If you&#8217;ve written good assertions, your error statement will be something like &#8220;global state does not comply with expectations&#8221; right after your update function.  This is much easier to debug than noticing a corrupt output thousands of operations later.</li>
<li><strong>Preventing corruption.  </strong>If you&#8217;re using an assertion package which calls abort or triggers an exceptions, you don&#8217;t need to worry about the line following the assertion running if the assertion has been violated.  This simplifies error handling immensely.</li>
<li><strong>Performance.  </strong>Depending on your compiler, it may be able to take advantage of your assertions to optimize the code that follows.  If the compiler &#8220;knows&#8221; - because you told it - that an loop iteration count must be a multiple of four, it can unroll and generate much more efficient code.</li>
</ol>
<p>The downsides of assertions - as implemented in C with <cassert> at least - are that they are extra code which executes at runtime.  Some of your assertions will be pruned by the compiler, but most will remain.  As such, if you add an assertion in the &#8220;wrong&#8221; spot - for example inside a tight loop - you can slow your program down quite a bit.</p>
<p>Before you panic, remember a few classic quotes about optimization:</p>
<ul>
<li>&#8220;More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason — including blind stupidity.&#8221; — W.A. Wulf</li>
<li>&#8220;We should forget about small efficiencies, say about 97% of the time: <strong>premature optimization is the root of all evil</strong>. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only <strong>after that code has been identified</strong>&#8220;[5] — Donald Knuth</li>
<li>&#8220;Bottlenecks occur in surprising places, so don&#8217;t try to second guess and put in a speed hack until you have proven that&#8217;s where the bottleneck is.&#8221; — Rob Pike</li>
</ul>
<p>(Credit: <a href="http://en.wikipedia.org/wiki/Program_optimization#Quotes">Wikipedia</a>, emphasis mine)</p>
<p>You should write your assertions, and only remove (or restructure) those that your profiler tells you are actually at issue.  The tiny amount of performance you might gain by omitting them is not worth hours spent debugging or (more importantly) the lower quality software that would result.  </p>
<p>As a side note: There&#8217;s plenty of ongoing research out there trying to either prove/disprove assertions and/or prune redundant assertions.  Expect your compiler to get substantially better over the next few years about giving compiler time errors on assertion violations and pruning unnecessary assertions before runtime.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.philipreames.com/Blog/2011/10/19/coding-best-practice-make-your-assumptions-explicit/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Why legacy code is important in research</title>
		<link>http://www.philipreames.com/Blog/2011/10/07/why-legacy-code-is-important-in-research/</link>
		<comments>http://www.philipreames.com/Blog/2011/10/07/why-legacy-code-is-important-in-research/#comments</comments>
		<pubDate>Fri, 07 Oct 2011 05:14:06 +0000</pubDate>
		<dc:creator>reames</dc:creator>
		
		<category><![CDATA[Research]]></category>

		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.philipreames.com/Blog/2011/10/07/why-legacy-code-is-important-in-research/</guid>
		<description><![CDATA[I ran across this blog entry a few weeks back and have been thinking about the value of transparency in research.  After a bit of reflection, I&#8217;m going to try an experiment here.  I&#8217;m posting a project idea in the &#8220;Open Proposal&#8221; vein.  I&#8217;m going one step further though and posting a [...]]]></description>
			<content:encoded><![CDATA[<p>I ran across <a href="http://blog.regehr.org/archives/568">this blog entry</a> a few weeks back and have been thinking about the value of transparency in research.  After a bit of reflection, I&#8217;m going to try an experiment here.  I&#8217;m posting a project idea in the &#8220;Open Proposal&#8221; vein.  I&#8217;m going one step further though and posting a writeup of an idea that hasn&#8217;t even made it to proposal stage yet.  Depending on how this first attempt works, I may extend it to other projects I&#8217;m working on, or I may not.  We&#8217;ll see.  </p>
<p>Please Note: This is a living document and is mostly a collection of notes and thoughts at this point.  It has not been proof read, cited, fact checked, or otherwise made ready for public consumption.  I&#8217;m going to be revising this periodically as I learn more about the topic and flesh out some ideas.  I welcome feedback, but please don&#8217;t take any of the material in it too seriously just yet.  </p>
<p>Update (1-21-2012): As it turns out, my research focus has narrowed a bit.  (For the moment at least.)  I don&#8217;t expect to get back to this any time in the near future.  I&#8217;ve cleaned up my original post into something that stands alone a bit better, but that&#8217;ll probably be the last change to this for a good while.  </p>
<p> <a href="http://www.philipreames.com/Blog/2011/10/07/why-legacy-code-is-important-in-research/#more-326" class="more-link">(more&#8230;)</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.philipreames.com/Blog/2011/10/07/why-legacy-code-is-important-in-research/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Coding Best Practice: Todo Lists</title>
		<link>http://www.philipreames.com/Blog/2011/10/06/coding-best-practice-todo-lists/</link>
		<comments>http://www.philipreames.com/Blog/2011/10/06/coding-best-practice-todo-lists/#comments</comments>
		<pubDate>Fri, 07 Oct 2011 04:44:35 +0000</pubDate>
		<dc:creator>reames</dc:creator>
		
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.philipreames.com/Blog/2011/10/06/coding-best-practice-todo-lists/</guid>
		<description><![CDATA[This may seem slightly obvious, but it&#8217;s worth saying anyways.  If you&#8217;re looking at code and you notice something you&#8217;d like to change or fix, make a note of it.  (If you have time to fix it now, great, but you probably don&#8217;t.)  You don&#8217;t need to be super formal about this [...]]]></description>
			<content:encoded><![CDATA[<p>This may seem slightly obvious, but it&#8217;s worth saying anyways.  If you&#8217;re looking at code and you notice something you&#8217;d like to change or fix, make a note of it.  (If you have time to fix it now, great, but you probably don&#8217;t.)  You don&#8217;t need to be super formal about this - in fact I recommend against it -, but make sure you do record it.  </p>
<p>Generally, the problem you&#8217;re trying to solve involves the following questions:<br />
(New guy on team): What should I be working on?<br />
(Your boss): What do we need to do before the release?<br />
(You): What can I do to kill time during this meeting?  </p>
<p>I&#8217;m going to discuss my system below, but please don&#8217;t get too caught up in it.  While you&#8217;re welcome to use it if you&#8217;d like, the important part is you find a system which works for you.  I&#8217;ve known developers with radically different systems that worked for them.  Any system or habit which allows you to routinely answer the above without much thought is just fine.  </p>
<p>My personal practice is to maintain three lists:<br />
1) Any serious bugs or security problems go into whatever bug tracker the project is using.  This is only if I can quickly and easily create specific test cases which illustrate an actual problem.  If you don&#8217;t have a formal bug tracker, a simple TODO file in the root directory will do.  The important part is that a) others might fix them while you&#8217;re busy and 2) you remember what they are when its time to do a release.<br />
2) Ugly code goes into its own list.  It&#8217;s main purpose is to shame me into refactoring when the list gets really long.<br />
3) Unconfirmed bugs go into their own list so I can make sure I make time later to either fix them, or move them into list 1.  I often do not take the time to investigate them immediate after noticing them because I&#8217;m in midst of something else; breaking my concentration and workflow would be disruptive.  </p>
<p>A few example &#8220;todo&#8221; items:<br />
1) Use of sprintf in C/C++ &#8212;  First, I take a look to see if this is an obvious bug which needs to get added to list 1.  If it&#8217;s obviously not a bug or vulnerability, it goes in list 2 for later cleanup.  Otherwise, it goes into list 3 and I move on.<br />
2) Function with 10,000 lines of code &#8212; Obvious candidate for list 2<br />
3) Undocumented pointer manipulation, const_cast(s), or reinterpret_casts(s) - List 3 if it takes more than a second to understand, otherwise list 2.  </p>
<p>To get off list 3, an item must have been analyzed to the point that I&#8217;m fairly sure it&#8217;s not a real issue or have been moved into a real bug report.  I generally try to keep list 3 empty.  I might leave something on it for a couple of days, but that&#8217;s about it.  If assuring myself of it&#8217;s innocuousness, I try to document or restructure the code to make the correctness obvious.*</p>
<p>* On first view, this might seem altruistic.  After all, I&#8217;m out to help the next guy through the code.  That&#8217;s altruistic right?  Well, not really.  Most of the time, I&#8217;m the poor smuck looking at it next time and it&#8217;s been long enough I&#8217;ve forgotten my conclusion the first time!</p>
<p>I generally don&#8217;t worry about list 2 unless I&#8217;m really bored, need a simple project to kill a few minutes during a meeting, or get really annoyed by one of it&#8217;s items.  As such, it tends to be a pretty dang long list for any project I&#8217;ve been working on for more than a few days.  </p>
]]></content:encoded>
			<wfw:commentRss>http://www.philipreames.com/Blog/2011/10/06/coding-best-practice-todo-lists/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Patch design vs change design</title>
		<link>http://www.philipreames.com/Blog/2011/09/06/patch-design-vs-change-design/</link>
		<comments>http://www.philipreames.com/Blog/2011/09/06/patch-design-vs-change-design/#comments</comments>
		<pubDate>Wed, 07 Sep 2011 02:06:44 +0000</pubDate>
		<dc:creator>reames</dc:creator>
		
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.philipreames.com/Blog/2011/09/06/patch-design-vs-change-design/</guid>
		<description><![CDATA[A recent email discussion over on the Clang development mailing list reminded me of a point that I consider rather important, but that seems to be missed in many technical discussions.  To put it simply, the design of the patches used to submit a feature to the community do not need to (and usually [...]]]></description>
			<content:encoded><![CDATA[<p>A recent email discussion over on the Clang development mailing list reminded me of a point that I consider rather important, but that seems to be missed in many technical discussions.  To put it simply, the design of the patches used to submit a feature to the community do not need to (and usually should not) reflect the design and implementation of the feature itself.  Or to put it another way, patch design != change design.  </p>
<p>In good software engineering practice, the development of a new feature is a largely incremental process.  Whether done on a branch or merely within a local directory, the usual process is to pick a small subset of the feature, implement it, then repeat until done.  This has the side effect that unless one is careful, the resulting code can be rather messy: duplicated code, workarounds for previous versions, etc&#8230;  At this point, any good developer (or at least one with a bit of time before this <em>has to be in</em>) goes back and does a bit of refactoring to clean things up.  </p>
<p>A naive developer is tempted to present their patches to the community in the same way.  <em>This is wrong. </em> Your coworkers don&#8217;t care the order in which you developed this in or how many drafts you had.  What they care about is a) having the feature, and b) being able to understand and review the parts as they go in.  If anything, the ability to <em>understand</em> your changes and be <em>sure they aren&#8217;t going to break anything</em> is actually <em>more important</em> than the feature you&#8217;ve implemented.  </p>
<p>This may seem odd at first, but take a step back and think about it.  You&#8217;re contributing to a large project with lots of moving parts.  You&#8217;re adding one (probably relatively small) feature to a code base that already has hundreds if not thousands of features.  If in the process of adding this new feature, something else breaks, do you think your user base is going to be happy?  Even the ones clamoring for this new feature?  </p>
<p>With this in mind, let&#8217;s consider what your goals should be for preparing a patch:<br />
1) it should be small, localized, and obviously correct<br />
2) it should add some bit of logic, structure, or functionality which is obviously useful<br />
3) it should reduce the complexity of the change outstanding so that it has a better chance of meeting the first two goals</p>
<p>I wish I could tell you there was some perfect set of rules for doing this, but there really isn&#8217;t.  This is where experience comes in.  It&#8217;s a bit of an art to design &#8220;the perfect patch set&#8221;.  If you&#8217;re not sure what you&#8217;re doing, talk to your peers.  Most communities are more than willing to mentor someone who asks for help.  </p>
<p>That concludes the core part of my argument.  The rest of this is a collection of side notes that don&#8217;t directly relate to the core idea.<br />
1) I&#8217;ve been saying &#8220;community&#8221; above, but the same applies to commercial software development.  You should think of &#8220;pushing your change to mainline&#8221; as &#8220;publishing&#8221; the change to the community of your fellow developers, testers, and users.<br />
2) From what I&#8217;ve observed in the open source world, the majority of patches that sit for long periods without attention don&#8217;t follow the guidelines I&#8217;ve suggested above.  The ones that do tend to get pretty prompt attention.  If you&#8217;ve ever complained about your patches not getting merged, you might want to think about that for a while.<br />
3) Even if you end up submitting a single massive patch - which you really shouldn&#8217;t btw - running through the exercise of mentally splitting up the change into multiple smaller patches can still be useful.  If you organize your submit commit around these smaller pieces, someone viewing the patch has a better chance at understanding it.<br />
4) My argument is tangential to the use of centralized vs distributed version control system.  While DVCS have major advantages, they actually make this problem <em>worse</em> if anything.  By encouraging their users to cherry pick their changes, they <em>encourage</em> users to push back to the community in terms of the commits they themselves made.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.philipreames.com/Blog/2011/09/06/patch-design-vs-change-design/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>

