Inspired by a recent blog post by Daniel Lemire and a couple of recent conversions with friends, I decided to write down a “quick” list of the concepts I feel every software engineer should be familiar with. Note that the key word in that was “aim”. I don’t expect every software engineer to know them all coming out of school, or even at any particular point in their career.
Before diving into this list, a quick caveat. This whole page is organized around factual knowledge. In practice, this is the least important part of software engineering. Of far more importance is the way you think about problems, your conceptual framework for solving them, and your understanding of design. None of those topics are directly covered here. (Edit: I ended up writing about some of this here.)
Debugging techniques – stepping through (normal) execution, breakpoints, watch points, invariants, assertions, checkable preconditions, self checking builds
Modularity: structural abstraction, scoping, namespaces, components/modules w/strong interfaces, thread local state, singletons, typing at the boundary, preconditions & post conditions (design by contract), immutability, value types (vs mutable types)
Testing: unit testing, acceptance testing, regression suites (separate from previous),
Bug Finding: fuzzing, mutational fuzzy, differential testing, light weight static analysis
Abstract Data Types: fixed size arrays, growable arrays, linked list, queue, stack, tree, binary search trees, balanced trees, directed acyclic graphs, graphs, map, set, bag, mathematical vector, mathematical matrix, pure function
Note: Traversal orders on abstract data structures are a classic case where a generator (a limited form of coroutine) is a powerful tool.
Concrete Data Structures: (all of the classic variants of the above – preferably several implementations of each), bloom filters, hash tables
Control Flow Constructs: regular control flow (if, while, for, etc…), irregular control flow (goto/labels), switch/match constructs, table lookup, unordered foreach/map, implicit vs explicit parallelism, generators (an actually useful form of coroutines), iterators, recursion, tail recursion, mutual recursion, closures, lambda functions
Concurrency: races, mutual exclusion, atomicity, ordering, consistency, dependence, happens before, purity, side effects, functions vs procedures, deadlocks, failure to make progress, livelock, starvation, unfair scheduling – which is actually quite common, time multiplexing
Error handling: error codes, exceptions, exception safety, abort/retry patterns, checkpointing
Design Patterns: transactions, map/filter, task/dependency graphs, compensating actions/RAII, self checked optimizations, log structured data, visitors, strategy/policy patterns, stateless request/response, state as arguments, data versioning for consistency in distributed systems, factories, builders for value types, adaptors, proxy objects (and their dangers!), command/action patterns (i.e. undo, easy commit semantics), domain specific languages and their interpreters and/or just in time compilers
(Optional) Light Weight Verification: symbolic execution, concolic/directed testing, happens before race detection, lock set race detectors, weakest precondition, verification conditions
I doubt most programmers will have a clue what any of these are. Before spending two years at Berkeley, I sure didn’t! I think that says unfortunate things about the field and how slow we are to migrate good ideas from academia, but that’s an entirely different discussion.
(Optional) Styles of Design: object oriented programming, functional programming, aspect oriented design
Aside from a few concepts I pulled out separately above, I find the schools of thought around “X programming” to be distracting and unhelpful. I admit that each has good ideas, but the absolutism drives me crazy. Take the strong points, adapt them to what you’re doing, and ignore the rest.
Random Other: hash functions, long tail distributions, rare events happen, public key cryptography, version control, refactoring, wholesale rewriting is nearly always wrong, incremental change management, commit as publication, understanding means and what they do and don’t tell you about probability distributions
This last batch is simply things I didn’t think of the first time through. I expect I’ll keep adding to it and splitting out related topics into their own groupings.