Intro ===== Complexity ---------- Modern computer systems are large and complex - this course is about the problems with designing complex systems and how to manage the complexity and these problems. Issues ^^^^^^ Complex systems exhibit 4 basic issues: - emergent properties - properties that show up only when combining individual components - propogation of effects (butterfly effect) - problems in one component affect other components - relationship between components isn't always obvious - incommensurate scaling - not all parts of a system scale at the same rate - tradeoffs - limits mean that increasing one characteristic may harm another - e.g. utilization v. latency tradeoff Symptoms ^^^^^^^^ - large number of components - large number of interconnections - lots of irregularities - long description / high information content - large team of designers, implementers, or maintainers Sources ^^^^^^^ - interactions of requirements - a general purpose tool that can do X and other things is more complex than a tool that can *only* do X - more jobs -> more complexity - more constraints -> more complexity, even if constraints don't apply at the same time - increasing efficiency (or other measure of "goodness") - "low-hanging fruit" is easy to get - additional gains in efficiency require more complexity - same applies to cost Managing Complexity ------------------- We can manage complexity in systems in a few ways: - modularity - abstraction - layering - hierarchy - naming (controls/manages all of these) But these aren't really different things! They're a method of constraining complexity - it makes it easier to understand why code is doing what it's doing. Modularity ^^^^^^^^^^ - break big things into little things (divide and conquer) - allows development of a smaller unit - reduce number of components and interactions - reduce complexity - reduce # of bugs - allows for interchangeable modules - replace a module with a "better" one - replace a broken module - stitching together big pieces that only interact in certain ways However, for people to be able to use modules well, you need a *spec* - a document that details how exactly this module work. That creates a **level of indirection**. Now clients don't care about how your module is implemented - only that it will do what it says in the spec. Abstraction ^^^^^^^^^^^ maybe thought of as one of the means to get to modularity - break into components at logical points - treat an individual component as a black box - inputs and outputs and behaviours - trust that it'll do it - robustness principle - be tolerant of inputs: try and do the right thing - be strict on outputs: underpromise, overdeliver, always follow the spec - mantain a safety margin: fragility is bad - a way to take functions and make them more general Layering ^^^^^^^^ Used to reduce interconnections - build higher layers on top of lower layers creates a "stack" - constrain the ways layers/modules are allowed to interact with one another It's often the design of the layers that defines what modules should look like. Hierarchy ^^^^^^^^^ How to group boxes of boxes into boxes - making bigger modules out of smaller ones - for example, ASM <- LOC <- functions <- class/module <- service - each of these are modules that have specs, but are made up of smaller modules Names ^^^^^ Now we need to understand what boxes are, and how to assemble them - names help us identify things in an independent way - replace a module but keep a name - use names to look up which components to use, where to find them - a way for a component to talk about another component that it's using - **binding**: going from a name to the actual code that's being run Approaches ---------- Iteration ^^^^^^^^^ - you likely won't make the right choices the first time, so make it easy to fix - make small steps, have fallbacks and try a lot of choices KISS ^^^^ Keep it Simple, Stupid - resist temptation to add unnecessary features, which increases complexity and you can't take them away Fundamental Abstractions ------------------------ Memory ^^^^^^ Memory is a fundamental abstraction - it just means storage - basic operations: - WRITE (name, value) - value <- READ (name) - memory can be provided by both hardware and complex systems - RAM - Disk - files - database - use layering and modularity to select specific memory module - main characteristics: - latency: time to r/w 1st byte - bandwidth: how long it takes to r/w X bytes - cost - capacity - volatility/lifetime - rewritability But you might have: - different terms for the units that make up values (e.g. bytes v. sequences) - different mechanisms for naming - different performance **Naming in Memory** For memory, each name needs to be unique - each name must point to at most one value (they can also point to nothing...) Usually, memory names are linearly mapped (e.g. pointers) **Layering in Memory** - file system - inodes are numeric names - file names built on top of inodes - internet - MAC addresses - IP addresses - DNS names - the web combines multiple high-level naming schemes together Coherence ^^^^^^^^^ Related to ordering/timing - that a read will always return the most recent write Atomicity ^^^^^^^^^ reads/writes happen all at once - can't interleave actions between multiple reads/writes Coherence and atomicity seem simple, but can be difficult for many reasons. Interpreter ^^^^^^^^^^^ - an interpreter performs actions in the computer system - do what the program says in the way it says to do it - 3 basic components: - instruction reference - where is the next instruction - repertoire - what can this interpreter do - environment reference - where is the current state Communication Channel ^^^^^^^^^^^^^^^^^^^^^ A way to move information between components Two main components: send, receive For example: - shared memory - might have atomicity issues - message pipeline - might have coherency issues (race)