Naming, Part 2

Naming supports the basic techniques for building systems

  • modularity
  • abstraction
  • layering
  • hierarchy

Naming systems designed to make systems easy to use, implement, and more reliable/predicatable

we can use them to refer to the same thing, something with the same name but a new implementation, or just finding something

Modularity

How do we handle 2 instances of different things with the same name?

Context-based resolution/disambiguation

We solve this using a scope: the context in which the name appears

  • different names inside and outside of the scope
    • init() vs std::vector::init()
    • scope can be dynamic or static
  • allows use of the same name in different contexts
    • name resolves according to predefined rules
    • e.g. nested scopes in python - go up the stack until name resolves in one

Metadata

Information about data

  • ex:
  • object name
  • object context
  • size
  • unique system wide identifier
  • type of data in object
  • timestamps (creation/modification/access)
  • ownership
  • checksum

It’s often possible to change metadata without changing the underlying data

  • ideally, names are just identifiers
  • metadata is used to store “other information”
  • but often names contain some metadata
    • file extension
    • version number (e.g. foo-v2.txt)
    • hierarchy/locator (e.g. website parts between dots)
    • these are considered overloaded names - they contain more than just an identifier
  • names that aren’t overloaded are pure names - they contain no information, all you can do is pass it to a resolver

Obtaining Metadata

  • directly (if stored separately)
    • takes a name and returns metadata
    • e.g. lstat() calls
  • by parsing the name
    • e.g. university name from URL
    • parent directory from file name
  • path names are often overloaded - easy to encode information in a directory path

Problems

Overloaded names are fragile!

  • e.g. changing the date in a filename may break scripts pointing to that file
  • e.g. library names like foo2.so renames to foo3.so
  • Level of indirection: a symbolic link fixes this

Locators

  • Addresses are modular names that tell a system where to find an object
  • can be physical or virtual (and virtual names are stable)
  • address can be used as both name and locator
    • e.g. web URLs are both addresses and names
    • addresses can be numeric or not

Uniqueness

It’s often necessary to assign unique names to things - but how?

  • hand out names centrally
    • e.g. ICANN - hand out large batches of names to servers to distribute
  • use a random name from a very large name space
    • random 256-bit name
  • generate name from content
    • e.g. SHA-1 or SHA-256
    • all objects with same content get same name
    • technically overloaded, but useful if name should depend on content

Lifetimes

  • names can have limited lifetimes
    • e.g. lifetime of a fd is the time the file is open
    • after the file is closed, the descriptor can be reused
  • limited-life names are often assigned rather than chosen
    • e.g. file descriptor, outgoing port
    • prevents collision
  • short-lived names tend to be local
    • takes time to distribute/invalidate
  • long-lived names can be global
    • e.g. domain->IP, url->file
    • changes take even more time to propogate
  • if a reference outlives the name, it’s dangling
  • similarly, if an object outlives all names, it’s orphaned (GC fixes this)

Resolution

Resolution is usually multiparted and recursive, with lots of parts cached!

e.g. Hierarchy of a URL:

  1. protocol
  2. website - edu server, then so on and so forth - caching
  3. resource on server - after the /