Naming, Part 2¶
Naming supports the basic techniques for building systems
- modularity
- abstraction
- layering
- hierarchy
Naming systems designed to make systems easy to use, implement, and more reliable/predicatable
we can use them to refer to the same thing, something with the same name but a new implementation, or just finding something
Modularity¶
How do we handle 2 instances of different things with the same name?
Context-based resolution/disambiguation
We solve this using a scope: the context in which the name appears
- different names inside and outside of the scope
init()
vsstd::vector::init()
- scope can be dynamic or static
- allows use of the same name in different contexts
- name resolves according to predefined rules
- e.g. nested scopes in python - go up the stack until name resolves in one
Metadata¶
Information about data
- ex:
- object name
- object context
- size
- unique system wide identifier
- type of data in object
- timestamps (creation/modification/access)
- ownership
- checksum
It’s often possible to change metadata without changing the underlying data
- ideally, names are just identifiers
- metadata is used to store “other information”
- but often names contain some metadata
- file extension
- version number (e.g.
foo-v2.txt
) - hierarchy/locator (e.g. website parts between dots)
- these are considered overloaded names - they contain more than just an identifier
- names that aren’t overloaded are pure names - they contain no information, all you can do is pass it to a resolver
Obtaining Metadata¶
- directly (if stored separately)
- takes a name and returns metadata
- e.g.
lstat()
calls
- by parsing the name
- e.g. university name from URL
- parent directory from file name
- path names are often overloaded - easy to encode information in a directory path
Problems¶
Overloaded names are fragile!
- e.g. changing the date in a filename may break scripts pointing to that file
- e.g. library names like
foo2.so
renames tofoo3.so
- Level of indirection: a symbolic link fixes this
Locators¶
- Addresses are modular names that tell a system where to find an object
- can be physical or virtual (and virtual names are stable)
- address can be used as both name and locator
- e.g. web URLs are both addresses and names
- addresses can be numeric or not
Uniqueness¶
It’s often necessary to assign unique names to things - but how?
- hand out names centrally
- e.g. ICANN - hand out large batches of names to servers to distribute
- use a random name from a very large name space
- random 256-bit name
- generate name from content
- e.g. SHA-1 or SHA-256
- all objects with same content get same name
- technically overloaded, but useful if name should depend on content
Lifetimes¶
- names can have limited lifetimes
- e.g. lifetime of a fd is the time the file is open
- after the file is closed, the descriptor can be reused
- limited-life names are often assigned rather than chosen
- e.g. file descriptor, outgoing port
- prevents collision
- short-lived names tend to be local
- takes time to distribute/invalidate
- long-lived names can be global
- e.g. domain->IP, url->file
- changes take even more time to propogate
- if a reference outlives the name, it’s dangling
- similarly, if an object outlives all names, it’s orphaned (GC fixes this)
Resolution¶
Resolution is usually multiparted and recursive, with lots of parts cached!
e.g. Hierarchy of a URL:
- protocol
- website -
edu
server, then so on and so forth - caching - resource on server - after the
/