Zac Blanco     Blog     Education     Projects     About

CSE 222A - Networks

Paper Notes

Lecture 1 - Course Intro

  • Networks have stood the test of time – why do more?
    • requirements changing!
      • billions of IoT devices
      • new usage (5G/mobile)
      • applications (device control, VR/AR)
      • state-level threats
      • policies anonymity/accountability

Lecture 2 - End-to-End Arguments

  • Where to implement particular functionality?
    • reliability
      • must be in the application
    • security
      • should the application also do this entirely?
      • in network-provided encryption where OS handles encryption, anyone in between app and OS –> can see unencrypted data
    • routing/addressing
      • not absolutely necessary for app?
      • application doesn’t really care how it gets to the destination
    • movement of bits
      • 3 pieces of equipment: medium, transmitter, receiver
  • applications have tons of “moving” parts:
    • disks, OS, filesystem, ram, links, power, cpu
    • should we make them all super reliable….or super unreliable?
      • only implement reliable system underneath if you can increase performance

Lecture 3 - Internet Routing

  • Interconnecting generic networks
    • main challenge is heterogeneity of link layers
      • addressing
      • bandwidth
      • latency
      • frame size
      • loss rates
      • service guarantees
  • routers forward packets from source to destination
    • may cross many separate networks along the way
  • all packets use the common internet protocol
    • any underlying data link protocol
    • any higher layer transport protocol
    • IP is the “thin waist”
  • router: store and forwards
    • connected to multiple networks
    • supports multiple data-link layers and makes decisions
    • must be explicitly addressed by incoming frames
    • parses IP header to determine next-hop
  • brief history of routing
    • arpanet with dynamic distance vectors
    • new networks on the ARPAnet (civilian vs military)
      • NSFNet, etc
      • nodes grew exponentially
      • networks with different rules
    • new requirements
      • need to scale to millions of routers
      • varying route metrics
      • must express business policies
  • shortest path is not always the best path
    • must account for “link cost”
  • to respect routing rules - need routing through outside entities separate from within particular domains
    • routing within a domain uses traditional protocols (RIP, OSPF)
    • routing between domains uses exterior gateway protocols (EGPs)
      • only have reachability information, decide routing based on local policy
  • Autonomous Systems
    • internet divided into separate ASes
    • distinct regions of administrative control
    • routers/links managed by a particular institution
    • service provider, university, company, etc
    • disjoint hierarchy of ASes
      • tier 1: national backbone providers
      • tier 2: medium sized regional provider w/ metro-area affinity
      • tier 3: small network w/ single company, university, etc
  • inter-domain routing
    • border routers summarize and advertise routes to neighbors
    • border routers apply policies
    • internal routers can use a notion of default routes
  • path-vector routing
    • give advertise known paths to neighbor ASes
      • prevents loops
    • problems with BGP (path vector implementation)
      • instability (route flapping when a link goes down)
        • long AS-path decision criteria causes DV-like behavior
        • not guaranteed to converge
      • scalability still difficult
        • 500k network prefixes in default-free table

        • tension: want to merge traffic to specific networked/multi-homed networks
      • performance
        • not always optimal and not balanced across paths

Lecture 4

BGP Routing 1. get routes from neighbors 2. select the routes to use (route selection not defined in BGP) 3. decide routes to share with neighbors

  • Three attributes of routing graphs
    • convergence
      • all routers have same information about routes that exist
      • agree on set of routes
    • stability
      • no AS changes its route based on what it knows now
    • safety
      • no matter what happens, we will eventually converge

Lecture 5

  • problems with internet
    • abstractions between theory and reality
    • human error (and animal threats!)
    • policy issues
  • for routing paper…why 37 sites?
    • symmetric routes
    • try to get a representative coverage sample across the entire internet
      • only touched 8% ASes…not very many, but “a lot” for any study at the time
  • routing loops in the wild
    • occurs across national borders!
  • fluttering - 50/50 across links?
    • no rules?
    • also, e2e argument – splitting routing for performance benefit
  • dominant routes
    • prevalent vs persistent paths
      • prevalent paths: most take one path
      • persistent paths: amount of time spent between switching paths
    • found some were very stable
      • others, switched quite often
      • others neither prevalent or persistent

Lecture 6

  • simulation vs prototyping
    • sally floyd– likes simulation, vern likes internet measurement
    • internet is meshing of physical engineering and software engineering
  • simulation gives “an intuition”
    • deployment on internet can’t really be updated once deployed
    • easy to test parameters in simulation
    • deploy vs simulate?
  • torrenting used to be big traffic hog (now it’s netflix)

Lecture 7

  • internet architecture generally secret (company proprietary, national security)
  • measuring the entire internet with all the probes takes way too long
  • CPUs of routers are vulnerable to traceroutes
  • many different network architectures depending on the provider
  • lots of redundant links
  • providers generally have the same set of backbone nodes
    • e.g. ATT hub and spoke might be more prone to failure

Lecture 8

  • 1970s…
    • Network Security - review of network security
    • original TCP/IP design with trusted networks and trusted hosts
    • E2E principle – intelligence at the edges
  • 1980s
    • wait…we can’t trust everyone
    • can’t trust hosts; compromised, untrusted insider, anyone can connect to the public internet
    • network itself still trusted
  • ..today
    • can’t trust the network either
    • network equipment may be compromised
    • untrusted network operators
    • anyone can access physical channel of wireless networks
  • TCP/IP
    • built in trust assumptions
      • net protocols only used as intended (assume correct packet headers and/or consider of other resources)
    • hosts are controlled by trusted administrators
      • random people can’t get onto the network
      • correct information reported by hosts
      • protocols implemented correctly.
  • Attacker models
    • MITM
    • Passive: eavesdrop on traffic
    • off-path: attacker injects traffic into network
  • no confidentiality
    • who can see packets you send?
      • networks (router/switch, AP, …)
      • unprotected wireless network
      • WPA2 PSK: everyone on same network
      • non-switched ethernet: everyone on network
      • switched ethernet: maybe everyone on same network
  • no authentication
    • TCP/IP does not offer packet authentication
    • attacker with direct access to network can spoof source addresses
    • connectionless protocols are especially vulnerable
  • link layer (MAC)
    • physical channels often shared by multiple hosts on local network
    • link layer controls access to the physical medium
    • how to make sure each host only gets frames addressed to it
    • each host is responsible for picking up frames addressed to it, ignoring others
    • filtering typically happens on the network card
    • many support “promiscuous” mode - pick up all frames
    • ARP spoofing
      • attacker on network can impersonate any other hosts
      • mitigation: fixed ARP tables, port binding on switch, higher level auth.
  • problems with addressing
    • how to know what the “real” address is?
    • domain name to IP
  • BGP
    • each BGP node needs to maintain connections to a set of trusted neighbors
    • neighbors share routing information
    • no authorization
      • malicious nodes can provide incorrect routing that redirects IP traffic
    • BGP hijacking
      • pakistan tried to block youtube within the country
        • pakistan claimed ownership of youtube’s IP block via BGP
        • BGP nodes forwarded the routing information
        • youtube sinkholed globally
      • myetherwallet compromised, $100K+ stolen
        • attacker used BGP hijack to claim ownership of AWS route 53 DNS addresses
  • IP spoofing attacks
    • no auth on link layer, TCP more complicated
    • connectionless protocols is simple
  • DDoS attack w/ spoofing
    • millions of packets at some victim with forged source address
    • hard to block+identify actual source
    • reflector attacks
      • identify online services that generate large response packets in response to small requests (e.g., DNS, memcached, NTP)
      • send requests to such services with source address of victim
        • huge bandwidth amplifier
    • partial fix:
      • source address validation
      • edge routers filter outbound packets with source address that could not be generated within a network
  • Denial-of-Service
    • attack against availability, rather than confidentiality, integrity, etc
    • two kinds:
      • logical vulnerabilities
        • ping of death, Land – fix w/ filter+patching
      • resource consumption
        • overwhelm with spurious requests (SYN flood, smurf, bandwidth, overflow)
  • resource consumption of services
    • Server CPU/memory resources
      • consumes connection state, time to evaluate message, cause new connections to be dropped, DB to query a lot
    • network resources
      • many routers packet-per-second processing limit, FIFO queueing
      • if attack is greater than forwarding capacity, good data will be dropped
  • Address spoofing 2
    • puzzles: don’t commit state until client has done a bunch of work for you (solve a computationally tough problem)
      • server provides puzzle to client
      • client must solve puzzle
      • tricky if validation isn’t super cheap
    • traceback
      • router support for tracing packets back to their source
      • probabilistic packet marking
      • packet logging

Lecture 9

  • attacks like SolarWinds and WannaCry still persist
  • networked computers have inherent trust
  • can’t really prevent an attack very well once started - especially before people notice
  • bug with code red propagation - used a fixed random seed for ip propagation
  • admins and orgs should diversify

Lecture 10

  • zmap challenges view of scanning internet as bad
  • opened lots of new opportunities
  • SYN-cookies to put state in the sent-packet
    • packets coming back have same information as sent packet

Lecture 11

  • methods to share physical media: multiple access
    • fixed partitioning
    • random
  • channelizing mechanisms
  • contention-based mechanisms (Aloha)
  • wireless channel problems

  • fixed partitioning
    • need to share media with multiple nodes (n)
      • multiple simultaneous conversations
    • a simple solution: divide channel into multiple separate channels
    • channels are physically separate
      • bitrate of link is split across channels
  • Frequency Division Multiple Access (FDMA)
    • divide bandwidth of f Hz into N channels each with bandwidth of \(f/n\)
    • easy to implemented, but unused subchannels are idle
    • used by traditional analog cell phone service, radio, TV
  • TDMA (Time Division Multiple Access)
    • divide channel into rounds of n time slots
    • assign hosts to particular time slots within round
    • unused slots idle
    • used in GSM and digital cordless phones
  • Code Division (CDMA)
    • do nothing ot physically separate channels
    • all stations transmit at the same time in same frequency bands
    • “spread spectrum techniques”
    • “split up energy”
  • problem with channel partitioning
    • not well suited from random access usage
    • instead, design schemes for more common situations
      • not all nodes want to send all the time
      • don’t have a fixed number of nodes
    • potentially higher throughput for transmissions
      • active nodes get full channel bandwidth
  • Aloha:
    • designed in 1970 to support wireless data connectivity (hawaii)
    • goal: distributed access control
    • aloha in a nutshell
      • send when data is ready
      • if transmission not successful, retransmit after fixed delay
  • elements of a wireless network
    • infrastructure mode:
      • base station connects mobile devices into a wires network
      • handoff: mobile changes changes base station when signal strength changes
    • ad-hoc mode:
      • no base stations
      • nodes can only transmit to other nodes in link coverage
      • nodes organize themselves into a network
  • wireless link characteristics
    • difference from wired links
    • decreased signal strength over distance
    • interference from other sources
    • multipath propagation
      • radio signal reflects of objects
  • wireless link characteristics
    • achieve particular bit error rate based on SNR and transmission rate

Lecture 12

  • not many lecture notes…see paper summary notes
  • exponential backoff when there is interference.

Lecture 13

  • mesh networks - hard to guarantee connectivity
  • network worked pretty well
  • why was their result surprising?
    • packet loss 20% median loss rate
    • interference also plays a role
    • more dense deployments, creates higher packet loss rates
    • data rates with higher loss is better than lower data rate with less loss
    • more losses doesn’t mean bitrate is wrong..just that your network is dense

Lecture 14

  • Peer to peer networks
    • generally, each node is equal and join together to provide a service
  • p2p adoption:
    • client-client file sharing
    • digital currency
    • voice/video telephony
  • classic p2p: bittorrent
    • user clicks download link: get torrent file with content hash, IP of tracker
    • user’s BT client talks to tracker, tracker tells peers who have file
    • user BT client downloads file from one or more peers
    • B client tells tracker it too, has a copy
    • users BT client serves files to others
  • flat-name lookup problem - the true problem behind P2P
    • one solution: semi-p2p where content is distributed, but index is in a DB
      • e.g. Napster. Single DB for index, clients store data
      • O(N) state stored in single server
      • SPOF
    • query flooding: gnutella:
      • ask neighbors for content, they ask neighbors for content, etc…
      • doesn’t scale for large networks..
    • routed DHT queries (Chord)
      • can we make the p2p system robust, maintain reasonable state per node and latency?
      • local hash table: hash, put, get
      • DHT: hash(data), lookup(key) –> IP Address, sendRPC(IP, put, get, key, data), sendRPC(IP, get, key) -> data
      • partitioning of data in a truly large-scale system
      • tuples in a global database engine
      • data blocks in a global FS
      • Files in a P2P file sharing system
  • Why are DHTs hard?
    • decentralized
    • scalable
    • efficient
    • dynamic
  • consistent hashing

Lecture 15

  • Pastry - DHT (key-location network)
    • joining – need to “know someone”
    • security issues: must trust first node in the network
  • failures: designed to survive L/2 failures (leaf failures)
    • tolerate failures by replicating values to nearby leaf nodes
    • byzantine nodes would need to take over multiple leaf nodes that have content in order to take ownership of all replicas for a set of keys
  • pastry requires inherent trust within the network
  • routing via bits: always try to route to another node with more bits
  • locality-based routing
    • want to route towards destination, but try to stay within a node that is “close” (latency-wise)
    • lower latency in getting closer to node
    • DHT potentially very slow compared to a database
  • number of hops not always necessarily correlated to routing distances
    • circuit switches between packet switching routers

Lecture 16

  • Data center networks…very different from the internet
    • everything in a warehouse
    • easier to upgrade
    • fully administered
  • DC network requirements
    • scalability: incrementally add/build new hosts
    • reliability: loop-free forwarding
    • VM migration
    • reasonable management burden…humans? they can manage downtime/misbehaving hardware
  • DC management: L2 vs L3
    • ethernet switching: (layer 2)
      • cheaper switch equipment
      • fixed addresses/auto-configuration
      • seamless mobility, migration, or failover
    • IP routing (layer 3)
      • scalability with hierarchical addressing
      • efficiency through shortest path routing
      • multipath routing through equal-cost multipath
    • datacenters often connect layer-2 islands by IP routers
  • ethernet
    • flat addressing + self-learning w/ autoconfiguration
      • “plug and play”
    • some monitoring apps require server with same role to be on the same VLAN
    • use same IP on dual-homed servers
    • VM migration simpler
  • traditional DC topology
    • L3 router typically at edge of DC-Internet
    • base servers connected by L2 ToR switches
    • aggregation layer with L2/L3 switches to allow inter-server communication
  • ethernet scaling issues…
    • flood-based delivery: frames to unknown destinations are flooded
    • bootstrapping on network relies on broadcasting
      • vulnerable to resource exhaustion attacks
    • inefficient forwarding paths
      • loops are fatal (broadcast storms)
      • forwarding along a single tree is inefficient, and leads to lower utilization
  • traditional topologies
    • over subscription of links higher in the topology
    • tradeoff between cost+provisioning
    • SPOF
  • FAT tree-based topology
    • ALL L3 solution
    • connect hosts with “fat trees”
      • infrastructure consist of cheap devices
      • each port supports same speed as end host
      • all devices can transmit at line speed if packets are distributed along existing paths
      • k-port fat tree can support k^3/4 hosts
    • FAT tree challenges
      • L3 only use one of existing equal cost paths
      • packet re-ordering occurs if L3 blindly takes advantage of path diversity

Lecture 17

  • At the time…
    • datacenters still in their infancy
    • at the time, not many buildings with tons of computers
      • supercomputers
      • enterprise/offices?
    • miranet – originally for HPC, now applied to commodity hardware
  • previous network had a bandwidth issue:
    • single-rooted tree all needs to go through root – the uplink bandwidth from layer 2 to 1 must be equivalent to the aggregate bandwidth of all servers
  • two-level routing table available on commodity switches or not?
    • highest-level prefix matching available on existing routers
      • lowest prefix match on 2nd table maybe implemented with ECMP?
      • ECMP: statically allocates flows of traffic from/to source/destination to a path
      • problem with this new architecture is the need for dynamic path changing depending on large flows

Lecture 18

  • fb DC workload…not necessarily the same as what fat-tree recommended
  • facebook workloads –> not a cloud provider. very specific workloads

Lecture 19

  • software defined network…
  • what is it??
    • control plane of network physically separate from the data plane
    • single logically centralized control plane controls the forwarding devices
  • networking planes…
    • data plane: processing and delivery of packets with local forwarding state
      • forwarding state + header –> forwarding decision
      • filtering, buffering, scheduling
    • control plane: computing the forwarding state in routers
      • determine how/where packets are forwarded
      • routing, traffic engineering, failure detection/recovery
    • management plane: configuring and tuning the network
      • traffic engineering, ACL config, device provisioning
  • control plane
    • compute paths the packets will follow
    • populates forwarding tables
    • traditionally..distributed protocol
    • e.g. Link-state routing (OSPF/IS-IS)
      • flood entire topology
      • each node computes shortest paths –> dijkstra’s
  • inter-domain routing
    • inter-domain routing, BGP artificially constrains routes
      • routing only on destination IP blocks
      • can only influence immediate neighbors
      • difficult to incorporate other information
    • application-specific peering (route video traffic one way, http another)
    • blocking DoS –> drop unwanted traffic upstream

Lecture 20

  • before active networking..everything baked into the switch and logic needs to be pre-programmed
  • active networks allow applications to decide how to route packets
  • if code is included in the packet, then metadata may not be necessary
    • ip metadata somewhat redundant
  • to support in-packet instructions
    • routers would need to have an available instruction set
    • need safeguards
    • router needs its own OS
  • research didn’t “have a home” yet..kind of died
  • applications of active networking
    • bigger issues (e.g. firewalls) is not well-suited for current
    • distributed systems … hard to manage without active networking
      • application developers lose control
  • active networking, not a good home for global internet
    • security issues
    • management issues
  • who would benefit from tight network management?
    • different types: small networks (e.g. doctor’s office), government, ISPs, datacenter/cloud providers. Which ones?
    • government (classified / info compartmentalization)
    • banks/finance companies

Lecture 21

  • ethane– great experiment in actual deployment
  • switch had to be built from an FPGA

Lecture 22

  • congestion control overview
    • challenge: efficiently share network resources?
    • today: TCP
    • Alternate solutions:
      • fair queueing, RED, Vegas, packet fair, rate control, credits
    • no one “true” solution to congestion control
    • too much congestion ==> congestion collapse
  • depends on assumptions and goals:
    • hosts, links
    • fairness
    • backcompat
  • ACK pacing in TCP
    • ACKS open slots in congestion/advertised window
      • bottlenecking link determines the rate than can be sent
      • ACK indicates a packet has left the network
  • problems with ack pacing
    • ack compression
      • vriations in queueing delays on return path change spacing between acks
      • worse with bursty cross traffic
    • what happens after a timeout
      • potentially no acks to time packet transmissions
    • congestion avoidance
      • slow start to last successful rate
      • back to AIMD
  • TCP achieves fairness at the cost of efficiency
  • TCP friendliness
    • problem: many different TCP implementations
    • if implementations change rate differently depending on drops, more aggressive implementations will take more bandwidth (i.e. intertwined MIMD/AIMD)
    • incentive to cause congestion collapse
      • if you are more aggressive grabbing bandwidth, you can have more network
      • …kind of game theory-like

Lecture 23

  • more congestion control…
  • high bandwidth-delay product networks
    • high latency links that can have a lot of packets in flight
    • can have lots of delay, bandwidth, or both
  • examples of high BW-D product network
    • satellite links: latency 100s of ms to 1s links
  • how does XCP deal with the staleness of congestion information in high BWDP networks?
    • XCP gives constant flow of congestion information to receivers
    • senders get updates every 1 RTT

Lecture 24

  • congestion control….again!
  • trend in adding buffers (memory) into switches to buffer large traffic flows
    • this is always good..right? wrong!
    • hard to determine link bandwidth via loss-based congestion control unless the queues are full
    • senders can’t mitigate or change rates until the buffers in the network are full
  • implications of congestion control for large company: lots more money because additional need to retransmit packets.
  • when sending faster than BW-D product…:
    • at BWDP you are optimally using the network
    • above BWDP -> start queueing
  • getting rid of queues could lead to easily measuring BWDP
    • but, if you get rid of them, then we’ll likely have much higher packet loss rates when bursts do come
  • BBR is still loss-based congestion control, but doesn’t require filling up queues in the network to change transmission rates
  • can’t simply measure bandwidth at the receiver side
    • bbr measurement does not give the exact measurement of the link bandwidth
    • measures bandwidth subtracted from what others are using at that time.
  • bbr doesn’t consider fairness of other users of the links
    • bbr at the same time as older versions of TCP, bbr will win over other congestion control algorithms -