BEN-600 - Tiered Caching Video Delivery for Tor Hidden Services ================================================================== Setup ------ Using Big Buck Bunny as the test media ben@milleniumfalcon:/tmp/BEN600$ wget http://distribution.bbb3d.renderfarming.net/video/mp4/bbb_sunflower_1080p_60fps_normal.mp4 Creating an ABR stream with multiple bitrates ben@milleniumfalcon:/tmp/BEN600$ ../HLS18test/HLS-Stream-Creator/HLS-Stream-Creator.sh -i bbb_sunflower_1080p_60fps_normal.mp4 \ -s 2 -b 512,1024,2048,2560,4096,6144,8192 -t Big_Buck_Bunny -p Big_Buck_Bunny Took a while to transcode... Initial Assumptions/Decisions ------------------------------ - NGinx doesn't handle caching of byte-range requests particularly well, the client is made to wait until it's retrieved all data from upstream. So despite the change in HLSv5 we're going to stick with seperate segment files. - The initial hypothesis is that the delivery bottle neck on a Client -> Server Hidden Service is the connection to the Tor network itself. On busy text-based Hidden Services this seems to be the case, so it's assumed the same will be true (and perhaps magnified) when delivering video to multiple clients. - For the purposes of this test (and to save typing), the term CDN refers to the test infrastructure being built. Topology Design ----------------- The aim is to have a tiered heirachy, with a mid-caching tier. - CDN Edge 1 - CDN Edge 2 -- CDN Mid-Tier 1 --- CDN Origin Where both edges are configured to talk only to the mid-tier. At a later date, will look at the possibility of adding additional to both the edge and the mid-tier. Intra-CDN Comms ---------------- The ultimate aim is to have every tier have it's own hidden service descriptor. Initial tests, though, will probably just use HTTPS between them, in part to guage what the ultimate impact of using HS's between the tiers is to cold startup times NGinx Settings - Caching -------------------------- The max-age of cached assets should be determined by the origin (via max-age) and will obviously need to differ between Linear and VoD. *Linear Video* the playlist should contain up to 4 segments at a time and have max-age determined by (segment length * segment count) * 0.5 So if the segments are 2 seconds a piece (2*4) * 0.5 = 4 seconds Which should ensure that a fresh copy of the manifest is always available when it's needed *VoD* Depending on the player, the manifest itself will only be retrieved once, but takes up very little space in cache. So we'll set the max-age at 1 week. The segments should be cached for 7 days too. *Available cache space* NGinx's cache will be on disk, with sufficient space to (theoretically) guarantee CACHE_HITS if all video were to be replayed again. The exception being during a final set of tests, where the available cache space will be severely restricted in order to try and instigate heavy LRUing of the cache. NGinx Settings - Logging -------------------------- To allow easy retrieval of stats from NGinx's logs, the following log format should be used log_format main '$remote_addr\t-\t$remote_user\t[$time_local]\t"$request"\t' '$status\t$body_bytes_sent\t"$http_referer"\t' '"$http_user_agent"\t"$http_x_forwarded_for"\t"$http_host"\t' 'CACHE_$upstream_cache_status\t$request_time\t$http_x_downstream\t'; Where request_time is the time taken to serve the response. The lower tiers should include an identifier in X-DOWNSTREAM (see below) so that they can be identified in the logs when connecting via Hidden Service. NGinx Settings - Upstream Connections -------------------------------------- Need to identify the best means to have NGinx go upstream to another hidden service. It doesn't speak SOCKS so we can't simply tell it to connect to Tor's SOCKS port. Transparent redirection would be one option I guess. In either case, a HTTP Keep-Alive session should be used between the tiers with a timeout of 60 seconds (to reduce the overhead of continually creating new circuits etc). To help in generating stats, all tiers should identify themselves to upstream with the X-DOWNSTREAM header. Note: You probably wouldn't do this outside of a testing environment. So that gives us something like the following location / { set $cachename "Edge1"; proxy_set_header X-DOWNSTREAM $cachename; proxy_set_header Host midtier.onion; proxy_pass http://midtier.onion:8091; # Enable Keep-Aluve proxy_http_version 1.1; proxy_set_header Connection ""; # Allow revalidations proxy_cache_revalidate on; # Allow request pipe-lining proxy_cache_lock on; # Cache specification proxy_cache torcdn-cache; proxy_cache_key "$scheme$host$request_uri"; add_header X-Cache-Status "$cachename-$upstream_cache_status"; # Make sure our DNS requests hit the Tor Client resolver 127.0.0.1:1053; } Upstream port 8091 is used to simplify the iptables config to transparently redirect the request via Tor. Tor Client config ------------------- As well as advertising a hidden service, the Tor client needs to be configured to expose it's DNS server on port 1053 so that NGinx can use it to resolve the relevant Hidden Service. We also need Transport enabled So, in addition to the Hidden Service config, Torrc needs to contain the following directives DNSPort 127.0.0.1:1053 AutomapHostsOnResolve 1 VirtualAddrNetwork 10.128.0.0/10 TransPort 9040 TransListenAddress 127.0.0.1 Metrics -------- For each test, I want to gather (at least) the following - Cache cold - number of bitrate changes (and from/to what) the player makes - Cache warm - number of bitrate changes (and from/to what) the player makes - Edge distribution of requests - Mid-tier hit rate with multiple players - Apparent throughput at the edge (with multiple players) - Apparent throughput at the mid-tier (with multiple players) - Apparent throughput at the origin (with multiple players) Average delivery times for segments should also be calculated Tests ------ All client side (e.g. browser) caches should be cleared between tests VoD - Direct to origin - 1 edge cache online, proxying to origin (cold cache) - 2 edge caches online, proxying to origin (cold cache) - 1 edge cache online, proxying to origin (warm cache) - 2 edge caches online, proxying to origin (warm cache) - 1 edge cache online, mid-tier online, proxying to origin (caching) - 2 edge caches online, mid-tier online, proxying to origin (caching) - Multiple players, multiple VoD streams (with overlap between players) - Multiple players, multiple VoD streams, limited cache space (to force LRUing) Linear Video - Direct to origin (via HS) - 1 edge cache online, proxying to origin (no caching) - 2 edge caches online, proxying to origin (no caching) - 1 edge cache online, proxying to origin (caching) - 2 edge caches online, proxying to origin (caching) - 1 edge cache online, mid-tier online, proxying to origin (caching) - 2 edge caches online, mid-tier online, proxying to origin (caching) - Multiple players, multiple linear streams (with overlap between players) - Multiple players, multiple linear streams, limited cache space (to force LRUing) Depending on the results, can then move onto lower resolution streams. Will also need to rinse and repeat with a different segment length (10 seconds seems a logical next step).