| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185 |
- # author: David Harrison
- # In an attempt for us to have consistent notions of torrent health
- # across search and our various tools, I thought it made sense
- # to define health stats in BTL.
- def reciprocity( downloaders ):
- # measure of the effectiveness of tit-for-tat.
- # This is heuristic for lack of a better model.
- low_thresh = 5 # traditional tit-for-tat algo makes no sense when there are <= 5.
- high_thresh = 50
- # '+1' includes the local host if it were to join the swarm.
- # This retains consistency with the use of 'downloaders+1' used in the
- # download_rate_health metric.
- if downloaders + 1 < low_thresh:
- gamma = 0.
- elif downloaders + 1 < high_thresh:
- gamma = (downloaders + 1. - low_thresh)/(high_thresh - low_thresh)
- else:
- gamma = 1.
- return gamma
-
- def download_rate_health( seeders, downloaders, nats = 0 ):
- """
- This health metric preserves order based on expected download rates.
- It is not necessarily proportional to expected download rate.
- Let downrate_i = download rate from torrent i.
- Let uprate_i = upload rate to torrent i.
- Let downloaders_i = number of downloadeers in torrent i.
- Let seeders_i = number of seeders in torrent i.
- E[downrate_i] = E[downrate from tit-for-tat_i] + E[downrate from seeders_i] (1)
- Assumption 1: downrate from tit-for-tat is related to uprate by a function gamma_i.
- Let reciprocity gamma_i = effectiveness of tit-for-tat.
- E[downrate from tit-for-tat_i] = gamma_i * uprate_i (2)
- where uprate_i is the local uplink capacity that can be committed to torrent i.
- Assuming uprate is invariant across the torrents that I might join, (2) becomes
- E[downrate from tit-for-tat_i] = gamma_i * uprate (3)
- Assumption 2: my expected downrate for any torrent i is equal to the
- average download rate due to seeders across the torrent swarm for i. Due to
- conservation of bits. The sum download rate due to seeders must equal
- the sum upload rate from seeders. Thus
-
- Sum_k uprate from seeder k
- E[downrate from seeders_i] = -----------------------------
- downloaders_i
- seeders_i * E[uprate from seeder]
- -------------------------------- (4)
- downloaders_i
- Assumption 3: E[update from seeder] = uprate
- This expectation might reasonably hold when the local user is a typical
- residential customer and all seeders are also residential customers.
- This is likely to be far off when their are many infrastructure seeds.
- E[downrate from seeders_i] = seeders_i * uprate / downloaders_i
- and thus combining with (1) and (3) yields
- E[downrate_i] = gamma_i * uprate + seeders_i * uprate / downloaders_i (5)
- Our objective is a health metric H such that
- H_i > H_j ---> E[downrate_i] > E[downrate_j] for all i,j
- Because uprate >= 0,
-
- E[downrate_i] E[downrate_j]
- ------------ > ------------- ---> E[downrate_i] > E[downrate_j] for all i,j
- uprate uprate
- We thus define rate health metric Hr_i
- seeders_i
- Hr_i = gamma_i + ------------- (6)
- downloaders_i
- To take into account the local user in the health metric, assuming the user is a
- potential downloader, this becomes
- seeders_i
- Hr_i = gamma_i + ----------------- (7)
- downloaders_i + 1
- The +1 also eliminates the potential divide by zero error that would arise in
- (6) were downloaders_i = 0.
- If gamma_i = gamma_j for all i,j then we can simplify (7) while preserving order
- by subtracting out gamma_i. This leads to
-
- seeders_i
- Hr_i = ----------------- (8)
- downloaders_i + 1
- From experience, we know that the tit-for-tat algorithm works poorly when there
- are few downloaders in a swarm. At the extreme when there are less than 5,
- the algorithm does nothing: the traditional tit-for-tat algorithm unchokes
- the best 4 fastest plus 1 randomly chosen 'optimistic' unchoke. Reciprocity
- should be some function of the number of downloaders in the swarm. For lack
- of a better study, I suggest the following
- downloaders_i +1
- / ---------------- for downloaders_i < thresh
- | thresh
- gamma_i <
- | 1 otherwise
- \
- Thus the effectivness of reciprocity increases linearly until it hits a threshold
- and reciprocation becomes perfect above that threshold.
- @param seeds: number of non-natted peers that have the entire file
- and are still in the torrent swarm.
- @param downloaders: number of non-natted downloaders in the swarm
- @param nats: number of natted downloaders in the swarm.
- """
- assert downloaders >= 0
- assert seeders >= 0
- gamma = reciprocity(downloaders)
- Hr = health = gamma + seeders / (downloaders + nats + 1.)
- return health
- def download_time_health( seeders, downloaders, nats, filesize ):
- """
- Health metric that perserves order based on download times.
- Smaller is better. (Confusing. I couldn't decide whether
- "smaller is better" or "bigger is better" is more appropriate.)
- Let downtime_i = download time for the entire file from torrent i.
- Thus
- Ht_i > Ht_j --> E[downtime_i] > E[downtime_j] (9)
- filesize_i
- E[downtime_i] = ------------
- E[downrate_i]
-
- Using the same definitions ofr downrate_i as used in derive the
- health metric for download_rate_health, (9) becomes
- filesize_i
- E[downtime_i] = ------------------------------------------------------
- gamma_i * uprate_i + seeders_i * uprate_i / downloaders_i
- Because uprate_i is positive,
-
- E[downtime_i] > E[downtime_j] <----
- filesize_i filesize_j
- ------------------------------ > ------------------------------
- gamma_i + seeders_i / downloaders_i gamma_i + seeders_j / downloaders_j
- Including +1 and using the same gamma as used from download_time_health,
- we define
- filesize_i
- Ht_i = --------------------------------------
- gamma_i + seeders_i / (downloaders_i+1)
- @param seeds: number of non-natted peers that have the entire file
- and are still in the torrent swarm.
- @param downloaders: number of non-natted downloaders in the swarm.
- @param nats: number of natted downloaders in the swarm.
- """
- assert downloaders >= 0
- assert seeders >= 0
- assert filesize >= 0
- gamma = reciprocity(downloaders)
- Hr = gamma + seeders / (downloaders+nats+1)
- Ht = filesize / Hr
- return Ht
-
|