Minor tweaks

2014-04-02 13:37:48 -03:00 · 2014-04-02 13:37:48 -03:00 · a4f4e49cd4
--- a/docs/Deduplication.md
+++ b/docs/Deduplication.md
@ -59,7 +59,7 @@ find duplicates)
 * where pos is position in file after readline()
 Idle-daily deduplication notes:
-* 86400(seconds in day)*250(submissions per second)*8(bytes per UUID)= 161mb -> 2gb for 12 weeks
+* 86400(seconds per day) x 250(submissions per second) x 8(bytes per UUID) = 161mb per day -> 2gb for 12 weeks
 * leveldb seems well-suited for this sort of workload, but a C++ implementation is trivial: https://github.com/tarasglek/tombstone_maker
 * skiplists(filename:offset)  are called tombstones
 * should have a compressed TOMBSTONE_INDEX for every IDLE_DAILY in a particular release. incoming_data EC2 job should generate those. Since these are basically sets they can be generated in parallel and UNIONED at the end of each EC2 job.