staff blogs

distributed.net staff keep (relatively) up-to-date logs of their activities in .plan files. These were traditionally available via finger, but we've put them on the web for easier consumption.

2001-02-28

ivo [28-Feb-2001 @ 20:40]

Filed under: Uncategorized — ivo @ 20:40 +00:00

:: 28-Feb-2001 20:41 (Wednesday) ::

After Decibel’s plan from yesterday, there are some things that need
explanation. We’ve had a lot of complaints from DPC members that argued
that there was no organized megaflush going on and that the accusation
made by the DCTI staff yesterday was premature and partly false.

What seemed to be the cause for the backlog was individual team members
who saved up some blocks flushed them because they feared their 8012 blocks
would become invalid soon. A valid reason to flush, one would say.
Unfortunately it turned out that this caused a lot of blocks flushed at
the same time, blocks that normally would have been distributed over days.

Some remarks on this practice: It’s _not_ good to save up blocks. Even
when you do it only on your own or ‘just for a few weeks’. Why?

Our master is optimized for processing keys from 1 or at most a couple of
subspaces. Every subspace takes up 32MB paged-in memory. A lower number
of paged-in subspaces means a faster master. Due to the optimizations in
the code, even switching between in-core spaces means a huge performance
hit. As soon as the master process has a backlog of a couple of million
blocks from 20 or so different subspaces, it chokes.

Some people justify their flushing by saying: ‘But hey, if you can’t cope
the load of a couple of million blocks more per day, how are you going to
handle this in the future when these loads are normal?’ I hope above
mentioned explanation will show this reasoning is false for once and for
all. We have a perfectly capable master box, its specs are still way
adequate, load testing shows that when blocks are somewhat from the same
subspaces, we can easily handle 1000Gkeys/s with the current hardware.
Normally, blocks from unopened spaces are kept in a separate queue which
is processed at quiet times. When suddenly a lot of blocks come from a
lot of unopened spaces, that theoretical 1000Gkey ceiling jumps down to
a something below our current rate, hence the backlog which is very hard
to get rid of.

With this explanation I hope to have convinced people not to save up too
many blocks. Daily stats are what they are for: Daily rankings. Not for
getting a #1 spot for 1 day because you’ve saved up more or longer than
your friends. If you really want to be #1, just recruit more computers!

We try to suit as many participants as possible and we are very pleased
with all the enthusiasm distributed.net participants show. But if people
keep doing things against the policies set out by distributed.net staff,
we might have to take measures against it, by blocking people from stats
or changing the lifetime of a block. This is not because we suddenly
dislike those participants, but because we want the contests to be
satisfying for _all_ users. without backlogs, so everyone can have their
blocks tallied in time.

One more thing on backlogs: Backlogs don’t mean our system is broken, it
just means our system is handling the load sort of well. We do accept all
blocks, and never give a connection refused on our proxies. We _will_
process all those blocks in the end. Maybe not today, but eventually. So
in the end each and every blocks you flush will be counted. Your stats
total will reflect your total work done. And if we all take care in
flushing to the system as it was designed, backlogs will be kept to a
minimum and daily stats will be correct, too.

Keep on crunching!