:: 30-Dec-1999 21:14 (Thursday) ::
For all of you who want to know the gory details of what happened to stats
last night, here they are.
About a week ago, Sybase skipped a bunch of identities in STATS_Participant,
the main participant info table. This tends to happen when the server is
shut down abnormally, and has happened several times in the past. Unfortunately,
we didn’t catch it when it happened this time, so there were several days of
data with bad participant IDs. This meant re-writing the script we use to
correct this problem.
We re-wrote the script a few days ago, but hadn’t run it yet. Yesterday, Bruce
came up with a change for the statsrun code that would allow us to eliminate
the identity field from STATS_Participant, thereby getting rid of this issue.
Since this would also require rebuilding STATS_Participant, it seemed a perfect
time to fix the identity problem.
With the new statsrun code in place and looking good, I was ready to run the
re-identity script. I almost made a backup copy of STATS_Participant, but
remembered that the re-identity script made a backup copy on it’s own. Of course,
this breaks an axiom of computers that goes something like ‘Too many backups is
almost enough.’
The script had a minor syntax error, and in the process of trying to debug it, I
ran the script several times. This had the unfortunate side-effect of sending all
traces of STATS_Participant to the great bit-bucket in the sky.
Not to worry, we make weekly backups of the database for this very reason. With a
boatload of help from Nugget, we got a copy of STATS_Participant from Dec 27 back
into the database. Though, it took us three tries to figure out how to get it back
in without having Sybase redo all the identities.
Once that was in, we fixed the identity problem (making loads of backups along the
way this time) and pondered how to recover the participant data and team info that
had been changed/added to STATS_Participant in the past few days. We decided that
re-running those days from logs would be the easiest.
We extracted the days in question from the master tables and used that info to reset
everyone’s team affiliation to what it was for the 12/28 statsrun (a process that
ended up taking a few hours). We fired off the statsrun and waited. We had to fine-
tune Bruce’s code a bit, but things seem to be going well now and the 12/29 RC5 data
is just about done.
The only thing left to do is re-assign blocks for the past few days to the correct
teams for the handful of people who changed their team membership during those days.
Because that hasn’t been done yet, some of the team stats might look a bit off.
After tonight’s statsrun, everything should be back to normal.
I’m glad we won’t have this problem with STATS_Participants again! }:8)
Thanks as always for your patience and CPU time.