Domino Anti-Pattern: Delete and Recreate

When reviewing agents or trouble shooting mis-behaving databases I come across a popular anti-pattern for Domino: delete and recreate. Anti-pattern are well known failures for common known problems. "Delete and Recreate" can be found in report generating applications or databases that pull values from other sources like RDBMS or flat files. Mostly you can find that pattern when developers came from a RDMBS background where deleting equals the total removal of information. In Domino things are a little different. When you delete a document something remains back: a deletion stub. A deletion stub is the DocumentUniqueId plus a flag that says "I'm a deletion stub". The stub is then replicated to other servers or clients to also remove this document in the replica. The default life span of a deletion stub is 30 days.
In a recent analysis I was looking at a database with just 400 documents, that was big, slow and prone to crashes. On closer examination I found an agent performing this pattern on the 400 documents every 30 minutes. Do your math: 400 x 48 x 30 = 576000. Quite some baggage for just 400 entries.
So what are the alternatives?
Requirements like this are often solved with another anti-pattern: "Loop through a big loop and do a dbLookup (or getDocumentByKey) for each iteration". But there is an easier way: Typically there is a sorted key (customer number, part number, record id etc.) that would be available in source and target. Create a collection for source and target and use what I call "The Tango" or "The Wiggle". The pseudo code looks like this:

Read first source key and first target key
Do until you run out of keys
if source key is equal target key: call sub routine comparing the two records how they have to be compared and update and save target if it has changed (but not otherwise)
if source key is bigger than target key: get next target key and delete current target
if target key is bigger than source key: insert source document into target view and read next source document
End of the do loop

Of course your production code needs to handle the case that you run out of source keys (= delete all remaining target documents) or target keys (=insert all remaining source documents).
This way you don't create unnecessary zombie deletion stubs.

Posted by Stephan H Wissel on 20 September 2007 | Comments (4) | categories: Show-N-Tell Thursday

posted by Peter von Stöckel on Friday 21 September 2007 AD:
I've seen this way of doing things too many times, where a developer haven't taken deletion stubs into account at all, and goes the "easy" way of just deleting and repopulating. Really good that you brought this up!

posted by Nigel Choh on Friday 21 September 2007 AD:
This is a nice-to-know anti pattern.

posted by Budi Febrianto on Friday 21 September 2007 AD:
delete and recreate documents are the fastest and the easiest way for lazy programmer.... i do that from times to times... Emoticon biggrin.gif

the cost is... slow replication process between servers
good tips, thanks

posted by Slawek Rogulski on Friday 21 September 2007 AD:
This is batch processing 101. How quickly the lessons are forgotten or not passed on.

Andre Guirard has also written about this http://www-10.lotus.com/ldd/bpmpblog.nsf/dx/lotsadocs1