Domino Anti-Pattern: Delete and Recreate
When reviewing agents or trouble shooting mis-behaving databases I come across a popular anti-pattern for Domino: delete and recreate. Anti-pattern are well known failures for common known problems. "Delete and Recreate" can be found in report generating applications or databases that pull values from other sources like RDBMS or flat files. Mostly you can find that pattern when developers came from a RDMBS background where deleting equals the total removal of information. In Domino things are a little different. When you delete a document something remains back: a deletion stub. A deletion stub is the DocumentUniqueId plus a flag that says "I'm a deletion stub". The stub is then replicated to other servers or clients to also remove this document in the replica. The default life span of a deletion stub is 30 days.
In a recent analysis I was looking at a database with just 400 documents, that was big, slow and prone to crashes. On closer examination I found an agent performing this pattern on the 400 documents every 30 minutes. Do your math: 400 x 48 x 30 = 576000. Quite some baggage for just 400 entries.
So what are the alternatives?
Requirements like this are often solved with another anti-pattern: "Loop through a big loop and do a dbLookup (or getDocumentByKey) for each iteration". But there is an easier way: Typically there is a sorted key (customer number, part number, record id etc.) that would be available in source and target. Create a collection for source and target and use what I call "The Tango" or "The Wiggle". The pseudo code looks like this:
Of course your production code needs to handle the case that you run out of source keys (= delete all remaining target documents) or target keys (=insert all remaining source documents).
This way you don't create unnecessary zombie deletion stubs.
In a recent analysis I was looking at a database with just 400 documents, that was big, slow and prone to crashes. On closer examination I found an agent performing this pattern on the 400 documents every 30 minutes. Do your math: 400 x 48 x 30 = 576000. Quite some baggage for just 400 entries.
So what are the alternatives?
Requirements like this are often solved with another anti-pattern: "Loop through a big loop and do a dbLookup (or getDocumentByKey) for each iteration". But there is an easier way: Typically there is a sorted key (customer number, part number, record id etc.) that would be available in source and target. Create a collection for source and target and use what I call "The Tango" or "The Wiggle". The pseudo code looks like this:
- Read first source key and first target key
- Do until you run out of keys
- if source key is equal target key: call sub routine comparing the two records how they have to be compared and update and save target if it has changed (but not otherwise)
- if source key is bigger than target key: get next target key and delete current target
- if target key is bigger than source key: insert source document into target view and read next source document
- End of the do loop
Of course your production code needs to handle the case that you run out of source keys (= delete all remaining target documents) or target keys (=insert all remaining source documents).
This way you don't create unnecessary zombie deletion stubs.
Posted by Stephan H Wissel on 20 September 2007 | Comments (4) | categories: Show-N-Tell Thursday