One popular principle behind database-driven applications (websites, services, etc) is to never throw anything away. For example, if you’re writing a “To-do list” application, you might be tempted to run a DELETE FROM TODO WHERE ID = 123
whenever things are checked off.
However, this means that you’ve lost the data forever, so if you wanted to mine the data to gain insights, or wanted to provide features like Undo, you’re out of luck. One way to solve this problem is to always use UPDATE
to set a “deleted” flag on the tuple, but this means your deleted data and your active data are in the same table, and you have to rewrite all your SELECT
s to include the delete flag.
An alternative way is to move all the deleted tuples into a shadow1 table using TRIGGER. Here’s how:
CREATE TRIGGER todo_shadow
BEFORE DELETE ON todo
FOR EACH ROW
INSERT DELAYED INTO _todo_shadow values(OLD.col1,OLD.col2,...);
Now, every time a tuple is deleted from your “todo” table, it gets inserted first into the “_todo_shadow” table. “_todo_shadow” is a table that is identical to “todo”, but without any keys, attributes or indexes — MyISAM would work well here since we don’t plan to delete / update on this table. Note the use of DELAYED, an optional optimization2 to defer shadow inserts.
While I use MySQL as an example, you can tweak it to work with Postgres, SQLite and SQL Server as well, all of them support triggers. Note that triggers have varying impacts on sharding, indexes and transactions, so make sure you read up on table locks before you deploy this at a nuclear plant!
—
1 Shadow tables usually store all provenance, not just deletes. For full support, you can create an additional trigger for updates as well. You can even add a timestamp
attribute that defaults to NOW()
to store the time of deletion.
2 See docs for reasons why you may want to omit DELAYED.