A Lack of Context

What I wish source systems would tell us and they hardly ever do. Best laid out as an example, look at this data:

𝟺𝟻𝟽𝟾𝟸𝟷, 𝟹 𝟶𝟶𝟶, 𝟸𝟶𝟸𝟶-𝟶𝟿-𝟸𝟶

This alone does not tell us much, so along with this we need context, commonly in the form of column names:

𝙲𝚄𝚂𝚃𝙾𝙼𝙴𝚁 𝙽𝚄𝙼𝙱𝙴𝚁, 𝙱𝙰𝙻𝙰𝙽𝙲𝙴, 𝚃𝙸𝙼𝙴𝚂𝚃𝙰𝙼𝙿

Fine, this is usually all we get. Now, let’s shake things up a bit by introducing a second line of data. Now we have:

𝟺𝟻𝟽𝟾𝟸𝟷, 𝟷𝟼 𝟶𝟶𝟶, 𝟸𝟶𝟸𝟶-𝟶𝟿-𝟸𝟶
𝟺𝟻𝟽𝟾𝟸𝟷, 𝟹 𝟶𝟶𝟶, 𝟸𝟶𝟸𝟶-𝟶𝟿-𝟸𝟶

Confusing, but this happens. Is the timestamp not granular enough and these were actually in succession? Is one a correction of the other? Can customers have different accounts and we are missing the account number?

Even if you can get all that sorted out, we can shake it up further. Put this in a different context:

𝙿𝙰𝚃𝙸𝙴𝙽𝚃 𝙽𝚄𝙼𝙱𝙴𝚁, 𝚁𝙰𝙳𝙸𝙰𝚃𝙸𝙾𝙽 𝙳𝙾𝚂𝙴, 𝚃𝙸𝙼𝙴𝚂𝚃𝙰𝙼𝙿

Now I feel the need to know more. Are these measurements made by different persons and how certain are they? What is the margin of error? If these were in succession, what were their durations? If only one of them is correct, which one is it?

More sources should communicate data as if it was a matter of life and death. This is what Transitional modeling is all about.

Published by

Lars Rönnbäck

Co-developer of the Anchor Modeling technique. Programmer of the online modeling tool. Site maintainer. Presenter and trainer.

Leave a Reply

Your email address will not be published. Required fields are marked *