Documents (storing and tracing provenance on blockchain)

Storing documents on-chain is very undervalued currently. In our Web3 and AI driven future, humans and automated agents will be using DLT systems to publish, sign and exchane documents. In this article we cover the specifics of storing and tracking the provenance of the documents using blockchain systems. 

This is an example of Document upload functionality in Ethora (old design circa 2022, being replaced with a new one currently):

Documents upload in Ethora platform

User journey and core features when creating a Document on chain

Core features:

1. User uploads a file (image, PDF, Word etc)

2. File is stored by DP / Ethora API using IPFS or Minio as underlying storage and a special “Document” wrapper entity created using our Solidity smart contract (special “Documents” token type) and a DB caching record.

3. Depending on the App-wide and User’s Privacy & Data settings, the Document will either be publicly available via User’s profile, or it may only be visible to the User themselves + anyone they explicitly share it with via Sharing Links functionality.

4. Each Document is automatically equipped with an Ethereum blockchain token which means its Provenance (history) is tracked using the immutable ledger. All transactions such as issuance of the document, signing the document, sharing, updates etc leave a blockchain transaction trace.

Importance and security of using the immutable ledger for Documents storage

This is important for audit and proof purposes. It is impossible to forge or backdate the creation or certain action with a document. The way the cryptographic hashing and Merkle Tree mechanisms work, even the person or organization who has access to all blockchain nodes, will not be able to e.g. substitute the document in question with another file and persuade others that new version has been there from the beginning.

The reason for this is each blockchain block relies on complex cryptographical computations involving one-way functions and crypto hashing using the ‘checksum’ of the previous block of transactions to start a new block of transactions.

All these computations continue all the time, gradually building the ‘Merkle Tree’ of transactions that rely upon each other. The status of the whole blockchain network relies on EACH of the past blocks. The past information in the blockchain therefore is IMMUTABLE (can’t be changed without alarming the others).

What happens if someone physically changes a Document or its blockchain record?

This doesn’t mean a perpetrator cannot physically change some record or file hash in blockchain. Similarly how it works with conventional databases and file systems, whoever has the physical access to the data, can physically change or destroy it. What is different and more secure in using the blockchain mechanism, however, is that:

(1) such change is much more difficult to affect since blockchain network has multiple nodes which work and verify information independently. Changing information in one node will not break the system as the system will self-heal via its Consensus mechanism using the trusted information from the remaining nodes.

(2) if someone were able to physically modify certain historical block in ALL the blockchain nodes, the system would stop working which will be noticed immediately. This will happen because e.g. by replacing a certain Document you will modify the cryptographic hash of the file and the historical block. This will result in all blocks since your intervention failing to pass the standard verification compute that the blockchain network carries out regularly. The Merkle Tree will not compute anymore hence all blocks since such intervention will be simply discarded by the network and the problem will be noticed by all users and its administrators.

Use cases for storing Documents on chain

This makes our Documents mechanism invaluable for use cases such as:

* Anti-tampering protection – the Document has its hash stored and verified on the blockchain which means nobody can backdate changes into certain document, certificate or legal agreement without this being noticed

* Proof of Authorship – the very same mechanism can be used to prove that certain creator has created certain text, music or art on a certain date. All that’s needed is to upload your creation using our Documents or NFT/Items mechanism. This will create an immutable record which can be later used to prove that you have possessed this assets on a certain date.

* Certificates – for example, educational certificates, test passes, health certificates such as vaccination certificate. Having an immutable Provenance mechanism means anyone can track the history of the document and verify when it has been created and by whom.

* Track & Trace (Supply chain documents) – same mechanism could be used for Supply Chain documents, for example each batch of even an individual item such as a tomato on a shelf could be supplied with a QR code label. When scanned, it will reveal the Document or a batch of Documents confirming the provenance of the item. This can be as detailed as necessary, including photographs of it being planted and grown, or any logistics documents such as shipment certificates, bill of landing etc.

* Self reporting such as CO2 emissions – we have participated in a EU project where we have implemented a similar mechanism for tracing CO2 emissions on chain. Important aspect to note here is that business can decide which data goes on chain and what remains off-chain. For example, the confidential primary data can remain off-chain while the results of calculations can be posted on-chain. The authenticity of the data and the calculation can still be proven as auditor can compare the cryptographic hashes of the on-chain document with the off-chain database checksum.

Same protection mechanisms of course apply to other types of assets or tokens on blockchain. Assuming they are implemented correctly (this is why we have ERC standards and smart contract audits) things such as cryptocurrencies, on-chain collectibles (NFTs) benefit from the same in-built protections which explains their growing popularity. The on-chain creation and tracing of documents however is not popular yet at the date of writing (December 2023). This can only mean the potential and possibilities here are not understood or appreciated by the businesses yet. This is one of the reasons we have built the Dappros/Ethora platform which significantly reduces the barrier of entry and makes the creation of an on-chain Document as simple as a drag&drop upload in the conventional non-DLT systems.

On-chain Documents signing and ‘multi-sig’ feature

Another very useful capability of blockchain systems is the functionality of cryptographic signing which lies at the core of blockchain networks. Unlike conventional systems where a “user” doesn’t mean anything specific and depends a lot on 3rd party systems that handle accounts creation and authentication, in blockchain systems all users and smart contracts have a universal and unified way to interact with others on the system. Each ‘account’ or ‘wallet’ is equipped with a ‘crypto pair’ meaning there is a publicly known address and there is a secret key linked to that address. This means it is easy and straightforward for users and decentralized applications to sign transactions with their cryptographic pair, and for the other entities in the system to verify and trace such signatures.

Why is this important for Documents? This is super important as the act of signing is often important for certain documents such as legal agreements, educational or vaccination certificates etc. We should note that blockchain treats signing a bit differently. It is an intrinsic part of all transactions in blockchain (aka DLT or Web3) systems. By merely creating a transaction (or a Document) a user or a decentralized application already sign it. Pretty much every action or transaction is cryptographically signed by someone on blockchain. Everything is automatically verified and filed on the immutable ledger.

For our documents use case this means, first of all, that the creation of the Document can be traced to a specific person or entity who has created it in the first place. The act of creating the Document has been recorded and signed by their crypto signature automatically. Similarly, other transactions such as transferring the Document ownership etc will be recorded and can be transparently reviewed by anyone who has access to the blockchain records.

Q: Can the conventional “contract” signing be implemented for the Documents? What is a multi-signature mechanism?

A: As explained, each Document is already signed by whoever creates it. Not every Document is a legal agreement that requires signing by multiple parties. If necessary, however, it is easy to enable a signing mechanism allowing multiple parties to sign a Document. The mechanism of smart contract allows multiple possibilities there. One of them is the multi-signature mechanism, for example you can require that the Document is signed by multiple users before it shows as ‘signed’ or ‘approved’. Alternatively, you may distinguish between the ‘issuance’ and ‘signing’ or ‘certifying’ the document. These could be shown as separate actions in the Document’s provenance. For example, an educational certificate could be issued by an authority or it could be self-issued by the person by merely scanning it and posting it on chain. In the latter case, for example, you could allow the user to display it in their profile as a self-issued Document, but your application interface would only show it as “Approved” or “Certified” once a recognised authority certifies it which will happen as a separate blockchain signing transactions. Other users of the system will be able to clearly see the history of the Document including the date and by whom it has been issued, the date and by whom it has been certified or co-signed and so on.

Q: Should all legal contracts on-chain be “smart contracts”?

A: Smart contracts are quite a bit different from conventional ‘legal prose’ contracts. You can simply upload a signed on unsigned legal agreement on-chain, creating an on-chain Document with it which will prove its existence and your possession of it as explained earlier. This way, your legal contract is already equipped by a smart contract automatically, however the main purpose of that smart contract is just to store and trace the history of your contract on-chain. It does not impact certain actions or functionalities otherwise. In web3 / blockchain world, smart contracts are normally a sort of decentralized software applications which typically implement certain business logic. If you prefer to have contracts signed on chain as part of your business logic / application mechanics, that can be easily done via the signing or multi-sig mechanism explained above. In your application interface you can display the document as signed and display the provenance of all signatures such as signed when and by whom, linking to the profiles of users or organizations who have signed it. You can go further than that and have your Document perform certain action or unlock certain mechanics once signed. In blockchain applications, smart contracts often are able to receive and store funds, and release those funds based on certain conditions. An example could be an escrow contract or a certain trust which would release funds or portions of funds once it receives signed requests from the trustees. Another example could be a revenue distribution smart contract. Those functionalities however need to be programmed first. Similarly how you write your legal contract in human language, you need to write your smart contract in a programming language such as Solidity. The main difference here is you have to rely on 3rd parties to enforce your legal contract, while the smart contract enforces everything itself as long as it has access to the funds or assets on-chain. To conclude, by uploading your document or contract in Dappros/Ethora platform you already convert it into a smart contract automatically, however our standard Documents contract has limited functionalities. It is designed mainly to store, track and display the document on chain, including some standard transfer and signing functionalities. In case you need to implement a specific business logic, you will need to have a specific smart contract programmed which could be based on our Documents smart contract or developed from scratch.

I hope this has provided you with a much needed understanding in regards to how you or your business may benefit from storing and tracing documents on the blockchain. You can use the open-source Ethora platform to easily build your own branded application utilizing these functionalities.

Taras Filatov, 6th December 2023.