In today’s blog post we will take a closer look at one of the core concepts of IOTA and any crypto currency, namely public key cryptography and digital signatures.
In IOTA, in order to transfer tokens from one address to another, you create a transaction. A transaction contains information about the payer, the receiver, and how many tokens should be transferred. Of course, it is crucial that not just anyone can create such a transaction, but only the owner of the tokens. In the legal world this is done by adding a signature under a contract. In the digital world we have digital signatures exactly for that purpose. However, unlike their pen and paper counterparts, they are even stronger as they also guarantee that not a single letter can been changed in your contract after it has been signed.
In the last 50 years, cryptography researchers around the world have derived many sophisticated mathematical theories that can be used for digital signatures. They all guarantee unforgeable messages, but probably the easiest to understand are hash-based signatures, i.e. signatures that are based on hash functions.
A hash function is a useful tool in cryptography, which takes an input and produces a seemingly unrelated output. If the hash function is good, then it should be very hard to reverse the process. This means that if I only know the output, I cannot guess the original input in any reasonable amount of time. This is why hash functions are referred to as one-way functions.
That is all we need to build the secure digital signatures implemented in IOTA, using the Winternitz one-time signature scheme.
Any IOTA transaction is represented as a sequence of trytes. While in the human world numbers are represented by the digits 0–9, for IOTA in the trinary world they consist of trytes with values between 0 and 26.
For brevity, we will call the hash function f and assume that the message to be signed consists of exactly two trytes: m1 and m2.
Let us say that Alice created a transaction transferring 1i to her friend Bob, and she now wants to prove that this was indeed issued by her. Alice first generates two random numbers priv1 and priv2 and keeps those secret as her private key. She then hashes both numbers exactly 27 times with the hash function f to get f²⁷(priv1) = pub1 and f²⁷(priv2) = pub2.
Figure 1: Starting with the private key generate the public key with enough intermediate steps for each tryte value.
The two resulting numbers pub1 and pub2 form Alice’s public key which she then shares with everyone as her IOTA address. Since f is a one-way function, it is not possible to determine the original priv1 and priv2 based on the public key.
To sign a message with that key, i.e. sign a transaction from that IOTA address, Alice takes the value of the first tryte m1 in the message and hashes her private key exactly that many times to get sign1. So, zero times if the value of m1 is 0, once if m1 has value 1, and so forth. She does exactly the same thing for m2 and adds sign1 and sign2 as the signature to her transaction.
Figure 2: Diagram of how the signature of the example message (26,2) is derived from the private key.
After this, everyone knowing pub1 and pub2 can verify this signature. All they need to do is apply the hash function f to sign1 and sign2 the remaining times to reach 27 (so 27 – m1 and 27 – m2 times). If the result matches pub1 and pub2, then the signature is correct.
Figure 3: The signature from the example in Figure 2 is verified by hashing sign1 once (27 – 26) and sign2 25 times (27 – 2) and comparing the results with the published public key.
Unfortunately, the approach as described so far is still not secure:
As soon as Alice signed the message (26,2) an attacker can use that information to create a valid signature for example for the message (27,3) just by hashing the published sign1 and sign2 of the original message each once more resulting in f(f²⁶(priv1))=f²⁷(priv1) and f(f²(priv2))=f³(priv2) and thus the correct signature for the new message.
This would be disastrous, as an attacker could, for example, modify the receiving address in any signed transaction. However, there is a simple mathematical trick to prevent this: We always assure — by modifying parts of the message — that the message is in some sense normalized: So, if m1 is larger than the middle value of a tryte, i.e. 13, then m2 must be lower than 13. Normalized means that the sum of both message trytes is always 26. For example (24,2) is a valid normalized message and any attacker that wants to increment the first tryte to 25 would also need to decrement the second tryte to 1. It is impossible to find the correct signature part as this would require reversing the last application of f.
But even then, this method can only be used to sign one message per private key. It is a so-called one time signature scheme (OTS). The most obvious case, to emphasize why signing more than once becomes insecure, is the worst-case scenario in which the following two messages have been signed:
Although both messages are normalized, the first message reveals priv1 and the second reveals priv2, which would of course allow an attacker to correctly sign any third message.
Unfortunately, there is no really easy solution and this is exactly the reason why IOTA addresses must not be debited more than once!
Hash-based signatures like the ones used in IOTA rely on the security of the underlying hash function and — unlike signatures based on other assumptions — they’re mostly viewed as resistant to quantum attacks like Shor’s algorithm. Their main drawbacks are that they result in rather long signatures and that they are one-time signature schemes.
IOTA’s motivation in going for these signatures, despite their drawbacks, is the seriousness with which we are taking the quantum computer risk. However, we are considering several ways around the OTS limitation, such as the recent proposal for reusable addresses.
As always, your comments and questions are very welcome, either here or in our Discord channel #tangle-math.