Structure of a simple legacy transaction

The following example is a bitcoin transaction with one input and two outputs using p2pkh-scripts for unlocking and locking. The example reflects the common case of a transaction that spends only one input paying to a receiving address and returning the change to an address in the wallet of the sender. The fee is the minimum allowed fee of one sat per byte of transaction data.

Raw data

01000000011078e502b9ff083e1b9443ac4be5e94b38a584c6a84f12e4480f85a51701524c000000006b483045022100e2dff5cedd7924e0b8dce70147b6d4081edca367ae4038adebca41edaa8864240220206bf64423dcc39a6075196c418d267fcd92f36d00cdd5606bb3a299ddbf2a2f012102899aa3f9ddbae77c501bf48851a1df8d5d45574a812a866931b3e1f284c8b4b9ffffffff0288130000000000001976a914bc9f9a524fb5ed857ea6062fb2d9ce07ef40538288ac960a0000000000001976a9144a18cad61146796a9602f4db532f39f157eb05a588ac00000000

Components of the transaction

The numbers in the transaction data are encoded either as unsigned fixed width integers (UInt32) or as variable width integers (VarInt) using little-endian byte order.

Version

01000000

The version number (4 bytes). It is 1 for most transactions.

Number of inputs

      01

The number of inputs encoded as a VarInt (up to 9 bytes).

Inputs

107..fff

In case of more than one input this substructure would repeat.

Number of outputs

      02

The number of outputs is encoded as a VarInt (up to 9 bytes).

Outputs

881..8ac

Since the transaction contains two outputs this substructure repeats.

Locktime

00000000
A block number if smaller than 500.000.000 or else a unix timestamp (4 bytes)

Definitions

Unit of account

   1 sat

The unit of account in bitcoin transactions is one Satoshi (1 sat). Every input and output amount is a multiple of this base unit. By definition one bitcoin is equivalent to 100.000.000 sats. The denomination of amounts in bitcoin versus sats in wallet software is a user interface decision. The sum of all spendable outputs will never surpass 21 million bitcoin, i.e. 2.100 trillion sats.

Fee

per byte

The mining fee of a valid transaction has to be at least 1 sat per byte of the transaction data. It is up to the user to decide for a specific fee above the minimum to incentivize miners to include the transaction in a future block. The fee is not an explicit part of the transaction data. It is calculated during creation and validation of the transaction by subtracting the sum of all output amounts from the sum of all input amounts. The remainder that is neither spend by the user nor returned to the wallet constitutes the fee that the miner is allowed to collect. The calculation of the fee necessitates the knowledge of the UTXOs that are spent by the transaction.

TransactionID

  [Byte]

The transactionID is not a randomly generated number but the hash (SHA256) of the raw data of the transaction. This hash is treated as a byte array and not as an integer so the endianess is not really of concern in this context. The transactionID is needed to compare or find a transaction and has to be generated for every transaction since it is not one of the fields of the raw data. To uniquely identify a transaction output it is necessary to specify a transactionID and an output index.

It is important to note that the transaction ID is serialized in reversed byte order while Blockexplorers expect the normal byte order of the output generated by the hash function.

Unspent transaction output

    UTXO

The tuple (transactionID, outputIndex) that is part of the input of the transaction is a unique identifier for a fixed amount of bitcoin locked in an output of another transaction. This other transaction has to be already included in a block or to be part of the pool of transactions waiting for confirmation. Otherwise the existence of this specific output can not be validated. An available output that has not been spent by a confirmed transaction yet is called an unspent transaction output. All bitcoin full nodes keep track of the set of all UTXOs for the purpose of transaction validation to ensure that no double-spending might occur.

Pay-to-pubkey-hash

   p2pkh

The most basic lock script to secure bitcoin is a pay-to-pupkey script (p2pk) that can be unlocked by presenting the correct signature corresponding to the public key. The pay-to-pubkey-hash script introduces an additional layer of security by replacing the public key with its double hash. That way the public key is not part of the lock script and published for the first time when the script is unlocked. This is one of the reasons why address reuse should be avoided.

The resulting 160-byte long string from the hashing operation is compressed using base58 encoding providing better legibility. This special representation is called a bitcoin (legacy) address.

Address encoding

  base58

The (legacy) address encoding used by bitcoin is a base58 encoding. This is comparable to the widely used base64 encoding modified by the removal of characters that are easily confounded depending on the font (e.g., "O" and "0"). The main intention was to avoid misspellings when reading and typing addresses. This was furthered by the addition of a checksum to the encoded hash value.

Sequence numer

ffffffff

The sequence number is used for a feature called replace-by-fee that allows a user to "update" a transaction that has already been broadcast on the bitcoin network but not yet been included in a block. This opens the possibility to broadcast a new transaction with the same inputs and outputs but different output amounts that lead to a higher fee for the miner. The sequence number helps to distinguish this transaction from the attempt of a double-spend.

The sequence number in conjunction with the locktime had been intended by Satoshi Nakamoto for a "high-frequence-trading" feature that would allow two users to make a series of bidirectional transactions without "settling" each individual transaction in a block. The proposed implementation of this feature has been shown to be open to manipulation by a malevolent miner who would be able to favor a specific transaction in disregard of the sequence number. A more sophisticated implementation of the same basic idea can be found in the payment channels of the Lightning network.