Opening Pandora's Box: An Analysis of the Usage of the Data Field in Blockchains

State: completed by Sebastian Kung

This research topic aims to perform an analysis of the type of data (e.g., images, URLs, hashes, Smart Contracts data, and so on) that is stored in the data field of different blockchains, similarly to the work of [1]. For example, in Bitcoin, the OP_RETURN field is used to store up to 80 bytes of data, whereas in the Ethereum blockchain, a dedicated field is responsible for storing data. As there is no limitation in terms of the type of data that can be stored, such fields are used in a myriad of applications, from storing temperature measurements from Internet-of-Things (IoT) devices, InterPlanteary FileSystem (IPFS) links, and images [https://cryptograffiti.info/]. Thus, this topic involves different steps. For example, firstly, the arbitrary transaction data needs to process and retrieved from each transaction of the blockchain; secondly, several algorithms to detect (i.e., decode) different types of data need to be designed and implemented (it has to be noted that such algorithms must be independent of the blockchain); thirdly, with the use a big data processing frameworks (e.g., Apache Hadoop), group and classify the data to provide insights regarding the use of the blockchain.

[1] Matzutt R. et al. (2018) A Quantitative Analysis of the Impact of Arbitrary Blockchain Content on Bitcoin. In: Meiklejohn S., Sako K. (eds) Financial Cryptography and Data Security. FC 2018. Lecture Notes in Computer Science, vol 10957. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-58387-6_23


30% Design, 60% Implementation, 10% Documentation
Basic knowledge on blockchains and programming skills in Python

Supervisors: Dr. Eder John Scheid

back to the main page