Storing information on the blockchain
Currently, the most popular non-local information storage solution is cloud storage like Google Disk, Dropbox, Mega, and databases like MySQL and MongoDB. However, companies may control the content of these repositories, and your information may be censored.
In this article, we will review the ways of storing information on the blockchain, as well as the pros and cons.
User interaction with the database
In practice, user interaction with the repository comes down to three steps:
1. A user uploads data to a company’s server using a desktop or web application;
2. The company imports information about new data to the information processing center;
3. To gain access to their data, the user sends a request to the data center, which provides information access.
Undoubtedly, this model has several advantages:
• CRUD — an acronym for four basic functions used when working with databases: create, read, update, and delete. This is a standard model of user interaction with the database.
• Often the speed of information processing depends only on the user’s Internet speed.
Otherwise, such centralized repositories are not the most reliable file storage. Information about the files you upload is transferred to third parties, and as such centralized servers often get targeted by hackers.
Data repositories on a blockchain
Using a blockchain for record information is not the best idea since a block, a structural unit of the blockchain, has a limited size. For example, the size of the bitcoin block is 1 megabyte; thus it is not possible to send a file larger than 1 megabyte to the blockchain. We also have to take into account the cost of sending this file.
Let’s have a look at the block # 637352 of the Bitcoin network.
A fee for adding transactions to the block equaled 0.47462040 BTC or $4372. Let’s assume that this block is “full,” thus equal to 1 megabyte. It turns out that to send a 1Mb file, we need to pay more than $4000. We also have to remember that the file will be visible to every network participant.
However, the Bitcoin blockchain is excellent for sending short messages. The average sentence in English consists of 15–20 words, where one word, on average, consists of 6 characters. In total, we get about 140 characters in one sentence or 140 bytes of information.
As a result, we get $0.5 per message + commission for transferring funds.
Peer-to-peer file systems
The most popular peer-to-peer file system is IPFS or the Interplanetary File System. This blockchain technology is built on the BitTorrent protocol, which involves breaking files into fragments and storing multiple copies of those files on the system participants’ computers.
This method has several advantages:
• The file will be downloaded by the users who are interested in it;
• Popular files are downloaded/distributed very quickly;
• The data is address-dependent, so it is impossible to fake the internal contents of the file;
• It is a peer-to-peer solution.
Reviewing shortcomings, we can note that files can be uploaded to the network only if the user is online, and as such, a system serves only static data. Besides, one can access the file only if they know its name.
In this scheme, the blockchain is used as an intermediary that connects participants and is responsible for verifying the authenticity and integrity of the files.
Decentralized cloud storage
These are ordinary cloud storage options similar to Dropbox. Except that the data is not placed on the company servers, but on the devices of users who rent them out.
Using such solutions, network participants do not have to be constantly online to send information. It is enough to upload the file to the cloud storage once. Such storages are stable, fast, and have huge capacities.
However, they are only suitable for serving static data and do not support search by content. Moreover, they are not free, as participants rent equipment from each other.
Storj and Sia
These companies operate on the trading platforms principle. They promise cheap, prompt, and safe storage; however, this does not mean that their services are cheaper than the ones of such giants as Google, Amazon, or DropBox. It’s just that they get profit not only from rental rates but also from commissions for transactions generated by downloading and extracting data.
The operation scheme of Storj and Sia is, in fact, intermediation between those who lease hard drives and those who rent them. Blockchain is used as a register of transactions, financial settlements, and authentication of files in databases. At the same time, the user data itself is stored outside the blockchain and can be deleted or become inaccessible any time if the lessors decide to delete the files or simply disconnect their device from the network.
Filecoin is a platform based on the same ideas as Storj and Sia. Their difference is only in two details:
- The platform will stimulate nodes of medium capacity to avoid the threat of centralization on the part of large players and instability on the part of small players.
- The system will try to find nodes for storing data as close to the users renting these nodes as possible. This will increase the download and upload speed, as well as reduce the possibility of errors during data transfer.
Using these innovations, as well as a unique consensus algorithm that stimulates an increase in network disk space, Filecoin intends to outrun Google and Amazon in terms of storage capacity in the next few years.
The main idea of Maidsafe is to create a fully encrypted P2P network that will be a database for the anonymous exchange of information through encrypted layers. It’s an analog of Tor for cloud storage. This will be possible through the three elements of Maidsafe:
- Self-encryption: data that encrypts itself. When a file is uploaded to the Maidsafe network structure, it is broken into numerous small fragments that are self-encrypted and distributed throughout the network. In this form, the file becomes unreadable to anyone except the owner.
- Decentralized data caching. Data in the SAFE Network will be stored worldwide, and not on the servers of one company or companies’ network. This will make the platform autonomous and increase the level of information security.
- Data availability. The network continually creates and maintains duplicates of all the files it stores. This function leads to redundant information, which should protect it from loss due to the disconnection of individual nodes.
Using a blockchain for information storage has some disadvantages. For example, the speed of downloading a file from Sia storage will be significantly lower than from Dropbox. However, this is compensated by the security of user data.
There currently is an ongoing development to speed up the file transfer and increase the reliability of decentralized file storage. The Filecoin project is working in this direction and has already invested $275 million in improving the infrastructure in 2017.