The idea and eventual implementation of the web or internet started a few decades ago. In a short span of time, we have reached a level where the web has become an important part of our life. We use web day in day out. From the consumer or enterprise perspective, the web has a critical role to play. It wouldn’t be wrong to say, the web is controlling a larger part of our life, be it watching movies, sharing content or communicating with friends. We consume the web for most of the activities as service. There are different service providers (not to mention, dominant and powerful) who offers services to us for direct or indirect cost. Because there are few service providers, these services are highly centralized and tightly controlled by these providers. Let’s take an example of cloud storage. We are generating a vast amount of data for which the local storage like our computers, phones or hard drives are neither enough nor reliable. This led consumers toward storing the data on the cloud for a small fee. The cloud storage offers easy to use, seamless integration with existing applications. But there is a major concern of data privacy which is hard to ensure when data is stored on services providers machines. Also, the data is not outside the purview of enforcing agencies hence it can be easily deleted or blocked for access. In summary, data privacy has become the biggest concern for cloud storage. In this article, we will explore the alternatives and possible solution to this problem.
Today the content on the web is accessed using a host-based address, typically it is called a web address. This is like a pointer in a web on how to find a document. This idea of the web address is easy to understand and use. But the problem with this is that even though it helps in finding the content, there is no guarantee that the content is authenticated or has not been altered. You must trust the content provider for the integrity of whatever content is hosted on the web. The other problem is, what if some enforcing agency decides to deny the access of content to specific or all the users? It can easily be done by simply blocking the web address to content as it is accessed from a centralized location. The solution to these problems is to move from the host-based address mechanism to content-based address technology.
So, what is content addressing? The idea of content addressing is not new but using this idea for the web is something that is revolutionary. In case of content addressing, the data or the content is not addressed by its location, but its unique fingerprint called hash. Specific content will have a unique hash that can be used to locate the content. No two content will have the same fingerprint. Since the content is not placed at one but on multiple locations it is difficult to block, as we must find every location and stop access to it. Also, if the content is altered its hash is bound to change, hence invalidating the content.
Now let’s look at the available technologies and ways to implement the permanent web. One of the technologies that we are going to discuss is the Interplanetary File System (IPFS). IFPS is a peer-to-peer hypermedia protocol for content addressed storage. IPFS network is created by participating user nodes all around the world. A user can run IPFS daemon on a computer running supported operating systems. For detailed install instructions refer https://docs.ipfs.io/introduction/install/. Once the node is up and running, content can be uploaded to IPFS network from the node. For every uploaded content, we will get back a hash that uniquely identifies the content on IPFS network. For demonstration, an image of my laptop is uploaded to IPFS network and the resultant hash is QmdcqcayPv3B6Xf37kACzLZmHpLK4re2jGNzcxE9Rjnu7Z. This image can be accessed by providing the hash to any of the IPFS node or gateway. For example, the image can be accessed with the following IPFS gateways.
Note the difference, as compared to traditional web, the content is not accessed by location but by content hash. The content above is immutable, that means, nobody can modify the content. If content is modified, it will result in a new hash and hence it will be in-accessible with original hash above. Also, it is difficult to take down the content as only access endpoints, in the above example, one or more gateway can be shut down not the computers storing the content.
The above technology has a great potential to become the foundation for the permanent web and in future, we will see an emergence of more related technology that will eventually take the web as we know today to next level.
About Mahadev Gaonkar
Mahadev Gaonkar is Cloud and Mobile Technical architect having 15+ years of experience in professional software applications development. He has worked with storage, networking and cloud vendors to design and develop enterprise software for datacenter and cloud solutions. He has a solid experience in technical pre-sale and solution architectures with emerging technologies.
More on Mahadev Gaonkar.