Cloud-based data synchronization services such as Dropbox, OneDrive, and Google Drive have attracted a huge number of subscribers and it leads to a tremendous of shared data. How can we utilize the redundancy of these shared data? We have studied it and proposed that a novel network data encoding technique which dynamically eliminates redundancy block of inter-files by previously synchronized or shared (i.e., transmitted) data.

IEEE ICNP (International Conference on Network Protocols, acceptance rate: 18.66%) is one of the premier conferences in the computer networking field, which is run by rigorous papers of 10 pages.


In this work, we raise a question on why the abundant information previously shared between a server and its client is not effectively utilized in the exchange of a new data which may be highly correlated with the shared data. We formulate this question as an encoding problem that is applicable to general data synchronization services including a wide range of Internet services such as cloud data synchronization, web browsing, messaging, and even data streaming. To this problem, we propose a new encoding technique, SyncCoding that maximally replaces subsets of the data to be transmitted with the coordinates pointing to the matching subsets included in the set of relevant shared data, called references. SyncCoding can be easily integrated into a transport layer protocol such as HTTP and enables significant reduction of network traffic. Our experimental evaluations of SyncCoding implemented in Linux shows that it outperforms existing popular encoding techniques, Brotli, LZMA, Deflate, and Deduplication in two practical use networking applications: cloud data sharing and web browsing. The gains of SyncCoding over Brotli, LZMA, Deflate, and Deduplication in the encoded size to be transmitted are shown to be about 12.4%, 20.1%, 29.9%, and 61.2% in the cloud data sharing and about 78.3%, 79.6%, 86.1%, and 92.9% in the web browsing, respectively. The gains of SyncCoding over Brotli, LZMA, and Deflate when Deduplication is applied in advance are about 7.4%, 10.6%, and 17.4% in the cloud data sharing and about 79.4%, 82.0%, and 83.2% in the web browsing, respectively.

[ Overview of the system design and the evaluation scenarios of two use case: 1) Cloud data sharing (left) and 2) Web browsing (right) ]