SyncCoding: A Compression Technique Exploiting References for Data Synchronization Services
We raise a question on why the abundant information previously shared between a server and its client is not effectively utilized in the exchange of a new data which may be highly correlated with the shared data. We formulate this question as an encoding problem that is applicable to general data synchronization services including a wide range of Internet services such as cloud data synchronization, web browsing, messaging, and even data streaming. To this problem, we propose a new encoding technique, SyncCoding that maximally replaces subsets of the data to be transmitted with the coordinates pointing to the matching subsets included in the set of relevant shared data, called references. SyncCoding can be easily integrated into a transport layer protocol such as HTTP and enables significant reduction of network traffic. Our experimental evaluations of SyncCoding implemented in Linux shows that it outperforms existing popular encoding techniques, Brotli, LZMA, Deflate, and Deduplication in two practical use networking applications: cloud data sharing and web browsing. The gains of SyncCoding over Brotli, LZMA, Deflate, and Deduplication in the encoded size to be transmitted are shown to be about 12.4%, 20.1%, 29.9%, and 61.2% in the cloud data sharing and about 78.3%, 79.6%, 86.1%, and 92.9% in the web browsing, respectively. The gains of SyncCoding over Brotli, LZMA, and Deflate when Deduplication is applied in advance are about 7.4%, 10.6%, and 17.4% in the cloud data sharing and about 79.4%, 82.0%, and 83.2% in the web browsing, respectively.
[ Overview of the system design and the evaluation scenarios of two use case: 1) Cloud data sharing (left) and 2) Web browsing (right) ]