Partial Restoration (Under the hood)

This article is part of our "Tiger Bridge - Under the hood" series, where we uncover technical details about our product, dive deep into the solutions it offers, and explain which decisions we had to make while designing it.

Dilemma

Tiger Bridge transfers files to the cloud based on policies or manual actions. This can happen if they haven’t been accessed recently or the local disk usage has reached a specific threshold. After a while, you may need to open some of them again. Whenever you try to access a file which has been moved to the cloud, a retrieval operation will be started. The dilemma here is whether we should download the full file or just the part that we need.

Background

Every time a file is opened, it is read by the application from start to end. This works for smaller files like simple Excel spreadsheets, for example. Their size allows them to be downloaded quickly, and they are cheap enough so that the application and its user do not see a significant delay when trying to open them from the cloud as opposed to directly from the local storage.

However, for big files like a high-resolution video, the delay may be significant if we need to wait for every gigabyte to get downloaded before we can play it; the respective price may also be high. As a general rule, the bigger the size, the longer the wait. This is why applications try to optimize their reading procedures by analyzing only parts of the whole files. Content in healthcare, science and media is a good example. There are some component files in these industries where the application can jump from one location in a file to another without going through everything in between.

Based on this, we examined three options while trying to make a decision.

1. Full restore – Downloading and reading the whole file. This is the best option from a file system perspective.
2. Progressive restore – A hybrid mode where we download the whole file but allow the application to work with it as soon as it reaches a file piece which it requests; the rest is being downloaded in the background.
3. Partial restore – Downloading and reading only the part of the file which is requested. Pieces before and after it are blank. We restore only the bare minimum of data.

In options 2 and 3, we must convince the file system that it has the whole file when it actually doesn’t.

Pros/Cons Analysis

Here are the pros and cons we considered for the partial restoration capability:

Pros

Cons

Storage saving – we consume much less expensive local storage; we can use a smaller disk on-premises, which can be a representation of a bigger storage space in the cloud

Complexity – the implementation is harder; without partial restore, the I/O operation gets blocked for the time the file gets downloaded, and then it gets released

Time to first data – the time it takes to get the first requested data; this is minimum with partial restore, maximum with full restore, and somewhere in the middle with hybrid mode, depending on the location of the piece you want to access

Requires file system support – cannot be implemented if the file system does not support sparse files under some form

Optimization / flexibility options – the user has a tool which can optimize the workflow of the client

Inconsistent state without tiering – the data on the disk is inconsistent. If the software is not there, there is no way to read the files, they will be corrupted

 

The importance of these arguments grows together with the average size of used files.

Decision

The three discussed options – full, progressive and partial restore, are all suitable for a solution like ours. Keep in mind that each includes the previous one’s capabilities, e.g. progressive restore supports both full and progressive restoration, and partial restore supports full, progressive, and partial restore. Because of the unique capabilities of the partial restore option, it became our preferred course of action.

Arguments

While making our decision, we considered the wait and price that come with each option.

Usually, based on the different Tiger Bridge policies, data which is reclaimed locally and only stored in the cloud is considered infrequently accessed. This is why cost was not the main factor we paid attention to. It is not expected to have a large amount of such files.

We focused more on the wait to first data when accessing that data. The “time to first data” capability is a showcase of better performance for the end user. As a result, we can provide the closest speed to a local storage solution. Time is of essential importance for the mission-critical applications that we serve, so this is why we made that choice.

Storage saving is also important, but mostly for the opportunity to not download unnecessary data. With partial restore, we only download some much smaller parts of big-sized files.

Our decision comes with a price; it leads to implementation complexity for our engineers, and we have to pay for file system support (in favor of NTFS).

The risk of having inconsistent files is real, but we have developed a decent workaround: if a client wants to stop using our software, there is an option for them to retrieve all of their data first.

While we implemented partial restore mechanisms, we also put some predictive algorithms in place. In some other cases, full restore is done for smaller files because the price and time to download a file might actually be better.

Our software uses predictive mechanisms as well. When a user accesses a file, Tiger Bridge gets the needed part of that file with the help of its partial restore capability. However, it also predicts which other parts the user may want to see later so they can be already downloaded when the user gets to them. Tiger Bridge starts downloading extra pieces right after the file is opened and continues doing so until it is closed or the user stops working with it for a while.

There are a lot of special settings our software solution supports, which make it easier to adapt even in the most sophisticated workflows and environments. Despite being powerful, the solution still needs some care. We have clients who work with cloud files in the fastest Amazon tiers; for them, partial restore is not strictly better than full restore, for example.

By default, when using Tiger Bridge, you get partial restore with some progressive/predictive mechanisms. We split the file in pieces and read the first ones in parallel. This way, time to first data gets slightly bigger, but reading the file is better optimized.

Tiger Bridge can also go ahead in a folder, meaning it can read and download another file while you are still accessing the previous one. For certain specific workflows and applications, this can lead to a noticeable performance boost.

Conclusion

By supporting all available approaches for data restoration, Tiger Bridge allows the most sophisticated applications to utilize the cloud in the best possible way. Each app may require some special settings tuning, but we are here to help you with that! With its great number of settings, Tiger Bridge offers outstanding flexibility and performance optimization