This article is part of our "Tiger Bridge - Under the hood" series, where we uncover technical details about our product, dive deep into the solutions it offers, and explain which decisions we had to make while designing it.
Dilemma
We had to decide whether to support dynamic population for the file and folder structure on-demand, or just populate everything immediately in an event of disaster recovery.
Background
The idea here is similar to the one behind partial restoration. Dynamic population is used extensively in the disaster recovery scenario. It might be needed when initial population is being done as well – like when you attach an empty source drive to the cloud.
In a situation where all data is in the cloud and we have nothing on-premises (because of a natural disaster or any kind of failure), the easiest way to recover would be to build the hierarchy on-premises with a lot of stub files, and to start working with and downloading files on-demand.
However, to build stubs, we need to have a database with information about all files. As previously discussed, we decided to use the native file system instead. We can rebuild this system by going to each individual object in the cloud (when we have 1 billion of them, for example) and adding the missing stub objects which link them to their respective cloud versions.
Only after you create 1 billion stub files will you have the full information about your dataset, starting from how big it is and going to everything else you are interested in.
The question here is whether we should spend the time and money to generate all stub files, or do that on-demand with dynamic population, potentially not even creating some of them.
Pros/Cons Analysis
This is what we considered about the dynamic population feature while making our decision:
Pros |
Cons |
Time to first use – the biggest plus here is that you can start using your data immediately, regardless of how big your data set is; the RTO is minimum in a DR scenario – the time you can survive after a disaster without access to your critical data |
Partial local data – you may never recover the full file system because we only recover on-demand; If you ever want to stop using the cloud, you would have to retrieve all data first |
Huge cost optimization – in terms of time and money |
Implementation complexity |
Optimal for volatile workflows – where the on-premise infrastructure gets frequently removed as it is not needed anymore (like in a VDI setup) |
Potentially inconsistent data metrics – the Properties window of a folder might not show relevant data as it does not have information for some of the files and folders inside |
Decision
Time to first use is of great importance for mission-critical on-premises workflows. Since we aim for the biggest possible cost optimization, we decided to implement dynamic population in Tiger Bridge.
Arguments
Even though the implementation was going to be difficult, we still felt it was the best thing to do for our clients. There are some risks involved, but we have the same workaround as with partial restoration – users can retrieve all of their data if they really needed to.
However, this will not be required initially. Once additionally equipped with the partial restore capability, users will be able to start working with files as if they were local almost immediately.
Conclusion
This was an easy decision for us to make because we are quite familiar with our clients’ needs.