This article is part of our "Tiger Bridge - Under the hood" series, where we uncover technical details about our product, dive deep into the solutions it offers, and explain which decisions we had to make while designing it.
Dilemma
Once we decided to build a software-only solution, it was possible to make it either cross-platform or for an individual platform – Windows, Linux, or Mac.
Background
Accessing storage typically works like this:
The application requests a file to be open, then goes to the API for files opening, which in turn asks the file system to eventually read 2 bytes from the storage device.
We wanted to make a hybrid solution that allowed us to access both the local storage and an alternative one, like the cloud. End-user applications should give users the ability to move data between the two seamlessly.
The question was how to achieve this, so we thought of the following options:
- Application-level access (what Apple presents natively). There are not many applications of that sort, and it is not easy to create a new one. Most of them were developed when there was no cloud at all. They can usually implement a hybrid solution with network devices but are slow for cloud adoption in general.
- Alternative file system (in or out of the kernel). This system acts as a distributor, deciding whether to send the request from the application to the embedded local system or somewhere else.
- Developers like Apple have such an additional file system. It can be called in user mode and you can make any application utilize it. However, while going from user mode to the kernel is fast, the opposite is not. This is why the approach of this so-called “callback FS” requires sacrifices and limited functionality (only basic features like opening, browsing, and reading). You will need to create a whole new file system, which is far from straightforward.
- Filtering approach – between the I/O API and the FS. When you are filtering, you decide which particular calls you want to alter and modify, leaving everything else as it is. This approach leaves you with the original functionalities of the file system – features which have been developed for ages. On top of that, you choose the ones you want to enhance. This results in maximum freedom and zero sacrifices.
The idea for the filtering approach goes back to antivirus software solutions. Even though the first of them scanned for viruses daily, this was not effective due to some viruses already being there when the application started. They had caused damage and it was difficult to get them out of the system. Modern antivirus solutions offer the so-called active protection whenever possible, which is essentially the same as the filtering approach – if you want to open a new file, it will be scanned by the software before it actually gets opened.
Statistically speaking, the Windows operating system has been attacked the most, so it has developed an architecture which supports this. - There are other solutions on the storage or block level but they are not covered by this discussion.
Pros/Cons Analysis
The option involving application-level access is easiest to implement, but it is a low-hanging fruit. Although it sacrifices functionality and potentially performance, a lot of vendors are sold on it as it works well in a cloud-first approach. If you will be accessing the cloud regularly, this solution does not present any problems. However, if you try implementing it on an on-premises first workflow, like most of our clients do, it may become not only impractical but impossible to use. All sorts of mission-critical on-premises applications would simply be incompatible with such a solution.
The second option – alternative file system – is the most universal method, but also the most complicated one. Since it actually leads to the writing of a whole new file system, not everyone would be willing to proceed with it.
The third option allows you to filter or alter a single call just like antivirus solutions do, or alter almost all calls. You have the freedom to decide, but keep in mind that writing in the kernel requires knowledge not directly available for everyone.
Decision
Why did we choose Windows? Because it is the only platform which supports filtering.
And since the majority of our clients support on-premises mission-critical applications, we decided that the only option currently working for us is the third one.
Arguments
If someone needs to create a solution for Mac, they need to proceed with the first option. If they want to make it for Linux, they should go with the second one. Windows provides maximum freedom while not sacrificing performance or functionality.
Taking a look at the business side of things, hybrid solutions are most valuable in those types of enterprise workflows where we see most clients using Windows.
Writing an entirely new file system would lead to instability, regardless of how much resources you put into the process. It wouldn’t work well unless you are a Microsoft-grade kind of company with tons of expertise in writing and supporting new file systems.
The filtering approach gives the best of both worlds due to its usability and limitless functionality.
Conclusion
If the first option worked for us, we would have created a cross-platform solution that supports Mac, Windows, and Linux. A lot of other vendors have done so in order to make their solutions universal. However, their clients are paying the price of lost functionality and performance. This is why we completely ruled out this scenario.
Although we have serious expertise in the area of file system building, it does not reflect what we want to be doing.
The filtering approach is less risky both for us and our clients, and it explains why we still don’t have a Linux version.
The cloud is always accessed from user mode, not the kernel. Even when our solution is doing filtering, it communicates with a user mode component which then communicates with the cloud. From that perspective, our solution is just as good and fast for cloud communication as any other.
Everything changes when accessing the local storage. The problem with the first approach is that you go to user mode even if you will not be accessing the cloud. This adds latency to the process.
For on-premises first hybrid solutions like the ones we have decided to support, the path to the local storage is important and introducing latency there might make the application not working.
With the third option, our solution is much broader and more functional when accessing the cloud. This aligns with our principles and core values.