Are there any practical limits to file virtualisation or caveats to be aware of?
Scalability is a huge issue to consider. There are many ways to view scalability. For example, you might increase the number of storage systems or NAS filers in the environment. You might increase the number of file systems or the number of files. You might also increase the number of servers making storage requests or the number of I/O operations. Regardless of how you scale, the file virtualisation platform should be able to maintain adequate levels of performance and availability.
Another issue is interoperability. The trick is to ensure that your file virtualisation platform can work with existing and future storage systems or switches. For example, an NFS-only environment may not be concerned with CIFS compatibility, but some organisations may require support for both.
Test the platform in your own environment and pose the hard questions to your vendors. Every vendor says they have an infinite number of supported devices or infinite capacity. That's common. But, press them for specific limits in terms of storage, file systems, connections and so on. What will they certify and support when you call for help? What will they support in writing?
Are there any problems or issues involved in 'backing out' of file virtualisation? If so, what steps can users take to minimise their headaches?
Removing virtualisation can be highly disruptive to file servers and NAS platforms, but it really depends on what you're using virtualisation for.
As an example, file aggregation can change the way that data is organised, so you may need to unload all of the data, remove the file virtualisation layer and then reformat and reload all of your data from scratch. By comparison, data migration is one of the most transparent file virtualisation functions. You can install the virtualisation and move the data to a new location or storage system. Afterwards, you can remove the virtualisation layer or turn virtualisation off until it's needed again.
Replication falls somewhere in between these two extremes. If the virtualisation layer is required to successfully access the replicated data, removing the file virtualisation layer can prove very disruptive unless the replicated data is in a format that applications can work with. It's important to test "back-out" procedures to weigh the disruption involved and determine whether key data will be left stranded.
Clarify data movement and migration? How does HSM and ILM differ from file virtualisation and data movement, or are they the same?
It really depends on your definition of file virtualisation. If the definition is broad enough, it might include HSM capabilities. Embracing ILM is a bit trickier, because ILM is so broadly defined already.
But, generally speaking, one of the basic premises of HSM is the ability to archive or move data off to another location. HSM leaves a stub file where the data was so that the application "thinks" the data is still there. At the same time, we define virtualisation as having emulation, aggregation and abstraction attributes. Well, we've emulated the file's presence, and I'd say that falls into the realm of virtualisation.
This is slightly different from archiving (aka data movement) in which the files and directories are moved completely.