How to migrate mass data (without business downtimes)

Maintaining a silo with millions and billions of documents is a huge challenge. At some point the need to make changes to the system configuration, metadata and object model or even to the physical environment, either in terms of hardware or with software- version upgrades might arise.

Usually this does not seem to be a very big deal since there are plenty of migration tools and products out there specifically designed for such a task. The obvious way is to put the corresponding system in read-only mode, do necessary changes and/or upgrades, move documents from A to B and activate the new platform. (In fact this is not as easy as one may think- but that is another story)

But what happens if there is no option for a large maintenance time-frame? No option for a single system-downtime at all? There may be situations where the system must be up and running for nearly 24 hours a day, seven days a week…

Good news is there are two approaches how such requirements can be met. The first one, compared to the other, will take a lot of time but makes it theoretically possible to migrate without a single second of downtime.

The Delta Approach

With this approach the source system can be scanned for documents the whole time while users are still working on it. Documents will be exported and transferred (migrated) into a new target system on a daily basis parallel to the running system. Depending on the amount of data this can take weeks but as this does not affect the daily business the time is not very important. Of course we must keep in mind that two systems are running in parallel.

When all documents have been initially transferred, another scan followed by an import will be done. Depending on the size of the delta (with other words how many changes have been made to the data in the meanwhile) additional scans and imports need to be done. The size of the delta will be reduced with each cycle and once it is small enough, the very last scan and import can be done: for this the source system may be set to read-only-mode and latest changes can be quickly migrated. This last delta migration is usually done on a weekend and may take only some minutes. When it is finally imported the new system can be “go live” and the source system can practically be switched off.

This will of course, depending on the overall size of the repository, may take a lot of time in total, but the business downtime, if there is any, is kept as short as possible.

A slight adaption to the above described concept, call it Delta migration w/o content, is to only copy the metadata of an object but NOT the content. This will speed up the scan and import because the actual content will stay untouched where it is and only objects with their metadata will be “moved” (created) in the first stage. Finally when the last delta of the objects is imported, the content can either be moved

manually to a new location or it stays where it is. The corresponding attribute for the “content location” can be maintained or simply changed through transformation rules. This will shorten the migration time in total dramatically and another huge advantage here is that you may change the whole object type structure (e.g. alter the attribute set, generate new metadata or even create new object types etc.) at the same time. The migration-center does actually offer exactly this sexy method with the Documentum-NCC (no content copy) adapter.

The Clone Approach

The second option is the Clone Approach. In special situations it can also be called Clone-Delta Approach. This is the fastest way to migrate or better say “copy” your documents to a new platform version. The strategy is to directly clone the underlying database. The cloning will inflict a downtime of the system which naturally lasts as long as the cloning will take. When the database is finally dumped, the cloned application will be used furthermore as we now have the dump to prepare a new one without hurry.

The next step is to setup the new environment. In terms the hardware and/or software (OS) needs to be changed or upgraded this is the right moment. When the new server and database are prepared it is time to import the database dump and install the application components on the server. Any changes, tests or customizations can be done as if one had all the time in the world as the “old” (cloned) server is still be used in production. Once the target environment is finally prepared, up and running it could become the new productive system. Of course, the last delta migration should be done before the “go live” if there have been changes to the current productive system. This can be done easily as described above.

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.
Powered by Zendesk