Formulating a data migration strategy

Netshock discusses data migration

I am currently working on a data migration strategy, which got me thinking – what are the components of a solid, well thought out migration strategy, that ensures all parties are kept informed throughout the process and that the migration is a success? I’d like to share my thoughts on this topic throughout this post:

Whenever I talk to my technical friends and colleagues they tell me that data migrations are easy, you just lift and shift the data from one location to another. Hmmm…. yes, the whole technical side of the migration may well be that straightforward, but how do you know that you’re moving the right ‘stuff’ and how do you know that it is going to work as expected at the other end?

In my mind, the best migration strategies have a very heavy analysis phase, which should take into consideration a number of things:

What data do you currently have and what systems, departments or people utilize that data? You may well find that you have a table in your database that collects data that you feel isn’t very valuable or important – that doesn’t mean that someone else in the business isn’t using the data for something. As part of the analysis phase of your migration, it makes sense to check whether anyone is using that data before deciding whether you should migrate it or not.

Once you know what data you have and who is using that data, you can start to consider whether you need to move it to the new system and if you do, do you need to take all of it (e.g. do you need to take 10 years worth of data), or can you take 1 month of data? By doing this, you stand a chance of scaling down the migration, making it faster and more straightforward.

Now that you have a definitive list of the data you are going to use, you’ll need to embark on a field mapping exercise. To describe what this is, it’s easiest for me to use the example of Microsoft Excel.

Let’s say that you have two Excel workbooks – one called old.xls and one called new.xls. In the old Excel file, customer name is stored in column B whereas the new Excel file stores customer name in column D. You need to know this when moving data from one Excel sheet to another, otherwise you’d end up with scrambled data that makes no sense! The same is true of all database and system migrations – unless the structure on both sides is identical.

So where are we now? We now know what data we are going to migrate, how much of that data we’re going to migrate and where in the new system we’re going to migrate it to. Once it’s in the new system, how are we going to test that the migration went through with no issues?

This is where you’ll need to start thinking about writing test cases which are essentially scenarios that you’ll need to test to ensure the data output is as expected. For example, one test case might be to run ‘report A’ in both the old and new systems for the same date range. Once you have the report from both systems, you’ll be able to compare the outputs. If there is a variance, it gives you the opportunity to identify the cause of the issue and make adjustments accordingly.

Once you are confident you have the correct data in the new system, you’ll need to consider how you’re going to let everyone know what is going to happen – after all, there is no better person to test the system than someone that uses it all the time. So, you’ll want to communicate with the users, product owners and anyone else that has an interest in the system. The communication should include:

  • What you’re doing
  • Why you’re doing it
  • Any outages or issues that may be faced
  • Any testing you’ll need their help with

By communicating with users, you’ll also be able to get a better understanding of how the data held within the system is used by each of the different parties. As part of this, you can use the migration as an opportunity to audit user access rights too. This will ensure that only authorised individuals can access data held within the new system.

Image used under creative commons

Tagged under: