Apply rain cell tracking to 20 years of rain radar data over Germany

So, I downloaded the 20-year radar data set (Mar to Nov) from the German meteorological service (DWD). It contains about 1.5 million individual radar-derived precipitation fields – one every 5 minutes. And, I was very excited to run a rain cell tracking on it. After handing in my PhD thesis, I was going for a short trip to visit family and friends in Germany. Fortunately, I have a home server and I decided to let it do the heavy work while I am on holidays. It took the full ten days of my absence plus two extra days for the 8-core Intel CPU machine to process all data. So, let’s see how I did it.

This post is part of the germanRADARanalysis project.

Continue reading “Apply rain cell tracking to 20 years of rain radar data over Germany”

How to process many gridded climate data files in parallel with find, xargs and cdo

Climate data often comes in the format of NetCDF and most of the time we have to deal with a large number of files. For instance, when they are split into one file per year. So, what can we do if we want to process all files in the same way?

Luckily, there are tools to accomplish this task easily and even improve the performance by parallel execution. Here, I will show you a simple way to do this. In this example, I will download a small part of a global climate data set and extract a region from it. It’s just a one-liner.

Continue reading “How to process many gridded climate data files in parallel with find, xargs and cdo”

How to run multiple instances of a program with different input data in parallel with xargs

Sometimes, we might want to run a single program with many input files. For example, if we want to resize or crop a large number of photos to the same dimensions. The long way would be to open the first file in a photo editing program and manually resize it, then open the next one, do it again and so on. But, of course we are faster if we automate the task and, most important, use multiple processor cores for parallel processing. Certainly, there are specialized software tools that will do a batch resizing of images for you (most likely not in parallel, though). However, here I will demonstrate a structured and flexible way of applying a program to multiple files in parallel on the command line in a Linux environment using xargs.

Continue reading “How to run multiple instances of a program with different input data in parallel with xargs”