Data Reorganization using Powershell and Robocopy
Background
As I have mentioned, I onboard new customers. That means I usually inherit problems. This solution was fun to come up with.
This customer purchased a new server and a eventually couple of NAS devices. My job was to migrate everything over to the new server, decommission the old one and make some security improvements etc along the way.
The old server started life as a domain controller and file server. Along the way drive space issues were resolved by adding and sharing several multi terabyte usb drives. Each one was filled to near capacity before moving to the next. As a bonus, the old server was beginning to fail. If a drive got unplugged it wouldnt come back unless the server was power cycled. And, that may cause the network to fail depending on the moon phase or something. Scary stuff. I could not trust or use the backups for reasons.
The data consisted of folders for active and inactive customers each containing hundreds or thousands of files. The files were mixed, lots of documents, sheets, media files, huge 300gb+ compressed files and things in between. In all, I was looking at organizing and moving around 30tb of data.
Preparation
The key to this whole operation was finding a way to make sense of data I had never seen before and did not have any context to work from. So I enlisted the help of their office administrator. You know the type, the “glue” that holds the whole operation together. The one behind the email, phone and often the front desk. She spent a week or two maybe more renaming folders in each drive, adding the word CLOSED to the name. Without this, I’d probably still be moving files.
While she was doing this, I did normal migration stuff. Built the new server, added it to the domain, cleaned up the network. Removed the extra dhcp server, fixed dns, ntp and checked the firewall for security issues. I joined the NAS devices to the network so I could use AD groups to manage access to the shares. Then I created shares that will have familiar names.
Execution
To move the data, I created powershell scripts that analyzed each drive and built an exception list. I used Robocopy to move everthing to their new locations based on the exceptions and the keyword in the folder names. Given the fragile nature of the old server and the network, I operated on only one drive at a time and had to carefully throttle the speeds so that no one would complain and the server wouldn’t crash.
Here’s how I made it work:
#File move based on name
#John Franklin 8/31/2023
$LogDate = Get-Date -f MM.dd.yyyy
$fileSourceFolder = "C:\Storage\Testsrc"
$fileDestinationFolder = "C:\Storage\Testdest"
$pattern = "(copytest) *"
$folders = Get-ChildItem $fileSourceFolder -recurse | Where-Object {$_.name -like $pattern} |select-object -expandproperty Name
$exclude = Get-ChildItem $fileSourceFolder -ad | Where-Object{ $_.Name -notlike $pattern}
$exclude += "System Volume Information"
foreach ($folder in $folders) {
robocopy $fileSourceFolder $fileDestinationFolder /XD $exclude /XF Thumbs.db /e /MIR /TEE /NP /MT:8 /W:1 /R:0 /log:$Home\Documents\Filecopy.$logdate.txt
}
How did it turn out? Well I think it turned out great. The impact to the customer will be minimal, drive mappings will be updated to connect to the new locations while retaining familiar names. Instead of storing data on consumer grade portable hard drives, it now lives on dedicated storage using raid with space to grow and the option to upgrade down the road if they continue to expand. And, as part of the project I improved their backup solution.
Bonus
Here’s how I figured out how much space was in use by the folders based on keyword
$targetDrive = "C:\" # Change this to the drive you want to search
$keyword = "CLOSED"
$matchingFolders = Get-ChildItem -Path $targetDrive -Directory | Where-Object { $_.Name -like "*$keyword*" }
$totalSize = 0
foreach ($folder in $matchingFolders) {
$folderSize = (Get-ChildItem -Path $folder.FullName -Recurse | Measure-Object -Property Length -Sum).Sum
$totalSize += $folderSize
}
$humanReadableSize = "{0:N2}" -f ($totalSize / 1GB) + " GB"
Write-Host "Total space consumed by folders with '$keyword' in their name: $humanReadableSize"