Posts Tagged ‘Blob Storage’

Accessing properties of Azure blobs with PowerShell

July 13, 2015

In the previous articles in this series talking about Azure blob storage and PowerShell, I’ve covered uploading and downloading blobs, copying blobs, and deleting blobs. Now let’s talk about how to read the properties of a blob, and how to update them.

I’m not going to discuss all of the properties of a blob, or the properties of a blob’s Properties. (Yes, a blob has a property called Properties and that property has properties of its own). (Yes, it’s a little confusing.) If you want to know all of the properties (and Properties properties) of a blob, please check out this article that I wrote for RedGate, and then come back here to see how to read and modify them using PowerShell.

Setup

As we’ve seen before, first you need a storage account context. This is the same as in the previous posts. To define a storage account context, you need the storage account name and key. Set the variables to storage account name and key. You can get these from the Storage Account in the Azure Portal.

$StorageAccountName = "yourStorageAccountName"
$StorageAccountKey = "yourStorageAccountKey"

Now let’s set up the storage account context.

$ctx = New-AzureStorageContext -StorageAccountName $StorageAccountName `
         -StorageAccountKey $StorageAccountKey

Now we need to set the name of the container that holds the blob that we want to use. Set this to your container name, and specify the name of the blob. I’m going to use the same values I was using in the previous post.

$ContainerName = “images”
$BlobName = “gizmodo_groundhog_texting.jpg”

Get a reference and check the properties

To examine or modify a blob’s properties, first you have to get a reference to the blob through the Azure Blob Service. You can use the cmdlet Get-AzureStorageBlob to access the blob – this returns an object of type ICloudBlob, which is generic. You then have to convert this to a CloudBlockBlob object to access the properties and Properties of the actual blob.

$Blob = Get-AzureStorageBlob -Context $ctx -Container $ContainerName -Blob $BlobName

$CloudBlockBlob = [Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob] $Blob.ICloudBlob

Now that we have a reference to the CloudBlockBlob object that represents the actual blob, we can look at the properties using the IntelliSense in PowerShell. In the command window, type in $CloudBlockBlob and a period (.), and it will show all the properties you can access. You can type in the variable with the property to see the value ($CloudBlockBlob.BlobType), or use the Write-Host command as displayed here.

Write-Host "blob type = " $CloudBlockBlob.BlobType
Write-Host "blob name = " $CloudBlockBLob.Name
Write-Host "blob uri = " $CloudBlockBlob.Uri

PSBlobProperties-image-01

To see the Properties properties, fetch the attributes of the blob first. This will populate these properties.

$CloudBlockBlob.FetchAttributes()
Write-Host "content type = " $CloudBlockBlob.Properties.ContentType
Write-Host "size = " $CloudBlockBlob.Properties.Length

PSBlobProperties-image-02

The files got uploaded with a default content type of “application/octet-stream”. Let’s change the content type to “image/jpg”. To do this, we can just assign a new value to the Content Type property, then call SetProperties to save the change. In the following example, I’m changing the content type, then using the Write-Host statements from above to see the new value.

$ContentType = "image/jpg"
$CloudBlockBlob.Properties.ContentType = $ContentType
$CloudBlockBlob.SetProperties()

PSBlobProperties-image-03

Another important property is the metadata. This is a set of key/value pairs that you can attach to the blob. Here’s an example of how to set the metadata. This assigns both the key and the value for each pair. This then saves the changes by calling SetMetadata and then displays the results.

$CloudBlockBlob.Metadata["filename"] = "original file name"
#author is the key, “RobinS” is the value
$CloudBlockBlob.Metadata["author"] = "RobinS"
$CloudBlockBlob.SetMetadata()
$CloudBlockBlob.Metadata

PSBlobProperties-image-04

To clear the metadata, you can call the Clear method on the Metadata, then save the changes.

#blank out the metadata
$CloudBlockBlob.Metadata.Clear()
#save the changes
$CloudBlockBlob.SetMetadata()
#view the saved data
$CloudBlockBlob.Metadata

PSBlobProperties-image-05

Summary

In this post, I showed you how to access the properties of a blob in Azure Storage, and how to set the properties and the metadata. In the next post, I’ll show how to take a snapshot of a blob and then how to view the available snapshots for a blob.

Deleting and copying files in Azure Blob Storage with PowerShell

July 12, 2015

In my previous post, I showed you how to upload and download files to and from Azure blob storage using the Azure PowerShell cmdlets. In this post, I’ll show you how to delete blobs, copy blobs, and start a long-term asynchronous copy of a large blob and then check the operation’s status until it’s finished. I’m going to continue with the container and blobs that I used in the previous post.

Setup

First, you need a storage account context. This is the same as in the previous post. To define a storage account context, you need the storage account name and key. Set the variables to storage account name and key. You can get these from the Storage Account in the Azure Portal.

$StorageAccountName = "yourStorageAccountName"
$StorageAccountKey = "yourStorageAccountKey"

Now let’s set up the storage account context.

$ctx = New-AzureStorageContext -StorageAccountName $StorageAccountName `
         -StorageAccountKey $StorageAccountKey

Now we need to set the name of the container that holds the blob(s ) that we want to delete. Set this to your container name.

$ContainerName = “images”

Delete a blob

To delete a blob, you need the storage context, container name, and blob name. So let’s set the blob name, and then use the PowerShell cmdlet to delete a blob.

$BlobName = “GuyEyeingOreos.png”

Remove-AzureStorageBlob -Blob $BlobName -Container $ContainerName -Context $ctx

Now get the list of blobs and you’ll see that the one you deleted is no longer there.

Get-AzureStorageBlob -Container $ContainerName -Context $ctx | Select Name

PSBlobDelete_image-01

Copy a blob

Now let’s copy a blob to another blob with a different name. We’ll use the Start-AzureStorageBlobCopy cmdlet to do this. The copy is done asynchronously. If it’s a huge file, we can check the progress of the copy. I’ll show you how to do that after this. Here, I’m going to copy one of my blobs to another blob, then get a list so I can see if it worked.

Set the blob name of the source blob, then set the blob name of the target blob. Then do the copy.

$BlobName = "gizmodo_groundhog_texting.jpg"
$NewBlobName = "CopyOf_" + $BlobName

Start-AzureStorageBlobCopy -SrcBlob $BlobName -SrcContainer $ContainerName `
        -DestContainer $ContainerName -DestBlob $newBlobName -Context $ctx

Get-AzureStorageBlob -Container $ContainerName -Context $ctx | Select Name

PSBlobCopy_image-01

You can see the new blob in the listing now.

Copy a blob asynchronously and check its progress

What if you have a really large blob, like a VHD file, and you want to copy it from one storage account to another? Maybe you even want to copy it to a storage account in another region, so you can recreate the VM in that region?

You may have noticed that there’s not a CopyBlob method per se – the cmdlet is Start-AzureCopyBlob. This is because the blob copy runs asynchronously. For small blobs, you don’t generally need to check the status of a copy operations – it will probably finish before you have a chance to check the status, depending on the size. For large blobs, though, like a VHD file that is 127 GB, it will take a while to copy it. So you want to be able to kick off the copy and then check the status later. Let’s see how to do that.

Setup the variables for the operation

I have a VHD file in the container “vhds” in one storage account and I’m going to copy it to a different storage account. Let’s start by setting the variables for the container name and blob name for the source. We will use the same blob name for the source and the destination.

$sourceContainer = "vhds"
$BlobName = "testasynccopysvc-testasynccopy-2015-06-16.vhd "

Next I’ll set up the name and key for the destination storage account. Fill these in with your own information.

$destStorageAccountName = "yourStorageAccountName"
$destStorageAccountKey = "yourStorageAccountKey"

We need a storage account context for the destination storage account.

$destctx = New-AzureStorageContext –StorageAccountName $destStorageAccountName `
        StorageAccountKey $destStorageAccountKey

Let’s set up the destination container. I’ll set the name and then create the container.

$destContainer = "copiedvhds"
New-AzureStorageContainer -name $destContainer -Context $destctx

Asynchronous copy and status check

Now let’s kick off the copy operation, then check the status by calling Get-AzureStorageBlobCopyState. You have to assign this to a variable representing the destination blob so you can check the status of the resulting blob. Notice that this copy command is a little bit different than the previous one – this one specifically provides the destination context because it’s different from the source context.

$BlobResult = Start-AzureStorageBlobCopy -SrcBlob $BlobName -SrcContainer $sourceContainer `
        -DestContainer $destContainer -DestBlob $BlobName -Context $ctx `
        -DestContext $destctx

$status = $BlobResult | Get-AzureStorageBlobCopyState
# show the status
$status

PSBlobCopyAsync_image-01

This shows that it is just starting the copy operation – the bytes copied is 0. Wait a few seconds and check the status again…

PSBlobCopyAsync_image-02

Now it shows the number of  bytes copied is greater than 0.

This PowerShell code will loop continuously until the status is not “Pending”. It will retrieve the status, display it, sleep 10 seconds, then loop around and check the status again. When the status no longer equals “Pending”, it will exit the loop.

While ($status.Status -eq "Pending") {
    $status = $BlobResult | Get-AzureStorageBlobCopyState
    $status
    Start-Sleep 10
}

When it’s completed, the final status will look like the following:

PSBlobCopyAsync_image-03

You can see that it has copied all the bytes, and the copy operation was successful.

Summary

In this post, we went over how to delete a blob and how to copy a blob to another blob. We also learned how to do an asynchronous copy of a large blob from one storage account to another, and check the status repeatedly until it was finished. In my next post, we will learn how to take a snapshot of a blob using PowerShell.

Uploading and downloading files to Azure Blob Storage with PowerShell

July 8, 2015

In my previous post, I showed you how to set up PowerShell so you can use it to perform commands against blob storage. In this post, I’ll show you how to create a container in blob storage and then upload files from the local machine to blob storage, and how to download files from blob storage to the local machine.

Setup for transferring files

When using hardcoded values (such as storage account name), I tend to put them in variables, and then use the variable in the script. This makes it easier to change the hardcoded values, especially if they are used in multiple places, which makes your script more flexible for different storage accounts, containers, etc.

First, set up a folder with some pictures or files in it that you can upload. Then set up a folder to which you can download blobs. I’m going to use “D:\_Temp\_AzureFilesDownloaded” for my target directory for downloads, and “D:\_Temp\_AzureFilesToUpload” to hold pictures that I can upload.

Now run PowerShell ISE as an administrator like you did in the previous post. If PowerShell is on your task bar, you can right-click on it and select “Run ISE as administrator.” We’re going to write a script in the script window and run the commands as we need to. When we’re done, you’ll have a script you can use to upload and download files. I tend to do this because it’s good to create your own library of samples that you can reuse.

Next, let’s set up variable names for the storage account name and key. I’m going to use $StorageAccountName and $StorageAccountKey. (If you didn’t know it, the $ prefix indicates a variable name). Fill your storage account name and key in. You can get these from the Storage Account in the Azure Portal.

$StorageAccountName = "yourStorageAccountName"
$StorageAccountKey = "yourStorageAccountKey"

When you access a storage account from PowerShell, you need to use what’s called a context. You create this using the storage account name and key, and pass it in with each PowerShell command so it knows which account to use when running the command. Here’s the command:

$ctx = New-AzureStorageContext -StorageAccountName $StorageAccountName `
         -StorageAccountKey $StorageAccountKey

Note the backwards tick mark (`) after $StorageAccountName. This is a continuation character in PowerShell.

When accessing blobs, you have to specify which container the blob is in. So let’s specify a variable for container name. This can be a container that already exists or a new container (I’ll show you how to create a new container in a minute.)

$ContainerName = "images"

This is how you create a new container with public access to the blobs. If you try this and the container already exists, it will give you an error.

New-AzureStorageContainer -Name $ContainerName -Context $ctx -Permission Blob

So now we have the storage context and the container name set up. Set a variable for the local file directory:

$localFileDirectory = "E:\_Temp\_AzureFilesToUpload\"

Upload blobs from local folder

One of the files in my folder is called “SnowyCabin.jpg”. I’m going to set a variable for $BlobName – the actual name of the blob – and then append it with the $localFileDirectory to create the path to the local file. Then we’ll use the Set-AzureStorageBlobContent cmdlet to upload the file. To use that, you specify the path to the local file, the name of the container, the name of the blob, and the storage context.

$BlobName = "SnowyCabin.jpg"
$localFile = $localFileDirectory + $BlobName
Set-AzureStorageBlobContent -File $localFile -Container $ContainerName `
        -Blob $BlobName -Context $ctx

So now I’ve uploaded that one file. I can check blob storage and I’ll see the file there in the container called “images”. (To check blob storage, you can use one of the Azure Portals, Visual Studio Azure Explorer, or a storage explorer product like the Azure Management Studio from Cerebrata. I can also query the storage account for the blobs in that container using the cmdlet Get-AzureStorageBlob.

Get-AzureStorageBlob -Container $ContainerName -Context $ctx

PSBlobUpDown_image-01

Let’s upload some more blobs and do another list. I’m going to upload 3 more pictures and then get a list of blobs in the container. This is the same PowerShell as above, repeated 3 times with different files. I’ve added an argument “ | Select Name” to the end of Get-AzureStorageBlob. That first character is a pipe symbol; this takes the output from Get-AzureStorageBlob and selects just the name and displays it.

$BlobName = "BluebellsAndBeechTrees.jpg"
$localFile = $localFileDirectory + $BlobName
Set-AzureStorageBlobContent -File $localFile -Container $ContainerName `
        -Blob $BlobName -Context $ctx

$BlobName = "gizmodo_groundhog_texting.jpg"
$localFile = $localFileDirectory + $BlobName
Set-AzureStorageBlobContent -File $localFile -Container $ContainerName `
        -Blob $BlobName -Context $ctx

$BlobName = "GuyEyeingOreos.png"
$localFile = $localFileDirectory + $BlobName
Set-AzureStorageBlobContent -File $localFile -Container $ContainerName `
-Blob $BlobName -Context $ctx

Get-AzureStorageBlob -Container $ContainerName -Context $ctx | Select Name

image-02

Download blobs to local disk

First, we need to set a variable for the target directory on the local machine.

$localTargetDirectory = "D:\_Temp\_AzureFilesDownloaded"

To download a blob, use the cmdlet Get-AzureStorageBlobContent. I’m going to download 3 of the files I uploaded.

$BlobName = "BluebellsAndBeechTrees.jpg"
Get-AzureStorageBlobContent -Blob $BlobName -Container $ContainerName `
        -Destination $localTargetDirectory -Context $ctx

$BlobName = "gizmodo_groundhog_texting.jpg"
Get-AzureStorageBlobContent -Blob $BlobName -Container $ContainerName `
        -Destination $localTargetDirectory -Context $ctx

$BlobName = "GuyEyeingOreos.png"
Get-AzureStorageBlobContent -Blob $BlobName -Container $ContainerName `
        -Destination $localTargetDirectory -Context $ctx

Now if I look in that local directory, I see those files.

Summary

In this post, I showed you how to upload files to Azure blob storage from the local disk, and how to download files from Azure blob storage to the local disk. In my next post, I’ll show you how to delete a blob and how to copy a blob.

Azure Blob Storage, Click Once deployment, and recursive file uploads

July 17, 2014

In this blog post, I am going to show you how to upload an folder and all of its contents from your local computer to Azure blob storage, including subdirectories, retaining the directory structure. This can have multiple uses, but I want to call out one use that people still using Click Once deployment might appreciate.

I used to be the (only) Click Once MVP, and still blog about it once in a while. Click Once is a Microsoft technology that allows you to host the deployment of your client application, console application, or VSTO add-in on a file share or web site. When updates are published, the user picks them up automatically. This can be a very handy for those people still dealing with these technologies, especially since Microsoft removed the Setup & Deployment package feature from Visual Studio after VS2010 and replaced it with a lame version of InstallShield (example of lameness: it wouldn’t let you deploy 64-bit applications). But I digress.

I wrote a blog article showing how you can host your Click Once deployment in Azure blob storage very inexpensively. (It’s even cheaper now.) The problem is you have to get your deployment up to Blob Storage, and for that, you need to write something to upload it, use something like Cerebrata’s Azure Management Studio, or talk the Visual Studio team and ClickOnce support into adding an option to the Publish page for you. I tried the latter — what a sales job I did! “Look! You can work Azure into Click Once and get a bunch of new Azure customers!” “Oh, that’s a great idea. But we have higher priorities.” (At least I tried.)

Having been unsuccessful with my rah! rah! campaign, I thought it would be useful if I just provided you with the code to upload a folder and everything in it. You can create a new Windows Forms app (or WPF or Console, whatever makes you happy) and ask the user for two pieces of information:

  • Local folder name where the deployment is published. (For those of you who don’t care about ClickOnce, this is the local folder that you want uploaded.)
  • The name of the Blob Container you want the files uploaded to.

Outside of a Click Once deployment, there are all kinds of uses for this code. You can store some of your files in Blob Storage as a backup, and use this code to update the files periodically. Of course, if you have an excessive number of files, you are going to want to run the code in a background worker and have it send progress back to the UI and tell it what’s going on.

Show me the code

I’m going to assume you understand recursion. If not, check out the very instructive wikipedia article. Or put the code in and just step through it. I think recursion is really cool; I once wrote a program in COBOL that would simulate recursion that went up to 10 levels deep. (For you youngsters, COBOL is not a recursive programming language.)

In your main method, you need to add all of the following code (up until the next section).

First, you need to set up your connection to the Azure Storage Account and to the blob container that you want to upload your files to. Assuming you have the connection string to your storage account, here’s what you need to do.

First you’re going to get an instance of the CloudStorageAccount you’re going to use. Next, you get an reference to the CloudBlobClient for that storage account. This is what you use to access the actual blob storage. And lastly, you will get a reference to the container itself.

The next thing I always do is call CreateIfNotExists() on the container. This does nothing if the container exists, but it does save you the trouble of creating the container out in Blob Storage in whatever account you’re using if you haven’t already created it. Or if you have forgotten to create it. If you add this, it makes sure that the container exists and the rest of your code will run.

//get a reference to the container where you want to put the files, 
//  create the container if it doesn't exist
CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse(connectionString);
CloudBlobClient cloudBlobClient = cloudStorageAccount.CreateCloudBlobClient();
CloudBlobContainer cloudBlobCOntainer = cloudBlobClient.GetContainerReference(containerName);
cloudBlobContainer.CreateIfNotExists();

I also set the permissions on the container. If this code actually creates the container, the default access is private, and nobody will be able to get to the blobs without either using a security token with the URL or having the storage account credentials.

//set access level to "blob", which means user can access the blob 
//  but not look through the whole container
//  this means the user must have a URL to the blob to access it
BlobContainerPermissions permissions = new BlobContainerPermissions();
permissions.PublicAccess = BlobContainerPublicAccessType.Blob;
cloudBlobContainer.SetPermissions(permissions);

So now we have our container reference (cloudBlobContainer) set up and the container is ready for use.

Next we need to get a list of files to upload, and then we need to upload them. I’ve factored this into multiple methods. Here are the top commands:

List<String> listOfiles = GetListOfFilesToUpload(folderPath);
string status = UploadFiles(listOfiles, folderPath);

After this finishes, all of your files are uploaded. Let’s look at the methods called.

GetListOfFilesToUpload(string folderPath)

This is the hard part – getting the list of files. This method calls the recursive routine. It starts with instantiating the list of files that will be the final result. Then it retrieves the list of files in the requested directory and adds them to the list. This is “the root”. Any files or directories in this root will be placed in the requested container in blob storage. [folderName] is the path to the local directory you want to uploaded.

//this is going to end up having a list of the files to be uploaded
//  with the file names being in the format needed for blob storage
List<string> listOfiles = new List<string>();
            
//get the list of files in the top directory and add them to our overall list
//these will have no path in blob storage bec they go in the root, so they will be like "mypicture.jpg";
string[] baseFiles = Directory.GetFiles(folderName);
for (int i = 0; i < baseFiles.Length; i++)
{
    listOfiles.Add(Path.GetFileName(baseFiles[i]));
}

Files will be placed in the same relative path in blob storage as they are on the local computer. For example, if D:\zAzureFiles\Images is our “root” upload directory, and there is a file with the full path of “D:\zAzureFiles\Images\Animals\Wolverine.jpg”, the path to the blob will be “Animals/Wolverine.jpg”.

Next we need to get the directories under our “root” upload directory, and process each one of them. For each directory, we will call GetFolderContents to get the files and folders in each directory. GetFolderContents is our recursive routine. So here is the rest of GetListOfFilesToUpload:

//call GetFolderContents (this routine) for each folder to retrieve everything under the top directory
string[] directories = Directory.GetDirectories(folderName);
for (int i = 0; i < directories.Length; i++)
{
    // an example of a directory : D:\zAzureFiles\Images\NatGeo; (root is D:\zAzureFiles\Images
    // topDir gives you the just the directory name, which is NatGeo in this example
    string topDir = GetTopDirectory(directories[i]);
    //GetFolderContents is recursive
    List<String> oneList = GetFolderContents(directories[i], topDir);
    //you have a list of files with blob storage paths for everything under topDir (incl subfolders)
    //  (like topDir/nextfolder/nextfolder/filename.whatever)
    //add the list of files under this folder to the list going in this iteration
    //eventually when it works its way back up to the top, 
    //  it will end up with a complete list of files
    //  under the top folder, with relative paths
    foreach (string fileName in oneList)
    {
        listOfiles.Add(fileName);
    }
}

And finally, return the list of files.

return listOfiles;

GetTopDirectory(string fullPath)

This is a helper method that just pulls off the last directory. For example, it reduces “D:\zAzureFiles\Images\Animals to “Animals”. This is used to pass the folder name to the next recursion.

private string GetTopDirectory(string fullPath)
{
    int lastSlash = fullPath.LastIndexOf(@"\");
    string topDir = fullPath.Substring(lastSlash + 1, fullPath.Length - lastSlash - 1);
    return topDir;
}

GetFolderContents(string folderName, string blobFolder)

This is the recursive routine. It returns List<string>, which is a list of all the files in and below the folderName passed in, which is the full path to the local directory being processed, like D:\zAzureFules\Images\Animals\.

This is similar to GetListOfFilesToUpload; it gets a list of files in the folder passed in and adds them to the return object with the appropriate blob storage path. Then it gets a list of subfolders to the folder passed in, and calls GetFolderContents for each one, adding the items returned from the recursion in to the return object before returning up a level of recursion.

This sets the file names to what they will be in blob storage, i.e. the relative path to the root. So a file on the local computer called D:\zAzureFiles\Images\Animals\Tiger.jpg would have a blob storage path of Animals/Tiger.jpg.

returnList is the List<String> returned to the caller.

List<String> returnList = new List<String>();
            
//process the files in folderName, set the blob path
string[] fileLst = Directory.GetFiles(folderName);
for (int i = 0; i < fileLst.Length; i++)
{
    string fileToAdd = string.Empty;
    if (blobFolder.Length > 0)
    {
        fileToAdd = blobFolder + @"\" + Path.GetFileName(fileLst[i]);
    }
    else
    {
        fileToAdd = Path.GetFileName(fileLst[i]);
    }
    returnList.Add(fileToAdd);
}

//for each subdirectory in folderName, call this routine to get the files under each folder
//  and then get the files under each folder, etc., until you get to the bottom of the tree(s) 
//  and have a complete list of files with paths
string[] directoryLst = Directory.GetDirectories(folderName);
for (int i = 0; i < directoryLst.Length; i++)
{
    List<String> oneLevelDownList = new List<string>();
    string topDir = blobFolder + @"\" + GetTopDirectory(directoryLst[i]);
    oneLevelDownList = GetFolderContents(directoryLst[i], topDir);
    foreach (string oneItem in oneLevelDownList)
    {
        returnList.Add(oneItem);
    }
}
return returnList;

UploadFiles(List<string> listOfFiles, string folderPath)

This is the method that actually uploads the files to Blob Storage. This assumes you have a reference to the cloudBlobContainer instance that we created at the top.

[listOfFiles] contains the files with relative paths to the root. For example “Animals/Giraffe.jpg”. [folderPath] is the folder on the local drive that is being uploaded. In our examples, this is D:\zAzureFiles\Images. Combining these gives us the path to the file on the local drive. All we have to do is set the reference to the location of the file in Blob Storage, and upload the file. Note – the FileMode.Open refers to the file on the local disk, not to the mode of the file in Blob Storage.

internal string UploadFiles(List<string> listOfFiles, string folderPath)
{
    string status = string.Empty;
    //now, listOfiles has the list of files you want to upload
    foreach (string oneFile in listOfFiles)
    {
        CloudBlockBlob blob = cloudBlobContainer.GetBlockBlobReference(oneFile);
        string localFileName = Path.Combine(folderPath, oneFile);
        blob.UploadFromFile(localFileName, FileMode.Open);
    }
    status = "Files uploaded.";
    return status;
}

Summary

So you have the following:

  • The code for the calling routine that sets the reference to the cloudBlobContainer and makes sure the container exists. This calls GetsListOfFilesToUpload and UploadFiles to, well, get the list of files to upload and then upload them.
  • GetListOfFilesToUpload calls GetFolderContents (which is recursive), and ultimately returns a list of the files as they will be named in Blob Storage.
  • GetFolderContents – the recursive routine that gets the list of files in the specified directory, and then calls itself with each directory found to get the files in the directory.
  • UploadFiles is called with the list of files to upload; it uploads them to the specified container.

If the files already exist in Blob Storage, they will be overwritten. For those of you doing ClickOnce, this means it will overlay the appname.application file (the deployment manifest) and the publish.htm if you are generating it.

One other note to those doing ClickOnce deployment – if you publish your application to the same local folder repeatedly, it will keep creating versions under Application Files. This code uploads everything from the top folder down, so if you have multiple versions under Application Files, it will upload them all over and over. You might want to move them or delete them before running the upload.

This post provided and explained the code for uploading a folder and all of its sub-items to Azure Blob Storage, retaining the folder structure. This can be very helpful for people using ClickOnce deployment and hosting their files in Blob Storage, and for anyone else wanting to upload a whole directory of files with minimal effort.

Moving Azure blob storage for the Rebranding/AzureSDK2.0 Project

July 13, 2013

In my last post, I talked about the work required to upgrade all of the cloud services to SDK 2.0 and rebrand “goldmail” as “pointacross” in the code. After all these changes were completed and turned over to QA, the real difficulties began – moving the data. Let’s talk about moving blob storage.

Storage accounts and data involved

We set up new storage accounts in US West with “pointacross” in the name, to replace the “goldmail” storage in US North Central. We decided to leave the huge amounts of diagnostics data behind, because we can always retrieve it later if we need to, but it’s not worth the trouble of moving it just to have it in the same Windows Azure storage tables and blob storage as the new diagnostics. So here’s what we had to move:

GoldMail main storage (includes the content libraries and diagnostics)
GoldMail Cloud Composer storage
GoldMail Saved Projects

When you use the PointAcross application (previously known as the Cloud Composer), you upload images and record audio. We store those assets, along with some other application information, in blob storage. This yields many benefits, but primarily it means you can work on a PointAcross project logged in from one computer, then go somewhere completely different and log in, and all of your assets are still there. You can also save your project after you publish your message, or while working on it – these go into the Saved Projects storage. Those two storage accounts have a container for each customer who has used that application or feature. Fortunately, we’re just in the beginning phases of releasing the PointAcross project widely, so there are only about a thousand containers in each storage account.

The main storage includes a bunch of data we use here and there, and the content library assets. There are a lot of files, but only about 20 containers.

Option 1 : Using AZCopy

So how do we move this data? The first thing we looked at is AZCopy, from the Windows Azure Storage team. This is a handy-dandy utility, and we used it to test migrating the data from one storage account to the other. You run it from the command window. Here’s the format of the command:

AzCopy [1]/[2] [3]/[4] /sourcekey:[5] /destkey:[6] /S

    [1] is the URL to the source storage account, like http://mystorage.blob.core.windows.net
    [2] is the container name to copy from
    [3] is the URL to the target storage account, like http://mynewstorage.blob.core.windows.net
    [4] is the container name to copy to
    [5] is the key to the source storage account, either primary or secondary
    [6] is the key to the destination storage account, either primary or secondary
    /S means to do a recursive copy, i.e. include all folders in the container.

I set up a BAT file called DoTheCopy and substituted %1% for the container name in both places, so I could pass it in as a parameter. (For those of you who are too young to know what a BAT file is, it’s just a text file with a file extension of .bat that you can run from the command line.) This BAT file had two lines and looked like this (I’ve chopped off the keys in the interest of space):

ECHO ON
AzCopy http://mystorage.blob.core.windows.net/%1% http://mynewstorage.blob.core.windows.net/%1%

I called this for each container:

E:\AzCopy> DoTheCopy containerName

I got tired of this after doing it twice (I hate repetition), so I set up another BAT file to call the first one repeatedly; its contents looked like this:

DoTheCopy container1
DoTheCopy container2
DoTheCopy container3

The AZCopy application worked really well, but there are a couple of gotchas. First, when it sets up the container in the target account, it makes it private. So if you want the container to be private but the blobs inside to be public, you have to change that manually yourself. You can change it programmatically or using an excellent tool such as Cerebrata’s Azure Management Studio.

The second problem is that those other two storage accounts have over a thousand containers each. So now I either have to type all of those container names in (not bloody likely!), or figure out a way to get a list of them. So I wrote a program to get the list of container names and generate the BAT file. This creates a generic list full of the command lines, converts it to an array, and writes it to a BAT file.

//sourceConnectionString is the connection string pointing to the source storage account
CloudStorageAccount sourceCloudStorageAccount = 
    CloudStorageAccount.Parse(sourceConnectionString);
CloudBlobClient sourceCloudBlobClient = sourceCloudStorageAccount.CreateCloudBlobClient();
List<string> outputLines = new List<string>();
IEnumerable<CloudBlobContainer> containers = sourceCloudBlobClient.ListContainers();
foreach (CloudBlobContainer oneContainer in containers)
{
    string outputLine = string.Format("DoTheCopy {0}", oneContainer.Name);
    outputLines.Add(outputLine);
}
string[] outputText = outputLines.ToArray();
File.WriteAllLines(@"E:\AzCopy\MoveUserCache.bat", outputText);

That’s all fine and dandy, but what about my container permissions? So I wrote a program to run after the data was moved. This iterates through the containers and sets the permissions on every one of them. If you want any of them to be private, you have to hardcode the exceptions, or fix them after running this.

private string SetPermissionsOnContainers(string dataConnectionString)
{
  string errorMessage = string.Empty;
  string containerName = string.Empty;
  try
  {
    CloudStorageAccount dataCloudStorageAccount = CloudStorageAccount.Parse(dataConnectionString);
    CloudBlobClient dataCloudBlobClient = dataCloudStorageAccount.CreateCloudBlobClient();

    int i = 0;
    List<string> containersToDo = new List<string>();

    IEnumerable<CloudBlobContainer> containers = dataCloudBlobClient.ListContainers();
    foreach (CloudBlobContainer oneContainer in containers)
    {                   
      i++;
      System.Diagnostics.Debug.Print("Processing container #{0} called {1}", i, oneContainer.Name);

      CloudBlobContainer dataContainer = dataCloudBlobClient.GetContainerReference(containerName);
      //set permissions
      BlobContainerPermissions permissions = new BlobContainerPermissions();
      permissions.PublicAccess = BlobContainerPublicAccessType.Blob;
      dataContainer.SetPermissions(permissions);
    }
  }
  catch (Exception ex)
  {
    errorMessage = string.Format("Exception thrown trying to change permission on container '{0}' "
        + "= {1}, inner exception = {2}",
      containerName, ex.ToString(), ex.InnerException.ToString());
  }
  return errorMessage;
}

Option 2: Write my own solution

Ultimately, I decided not to use AZCopy. By the time I’d written this much code, I realized it was just as easy to write my own code to move all of the containers from one storage account to another, and set the permissions as it iterated through the containers, and I could add trace logging so I could see the progress. I could also hardcode exclusions if I wanted to. Here’s the code for iterating through the containers. When getting the list of containers, if it is the main account, I only want to move specific containers. This is because I moved some that were static ahead of time. So for this condition, I just set up an array of container names that I want to process. For the other accounts, it retrieves a list of all containers and processes all of them.

private string CopyContainers(string sourceConnectionString, string targetConnectionString, 
  string accountName)
{
  string errorMessage = string.Empty;
  string containerName = string.Empty;
  try 
  {
    CloudStorageAccount sourceCloudStorageAccount = CloudStorageAccount.Parse(sourceConnectionString);
    CloudBlobClient sourceCloudBlobClient = sourceCloudStorageAccount.CreateCloudBlobClient();
    CloudStorageAccount targetCloudStorageAccount = CloudStorageAccount.Parse(targetConnectionString);
    CloudBlobClient targetCloudBlobClient = targetCloudStorageAccount.CreateCloudBlobClient();

    int i = 0;
    List<string> containersToDo = new List<string>();
    if (accountName == "mainaccount")
    {
      containersToDo.Add("container1");
      containersToDo.Add("container2");
      containersToDo.Add("container3");

      foreach (string oneContainer in containersToDo)
      {
        i++;
        System.Diagnostics.Debug.Print("Processing container #{0} called {1}", i, oneContainer);
        MoveBlobsInContainer(oneContainer, accountName, targetCloudBlobClient, sourceCloudBlobClient);
      }
    }
    else
    {
      IEnumerable<CloudBlobContainer> containers = sourceCloudBlobClient.ListContainers();                    
      foreach (CloudBlobContainer oneContainer in containers)
      {
        i++;
        System.Diagnostics.Debug.Print("Processing container #{0} called {1}", i, oneContainer.Name);
        MoveBlobsInContainer(oneContainer.Name, accountName, targetCloudBlobClient, sourceCloudBlobClient);
      }
    }
  }
  catch (Exception ex)
  {
    errorMessage = string.Format("Exception thrown trying to move files for account '{0}', " +
      "container '{1}' = {2}, inner exception = {3}",
      accountName, containerName, ex.ToString(), ex.InnerException.ToString());
  }
  return errorMessage;
}

And here’s the code that actually moves the blobs from the source container to the destination container.

private string MoveBlobsInContainer(string containerName, string accountName, 
  CloudBlobClient targetCloudBlobClient, CloudBlobClient sourceCloudBlobClient)
{
  string errorMessage = string.Empty;
  try
  {
    long totCount = 0;
    //first, get a reference to the container in the target account, 
    //  create it if needed, and set the permissions on it 
    CloudBlobContainer targetContainer = 
      targetCloudBlobClient.GetContainerReference(containerName);
    targetContainer.CreateIfNotExists();
    //set permissions
    BlobContainerPermissions permissions = new BlobContainerPermissions();
    permissions.PublicAccess = BlobContainerPublicAccessType.Blob;
    targetContainer.SetPermissions(permissions);

    //get list of files in sourceContainer, flat list
    CloudBlobContainer sourceContainer = 
      sourceCloudBlobClient.GetContainerReference(containerName);
    foreach (IListBlobItem item in sourceContainer.ListBlobs(null, 
      true, BlobListingDetails.All))
    {
      totCount++;
      System.Diagnostics.Debug.Print("Copying container {0}/blob #{1} with url {2}", 
        containerName, totCount, item.Uri.AbsoluteUri);
      CloudBlockBlob sourceBlob = sourceContainer.GetBlockBlobReference(item.Uri.AbsoluteUri);
      CloudBlockBlob targetBlob = targetContainer.GetBlockBlobReference(sourceBlob.Name);
      targetBlob.StartCopyFromBlob(sourceBlob);
    }
  }
  catch (Exception ex)
  {
    errorMessage = string.Format("Exception thrown trying to move files for account '{0}', "
      + "container '{1}' = {2}, inner exception = {3}",
        accountName, containerName, ex.ToString(), ex.InnerException.ToString());
  }
  return errorMessage;
}

You can hook this up to a fancy UI and run it in a background worker and pass progress back to the UI, but I didn’t want to spend that much time on it. I create a Windows Forms app with 1 button. When I clicked the button, it ran some code that set the connection strings and called CopyContainers for each storage account.

Did it work?

When we put all of the services in production, as our Sr. Systems Engineer, Jack Chen, published all of the services to PointAcross production, I ran this to move the data from the goldmail storage accounts to the pointacross storage accounts. It worked perfectly. The only thing left at this point is moving the Windows Azure SQL Databases (the database previously known as SQL Azure Winking smile ).