Posts Tagged ‘Logging’

Implementing SLAB in an Azure service

July 19, 2014

In my previous post,  I talked about design considerations when using SLAB, and presented a class to be changed from regular trace logging to semantic logging using SLAB. In this post, I’m going to show you how to set up the event logging with SLAB, how to write the entries to Windows Azure Table Storage, and how to implement the design to change the class to use SLAB instead of regular trace logging.

To implement SLAB, first you need to download the NuGet package and install it. So in Visual Studio, you can click Solution / Tools / NuGet Package Manager / Manager Packages and search for “semantic logging application block”. You will get several entries – one for the EventSource analyzer, one for each sink, one for the out-of-process service, and one that is just the semantic logging application block. Select the last one and the Azure Sink, which will allow us to write our results to Table Storage.

image_thumb[2]

Now we need to add a class that inherits from EventSource. I’m going to call it CustomerSvcEventSource. In the bare bones of the class, I will add a static method to call to Log events, a static field instance to provide access to an instance of the class, and a private constructor. Here’s the bare bones of the class.

public class CustomerSvcEventSource : EventSource
{        
  //create a static field instance that provides access to an instance of this class
  private static readonly Lazy<CustomerSvcEventSource> Instance
    = new Lazy<CustomerSvcEventSource>(() => new CustomerSvcEventSource());

  //create a static property called Log that returns the current value 
  //of the Instance field of the event source. This value is 
  //determined by the custom event methods called in the Customerservices application.
  private CustomerSvcEventSource() { }

  public static CustomerSvcEventSource Log
  {
    get { return Instance.Value; }
  }
}

The next thing we need to do is to set up the listener. This needs to start up when the application starts up, and close when the application closes. Since this is a WCF service, I’m going to put this in the Application_Start method and the Application_End method in Global.asax.cs.

I need these two using clauses:

using Microsoft.Practices.EnterpriseLibrary.SemanticLogging;
using Microsoft.WindowsAzure.ServiceRuntime;

I define the listener in the Global.asax.cs class:

public ObservableEventListener listener;

Now put the code to start up the listener in Application_Start. “CustomerSvc” is the name of the listener. The

“RoleEnvironment…” gets the connection string to the storage account I want to write the logs to. In EnableEvents, I provide the name of the Event Source log method, the logging level, and keywords. You can have your listener just write errors by changing the EventLevel, or just write specific Keywords. you can also set up multiple listeners and send the results of each to different tables. (You can specify the table name when instantiating the listener.)

//slab listener
listener = new ObservableEventListener();
listener.LogToWindowsAzureTable("CustomerSvc", 
  RoleEnvironment.GetConfigurationSettingValue("DataConnectionString"));
listener.EnableEvents(CustomerSvcEventSource.Log, EventLevel.LogAlways, Keywords.All);

Put the code to stop the listener in Application_End:

if (listener != null)
{
  listener.DisableEvents(CustomerSvcEventSource.Log);
  listener.Dispose();
}

Now we can go back and update our Event Source with keywords, tasks, opcodes, and then finally the actual events. Here are the results of our analysis in the previous post on what events we’d like to log, including the keywords, tasks, and logging levels:

  • Database retry errors
    • Event id 101, Keyword=Database, Task=DatabaseRetry, Level = Error
  • Database messages that are errors/exceptions
    • Event id 102, Keyword=Database, Task=[DatabaseRead or DatabaseWrite], Level = Error
  • Database messages that are informational
    • Event id 103, Keyword=Database, Level = Informational
  • Errors with the input parameters
    • Event id 104, Keyword = ParameterError, Level = Warning
  • GetCustomerFavorites Entry Point (with params in)
    • Event id 105, Keyword = EntryPoint, Level = Informational
  • Output parameters (i.e. results)
    • Event id 106, Level = Informational
  • General exception
    • Event id 107, Level = Error

Let’s set up the Keywords. You can have up to 64, and they have to be in a nested class in the EventSource class. Each keyword value is a 64-bit integer, which is treated as a bit array, so they all have to be powers of 2 – 1, 2, 4, 8, 16, 32, etc.

From the information above, we have 3 keywords. I’m adding this in to our EventSource class:

public class Keywords
{
  public const EventKeywords Database = (EventKeywords)1;
  public const EventKeywords ParameterError = (EventKeywords)2;
  public const EventKeywords EntryPoint = (EventKeywords)4;
}

Now let’s add the Tasks. We have three. These do not have to be a multiple of 2.

public class Tasks
{
  public const EventTask DatabaseRead = (EventTask)1;
  public const EventTask DatabaseWrite = (EventTask)2;
  public const EventTask DatabaseRetry = (EventTask)3;
}

What’s left? We didn’t use any OpCodes, but they are set up the same way as the tasks. We assign the event id’s and levels when we add the events to the EventSource class. So let’s go back to our bare bones Event Source class and start adding the methods for writing the events.

The first event I’m going to write is the EntryPoint. You put the event id in the attribute of the method, as well as the Message, Level, and Keywords. You call WriteEvent inside the method, passing the parameters passed in. WriteEvent will write a line to table storage. The message will be written, and the parameters in json format.

[Event(105, Message = "[GetCustomerFavorites] Called/entry point. FirstName = {0}, LastName = {1}", 
  Level = EventLevel.Informational, Keywords = Keywords.EntryPoint)]
internal void GetCustomerFavoritesCalled(string firstName, string lastName)
{
  if (this.IsEnabled()) this.WriteEvent(105, firstName, lastName);
}

The event id is 105.  I’ve put if (this.IsEnabled()) in front of my WriteEvent in case the Event Source is not enabled. This will keep it from throwing an exception if for some reason the Event Source is not enabled.

So now we need to call this from the entry point in our class, the code for which is posted in the previous blog post. It will look like this:

CustomerSvcEventSource.Log.GetCustomerFavoritesCalled(firstName, lastName);

The next one I’m going to add is for the parameter errors. I’m going to make this one generic, and pass in the method name to be written as part of the payload. Then I can call this from anywhere.

[Event(104, Message = "Parameter error in method {0}, msg = {1}", Level = EventLevel.Error, 
  Keywords = Keywords.ParameterError)]
internal void ParameterError(string methodName, string errorMessage)
{
  if (this.IsEnabled()) this.WriteEvent(104, errorMessage);
}

Here is an example of calling this:

errorMessage = "First and last name are blank.";
CustomerSvcEventSource.Log.ParameterError("GetCustomerFavorites", errorMessage);

I’m calling this in the same places I’m calling the trace logging for the invalid parameters. (I’m going to post the entire class when I’m finished.)

One thing to note here – this shows the advantage of using this method of logging. You don’t have a whole bunch of trace statements that essentially do the same thing, which open you up to typos. You’re making the message once in the event definition and using it repeatedly. Of course, this means if you have a typo in the event definition, it will be displayed in every event message. But as a friend of mine once said about programming rules: Right or wrong, be consistent.

Here is a method for the case where I read the database and should find data, but I don’t. Note that this one has a Task attached to it, and the keyword is Database. (You could argue whether or not this should have the Database keyword, but we’ll leave it “as is” for an example.) 

[Event(102, Message = "Method {0}; no data found, but should be. Input Info = {1}.",
  Level = EventLevel.Warning, Keywords = Keywords.Database, Task = Tasks.DatabaseRead)]
internal void DatabaseRecordsNotFound(string methodName, string extraInfo, string errorMessage)
{
  if (this.IsEnabled()) this.WriteEvent(102, methodName, extraInfo, errorMessage);
}

And here’s what the call looks like:

CustomerSvcEventSource.Log.DatabaseRecordsNotFound("GetCustomerFavorites", 
  string.Format("FirstName = {0}, LastName = {1}", firstName, lastName),
  errorMessage);

Note – I’m consolidating a bit here. If you wanted to truly make this 100% SLAB-ified, you could make this less generic, and call it something like GetCustomersDataRNF and pass the first and last name in as parameters, and only use this for this one case. This goes for some of hte other methods you’re going to see below. You get to decide how much granularity you want to have.

Here is a method for database data conversion errors. Note that this one has a Task attached to it, and the keyword is Database. (Again, you could argue whether or not this should have the Database keyword, since it’s a coding error.)  This is called when trying to convert the parameters to local variables. This is an addition to the original design from the previous blog post. When working on this, flexibility is important. When you decide to implement your event source, you may find you want different information than originally decided.

[Event(108, Message = "Error converting the database fields to output parameters in method {0}. "
  + "ExtraInfo = {1}, ex = {2}",
  Level = EventLevel.Error, Keywords = Keywords.Database, Task = Tasks.DatabaseRead)]
internal void DatabaseFieldConversionError(string methodName, string extraInfo, string errorMessage)
{
  if (this.IsEnabled()) this.WriteEvent(108, methodName, extraInfo, errorMessage);
}

And here’s what the caller looks like: 

CustomerSvcEventSource.Log.DatabaseFieldConversionError("GetCustomerFavorites",
  string.Format("firstName = {0}, lastName = {1}", firstName, lastName),
  CustomerSvcEventSource.FormatException(ex));

I realized I am going to need to be able to log exceptions. I want all of them to have the same format, so I have added a FormatException method to my Event Source class that I can call from anywhere. You  might want to add the InnerException to it; this is just a basic example.

public static string FormatException(Exception exception)
{
  return exception.GetType().ToString() + Environment.NewLine + exception.Message;
}

I’m actually calling this when sending the information in to write the event. This way, if I want to format exceptions differently depending on where they are, I can just set up another FormatException type method and call that instead.

Here is the SQL DB Retry error event method. By including the TryCount as a parameter, I can search my logs for entries where TryCount > 3, and find the places that didn’t work at all because it could not connect to the database within 3 retries.

[Event(101, Message = "DB Retry in method {0}, MiscInfo = {1}, TryCount = {2, "
  + "RetrySleepTime = {3}, ex = {4}",
  Level = EventLevel.Error, Keywords = Keywords.Database, Task = Tasks.DatabaseRead)]
internal void DatabaseRetry(string methodName, string miscInfo, int tryCount, int sleepTime, 
  string exceptionInfo)
{
  if (this.IsEnabled()) this.WriteEvent(101, methodName, miscInfo, tryCount, sleepTime, 
    exceptionInfo);
}

And this is the caller, which is in the catch for the SQL Exception:

CustomerSvcEventSource.Log.DatabaseRetry("GetCustomerFavorites",
  string.Format("FirstName = {0}, LastName = {1}", firstName, lastName),
  tryCount, GlobalStaticProperties.retrySleepTime[tryCount - 1], 
  CustomerSvcEventSource.FormatException(ex));

Again, I’ve genericized this so I can use it repeatedly, but you can make it specific to this method.

You should have the general idea by now. After finishing all the code in the class, here is the final EventSource class:

public class CustomerSvcEventSource : EventSource
{        
  //create a static field instance that provides access to an instance of this class
  private static readonly Lazy<CustomerSvcEventSource> Instance
    = new Lazy<CustomerSvcEventSource>(() => new CustomerSvcEventSource());

  //create a static property called Log that returns the current value 
  //of the Instance field of the event source. This value is 
  //determined by the custom event methods called in the Customerservices application.
  private CustomerSvcEventSource() { }

  public static CustomerSvcEventSource Log
  {
    get { return Instance.Value; }
  }

  public class Keywords
  {
    public const EventKeywords Database = (EventKeywords)1;
    public const EventKeywords ParameterError = (EventKeywords)2;
    public const EventKeywords EntryPoint = (EventKeywords)4;
  }

  public class Tasks
  {
    public const EventTask DatabaseRead = (EventTask)1;
    public const EventTask DatabaseWrite = (EventTask)2;
    public const EventTask DatabaseRetry = (EventTask)3;
  }

  [Event(101, Message = "DB Retry in method {0}, MiscInfo = {1}, TryCount = {2}, "
    + "RetrySleepTime = {3}, ex = {4}",
    Level = EventLevel.Error, Keywords = Keywords.Database, Task = Tasks.DatabaseRead)]
  internal void DatabaseRetry(string methodName, string miscInfo, int tryCount, int sleepTime, 
    string exceptionInfo)
  {
    if (this.IsEnabled()) this.WriteEvent(101, methodName, miscInfo, tryCount, sleepTime, 
      exceptionInfo);
  }

  [Event(102, Message = "Method {0}; no data found, but should be. Input Info = {1}.",
    Level = EventLevel.Warning, Keywords = Keywords.Database, Task = Tasks.DatabaseRead)]
  internal void DatabaseRecordsNotFound(string methodName, string extraInfo, string errorMessage)
  {
    if (this.IsEnabled()) this.WriteEvent(102, methodName, extraInfo, errorMessage);
  }

  [Event(103, Message = "Message from method {0} = {1}", Level = EventLevel.Informational,
    Keywords = Keywords.Database, Task = Tasks.DatabaseRead)]
  internal void WriteDBInfoMsg(string methodName, string infoMessage)
  {
    if (this.IsEnabled()) this.WriteEvent(103, methodName, infoMessage);
  }

  [Event(104, Message = "Parameter error in method {0}, msg = {1}", Level = EventLevel.Error,
    Keywords = Keywords.ParameterError)]
  internal void ParameterError(string methodName, string errorMessage)
  {
    if (this.IsEnabled()) this.WriteEvent(104, methodName, errorMessage);
  }

  [Event(105, Message = "[GetCustomerFavorites] Called/entry point. FirstName = {0}, LastName = {1}",
    Level = EventLevel.Informational, Keywords = Keywords.EntryPoint)]
  internal void GetCustomerFavoritesCalled(string firstName, string lastName)
  {
    if (this.IsEnabled()) this.WriteEvent(105, firstName, lastName);
  }

  [Event(106, Message = "Method {0}, output params = {1}", Level = EventLevel.Informational, 
    Task = Tasks.DatabaseRead)]
  internal void OutputParams(string methodName, string paramString)
  {
    if (this.IsEnabled()) this.WriteEvent(106, methodName, paramString);
  }

  [Event(107, Message = "Unexpected exception in method {0}. Extra info = {1}. Ex = {2}", 
    Level = EventLevel.Error)]
  internal void UnexpectedException(string methodName, string miscInfo, string exceptionInfo)
  {
    if (this.IsEnabled()) this.WriteEvent(107, methodName, miscInfo, exceptionInfo);
  }

  [Event(108, Message = "Error converting the database fields to output parameters in method {0}. "
    + "ExtraInfo = {1}, ex = {2}",
    Level = EventLevel.Error, Keywords = Keywords.Database, Task = Tasks.DatabaseRead)]
  internal void DatabaseFieldConversionError(string methodName, string extraInfo, string errorMessage)
  {
    if (this.IsEnabled()) this.WriteEvent(108, methodName, extraInfo, errorMessage);
  }

  public static string FormatException(Exception exception)
  {
    return exception.GetType().ToString() + Environment.NewLine + exception.Message;
  }
}

And here is the final GetCustomerFavorites. I’ve left in the original trace logging, but commented out, so you can see how much cleaner the new calls are.

internal class CustomerFavorites
{
  int maxNameLength = 50;

  internal string GetCustomerFavorites(out string favoriteMovie, out string favoriteLanguage,
    string firstName, string lastName)
  {
    //Trace.TraceInformation("[GetCustomerFavorites] called. FirstName = {0}, LastName = {1}",
    //  firstName, lastName);

    CustomerSvcEventSource.Log.GetCustomerFavoritesCalled(firstName, lastName);

    string errorMessage = string.Empty;
    favoriteMovie = string.Empty;
    favoriteLanguage = string.Empty;

    if (string.IsNullOrWhiteSpace(firstName) && string.IsNullOrWhiteSpace(lastName))
    {
      //Trace.TraceError("[GetCustomerFavorites] called. First and last name are blank, "
      //    + "no data returned.");
      errorMessage = "First and last name are blank.";

      CustomerSvcEventSource.Log.ParameterError("GetCustomerFavorites", errorMessage);
    }
    else if (string.IsNullOrWhiteSpace(firstName))
    {
      //Trace.TraceError("[GetCustomerFavorites] called. First name is blank, no data returned."
      //  + " LastName = {0}", lastName);
      errorMessage = "First name is blank; both fields must be provided.";

      CustomerSvcEventSource.Log.ParameterError("GetCustomerFavorites", errorMessage);
    }
    else if (string.IsNullOrWhiteSpace(lastName))
    {
      //Trace.TraceError("[GetCustomerFavorites] called. First name is blank, no data returned."
      //  + " LastName = {0}", lastName);
      errorMessage = "First name is blank; both fields must be provided.";

      CustomerSvcEventSource.Log.ParameterError("GetCustomerFavorites", errorMessage);
    }
    else if (firstName.Length > maxNameLength)
    {
      //Trace.TraceError("[GetCustomerFavorites] called. First name is too long, no data " 
      //  + "returned. FirstName = {0}, LastName = {0}", firstName, lastName);
      errorMessage = string.Format("First name is too long. Can not be more than {0} "
        + "characters long.", maxNameLength);

      CustomerSvcEventSource.Log.ParameterError("GetCustomerFavorites", errorMessage);
    }
    else if (lastName.Length > maxNameLength)
    {
      //Trace.TraceError("[GetCustomerFavorites] called. Last name is too long, no data "
      //  + "returned. FirstName = {0}, LastName = {0}", firstName, lastName);
      errorMessage = string.Format("Last name is too long. Can not be more than {0} "
        + "characters long.", maxNameLength);

      CustomerSvcEventSource.Log.ParameterError("GetCustomerFavorites", errorMessage);
    }
    if (!string.IsNullOrWhiteSpace(errorMessage))
      return errorMessage;

    int tryCount = 0;
    bool success = false;

    try
    {
      do
      {
        errorMessage = string.Empty; 
        tryCount++;
        try
        {
          using (SqlConnection cnx = 
            new SqlConnection(GlobalStaticProperties.dbConnectionString))
          {
            cnx.Open();
            using (SqlCommand cmd = new SqlCommand("Customer_GetByName", cnx))
            {
              cmd.CommandType = CommandType.StoredProcedure;

              SqlParameter prm = new SqlParameter("@FirstName", SqlDbType.NVarChar, 50);
              prm.Direction = ParameterDirection.Input;
              prm.Value = firstName;
              cmd.Parameters.Add(prm);

              prm = new SqlParameter("@LastName", SqlDbType.NVarChar, 50);
              prm.Direction = ParameterDirection.Input;
              prm.Value = lastName;
              cmd.Parameters.Add(prm);

              SqlDataAdapter da = new SqlDataAdapter(cmd);
              DataTable dt = new DataTable();
              da.Fill(dt);
              success = true;

              if (dt == null || dt.Rows.Count <= 0)
              {
                errorMessage = string.Format("Error retrieving favorites; record not found "
                  + "for first name '{0}', last name '{1}'.", firstName, lastName);

                CustomerSvcEventSource.Log.DatabaseRecordsNotFound("GetCustomerFavorites", 
                  string.Format("FirstName = {0}, LastName = {1}", firstName, lastName),
                  errorMessage);
              }
              else
              {
                try
                {
                  DataRow dr = dt.Rows[0];
                  favoriteMovie = dr["FavoriteMovie"].ToString();
                  favoriteLanguage = dr["FavoriteLanguage"].ToString();
                  //Trace.TraceInformation("[GetCustomerFavorites] FirstName = {0}, "
                  //  + "LastName = {1}, FavoriteMovie = {2}, FavoriteLanguage = {3}",
                  //    firstName, lastName, favoriteMovie, favoriteLanguage);

                  string outputs = string.Format("favoriteMovie: {0}, favoriteLanguage: {1}", 
                    favoriteMovie, favoriteLanguage);
                  CustomerSvcEventSource.Log.OutputParams("GetCustomerFavorites", outputs);
                }
                catch (Exception ex)
                {
                  //Trace.TraceError("[GetCustomerFavorites] FirstName = {0}, LastName = {1}, "
                  //  + "Exception thrown trying to get favorites from SQL record = {2}", 
                  //  firstName, lastName, ex.ToString());
                  errorMessage = "Error retrieving customer favorites.";

                  CustomerSvcEventSource.Log.DatabaseFieldConversionError("GetCustomerFavorites",
                    string.Format("firstName = {0}, lastName = {1}", firstName, lastName),
                    CustomerSvcEventSource.FormatException(ex));
                }
              }
            }//using SqlCommand
          } //using SqlConnection
        }
        catch (SqlException ex)
        {
          errorMessage = "Error retrieving customer favorites.";
          //Trace.TraceError("[GetCustomerFavorites] firstName = {0}, lastName = {1}, "
          //  + "Try #{2}, will sleep {3}ms. SQL Exception = {4}",
          //    firstName, lastName, tryCount, 
          //    GlobalStaticProperties.retrySleepTime[tryCount - 1], ex.ToString());

          CustomerSvcEventSource.Log.DatabaseRetry("GetCustomerFavorites",
            string.Format("FirstName = {0}, LastName = {1}", firstName, lastName),
            tryCount, GlobalStaticProperties.retrySleepTime[tryCount - 1], 
            CustomerSvcEventSource.FormatException(ex));

          if (tryCount < GlobalStaticProperties.MaxTryCount && 
            GlobalStaticProperties.retrySleepTime[tryCount - 1] > 0)
              Thread.Sleep(GlobalStaticProperties.retrySleepTime[tryCount - 1]);
        }
      } while (tryCount < GlobalStaticProperties.MaxTryCount && !success);
    }
    catch (Exception ex)
    {
      //Trace.TraceError("[GetCustomerFavorites] firstName = {0}, lastName = {1}, "
      //  + "Overall Exception thrown = {2}", firstName, lastName, ex.ToString());
        errorMessage = "Error getting customer favorites.";

        CustomerSvcEventSource.Log.UnexpectedException("GetCustomerFavorites", 
          String.Format("FirstName = {0}, LastName {1}", firstName, lastName), 
          CustomerSvcEventSource.FormatException(ex));
    }
    return errorMessage;
  }
}

I’ve run the service, and set it to have a bunch of errors, which were all logged to the default table for SLAB logging – SLABLogsTable. The table has a column for EventID, Keywords, Level, Message, Opscode, Task, Version, Payload, and then each parameter in the payload across all of the error messages: Payload_firstName, Payload_lastName, Payload_methodName, etc. I’ve exported the file to CSV and Excel if you want to take a look at them.

If you use the same field names, they will be put in the same column. So if I went back and changed this, I would change MiscInfo and ExtraInfo  and paramstring to all have the same names, they will all show up in the same column. This is not something you’re likely to search on, it’s just some extra information to identify the records being handled.

Finally, there re two things you want to be sure of:

  • The id # in the attribute of the method used to write the event is the same as the id# in the WriteEvent call
  • Don’t use the same id# more than once.

You can include the EventSourceAnalyzer in your project and write a unit test that will confirm your EventSource doesn’t have any configuration errors.

Summary

In this post, I showed how to set up an Event Source and a listener for it, and then take a class with trace logging and “SLAB-ify” it. This provided us with structured output and multiple fields we can query against directly, making it easier to get our data in a consistent format that we can later use for data mining.

SLAB design considerations

July 18, 2014

In my previous post, I talked about the features available in the Semantic Logging Application Block. In this post, I will show you a class with regular trace logging, and talk about things to consider when converting it to implement the Semantic Application Logging Block (SLAB) to create structured logging. You can just do a straight conversion and not think about your trace logging methodology, but this kind of goes against the whole point of using SLAB. You should put some thought into it so that what you end up with isn’t just trace logging, but is structured in a way that will allow you to search for patterns or do data mining.

Show me some code

Here is a class called CustomerFavorites that is similar to one you might use in a data access layer or a WCF service. Input parameters are first name and last name; output parameters are favorite movie and favorite language. I’ve written this in a way that allows me to put in a lot of trace logging, like validating that the input fields aren’t blank, and rolling my own SQL retry code. This displays regular unstructured trace logging.

internal class CustomerFavorites
{
  int maxNameLength = 50;

  internal string GetCustomerFavorites(out string favoriteMovie, out string favoriteLanguage,
    string firstName, string lastName)
  {
    Trace.TraceInformation("[GetCustomerFavorites] called. FirstName = {0}, LastName = {1}", 
      firstName, lastName);

    string errorMessage = string.Empty;
    favoriteMovie = string.Empty;
    favoriteLanguage = string.Empty;

    if (string.IsNullOrWhiteSpace(firstName) && string.IsNullOrWhiteSpace(lastName))
    {
      Trace.TraceError("[GetCustomerFavorites] called. First and last name are blank, "
          + "no data returned.");
        errorMessage = "Must provide name to get favorites.";
    }
    else if (string.IsNullOrWhiteSpace(firstName))
    {
      Trace.TraceError("[GetCustomerFavorites] called. First name is blank, no data returned."
        + " LastName = {0}", lastName);
      errorMessage = "First name is blank; both fields must be provided.";
    }
    else if (string.IsNullOrWhiteSpace(lastName))
    {
      Trace.TraceError("[GetCustomerFavorites] called. First name is blank, no data returned."
        + " LastName = {0}", lastName);
      errorMessage = "First name is blank; both fields must be provided.";
    }
    else if (firstName.Length > maxNameLength)
    {

      Trace.TraceError("[GetCustomerFavorites] called. First name is too long, no data " 
        + "returned. FirstName = {0}, LastName = {0}", firstName, lastName);
      errorMessage = string.Format("First name is too long. Can not be more than {0} "
        + "characters long.", maxNameLength);
    }
    else if (lastName.Length > maxNameLength)
    {
      Trace.TraceError("[GetCustomerFavorites] called. Last name is too long, no data "
        + "returned. FirstName = {0}, LastName = {0}", firstName, lastName);
      errorMessage = string.Format("Last name is too long. Can not be more than {0} "
        + "characters long.", maxNameLength);
    }

    int tryCount = 0;
    bool success = false;

    try
    {
      do
      {
        errorMessage = string.Empty; 
        tryCount++;
        try
        {
          using (SqlConnection cnx = 
            new SqlConnection(GlobalStaticProperties.dbConnectionString))
          {
            cnx.Open();
            using (SqlCommand cmd = new SqlCommand("Customer_GetByName", cnx))
            {
              cmd.CommandType = CommandType.StoredProcedure;

              SqlParameter prm = new SqlParameter("@FirstName", SqlDbType.NVarChar, 50);
              prm.Direction = ParameterDirection.Input;
              prm.Value = firstName;
              cmd.Parameters.Add(prm);

              prm = new SqlParameter("@LastName", SqlDbType.NVarChar, 50);
              prm.Direction = ParameterDirection.Input;
              prm.Value = lastName;
              cmd.Parameters.Add(prm);

              SqlDataAdapter da = new SqlDataAdapter(cmd);
              DataTable dt = new DataTable();
              da.Fill(dt);
              success = true;

              if (dt == null || dt.Rows.Count <= 0)
              {
                errorMessage = string.Format("Error retrieving favorites; record not found "
                  + "for first name '{0}', last name '{1}'.", firstName, lastName);
              }
              else
              {
                try
                {
                  DataRow dr = dt.Rows[0];
                  favoriteMovie = dr["FavoriteMovie"].ToString();
                  favoriteLanguage = dr["FavoriteLanguage"].ToString();
                  Trace.TraceInformation("[GetCustomerFavorites] FirstName = {0}, "
                    + "LastName = {1}, FavoriteMovie = {2}, FavoriteLanguage = {3}",
                      firstName, lastName, favoriteMovie, favoriteLanguage);
                }
                catch (Exception ex)
                {
                  Trace.TraceError("[GetCustomerFavorites] FirstName = {0}, LastName = {1}, "
                    + "Exception thrown trying to get favorites from SQL record = {2}", 
                    firstName, lastName, ex.ToString());
                  errorMessage = "Error retrieving customer favorites.";
                }
              }
            }//using SqlCommand
          } //using SqlConnection
        }
        catch (SqlException ex)
        {
          errorMessage = "Error retrieving customer favorites.";
          Trace.TraceError("[GetCustomerFavorites] firstName = {0}, lastName = {1}, "
            + "Try #{2}, will sleep {3}ms. SQL Exception = {4}",
              firstName, lastName, tryCount, 
              GlobalStaticProperties.retrySleepTime[tryCount - 1], ex.ToString());
          if (tryCount < GlobalStaticProperties.MaxTryCount && 
            GlobalStaticProperties.retrySleepTime[tryCount - 1] > 0)
              Thread.Sleep(GlobalStaticProperties.retrySleepTime[tryCount - 1]);
        }
      } while (tryCount < GlobalStaticProperties.MaxTryCount && !success);
    }
    catch (Exception ex)
    {
      Trace.TraceError("[GetCustomerFavorites] firstName = {0}, lastName = {1}, "
        + "Overall Exception thrown = {2}", firstName, lastName, ex.ToString());
        errorMessage = "Error getting customer favorites.";
    }
    return errorMessage;
  }
}

One word about the SQL retry code: There is an array loaded in a global static variable called retrySleepTime that provides the retry information – how many seconds to wait before retrying. This is usually an increasing number; I think mine are 0, 2.5, 5, and 7.5. When it fails the first time, it retries immediately. The second time, it waits 2.5 seconds and then retries, etc. This kind of retry is necessary because the Azure SQL Database occasionally has connection problems, depending on a number of things. The best practice is not to roll your own retry code, but to use the Transient Fault Handling Application block from the Azure CAT team and the Microsoft Patterns and Practices team. This is part of Enterprise Library 6, which also contains the Semantic Logging Application Block.

Design considerations

Looking at the code above, we have the following kinds of messages we want to log:

  • Entry point, which also traces the input and output variables.
  • Input errors (variables or blank or too long)
  • No matching records found in the database and there should be
  • Exception converting the DataRow parameters to real numbers
  • SQL Exception with retry code
  • Unexpected exception

If you have a big WCF service, you might have lots of methods that look very similar to this, but with different database actions and different variables.

A note about keywords: When you set up your listener, you can filter the logging by the keywords and the Message Level. For example, you can turn off all but DatabaseError, or all but Errors. If you have something you’re going to be logging 1000 times per second, add a keyword for it, so you can turn it off if needed.

How would you gather this data? How are you going to look at the resulting logs?? When you set up your event source class, you can assign values to the following:

  • Event ID
    • This is an arbitrary number provided by you.
  • Keyword (defined by you) (limited to 64)
    • When you set up your listener, you can filter the logging by the keywords. For example, you can turn off logging for a specific keyword. If you have something you’re going to be logging 1000 times per second, add a keyword for it, so you can turn it off if needed.
  • EventOpcode (1-10 are taken up by .NET, opcodes 11 and above are defined by you)
    • You can filter the results by EventOpcode.
  • Task (defined by you)
    • You can filter the results by task.
  • Message
    • This is your message that uses the parameters sent in to be used when writing the event log; it’s the “pretty” format. So you might send in method name, and your message might be “Entry point for {0}.” The parameters make up the payload, and if writing to Azure Table Storage, you can query on the parameters individually.
  • Message Level (Informational, Critical, Error, etc)
    • You can filter the results by the message level; you can also turn the logging on/off. For example, you could turn off all of the information-only logging.

So here’s some of the kinds of events you might want to be able to query from this class and others like it:

  • Database retry errors
  • Database messages that are errors/exceptions
  • Database messages that are informational
  • Errors with the input parameters
  • GetCustomerFavorites Entry Point (with params in)
  • Output parameters (i.e. results)
  • General exception

Looking at this list, how would you assign event id’s, keywords, opcodes, and tasks?

  • You can only have up to 64 keywords, so you want to be careful about using them. You could set up keywords for Database, ParameterError, and EntryPoint.
  • You could set up tasks for DatabaseRead, DatabaseUpdate, DatabaseRetry.
  • You can tell error/exception from information by the event level.

So let’s look at that list again and categorize our events for this class. Each event has a different event id that you assign arbitrariliy.

  • Database retry errors
    • Event id 101, Keyword=Database, Task=DatabaseRetry, Level = Error
  • Database messages that are errors/exceptions
    • Event id 102, Keyword=Database, Task=[DatabaseRead or DatabaseWrite], Level = Error
  • Database messages that are informational
    • Event id 103, Keyword=Database, Level = Informational
  • Errors with the input parameters
    • Event id 104, Keyword = ParameterError, Level = Warning
  • GetCustomerFavorites Entry Point (with params in)
    • Event id 105, Keyword = EntryPoint, Level = Informational
  • Output parameters (i.e. results)
    • Event id 106, Level = Informational
  • General exception
    • Event id 107, Level = Error

You get the general idea. Note that the parameters passed in when writing the event get written as separate key/value pairs, and can be queried on if you’re writing to Azure Table Storage. So if you pass in the method name, you will be able to query for all events having that method name.

Notice how quickly the event id’s add up. You might want to consider putting them in partial classes and grouping them that way so they are easier to maintain. Of course, you could always use regions, which people either love or hate. If you do this, find out which way your boss feels before deciding whether or not to use them. ;-)  (I had a lead once who removed all comments I put in my code. She said “it makes it look untidy.” So you never know what’s going to make someone happy.) (Yes, this drove me nuts.)

Summary

In this post, I showed an example of a class with trace logging, and discussed design considerations for converting this to semantic logging. In my next post, I will show you how to implement our design.

Diagnostics logging and structured logging

July 18, 2014

This post is going to be about one of my favorite things, diagnostics logging. When I worked for a startup, I was responsible for the entire backend for all of the products. This comprised of a lot of WCF services running in Web Roles, an Azure SQL Database, and some Worker Roles. By habit, I always include a lot of trace logging. I thought maybe I had overdone it in this case, as I was tracing entry points for all methods in the WCF services called by their clients, and all of the variables as well, and some additional information, but then I went to this talk by the Halo group, and they talked about how much logging they do, and I realized I’m an amateur in comparison!

Diagnostics logging in action

Included in all of my tracing was the user GUID, which allowed us to pull together a bunch of information about what a particular customer was doing, to help troubleshoot when he had a problem. You could see when he logged in, when he called the backend and what for. This was very helpful when a customer had a problem. Additionally, the back end was largely developed when we migrated from a traditional hosting center to Azure, and it was .NET 2.0 asmx web services. This was back in 2010 before Azure was mainstream, and we didn’t have the capabilities we have today to attach the debugger in Visual Studio and debug the service running in the cloud. If you wanted to know what was going on, you added trace logging.

For example, we had a worker role that would pull requests from a queue to resize the photos from the customer into mobile-sized pictures. This role would read the queue, pull messages off of the queue and deconstruct them into whatever structure was needed, download the pictures involved, resize them, then upload them again. Every step was a failure point. If you have a hundred pictures, that’s 100 downloads, 100 resizes, 100 uploads, and there was deconstructing the message – what if it was in an invalid format and unrecognizable? I put in trace logging throughout the code, so if something happened, I would know exactly where it went wrong and why.

All of my trace logging was done straight into the Windows Azure Diagnostics Logs table. A couple of years later, I discovered ETW (Event Tracing for Windows) and SLAB – the Semantic Logging Application Block. I would have rewritten all of our logging, but at a startup, that’s not a luxury you have! It’s definitely something I would recommend and use going forward.

What’s wrong with plain ordinary trace logging

The problem with ordinary trace logging is that you create millions of rows of unstructured data containing the identifying information about each row (like timestamp and partition key), and a single field for whatever message you want to convey. This made it very time-consuming and difficult to look through the logs and find what I needed. I used the Cerebrata Azure Management Studio, which did some built-in filtering by service, which was helpful in reducing the amount of information I had to wade through, but I still had to search millions of records for the user GUID to find out what was going on.

There were some services, like the one that continuously polled our CRM system for changes and when found, migrated them to the SQL Server database, that generated so much logging, you had to be very determined to find what you wanted. Aside from troubleshooting, it would have been really helpful to have some kind of structure to that logging so we could gather some information, consolidate it, and learn something from it. But all you have with WADiagnostics is a single string field.

Rather than a single field that said something like “user [GUID] reloaded project abc”, how much more useful would it be to have something that has fields like this that you can query directly: { user:”guid”, project:”abc”, action:”reload” }. This would be possible if all the logging for the services used structured logging and was stored in  Azure Table Storage. This would have allowed us to query the table for specific values of GUID, projects, and actions. Think of the metrics you could gather – how many customers are accessing project “abc”, how many are reloading the project versus abandoning it and creating a new project, etc.

This is what structured logging is all about – making it easier to consume logs.

The problem is that structured logging is a completely different paradigm, and you don’t just put in random trace statements whenever you want to. You have to think about what you want to log, where you want to log it, and how. How do you want to query the data? Do you want to be able to retrieve all database retry errors together, all method entry points together, all data for a specific user GUID? These are some of the things you have to think about.

Summary

This post talked about the practice of diagnostics logging, why structured logging is a good idea, and how it can help you do troubleshooting and data mining. In my next post, I will talk specifically about the features of ETW and SLAB.