Saturday, 1 August 2015

Semantic Logging Application Block - Issues with Sql databaes provider customization

Enterprise Library's Semantic Logging Application Block (SLAB) is the right way to capture structured information (trace, audit and error). Enterprise Library 6.0 comes with an out of the box windows service which lets you capture these events in an out-of-process manner. This capability helps in two ways:

1. SLAB uses high performance ETW (event tracing for windows) infrastructure to publish the events. Multiple sinks can be configured to subscribe to the desired events and publish to desired destinations e.g. Flat File, SQL Server Database, SQL Azure Database, Email etc. In fact, the out of the box service comes prepackaged with listeners for file and SQL database (on-premise and Azure). This, also means, your application's core implementation is not blocked to complete the publishing process and therefore it can perform better.

2. When you use Windows Service based out-of-process listener host, you can actually cater to multiple applications which is an awesome capability. If you have a good telemetry design and structure in place, it would be really easy for you to have all of your enterprise's applications dump information to single destination and enable you to analyze the captured data for better decisions.

Now, let us switch to issues - Most of us would eventually decide to dump the traces, error, audit into single database (because we are too lazy and it is super easy to write a T-SQL query). However, there are couple of gotchas:

1. Out of the box database listener/sink gives us the capability to rename the destination tables and have things work in seamless fashion. You can specify the table name in the configuration file. That being said, if you change the schema (say, add new columns and rename some columns e.g. rename ProviderID to MyProviderId), you will have to change the dbo.WriteTraces SP to adhere to the new schema. That works (mostly). You would start to notice the intermittent occurrence of the following warning in event viewer :

EventId : 102, Level : Warning, Message : A database sink discarded XXX events due to failures while attempting to publish a batch., Payload : [numberOfEntries : XXX] , EventName : DatabaseSinkPublishEventsFailedAndDiscardsEntriesInfo, Timestamp : 
EventId : 103, Level : Warning, Message : A database sink discarded an event with index 0 due to a failure while attempting to publish a batch. Message: The given ColumnMapping does not match up with any column in the source or destination., Payload : [message : The given ColumnMapping does not match up with any column in the source or destination.] [entryOrder : 0] , EventName : DatabaseSinkPublishEventsFailedAndDiscardSingleEntryInfo, Timestamp : 

If you run SQL profiler to figure out the root cause of "ColumnMapping" mismatch then you would notice the intermittent occurrence of following command:

select @@trancount; SET FMTONLY ON select * from Trace SET FMTONLY OFF exec ..sp_tablecollations_100 N'.[Trace]'

And you wonder why? You modified the SP to take care of mismatches and things should work. Well, look into the implementation of database listenter/sink and you would know the reason.

The out of the box implementation has following function:

private async Task PublishEventsAsync(IList collection)
        {
            int publishedEvents = collection.Count;

            try
            {
                if (collection.Count > 128)
                {
                    await this.UseSqlBulkCopy(collection).ConfigureAwait(false);
                }
                else
                {
                    await this.UseStoredProcedure(collection).ConfigureAwait(false);
                }

                return publishedEvents;
            }
            catch (OperationCanceledException)
            {
                return 0;
            }
            catch (Exception ex)
            {
                if (this.cancellationTokenSource.IsCancellationRequested)
                {
                    return 0;
                }

                SemanticLoggingEventSource.Log.DatabaseSinkPublishEventsFailed(ex.ToString());
                throw;
            }

        }

This means if the application is publishing large number of events then the default behavior of database sink is to use SqlBulkCopy to copy data to destination table. That would mean that Bulk Copy can actually fail if your new table schema is not compatible with the assumed structure present in code. In fact, in my case, the bulk copy operation failed due to case mismatch in column name e.g. EventId name did not work, EventID worked :). Probably, that is the way bulk copy works. There is another option by which you can force the service to always use the new SP that you wrote. You can change the maximum buffer batch size in configuration file to 128 and it will ensure that Bulk Copy operation is never used. But think about impact of this change before doing it. Bulk Copy operations are pretty efficient for transferring large data :).

One good news is that SLAB lets you specify batch size for each type of event and therefore you can selectively decide your future.

2. If one of the event in the batch carries a payload which has more than 4000 characters, whole batch of the events will get discarded for Sql provider due to schema mismatch. That happens because the dbo.WriteTraces SP uses a table type parameter which limits the payload size. So be careful!!

Happy logging.

No comments:

Post a Comment