How LogJoint parses XML files

Log as string

LogJoint considers an XML log file as one big string. This is a logical representation, of-course physically LogJoint doesn't load the whole file into a string in memory. A string here means a sequence of Unicode characters. To convert a raw log file to Unicode characters LogJoint uses the encoding specified in your format's settings. XML file does not have to be pretty-printed to look nice in LogJoint.

Suppose we have this log file:

<event timestamp="2017-02-03 14:56:12.654" severity="info" thread="6d12">Hi there</event>
<event timestamp="2017-02-03 14:56:13.002" severity="error" thread="6d12">Ups! Error occurred!
  <exception-info>
    <message>Can not commit transaction</message>
    <method>Foo.Bar()</method>
    <inner-exception>
      <message>Invalid argument</message>
      <method>Foo.VerifyArgs()</method>
    </inner-exception>
  </exception-info>
</event>

The log contains two messages, each represented by event element. Time, thead and severity are stored in separate attributes. Event's textual content is unstructured. The second message is of severity error and it includes exception information in child XML element.

Header regular expression

LogJoint uses user-provided regular expression to split input XML string into individual log messages. This regex is called header regular expression. It's supposed to match the beginnings of messages. It might look unnatural to use regexps against XML texts. The reason for this approach is efficiency - with the regex in hands LogJoint can read a random part of potentially huge input file and start splitting this part. In our example the header regular expression may look like this:
<event       # opening XML tag
\s+          # whitespace
timestamp=   # mandatory attribute

Note that LogJoint ignores unescaped white space in patterns and treats everything after # as a comment. Programmers can read about IgnorePatternWhitespace, ExplicitCapture, and Multiline flags that are actually used here in msdn: RegexOptions Enumeration.

LogJoint applies the header regular expression many times to find all the messages in the input string. In our example the header regex will match two times:

Thick black lines show message boundaries. After applying header regex LogJoint knows where the messages begin and where they end. A messsage ends where the next message begins.

Normalization with XSL transformation

On the next step LogJoint applies user-provided normalization XSL transformation to each message separated out on previous step. The output of this XSL tranformation must be one XML element with the following schema

<m d="datetime: yyyy-MM-ddTHH:mm:ss.fffffff" t="thread id string" s="severity: i, w, e">Log message</m>
			

Only d attribute is mandatory.

LogJoint knows how to interpret and display transformation output. Basically your XSL tranformation tells LogJoint:

For the sample log above the transformation might look like that:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:lj="http://logjoint.codeplex.com/">

	<xsl:output method="xml"/>

	<xsl:template match='event'>
		<m>
			<xsl:attribute name='t'>
				<xsl:value-of select='@thread'/>
			</xsl:attribute>
			<xsl:attribute name='d'>
				<xsl:value-of select='lj:TO_DATETIME(@timestamp, "yyyy-MM-dd HH:mm:ss.fff")'/>
			</xsl:attribute>
			<xsl:attribute name='s'>
				<xsl:choose>
					<xsl:when test="@severity='error'">e</xsl:when>
					<xsl:when test="@severity='warning'">w</xsl:when>
					<xsl:otherwise>i</xsl:otherwise>
				</xsl:choose>
			</xsl:attribute>
			
			<xsl:value-of select="lj:TRIM(text())"/>
			<xsl:apply-templates select="exception-info"/>
		</m>
	</xsl:template>

	<xsl:template match='exception-info'>
		<xsl:value-of select="lj:NEW_LINE()"/>
		<xsl:text>Exception: </xsl:text>
		<xsl:value-of select="message"/> at <xsl:value-of select="method"/>
		<xsl:apply-templates select="inner-exception"/>
	</xsl:template>

	<xsl:template match='inner-exception'>
		<xsl:value-of select="lj:NEW_LINE()"/>
		<xsl:text>Inner exception: </xsl:text>
		<xsl:value-of select="message"/> at <xsl:value-of select="method"/>
		<xsl:apply-templates select="inner-exception"/>
	</xsl:template>

</xsl:stylesheet>

Within XSLT code you can use standard XSL functions as well as that from namespace lj:. The latter are helper functions introduced by LogJoint to XSLT processor. See functions reference.