I’ve had various stabs over the years at tools that will dump out the whole connector space, or just the pending exports, and convert it into a CSV file for easy analysis. They often fall down on two things: the XML file produced by CSExport can be very large (way too big for Get-Content), and the whole file is all on one line. I’ve now taken the approach of breaking the XML out into multiple files which I can then parse easily.
Step one is to tackle the single line problem. Because the XML file produced by CSExport is all on a single line I can’t use a StreamReader to read it line by line. I looked into various other reading options (chunks and characters), but eventually decided to use an XSLT stylesheet to insert carriage retuns between each <cs-object> node.
The stylesheet looks like this (saved as CSExportSplitLines.xslt):
<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="//cs-object"> <xsl:copy-of select="."/> <xsl:text> </xsl:text> </xsl:template> </xsl:stylesheet>
Next I use PowerShell to create a copy of the CSExport file with the carriage returns added:
$XSLTPath = “CSExportSplitLines.xslt”
$SourceFile = “AD.XML”
$TargetFile = “AD_SplitLines.XML”$xslt = new-object system.xml.xsl.XslTransform
$xslt.load($XSLTPath)
$xslt.Transform($SourceFile,$TargetFile)
Then it’s a simple matter to read the new XML file one line at a time, writing out a temporary file for each one (note the snippet below uses $TempFolder which must be defined):
$reader = [System.IO.File]::OpenText($TargetFile) $i = 0 do { $line = $reader.ReadLine() $line | out-file ($TempFolder + "\" + $i.ToString().PadLeft(10,'0') + ".XML") $i += 1 } until ($reader.EndOfStream) $reader.Close() Remove-Item $TargetFile
Depending on the size of your CSExport file this may produce a lot of files! But they’re all small and easy enough to loop through and load with Get-Content.