Welcome PowerShell User! This recipe is just one of the hundreds of useful resources contained in the PowerShell Cookbook.

If you own the book already, login here to get free, online, searchable access to the entire book's content.

If not, the Windows PowerShell Cookbook is available at Amazon, or any of your other favourite book retailers. If you want to see what the PowerShell Cookbook has to offer, enjoy this free 90 page e-book sample: "The Windows PowerShell Interactive Shell".

9.5 Parse and Manage Text-Based Logfiles

Problem

You want to parse and analyze a text-based logfile using PowerShell’s standard object-based commands.

Solution

Use the ConvertFrom-String cmdlet described in Recipe 5.15 to work with text-based logfiles. With your assistance, it converts streams of text into streams of objects, which you can then easily work with using PowerShell’s standard commands.

Discussion

The ConvertFrom-String script primarily takes two arguments when you’re parsing logfiles:

  • A regular expression that describes how to break the incoming text into groups

  • A list of property names that the script then assigns to those text groups

As Example 9-1 demonstrates, you can use firewall logs from the Windows directory. If enabled, these logs track inbound and outbound network connections on a machine.

Example 9-1. Examining the Windows firewall log
PS C:\WINDOWS\system32> Get-Content .\Logfiles\Firewall\pfirewall.log -Head 10
#Version: 1.5
#Software: Microsoft Windows Firewall
#Time Format: Local
#Fields: date time action protocol src-ip dst-ip src-port dst-port size tcpflags tcpsyn

2020-12-22 15:49:56 ALLOW UDP 192.168.1.132 208.67.222.222 51411 53 0 - - - - SEND
2020-12-22 15:49:57 ALLOW TCP 192.168.1.251 192.168.1.132 43223 32400 0 - 0 0 RECEIVE
2020-12-22 15:50:00 ALLOW TCP 192.168.1.251 192.168.1.132 43231 32400 0 - 0 0 RECEIVE
2020-12-22 15:50:01 ALLOW UDP 192.168.1.132 208.67.222.222 49998 53 0 - - - - SEND
2020-12-22 15:50:02 ALLOW TCP 192.168.1.132 168.62.58.130 58406 443 0 - 0 0 0 SEND
(...)

Like most logfiles, the format of the text is very regular but hard to manage. In this example, you have 10 fields that seem to be filled out, and some that aren’t.

Fortunately, this logfile documents its fields, so we can store those into an array:

$fields = -split ("date time action protocol src-ip dst-ip src-port dst-port size " +
    "tcpflags tcpsyn tcpack tcpwin icmptype icmpcode info path")

We don’t care about the first four lines because they’re just headers, so we can use Select-Object to skip those:

PS C:\WINDOWS\system32> Get-Content .\Logfiles\Firewall\pfirewall.log -Head 10 |
>>     Select-Object -Skip 4

2020-12-22 15:49:56 ALLOW UDP 192.168.1.132 208.67.222.222 51411 53 0 - - - - - SEND
2020-12-22 15:49:57 ALLOW TCP 192.168.1.251 192.168.1.132 43223 32400 0 - 0 0 0 RECEIVE
2020-12-22 15:50:00 ALLOW TCP 192.168.1.251 192.168.1.132 43231 32400 0 - 0 0 0 RECEIVE
2020-12-22 15:50:01 ALLOW UDP 192.168.1.132 208.67.222.222 49998 53 0 - - - - - SEND
2020-12-22 15:50:02 ALLOW TCP 192.168.1.132 168.62.58.130 58406 443 0 - 0 0 0 - SEND

And then finally let ConvertFrom-String parse the results based on whitespace:

PS C:\WINDOWS\system32> Get-Content .\Logfiles\Firewall\pfirewall.log -Head 10 |
>>     Select-Object -Skip 4 | ConvertFrom-String -PropertyNames $fields


date     :

           2020-12-22
time     : 15:49:56
action   : ALLOW
protocol : UDP
src-ip   : 192.168.1.132
dst-ip   : 208.67.222.222
src-port : 51411
dst-port : 53
size     : 0
tcpflags : -
tcpsyn   : -
tcpack   : -
tcpwin   : -
icmptype : -
icmpcode : -
info     : -
path     : SEND

Once we’re happy with the results, we can remove the -Head 10 parameter to Get-Content to have PowerShell parse the whole logfile.

If this input wasn’t so regular, we could also use a custom parsing expression on these records. For example, if we wanted to capture only the protocol (TCP or UDP) and whether it was a SEND or RECEIVE, we could do the following:

PS C:\WINDOWS\system32> $parseExpression = '.*(UDP|TCP).*(SEND|RECEIVE)'
>> Get-Content .\Logfiles\Firewall\pfirewall.log -Head 10 |
>>     Select-Object -Skip 4 |
>>     ConvertFrom-String -Delimiter $parseExpression -Property Ignored,
>>     Protocol,Direction

Ignored Protocol Direction P4
------- -------- --------- --
        UDP      SEND
        TCP      RECEIVE
        TCP      RECEIVE
        UDP      SEND
        TCP      SEND

We can now easily query those objects using PowerShell’s built-in commands. For example, you can find the IP addresses your system is communicating with the most:

$allConnections = Get-Content .\Logfiles\Firewall\pfirewall.log |
    Select-Object -Skip 4 | ConvertFrom-String -PropertyNames $fields
$allConnections | Group-Object dst-ip

Using this technique, you can work with most text-based logfiles.

For extremely large logfiles, handwritten parsing tools may not meet your needs. In those situations, specialized log management tools can prove helpful. One example is Microsoft’s free Log Parser. Another common alternative is to import the log entries to a SQL database, and then perform ad hoc queries on database tables instead.

See Also

Recipe 5.15, “Convert Text Streams to Objects”

Appendix B, Regular Expression Reference