Welcome PowerShell User! This recipe is just one of the hundreds of useful resources contained in the PowerShell Cookbook.
If you own the book already, login here to get free, online, searchable access to the entire book's content.
If not, the Windows PowerShell Cookbook is available at Amazon, or any of your other favourite book retailers. If you want to see what the PowerShell Cookbook has to offer, enjoy this free 90 page e-book sample: "The Windows PowerShell Interactive Shell".
You want to write a script that generates a large report or large amount of data.
The best approach to generating a large amount of data is to take advantage of PowerShell’s streaming behavior whenever possible. Opt for solutions that pipeline data between commands:
Get-ChildItem
C
:
\*.
txt
-Recurse
|
Out-File
c
:
\
temp
\
AllTextFiles
.
txt
rather than collect the output at each stage:
$files
=
Get-ChildItem
C
:
\*.
txt
-Recurse
$files
|
Out-File
c
:
\
temp
\
AllTextFiles
.
txt
If your script generates a large text report (and streaming is not an option), use the StringBuilder
class:
$output
=
New-Object
System
.
Text
.
StringBuilder
Get-ChildItem
C
:
\*.
txt
-Recurse
|
ForEach
-Object
{
[void]
$output
.
AppendLine
(
$_
.
FullName
)
}
$output
.
ToString
()
rather than simple text concatenation:
$output
=
""
Get-ChildItem
C
:
\*.
txt
-Recurse
|
ForEach
-Object
{
$output
+=
$_
.
FullName
}
$output
In PowerShell, combining commands in a pipeline is a fundamental concept. As scripts and cmdlets generate output, PowerShell passes that output to the next command in the pipeline as soon as it can. In the Solution, the Get-ChildItem
commands that retrieve all text files on the C: drive take a very long time to complete. However, since they begin to generate data almost immediately, PowerShell can pass that data on to the next command as soon as the Get-ChildItem
cmdlet produces it. This is true of any commands that generate or consume data and is called streaming. The pipeline completes almost as soon as the Get-ChildItem
cmdlet finishes producing its data and uses memory very efficiently as it does so.
The second Get-ChildItem
example (which collects its data) prevents PowerShell from taking advantage of this streaming opportunity. It first stores all the files in an array, which, because of the amount of data, takes a long time and an enormous amount of memory. Then, it sends all those objects into the output file, which takes a long time as well.
However, most commands can consume data produced by the pipeline directly, as illustrated by the Out-File
cmdlet. For those commands, PowerShell provides streaming behavior as long as you combine the commands into a pipeline. For commands that do not support data coming from the pipeline directly, the ForEach-Object
cmdlet (with the aliases of foreach
and %) lets you work with each piece of data as the previous command produces it, as shown in the StringBuilder
example.
When you generate large reports, it’s common to store the entire report into a string, and then write that string out to a file once the script completes. You can usually accomplish this most effectively by streaming the text directly to its destination (a file or the screen), but sometimes this isn’t possible.
Since PowerShell makes it so easy to add more text to the end of a string (as in
$output
+= $_.FullName
), many initially opt for that approach. This works great for small-to-medium strings, but it causes significant performance problems for large strings.
As an example of this performance difference, compare the following:
PS > Measure-Command { $output = New-Object Text.StringBuilder 1..10000 | ForEach-Object { $output.Append("Hello World") } } (...) TotalSeconds : 2.3471592 PS > Measure-Command { $output = "" 1..10000 | ForEach-Object { $output += "Hello World" } } (...) TotalSeconds : 4.9884882
In the .NET Framework (and therefore PowerShell), strings never change after you create them. When you add more text to the end of a string, PowerShell has to build a new string by combining the two smaller strings. This operation takes a long time for large strings, which is why the .NET Framework includes the System.Text.StringBuilder
class. Unlike normal strings, the StringBuilder
class assumes that you will modify its data—an assumption that allows it to adapt to change much more
efficiently.