New REXX toy: rxsparse.rex converts a specified file on NTFS to a sparse file with
FSUTIL SPARSE SETFLAG and
FSUTIL SPARSE SETRANGE. Obviously you need
fsutil.exe for this business, and it only works on NTFS.
NTFS files can be compressed and decompressed on the fly on Windows NT when the corresponding file attribute is set, and with inheritance this can be arranged for complete subdirectory trees. This feature replaced the obscure DOS double space approach, good riddance. Compression on the fly can be nice, but of course file accesses will be slower. Sparse files are an alternative, long runs of zero bytes in a file are not physically stored, but emulated. Unlike compression that's fast, only the position and length of sparse ranges has to be noted.
For NTFS a long run is not anything, but a multiple of the cluster size. As for FAT file systems the NTFS cluster size is 1, 2, 4, …, or 128 sectors. Wikipedia claims that a long run consists of 64 KB, and rxsparse can check this theory in its self test:
SPARSE SETFLAG C:\etc\bin\REXX\rxsparse.tmp 655360 = 5*131072 bytes with 8, 16, 32, 64, 128 "zero"-sectors at the end: Bereich geringer Datendichte:   SPARSE SETRANGE 126976 4096 SPARSE SETRANGE 253952 8192 SPARSE SETRANGE 376832 16384 SPARSE SETRANGE 491520 32768 SPARSE SETRANGE 589824 65536 Bereich geringer Datendichte:   SPARSE SETFLAG C:\etc\bin\REXX\rxsparse.tmp 655360 = 5*131072 bytes with 8, 16, 32, 64, 128 "zero"-sectors at the begin: Bereich geringer Datendichte:   SPARSE SETRANGE 0 4096 SPARSE SETRANGE 131072 8192 SPARSE SETRANGE 262144 16384 SPARSE SETRANGE 393216 32768 SPARSE SETRANGE 524288 65536 Bereich geringer Datendichte:   Bereich geringer Datendichte:   Test okay, maybe delete C:\etc\bin\REXX\rxsparse.tmp The shown ranges [offset] [size] contain non-zero bytes; check that there are 1..4 remaining ranges. The smallest hidden zero-range size should be at least 4096 (otherwise edit BLKLEN in the source).
The German gibberish is the output of
FSUTIL SPARSE QUERYRANGE on a German Windows 7 x64 SP1. Oddly this is the opposite of
SETRANGE, it shows ranges containing non-zero bytes. NTFS cluster size 4096=8×512 is perfectly normal, the definition of long run could still differ for other NTFS cluster and sector sizes.
Major caveat: When rxsparse found the end of what it considers as a long run of zero sectors it has to close the file before using
FSUTIL SPARSE SETRANGE. If another process manages to write non-zero bytes in the critical range before
FSUTIL gets write access these non-zero bytes are lost. Presumably REXX could somehow use a decent .NET API to avoid this potential race condition with the
FSUTIL command line tool, but I didn't bother to figure it out.
Sadly my use case "FAT32 VHD image on NTFS" does not work as expected for the VHD variant known as fixed VHD (type 2). For NTFS VHD images it is also pointless,
DISKPART can handle NTFS VHDs after defragmentation and precompact in a VPC VM guest OS supporting the precompact VM addition, notably Windows 2000 or better. Any "defrag" and wipe unused disk space tool for FAT file systems on another guest OS has presumably the same effect as precompact, but
DISKPART supports only NTFS for its part of the VHD compactification magic. Not yet tested, maybe rxsparse makes sense for dynamic FAT VHD images.