xyzzy: 2011

2011-11-28

FAT16 fun

Continuing the FAT fun saga with an unplanned sequel: rexxfat used partition type 05 instead of 06 for BigFAT, i.e., FAT16 with 2**16 or more sectors. I've fixed this stupid bug in the last rexxfat version. The virtual hard disk image works fine in a Windows 2000 virtual machine:

The odd size 3759 MB matches what I have on a memory card. It's rather tricky to copy the disk image to the memory card, either rawcopy doesn't support copies to a physical device, or it was not obvious for me how to manage this. Using diskpart to unmount the one and only volume on this partitioned memory card wasn't good enough for write access, eventually the diskpart delete volume magic did the trick. Presumably that also explains my difficulties with other tools including dskprobe and HxD.

Looking for explanations I stumbled over SysInternals SDelete. That article doesn't explain how Windows 7 manages volumes on memory cards, but answered another open question: Compressed, encrypted and sparse are managed by NTFS in 16-cluster blocks. That matches my observation for rxsparse.rex, 16×4=64 KB, and if the SysInternals folks got that right Wikipedia also got it right. Hint, it is very likely that SysInternals got it right, after all they were the first to offer a working DOS NTFS-driver permitting write access.

2011-11-24

REXX FAT

After creating a somewhat dubious MBR in disk images formatted by rexxfat I've now also added a dummy VBR doing more than only INT 18h (hex. CD 18). For an ordinary FAT12, FAT16, or FAT32 volume this VBR displays the OEM Id. (8 characters at offset 3 of the VBR), the file system type string (8 characters at offset -8 from the begin of the VBR code), the volume label (11 char.s, offset -19, default NO␠NAME␠␠␠␠), and the wannabe-unique volume Id. (4 bytes shown as 8 little endian hex. digits, offset -23).

Wikipedia claims that the file system type string and the volume label only exist for an extended boot signature introduced by 29h at offset -24 from the begin of the code. In other words, for 28h only the volume id. can be expected to exist, and for anything else these 1+4+11+8 bytes won't exist at all, notably on media formatted with PC DOS 3.x or earlier. The rexxfat dummy VBR code is designed to work for any 29h FAT layout with sector size 256, 512, or more. Check out the script for the OpenWatcom WASM assembler MBR and VBR sources in two REXX comments.

Apparently Darwin VBRs expect that SI points to their partition table entry in the MBR, and just in case rexxfat now guarantees that when it boots an active primary partition. Looking at a Darwin MBR source I think this is only used in its magic to boot a logical disk in an extended partition, i.e., the relative number of hidden sectors from the begin of the EBR to the logical disk VBR is updated on the fly to yield the absolute number of hidden sectors from MBR to VBR, and that's of course the LBA. This won't help normal FAT VBR code expecting the number of hidden sectors in its own BPB, and the rexxfat MBR limits its efforts to initialize SI as expected for primary partitions. Patching loaded FAT VBRs in an extended partition is left as an exercise for the EBR code — and so far rexxfat only formats one primary partition with a FAT. More convoluted tasks are a job for GRUB, BCDEDIT, diskpart, parted, or what you have.

YAM (Yet Another Mail)

The ways of the IETF are completely bizarre, and the Tao of IETF does not really explain why. In a recent episode the YAM WG will be closed after moving two of four e-mail related RFCs to STDs:

STD 72 is RFC 6409 and covers "mail submission". That's what ordinary users like you and me do when they send e-mail using their ISP or another ESP such as Gmail, GMX, or Hotmail.
STD 71 is RFC 6152 and covers "8BITMIME". That's what everybody uses since about 1993 outside of pure US-ASCII 7bit e-mail.

Oddly YAM didn't finish the work on a 5321bis and a 5322bis. That leaves us with STD 10 (SMTP) and STD 11 (message format) in limbo, unless you still consider RFCs 821 and 822 (vintage 1982) as the ultimate truth in IETF e-mail standards.

For SPF it's quite interesting to look at the "reverse path" fine print of RFC 821, e.g.:

The first host in the <reverse-path> should be the host sending this command. (RFC 821 about MAIL FROM).
In any case, the SMTP adds its own identifier to the reverse-path. (RFC 821 about Relaying).

Sadly RFC 4408 specifying SPF also claims that this RFC 821 feature is archaic. Of course today nobody uses reverse and forward paths as designed about thirty years ago for e-mail routing, but for SPF the MAIL FROM address is still supposed to indicate the responsible party for technical problems with an e-mail, also known as bounce address. And e-mails forwarded to a third party need a new envelope sender address, otherwise SPF checks at this third party will fail, because IP-addresses authorized to send e-mail from the first party won't match IPs used by the forwarding second party. That's as simple as it can be, ignoring the minor detail that RFC 1123 broke it about twenty years ago.

By the way, the work on 4408bis is supposed to start really soon now, but my confidence in the IETF is somewhat limited.

Google ±

Google+ wants me to join Google+, because I'm on Google+.

2011-11-08

YAM (Yet Another MBR)

The Starman and Ray Knights disassembled various MBRs from ancient PC DOS to current Windows 7 versions, also covering LILO and GRUB MBRs. The code used by TestDisk — when you are forced to fix a broken MBR — is a variant of the source published by Neil Turton. The TestDisk variant tries to load the VBR for the active partition, and if that doesn't work the user can pick 1234A to load the VBR of partition 1…4 or floppy drive A:. The original code has a timeout allowing to select 1234A, and in that variant the active partition is only the default.

A very simple MBR for the old INT 13h CHS interface is distributed with parted. MBR CHS addressing works only for partitions starting in cylinder 0…1023 for any geometry, or in logical sector number (LBA) 0…16515070 in a (virtual) 1024×256×63 geometry. Some interesting details found in these MBRs and related sources:

All MBRs first define stack, data and extra segments, and then allow interrupts. On entry SS:SP can be 0000:0400h at the end of the interrupt vector table, and picking something better is a good plan. The new MBR code in rexxfat.rex simply uses 0000:0000h as stack top, i.e., the top of segment 0000h.
The entry address for both MBRs and VBRs is 0000:7C00h. To load any VBR at this address the MBR needs to move its own code and data to another address. Allegedly some odd BIOS INT 19h code manages to start the MBR at the alias 07C0:0000h. In other words, assume CS:nothing, or fix CS with a far jump. The rexxfat MBR is designed for sector sizes from 256=100h up to 4096=1000h bytes. Therefore the MBR copies itself to 0000:EC00h. Adding 1000h yields a stack bottom at 0000:FC00h; and a stack with 1024 bytes should be enough for INT 13h LBA read operations.
Allegedly some odd BIOS INT 19h code does not report the disk number for INT 13h in DL correctly. Old MBR code apparently used the active partition flag 80h to populate DL, i.e., any non-zero flag instead of 80h could be used. Variants of this scheme accept 80h…FFh as active, use DL as set by the BIOS, and reject 01h…7Fh as invalid partition table. The minimalistic rexxfat MBR accepts only 80h as active and does not look for trouble in the form of more than one boot partition. Actually it scans the flags backwards checking its own 55AAh magic first.
Most MBR variants support both CHS and LBA addressing. The minimalistic rexxfat MBR always uses LBA to load six instead of only one VBR sector. If something goes wrong, e.g., there is no active partition, or the loaded VBR has no 55AAh magic, an error is reported. At that point the user can press space, 0…9, a (lower case "A"), or any other key. 0…9 tries to load the MBR for DL=80h…89h, and a tries DL=00h for the first floppy drive. This won't work in emulation modes for say USB, where only 80h for this partitioned medium is supported. Pressing space triggers INT 18h, a modern BIOS would then try to boot the next configured boot medium. Any other key triggers INT 19h to boot the first configured boot medium.
Various Windows MBRs have special code for the FAT32 backup boot sectors and partition types 0Bh or 0Ch. Windows 7 MBRs support encrypted disks. If you need these features use a hex. editor and copy the code in the first sector up to offset 1B8h. The remaining 72 bytes are the disk ID (four bytes), two unclear nulls, 4×16 bytes for the four partitions, and the magic 55AAh. Little endian whiners, it's 0xAA55 in your debugger ;-)

Allegedly MBRs and VBRs for sector sizes 1024, 2048, and 4096 are supposed to have their magic at the same offset as for sector size 512. At the moment the rexxfat MBR expects that the 72 bytes from offset 1B8h to 1FFh contain the disk ID, partition table, and magic. Flipping one byte in the code from 01h to 00h for the offset of the magic in theory allows sector size 256. In practice VHD images only allow 512, real hard disks used 512, and El Torito or as the name says 512e emulate 512.

2011-11-03

FAT32 fun

After NTFS and EXFAT now some FAT32 fun. The FORMAT and DISKPART tools on Windows NT have some limitations, therefore I created rexxfat.rex for experiments with disk image files and the FAT file system. In essence rexxfat can create a VFD image for a given number of sectors, example:

c:\etc\bin>rexxfat 4194304
1:   32264 * 2 FAT32 sectors;   7 +   0 SYS;   1 *   4129769 data:      4129769
2:   16257 * 2 FAT32 sectors;   8 +   0 SYS;   2 *   2080891 data:      4161782
3:    8161 * 2 FAT32 sectors;  10 +   0 SYS;   4 *   1044493 data:      4177972
4:    4089 * 2 FAT32 sectors;  14 +   0 SYS;   8 *    523264 data:      4186112
5:    2047 * 2 FAT32 sectors;  18 +   0 SYS;  16 *    261887 data:      4190192
6:    1024 * 2 FAT32 sectors;  32 +   0 SYS;  32 *    131007 data:      4192224
7:     128 * 2 FAT16 sectors;   1 + 127 SYS; 128 *     32765 data:      4193920
Pick 1..7 to create c:\etc\bin\rexxfat.vfd with 4194304 sectors

FAT16 cluster size 128 (hex. 80h) might be a bad idea for ancient tools. On the other hand FAT32 is not supported by ancient DOS versions up to MS DOS 6.22 or PC DOS 7.

FAT12 and FAT16 have a static root directory in the system area. For FAT32 the root directory is a dynamic part of the data area typically starting in the first data cluster. Consequently FAT12 and FAT16 never require unused sectors at the end of the data area, e.g., if a given cluster size cs=2**n yields u=1…cs-1 unused data sectors, then the data area can be reduced by u sectors, and the root directory can be increased by u sectors. rexxfat uses a default minimum of 6 sectors for the root directory, for sector size 512 that allows a minimum of 96 root directory entries.

For FAT32 any unused sectors are added to the reserved sectors at the begin of the system area. Normally FAT32 has 12 or more reserved sectors consisting of four sets:

00..02 Three boot sectors (01 is actually used as FSinfo)
03..05 Three unused sectors (nulls, no magic 55AAh)
06..08 Three backup boot sectors (the FSinfo pointer in 06 is 01, not 07)
09..11 Three or more unused sectors

Fortunately this madness can be trimmed, for starters zero or more unused reserved sectors in the last set are good enough. It is also possible to move the backup boot sectors from 06..08 to 03..05, simply modify the backup pointer in 00 and its backup.

There be dragons: If sector 00 is unreadable the backup pointer is also unreadable, and tools trying to fix or bypass this issue would assume that 06 is the backup. It is still possible to use 03..05 as official backup and keep 06 as unofficial third backup. For many purposes that should be good enough, with one notable exception: FAT32 boot code for MS DOS 7 (Windows 95, 98, ME) actually uses the third boot sector, and the Windows NT master boot code assumes that the backup boot sectors for FAT32 partitions start in relative sector 06. Obviously this cannot work as expected if MS DOS 7 backup boot code expected in sector 08 only exists in sector 05.

Fast forward, this is 2011, I've never used MS DOS 7, the dummy boot code in rexxfat consists of two bytes CD18h (INT 18h), and this works for all sector sizes in sectors 00, 03, or even 06. There is an option to enforce a classic layout with 9 or more reserved sectors.

For my purposes the odd layout with only 7 reserved sectors is better, adding one sector for the MBR yields 8, and removable media typically consists of 8×x sectors: No stupid gaps between the MBR and the FAT32 partition, and also no unused sectors after this partition or at its end. DISKPART and similar tools would skip 63-1 or 2048-1 sectors between MBR and partition, but for removable media with only one partition this is unnecessary: If some decent MBR code can load sector 63 or 2048 it can also load sector 1. There is no "second track" beginning in sector 63, and the new MS Vista alignment on MB boundaries (2048) is nice if you need to find a lost partition on a huge disk, but not required.

To create a VHD image instead of a VFD with rexxfat specify a negative total sector number as argument. One sector is used as MBR, the rest is used for the FAT partition. If the sector size is 512 a classic fixed VHD footer (511 bytes) is added. Sadly fixed VHDs cannot be created as NTFS sparse files, and also do not allow other sector sizes (128, 256, 1024, 2048, or 4096). Here's the checkmbr output for a VHD with 66558 sectors:

...\bin\rexxfat.vhd (assuming geometry CHS 11093   1  6) id. [16F4-1407]
   MBR CHS     0   0  1 at          0, end     0   0  1, size          1
1: 0C: CHS     0   0  2 at          1, end 11092   0  6, size      66557 FAT32
2: 00: CHS     0   0  0 at          0, end     0   0  0, size          0 unused
3: 00: CHS     0   0  0 at          0, end     0   0  0, size          0 unused
4: 00: CHS     0   0  0 at          0, end     0   0  0, size          0 unused
                                                        total      66558

 FAT32 CHS     0   0  2 at          1, end 11092   0  6, size      66557
  boot CHS     0   0  2 at          1, end     0   0  4, size          3
backup CHS     0   0  5 at          4, end     1   0  1, size          3 boot
  rest CHS     1   0  2 at          7, end     1   0  2, size          1 boot
 FAT32 CHS     1   0  3 at          8, end    86   0  4, size        512 #1
 FAT32 CHS    86   0  5 at        520, end   171   0  6, size        512 #2
  data CHS   172   0  1 at       1032, end 11092   0  6, size      65526
[16F4-1408] (cluster size   1, number      65526)       total      66557

This is work in progress, the pseudo-geometry in the VHD footer fits into 28=16+4+8 bits for ATA, but an incomplete last cylinder in a geometry with 24=10+8+6 bits for INT 13h would be better for less than 16515072=1024×256×63 sectors. After all there are no cylinders in a VHD image, and the checkmbr output is designed for INT 13h CHS tuples in MBRs, not for ATA CHS tuples in VHDs.

2011-11-02

SideWiki shutdown

Google SideWiki — good riddance. Just for fun I save my last (of five) entries below, they were all futile attempts to get in contact with a human being interested in fixing Google bugs. Next stop, let's kill Knol.

'Meta-searching' Google
The alleged "meta search" is an ordinary "cached" link offered in an ordinary Google search result, where the found Google groups article does not contain the searched `+"bigfat" fat16 -forum`, and I hoped that at least the cached result might be related to my query.
Instead I get a runabout ending in this "sidewiki" known to be yet another Dave Null at his support work. If you do not want cached Google groups search hits to be clicked just do not offer the links.
About Unusual traffic from your computer network - Web Search Help

Go, Duck, go

Reader redesign: Terrible decision, or worst decision? and another article discuss the next step of what will be known as the demise of Google: Now they have crippled Google Reader. It even doesn't work anymore in Chrome.

What used to be a single click Like or Share in Reader is now gone, and the new Reader UI is a disaster, as one of the creators of the original UI states. In theory I could join the G+ beta tests, but as it happens I'm not interested to risk my Google account including Blogger and Gmail with an acknowledgement of the G+ beta test ToS.

Check out Duck Duck Go lite while it's still a free no-nonsense search engine. And do not waste time with Google APIs or services, sadly they will be gone or munged until no-good before you begin to grok what they were about.

2011-10-30

EXFAT fun

For what's it worth checkmbr.rex can now handle EXFAT partitions. Partition type 07h is still tagged as NTFS, but the reported details now match EXFAT if applicable. Example:

       CHS     0   0  1 at          0, end     0 158 47, size      10000
hidden or total sectors 63 10000 do not match 0 10000
  boot CHS     0   0  1 at          0, end     0   0 12, size         12
backup CHS     0   0 13 at         12, end     0   0 24, size         12 boot
  rest CHS     0   0 25 at         24, end     0   2  2, size        104 boot
 EXFAT CHS     0   2  3 at        128, end     0   2 18, size         16 #1
unused CHS     0   2 19 at        144, end     0   4  4, size        112
  data CHS     0   4  5 at        256, end     0 158 46, size       9744
[6722-4357] (cluster size   8, number       1218)       total      10000

The Windows 7 FORMAT /FS:exFAT tool used an obscure number 63 for the hidden sectors in this unpartitioned VFD image also known as superfloppy, and checkmbr.rex dutifully reports that 63 is not 0. As long as you don't try to boot from a superfloppy or ordinary partition these hidden sectors are irrelevant.

It is interesting to see that FORMAT reserved 128=2×12+104 sectors for the 2×12 boot sectors. Most of the 12 boot sectors are already unused, and the boot checksum sector with 128 copies of the same 32bits checksum is hilarious. So what is the idea of the 104 additional sectors?

While at it Microsoft decided that two "FAT" copies are for cowards, and creates only one "FAT". It is not really a FAT, EXFAT uses a bit map for allocations, the "FAT" is only used for purposes where a bit is not good enough, i.e., bad clusters or fragmented cluster chains. And after saving 16 sectors for a second FAT there is another set of 112 apparently unused sectors, 128=16+112.

I'd get the idea if subsections of the system area are padded to max(4096/SS,CS) sectors: At some point in time we'll want to use 512e aligned to physical sector size 4096. But for that 56=2×12+2×16 instead of 256=2×128 would be good enough. For one FAT there are apparently 216=104+112 unused sectors, and instead of 1218 there could be 1235=1218+27 clusters (27=216/8).

SANS published a brilliant reverse engineering paper about EXFAT, but I'm not yet ready to outsmart FORMAT /FS:exFAT. Remotely related, checkmbr survived the forensic extended partition test case with two primary partitions in an extended partition. And I've fixed the output for zero FAT12 clusters. ToDo: checkmbr should report that 6 of the 16 FAT sectors are overkill for 1218 clusters, after all it does this already for FAT12/16/32.

NTFS fun

New REXX toy: rxsparse.rex converts a specified file on NTFS to a sparse file with FSUTIL SPARSE SETFLAG and FSUTIL SPARSE SETRANGE. Obviously you need fsutil.exe for this business, and it only works on NTFS.

NTFS files can be compressed and decompressed on the fly on Windows NT when the corresponding file attribute is set, and with inheritance this can be arranged for complete subdirectory trees. This feature replaced the obscure DOS double space approach, good riddance. Compression on the fly can be nice, but of course file accesses will be slower. Sparse files are an alternative, long runs of zero bytes in a file are not physically stored, but emulated. Unlike compression that's fast, only the position and length of sparse ranges has to be noted.

For NTFS a long run is not anything, but a multiple of the cluster size. As for FAT file systems the NTFS cluster size is 1, 2, 4, …, or 128 sectors. Wikipedia claims that a long run consists of 64 KB, and rxsparse can check this theory in its self test:

SPARSE SETFLAG C:\etc\bin\REXX\rxsparse.tmp
655360 = 5*131072 bytes with 8, 16, 32, 64, 128 "zero"-sectors at the end:
Bereich geringer Datendichte: [0] [655360]
SPARSE SETRANGE 126976 4096
SPARSE SETRANGE 253952 8192
SPARSE SETRANGE 376832 16384
SPARSE SETRANGE 491520 32768
SPARSE SETRANGE 589824 65536
Bereich geringer Datendichte: [0] [589824]

SPARSE SETFLAG C:\etc\bin\REXX\rxsparse.tmp
655360 = 5*131072 bytes with 8, 16, 32, 64, 128 "zero"-sectors at the begin:
Bereich geringer Datendichte: [0] [655360]
SPARSE SETRANGE 0 4096
SPARSE SETRANGE 131072 8192
SPARSE SETRANGE 262144 16384
SPARSE SETRANGE 393216 32768
SPARSE SETRANGE 524288 65536
Bereich geringer Datendichte: [0] [524288]
Bereich geringer Datendichte: [589824] [65536]

Test okay, maybe delete C:\etc\bin\REXX\rxsparse.tmp

The shown ranges [offset] [size] contain non-zero
bytes; check that there are 1..4 remaining ranges.
The smallest hidden zero-range size should be at
least 4096 (otherwise edit BLKLEN in the source).

The German gibberish is the output of FSUTIL SPARSE QUERYRANGE on a German Windows 7 x64 SP1. Oddly this is the opposite of SETRANGE, it shows ranges containing non-zero bytes. NTFS cluster size 4096=8×512 is perfectly normal, the definition of long run could still differ for other NTFS cluster and sector sizes.

Major caveat: When rxsparse found the end of what it considers as a long run of zero sectors it has to close the file before using FSUTIL SPARSE SETRANGE. If another process manages to write non-zero bytes in the critical range before FSUTIL gets write access these non-zero bytes are lost. Presumably REXX could somehow use a decent .NET API to avoid this potential race condition with the FSUTIL command line tool, but I didn't bother to figure it out.

Sadly my use case "FAT32 VHD image on NTFS" does not work as expected for the VHD variant known as fixed VHD (type 2). For NTFS VHD images it is also pointless, DISKPART can handle NTFS VHDs after defragmentation and precompact in a VPC VM guest OS supporting the precompact VM addition, notably Windows 2000 or better. Any "defrag" and wipe unused disk space tool for FAT file systems on another guest OS has presumably the same effect as precompact, but DISKPART supports only NTFS for its part of the VHD compactification magic. Not yet tested, maybe rxsparse makes sense for dynamic FAT VHD images.

2011-08-12

Customised Search Engines

More than four years ago I explained the secrets of LP and AH in a Google CSE query parameter cof=FORID:0%3BAH:center%3BLP:0. The percent-encoded hex. %3B is a semicolon separating FORID:0, AH:center, and LP:0 in this example going back to Google's ancient free site search.

This undocumented cruft is now seriously dead, or rather, AH:center and LP:0 now have no visible effect. Maybe the cof= parameter was removed while some new CSE features were introduced since June, see an entry on the Custom Search Blog — I've no idea what the Element might be, and the officially deprecated features didn't mention any undocumented cruft, but there, it's dead, and the new layout for CSE search results hosted by Google works fine for the xyzzy CSE. Fortunately I modified it to work on the CSE layout test page some months ago, and as it happens that is now apparently good enough for new search result pages without AH:center and LP:0.

I'll remove any remaining obsolete cof= parameters where I find them, it might be hidden in obscure places such as the template for this blog, persistent URL (PURL) redirections, rel="search" link relations in the header of my web pages, and a tiny CSE googlet (PURL of XML source).

BTW, if you were always looking for an IETF-related search engine test the xyzzy CSE as shown near the bottom of all web pages for this blog. I still update this CSE when I find new promising sites, one of the last additions was a site tracking link relation registration requests. I added it after the registration of link relation rel="canonical" on the corresponding IETF expert review mailing list.

And if you were always looking for a REXX-related search engine check out my second (and last) REXX CSE — I rarely use it, but fix it when I stumble over broken links on my KEXX page. For a recipe to convert CSEs given by their cx= ID to OpenSearch Description Documents see another four years old entry on this blog, or just adopt one of the OSDDs on my googlet page.

2011-07-29

MD5 test suite version 1.8

The MD5 test suite 1.8 finally mentions RFC 6151, fixes a minor exit code bug in version 1.7, supports the web-safe base64 encoding specified in RFC 4648 as base64url alphabet, uses the USCYBERCOM easter egg as MD5 streaming test case, and covers the collision announced by Tao Xie and Dengguo Feng in December 2010 (2 of 512 bits modified). Read their PDF (one page) for details of the offered bounty for anybody who finds another collision of this type.

2011-07-14

checkMBR.rex

Some of my web pages are still messy and on their way to a new hoster, but there is a new version of checkmbr.rex (REXX script) maybe interesting for users of TestDisk. The script can analyze MBR disks on Windows platforms using \\.\PHYSICALDRIVEn for n=0..9, and identifies unused sectors caused by generous partitioning or by FAT and NTFS sizes not exactly fitting into their given partition. While at it the script creates base 64 backups of various boot sectors. Fixed bugs and new features:

The max. cluster numbers for FAT12/16/32 were "off by one". The hardwired limits are now 4084, 65524, and 268435444.
Added partition type 27h "WinRE" for a hidden windows recovery partition and file system NTFS.
exFAT uses partition type 07h also used by NTFS. This is unsupported and should result in lots of "NTFS errors" for exFAT (untested).
UEFI disks start with a protective MBR (partition type EEh). This is now reported, but checkMBR analyzes only MBR disks.
VFD (virtual floppy disk) and fixed VHD (virtual hard disk) files can now be given instead of a physical drive number.

The new VHD feature would fail for ooREXX 3.x and VHDs greater than 2 GB, get ooREXX 4.x. Many identified partition types are untested or ambiguous, e.g., I've never seen an EFI FAT type EFh roughly corresponding to a Windows system NTFS partition on MBR disks, or the partition types allegedly used by the FreeDOS FDISK tool to hide FAT, NTFS, or extended partitions.

2011-07-09

about: ftp:

There are I-Ds for two URI schemes:

about: URIs tackle about:blank and similar beasts.
ftp: URIs are another missing piece in the quest to finally get rid of RFC 1738.

Please review these drafts and send any feedback to the relevant IETF mailing lists — subscriptions are cheap, they cost you a working e-mail address. ;-)

So far nobody volunteered to finish the existing file: URI I-D (2005). There are some dragons with respect to different browsers and operating systems, notably IE vs. Firefox and Windows vs. Linux, but really, almost any file: URI scheme RFC based on the work for other schemes would be better than RFC 1738 vintage 1994 today. And you are not forced to emulate some negative records created by me for RFC 5538. I think file: and ftp: URIs are the last missing pieces in this puzzle. Check out the mailserver: (RFC 6196) and tn3270: (RFC 6270) URI schemes for pieces added in 2011.

Completely unrelated, and not deserving a separate article: I've recently added a CC-BY-SA license for this blog. That is mostly an experiment to show my support for Creative Commons and Wikimedia Commons as explained on my commons page with a fascinating (but long) video of a CCC lecture by Lawrence Lessig in 2006. Clearly I can't revert this license to some more restrictive form for existing pages, but maybe I'll change it to public domain in the future if I feel like it, or if somebody asks for it with a sound reason (from my POV).

2011-06-25

Firefox 5 in W2K under windows 7 x64

Now that is really something new (for me) in Windows 7 x64:

What you see is the bottom of a 1152×768 virtual Windows 2000 SP4 desktop with quick links for Firefox 5, IE6, KeditW 1.6, etc. The host system is a Windows 7 x64 SP1 home premium Sony VAIO (1600×900, ATI mobile Radeon catalyst 11.6). Sadly XP mode is not supported for home premium, or rather, I couldn't test a manual XP activation, because I have no XP license key.

Some invisible details:

VPC does not really work for guests older than XP SP3, you have to install the virtual machine additions 2004.
Virtual PC guy posted a picture of the real network card virtualized in VPC for up to four networks. For my purposes "internal NAT" allows me to share an existing mobile broadband WWAN connection. This is not the VPC default, change it in the VM settings while the VM is not running — hibernating might be good enough.
Without the VPC integration features I found yet no simple way to exchange guest and host files (including simple things like the clipboard), WebDAV on a remote hoster clearly doesn't qualify as simple.
Using virtual floppies (VFD) is a major pain with Windows 7 VPC, but once you have attached a VFD to a VM it sticks. You can even format the VFD.
I didn't know how to "capture" (if that's the correct term) a W2K VHD from a genuine system, and used a public VHD. That beast required a lot of work (tons of missing updates, removal of obscure "bars" installed by the publisher, adding A/V-software, full scan with the latest MSRT, Firefox, Flash, Silverlight, 7Zip, XnView, DirectX, Secunia PSI 1.5x, etc.).
Just in case: Yes, WU still works for W2K — only up to June, 2010 for OS + IE6 updates, but still for any MS Office stuff. Sadly the still working monthly MSRT requires manual downloads. Avira announced that they'll stop to support Avira A/V Personal 10 under W2K in July 2011, which is kind of stupid: Anything better than W2K can and IMO should use MSE.
Somewhat unrelated, normally you cannot get ATI Catalyst 11.6 for Sony VAIO from AMD, but I found an unrestricted official download link in an obscure forum. Same procedure as for direct Flash AX downloads without Adobe's GetPlusPlus ~~mal~~Adware.
The latest Intel PIU fails to install under W2K, and an older version (working for W2K) died with an obscure error in VPC. FWIW installing DirectX 9 in the virtual W2K worked, but of course there isn't much to accelerate in a VPC. My real W2K is far slower than the virtual W2K: The real box has 256 MB RAM, the virtual box has 512 MB, and the windows 7 host has about 3950 MB.
Sometimes the virtual W2K hangs and needs a hard reset (= close VM). I'm not sure when this started, among my suspects are Firefox 5, Avira Personal 10, and Secunia PSI 1.5x. It's also possible that my (failed) attempts to install the VPC integration features, or my inconclusive attempts to install the VPC 2007 VMA screwed up the VHD. There is a suspicious unknown device in the device manager, claiming to be working, and using IRQ7 — is this some kind of time synchronization? To break out of a restart hanging VM loop modify the VM settings to "unconditional close" or "always ask". Normally I'd prefer "auto-hibernate on close", but that doesn't help if the VM hangs.
While a VHD is not running or hibernating it is relatively simple to mount it as a virtual disk in Windows 7. Unmounting can be slightly tricky, but it's necessary to start any VM using the VHD.
Having fun with VMs I stumbled over a simple VFD-driver and virtual CD control panel for Windows x86 platforms. Check out Elby Clonedrive for serious virtual CD applications. Apparently the MS XP virtual CD controlpanel 21 works also under W2K.
At some point in time I'll have to grok the diskpart manual — it would be nice to associate .vhd with a simple attach/detach (mount or unmount) script without going to a command line or the device manager.

2011-06-06

Shortcut icon mysteries

Tweak My Blogger proposes to use three link relations to replace the default Blogger favicon with a custom favicon. Historical background:

There can be various link relations in the <head> element of HTML or XHTML pages, e.g., <link rel="search" … /> for OpenSearch descriptions. For favicons IE originally used the old Windows .ico format. That's a kind of container for related Windows .bmp images in various sizes and with different numbers of colours. A favicon .ico should include an image with size 16×16 and up to 256 colours, and it is not required to offer other sizes. If compatibility with say IE5 is not your main problem 32×32 or 64×64, and more than 256 colours, might also work: Applications are supposed to pick the best image offered in an .ico for their needs, and scale it up or down if necessary. If an automatically scaled down image does not work for your icon the .ico format allows to include optimized smaller versions, notably 16×16.

Years ago there was no proper MIME type for these beasts, but type="image/x-icon" was widely supported. Later Microsoft registered type="image/vnd.microsoft.icon for this purpose, and modern browsers are supposed to know this type as far as they support the odd .ico format at all.

Web servers might be still configured to associate .ico with image/x-icon, but that does not affect a correct image/vnd.microsoft.icon in the <head> of pages. Without a link relation IE simply tried to fetch a file favicon.ico from the relevant directory. For various reasons that was a bad idea, and modern browsers rely on explicit link relations instead of default locations for favicons and other purposes.

Other browsers and other platforms were not eager to support the odd Microsoft .ico format, but liked the idea of shortcut icons. A much better format is .png, but some old browsers including IE did not support it, or had issues with certain PNG-features. Somehow these differences resulted in two link relations, rel="icon" and rel="shortcut", for in essence the same purpose. Only rel="link" is registered at the moment. It is possible to list more than one relation as in rel="shortcut icon", and today that should be good enough if the favicon is an .ico.

If you seriously hate this image format try rel="icon" type="image/png". GIF instead of PNG is also okay, of course excluding animated GIFs. Other formats are either unsuited for a favicon, e.g., JPEG and WebP, or still less widely supported, e.g., SVG, which will be an ideal solution for most kinds of icons.

So far for the theory of the favicon business. The practice on Blogger and other sites can be slightly more complex. IIRC you can't use the name favicon.ico, just pick something else, e.g., href="http://example.org/my-icon.ico" if this really is an .ico with MIME type="image/vnd.microsoft.icon". Whatever Blogger does, the type="icon/ico" recommended in Tweak My Blogger makes no sense, and at least in theory (not tested) more than one link relation is unnecessary. My Blogger template contains the following line immediately before the <title>:

<link href='http://purl.net/xyzzy/xyzzy.ico' rel='shortcut icon' type='image/vnd.microsoft.icon' />

Sadly Google reader still shows the Blogger icon for the feed, and unsurprisingly a Google profile still shows the Blogger icon for a Blogger profile, but otherwise it works as expected. Maybe create a 57×57 PNG for I-Phones and I-Pads, and use the (unregistered) link relation rel="apple-touch-icon", it can't get odder. I have no I‑Panything to test this.

2011-05-01

URL fragment vs. query part

Has Google finally reached the ultimate Microsoft state, "breaking standards for fun and for profit" ? In a recent entry on the Google Operating System blog (not affiliated with Google) two screen shots show #fragments instead of ?queries. My attempt to post a comment had no visible effect, maybe comments are moderated or the blogger template is broken or whatever the problem might be, zero feedback to users is just wrong. Google Chrome lost the content of the comment form before I could reproduce it verbatim here, in essence it said:

In URLs a hash mark "#" introduces the optional fragment at the end of the URL, and a question mark "?" introduces the optional query part — also at the end of the URL, but before any fragment.

Fragments are the local business of the client (browser) and depend on the document (MIME type). Queries are the business of the server, e.g., many servers expect name=value pairs separated by ampersands "&", other severs also permit semicolons ";" as separator, and so on.

The query syntax depends on the scheme, here http:, the details depend on the server, but in both cases the overall URL syntax would require to percent-encode "#" if it is a part of the query not intended to start the fragment.

Notably fragments do not participate in redirections from one scheme or server to another scheme or server, the target location never sees the source fragment — browsers are not expected to transmit a fragment in requests such as HTTP GET.

In other words, something is wrong in the googlesystem screen shots.

2011-03-18

Google - you get what you paid for

With a few exceptions Google services have the unacceptable feature to be associated with an account forever. So if you test say "feedburner" once, and then decide that this is yet another data kraken and never use it again (for years), it will still be listed in the Google accounts dashboard. In theory you could delete the complete account including blogs, mails, images, videos, docs, sites, etc. — in practice folks trying this often show up in the blogger help forum whining about their lost blog and images.

Episode 1: I sent a complaint about "google moderator" to Google's privacy officer. This is actually an obscure HTML form creating no confirmation mail and no ticket number. After weeks my privacy complaint still got no reply. I'll test it again when I get around to it, complete with screenshots of the HTML form, as this isssue might require an intervention by Hamburg's privacy officer.

Episode 2: My entry in "places" is associated with an off by two photo in "maps". After a considerable amount of time I figured out how to report this issue. This actually resulted in a timely mail answer, but the mail came from a noreply bounce address offering no way to answer. My report was that they swapped the photos for house X and X+2. Sh*t happens, this was incorrect, they systematically show X+2 for X for the given street. I have no way to fix my semi-erroneous report, this is stupid. Never send noreply mail if you are not a mail-bot, this infuriates customers and users. Why should they use valid and valuable mail-addresses, when the other side uses bounce-addresses?

Episode 3: The new Chrome "apps" store apparently records all installed apps, even after they were removed. I submitted a complaint about this, because I do not recall where or when I permitted this violation of my privacy, and I certainly do not know how to withdraw this permission.

2011-03-10

mailto: URLs

If you think that mailto: URLs are rather boring in comparison with http: you have a point, most mailto: URLs in the wild are straight forward and simple. But the syntax specified in RFC 2368 was based on

the RFC 1738 URL specification (1994),
the old STD 11 in RFC 822 (1982).

Updating the mailto: specification therefore had to replace the old (and tricky) RFC 822 syntax by the new RFC 5234 ABNF in STD 68 published 26 years later (2008). The actual content of RFC 822 is the Internet Message Format as used in e-mail and now specified in RFC 5322, the successor of RFC 2822. For URIs we are now at STD 66 in RFC 3986 (2005) with various subtle differences from the eleven years older RFC 1738. You won't often see these differences, but clearly there was no such thing as IPv6 in URIs back in 1994. UTF-8 was rarely used, supported 31bit Unicode points, and permitted "overlong" encodings — not exactly the STD 63 rules in RFC 3629 (2003) we use today.

The general syntax for URIs in STD 66 and the specific syntax of e-mail addresses in RFC 5322 are not directly compatible, some characters permitted in addresses are not permitted "as is" in URIs and have to be "percent-encoded" based on the hex. UTF-8 encoding. It's a miracle that the new mailto: URI specification in RFC 6068 managed to close these gaps in 2010, twelve years after RFC 2368. There are lots of interesting examples in RFC 6068, my favourite is a clever use of In-Reply-To e-mail header fields. The complicated examples are also fascinating. If you create a complicated e-mail address your chances that it works anywhere in a mailto: URLs are now better. Well, at least it is specified, implementing it at the interface of browsers and mail user agents is another story. Well done, one "like" from me to RFC 6068.

2011-03-02

Service Status Feeds

My favourite tool for 2009 and 2010 — while I was mostly offline — is Google Reader, the feed reader offered by Google for "popular browsers" (read: IE6, FF2, or better), because it dutifully kept all my subscriptions, and still works like a charme.

In a Gmail blog article Google reported a bug in the Gmail service, apparently my account belongs to the 99.98% not affected by this issue. The article mentions an Apps Status Dashboard showing the published issues for various Google services including Gmail. Some years ago I would have added this page to my bookmarks, and checked it when something went wrong. Today it is much simpler to subscribe to the relevant RSS feed — hopefully this results in no updates when there are no known issues.

Two other examples of status feeds are Blogger and DynDNS.

2011-02-22

RFC 6055: IAB Thoughts on Encodings for Internationalized Domain Names

The new RFC 6055 is quite interesting, it discusses issues of IDNA vs. UTF-8 encodings in the DNS and other namespaces.

By definition any valid IDNA A-label, i.e., an ASCII compatible label starting with xn-- and following the IDNA rules specified in RFC 5890 and RFC 5891 (among others), has a corresponding Unicode U-label, i.e., a label containing non-ASCII characters. The IDNA encoding mechanism punycode tries hard to create short A-labels, because there is an upper limit of 63 octets per label, and another upper limit of 255 octets per FQDN DNS query consisting of zero or more labels.

RFC 6055 does not propose to use redundant UTF-8 DNS entries for raw U-labels in addition to the IDNA A-labels, so I guess there are too many cases where this would anyway not work, e.g., when an otherwise valid U-label consists of more than 63 UTF-8 octets.

When I get "a round tuit" I'll look into a now long expired EAI SPF draft, where using raw UTF-8 labels was the only viable solution for an EAI extension of SPF. Even if nobody implements this draft and/or if EAI never really makes it the last "tombstone" version should reference RFC 5321 (SMTP), RFC 5890, RFC 5891, and RFC 6055.

John Klensin is the author or a co-author of all RFCs mentioned here. I should check what else he published between RFC 5321 and RFC 6055, but it is nice to see that for at least one person in the world "RFC" still works as a "Request For Comments". :-)

2011-02-01

MD5 1.7: verified errata

All pending errata for the MD5 test suite are now verified by the IESG; thanks to Alexey Melnikov. The oldest erratum 749 about an MD5 example in RFC 2069 was submitted 2005-02-06 and verified 2010-07-11. Now I feel less bad about dropping out for two years.

For historical reasons the MD5 test suite is one of my REXX scripts still published as cmd-file instead of a rex-file. On an OS/2 box REXX is the default scripting language using file extension cmd. An ordinary OS/2 cmd.exe-shell script never starts with "/*"; any script starting with "/*" is interpreted as REXX.

Good old PC DOS 7 uses extension bat for command.com-shell scripts, and the text editor KEDIT uses kex for its macros; both also identify REXX by "/*" in line 1.

Please do not feed OS/2 REXX cmd-scripts to the Windows NT cmd.exe-shell; you would get numerous errors. For NT simply rename md5.cmd to md5.rex and let ooREXX interpret it.

2011-01-30

WebHop DDoS

DynDNS reports a DDoS attack against their Webhop services. This also affects xyzzy.webhop.info and xn--80akhbyknj4f.boldlygoingnowhere.org. The first address used to be a HTTP redirection to my mirror site, at the moment it is a redundant redirection to xyzzy.

The second address is used in some IDNAbis experiments with IRIs; the actual content are now subpages of xyzzy/home/googlets.

2011-01-28

videotex

Instead of a thank you to the news.t-online.de team in RFC 5538 I added the historic videotex: URI scheme to the IANA registry.

When the WWW with http: and https: started to replace gopher: and various videotex: systems (BTX, minitel, Prestel) this URI scheme allowed links to videotex: resources for browsers supporting it. It was a nice system for limited clients including DOS boxes, but the software to create videotex: pages was very expensive.

Is it only me, or do we really invent the wheel again and again ? Now we have WAP and a .mobi TLD, but what was wrong with telnet: and gopher: ? At least ftp: is still alive and kicking. And maybe news: survives.

If you know a public specification for an unregistered URI scheme, please submit it to IANA following the rules in RFC 4395 — in a nutshell this is a public review on an expert mailing list.

2011-01-26

RFC 1849

The famous son-of-1036 Internet Draft about the Netnews article format was published as historic

RFC 1849: "Son of 1036": News Article Format and Transmission

Son-of-1036 (also known as 1036bis) was the first Internet Draft I've ever seen 15 years ago. Admittedly I didn't know the exact difference between draft and RFC at this time, let alone the various RFC series and status flags.

RFC 1849 was immediately obsoleted by the work of the IETF USEFOR WG:

RFC 5536: Netnews Article Format
RFC 5537: Netnews Architecture and Protocols

Folks interested in Usenet & Netnews can be rather stubborn, one of the reasons why this WG needed more than a decade to finish its work (presumably a new IETF record). Thanks to Charles Lindsey, Russ Allberry, Ken Murchinson, Henry Spencer, Alexey Melnikov, Harald Alvestrand, Lisa Dusseault, and many other contributors on the USEFOR mailing list.

Thanks also to anybody helping to publish…

RFC 5538: The 'news' and 'nntp' URI Schemes

… while I was offline for more than two years. That this RFC took so long was clearly my fault. Anonymous credits I coludn't add in this RFC: The news.t-online.de team triggered my interest in an Internet Draft by A. Gilman about news:, nntp:, and snews: URLs. Later Paul Hoffman started the work to obsolete all old URL schemes in RFC 1738 by new RFCs based on the Internet Standard for URLs (STD 66, RFC 3986).

There are still some schemes in RFC 1738 not yet obsoleted by fresh RFCs, notably file:. Maybe Martin Dürst could finish his work on the awfully complex mailto: scheme, I have not yet checked this. If you find no old URL scheme to work on try dict:.


Search only IANA, ICANN, IETF, OpenSPF, Unicode, W3C, xyzzy