Great, RFC 5064 about the Archived-At mail and news message header field got its number, just in time to consider this as an early xmas gift ;-)
REXX, SPF, Internet drafts
- Google Code custom search
- en×de translations by LEO
- es×de translations by LEO
- fr×de translations by LEO
- About Flash googlet (version, links, search)
- Atomic clock googlet (JAVA applet of the PTB)
- Tiny map search googlet (local search)
The language subtag registry defined in BCP 47 was updated, it now contains the new region codes BL & MF and Suppress-Script: Latn for the Frisian languages frr, frs, and fy, for the Sorbian languages dsb and hsb, and for the Low and Swiss German languages nds and gsw.
mis entry was updated some months ago in ISO 639, its description is now uncoded languages. I've created new experimental XML versions of the registry, for other formats check out the Language Tags site.
RFC 4234bis was approved, we'll soon see a new Internet Standard (STD) about ABNF, the syntax used in many RFCs. An RFC defining the Archived-At header field in e-mail and news was approved earlier while I was in essence incommunicado after a system crash... :-(
I've submitted new versions of the news and nntp URI draft, adding an appendix with a detailed example about the relations between Archived-At, Message-ID, Xref, news-, and nntp-URLs. Two typos in version 06 fixed, this draft reached a point where it's easier to spoil it than to improve it.
Admittedly it's almost impossible to fix this bug based on a DTD, renaming %URI; as in the related XML schema anyURI to %IRI; in the DTD has the same effect as renaming it to %FOO; for DTD-validators, the datatype is still CDATA, or in other words (almost) anything goes.
Hopefully even DTD-validators will be fixed really soon to check URIs. Broken URLs are abused for attacks, ironically that was a side effect of better URI tests, several applications failed to check the generic RFC 3986 syntax. All valid URIs match this generic syntax, scheme specific URIs are proper subsets of the generic syntax. URI "producers" including MediaWiki as well as URI "consumers" including validators have to get this right, otherwise bad things happen.
The folks at validome hope that they'll get this right soon, schema validators have an advantage. They already identify the IDNwiki and its E-mail test as invalid, big oops for accessibility tests (IANAL).
Update: The IDNwiki pages were fixed 2007-11-21.
Version 2.1 of rxwhois knows the new TLDs BL, MF, and TEL. The corresponding whois servers are not yet running, but whois.iana.org already supports these TLDs. The corresponding region codes BL and MF will be added to the IANA language subtag registry in the next weeks.
The worst issue so far from my POV is that popular XHTML validators like the W3C validator don't check the URL syntax in attributes like href="URI", see bug 4916. Many users will be misled to create invalid pages with "unencoded" IRIs in document types like HTML 4.01 and XHTML 1.0, where that's not allowed.
Now the bad news, the issue with two md5-sess examples in draft smith-sipping-auth-examples might be in fact precisely what RFC 2617 says, as reported in a semi-official erratum. If that's correct the md5-sess in RFC 2831 would be different. Hopefully draft melnikov-digest-to-historic will shed some light on this before it moves RFC 2831 to historic. For more about this see the IETF SASL WG mailing list.
For now the MD5 test suite still uses only the binary x2c(HA1) form instead of the hex. HA1 form in its md5-sess calculation.
Admittedly RFC 2070 is old, and its status is historic. But it was the first HTML specification with I18N based on UNICODE, and the last HTML specification published by the IETF.
So far its DTD had to be extracted manually from the RFC, now IANA hosts an official master copy with the public identifier urn:ietf:params:xml:pi:-:IETF:DTD+HTML+i18N:EN. The urn:ietf:params:xml:pi registry was created by RFC 3688 for DTDs developed by the IETF. Of course HTML i18n is still SGML, not XML, but its DTD is now the first registered IETF DTD.
If you have old HTML i18n documents you can use an URL of this DTD as system identifier like this:
Many OS/2 SAAREXX programs work almost as is under WindowsNT ooRexx, the RexxUtil functions are similar, the WindowsNT CMD shell is similar, and the RxSock interface is almost identical. Some OS/2 RexxUtil functions are not yet or not more supported under WindowsNT, e.g. SysGetMessage, SysProcessType, and SysQueryProcessCodePage.
Version 2.0.5 of rxwhois.cmd now also works as WindowsNT ooRexx script, just rename it to rxwhois.rex. I've adopted additional local character sets from utf-8.cmd including codepage 923 (ISO 8859-15, Latin 9) and 878 (KOI8-R), but I've only tested 858 (pc-multilingual-850+euro) and 1252 (windows-1252).
As always some whois-servers for ccTLDs had to be updated, for details see the source. After a system crash of my OS/2 box in June I was unable to check anything beyond the whois servers already known by rxwhois.cmd, whois.iana.org, and whois-servers.net. Just for fun I've added the eleven IDN TLDs for the test beginning in September 2007:
- Chinese, simplified
- Chinese, traditional
Known issue: rxwhois.cmd expects UTF-8 as charset of whois servers, but whois.iana.org uses Latin-1 for at least one TLD ht (Haiti) entry. The IANA folks told me that they'll intend to use ASCII data for the eleven IDN test TLDs.
This CSE is identified by cx=003258325049489668794:ru2dpahviq8. The &cx=-parameter is used in links and anything else related to this CSE. The left hand side 003258325049489668794 is related to the Google account and the up to 5000 annotations (e.g. sites and URL patterns) associated with this account. The right hand side ru2dpahviq8 is related to the actual CSE context including details of its layout, references to the associated annotations also known as background labels, etc.
It's not my CSE, I can ignore most technical details only relevant for the CSE creator. One detail is probably important, this CSE uses FORID:1 unlike my own CSEs with FORID:0. The value is visible in the monstrous URL of search results, it's a part of the &cof= parameter.
Most other layout details noted in &cof= are set by Google on the fly based on the CSE definition a.k.a. context. For my own CSEs I force LP:0 and AH:center with &cof=FORID%3A0%3BLP%3A0%3BAH%3Acenter, but that's arguably pointless, opensearch only works with Firefox 2, IE7, or better, and these browsers have no issues with the default LP:1 logo position and AH:left aligned header on result pages.
CSEs refuse to return &output=xml or &output=xml-no-dtd results, therefore the opensearch description needs only one type="text/html" template. Just in case I added...
<Attribution> Google CSE by Jason Kersey </Attribution>
...anyway, after all the search results are Google results. In this case results filtered and rearranged as defined by the CSE creator Jason Kersey. With up to three searched sites in a CSE Google allegedly also shows its supplemental results.
Putting it all together I arrived at this mozillazine opensearch description. I've no clue how and where Firefox or IE7 might use the Tags or Description, most likely these details are irrelevant for search results on ordinary type="text/html" pages. The validator wants a Query example as specified by opensearch.org, just for fun I picked about%3Aconfig.
One last detail, the icon, fortunately kb.mozillazine.org has a type="image/vnd.microsoft.icon" 16×16 favicon needing less than 10 KB, so this should work as is (
http:-URL instead of
data:-URL) for Firefox. It's tricky to get the icon right with *.googlepages.com, the Google Page Creator won't let you have your own favicon.ico. Just use another name.
One way to use opensearch descriptions is to add a link in the header of (X)HTML pages. The title in the link should match the ShortName in the description, otherwise browsers won't know if the corresponding search is already installed. I've done that here in my blogger-template:
Another way is the window.external.AddSearchProvider function, Firefox 2 users can then simply click on the link to copy an opensearch description to their browser. I haven't tested IE7, maybe it uses the same method, i.e. "copy description". Last step, test this OpenSearch description with Firefox 2 or better.
For another example see my googlets page.
Yet another mailto command line tool, rxmailto.cmd v0.1 can so far send one text mail to one receiver via a Mail Submit Agent (MSA) at port 587 supporting SMTP AUTH with CRAM-MD5 and 8bitMIME. That's the minimum I could get away with after the spam flood finally drowned my old mailbox.
Various details are far from perfect, e.g. if a run of words with non-ASCII characters in the subject is longer than 56 octets the subject encoder will emit a folded line longer than 76 characters, and that's not permitted by RFC 2047. On the other hand the script won't break UTF-8 characters in the subject for platforms with UTF-8 as local charset. You get what you pay for, less than 30 KB. ;-)
The MD5 test suite version 1.2 finally supports streaming and bit string input:
hash = MD5( bytes ) ==> MD5 of an octet string ctxt = MD5( bytes, '' ) ==> init. new MD5 context ctxt = MD5( bytes, ctxt ) ==> update old MD5 context hash = MD5( /**/ , ctxt ) ==> finalize MD5 context hash = MD5( bytes, /**/, n ) ==> MD5 of n zero-fill bits ctxt = MD5( bytes, '' , n ) ==> init. MD5 bit context ctxt = MD5( bytes, ctxt, n ) ==> update MD5 bit context
Also added: APR1 can determine the hashed passwords used by BSD and Apache htpasswd. This is a function also offered by openssl passwd -1 and openssl passwd -apr1, for details see a manual of the openssl command line tool.
Unrelated, Google's page creator rewrites an uploaded sitemap 0.90 automagically into a sitemap 0.84 removing all <lastmod> elements. Or rather it did that last week for e.g. this sitemap, maybe it's one of the experimental features.
RFC 4590 contains four examples for Auth Digest. That's in essence the same as Digest-MD5 defined in RFC 2831, only based on the older RFC 2617. The examples were apparently copied as is to RFC 4590bis drafts. I've added the 2*2 (INVITE+rspauth, GET+rspauth) examples to md5.cmd (1.1).
The RFC 4590 examples still fail in my MD5 test suite, or rather my attempt to guess the used password failed. There's also an oddity in these examples not yet supported by the REXX script:
RFC 2617 states that a client sending any
qop= parameter, for the RFC 4590 examples that's
qop=auth, MUST also send a
cnonce= (client nonce) together with a
NC= (nonce counter). In the RFC 4590 examples the client doesn't do that, causing a trap in my REXX script.
There are two plausible ways to fix this, either use the RFC 2069 fallback algorithm, or simply omit the missing NC and CNONCE. In simplified REXX the second solution would be:
return MD5( HA1 || ':' || NONCE || ':auth:' || MD5( XURL ))
The first (2069) solution would use a colon : instead of :auth:. The "official" RFC 2617 string instead of :auth: is:
':' || NC || ':' || CNONCE || ':' || QOP || ':'
Other variants of what RFC 4590 actually wants could be to use an empty CNONCE with a dummy NC in the direction of :00000001::auth:. As always Digest-MD5 is messy.
Related, an old 2069-erratum still rots in the pending errata mbox. I'm now confident that the 2069-code in md5.cmd works at least with the IETF tools server. I've not yet submitted an erratum for RFC 2983, three out of four 2983-examples are fine.
- The default position of the search form on the result page is "left" set by AH:left. Fans of the old free site search form know this "Align Header" parameter, it's (kind of) documented on my lab page. It's also straight forward to modify it, add ;AH:center to the cof=FORID:0 parameter, where 0 might be something else depending on the used form.
- The free site search result pages show my logo immediately above the search form. The used parameters L: for the logo URL and S: for the site URL are still the same for CSEs, it's only not more necessary to specify all these odd values as part of the cof= parameter, Google inserts them on the fly. Unfortunately it also uses some CSS magic (style sheets) to get this right, failing miserably with browsers not supporting CSS. After some experiments I found the culprit: There's a new LP:1 "Logo Position" , this has to be disabled by LP:0 to get the desired effect.
<input type="hidden" name="cof" value="FORID:0;AH:center;LP:0" />
leo-dict.xml uses Content type html for Googlets with target="_top" as required by LEO, resulting in an <iframe> on an iGoogle-page or wherever it's used. Of course this only works with browsers and devices supporting <iframe>.
A misrepresentation of some ß uses in upper case head lines, on books, and on tombstones resulted in a proposal to add an "upper case ß" to Unicode as code point u+1E9E (PDF).
I've got a warning that this PDF might crash some PDF viewers, but exceptionally AcroReader 3 survived this attack.
The next likely step could be demands to permit ß in I18N domain labels because it suddenly got an upper case variant.
Of course there's also the "minor" problem of upgrading fonts and software for a fictitious "upper case ß" worldwide, as far as they support German.
The Google Gadgets are a cute idea, they allow to encapsulate Web forms into pieces of XML, which then can be added to personal Google start pages and similar services.
The name add.gif for the link icon isn't very intuitive, I renamed my copies to googlet.gif. My second experiment after the xyzzy "custom search engine" is a LEO dict search form, but so far Google claims that it's unavailable or empty when I try to add it.
The location of the sitemap can now be given in a robots.txt file, see their protocol page, example:
Any services above finding the sitemap like Google's webmaster tools still require some form of submission proving that the submitter has write access on the submitted site.
History, this isn't my first attempt with a "mail2blog" service, for two years mailbucket.org did what I wanted.
The Blogger help pages are messy, many links are broken, SSL login fails miserably with old browsers, the feedback link without login doesn't work, etc. If you're looking for a blog hoster try to find a better service. I want the "blog-by-mail" feature available here, so it's my own fault.
Known issue, the application/atom+xml media type for the atom feed apparently degenerates into text/html after the first access. This might be a cache issue, resulting in a warning by the W3C validator:
- ► 2011 (25)
- ► 2008 (26)
- ► November (4)
- ► May (5)