HTML source obfuscation with character entities

If you're able, view the source of this webpage. All the text in it will be just HTML decimal character entities, such as if for "if". Even spaces are encoded as  . Only the angle brackets, ampersand, and single and double quotes are encoded with their usual named entities.

What's the point?

ａｒｔ． I guess.

I wanted to experiment with taking this technical ability, this workaround, this hack, to the extreme. I wanted to make a page that's difficult to change after it's created, because its source is human-unreadable. An unclear message about the dichotomy between accessible to humans and accessible to machines – your browser will have no problem decoding this and showing it to you. At the mercy of the ~algorithm~.

And sure, this wastes bytes, I get that. I'm expecting transport-layer gzip encoding to help a bit here; the entropy per character of decimal-encoded text is necessarily lower than the text itself, so it should do a decent job. And don't get me wrong: this isn't something I endorse, it's a pretty horrible and pointless thing to do, because browsers' dev tools will do entity decoding anyway, so only readers of raw source get hurt.

How did I do it?

I'm not such a masochist as to manually type hundreds of decimal codes; of course I used a program to help. Very simple, loop over lines of input, loop over characters of lines, turn each character (unless it's in the set of forbidden characters) into its decimal representation, and stick it between &# and a semicolon and print.

My point about difficult editability still stands, though: to change text, I first need to find where the old text is, and the old text is still encoded.

Created by the author of oatcookies.neocities.org on 2021-05-01.