Such as in the case of some subtitle files that have been obtained by OCR'ing images written in a font that doesn't distinguish between l
and I
.
These don't cover every case (these miss l
s at the start of lines but in the middle of sentences, and might convert some proper nouns that should begin with an I
), nor have I tested them for languages other than English, but this does an effective job of getting most of the easy cases.
Remember to use regex search mode in your text editor, and remember to set case-sensitive mode on! Otherwise you'll just replace every lowercase i with an l.
If you have the reverse problem, of all uppercase I
s being represented as lowercase l
s, my suggestion is to first globally replace l
→I
, then run the below replacements. (I haven't tested it yet as of first posting this, though.)
The replacements here use backslash syntax (\1
, \2
etc.) to represent capture groups in replacements. Of course, if your regex language uses dollar signs instead ($1
, $2
etc.), then replace backslashes with dollars.
([a-z']+)I([a-z]*)
\1l\2
I
in the middle or end of otherwise lowercase words, or right after an apostrophe (like in words such as we'II
→we'll
).([A-Z]+)I([a-z]+)
\1l\2
I
that are between an uppercase letter and a lowercase letter.(([a-z",]|I) [a-z]*)I([a-z]+)
\1l\3
I
, but are followed by lowercase letters, and which follow a word ending in lowercase characters, or punctuation that could reasonably be found in the middle of a sentence.Call me Ishmael
→ Call me lshmael
). The workaround is to just fix those manually afterwards (or, first do a search for text matching this regex that would be replaced, and take a note of all the names that it finds, so they're easier to fix); there are usually fewer proper nouns like these than there are other words that begin with l
.I
, not the smaller one within.([a-z]['-]+)I([a-z]*)
\1l\2
I
that follows some punctuation which itself follows a lowercase letter (e.g. fleur-de-Iis
→fleur-de-lis
). Looking for a lowercase letter is important, because otherwise this could replace correct capital-I
s that are at the beginning of speech lines that are marked with a dash.First posted 2025-08-20, last edited 2025-08-20. index