cleanText function (no dependencies)
Normalizes line breaks, multiple spaces, smart quotes, and invisible characters. Each cleanup is toggled by option.
typescript
type TextCleanerOptions = {
removeMultipleSpaces: boolean;
removeExtraLineBreaks: boolean;
trimEdges: boolean;
normalizeQuotes: boolean;
removeInvisibleCharacters: boolean;
collapseEmptyLines: boolean;
};
export function cleanText(input: string, options: TextCleanerOptions): string {
let output = input.replace(/\r\n?/g, "\n");
if (options.removeInvisibleCharacters) {
output = output.replace(
/[\u0000-\u0008\u000B\u000C\u000E-\u001F\u007F\u200B-\u200D\uFEFF]/g,
"",
);
}
if (options.normalizeQuotes) {
output = output.replace(/[‘’‚‛]/g, "'").replace(/[“”„‟]/g, '"');
}
if (options.removeExtraLineBreaks) {
output = output.replace(/[^\S\n]+\n/g, "\n").replace(/\n[^\S\n]+/g, "\n");
}
if (options.removeMultipleSpaces) {
output = output.replace(/[^\S\n]{2,}/g, " ");
}
if (options.collapseEmptyLines) {
output = output.replace(/\n{3,}/g, "\n\n");
}
if (options.trimEdges) {
output = output.trim();
}
return output;
}Notas de uso
- Pass an object with every option set to true for a full cleanup, or turn off the ones you don't want.
- The function doesn't mutate the input: it returns a new string.
Limitaciones
- Works on strings; it doesn't open or read files.
- It doesn't fix spelling or grammar, only normalizes formatting.