Script Archive
Some of these scripts are not yet there. If you care to write replacements
or had downloaded them from the old grab-bag, please send them and they will
be included.
It is sometimes necessary to remove comments (preceded by #),
since this is not universally legal syntax.
Filename manipulation |
File conversion |
HTML utilities |
Text formatting |
Beautifiers |
Mathematical |
Information extraction / tabulation |
Desktop information
Filename manipulation
-
Lowercase filenames (filter)
Uppercase filenames (filter)
-
Lowercase/uppercase list of filenames supplied from STDIN. Makes
a list of mv commands.
Example:
find /mnt/zeus/docs | tolower.sed | sh -x
-
Lowercase filenames (application)
Uppercase filenames (application)
- Lowercase/uppercase list of filenames supplied as command line arguments.
Again, makes a list of mv commands.
This version operates on files in current directory only.
Example:
down *.HTM *.INC *.sed
- Print basename of files
- Remove the directory prefix from a file path, and print remaining element.
Like UNIX basename, but reads data from a file or stdin.
Could easily be adapted for DOS conventions.
- Print path of files
- Remove the filename from a file path, and print remaining elements.
Like UNIX dirname, but reads data from a file or stdin.
Easily adapted to DOS conventions.
File conversion
- Convert DOS files for UNIX
- Changes DOS end-of-lines to UNIX end-of-lines (to be
ran under UNIX).
- Convert UNIX files for DOS
- Changes UNIX end-of-lines to DOS end-of-lines (to be
ran under UNIX).
- Split digest
- Recreates original email messages from a list digest.
The author says this should work `at least for digests generated by
Majordomo and #listserv, and FAQs using minimal digest format.'
HTML utilities
- Text -> HTML
- Converts preformatted text to HTML ready for viewing.
- <SCRIPT> -> HTML
- make <SCRIPT> contents viewable in HTML
- HTML -> <SCRIPT>
- undo previous operation
- ISO8859-1 -> HTML
- Convert ISO Latin 1 characters (eg: é, £, ¥, ½)
to their equivalent HTML character entitities.
- HTML -> ISO8859-1
- Convert HTML character entities to their ISO Latin 1 equivalent.
-
Lowercase HTML tags
Uppercase HTML tags
- Change case of HTML tags, preserving attributes.
- Strip HTML comments
- Remove all commented material from HTML
- Extract URLs from HTML
- Print all URLs (even commented ones) and associated
ALT comments found in an HTML file, formatted as: URL|comment.
- Extract title from HTML
- Print the TITLE (or the first H[0-7]
heading located) of an HTML document.
Text formatting
Capitalise words (i)
Capitalises the first letter of each word.
Capitalise words (ii)
A simple optimization to the above.
Capitalise words (iii)
How the real seder does it: faster, and much harder to understand!
Reverse text
Reverses the order of characters on each line of input.
Reverse text
A faster version.
Reverse file
Reverses the line order of a file, subject to the size of
the hold buffer.
Join lines
Joins all input on a single line.
Un-double-space lines
Change double-spaced lines to single-spaced.
Centre lines
Centres lines for an 80-column device. Easily adapted to different widths.
Centre lines
A different and more CPU-intensive approach.
Squeeze blank lines
Replace consecutive blank lines with one line, so that at most
one empty line separates two non-empty lines. Like cat -s.
Beautifiers
- Intel assembler -> UNIX
assembler
- Converts Intel 386 assembly (MASM) code to Unix 386 assembly (gas) code.
- Strip C comments
- Strips comments from C source. Beware that code following a comment
is also removed. See next item...
- Strip C/C++ comments
- Strips comments from C/C++ source. Handles comments surrounded by code.
- Beautify directory listing (UNIX)
- Indents the output of ls -lR according
to the depth of each directory. Makes output far easier to read.
- Directory tree (UNIX)
- Indents the output of find -type d into a nice tree
format. Thanks to Stewart Ravenshall.
- File polisher (troff)
- Very comprehensive suite of filters by Robert Marks
which perform a large number of beautifying operations on text files prior
to processing by troff. These scripts were used to produce camera-ready
output for the Australian School of Management between
1985 and 1995.
You can download a
gzipped tar archive
of the scripts, or individual scripts:
polish0.sed,
polish1.sed,
polish2.sed,
polish3.sed,
polish4.sed,
polish5.sed,
polish6.sed,
polish7.sed,
polish8.sed,
polish9.sed,
or visit
Robert's Web site.
- Horizontal banner
- Rotates the vertical output of banner to produce horizontal output.
The script assumes a screen size of 80x60. This could be overcome.
- Number lines
- A short script to display output lines preceded by line numbers.
This is similar to the UNIX nl command, or cat -n.
- Number lines
- This version demonstrates a technique for manually calculating numbers.
- Number non-empty lines
- A short script to display output lines, preceding non-empty lines
with a line number. Empty lines affect the count. This is not
the same as cat -bn, which does not count empty lines.
- Number non-empty lines
- This version demonstrates a technique for manually calculating numbers;
it emulates cat -bn exactly.
Mathematical
- Desktop calculator
- This script from sed guru Greg Ubben
is a full implementation of the UNIX desktop calculator dc.
dc is an arbitrary precision, multi-base, stacking calculator.
Read how it's done in Greg's
analysis.
- Add decimals
- This impressive script adds a list of decimal numbers.
It pulls this off by transforming successive digits in each number
into an analogue format, where a=1, aa=2, aaa=3, etc, concatenating the
two analogue numbers, resolving carry, and transforming the numbers back
into decimal. Similar (but not the same script) to the one explained in
Greg Ubben's Adding a list of decimal
numbers on the tutorials page.
- Increment a number
- Interesting script to increment numbers.
- Commify numbers (i)
- Formats numbers by placing commas before every 3 digits (eg: 1,200,573).
- Commify numbers (ii)
- A more compact script for versions of sed which recognise Extended RE's.
- Commify numbers (iii)
- Compare with (i). This script expects 100% numeric input.
Information extraction / tabulation
- Find anagrams
- Search for dictionary words in a string.
- Find anagrams
- Search for anagrams in a list of words (one word per line).
- Indexer
- This script collates a list of references to produce an index
suitable for a book or magazine. A
detailed description of the way it works, along with alternative
versions of the script, is available on the
tutorials page. The script was used
by the Cornerstone magazine to create an index for a book after
typesetting.
- Show make targets
- Extracts targets for a file from a makefile.
- Sort/delimit/number a list of names
- Sort, partition and number a list of names.
A thorough analysis of this script is given by the author in
A lookup-table counter on the
tutorials page.
- Display beginning of file
- Display first 10 lines of a file. Like head.
Desktop information
- Display a calendar
- Display a simple calendar for the current month,
à la the UNIX command cal. Only date
is required, math is done directly in sed.
Updated 29 Oct 1998