Script Archive

Some of these scripts are not yet there. If you care to write replacements or had downloaded them from the old grab-bag, please send them and they will be included.

It is sometimes necessary to remove comments (preceded by #), since this is not universally legal syntax.

Filename manipulation | File conversion | HTML utilities | Text formatting | Beautifiers | Mathematical | Information extraction / tabulation | Desktop information

 

Filename manipulation

Lowercase filenames (filter)
Uppercase filenames (filter)
Lowercase/uppercase list of filenames supplied from STDIN. Makes a list of mv commands.
Example:
  find /mnt/zeus/docs | tolower.sed | sh -x
Lowercase filenames (application)
Uppercase filenames (application)
Lowercase/uppercase list of filenames supplied as command line arguments. Again, makes a list of mv commands. This version operates on files in current directory only.
Example:
  down *.HTM *.INC *.sed
Print basename of files
Remove the directory prefix from a file path, and print remaining element. Like UNIX basename, but reads data from a file or stdin. Could easily be adapted for DOS conventions.
Print path of files
Remove the filename from a file path, and print remaining elements. Like UNIX dirname, but reads data from a file or stdin. Easily adapted to DOS conventions.

File conversion

Convert DOS files for UNIX
Changes DOS end-of-lines to UNIX end-of-lines (to be ran under UNIX).
Convert UNIX files for DOS
Changes UNIX end-of-lines to DOS end-of-lines (to be ran under UNIX).
Split digest
Recreates original email messages from a list digest. The author says this should work `at least for digests generated by Majordomo and #listserv, and FAQs using minimal digest format.'

HTML utilities

Text -> HTML
Converts preformatted text to HTML ready for viewing.
<SCRIPT> -> HTML
make <SCRIPT> contents viewable in HTML
HTML -> <SCRIPT>
undo previous operation
ISO8859-1 -> HTML
Convert ISO Latin 1 characters (eg: é, £, ¥, ½) to their equivalent HTML character entitities.
HTML -> ISO8859-1
Convert HTML character entities to their ISO Latin 1 equivalent.
Lowercase HTML tags
Uppercase HTML tags
Change case of HTML tags, preserving attributes.
Strip HTML comments
Remove all commented material from HTML
Extract URLs from HTML
Print all URLs (even commented ones) and associated ALT comments found in an HTML file, formatted as: URL|comment.
Extract title from HTML
Print the TITLE (or the first H[0-7] heading located) of an HTML document.

Text formatting

Capitalise words (i)
Capitalises the first letter of each word.
Capitalise words (ii)
A simple optimization to the above.
Capitalise words (iii)
How the real seder does it: faster, and much harder to understand!
Reverse text
Reverses the order of characters on each line of input.
Reverse text
A faster version.
Reverse file
Reverses the line order of a file, subject to the size of the hold buffer.
Join lines
Joins all input on a single line.
Un-double-space lines
Change double-spaced lines to single-spaced.
Centre lines
Centres lines for an 80-column device. Easily adapted to different widths.
Centre lines
A different and more CPU-intensive approach.
Squeeze blank lines
Replace consecutive blank lines with one line, so that at most one empty line separates two non-empty lines. Like cat -s.

Beautifiers

Intel assembler -> UNIX assembler
Converts Intel 386 assembly (MASM) code to Unix 386 assembly (gas) code.
Strip C comments
Strips comments from C source. Beware that code following a comment is also removed. See next item...
Strip C/C++ comments
Strips comments from C/C++ source. Handles comments surrounded by code.
Beautify directory listing (UNIX)
Indents the output of ls -lR according to the depth of each directory. Makes output far easier to read.
Directory tree (UNIX)
Indents the output of find -type d into a nice tree format. Thanks to Stewart Ravenshall.
File polisher (troff)
Very comprehensive suite of filters by Robert Marks which perform a large number of beautifying operations on text files prior to processing by troff. These scripts were used to produce camera-ready output for the Australian School of Management between 1985 and 1995.
You can download a gzipped tar archive of the scripts, or individual scripts: polish0.sed, polish1.sed, polish2.sed, polish3.sed, polish4.sed, polish5.sed, polish6.sed, polish7.sed, polish8.sed, polish9.sed, or visit Robert's Web site.
Horizontal banner
Rotates the vertical output of banner to produce horizontal output. The script assumes a screen size of 80x60. This could be overcome.
Number lines
A short script to display output lines preceded by line numbers. This is similar to the UNIX nl command, or cat -n.
Number lines
This version demonstrates a technique for manually calculating numbers.
Number non-empty lines
A short script to display output lines, preceding non-empty lines with a line number. Empty lines affect the count. This is not the same as cat -bn, which does not count empty lines.
Number non-empty lines
This version demonstrates a technique for manually calculating numbers; it emulates cat -bn exactly.

Mathematical

Desktop calculator
This script from sed guru Greg Ubben is a full implementation of the UNIX desktop calculator dc. dc is an arbitrary precision, multi-base, stacking calculator. Read how it's done in Greg's analysis.
Add decimals
This impressive script adds a list of decimal numbers. It pulls this off by transforming successive digits in each number into an analogue format, where a=1, aa=2, aaa=3, etc, concatenating the two analogue numbers, resolving carry, and transforming the numbers back into decimal. Similar (but not the same script) to the one explained in Greg Ubben's Adding a list of decimal numbers on the tutorials page.
Increment a number
Interesting script to increment numbers.
Commify numbers (i)
Formats numbers by placing commas before every 3 digits (eg: 1,200,573).
Commify numbers (ii)
A more compact script for versions of sed which recognise Extended RE's.
Commify numbers (iii)
Compare with (i). This script expects 100% numeric input.

Information extraction / tabulation

Find anagrams
Search for dictionary words in a string.
Find anagrams
Search for anagrams in a list of words (one word per line).
Indexer
This script collates a list of references to produce an index suitable for a book or magazine. A detailed description of the way it works, along with alternative versions of the script, is available on the tutorials page. The script was used by the Cornerstone magazine to create an index for a book after typesetting.
Show make targets
Extracts targets for a file from a makefile.
Sort/delimit/number a list of names
Sort, partition and number a list of names. A thorough analysis of this script is given by the author in A lookup-table counter on the tutorials page.
Display beginning of file
Display first 10 lines of a file. Like head.

Desktop information

Display a calendar
Display a simple calendar for the current month, à la the UNIX command cal. Only date is required, math is done directly in sed.



Updated 29 Oct 1998