http://goof.com/pcg/os2/tips.html
Tips and tricks
1.Well, the basic question anybody is asking right after downloading PGCC is "how I can get all those shitty-gritty-nice optimizations you talk so much about?".
The answer is simple: PGCC has yet three additional optimization modes: -O4 through -O6. With each level above -O3 an additional set of optimizations will
be performed. Well, here is a complete list (as of pgcc-2.95.3, ripped directly from code):
-O4
-fschedule_insns_after_reload
-fswap_for_agi
-frisc
-frisc_const = 2;
-finterleave_stack_non_stack
-fschedule_stack_reg_insns
-O5
-fruntime_lift_stores /* big space penalty */
-fomit_frame_pointer
-O6
-fall_mem_givs
-fdo_offload
-frisc_mem_dest
2.How to optimize for speed: by default GCC doesn't unroll loops while most other C compilers do (Visual Age C++ for example). Thus if you compare some
benchmarks compiled with PGCC and VACPP you can think that PGCC is slower... that's not true. If you enable loop unrolling (the best way is to
-funroll-loops, there is also the -funroll-all-loops switch that generally gives worse results) PGCC builds programs that are at least same speed (if not faster)
compared to VACPP. For example, UFC-Crypt library (a DES encryption algorithm) rised on my machine after I used the above switch (with -O6 in both
cases) from 4387 to 4795 crypt()s per second.
If you're using optimization levels prior to 5 (-O1..-O4) you also may want to use -fomit-frame-pointer switch. This reduces code size and increases speed as
well. The downside is that you can't debug your programs compiled in this mode (well, in any case I personally prefer to debug programs compiled without
optimizations at all). -O5 and -O6 implies -fomit-frame-pointer.
Another thing to try if you use floating-point math is to try -ffast-math switch. The bad side is that the floating-point won't be 'IEEE compliant' anymore,
whatever this matters (basically this means you'll get sometimes slightly different results on different platforms - *VERY* slightly). Instead, many things will
start to work much faster (for example sqrt() is almost twice faster). The reason is that in -ffast-math mode compiler doesn't generate floating-point-operand
normalization sequences after calculations (thus you can get "+0" as well as "-0" and so on). This generally doesn't matter anyway.
3.How to optimize for size: In previous versions of GCC it was adviced to use the '-O2' switch in order to optimize for size. Since EGCS 1.1 there is a new
switch called '-Os' (optimize by size). It is not just -O2, it imposes a lot of other size optimization such as lowering the threshold for a library call vs inline
implementation and so on.
You also may want to use -fomit-frame-pointer together with -Os. This will reduce code size as well, and will increase execution speed. In the above example
(UFC-Crypt) I got a 9% speed increase after using -fomit-frame-pointer.
4.By default PGCC for OS/2 uses frame-unwind-info exceptions. This mean that with -fexceptions (which is on by default) gcc will partially generate debug
info which is used then for frame unwinding during exceptions. Due to this fact files become larger (sometimes up to ~20%!) but this exception mechanism
has the advantage of having almost no overhead at code execution time. That is, try {...} catch () {...} generates almost no code at all, so the programs
runs as fast as before. The disadvantage of this is executable size, of course. Nothing comes from nothing. So, if you DO NOT use exceptions, you SHOULD
turn them off (by using -fno-exceptions switch) - this will greatly reduce your code size.
5.There is also a alternative method for handling exceptions. It is called setjump/longjump method, and as you understand it uses setjmp/longjmp instructions.
When try {} is encountered, a setjmp() is executed, and when a exception is thrown it longjumps to the last setjmp(). This method generates LOTS of messy
code (most other PC C compilers uses same method - I've took a look at Watcom and BCPP) which decreases execution time but occupies less space than
frame-unwind-info. To use this exception handling method you should use the -fsjlj-exceptions switch.
6.Please do not mix code compiled with sjlj-exceptions and with frame-unwind-info exceptions! Exception won't cross the boundary between different-style
exception handling code, so most likely you will got a 'unhandled exception'. Actually this includes inability to use pre-compiled libstdc++ with
-fsjlj-exceptions (it was compiled with frame-unwind-info exception handling).
7.In general, my opinion is that you better don't use exceptions at all - the mechanism is neat but it costs too much; without hardware support for exceptions it
doesn't work good. If you like speed and code size, forget exceptions; use good old return codes.
8.If you do not use exceptions, you better do the link pass either with gcc instead of g++, or use -fno-exceptions switch *even on linking*. The reason for this is
that the front-end links your program either against gcc.a or against gpp.a; second is same as gcc.a but with frame-unwind info, so resulting executable will be
(typically by ~10K) bigger.
9.If you use frame-unwind-info exceptions, you should not use -fomit-frame-pointer since stack unwinding functions rely on frame pointer. Keep in mind
that -O5 and -O6 automatically enables frame pointer ommision, so if you use -O6 you should also use -fno-omit-frame-pointer.
10.If you don't want to depend on dynamic version of libgcc (gccXXXXX.dll) and still want to use -Zcrtdll, you can avoid making your executables dependent
on dynamic version of libgcc by using "-lgcc" option on command line. This will link your program against "gcc.(a|lib)" rather than against "gccdll.(a|lib)".
Your executables will become a bit larger, but this is a better choice if you have a small number of executables compiled with current version of pgcc.
If you installed pgcc on temporary basis (i.e. in /emx/bin.new, /emx/lib.new and so on) you should consider the following: since by default gcc always looks
first for libraries in "/emx/lib" it will find the old libgcc as "/emx/lib/gcc.a". To avoid this you should copy /emx/lib/gcc.a and /emx/lib/gcc_p.a into
/emx/lib/st and /emx/lib/mt, and then remove them from /emx/lib.
http://goof.com/pcg/os2/tips.html