Saturday, December 11, 2010

gcc (page 7)



DEC Alpha Options

These -m options are defined for the DEC Alpha implementations:

-mno-soft-float
-msoft-float
Use (do not use) the hardware floating-point instructions for
floating-point operations. When -msoft-float is specified,
functions in libgcc.a will be used to perform floating-point
operations. Unless they are replaced by routines that emulate the
floating-point operations, or compiled in such a way as to call
such emulations routines, these routines will issue floating-point
operations. If you are compiling for an Alpha without floating-
point operations, you must ensure that the library is built so as
not to call them.

Note that Alpha implementations without floating-point operations
are required to have floating-point registers.

-mfp-reg
-mno-fp-regs
Generate code that uses (does not use) the floating-point register
set. -mno-fp-regs implies -msoft-float. If the floating-point
register set is not used, floating point operands are passed in
integer registers as if they were integers and floating-point
results are passed in $0 instead of $f0. This is a non-standard
calling sequence, so any function with a floating-point argument or
return value called by code compiled with -mno-fp-regs must also be
compiled with that option.

A typical use of this option is building a kernel that does not
use, and hence need not save and restore, any floating-point
registers.

-mieee
The Alpha architecture implements floating-point hardware optimized
for maximum performance. It is mostly compliant with the IEEE
floating point standard. However, for full compliance, software
assistance is required. This option generates code fully IEEE
compliant code except that the inexact-flag is not maintained (see
below). If this option is turned on, the preprocessor macro
"_IEEE_FP" is defined during compilation. The resulting code is
less efficient but is able to correctly support denormalized
numbers and exceptional IEEE values such as not-a-number and
plus/minus infinity. Other Alpha compilers call this option
-ieee_with_no_inexact.

-mieee-with-inexact
This is like -mieee except the generated code also maintains the
IEEE inexact-flag. Turning on this option causes the generated
code to implement fully-compliant IEEE math. In addition to
"_IEEE_FP", "_IEEE_FP_EXACT" is defined as a preprocessor macro.
On some Alpha implementations the resulting code may execute
significantly slower than the code generated by default. Since
there is very little code that depends on the inexact-flag, you
should normally not specify this option. Other Alpha compilers
call this option -ieee_with_inexact.

-mfp-trap-mode=trap-mode
This option controls what floating-point related traps are enabled.
Other Alpha compilers call this option -fptm trap-mode. The trap
mode can be set to one of four values:

n This is the default (normal) setting. The only traps that are
enabled are the ones that cannot be disabled in software (e.g.,
division by zero trap).

u In addition to the traps enabled by n, underflow traps are
enabled as well.

su Like u, but the instructions are marked to be safe for software
completion (see Alpha architecture manual for details).

sui Like su, but inexact traps are enabled as well.

-mfp-rounding-mode=rounding-mode
Selects the IEEE rounding mode. Other Alpha compilers call this
option -fprm rounding-mode. The rounding-mode can be one of:

n Normal IEEE rounding mode. Floating point numbers are rounded
towards the nearest machine number or towards the even machine
number in case of a tie.

m Round towards minus infinity.

c Chopped rounding mode. Floating point numbers are rounded
towards zero.

d Dynamic rounding mode. A field in the floating point control
register (fpcr, see Alpha architecture reference manual)
controls the rounding mode in effect. The C library
initializes this register for rounding towards plus infinity.
Thus, unless your program modifies the fpcr, d corresponds to
round towards plus infinity.

-mtrap-precision=trap-precision
In the Alpha architecture, floating point traps are imprecise.
This means without software assistance it is impossible to recover
from a floating trap and program execution normally needs to be
terminated. GCC can generate code that can assist operating system
trap handlers in determining the exact location that caused a
floating point trap. Depending on the requirements of an
application, different levels of precisions can be selected:

p Program precision. This option is the default and means a trap
handler can only identify which program caused a floating point
exception.

f Function precision. The trap handler can determine the
function that caused a floating point exception.

i Instruction precision. The trap handler can determine the
exact instruction that caused a floating point exception.

Other Alpha compilers provide the equivalent options called
-scope_safe and -resumption_safe.

-mieee-conformant
This option marks the generated code as IEEE conformant. You must
not use this option unless you also specify -mtrap-precision=i and
either -mfp-trap-mode=su or -mfp-trap-mode=sui. Its only effect is
to emit the line .eflag 48 in the function prologue of the
generated assembly file. Under DEC Unix, this has the effect that
IEEE-conformant math library routines will be linked in.

-mbuild-constants
Normally GCC examines a 32- or 64-bit integer constant to see if it
can construct it from smaller constants in two or three
instructions. If it cannot, it will output the constant as a
literal and generate code to load it from the data segment at
runtime.

Use this option to require GCC to construct all integer constants
using code, even if it takes more instructions (the maximum is
six).

You would typically use this option to build a shared library
dynamic loader. Itself a shared library, it must relocate itself
in memory before it can find the variables and constants in its own
data segment.

-malpha-as
-mgas
Select whether to generate code to be assembled by the vendor-
supplied assembler (-malpha-as) or by the GNU assembler -mgas.

-mbwx
-mno-bwx
-mcix
-mno-cix
-mfix
-mno-fix
-mmax
-mno-max
Indicate whether GCC should generate code to use the optional BWX,
CIX, FIX and MAX instruction sets. The default is to use the
instruction sets supported by the CPU type specified via -mcpu=
option or that of the CPU on which GCC was built if none was
specified.

-mfloat-vax
-mfloat-ieee
Generate code that uses (does not use) VAX F and G floating point
arithmetic instead of IEEE single and double precision.

-mexplicit-relocs
-mno-explicit-relocs
Older Alpha assemblers provided no way to generate symbol
relocations except via assembler macros. Use of these macros does
not allow optimal instruction scheduling. GNU binutils as of
version 2.12 supports a new syntax that allows the compiler to
explicitly mark which relocations should apply to which
instructions. This option is mostly useful for debugging, as GCC
detects the capabilities of the assembler when it is built and sets
the default accordingly.

-msmall-data
-mlarge-data
When -mexplicit-relocs is in effect, static data is accessed via
gp-relative relocations. When -msmall-data is used, objects 8
bytes long or smaller are placed in a small data area (the ".sdata"
and ".sbss" sections) and are accessed via 16-bit relocations off
of the $gp register. This limits the size of the small data area
to 64KB, but allows the variables to be directly accessed via a
single instruction.

The default is -mlarge-data. With this option the data area is
limited to just below 2GB. Programs that require more than 2GB of
data must use "malloc" or "mmap" to allocate the data in the heap
instead of in the program's data segment.

When generating code for shared libraries, -fpic implies
-msmall-data and -fPIC implies -mlarge-data.

-msmall-text
-mlarge-text
When -msmall-text is used, the compiler assumes that the code of
the entire program (or shared library) fits in 4MB, and is thus
reachable with a branch instruction. When -msmall-data is used,
the compiler can assume that all local symbols share the same $gp
value, and thus reduce the number of instructions required for a
function call from 4 to 1.

The default is -mlarge-text.

-mcpu=cpu_type
Set the instruction set and instruction scheduling parameters for
machine type cpu_type. You can specify either the EV style name or
the corresponding chip number. GCC supports scheduling parameters
for the EV4, EV5 and EV6 family of processors and will choose the
default values for the instruction set from the processor you
specify. If you do not specify a processor type, GCC will default
to the processor on which the compiler was built.

Supported values for cpu_type are

ev4
ev45
21064
Schedules as an EV4 and has no instruction set extensions.

ev5
21164
Schedules as an EV5 and has no instruction set extensions.

ev56
21164a
Schedules as an EV5 and supports the BWX extension.

pca56
21164pc
21164PC
Schedules as an EV5 and supports the BWX and MAX extensions.

ev6
21264
Schedules as an EV6 and supports the BWX, FIX, and MAX
extensions.

ev67
21264a
Schedules as an EV6 and supports the BWX, CIX, FIX, and MAX
extensions.

Native Linux/GNU toolchains also support the value native, which
selects the best architecture option for the host processor.
-mcpu=native has no effect if GCC does not recognize the processor.

-mtune=cpu_type
Set only the instruction scheduling parameters for machine type
cpu_type. The instruction set is not changed.

Native Linux/GNU toolchains also support the value native, which
selects the best architecture option for the host processor.
-mtune=native has no effect if GCC does not recognize the
processor.

-mmemory-latency=time
Sets the latency the scheduler should assume for typical memory
references as seen by the application. This number is highly
dependent on the memory access patterns used by the application and
the size of the external cache on the machine.

Valid options for time are

number
A decimal number representing clock cycles.

L1
L2
L3
main
The compiler contains estimates of the number of clock cycles
for "typical" EV4 & EV5 hardware for the Level 1, 2 & 3 caches
(also called Dcache, Scache, and Bcache), as well as to main
memory. Note that L3 is only valid for EV5.

DEC Alpha/VMS Options

These -m options are defined for the DEC Alpha/VMS implementations:

-mvms-return-codes
Return VMS condition codes from main. The default is to return
POSIX style condition (e.g. error) codes.

FR30 Options

These options are defined specifically for the FR30 port.

-msmall-model
Use the small address space model. This can produce smaller code,
but it does assume that all symbolic values and addresses will fit
into a 20-bit range.

-mno-lsim
Assume that run-time support has been provided and so there is no
need to include the simulator library (libsim.a) on the linker
command line.

FRV Options

-mgpr-32
Only use the first 32 general purpose registers.

-mgpr-64
Use all 64 general purpose registers.

-mfpr-32
Use only the first 32 floating point registers.

-mfpr-64
Use all 64 floating point registers

-mhard-float
Use hardware instructions for floating point operations.

-msoft-float
Use library routines for floating point operations.

-malloc-cc
Dynamically allocate condition code registers.

-mfixed-cc
Do not try to dynamically allocate condition code registers, only
use "icc0" and "fcc0".

-mdword
Change ABI to use double word insns.

-mno-dword
Do not use double word instructions.

-mdouble
Use floating point double instructions.

-mno-double
Do not use floating point double instructions.

-mmedia
Use media instructions.

-mno-media
Do not use media instructions.

-mmuladd
Use multiply and add/subtract instructions.

-mno-muladd
Do not use multiply and add/subtract instructions.

-mfdpic
Select the FDPIC ABI, that uses function descriptors to represent
pointers to functions. Without any PIC/PIE-related options, it
implies -fPIE. With -fpic or -fpie, it assumes GOT entries and
small data are within a 12-bit range from the GOT base address;
with -fPIC or -fPIE, GOT offsets are computed with 32 bits. With a
bfin-elf target, this option implies -msim.

-minline-plt
Enable inlining of PLT entries in function calls to functions that
are not known to bind locally. It has no effect without -mfdpic.
It's enabled by default if optimizing for speed and compiling for
shared libraries (i.e., -fPIC or -fpic), or when an optimization
option such as -O3 or above is present in the command line.

-mTLS
Assume a large TLS segment when generating thread-local code.

-mtls
Do not assume a large TLS segment when generating thread-local
code.

-mgprel-ro
Enable the use of "GPREL" relocations in the FDPIC ABI for data
that is known to be in read-only sections. It's enabled by
default, except for -fpic or -fpie: even though it may help make
the global offset table smaller, it trades 1 instruction for 4.
With -fPIC or -fPIE, it trades 3 instructions for 4, one of which
may be shared by multiple symbols, and it avoids the need for a GOT
entry for the referenced symbol, so it's more likely to be a win.
If it is not, -mno-gprel-ro can be used to disable it.

-multilib-library-pic
Link with the (library, not FD) pic libraries. It's implied by
-mlibrary-pic, as well as by -fPIC and -fpic without -mfdpic. You
should never have to use it explicitly.

-mlinked-fp
Follow the EABI requirement of always creating a frame pointer
whenever a stack frame is allocated. This option is enabled by
default and can be disabled with -mno-linked-fp.

-mlong-calls
Use indirect addressing to call functions outside the current
compilation unit. This allows the functions to be placed anywhere
within the 32-bit address space.

-malign-labels
Try to align labels to an 8-byte boundary by inserting nops into
the previous packet. This option only has an effect when VLIW
packing is enabled. It doesn't create new packets; it merely adds
nops to existing ones.

-mlibrary-pic
Generate position-independent EABI code.

-macc-4
Use only the first four media accumulator registers.

-macc-8
Use all eight media accumulator registers.

-mpack
Pack VLIW instructions.

-mno-pack
Do not pack VLIW instructions.

-mno-eflags
Do not mark ABI switches in e_flags.

-mcond-move
Enable the use of conditional-move instructions (default).

This switch is mainly for debugging the compiler and will likely be
removed in a future version.

-mno-cond-move
Disable the use of conditional-move instructions.

This switch is mainly for debugging the compiler and will likely be
removed in a future version.

-mscc
Enable the use of conditional set instructions (default).

This switch is mainly for debugging the compiler and will likely be
removed in a future version.

-mno-scc
Disable the use of conditional set instructions.

This switch is mainly for debugging the compiler and will likely be
removed in a future version.

-mcond-exec
Enable the use of conditional execution (default).

This switch is mainly for debugging the compiler and will likely be
removed in a future version.

-mno-cond-exec
Disable the use of conditional execution.

This switch is mainly for debugging the compiler and will likely be
removed in a future version.

-mvliw-branch
Run a pass to pack branches into VLIW instructions (default).

This switch is mainly for debugging the compiler and will likely be
removed in a future version.

-mno-vliw-branch
Do not run a pass to pack branches into VLIW instructions.

This switch is mainly for debugging the compiler and will likely be
removed in a future version.

-mmulti-cond-exec
Enable optimization of "&&" and "||" in conditional execution
(default).

This switch is mainly for debugging the compiler and will likely be
removed in a future version.

-mno-multi-cond-exec
Disable optimization of "&&" and "||" in conditional execution.

This switch is mainly for debugging the compiler and will likely be
removed in a future version.

-mnested-cond-exec
Enable nested conditional execution optimizations (default).

This switch is mainly for debugging the compiler and will likely be
removed in a future version.

-mno-nested-cond-exec
Disable nested conditional execution optimizations.

This switch is mainly for debugging the compiler and will likely be
removed in a future version.

-moptimize-membar
This switch removes redundant "membar" instructions from the
compiler generated code. It is enabled by default.

-mno-optimize-membar
This switch disables the automatic removal of redundant "membar"
instructions from the generated code.

-mtomcat-stats
Cause gas to print out tomcat statistics.

-mcpu=cpu
Select the processor type for which to generate code. Possible
values are frv, fr550, tomcat, fr500, fr450, fr405, fr400, fr300
and simple.

GNU/Linux Options

These -m options are defined for GNU/Linux targets:

-mglibc
Use the GNU C library instead of uClibc. This is the default
except on *-*-linux-*uclibc* targets.

-muclibc
Use uClibc instead of the GNU C library. This is the default on
*-*-linux-*uclibc* targets.

H8/300 Options

These -m options are defined for the H8/300 implementations:

-mrelax
Shorten some address references at link time, when possible; uses
the linker option -relax.

-mh Generate code for the H8/300H.

-ms Generate code for the H8S.

-mn Generate code for the H8S and H8/300H in the normal mode. This
switch must be used either with -mh or -ms.

-ms2600
Generate code for the H8S/2600. This switch must be used with -ms.

-mint32
Make "int" data 32 bits by default.

-malign-300
On the H8/300H and H8S, use the same alignment rules as for the
H8/300. The default for the H8/300H and H8S is to align longs and
floats on 4 byte boundaries. -malign-300 causes them to be aligned
on 2 byte boundaries. This option has no effect on the H8/300.

HPPA Options

These -m options are defined for the HPPA family of computers:

-march=architecture-type
Generate code for the specified architecture. The choices for
architecture-type are 1.0 for PA 1.0, 1.1 for PA 1.1, and 2.0 for
PA 2.0 processors. Refer to /usr/lib/sched.models on an HP-UX
system to determine the proper architecture option for your
machine. Code compiled for lower numbered architectures will run
on higher numbered architectures, but not the other way around.

-mpa-risc-1-0
-mpa-risc-1-1
-mpa-risc-2-0
Synonyms for -march=1.0, -march=1.1, and -march=2.0 respectively.

-mbig-switch
Generate code suitable for big switch tables. Use this option only
if the assembler/linker complain about out of range branches within
a switch table.

-mjump-in-delay
Fill delay slots of function calls with unconditional jump
instructions by modifying the return pointer for the function call
to be the target of the conditional jump.

-mdisable-fpregs
Prevent floating point registers from being used in any manner.
This is necessary for compiling kernels which perform lazy context
switching of floating point registers. If you use this option and
attempt to perform floating point operations, the compiler will
abort.

-mdisable-indexing
Prevent the compiler from using indexing address modes. This
avoids some rather obscure problems when compiling MIG generated
code under MACH.

-mno-space-regs
Generate code that assumes the target has no space registers. This
allows GCC to generate faster indirect calls and use unscaled index
address modes.

Such code is suitable for level 0 PA systems and kernels.

-mfast-indirect-calls
Generate code that assumes calls never cross space boundaries.
This allows GCC to emit code which performs faster indirect calls.

This option will not work in the presence of shared libraries or
nested functions.

-mfixed-range=register-range
Generate code treating the given register range as fixed registers.
A fixed register is one that the register allocator can not use.
This is useful when compiling kernel code. A register range is
specified as two registers separated by a dash. Multiple register
ranges can be specified separated by a comma.

-mlong-load-store
Generate 3-instruction load and store sequences as sometimes
required by the HP-UX 10 linker. This is equivalent to the +k
option to the HP compilers.

-mportable-runtime
Use the portable calling conventions proposed by HP for ELF
systems.

-mgas
Enable the use of assembler directives only GAS understands.

-mschedule=cpu-type
Schedule code according to the constraints for the machine type
cpu-type. The choices for cpu-type are 700 7100, 7100LC, 7200,
7300 and 8000. Refer to /usr/lib/sched.models on an HP-UX system
to determine the proper scheduling option for your machine. The
default scheduling is 8000.

-mlinker-opt
Enable the optimization pass in the HP-UX linker. Note this makes
symbolic debugging impossible. It also triggers a bug in the HP-UX
8 and HP-UX 9 linkers in which they give bogus error messages when
linking some programs.

-msoft-float
Generate output containing library calls for floating point.
Warning: the requisite libraries are not available for all HPPA
targets. Normally the facilities of the machine's usual C compiler
are used, but this cannot be done directly in cross-compilation.
You must make your own arrangements to provide suitable library
functions for cross-compilation.

-msoft-float changes the calling convention in the output file;
therefore, it is only useful if you compile all of a program with
this option. In particular, you need to compile libgcc.a, the
library that comes with GCC, with -msoft-float in order for this to
work.

-msio
Generate the predefine, "_SIO", for server IO. The default is
-mwsio. This generates the predefines, "__hp9000s700",
"__hp9000s700__" and "_WSIO", for workstation IO. These options
are available under HP-UX and HI-UX.

-mgnu-ld
Use GNU ld specific options. This passes -shared to ld when
building a shared library. It is the default when GCC is
configured, explicitly or implicitly, with the GNU linker. This
option does not have any affect on which ld is called, it only
changes what parameters are passed to that ld. The ld that is
called is determined by the --with-ld configure option, GCC's
program search path, and finally by the user's PATH. The linker
used by GCC can be printed using which `gcc -print-prog-name=ld`.
This option is only available on the 64 bit HP-UX GCC, i.e.
configured with hppa*64*-*-hpux*.

-mhp-ld
Use HP ld specific options. This passes -b to ld when building a
shared library and passes +Accept TypeMismatch to ld on all links.
It is the default when GCC is configured, explicitly or implicitly,
with the HP linker. This option does not have any affect on which
ld is called, it only changes what parameters are passed to that
ld. The ld that is called is determined by the --with-ld configure
option, GCC's program search path, and finally by the user's PATH.
The linker used by GCC can be printed using which `gcc
-print-prog-name=ld`. This option is only available on the 64 bit
HP-UX GCC, i.e. configured with hppa*64*-*-hpux*.

-mlong-calls
Generate code that uses long call sequences. This ensures that a
call is always able to reach linker generated stubs. The default
is to generate long calls only when the distance from the call site
to the beginning of the function or translation unit, as the case
may be, exceeds a predefined limit set by the branch type being
used. The limits for normal calls are 7,600,000 and 240,000 bytes,
respectively for the PA 2.0 and PA 1.X architectures. Sibcalls are
always limited at 240,000 bytes.

Distances are measured from the beginning of functions when using
the -ffunction-sections option, or when using the -mgas and
-mno-portable-runtime options together under HP-UX with the SOM
linker.

It is normally not desirable to use this option as it will degrade
performance. However, it may be useful in large applications,
particularly when partial linking is used to build the application.

The types of long calls used depends on the capabilities of the
assembler and linker, and the type of code being generated. The
impact on systems that support long absolute calls, and long pic
symbol-difference or pc-relative calls should be relatively small.
However, an indirect call is used on 32-bit ELF systems in pic code
and it is quite long.

-munix=unix-std
Generate compiler predefines and select a startfile for the
specified UNIX standard. The choices for unix-std are 93, 95 and
98. 93 is supported on all HP-UX versions. 95 is available on HP-
UX 10.10 and later. 98 is available on HP-UX 11.11 and later. The
default values are 93 for HP-UX 10.00, 95 for HP-UX 10.10 though to
11.00, and 98 for HP-UX 11.11 and later.

-munix=93 provides the same predefines as GCC 3.3 and 3.4.
-munix=95 provides additional predefines for "XOPEN_UNIX" and
"_XOPEN_SOURCE_EXTENDED", and the startfile unix95.o. -munix=98
provides additional predefines for "_XOPEN_UNIX",
"_XOPEN_SOURCE_EXTENDED", "_INCLUDE__STDC_A1_SOURCE" and
"_INCLUDE_XOPEN_SOURCE_500", and the startfile unix98.o.

It is important to note that this option changes the interfaces for
various library routines. It also affects the operational behavior
of the C library. Thus, extreme care is needed in using this
option.

Library code that is intended to operate with more than one UNIX
standard must test, set and restore the variable
__xpg4_extended_mask as appropriate. Most GNU software doesn't
provide this capability.

-nolibdld
Suppress the generation of link options to search libdld.sl when
the -static option is specified on HP-UX 10 and later.

-static
The HP-UX implementation of setlocale in libc has a dependency on
libdld.sl. There isn't an archive version of libdld.sl. Thus,
when the -static option is specified, special link options are
needed to resolve this dependency.

On HP-UX 10 and later, the GCC driver adds the necessary options to
link with libdld.sl when the -static option is specified. This
causes the resulting binary to be dynamic. On the 64-bit port, the
linkers generate dynamic binaries by default in any case. The
-nolibdld option can be used to prevent the GCC driver from adding
these link options.

-threads
Add support for multithreading with the dce thread library under
HP-UX. This option sets flags for both the preprocessor and
linker.

Intel 386 and AMD x86-64 Options

These -m options are defined for the i386 and x86-64 family of
computers:

-mtune=cpu-type
Tune to cpu-type everything applicable about the generated code,
except for the ABI and the set of available instructions. The
choices for cpu-type are:

generic
Produce code optimized for the most common IA32/AMD64/EM64T
processors. If you know the CPU on which your code will run,
then you should use the corresponding -mtune option instead of
-mtune=generic. But, if you do not know exactly what CPU users
of your application will have, then you should use this option.

As new processors are deployed in the marketplace, the behavior
of this option will change. Therefore, if you upgrade to a
newer version of GCC, the code generated option will change to
reflect the processors that were most common when that version
of GCC was released.

There is no -march=generic option because -march indicates the
instruction set the compiler can use, and there is no generic
instruction set applicable to all processors. In contrast,
-mtune indicates the processor (or, in this case, collection of
processors) for which the code is optimized.

native
This selects the CPU to tune for at compilation time by
determining the processor type of the compiling machine. Using
-mtune=native will produce code optimized for the local machine
under the constraints of the selected instruction set. Using
-march=native will enable all instruction subsets supported by
the local machine (hence the result might not run on different
machines).

i386
Original Intel's i386 CPU.

i486
Intel's i486 CPU. (No scheduling is implemented for this
chip.)

i586, pentium
Intel Pentium CPU with no MMX support.

pentium-mmx
Intel PentiumMMX CPU based on Pentium core with MMX instruction
set support.

pentiumpro
Intel PentiumPro CPU.

i686
Same as "generic", but when used as "march" option, PentiumPro
instruction set will be used, so the code will run on all i686
family chips.

pentium2
Intel Pentium2 CPU based on PentiumPro core with MMX
instruction set support.

pentium3, pentium3m
Intel Pentium3 CPU based on PentiumPro core with MMX and SSE
instruction set support.

pentium-m
Low power version of Intel Pentium3 CPU with MMX, SSE and SSE2
instruction set support. Used by Centrino notebooks.

pentium4, pentium4m
Intel Pentium4 CPU with MMX, SSE and SSE2 instruction set
support.

prescott
Improved version of Intel Pentium4 CPU with MMX, SSE, SSE2 and
SSE3 instruction set support.

nocona
Improved version of Intel Pentium4 CPU with 64-bit extensions,
MMX, SSE, SSE2 and SSE3 instruction set support.

core2
Intel Core2 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3
and SSSE3 instruction set support.

k6 AMD K6 CPU with MMX instruction set support.

k6-2, k6-3
Improved versions of AMD K6 CPU with MMX and 3dNOW! instruction
set support.

athlon, athlon-tbird
AMD Athlon CPU with MMX, 3dNOW!, enhanced 3dNOW! and SSE
prefetch instructions support.

athlon-4, athlon-xp, athlon-mp
Improved AMD Athlon CPU with MMX, 3dNOW!, enhanced 3dNOW! and
full SSE instruction set support.

k8, opteron, athlon64, athlon-fx
AMD K8 core based CPUs with x86-64 instruction set support.
(This supersets MMX, SSE, SSE2, 3dNOW!, enhanced 3dNOW! and
64-bit instruction set extensions.)

k8-sse3, opteron-sse3, athlon64-sse3
Improved versions of k8, opteron and athlon64 with SSE3
instruction set support.

amdfam10, barcelona
AMD Family 10h core based CPUs with x86-64 instruction set
support. (This supersets MMX, SSE, SSE2, SSE3, SSE4A, 3dNOW!,
enhanced 3dNOW!, ABM and 64-bit instruction set extensions.)

winchip-c6
IDT Winchip C6 CPU, dealt in same way as i486 with additional
MMX instruction set support.

winchip2
IDT Winchip2 CPU, dealt in same way as i486 with additional MMX
and 3dNOW! instruction set support.

c3 Via C3 CPU with MMX and 3dNOW! instruction set support. (No
scheduling is implemented for this chip.)

c3-2
Via C3-2 CPU with MMX and SSE instruction set support. (No
scheduling is implemented for this chip.)

geode
Embedded AMD CPU with MMX and 3dNOW! instruction set support.

While picking a specific cpu-type will schedule things
appropriately for that particular chip, the compiler will not
generate any code that does not run on the i386 without the
-march=cpu-type option being used.

-march=cpu-type
Generate instructions for the machine type cpu-type. The choices
for cpu-type are the same as for -mtune. Moreover, specifying
-march=cpu-type implies -mtune=cpu-type.

-mcpu=cpu-type
A deprecated synonym for -mtune.

-mfpmath=unit
Generate floating point arithmetics for selected unit unit. The
choices for unit are:

387 Use the standard 387 floating point coprocessor present
majority of chips and emulated otherwise. Code compiled with
this option will run almost everywhere. The temporary results
are computed in 80bit precision instead of precision specified
by the type resulting in slightly different results compared to
most of other chips. See -ffloat-store for more detailed
description.

This is the default choice for i386 compiler.

sse Use scalar floating point instructions present in the SSE
instruction set. This instruction set is supported by Pentium3
and newer chips, in the AMD line by Athlon-4, Athlon-xp and
Athlon-mp chips. The earlier version of SSE instruction set
supports only single precision arithmetics, thus the double and
extended precision arithmetics is still done using 387. Later
version, present only in Pentium4 and the future AMD x86-64
chips supports double precision arithmetics too.

For the i386 compiler, you need to use -march=cpu-type, -msse
or -msse2 switches to enable SSE extensions and make this
option effective. For the x86-64 compiler, these extensions
are enabled by default.

The resulting code should be considerably faster in the
majority of cases and avoid the numerical instability problems
of 387 code, but may break some existing code that expects
temporaries to be 80bit.

This is the default choice for the x86-64 compiler.

sse,387
sse+387
both
Attempt to utilize both instruction sets at once. This
effectively double the amount of available registers and on
chips with separate execution units for 387 and SSE the
execution resources too. Use this option with care, as it is
still experimental, because the GCC register allocator does not
model separate functional units well resulting in instable
performance.

-masm=dialect
Output asm instructions using selected dialect. Supported choices
are intel or att (the default one). Darwin does not support intel.

-mieee-fp
-mno-ieee-fp
Control whether or not the compiler uses IEEE floating point
comparisons. These handle correctly the case where the result of a
comparison is unordered.

-msoft-float
Generate output containing library calls for floating point.
Warning: the requisite libraries are not part of GCC. Normally the
facilities of the machine's usual C compiler are used, but this
can't be done directly in cross-compilation. You must make your
own arrangements to provide suitable library functions for cross-
compilation.

On machines where a function returns floating point results in the
80387 register stack, some floating point opcodes may be emitted
even if -msoft-float is used.

-mno-fp-ret-in-387
Do not use the FPU registers for return values of functions.

The usual calling convention has functions return values of types
"float" and "double" in an FPU register, even if there is no FPU.
The idea is that the operating system should emulate an FPU.

The option -mno-fp-ret-in-387 causes such values to be returned in
ordinary CPU registers instead.

-mno-fancy-math-387
Some 387 emulators do not support the "sin", "cos" and "sqrt"
instructions for the 387. Specify this option to avoid generating
those instructions. This option is the default on FreeBSD, OpenBSD
and NetBSD. This option is overridden when -march indicates that
the target cpu will always have an FPU and so the instruction will
not need emulation. As of revision 2.6.1, these instructions are
not generated unless you also use the -funsafe-math-optimizations
switch.

-malign-double
-mno-align-double
Control whether GCC aligns "double", "long double", and "long long"
variables on a two word boundary or a one word boundary. Aligning
"double" variables on a two word boundary will produce code that
runs somewhat faster on a Pentium at the expense of more memory.

On x86-64, -malign-double is enabled by default.

Warning: if you use the -malign-double switch, structures
containing the above types will be aligned differently than the
published application binary interface specifications for the 386
and will not be binary compatible with structures in code compiled
without that switch.

-m96bit-long-double
-m128bit-long-double
These switches control the size of "long double" type. The i386
application binary interface specifies the size to be 96 bits, so
-m96bit-long-double is the default in 32 bit mode.

Modern architectures (Pentium and newer) would prefer "long double"
to be aligned to an 8 or 16 byte boundary. In arrays or structures
conforming to the ABI, this would not be possible. So specifying a
-m128bit-long-double will align "long double" to a 16 byte boundary
by padding the "long double" with an additional 32 bit zero.

In the x86-64 compiler, -m128bit-long-double is the default choice
as its ABI specifies that "long double" is to be aligned on 16 byte
boundary.

Notice that neither of these options enable any extra precision
over the x87 standard of 80 bits for a "long double".

Warning: if you override the default value for your target ABI, the
structures and arrays containing "long double" variables will
change their size as well as function calling convention for
function taking "long double" will be modified. Hence they will
not be binary compatible with arrays or structures in code compiled
without that switch.

-mlarge-data-threshold=number
When -mcmodel=medium is specified, the data greater than threshold
are placed in large data section. This value must be the same
across all object linked into the binary and defaults to 65535.

-mrtd
Use a different function-calling convention, in which functions
that take a fixed number of arguments return with the "ret" num
instruction, which pops their arguments while returning. This
saves one instruction in the caller since there is no need to pop
the arguments there.

You can specify that an individual function is called with this
calling sequence with the function attribute stdcall. You can also
override the -mrtd option by using the function attribute cdecl.

Warning: this calling convention is incompatible with the one
normally used on Unix, so you cannot use it if you need to call
libraries compiled with the Unix compiler.

Also, you must provide function prototypes for all functions that
take variable numbers of arguments (including "printf"); otherwise
incorrect code will be generated for calls to those functions.

In addition, seriously incorrect code will result if you call a
function with too many arguments. (Normally, extra arguments are
harmlessly ignored.)

-mregparm=num
Control how many registers are used to pass integer arguments. By
default, no registers are used to pass arguments, and at most 3
registers can be used. You can control this behavior for a
specific function by using the function attribute regparm.

Warning: if you use this switch, and num is nonzero, then you must
build all modules with the same value, including any libraries.
This includes the system libraries and startup modules.

-msseregparm
Use SSE register passing conventions for float and double arguments
and return values. You can control this behavior for a specific
function by using the function attribute sseregparm.

Warning: if you use this switch then you must build all modules
with the same value, including any libraries. This includes the
system libraries and startup modules.

-mpc32
-mpc64
-mpc80
Set 80387 floating-point precision to 32, 64 or 80 bits. When
-mpc32 is specified, the significands of results of floating-point
operations are rounded to 24 bits (single precision); -mpc64 rounds
the significands of results of floating-point operations to 53 bits
(double precision) and -mpc80 rounds the significands of results of
floating-point operations to 64 bits (extended double precision),
which is the default. When this option is used, floating-point
operations in higher precisions are not available to the programmer
without setting the FPU control word explicitly.

Setting the rounding of floating-point operations to less than the
default 80 bits can speed some programs by 2% or more. Note that
some mathematical libraries assume that extended precision (80 bit)
floating-point operations are enabled by default; routines in such
libraries could suffer significant loss of accuracy, typically
through so-called "catastrophic cancellation", when this option is
used to set the precision to less than extended precision.

-mstackrealign
Realign the stack at entry. On the Intel x86, the -mstackrealign
option will generate an alternate prologue and epilogue that
realigns the runtime stack if necessary. This supports mixing
legacy codes that keep a 4-byte aligned stack with modern codes
that keep a 16-byte stack for SSE compatibility. See also the
attribute "force_align_arg_pointer", applicable to individual
functions.

-mpreferred-stack-boundary=num
Attempt to keep the stack boundary aligned to a 2 raised to num
byte boundary. If -mpreferred-stack-boundary is not specified, the
default is 4 (16 bytes or 128 bits).

-mincoming-stack-boundary=num
Assume the incoming stack is aligned to a 2 raised to num byte
boundary. If -mincoming-stack-boundary is not specified, the one
specified by -mpreferred-stack-boundary will be used.

On Pentium and PentiumPro, "double" and "long double" values should
be aligned to an 8 byte boundary (see -malign-double) or suffer
significant run time performance penalties. On Pentium III, the
Streaming SIMD Extension (SSE) data type "__m128" may not work
properly if it is not 16 byte aligned.

To ensure proper alignment of this values on the stack, the stack
boundary must be as aligned as that required by any value stored on
the stack. Further, every function must be generated such that it
keeps the stack aligned. Thus calling a function compiled with a
higher preferred stack boundary from a function compiled with a
lower preferred stack boundary will most likely misalign the stack.
It is recommended that libraries that use callbacks always use the
default setting.

This extra alignment does consume extra stack space, and generally
increases code size. Code that is sensitive to stack space usage,
such as embedded systems and operating system kernels, may want to
reduce the preferred alignment to -mpreferred-stack-boundary=2.

-mmmx
-mno-mmx
-msse
-mno-sse
-msse2
-mno-sse2
-msse3
-mno-sse3
-mssse3
-mno-ssse3
-msse4.1
-mno-sse4.1
-msse4.2
-mno-sse4.2
-msse4
-mno-sse4
-mavx
-mno-avx
-maes
-mno-aes
-mpclmul
-mno-pclmul
-msse4a
-mno-sse4a
-msse5
-mno-sse5
-m3dnow
-mno-3dnow
-mpopcnt
-mno-popcnt
-mabm
-mno-abm
These switches enable or disable the use of instructions in the
MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AES, PCLMUL, SSE4A, SSE5,
ABM or 3DNow! extended instruction sets. These extensions are also
available as built-in functions: see X86 Built-in Functions, for
details of the functions enabled and disabled by these switches.

To have SSE/SSE2 instructions generated automatically from
floating-point code (as opposed to 387 instructions), see
-mfpmath=sse.

GCC depresses SSEx instructions when -mavx is used. Instead, it
generates new AVX instructions or AVX equivalence for all SSEx
instructions when needed.

These options will enable GCC to use these extended instructions in
generated code, even without -mfpmath=sse. Applications which
perform runtime CPU detection must compile separate files for each
supported architecture, using the appropriate flags. In
particular, the file containing the CPU detection code should be
compiled without these options.

-mcld
This option instructs GCC to emit a "cld" instruction in the
prologue of functions that use string instructions. String
instructions depend on the DF flag to select between autoincrement
or autodecrement mode. While the ABI specifies the DF flag to be
cleared on function entry, some operating systems violate this
specification by not clearing the DF flag in their exception
dispatchers. The exception handler can be invoked with the DF flag
set which leads to wrong direction mode, when string instructions
are used. This option can be enabled by default on 32-bit x86
targets by configuring GCC with the --enable-cld configure option.
Generation of "cld" instructions can be suppressed with the
-mno-cld compiler option in this case.

-mcx16
This option will enable GCC to use CMPXCHG16B instruction in
generated code. CMPXCHG16B allows for atomic operations on 128-bit
double quadword (or oword) data types. This is useful for high
resolution counters that could be updated by multiple processors
(or cores). This instruction is generated as part of atomic built-
in functions: see Atomic Builtins for details.

-msahf
This option will enable GCC to use SAHF instruction in generated
64-bit code. Early Intel CPUs with Intel 64 lacked LAHF and SAHF
instructions supported by AMD64 until introduction of Pentium 4 G1
step in December 2005. LAHF and SAHF are load and store
instructions, respectively, for certain status flags. In 64-bit
mode, SAHF instruction is used to optimize "fmod", "drem" or
"remainder" built-in functions: see Other Builtins for details.

-mrecip
This option will enable GCC to use RCPSS and RSQRTSS instructions
(and their vectorized variants RCPPS and RSQRTPS) with an
additional Newton-Raphson step to increase precision instead of
DIVSS and SQRTSS (and their vectorized variants) for single
precision floating point arguments. These instructions are
generated only when -funsafe-math-optimizations is enabled together
with -finite-math-only and -fno-trapping-math. Note that while the
throughput of the sequence is higher than the throughput of the
non-reciprocal instruction, the precision of the sequence can be
decreased by up to 2 ulp (i.e. the inverse of 1.0 equals
0.99999994).

-mveclibabi=type
Specifies the ABI type to use for vectorizing intrinsics using an
external library. Supported types are "svml" for the Intel short
vector math library and "acml" for the AMD math core library style
of interfacing. GCC will currently emit calls to "vmldExp2",
"vmldLn2", "vmldLog102", "vmldLog102", "vmldPow2", "vmldTanh2",
"vmldTan2", "vmldAtan2", "vmldAtanh2", "vmldCbrt2", "vmldSinh2",
"vmldSin2", "vmldAsinh2", "vmldAsin2", "vmldCosh2", "vmldCos2",
"vmldAcosh2", "vmldAcos2", "vmlsExp4", "vmlsLn4", "vmlsLog104",
"vmlsLog104", "vmlsPow4", "vmlsTanh4", "vmlsTan4", "vmlsAtan4",
"vmlsAtanh4", "vmlsCbrt4", "vmlsSinh4", "vmlsSin4", "vmlsAsinh4",
"vmlsAsin4", "vmlsCosh4", "vmlsCos4", "vmlsAcosh4" and "vmlsAcos4"
for corresponding function type when -mveclibabi=svml is used and
"__vrd2_sin", "__vrd2_cos", "__vrd2_exp", "__vrd2_log",
"__vrd2_log2", "__vrd2_log10", "__vrs4_sinf", "__vrs4_cosf",
"__vrs4_expf", "__vrs4_logf", "__vrs4_log2f", "__vrs4_log10f" and
"__vrs4_powf" for corresponding function type when -mveclibabi=acml
is used. Both -ftree-vectorize and -funsafe-math-optimizations have
to be enabled. A SVML or ACML ABI compatible library will have to
be specified at link time.

-mpush-args
-mno-push-args
Use PUSH operations to store outgoing parameters. This method is
shorter and usually equally fast as method using SUB/MOV operations
and is enabled by default. In some cases disabling it may improve
performance because of improved scheduling and reduced
dependencies.

-maccumulate-outgoing-args
If enabled, the maximum amount of space required for outgoing
arguments will be computed in the function prologue. This is
faster on most modern CPUs because of reduced dependencies,
improved scheduling and reduced stack usage when preferred stack
boundary is not equal to 2. The drawback is a notable increase in
code size. This switch implies -mno-push-args.

-mthreads
Support thread-safe exception handling on Mingw32. Code that
relies on thread-safe exception handling must compile and link all
code with the -mthreads option. When compiling, -mthreads defines
-D_MT; when linking, it links in a special thread helper library
-lmingwthrd which cleans up per thread exception handling data.

-mno-align-stringops
Do not align destination of inlined string operations. This switch
reduces code size and improves performance in case the destination
is already aligned, but GCC doesn't know about it.

-minline-all-stringops
By default GCC inlines string operations only when destination is
known to be aligned at least to 4 byte boundary. This enables more
inlining, increase code size, but may improve performance of code
that depends on fast memcpy, strlen and memset for short lengths.

-minline-stringops-dynamically
For string operation of unknown size, inline runtime checks so for
small blocks inline code is used, while for large blocks library
call is used.

-mstringop-strategy=alg
Overwrite internal decision heuristic about particular algorithm to
inline string operation with. The allowed values are "rep_byte",
"rep_4byte", "rep_8byte" for expanding using i386 "rep" prefix of
specified size, "byte_loop", "loop", "unrolled_loop" for expanding
inline loop, "libcall" for always expanding library call.

-momit-leaf-frame-pointer
Don't keep the frame pointer in a register for leaf functions.
This avoids the instructions to save, set up and restore frame
pointers and makes an extra register available in leaf functions.
The option -fomit-frame-pointer removes the frame pointer for all
functions which might make debugging harder.

-mtls-direct-seg-refs
-mno-tls-direct-seg-refs
Controls whether TLS variables may be accessed with offsets from
the TLS segment register (%gs for 32-bit, %fs for 64-bit), or
whether the thread base pointer must be added. Whether or not this
is legal depends on the operating system, and whether it maps the
segment to cover the entire TLS area.

For systems that use GNU libc, the default is on.

-mfused-madd
-mno-fused-madd
Enable automatic generation of fused floating point multiply-add
instructions if the ISA supports such instructions. The
-mfused-madd option is on by default. The fused multiply-add
instructions have a different rounding behavior compared to
executing a multiply followed by an add.

-msse2avx
-mno-sse2avx
Specify that the assembler should encode SSE instructions with VEX
prefix. The option -mavx turns this on by default.

These -m switches are supported in addition to the above on AMD x86-64
processors in 64-bit environments.

-m32
-m64
Generate code for a 32-bit or 64-bit environment. The 32-bit
environment sets int, long and pointer to 32 bits and generates
code that runs on any i386 system. The 64-bit environment sets int
to 32 bits and long and pointer to 64 bits and generates code for
AMD's x86-64 architecture. For darwin only the -m64 option turns
off the -fno-pic and -mdynamic-no-pic options.

-mno-red-zone
Do not use a so called red zone for x86-64 code. The red zone is
mandated by the x86-64 ABI, it is a 128-byte area beyond the
location of the stack pointer that will not be modified by signal
or interrupt handlers and therefore can be used for temporary data
without adjusting the stack pointer. The flag -mno-red-zone
disables this red zone.

-mcmodel=small
Generate code for the small code model: the program and its symbols
must be linked in the lower 2 GB of the address space. Pointers
are 64 bits. Programs can be statically or dynamically linked.
This is the default code model.

-mcmodel=kernel
Generate code for the kernel code model. The kernel runs in the
negative 2 GB of the address space. This model has to be used for
Linux kernel code.

-mcmodel=medium
Generate code for the medium model: The program is linked in the
lower 2 GB of the address space. Small symbols are also placed
there. Symbols with sizes larger than -mlarge-data-threshold are
put into large data or bss sections and can be located above 2GB.
Programs can be statically or dynamically linked.

-mcmodel=large
Generate code for the large model: This model makes no assumptions
about addresses and sizes of sections.

IA-64 Options

These are the -m options defined for the Intel IA-64 architecture.

-mbig-endian
Generate code for a big endian target. This is the default for HP-
UX.

-mlittle-endian
Generate code for a little endian target. This is the default for
AIX5 and GNU/Linux.

-mgnu-as
-mno-gnu-as
Generate (or don't) code for the GNU assembler. This is the
default.

-mgnu-ld
-mno-gnu-ld
Generate (or don't) code for the GNU linker. This is the default.

-mno-pic
Generate code that does not use a global pointer register. The
result is not position independent code, and violates the IA-64
ABI.

-mvolatile-asm-stop
-mno-volatile-asm-stop
Generate (or don't) a stop bit immediately before and after
volatile asm statements.

-mregister-names
-mno-register-names
Generate (or don't) in, loc, and out register names for the stacked
registers. This may make assembler output more readable.

-mno-sdata
-msdata
Disable (or enable) optimizations that use the small data section.
This may be useful for working around optimizer bugs.

-mconstant-gp
Generate code that uses a single constant global pointer value.
This is useful when compiling kernel code.

-mauto-pic
Generate code that is self-relocatable. This implies
-mconstant-gp. This is useful when compiling firmware code.

-minline-float-divide-min-latency
Generate code for inline divides of floating point values using the
minimum latency algorithm.

-minline-float-divide-max-throughput
Generate code for inline divides of floating point values using the
maximum throughput algorithm.

-minline-int-divide-min-latency
Generate code for inline divides of integer values using the
minimum latency algorithm.

-minline-int-divide-max-throughput
Generate code for inline divides of integer values using the
maximum throughput algorithm.

-minline-sqrt-min-latency
Generate code for inline square roots using the minimum latency
algorithm.

-minline-sqrt-max-throughput
Generate code for inline square roots using the maximum throughput
algorithm.

-mno-dwarf2-asm
-mdwarf2-asm
Don't (or do) generate assembler code for the DWARF2 line number
debugging info. This may be useful when not using the GNU
assembler.

-mearly-stop-bits
-mno-early-stop-bits
Allow stop bits to be placed earlier than immediately preceding the
instruction that triggered the stop bit. This can improve
instruction scheduling, but does not always do so.

-mfixed-range=register-range
Generate code treating the given register range as fixed registers.
A fixed register is one that the register allocator can not use.
This is useful when compiling kernel code. A register range is
specified as two registers separated by a dash. Multiple register
ranges can be specified separated by a comma.

-mtls-size=tls-size
Specify bit size of immediate TLS offsets. Valid values are 14,
22, and 64.

-mtune=cpu-type
Tune the instruction scheduling for a particular CPU, Valid values
are itanium, itanium1, merced, itanium2, and mckinley.

-mt
-pthread
Add support for multithreading using the POSIX threads library.
This option sets flags for both the preprocessor and linker. It
does not affect the thread safety of object code produced by the
compiler or that of libraries supplied with it. These are HP-UX
specific flags.

-milp32
-mlp64
Generate code for a 32-bit or 64-bit environment. The 32-bit
environment sets int, long and pointer to 32 bits. The 64-bit
environment sets int to 32 bits and long and pointer to 64 bits.
These are HP-UX specific flags.

-mno-sched-br-data-spec
-msched-br-data-spec
(Dis/En)able data speculative scheduling before reload. This will
result in generation of the ld.a instructions and the corresponding
check instructions (ld.c / chk.a). The default is 'disable'.

-msched-ar-data-spec
-mno-sched-ar-data-spec
(En/Dis)able data speculative scheduling after reload. This will
result in generation of the ld.a instructions and the corresponding
check instructions (ld.c / chk.a). The default is 'enable'.

-mno-sched-control-spec
-msched-control-spec
(Dis/En)able control speculative scheduling. This feature is
available only during region scheduling (i.e. before reload). This
will result in generation of the ld.s instructions and the
corresponding check instructions chk.s . The default is 'disable'.

-msched-br-in-data-spec
-mno-sched-br-in-data-spec
(En/Dis)able speculative scheduling of the instructions that are
dependent on the data speculative loads before reload. This is
effective only with -msched-br-data-spec enabled. The default is
'enable'.

-msched-ar-in-data-spec
-mno-sched-ar-in-data-spec
(En/Dis)able speculative scheduling of the instructions that are
dependent on the data speculative loads after reload. This is
effective only with -msched-ar-data-spec enabled. The default is
'enable'.

-msched-in-control-spec
-mno-sched-in-control-spec
(En/Dis)able speculative scheduling of the instructions that are
dependent on the control speculative loads. This is effective only
with -msched-control-spec enabled. The default is 'enable'.

-msched-ldc
-mno-sched-ldc
(En/Dis)able use of simple data speculation checks ld.c . If
disabled, only chk.a instructions will be emitted to check data
speculative loads. The default is 'enable'.

-mno-sched-control-ldc
-msched-control-ldc
(Dis/En)able use of ld.c instructions to check control speculative
loads. If enabled, in case of control speculative load with no
speculatively scheduled dependent instructions this load will be
emitted as ld.sa and ld.c will be used to check it. The default is
'disable'.

-mno-sched-spec-verbose
-msched-spec-verbose
(Dis/En)able printing of the information about speculative motions.

-mno-sched-prefer-non-data-spec-insns
-msched-prefer-non-data-spec-insns
If enabled, data speculative instructions will be chosen for
schedule only if there are no other choices at the moment. This
will make the use of the data speculation much more conservative.
The default is 'disable'.

-mno-sched-prefer-non-control-spec-insns
-msched-prefer-non-control-spec-insns
If enabled, control speculative instructions will be chosen for
schedule only if there are no other choices at the moment. This
will make the use of the control speculation much more
conservative. The default is 'disable'.

-mno-sched-count-spec-in-critical-path
-msched-count-spec-in-critical-path
If enabled, speculative dependencies will be considered during
computation of the instructions priorities. This will make the use
of the speculation a bit more conservative. The default is
'disable'.

M32C Options

-mcpu=name
Select the CPU for which code is generated. name may be one of r8c
for the R8C/Tiny series, m16c for the M16C (up to /60) series,
m32cm for the M16C/80 series, or m32c for the M32C/80 series.

-msim
Specifies that the program will be run on the simulator. This
causes an alternate runtime library to be linked in which supports,
for example, file I/O. You must not use this option when
generating programs that will run on real hardware; you must
provide your own runtime library for whatever I/O functions are
needed.

-memregs=number
Specifies the number of memory-based pseudo-registers GCC will use
during code generation. These pseudo-registers will be used like
real registers, so there is a tradeoff between GCC's ability to fit
the code into available registers, and the performance penalty of
using memory instead of registers. Note that all modules in a
program must be compiled with the same value for this option.
Because of that, you must not use this option with the default
runtime libraries gcc builds.

No comments:

Post a Comment