ArmEabiPort

90 阅读 0 评论 60 点赞

我是靠谱客的博主跳跃钢笔，最近开发中收集的这篇文章主要介绍ArmEabiPort，觉得挺不错的，现在分享给大家，希望可以做个参考。

概述

The ARM EABI port is the default port in Debian for the ARM architecture, named armel. The old (OABI) port (named "arm") was last released with 5.0.x (Lenny). An even newer port targeted at newer hardware with another ABI ("armhf") is currently under development and is expected to ship with 7.0 (Wheezy) - see ArmHardFloatPort. See ArmPorts for more links and an overview.

In a nutshell
Terminology
GCC view
ARM floating points
1. GCC preprocessor macros for floating point
Struct packing and alignment
1. Stack alignment
2. 64-bit data type alignment
3. Enum sizes
4. System call interface
Choice of minimum CPU
1. Thumb interworking suggests armv4t
2. Other scenarios
Why a new port
Status
Build daemons
Supported hardware
1. ixp4xx
2. iop3xx
3. Marvell orion
Others
Porting to new platforms
References

In a nutshell

EABI is the new "Embedded" ABI by ARM ltd. EABI is actually a family of ABIs and one of the "subABIs" is GNU EABI, for Linux. The effective changes for users are:

Floating point performance, with or without an FPU is very much faster, and mixing soft and hardfloat code is possible
Structure packing is not as painful as it used to be
More compatibility with various tools (in future - currently linux-elf is well supported)
A more efficient syscall convention
At present (with gcc-4.1.1) it works with ARMv4t, ARMv5t processors and above, but supporting ARMv4 (e.g., StrongARM) requires toolchain modifications. See "Thumb interworking" below.

Terminology

Strictly speaking, both the old and new ARM ABIs are subsets of the ARM EABI specification, but in everyday usage the term "EABI" is used to mean the new one described here and "OABI" or "old-ABI" to mean the old one. However, there are one or two programs that sometimes describe an old ABI binary as "EABI".

To add to the confusion, powerpc has also had an ABI called "EABI" for some, which has nothing to do with this one.

GCC view

New ABI is not only a new ABI field, it is also a new GCC target.

Legacy ABI

ABI flags passed to binutils: -mabi=apcs-gnu -mfpu=fpa
gcc -dumpmachine: arm-unknown-linux
objdump -x for compiled binary:

private flags = 2: [APCS-32] [FPA float format] [has entry point]

"file" on compiled Debian binary:

ELF 32-bit LSB executable, ARM, version 1 (ARM), for GNU/Linux 2.2.0, dynamically linked (uses shared libs), for GNU/Linux 2.2.0, stripped

"readelf -h | grep Flags""

Flags: 0x0

Arm EABI:

ABI flags passed by gcc to binutils: -mabi=aapcs-linux -mfloat-abi=soft -meabi=4
gcc -dumpmachine: arm-unknown-linux-gnueabi
objdump -x for compiled binary:

private flags = 4000002: [Version4 EABI] [has entry point]

"file" on compiled binary (under Debian):

ELF 32-bit LSB executable, ARM, version 1 (SYSV), for GNU/Linux 2.4.17, dynamically linked (uses shared libs), for GNU/Linux 2.4.17, stripped

"readelf -h | grep Flags""

Flags: 0x4000002, has entry point, Version4 EABI

Furthermore, as well as the usual __arm__ and maybe also __thumb__ symbols, the C preprocessor symbol __ARM_EABI__ is also defined when compiling into EABI, while __ARMEL__ is predefined under both the new and old ABIs.

ARM floating points

The Debian arm/OABI port creates hardfloat FPA instructions. FPA comes from "Floating Point Accelerator". Since the FPA floating point unit was implemented only in very few ARM cores, typically FPA instructions are emulated in kernel via Illegal instruction faults. This is of course very inefficient: about 10 times slower that -msoft-float for a FIR test program. The FPA unit also has the peculiarity of having mixed-endian doubles, which is usually the biggest grief for ARM porters, along with structure packing issues.

ARM has now introduced a new floating point unit, VFP (Vector Floating Points), which uses a different instruction set than FPA and stores floats in natural-endian IEEE-754 format. VFP isimplemented in new some ARM9/10/11 and Cortex A* cores, like in the new TI OMAP2 family. It seems likely that ARM cores without VFP will remain popular, as in many places ARM is used floats are unnecessary.

To complicate thing further, ARM processors are being integrated with many other FPUs and DSPs, each of which adds its own set of instructions to the ARM set:

Cirrus Logic's EP93XX series integrate an ARM920T core with a MaverickCrunch FPU. This also uses IEEE-754, though uses a different instruction set to VFP. Current ARM-Debian users cannot use their Maverick FPUs at all except by programming in assembler or using an alternative compiler. GCC has flags to generate Maverick FP instructions (-mfpu=maverick), but the .o files cannot be linked with the standard Debian GCC startup files or libraries.
Intel's iWMMXt unit is used in their PXA270 processor with an XScale main core. This adds integer SIMD and some other instructions but there is currently no iWMMXt processor with hardware floating point capabilities. iWMMXt processors are incompatible with FPA due to opcode overlap, while they could have an VFP coprocessor in principal. That said, iWMMXt instructions should make soft-float fairly quick anyway. Again, GCC support exists (-march=iwmmxt) for this but is also currently unusable within standard Debian.
Texas Instruments' OMAP, OMAP2, DaVinci DM644x series and numerous other products integrate a ARM9/ARM11 core with their own DSP core for multimedia acceleration and/or telecommunication signal processing, though it does fixed-point math and its DSP code is completely separate from the ARM code. In Linux DSP Gateway or proprietary solutions are used to load code for execution on the c55x/c6xx and provide a way to for ARM and DSP code to communicate.

For a generic-purpose distribution like Debian, targeting binary compatibility, EABI lets us have the cake and eat it. We can make a soft-float distribution using IEEE-754 with FPU-specific versions of packages (linux-kernel-2.6.x-vfp, libc6-iwmmxt, mediaplayer-maverick, etc) where needed. This also enables individual packages to do runtime FPU detection and call code compiled for different FP scenarios (in liboil for example).

The major FP variants worth support as alternative versions of FP-critical packages seem to be

the current arm arch supporting armv3 with or without FPA and armv4 processors.
EABI for generic ARM (>= armv4t) using IEEE soft-float
EABI for lowest common denominator VFP (there are now more than one VFP "extended" variant)
EABI for MaverickCrunch FPU
EABI for iWMMXt using iWMMXt-specific soft-float

GCC preprocessor macros for floating point

When porting code to ARM EABI, the following preprocessor macros are interesting:

__VFP_FP__ means that the floating point format in use is that of the ARM VFP unit, which is native-endian IEEE-754.
__MAVERICK__ means that the floating point format is that of the Cirrus Logic MaverickCrunch, which is also IEEE-754 and is always little-endian.
__SOFTFP__ means that instead of floating point instructions, library calls are being generated for floating point math operations so that the code will run on a processor without an FPU.

__VFP_FP__ and __MAVERICK__ are mutually exclusive. If neither is set, that means the floating point format in use is the old mixed-endian 45670123 format of the FPA unit.

Note that __VFP_FP__ does not mean that VFP code generation has been selected. It only speaks of the floating point data format in use and is normally set when soft-float has been selected. The correct test for VFP code generation, for example around asm fragments containing VFP instructions, is

#if (defined(__VFP_FP__) && !defined(__SOFTFP__))

Paradoxically, the -mfloat-abi=softfp does not set the __SOFTFP___ macro, since it selects real floating point instructions using the soft-float ABI at function-call interfaces.

By default in Debian armel, __VFP_FP__ && __SOFTFP__ are selected.

Struct packing and alignment

With the new ABI, default structure packing changes, as do some default data sizes and alignment (which also have a knock-on effect on structure packing). In particular the minimum size and alignment of a structure was 4 bytes. Under the EABI there is no minimum and the alignment is determined by the types of the components it contains. This will break programs that know too much about the way structures are packed and can break code that writes binary files by dumping and reading structures.

Stack alignment

The ARM EABI requires 8-byte stack alignment at public function entry points, compared to the previous 4-byte alignment.

64-bit data type alignment

"One of the key differences between the traditional GNU/Linux ABI and the EABI is that 64-bit types (like long long) are aligned differently. In the traditional ABI, these types had 4-byte alignment; in the EABI they have 8-byte alignment. As a result, if you use the same structure definitions (in a header file) and include it in code used in both the kernel and in application code, you may find that the structure size and alignment differ."

-- from the Codesourcery ARM GNU toolchain FAQ

Enum sizes

The EABI defines an optional system for controlling the size of C enumerated types. For arm-linux it was decided to keep the existing behaviour (enums are at least the same size as an int) for consistency with other Linux systems.

This is also reflected in the -mabi=aapcs or -mabi=aapcs-linux switches to GCC: aapcs defines enums to be a variable sized type, while with aapcs-linux they are always ints (4 bytes).

System call interface

Two things change in the system call interface: alignment of 64-bit parameters passed in registers and the way the system call number itself is passed.

With EABI, 64-bit function parameters passed in registers are aligned to an even-numbered register instead of using the next available pair.

Here's an explanation from Russell King, 12 Jan 2006:

We have r0 to r6 to pass 32-bit or 64-bit arguments. With EABI,
64-bit arguments will be aligned to an _even_ numbered register.
Hence:
long sys_foo(int a, long long b, int c, long long d);
will result in the following register layouts:

EABI
Current
r0
a
a
r1
unused
_ b
r2
_ b
/
r3
/
c
r4
c
_ d
r5
/
r6
... out of space for 'd'
... room for one other int.
r7
syscall number

Since this already causes an incompatible change in the system call interface, the opportunity has been taken to slip in a more efficient, totally incompatible way of doing system calls: instead of using the swi __NR_SYSCALL_BASE(==0x900000)+N instruction, where N is the number of the system call, swi 0 is always used with the system call number stashed in register r7. This is more efficient because the kernel no longer has to go and fish N out of the instruction stream(*), which used to have a negative impact on the efficiency of processors with separate text and data caches (i.e. most ARMs).

Fortunately, the two schemes can coexist and EABI kernels have an option to support the old syscall interface (including old structure layout rules) for running old-EABI binaries. However some features (e.g., ALSA, MD (RAID) and system calls from Thumb mode) do not work correctly from old-ABI binaries.

Some third party EABI toolchains (e.g., CodeSourcery 2005q3) use the old kernel interface via userspace shims in glibc. This is now obsolete and no longer supported by glibc.

(*) This is only true if the old-ABI compatibility option is disabled.

See this article for more details.

Choice of minimum CPU

Thumb interworking suggests armv4t

The EABI includes thumb interworking, which means that 16-bit Thumb and normal 32-bit ARM instructions can be mixed at function-level granularity.

Thumb interworking is mandatory according to the ARM EABI spec and requires every return and indirect function call to execute a BX instruction to set the core to the correct state, which is only present in armv4t cores and above. Gcc, too, only supports thumb interworking for armv4t and above.

So Debian armel runs on a minimum CPU of ARMv4t and by default the Debian armel GCC generates code for armv4t (rather than the usual default ARM target of armv5t).

Other scenarios

However a lower entry-level CPU is possible to do using different function return sequences which are of various speeds, and that work and/or allow Thumb interworking on different selections of the ARM architectures.

0. mov pc,lr

Is what GCC currently emits for -march=armv4. It works on ARMv4 and above but is only Thumb interworking-safe from ARMv7.

1. bx lr

Is what GCC emits for -march=armv4t. It works on ARMv4t and above and Thumb interworking is possible on ARMv4t and above. Excludes armv4, the StrongARM which are very common and some armv5users, but armv5 with no t seems a rare processor. CC needs modifying to implement any of the following choices.

2. tst lr, #1; moveq pc, lr; bx lr

was suggested by Paul Brook as an alternative to BX. It works on ARMv4 and above and Thumb interworking is possible on ARMv4t and above, with the extra cost of two instructions per indirect call/function return, though in line with the run-on-minimum-hardware Debian way.

Here is a patch by Richard K. Pixley which illustrates what is needed, but is not (April 2007) tested: http://lists.debian.org/debian-arm/2007/05/msg00015.html

This is problematic because hand written assembly has to be manually fixed.

3. ldm/ldr

Works on ARMv4 and above but Thumb interworking is only possible on ARMv5t and above, excluding ARMv4t users from using Thumb code with Debian. Gcc currently emits this for non-leaf functions on ARMv4 and ARMv5 (but not ARMv4t, where it uses BX, the only way to do interworking on v4t). Although a single instruction, this method may be slower than the three-instruction sequence because of the memory accesses it requires.

4. Drop Thumb interworking

A final option would be simply to compile the standard Debian repo --with-arch=armv4 --with-no-thumb-interwork. This would work on all processors without the dangers inherent in modifying GCC and, according to the GCC manual page, saves a slight size and speed overhead caused by being thumb-interworkable.

There is significant discussion of the technical merits of these various schemes in the debian-arm mailing list thread Re: ARM EABI port: minimum CPU choice of which the above is a partial summary.

5. tat says that simply compile for armv3 would work, though the code would be relatively slower on the most common, later CPUs. armv3 is fairly rare: Psion 5

6. Kernel emulation traps

It may be possible to catch illegal instruction in the kernel generated by the missing "BX" instruction, in the same way as missing hard floating point instructions can be emulated. It wouldn't be that fast on armv4 hardware (causing a context switch per function call/return) but such a kernel hack would allow the current repository to be used unmodified on armv4 hardware.

7. Linker fixups

The EABI provides mechanisms (R_ARM_V4BX relocations) for the linker to fixup bx instructions. Currently the linker only knows how to convert these to mov pc instructions, so you have to choose between armv4 or interworking at static link time. However the linker could be taught how to convert these into branches to a tst;moveq;bx stub. This has the advantage of also working for hand written assembly. It may be desirable to get the compiler to also generate this triplet inline for performance reasons.

This is implemented in recent binutils. Code should be compiled with -march=armv4, and arrange for --fix-v4bx-interworking to be passed to the linker and --fix-v4bx to be passed to the assembler. A gcc patch may also need backporting to avoid and earlier sanity check.

If you pass --fix-v4bx to the linker it will generate plain v4 binaries, which are not interworking safe, so should not be used on later cores (which may have Thumb libraries). --fix-v4bx-interworking generates code that works on armv4, and is also interworking safe on later hardware.

The limitation of this option is that any bx instruction will clobber the condition codes. The ABI specifies that condition codes are normally call clobbered, so normally this is not a problem. AFAIK there is only one exception. The libgcc cfcmp* routines (gcc/config/arm/ieee-{sf,df}.S) need to preserve the C and Z flags.

The linker fixup does introduce some additional overhead, so it may be desirable to also implement (2). Care should be taken to avoid double fixups.

Why a new port

In Debian, we want to assure complete binary compatibility. Since the old ABI is not compatible with the new one, we can't allow packages built with old ABI to link against new-abi libs, or the other direction.

Status

Armel (EABI) was released with Lenny as it was in good shape by then. That release thus contained both arm and armel. Arm was dropped in Squeeze.

Build daemons

See ArmEabiBuildd

Supported hardware

Debian/armel can be run on ARMv4t and any newer systems where a Linux kernel exists and enough memory (32MB+) and storage space (1GB+). The official Debian kernel only supports a tiny subset of the systems.

ixp4xx

ixp4xx is a low-end xscale based ARM network processor from Intel. It includes ethernet and usb on the same chip, allowing manufacturers to create cheap, if somewhat slow, network attached Linux systems.

Linksys NSLU-2, nickname "slug", the most popular Debian/Arm device. install instructions
Freecom FSG-3

There is a bunch of others ixp4xx platforms supported in Debian kernel, but their status and install instructions are unknown at the moment.

iop3xx

IOP 3xx series is a 400-600Mhz xscale based ARM core from intel. The IOP series of chips is a particularly interesting platform to run Debian since it has a focus on storage and therefore usually offers SATA and plenty of memory unlike most embedded systems.

Thecus N2100 Install guide
IO-Data glan tank Install guide
Intel boards: EM7210 (SS4000E NAS), IQ31244, IQ80321, EP80219. support included in kernel, status unkown.

Other NAS devices based on intel IOP3xx chips exist, Such as Thecus N4100, but their kernel haven't been included in mainline (Linus's kernel) yet.

Marvell orion

Orion is a system on a chip (SoC) from Marvell that integrates an ARM CPU, Ethernet, SATA, USB, and other functionality in one chip. There are many Network Attached Storage (NAS) devices on the market that are based on an Orion chip. They make nice home servers that are fairly powerful, pretty quiet and don't consume too much electricity. Orion based devices are a great platform to run Debian and together with Debian they can be a very powerful environment.

Qnap TS-109, TS-209, TS-409. Install instructions
HP mv2120 Install instructions
Buffalo Kurobox Linkstation pro/live
D-link DNS323 (kernel only)

A bunch of other orion platforms are being included to mainline Linux soon

Others

These platforms are not supported officially (at least yet). However instructions and user communities exist.

Sharp zaurus (pxa2xx)install instructions
Openmoko Freerunner (samsung arm core)install instructions
Nokia N800/N810 tablets (TI omap2) install instructions
Nokia N900 tablet / mobile phone: Work in Progress, see the Nokia N900 Packaging Team
Texas Instruments Beagleboard (TI omap3) install instructions
Toshiba AC100 Netbook: Work in Progress, see this thread on the Debian ARM mailing list and especially this mail with installation instructions

Porting to new platforms

Unlike x86, each and every arm platform boots in a slightly different way. Thus, most of work of getting Debian running will involve dealing with bootloader and Kernel. Which is not really debian-specific work. After that, people can start working porting debian-installer for the system in question.

References

Definition of EABI: ARM Architecture Procedure Call Standard plus CodeSourcery's ARM GNU/Linux Application Binary Interface Supplement
ELF for the ARM Architecture (PDF by ARM Ltd)
GNU Toolchain for ARM FAQ by CodeSourcery

最后

以上就是跳跃钢笔为你收集整理的ArmEabiPort的全部内容，希望文章能够帮你解决ArmEabiPort所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错，欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供，作为学习参考使用，或来自网络收集整理，版权属于原作者所有。

本文分类：Linux Kernel
浏览次数：90 次浏览
发布日期：2023-10-07 17:10:18
本文链接：https://www.kaopuke.com/article/k-p-k_14_uzokfx_13__23__6_3.html

ArmEabiPort

概述

In a nutshell

Terminology

GCC view

ARM floating points

GCC preprocessor macros for floating point

Struct packing and alignment

Stack alignment

64-bit data type alignment

Enum sizes

System call interface

Choice of minimum CPU

Thumb interworking suggests armv4t

Other scenarios

Why a new port

Status

Build daemons

Supported hardware

ixp4xx

iop3xx

Marvell orion

Others

Porting to new platforms

References

最后

评论列表共有 0 条评论

发表评论取消回复

ArmEabiPort

概述

In a nutshell

Terminology

GCC view

ARM floating points

GCC preprocessor macros for floating point

Struct packing and alignment

Stack alignment

64-bit data type alignment

Enum sizes

System call interface

Choice of minimum CPU

Thumb interworking suggests armv4t

Other scenarios

Why a new port

Status

Build daemons

Supported hardware

ixp4xx

iop3xx

Marvell orion

Others

Porting to new platforms

References

最后

相关文章

评论列表共有 0 条评论

发表评论 取消回复

发表评论取消回复