How to Convert C++ to ARM Assembly
- Use the GCC Compiler to Convert C++ to ARM Assembly
- Create a MOD (Assembly-Time Modulus) Function to Convert C++ to ARM Assembly
-
Use the
arm-linux-gnueabi-gcc
Command to Convert C++ to ARM Assembly -
Use the
armclang
Command in ARM Compiler for Linux to Convert C++ to ARM Assembly -
Use the
__asm
Keyword to Convert C++ to ARM Assembly
Interfacing C++ with ARM assembly serves the programmers in many ways, and it is also a straightforward process that helps C++ access various functions and variables defined in assembly language and vice versa. This tutorial will teach you how to convert C++ code or functions to ARM assembly.
Programmers can use separate assembly code modules to link them with C++-compiled modules to use the assembly variables and inline assembly embedded in C++ or modify the assembly code that the compiler produces.
Most importantly, you must preserve any dedicated registers modified by a function, enable interrupt routines to save all the registers, ensure functions return values correctly according to their C++ declaration, no assembly module using the .cinit
section, enable the compiler to assign link names to all external objects, and declare every object and function with the .def
or .global
directive that is accessed or called from C++ in the assembly modifier before converting C++ to the ARM assembly.
Define the functions called from the assembly language with C
(functions prototyped as an extern C
) in a C++ file. Define variables in the .bss
section or assign them a linker symbol to later identify which one requires conversion.
Use the GCC Compiler to Convert C++ to ARM Assembly
The gcc
is a great source of getting intermediate outputs from C++ code during its execution. It is a feature that gets the assembler output using the -S
option.
The -S
option is for the output after compiling the code before sending it to the assembler.
Its syntax is gcc –S your_program.cpp
, and you can write a simple C++ program to convert into ARM assembly by just declaring this command. Besides being one of the simplest approaches, its output is complex and hard to understand, even for intermediate-level programmers.
GNN.cpp
file:
#include <iostream>
using namespace std;
main() {
int i, u, div;
i = 2;
u = 10;
div = i / u;
cout << "Answer: " << div << endl;
}
Run this command on GCC in Microsoft Windows:
gcc –S GNN.cpp
Output:
It is possible to use a series of ASM statements or a single ASM statement for a single line of assembly code insertion into the assembly file within your C++ program that the compiler creates. These assembly statements place sequential lines of code (assembly code) into the compiler (C++ compiler output) with no intervening code (without any code interruptions).
However, always maintain the C++ environment because the compiler does not check/analyze the inserted instructions. Always avoid inserting labels or umps into C++ code as they may produce unpredictable results and confuse the register-tracking algorithms that code generates.
Furthermore, the ASM statements are not a valid choice for inserting assembler directives, and you can use the symdebug:dwarf
command or the -g
command without changing the assembly environment and avoiding assembly macros creation in C++ code because the C++ environment debugs information.
Create a MOD (Assembly-Time Modulus) Function to Convert C++ to ARM Assembly
As the ARM Assembly lacks the MOD commands, you can create a MOD function with subs and easily convert C++ to ARM Assembly. You need to load the memory address of the variable via ldr reg, =var
, and in case you want to load the variable, it requires doing another ldr
with that reg
like ldr r0, =carry ldr r0, [r0]
to load the value stored at the memory address in r0
.
Use sdiv
because it is much faster than a subtract loop except for minimal inputs, where the loop only runs once or twice.
Concept:
;Precondition: R0 % R1 is the required computation
;Postcondition: R0 has the result of R0 % R1
: R2 has R0 / R1
; Example comments for 10 % 7
UDIV R2, R0, R1 ; 1 <- 10 / 7 ; R2 <- R0 / R1
MLS R0, R1, R2, R0 ; 3 <- 10 - (7 * 1) ; R0 <- R0 - (R1 * R2 )
#include <iostream>
using namespace std;
main() {
int R0, R1, R2;
R1 = 7;
R2 = 1;
R0 = 10;
int Sol1, Sol2;
Sol1 = R2 < -R0 / R1;
Sol2 = R0 < -R0 - (R1 * R2);
cout << Sol1 << endl;
cout << Sol2;
}
Output:
Use the arm-linux-gnueabi-gcc
Command to Convert C++ to ARM Assembly
The arm-linux-gnueabi-gcc
command is a perfect way to convert C++ to ARM assembly for x86 & x64 machines. As the gcc
doesn’t have ARM targets available, you cannot use it for general systems, but only if you are on an ARM system where you can use the regular gcc
instead.
The complete command arm-linux-gnueabi-gcc -S -O2 -march=armv8-a GNN.cpp
is incredibly strong where -S
represents the output assembly and tells gcc
about it, -02
is a code optimizer and reduces debug clutter from the result. The -02
is optional; on the other hand, the -march=armv8-a
is compulsory and tells it to use the ARM v8 target while compiling.
You can change the ARM target while compiling by using the different versions of ARM v8, including; armv8-a
, armv8.1-a
to armv8.6-a
, armv8-m.base
, armv8-m.main
, and armv8.1-m.main
where each one is slightly different, and you can perform in-depth analysis and select the one that suits your needs perfectly.
The power.c
from the command tells which file to compile, and if you haven’t specified an output file like -o output.asm
, the assembly will be outputted to the similar file name power.s
.
The arm-linux-gnueabi-gcc
is a great alternative to compiling on an arm
machine that provides the target or output assembly with regular gcc
.
The gcc
lets programmers specify the target architecture with -march=xxx
, and you must know to identify your machine’s apt
package to select the right one.
GNN.cpp
file:
#include <iostream>
using namespace std;
int power(int x, int y) {
if (x == 0) {
return 0;
} else if (y < 0) {
return 0;
} else if (y == 0) {
return 1;
} else {
return x * power(x, y - 1);
}
}
main() {
int x, y, sum;
x = 2;
y = 10;
sum = power(x, y);
cout << sum;
}
arm-linux-gnueabi-gcc -S -O2 -march=armv8-a GNN.cpp
Output:
power(int, int):
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-8], esi
cmp DWORD PTR [rbp-4], 0
jne .L2
mov eax, 0
jmp .L3
.L2:
cmp DWORD PTR [rbp-8], 0
jns .L4
mov eax, 0
jmp .L3
.L4:
cmp DWORD PTR [rbp-8], 0
jne .L5
mov eax, 1
jmp .L3
.L5:
mov eax, DWORD PTR [rbp-8]
lea edx, [rax-1]
mov eax, DWORD PTR [rbp-4]
mov esi, edx
mov edi, eax
call power(int, int)
imul eax, DWORD PTR [rbp-4]
.L3:
leave
ret
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], 2
mov DWORD PTR [rbp-8], 10
mov edx, DWORD PTR [rbp-8]
mov eax, DWORD PTR [rbp-4]
mov esi, edx
mov edi, eax
call power(int, int)
mov DWORD PTR [rbp-12], eax
mov eax, DWORD PTR [rbp-12]
mov esi, eax
mov edi, OFFSET FLAT:_ZSt4cout
call std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
mov eax, 0
leave
ret
__static_initialization_and_destruction_0(int, int):
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-8], esi
cmp DWORD PTR [rbp-4], 1
jne .L10
cmp DWORD PTR [rbp-8], 65535
jne .L10
mov edi, OFFSET FLAT:_ZStL8__ioinit
call std::ios_base::Init::Init() [complete object constructor]
mov edx, OFFSET FLAT:__dso_handle
mov esi, OFFSET FLAT:_ZStL8__ioinit
mov edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
call __cxa_atexit
.L10:
nop
leave
ret
_GLOBAL__sub_I_power(int, int):
push rbp
mov rbp, rsp
mov esi, 65535
mov edi, 1
call __static_initialization_and_destruction_0(int, int)
pop rbp
ret
Alternatively, you can install the ARM compiler for Linux by loading the module for ARM compiler by running module load arm<major-version>/<package-version>
where <package-version>
is <major-version>.<minor-version>{.<patch-version>}
, for example: module load arm21/21.0
.
The armclang -S <source>.c
command can help you compile your C++ source and specify an assembly code output where -S
represents assembly code output and <source>.s
is the file that will contain converted code.
Use the armclang
Command in ARM Compiler for Linux to Convert C++ to ARM Assembly
You can produce annotated assembly code using the ARM C++ compiler, which is the first step to learning how the compiler vectorizes loops. An ARM compiler for Linux OS is a prerequisite for generating the assembly code from C++.
After loading the module for the ARM compiler, run the module load arm<major-version>/<package-version>
command, for example: module load arm21/21.0
by putting <major-version>.<minor-version>{.<patch-version>}
where the <package-version>
is part of the command.
Compile your source code using the armclang -S <source>.cpp
command and insert the source file name in the location of <source>.cpp
.
The ARM assembly compiler does something different from the GCC compiler, using SIMD (Single Instruction Multiple Data) instructions and registers to vectorize the code.
GNN.cpp
file:
#include <iostream>
using namespace std;
void subtract_arrays(int a, int b, int c) {
int sum;
for (int i = 0; i < 5; i++) {
a = (b + c) - i;
sum = sum + a;
}
cout << sum;
}
int main() {
int a = 1;
int b = 2;
int c = 3;
subtract_arrays(a, b, c);
}
armclang -O1 -S -o source_O1.s GNN.cpp
Output:
subtract_arrays(int, int, int):
push rbp
mov rbp, rsp
sub rsp, 32
mov DWORD PTR [rbp-20], edi
mov DWORD PTR [rbp-24], esi
mov DWORD PTR [rbp-28], edx
mov DWORD PTR [rbp-8], 0
jmp .L2
.L3:
mov edx, DWORD PTR [rbp-24]
mov eax, DWORD PTR [rbp-28]
add eax, edx
sub eax, DWORD PTR [rbp-8]
mov DWORD PTR [rbp-20], eax
mov eax, DWORD PTR [rbp-20]
add DWORD PTR [rbp-4], eax
add DWORD PTR [rbp-8], 1
.L2:
cmp DWORD PTR [rbp-8], 4
jle .L3
mov eax, DWORD PTR [rbp-4]
mov esi, eax
mov edi, OFFSET FLAT:_ZSt4cout
call std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
nop
leave
ret
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], 1
mov DWORD PTR [rbp-8], 2
mov DWORD PTR [rbp-12], 3
mov edx, DWORD PTR [rbp-12]
mov ecx, DWORD PTR [rbp-8]
mov eax, DWORD PTR [rbp-4]
mov esi, ecx
mov edi, eax
call subtract_arrays(int, int, int)
mov eax, 0
leave
ret
__static_initialization_and_destruction_0(int, int):
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-8], esi
cmp DWORD PTR [rbp-4], 1
jne .L8
cmp DWORD PTR [rbp-8], 65535
jne .L8
mov edi, OFFSET FLAT:_ZStL8__ioinit
call std::ios_base::Init::Init() [complete object constructor]
mov edx, OFFSET FLAT:__dso_handle
mov esi, OFFSET FLAT:_ZStL8__ioinit
mov edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
call __cxa_atexit
.L8:
nop
leave
ret
_GLOBAL__sub_I_subtract_arrays(int, int, int):
push rbp
mov rbp, rsp
mov esi, 65535
mov edi, 1
call __static_initialization_and_destruction_0(int, int)
pop rbp
ret
Use the __asm
Keyword to Convert C++ to ARM Assembly
It’s known to be the most valid approach as the compiler provides an inline assembler to write assembly code in your C++ source code and enables you to access features of the target processor that are not a part of or available from C++.
Using the GNU inline assembly syntax, the _arm
keyword helps you incorporate or write inline assembly code into a function.
However, it is not a good approach to migrate armasm
syntax assembly code to GNU syntax as the inline assembler does not support legacy assembly code written in armasm
assembly syntax.
The __asm [volatile] (code); /* Basic inline assembly syntax */
inline assembly statement shows the general form of an _arm
statement, and there is also an extended version of inline assembly syntax which you will find in the example code below.
Using the volatile
qualifier for assembler instructions is beneficial but can have some drawbacks that the compiler might be unaware of, including; the chances of disabling certain compiler optimizations that can lead to the compiler removing the code block.
As the volatile
qualifier is optional, using it can ensure the compiler does not remove the assembly code blocks when compiling with -01
or above.
#include <stdio.h>
int add(int x, int y) {
int sum = 0;
__asm("ADD %[_sum], %[input_x], %[input_y]"
: [_sum] "=r"(sum)
: [input_x] "r"(x), [input_y] "r"(y));
return sum;
}
int main(void) {
int x = 1;
int y = 2;
int z = 0;
z = add(x, y);
printf("Result of %d + %d = %d\n", x, y, z);
}
Output:
add(int, int):
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-20], edi
mov DWORD PTR [rbp-24], esi
mov DWORD PTR [rbp-4], 0
mov eax, DWORD PTR [rbp-20]
mov edx, DWORD PTR [rbp-24]
ADD eax, eax, edx
mov DWORD PTR [rbp-4], eax
mov eax, DWORD PTR [rbp-4]
pop rbp
ret
.LC0:
.string "Result of %d + %d = %d\n"
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], 1
mov DWORD PTR [rbp-8], 2
mov DWORD PTR [rbp-12], 0
mov edx, DWORD PTR [rbp-8]
mov eax, DWORD PTR [rbp-4]
mov esi, edx
mov edi, eax
call add(int, int)
mov DWORD PTR [rbp-12], eax
mov ecx, DWORD PTR [rbp-12]
mov edx, DWORD PTR [rbp-8]
mov eax, DWORD PTR [rbp-4]
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov eax, 0
leave
ret
The code
keyword in the _arm
assembly statement is the assembly instruction, and the code_template
is its template; if you only specify it rather than code
, then you must specify the output_operand_list
before specifying the optional input_operand_list
and clobbered_register_list
.
The output_operand_list
(as an output operands list) is separated by commas, and each operand consists of a symbolic name in square brackets with the [result] "=r" (res)
format.
You may use the inline assembly to define symbols like __asm (".global __use_no_semihosting\n\t");
or to define labels using the :
sign after the label name like __asm ("my_label:\n\t");
.
Furthermore, it enables you to write multiple instructions within the same _asm
statement and also enables you to write embedded assembly using the __attribute__((naked))
keyword.
The Microsoft C++ compiler (MSVC) can provide different results on the ARM architecture than on x86 or x64 machines or architectures for the same C++ source code, and you may encounter many migration or conversion issues.
The issues can invoke undefined, implementation-defined, or unspecified behavior and other migration issues attributed to hardware differences between ARM and x86 or x64 architectures that interact with the C++ standard differently.
Hassan is a Software Engineer with a well-developed set of programming skills. He uses his knowledge and writing capabilities to produce interesting-to-read technical articles.
GitHub