JHHK

欢迎来到我的个人网站
行者常至 为者常成

LLVM(二):编译过程

目录

编译过程查看

main.m文件内容如下

#import <Foundation/Foundation.h>
#import "Person.h"

#define AGE (18)

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        int a = 10;
        int b = 20;
        int c = a+b+AGE;
        printf("c = %d\n",c);
    }
    return 0;
}

查看main.m文件的执行过程

clang -ccc-print-phases main.m

执行过程如下

$ clang -ccc-print-phases -framework Foundation main.m -o main
0: input, "Foundation", object                //导入Foundation框架
1: input, "main.m", objective-c               //导入源文件
2: preprocessor, {1}, objective-c-cpp-output  //预处理
3: compiler, {2}, ir                          //编译生成IR(中间代码)
4: backend, {3}, assembler                    //汇编器生成汇编代码
5: assembler, {4}, object                     //生成机器码
6: linker, {0, 5}, image                      //链接
7: bind-arch, "x86_64", {6}, image            //生成Image,也就是最后的可执行文件

接着,就可以在终端直接运行这个程序了:

$ ./main

编译过程

一、预处理

指令

clang -E main.m -o emain.m

emain.m文件如下

@interface Person : NSObject

@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        int a = 10;
        int b = 20;
        int c = a+b+(18);
        print("c = %d\n",c);
    }
    return 0;
}

替换了Person.h文件的内容和宏定义

二、词法分析,生成Token

clang -fmodules -E -Xclang -dump-tokens main.m

部分代码

at '@'	 [StartOfLine]	Loc=<./Person.h:15:1>
identifier 'interface'		Loc=<./Person.h:15:2>
identifier 'Person'	 [LeadingSpace]	Loc=<./Person.h:15:12>
colon ':'	 [LeadingSpace]	Loc=<./Person.h:15:19>
identifier 'NSObject'	 [LeadingSpace]	Loc=<./Person.h:15:21>
at '@'	 [StartOfLine]	Loc=<./Person.h:19:1>
identifier 'end'		Loc=<./Person.h:19:2>
int 'int'	 [StartOfLine]	Loc=<main.m:14:1>
identifier 'main'	 [LeadingSpace]	Loc=<main.m:14:5>
l_paren '('		Loc=<main.m:14:9>
int 'int'		Loc=<main.m:14:10>
identifier 'argc'	 [LeadingSpace]	Loc=<main.m:14:14>
comma ','		Loc=<main.m:14:18>

三、语法分析,生成语法树(AST,AbstractSyntax Tree)

$clang -fmodules -fsyntax-only -Xclang -ast-dump main.m

部分代码

-FunctionDecl 0x7f7fde294740 <main.m:14:1, line:22:1> line:14:5 main 'int (int, const char **)'
| |-ParmVarDecl 0x7f7fde2944e8 <col:10, col:14> col:14 argc 'int'
| |-ParmVarDecl 0x7f7fde294600 <col:20, col:38> col:33 argv 'const char **':'const char **'
| `-CompoundStmt 0x7f7fdf01e488 <col:41, line:22:1>
|   |-ObjCAutoreleasePoolStmt 0x7f7fdf01e440 <line:15:5, line:20:5>
|   | `-CompoundStmt 0x7f7fdf01e410 <line:15:22, line:20:5>
|   |   |-DeclStmt 0x7f7fde294908 <line:16:9, col:19>
|   |   | `-VarDecl 0x7f7fde294888 <col:9, col:17> col:13 used a 'int' cinit
|   |   |   `-IntegerLiteral 0x7f7fde2948e8 <col:17> 'int' 10
|   |   |-DeclStmt 0x7f7fde2949b8 <line:17:9, col:19>
|   |   | `-VarDecl 0x7f7fde294938 <col:9, col:17> col:13 used b 'int' cinit
|   |   |   `-IntegerLiteral 0x7f7fde294998 <col:17> 'int' 20

四、生成中间代码IR

LLVM IR有3种表示形式(但本质是等价的,就好比水可以有气体,液体,固体等3中形态)

text:便于阅读的文本格式,类似于汇编语言,扩展名.ll

clang -S -emit-llvm main.m -o main.ll

main.ll文件的部分内容

Function Attrs: noinline optnone ssp uwtable
define i32 @main(i32, i8**) #0 {
  %3 = alloca i32, align 4
  %4 = alloca i32, align 4
  %5 = alloca i8**, align 8
  %6 = alloca i32, align 4
  %7 = alloca i32, align 4
  %8 = alloca i32, align 4
  store i32 0, i32* %3, align 4
  store i32 %0, i32* %4, align 4
  store i8** %1, i8*** %5, align 8
  %9 = call i8* @llvm.objc.autoreleasePoolPush() #1
  store i32 10, i32* %6, align 4
  store i32 20, i32* %7, align 4
  %10 = load i32, i32* %6, align 4
  %11 = load i32, i32* %7, align 4
  %12 = add nsw i32 %10, %11
  %13 = add nsw i32 %12, 18
  store i32 %13, i32* %8, align 4
  %14 = load i32, i32* %8, align 4
  %15 = call i32 (i8*, i32, ...) bitcast (i32 (...)* @print to i32 (i8*, i32, ...)*)(i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str, i32 0, i32 0), i32 %14)
  call void @llvm.objc.autoreleasePoolPop(i8* %9)
  ret i32 0
}

bitcode:二进制格式,扩展名为.bc

clang -c -emit-llvm main.m

查看main.bc文件

hexdump -C main.bc  | head
00000000  de c0 17 0b 00 00 00 00  14 00 00 00 90 0d 00 00  |................|
00000010  07 00 00 01 42 43 c0 de  35 14 00 00 07 00 00 00  |....BC..5.......|
00000020  62 0c 30 24 96 96 a6 a5  f7 d7 7f 5d d3 b4 4f fb  |b.0$.......]..O.|
00000030  f7 ed d7 fb 4f 0b 51 80  4c 01 00 00 21 0c 00 00  |....O.Q.L...!...|
00000040  ee 02 00 00 0b 02 21 00  02 00 00 00 16 00 00 00  |......!.........|
00000050  07 81 23 91 41 c8 04 49  06 10 32 39 92 01 84 0c  |..#.A..I..29....|
00000060  25 05 08 19 1e 04 8b 62  80 14 45 02 42 92 0b 42  |%......b..E.B..B|
00000070  a4 10 32 14 38 08 18 4b  0a 32 52 88 48 70 c4 21  |..2.8..K.2R.Hp.!|
00000080  23 44 12 87 8c 10 41 92  02 64 c8 08 b1 14 20 43  |#D....A..d.... C|
00000090  46 88 20 c9 01 32 52 84  18 2a 28 2a 90 31 7c b0  |F. ..2R..*(*.1|.|

memory:内存格式

五、生成汇编

clang -S -fobjc-arc main.m -o main.s

main.s部分内容

_main:                                  ## @main
	.cfi_startproc
## %bb.0:
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	subq	$48, %rsp
	movl	$0, -4(%rbp)
	movl	%edi, -8(%rbp)
	movq	%rsi, -16(%rbp)
	callq	_objc_autoreleasePoolPush
	movl	$10, -20(%rbp)
	movl	$20, -24(%rbp)
	movl	-20(%rbp), %edi
	addl	-24(%rbp), %edi

六、生成目标文件

clang -fmodules -c main.m -o main.o

七、生成可执行文件,这样就能够执行看到输出结果

clang main.o -o main

或者

clang main.m -o main -framework Foundation

八、执行

./main

执行结果

c = 48

行者常至,为者常成!





R
Valine - A simple comment system based on Leancloud.