[gentoo-commits] gentoo-projects commit in extreme-security/solutions/branches/pappy001/docs/airs_linker_docs: A.txt B.txt C.txt D.txt E.txt - gentoo-commits

From:	"Alexander Gabert (pappy)" <pappy@g.o>
To:	gentoo-commits@l.g.o
Subject:	[gentoo-commits] gentoo-projects commit in extreme-security/solutions/branches/pappy001/docs/airs_linker_docs: A.txt B.txt C.txt D.txt E.txt
Date:	Tue, 27 Nov 2007 19:06:45
Message-Id:	`E1Ix5lS-0005oO-NH@stork.gentoo.org`

1

pappy       07/11/27 19:06:34

2

3

  Added:                A.txt B.txt C.txt D.txt E.txt

4

  Log:

5

  adding preliminary versions of ebuild, patch, profile for GXS glibc and documentation about linkers

6

7

Revision  Changes    Path

8

1.1                  extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/A.txt

9

10

file : http://sources.gentoo.org/viewcvs.py/gentoo-projects/extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/A.txt?rev=1.1&view=markup

11

plain: http://sources.gentoo.org/viewcvs.py/gentoo-projects/extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/A.txt?rev=1.1&content-type=text/plain

12

13

Index: A.txt

14

===================================================================

15

   #[1]RSS 2.0 [2]RSS .92 [3]Atom 0.3

16

17

[4]Airs - Ian Lance Taylor

18

19

[5]Linkers part 4

20

21

   August 27, 2007 at 10:47 pm · Filed under [6]Programming

22

23

   Shared Libraries

24

25

   We've talked a bit about what object files and executables look like,

26

   so what do shared libraries look like? I'm going to focus on ELF shared

27

   libraries as used in SVR4 (and GNU/Linux, etc.), as they are the most

28

   flexible shared library implementation and the one I know best.

29

30

   Windows shared libraries, known as DLLs, are less flexible in that you

31

   have to compile code differently depending on whether it will go into a

32

   shared library or not. You also have to express symbol visibility in

33

   the source code. This is not inherently bad, and indeed ELF has picked

34

   up some of these ideas over time, but the ELF format makes more

35

   decisions at link time and is thus more powerful.

36

37

   When the program linker creates a shared library, it does not yet know

38

   which virtual address that shared library will run at. In fact, in

39

   different processes, the same shared library will run at different

40

   address, depending on the decisions made by the dynamic linker. This

41

   means that shared library code must be position independent. More

42

   precisely, it must be position independent after the dynamic linker has

43

   finished loading it. It is always possible for the dynamic linker to

44

   convert any piece of code to run at any virtula address, given

45

   sufficient relocation information. However, performing the reloc

46

   computations must be done every time the program starts, implying that

47

   it will start more slowly. Therefore, any shared library system seeks

48

   to generate position independent code which requires a minimal number

49

   of relocations to be applied at runtime, while still running at close

50

   to the runtime efficiency of position dependent code.

51

52

   An additional complexity is that ELF shared libraries were designed to

53

   be roughly equivalent to ordinary archives. This means that by default

54

   the main executable may override symbols in the shared library, such

55

   that references in the shared library will call the definition in the

56

   executable, even if the shared library also defines that same symbol.

57

   For example, an executable may define its own version of malloc. The C

58

   library also defines malloc, and the C library contains code which

59

   calls malloc. If the executable defines malloc itself, it will override

60

   the function in the C library. When some other function in the C

61

   library calls malloc, it will call the definition in the executable,

62

   not the definition in the C library.

63

64

   There are thus different requirements pulling in different directions

65

   for any specific ELF implementation. The right implementation choices

66

   will depend on the characteristics of the processor. That said, most,

67

   but not all, processors make fairly similar decisions. I will describe

68

   the common case here. An example of a processor which uses the common

69

   case is the i386; an example of a processor which make some different

70

   decisions is the PowerPC.

71

72

   In the common case, code may be compiled in two different modes. By

73

   default, code is position dependent. Putting position dependent code

74

   into a shared library will cause the program linker to generate a lot

75

   of relocation information, and cause the dynamic linker to do a lot of

76

   processing at runtime. Code may also be compiled in position

77

   independent mode, typically with the -fpic option. Position independent

78

   code is slightly slower when it calls a non-static function or refers

79

   to a global or static variable. However, it requires much less

80

   relocation information, and thus the dynamic linker will start the

81

   program faster.

82

83

   Position independent code will call non-static functions via the

84

   Procedure Linkage Table or PLT. This PLT does not exist in .o files. In

85

   a .o file, use of the PLT is indicated by a special relocation. When

86

   the program linker processes such a relocation, it will create an entry

87

   in the PLT. It will adjust the instruction such that it becomes a

88

   PC-relative call to the PLT entry. PC-relative calls are inherently

89

   position independent and thus do not require a relocation entry

90

   themselves. The program linker will create a relocation for the PLT

91

   entry which tells the dynamic linker which symbol is associated with

92

   that entry. This process reduces the number of dynamic relocations in

93

   the shared library from one per function call to one per function

94

   called.

95

96

   Further, PLT entries are normally relocated lazily by the dynamic

97

   linker. On most ELF systems this laziness may be overridden by setting

98

   the LD_BIND_NOW environment variable when running the program. However,

99

   by default, the dynamic linker will not actually apply a relocation to

100

   the PLT until some code actually calls the function in question. This

101

   also speeds up startup time, in that many invocations of a program will

102

   not call every possible function. This is particularly true when

103

   considering the shared C library, which has many more function calls

104

   than any typical program will execute.

105

106

   In order to make this work, the program linker initializes the PLT

107

   entries to load an index into some register or push it on the stack,

108

   and then to branch to common code. The common code calls back into the

109

   dynamic linker, which uses the index to find the appropriate PLT

110

   relocation, and uses that to find the function being called. The

111

   dynamic linker then initializes the PLT entry with the address of the

112

   function, and then jumps to the code of the function. The next time the

113

   function is called, the PLT entry will branch directly to the function.

114

115

   Before giving an example, I will talk about the other major data

116

   structure in position independent code, the Global Offset Table or GOT.

117

   This is used for global and static variables. For every reference to a

118

   global variable from position independent code, the compiler will

119

   generate a load from the GOT to get the address of the variable,

120

   followed by a second load to get the actual value of the variable. The

121

   address of the GOT will normally be held in a register, permitting

122

   efficient access. Like the PLT, the GOT does not exist in a .o file,

123

   but is created by the program linker. The program linker will create

124

   the dynamic relocations which the dynamic linker will use to initialize

125

   the GOT at runtime. Unlike the PLT, the dynamic linker always fully

126

   initializes the GOT when the program starts.

127

128

   For example, on the i386, the address of the GOT is held in the

129

   register %ebx. This register is initialized at the entry to each

130

   function in position independent code. The initialization sequence

131

   varies from one compiler to another, but typically looks something like

132

   this:

133

134

   call __i686.get_pc_thunk.bx

135

   add $offset,%ebx

136

137

   The function __i686.get_pc_thunk.bx simply looks like this:

138

139

   mov (%esp),%ebx

140

ret

141

142

   This sequence of instructions uses a position independent sequence to

143

   get the address at which it is running. Then is uses an offset to get

144

   the address of the GOT. Note that this requires that the GOT always be

145

   a fixed offset from the code, regardless of where the shared library is

146

   loaded. That is, the dynamic linker must load the shared library as a

147

   fixed unit; it may not load different parts at varying addresses.

148

149

   Global and static variables are now read or written by first loading

150

   the address via a fixed offset from %ebx. The program linker will

151

   create dynamic relocations for each entry in the GOT, telling the

152

   dynamic linker how to initialize the entry. These relocations are of

153

   type GLOB_DAT.

154

155

   For function calls, the program linker will set up a PLT entry to look

156

   like this:

157

158

   jmp *offset(%ebx)

159

   pushl #index

160

   jmp first_plt_entry

161

162

   The program linker will allocate an entry in the GOT for each entry in

163

   the PLT. It will create a dynamic relocation for the GOT entry of type

164

   JMP_SLOT. It will initialize the GOT entry to the base address of the

165

   shared library plus the address of the second instruction in the code

166

   sequence above. When the dynamic linker does the initial lazy binding

167

   on a JMP_SLOT reloc, it will simply add the difference between the

168

   shared library load address and the shared library base address to the

169

   GOT entry. The effect is that the first jmp instruction will jump to

170

   the second instruction, which will push the index entry and branch to

171

   the first PLT entry. The first PLT entry is special, and looks like

172

   this:

173

174

   pushl 4(%ebx)

175

   jmp *8(%ebx)

176

177

   This references the second and third entries in the GOT. The dynamic

178

   linker will initialize them to have appropriate values for a callback

179

   into the dynamic linker itself. The dynamic linker will use the index

180

   pushed by the first code sequence to find the JMP_SLOT relocation. When

181

   the dynamic linker determines the function to be called, it will store

182

   the address of the function into the GOT entry references by the first

183

   code sequence. Thus, the next time the function is called, the jmp

184

   instruction will branch directly to the right code.

185

186

   That was a fast pass over a lot of details, but I hope that it conveys

187

   the main idea. It means that for position independent code on the i386,

188

   every call to a global function requires one extra instruction after

189

   the first time it is called. Every reference to a global or static

190

   variable requires one extra instruction. Almost every function uses

191

   four extra instructions when it starts to initialize %ebx (leaf

192

   functions which do not refer to any global variables do not need to

193

   initialize %ebx). This all has some negative impact on the program

194

   cache. This is the runtime performance penalty paid to let the dynamic

195

   linker start the program quickly.

196

197

   On other processors, the details are naturally different. However, the

198

   general flavour is similar: position independent code in a shared

199

   library starts faster and runs slightly slower.

200

201

   More tomorrow.

202

203

   [7]Permalink [8]Comments (5)

204

205

[9]Linkers part 3

206

207

   August 24, 2007 at 10:25 pm · Filed under [10]Programming

208

209

   Continuing notes on linkers.

210

211

   Address Spaces

212

213

   An address space is simply a view of memory, in which each byte has an

214

   address. The linker deals with three distinct types of address space.

215

216

   Every input object file is a small address space: the contents have

217

   addresses, and the symbols and relocations refer to the contents by

218

   addresses.

219

220

   The output program will be placed at some location in memory when it

221

   runs. This is the output address space, which I generally refer to as

222

   using virtual memory addresses.

223

224

   The output program will be loaded at some location in memory. This is

225

   the load memory address. On typical Unix systems virtual memory

226

   addresses and load memory addresses are the same. On embedded systems

227

   they are often different; for example, the initialized data (the

228

   initial contents of global or static variables) may be loaded into ROM

229

   at the load memory address, and then copied into RAM at the virtual

230

   memory address.

231

232

   Shared libraries can normally be run at different virtual memory

233

   address in different processes. A shared library has a base address

234

   when it is created; this is often simply zero. When the dynamic linker

235

   copies the shared library into the virtual memory space of a process,

236

   it must apply relocations to adjust the shared library to run at its

237

   virtual memory address. Shared library systems minimize the number of

238

   relocations which must be applied, since they take time when starting

239

   the program.

240

241

   Object File Formats

242

243

   As I said above, an assembler turns human readable assembly language

244

   into an object file. An object file is a binary data file written in a

245

   format designed as input to the linker. The linker generates an

246

   executable file. This executable file is a binary data file written in

247

   a format designed as input for the operating system or the loader (this

248

   is true even when linking dynamically, as normally the operating system

249

   loads the executable before invoking the dynamic linker to begin

250

   running the program). There is no logical requirement that the object

251

   file format resemble the executable file format. However, in practice

252

   they are normally very similar.

253

254

   Most object file formats define sections. A section typically holds

255

   memory contents, or it may be used to hold other types of data.

256

   Sections generally have a name, a type, a size, an address, and an

257

   associated array of data.

258

259

   Object file formats may be classed in two general types: record

260

   oriented and section oriented.

261

262

   A record oriented object file format defines a series of records of

263

   varying size. Each record starts with some special code, and may be

264

   followed by data. Reading the object file requires reading it from the

265

   begininng and processing each record. Records are used to describe

266

   symbols and sections. Relocations may be associated with sections or

267

   may be specified by other records. IEEE-695 and Mach-O are record

268

   oriented object file formats used today.

269

270

   In a section oriented object file format the file header describes a

271

   section table with a specified number of sections. Symbols may appear

272

   in a separate part of the object file described by the file header, or

273

   they may appear in a special section. Relocations may be attached to

274

   sections, or they may appear in separate sections. The object file may

275

   be read by reading the section table, and then reading specific

276

   sections directly. ELF, COFF, PE, and a.out are section oriented object

277

   file formats.

278

279

   Every object file format needs to be able to represent debugging

280

   information. Debugging informations is generated by the compiler and

281

   read by the debugger. In general the linker can just treat it like any

282

   other type of data. However, in practice the debugging information for

283

   a program can be larger than the actual program itself. The linker can

284

   use various techniques to reduce the amount of debugging information,

285

   thus reducing the size of the executable. This can speed up the link,

286

   but requires the linker to understand the debugging information.

287

288

   The a.out object file format stores debugging information using special

289

   strings in the symbol table, known as stabs. These special strings are

290

   simply the names of symbols with a special type. This technique is also

291

   used by some variants of ECOFF, and by older versions of Mach-O.

292

293

   The COFF object file format stores debugging information using special

294

   fields in the symbol table. This type information is limited, and is

295

   completely inadequate for C++. A common technique to work around these

296

   limitations is to embed stabs strings in a COFF section.

297

298

   The ELF object file format stores debugging information in sections

299

   with special names. The debugging information can be stabs strings or

300

   the DWARF debugging format.

301

302

   More next week.

303

304

   [11]Permalink [12]Comments (1)

305

306

[13]Linkers part 2

307

308

   August 23, 2007 at 10:18 pm · Filed under [14]Programming

309

310

   I'm back, and I'm still doing the linker technical introduction.

311

312

   Shared libraries were invented as an optimization for virtual memory

313

   systems running many processes simultaneously. People noticed that

314

   there is a set of basic functions which appear in almost every program.

315

   Before shared libraries, in a system which runs multiple processes

316

   simultaneously, that meant that almost every process had a copy of

317

   exactly the same code. This suggested that on a virtual memory system

318

   it would be possible to arrange that code so that a single copy could

319

   be shared by every process using it. The virtual memory system would be

320

   used to map the single copy into the address space of each process

321

   which needed it. This would require less physical memory to run

322

   multiple programs, and thus yield better performance.

323

324

   I believe the first implementation of shared libraries was on SVR3,

325

   based on COFF. This implementation was simple, and basically assigned

326

   each shared library a fixed portion of the virtual address space. This

327

   did not require any significant changes to the linker. However,

328

   requiring each shared library to reserve an appropriate portion of the

329

   virtual address space was inconvenient.

330

331

   SunOS4 introduced a more flexible version of shared libraries, which

332

   was later picked up by SVR4. This implementation postponed some of the

333

   operation of the linker to runtime. When the program started, it would

334

   automatically run a limited version of the linker which would link the

335

   program proper with the shared libraries. The version of the linker

336

   which runs when the program starts is known as the dynamic linker. When

337

   it is necessary to distinguish them, I will refer to the version of the

338

   linker which creates the program as the program linker. This type of

339

   shared libraries was a significant change to the traditional program

340

   linker: it now had to build linking information which could be used

341

   efficiently at runtime by the dynamic linker.

342

343

   That is the end of the introduction. You should now understand the

344

   basics of what a linker does. I will now turn to how it does it.

345

346

   Basic Linker Data Types

347

348

   The linker operates on a small number of basic data types: symbols,

349

   relocations, and contents. These are defined in the input object files.

350

   Here is an overview of each of these.

351

352

   A symbol is basically a name and a value. Many symbols represent static

353

   objects in the original source code-that is, objects which exist in a

354

   single place for the duration of the program. For example, in an object

355

   file generated from C code, there will be a symbol for each function

356

   and for each global and static variable. The value of such a symbol is

357

   simply an offset into the contents. This type of symbol is known as a

358

   defined symbol. It's important not to confuse the value of the symbol

359

   representing the variable my_global_var with the value of my_global_var

360

   itself. The value of the symbol is roughly the address of the variable:

361

   the value you would get from the expression &my_global_var in C.

362

363

   Symbols are also used to indicate a reference to a name defined in a

364

   different object file. Such a reference is known as an undefined

365

   symbol. There are other less commonly used types of symbols which I

366

   will describe later.

367

368

   During the linking process, the linker will assign an address to each

369

   defined symbol, and will resolve each undefined symbol by finding a

370

   defined symbol with the same name.

371

372

   A relocation is a computation to perform on the contents. Most

373

   relocations refer to a symbol and to an offset within the contents.

374

   Many relocations will also provide an additional operand, known as the

375

   addend. A simple, and commonly used, relocation is "set this location

376

   in the contents to the value of this symbol plus this addend." The

377

   types of computations that relocations do are inherently dependent on

378

   the architecture of the processor for which the linker is generating

379

   code. For example, RISC processors which require two or more

380

   instructions to form a memory address will have separate relocations to

381

   be used with each of those instructions; for example, "set this

382

   location in the contents to the lower 16 bits of the value of this

383

   symbol."

384

385

   During the linking process, the linker will perform all of the

386

   relocation computations as directed. A relocation in an object file may

387

   refer to an undefined symbol. If the linker is unable to resolve that

388

   symbol, it will normally issue an error (but not always: for some

389

   symbol types or some relocation types an error may not be appropriate).

390

391

   The contents are what memory should look like during the execution of

392

   the program. Contents have a size, an array of bytes, and a type. They

393

   contain the machine code generated by the compiler and assembler (known

394

   as text). They contain the values of initialized variables (data). They

395

   contain static unnamed data like string constants and switch tables

396

   (read-only data or rdata). They contain uninitialized variables, in

397

   which case the array of bytes is generally omitted and assumed to

398

   contain only zeroes (bss). The compiler and the assembler work hard to

399

   generate exactly the right contents, but the linker really doesn't care

400

   about them except as raw data. The linker reads the contents from each

401

   file, concatenates them all together sorted by type, applies the

402

   relocations, and writes the result into the executable file.

403

404

   Basic Linker Operation

405

406

   At this point we already know enough to understand the basic steps used

407

   by every linker.

408

     * Read the input object files. Determine the length and type of the

409

       contents. Read the symbols.

410

     * Build a symbol table containing all the symbols, linking undefined

411

       symbols to their definitions.

412

     * Decide where all the contents should go in the output executable

413

       file, which means deciding where they should go in memory when the

414

       program runs.

415

     * Read the contents data and the relocations. Apply the relocations

416

       to the contents. Write the result to the output file.

417

     * Optionally write out the complete symbol table with the final

418

       values of the symbols.

419

420

   More tomorrow.

421

422

   [15]Permalink [16]Comments (10)

423

424

[17]Linkers part 1

425

426

   August 22, 2007 at 11:30 pm · Filed under [18]Programming

427

428

   I've been working on and off on a new linker. To my surprise, I've

429

   discovered in talking about this that some people, even some computer

430

   programmers, are unfamiliar with the details of the linking process.

431

   I've decided to write some notes about linkers, with the goal of

432

   producing an essay similar to my existing one about the GNU configure

433

   and build system.

434

435

   As I only have the time to write one thing a day, I'm going to do this

436

   on my blog over time, and gather the final essay together later. I

437

   believe that I may be up to five readers, and I hope y'all will accept

438

   this digression into stuff that matters. I will return to random

439

   philosophizing and minding other people's business soon enough.

440

441

   A Personal Introduction

442

443

   Who am I to write about linkers?

444

445

   I wrote my first linker back in 1988, for the AMOS operating system

446

   which ran on Alpha Micro systems. (If you don't understand the

447

   following description, don't worry; all will be explained below). I

448

   used a single global database to register all symbols. Object files

449

   were checked into the database after they had been compiled. The link

450

   process mainly required identifying the object file holding the main

451

   function. Other objects files were pulled in by reference. I reverse

452

   engineered the object file format, which was undocumented but quite

453

   simple. The goal of all this was speed, and indeed this linker was much

454

   faster than the system one, mainly because of the speed of the

455

   database.

456

457

   I wrote my second linker in 1993 and 1994. This linker was designed and

458

   prototyped by Steve Chamberlain while we both worked at Cygnus Support

459

   (later Cygnus Solutions, later part of Red Hat). This was a complete

460

   reimplementation of the BFD based linker which Steve had written a

461

   couple of years before. The primary target was a.out and COFF. Again

462

   the goal was speed, especially compared to the original BFD based

463

   linker. On SunOS 4 this linker was almost as fast as running the cat

464

   program on the input .o files.

465

466

   The linker I am now working, called gold, on will be my third. It is

467

   exclusively an ELF linker. Once again, the goal is speed, in this case

468

   being faster than my second linker. That linker has been significantly

469

   slowed down over the years by adding support for ELF and for shared

470

   libraries. This support was patched in rather than being designed in.

471

   Future plans for the new linker include support for incremental

472

   linking-which is another way of increasing speed.

473

474

   There is an obvious pattern here: everybody wants linkers to be faster.

475

   This is because the job which a linker does is uninteresting. The

476

   linker is a speed bump for a developer, a process which takes a

477

   relatively long time but adds no real value. So why do we have linkers

478

   at all? That brings us to our next topic.

479

480

   A Technical Introduction

481

482

   What does a linker do?

483

484

   It's simple: a linker converts object files into executables and shared

485

   libraries. Let's look at what that means. For cases where a linker is

486

   used, the software development process consists of writing program code

487

   in some language: e.g., C or C++ or Fortran (but typically not Java, as

488

   Java normally works differently, using a loader rather than a linker).

489

   A compiler translates this program code, which is human readable text,

490

   into into another form of human readable text known as assembly code.

491

   Assembly code is a readable form of the machine language which the

492

   computer can execute directly. An assembler is used to turn this

493

   assembly code into an object file. For completeness, I'll note that

494

   some compilers include an assembler internally, and produce an object

495

   file directly. Either way, this is where things get interesting.

496

497

   In the old days, when dinosaurs roamed the data centers, many programs

498

   were complete in themselves. In those days there was generally no

499

   compiler-people wrote directly in assembly code-and the assembler

500

   actually generated an executable file which the machine could execute

501

   directly. As languages liked Fortran and Cobol started to appear,

502

   people began to think in terms of libraries of subroutines, which meant

503

   that there had to be some way to run the assembler at two different

504

   times, and combine the output into a single executable file. This

505

   required the assembler to generate a different type of output, which

506

   became known as an object file (I have no idea where this name came

507

   from). And a new program was required to combine different object files

508

   together into a single executable. This new program became known as the

509

   linker (the source of this name should be obvious).

510

511

   Linkers still do the same job today. In the decades that followed, one

512

   new feature has been added: shared libraries.

513

514

   More tomorrow.

515

516

   [19]Permalink [20]Comments (11)

517

518

References

519

520

   Visible links

521

   1. http://www.airs.com/blog/feed/

522

   2. http://www.airs.com/blog/feed/rss/

523

   3. http://www.airs.com/blog/feed/atom/

524

   4. http://www.airs.com/blog

525

   5. http://www.airs.com/blog/archives/41

526

   6. http://www.airs.com/blog/archives/category/programming/

527

   7. http://www.airs.com/blog/archives/41

528

   8. http://www.airs.com/blog/archives/41#comments

529

   9. http://www.airs.com/blog/archives/40

530

  10. http://www.airs.com/blog/archives/category/programming/

531

  11. http://www.airs.com/blog/archives/40

532

  12. http://www.airs.com/blog/archives/40#comments

533

  13. http://www.airs.com/blog/archives/39

534

  14. http://www.airs.com/blog/archives/category/programming/

535

  15. http://www.airs.com/blog/archives/39

536

  16. http://www.airs.com/blog/archives/39#comments

537

  17. http://www.airs.com/blog/archives/38

538

  18. http://www.airs.com/blog/archives/category/programming/

539

  19. http://www.airs.com/blog/archives/38

540

  20. http://www.airs.com/blog/archives/38#comments

541

  21. http://www.airs.com/blog/archives/37

542

  22. http://www.airs.com/blog/archives/category/politics/

543

  23. http://www.airs.com/blog/archives/37

544

  24. http://www.airs.com/blog/archives/37#comments

545

  25. http://www.airs.com/blog/page/13/

546

  26. http://www.airs.com/blog/page/11/

547

  27. http://www.airs.com/blog/archives/date/2007/11/

548

  28. http://www.airs.com/blog/archives/date/2007/10/

549

  29. http://www.airs.com/blog/archives/date/2007/09/

550

  30. http://www.airs.com/blog/archives/date/2007/08/

551

  31. http://www.airs.com/blog/archives/date/2007/02/

552

  32. http://www.airs.com/blog/archives/date/2007/01/

553

  33. http://www.airs.com/blog/archives/date/2006/12/

554

  34. http://www.airs.com/blog/archives/date/2006/07/

555

  35. http://www.airs.com/blog/archives/date/2006/06/

556

  36. http://www.airs.com/blog/archives/date/2006/04/

557

  37. http://www.airs.com/blog/archives/date/2006/02/

558

  38. http://www.airs.com/blog/archives/date/2006/01/

559

  39. http://www.airs.com/blog/archives/date/2005/12/

560

  40. http://www.airs.com/blog/archives/date/2005/11/

561

  41. http://www.airs.com/blog/archives/category/random/

562

  42. http://www.airs.com/blog/archives/category/money/

563

  43. http://www.airs.com/blog/archives/category/philosophy/

564

  44. http://www.airs.com/blog/archives/category/books/

565

  45. http://www.airs.com/blog/archives/category/politics/

566

  46. http://www.airs.com/blog/archives/category/programming/

567

  47. http://web.elastic.org/~fche/blog2/

568

  48. http://www.dberlin.org/blog/

569

  49. http://blog.360.yahoo.com/blog-f4oLexAlc6l3W_YHZF1IXYDu

570

  50. http://tromey.com/blog

571

  51. http://airs.com/ian/

572

  52. http://airs.com/ian/essays/

573

  53. http://lessig.org/blog/

574

  54. http://schneier.com/blog/

575

  55. http://www.airs.com/blog/wp-register.php

576

  56. http://www.airs.com/blog/wp-login.php

577

  57. http://www.airs.com/blog/feed/

578

  58. http://www.airs.com/blog/comments/feed/

579

  59. http://wordpress.org/

580

  60. http://beccary.com/

581

  61. http://weblogs.us/

582

  62. http://validator.w3.org/check/referer

583

  63. http://jigsaw.w3.org/css-validator/check/referer

584

585

   Hidden links:

586

  64. http://www.airs.com/blog/page/11/

587

  65. http://www.airs.com/blog/page/11/

588

  66. http://www.airs.com/blog/page/13/

589

  67. http://www.airs.com/blog/page/13/

1.1                  extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/B.txt

594

595

file : http://sources.gentoo.org/viewcvs.py/gentoo-projects/extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/B.txt?rev=1.1&view=markup

596

plain: http://sources.gentoo.org/viewcvs.py/gentoo-projects/extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/B.txt?rev=1.1&content-type=text/plain

597

598

Index: B.txt

599

===================================================================

600

   #[1]RSS 2.0 [2]RSS .92 [3]Atom 0.3

601

602

[4]Airs - Ian Lance Taylor

603

604

[5]Linkers part 9

605

606

   September 5, 2007 at 9:30 pm · Filed under [6]Programming

607

608

   Symbol Versions

609

610

   A shared library provides an API. Since executables are built with a

611

   specific set of header files and linked against a specific instance of

612

   the shared library, it also provides an ABI. It is desirable to be able

613

   to update the shared library independently of the executable. This

614

   permits fixing bugs in the shared library, and it also permits the

615

   shared library and the executable to be distributed separately.

616

   Sometimes an update to the shared library requires changing the API,

617

   and sometimes changing the API requires changing the ABI. When the ABI

618

   of a shared library changes, it is no longer possible to update the

619

   shared library without updating the executable. This is unfortunate.

620

621

   For example, consider the system C library and the stat function. When

622

   file systems were upgraded to support 64-bit file offsets, it became

623

   necessary to change the type of some of the fields in the stat struct.

624

   This is a change in the ABI of stat. New versions of the system library

625

   should provide a stat which returns 64-bit values. But old existing

626

   executables call stat expecting 32-bit values. This could be addressed

627

   by using complicated macros in the system header files. But there is a

628

   better way.

629

630

   The better way is symbol versions, which were introduced at Sun and

631

   extended by the GNU tools. Every shared library may define a set of

632

   symbol versions, and assign specific versions to each defined symbol.

633

   The versions and symbol assignments are done by a script passed to the

634

   program linker when creating the shared library.

635

636

   When an executable or shared library A is linked against another shared

637

   library B, and A refers to a symbol S defined in B with a specific

638

   version, the undefined dynamic symbol reference S in A is given the

639

   version of the symbol S in B. When the dynamic linker sees that A

640

   refers to a specific version of S, it will link it to that specific

641

   version in B. If B later introduces a new version of S, this will not

642

   affect A, as long as B continues to provide the old version of S.

643

644

   For example, when stat changes, the C library would provide two

645

   versions of stat, one with the old version (e.g., LIBC_1.0), and one

646

   with the new version (LIBC_2.0). The new version of stat would be

647

   marked as the default-the program linker would use it to satisfy

648

   references to stat in object files. Executables linked against the old

649

   version would require the LIBC_1.0 version of stat, and would therefore

650

   continue to work. Note that it is even possible for both versions of

651

   stat to be used in a single program, accessed from different shared

652

   libraries.

653

654

   As you can see, the version effectively is part of the name of the

655

   symbol. The biggest difference is that a shared library can define a

656

   specific version which is used to satisfy an unversioned reference.

657

658

   Versions can also be used in an object file (this is a GNU extension to

659

   the original Sun implementation). This is useful for specifying

660

   versions without requiring a version script. When a symbol name

661

   containts the @ character, the string before the @ is the name of the

662

   symbol, and the string after the @ is the version. If there are two

663

   consecutive @ characters, then this is the default version.

664

665

   Relaxation

666

667

   Generally the program linker does not change the contents other than

668

   applying relocations. However, there are some optimizations which the

669

   program linker can perform at link time. One of them is relaxation.

670

671

   Relaxation is inherently processor specific. It consists of optimizing

672

   code sequences which can become smaller or more efficient when final

673

   addresses are known. The most common type of relaxation is for call

674

   instructions. A processor like the m68k supports different PC relative

675

   call instructions: one with a 16-bit offset, and one with a 32-bit

676

   offset. When calling a function which is within range of the 16-bit

677

   offset, it is more efficient to use the shorter instruction. The

678

   optimization of shrinking these instructions at link time is known as

679

   relaxation.

680

681

   Relaxation is applied based on relocation entries. The linker looks for

682

   relocations which may be relaxed, and checks whether they are in range.

683

   If they are, the linker applies the relaxation, probably shrinking the

684

   size of the contents. The relaxation can normally only be done when the

685

   linker recognizes the instruction being relocated. Applying a

686

   relaxation may in turn bring other relocations within range, so

687

   relaxation is typically done in a loop until there are no more

688

   opportunities.

689

690

   When the linker relaxes a relocation in the middle of a contents, it

691

   may need to adjust any PC relative references which cross the point of

692

   the relaxation. Therefore, the assembler needs to generate relocation

693

   entries for all PC relative references. When not relaxing, these

694

   relocations may not be required, as a PC relative reference within a

695

   single contents will be valid whereever the contents winds up. When

696

   relaxing, though, the linker needs to look through all the other

697

   relocations that apply to the contents, and adjust PC relatives one

698

   where appropriate. This adjustment will simply consist of recomputing

699

   the PC relative offset.

700

701

   Of course it is also possible to apply relaxations which do not change

702

   the size of the contents. For example, on the MIPS the position

703

   independent calling sequence is normally to load the address of the

704

   function into the $25 register and then to do an indirect call through

705

   the register. When the target of the call is within the 18-bit range of

706

   the branch-and-call instruction, it is normally more efficient to use

707

   branch-and-call, since then the processor does not have to wait for the

708

   load of $25 to complete before starting the call. This relaxation

709

   changes the instruction sequence without changing the size.

710

711

   More tomorrow. I apologize for the haphazard arrangement of these

712

   linker notes. I'm just writing about ideas as I think of them, rather

713

   than being organized about that. If I do collect these notes into an

714

   essay, I'll try to make them more structured.

715

716

   [7]Permalink [8]Comments (6)

717

718

[9]Linkers part 8

719

720

   September 4, 2007 at 10:42 pm · Filed under [10]Programming

721

722

   ELF Segments

723

724

   Earlier I said that executable file formats were normally the same as

725

   object file formats. That is true for ELF, but with a twist. In ELF,

726

   object files are composed of sections: all the data in the file is

727

   accessed via the section table. Executables and shared libraries

728

   normally contain a section table, which is used by programs like nm.

729

   But the operating system and the dynamic linker do not use the section

730

   table. Instead, they use the segment table, which provides an

731

   alternative view of the file.

732

733

   All the contents of an ELF executable or shared library which are to be

734

   loaded into memory are contained within a segment (an object file does

735

   not have segments). A segment has a type, some flags, a file offset, a

736

   virtual address, a physical address, a file size, a memory size, and an

737

   alignment. The file offset points to a contiguous set of bytes which

738

   are the contents of the segment, the bytes to load into memory. When

739

   the operating system or the dynamic linker loads a file, it will do so

740

   by walking through the segments and loading them into memory (typically

741

   by using the mmap system call). All the information needed by the

742

   dynamic linker-the dynamic relocations, the dynamic symbol table,

743

   etc.-are accessed via information stored in special segments.

744

745

   Although an ELF executable or shared library does not, strictly

746

   speaking, require any sections, they normally do have them. The

747

   contents of a loadable section will fall entirely within a single

748

   segment.

749

750

   The program linker reads sections from the input object files. It sorts

751

   and concatenates them into sections in the output file. It maps all the

752

   loadable sections into segments in the output file. It lays out the

753

   section contents in the output file segments respecting alignment and

754

   access requirements, so that the segments may be mapped directly into

755

   memory. The sections are mapped to segments based on the access

756

   requirements: normally all the read-only sections are mapped to one

757

   segment and all the writable sections are mapped to another segment.

758

   The address of the latter segment will be set so that it starts on a

759

   separate page in memory, permitting mmap to set different permissions

760

   on the mapped pages.

761

762

   The segment flags are a bitmask which define access requirements. The

763

   defined flags are PF_R, PF_W, and PF_X, which mean, respectively, that

764

   the contents must be made readable, writable, or executable.

765

766

   The segment virtual address is the memory address at which the segment

767

   contents are loaded at runtime. The physical address is officially

768

   undefined, but is often used as the load address when using a system

769

   which does not use virtual memory. The file size is the size of the

770

   contents in the file. The memory size may be larger than the file size

771

   when the segment contains uninitialized data; the extra bytes will be

772

   filled with zeroes. The alignment of the segment is mainly informative,

773

   as the address is already specified.

774

775

   The ELF segment types are as follows:

776

     * PT_NULL: A null entry in the segment table, which is ignored.

777

     * PT_LOAD: A loadable entry in the segment table. The operating

778

       system or dynamic linker load all segments of this type. All other

779

       segments with contents will have their contents contained

780

       completely within a PT_LOAD segment.

781

     * PT_DYNAMIC: The dynamic segment. This points to a series of dynamic

782

       tags which the dynamic linker uses to find the dynamic symbol

783

       table, dynamic relocations, and other information that it needs.

784

     * PT_INTERP: The interpreter segment. This appears in an executable.

785

       The operating system uses it to find the name of the dynamic linker

786

       to run for the executable. Normally all executables will have the

787

       same interpreter name, but on some operating systems different

788

       interpreters are used in different emulation modes.

789

     * PT_NOTE: A note segment. This contains system dependent note

790

       information which may be used by the operating system or the

791

       dynamic linker. On GNU/Linux systems shared libraries often have a

792

       ABI tag note which may be used to specify the minimum version of

793

       the kernel which is required for the shared library. The dynamic

794

       linker uses this when selecting among different shared libraries.

795

     * PT_SHLIB: This is not used as far as I know.

796

     * PT_PHDR: This indicates the address and size of the segment table.

797

       This is not too useful in practice as you have to have already

798

       found the segment table before you can find this segment.

799

     * PT_TLS: The TLS segment. This holds the initial values for TLS

800

       variables.

801

     * PT_GNU_EH_FRAME (0x6474e550): A GNU extension used to hold a sorted

802

       table of unwind information. This table is built by the GNU program

803

       linker. It is used by gcc's support library to quickly find the

804

       appropriate handler for an exception, without requiring exception

805

       frames to be registered when the program start.

806

     * PT_GNU_STACK (0x6474e551): A GNU extension used to indicate whether

807

       the stack should be executable. This segment has no contents. The

808

       dynamic linker sets the permission of the stack in memory to the

809

       permissions of this segment.

810

     * PT_GNU_RELRO (0x6474e552): A GNU extension which tells the dynamic

811

       linker to set the given address and size to be read-only after

812

       applying dynamic relocations. This is used for const variables

813

       which require dynamic relocations.

814

815

   ELF Sections

816

817

   Now that we've done segments, lets take a quick look at the details of

818

   ELF sections. ELF sections are more complicated than segments, in that

819

   there are more types of sections. Every ELF object file, and most ELF

820

   executables and shared libraries, have a table of sections. The first

821

   entry in the table, section 0, is always a null section.

822

823

   ELF sections have several fields.

824

     * Name.

825

     * Type. I discuss section types below.

826

     * Flags. I discuss section flags below.

827

     * Address. This is the address of the section. In an object file this

828

       is normally zero. In an executable or shared library it is the

829

       virtual address. Since executables are normally accessed via

830

       segments, this is essentially documentation.

831

     * File offset. This is the offset of the contents within the file.

832

     * Size. The size of the section.

833

     * Link. Depending on the section type, this may hold the index of

834

       another section in the section table.

835

     * Info. The meaning of this field depends on the section type.

836

     * Address alignment. This is the required alignment of the section.

837

       The program linker uses this when laying out the section in memory.

838

     * Entry size. For sections which hold an array of data, this is the

839

       size of one data element.

840

841

   These are the types of ELF sections which the program linker may see.

842

     * SHT_NULL: A null section. Sections with this type may be ignored.

843

     * SHT_PROGBITS: A section holding bits of the program. This is an

844

       ordinary section with contents.

845

     * SHT_SYMTAB: The symbol table. This section actually holds the

846

       symbol table itself. The section contents are an array of ELF

847

       symbol structures.

848

     * SHT_STRTAB: A string table. This type of section holds

849

       null-terminated strings. Sections of this type are used for the

850

       names of the symbols and the names of the sections themselves.

851

     * SHT_RELA: A relocation table. The link field holds the index of the

852

       section to which these relocations apply. These relocations include

853

       addends.

854

     * SHT_HASH: A hash table used by the dynamic linker to speed symbol

855

       lookup.

856

     * SHT_DYNAMIC: The dynamic tags used by the dynamic linker. Normally

857

       the PT_DYNAMIC segment and the SHT_DYNAMIC section will point to

858

       the same contents.

859

     * SHT_NOTE: A note section. This is used in system dependent ways. A

860

       loadable SHT_NOTE section will become a PT_NOTE segment.

861

     * SHT_NOBITS: A section which takes up memory space but has no

862

       associated contents. This is used for zero-initialized data.

863

     * SHT_REL: A relocation table, like SHT_RELA but the relocations have

864

       no addends.

865

     * SHT_SHLIB: This is not used as far as I know.

866

     * SHT_DYNSYM: The dynamic symbol table. Normally the DT_SYMTAB

867

       dynamic tag will point to the same contents as this section (I

868

       haven't discussed dynamic tags yet, though).

869

     * SHT_INIT_ARRAY: This section holds a table of function addresses

870

       which should each be called at program startup time, or, for a

871

       shared library, when the library is opened by dlopen.

872

     * SHT_FINI_ARRAY: Like SHT_INIT_ARRAY, but called at program exit

873

       time or dlclose time.

874

     * SHT_PREINIT_ARRAY: Like SHT_INIT_ARRAY, but called before any

875

       shared libraries are initialized. Normally shared libraries

876

       initializers are run before the executable initializers. This

877

       section type may only be linked into an executable, not into a

878

       shared library.

879

     * SHT_GROUP: This is used to group related sections together, so that

880

       the program linker may discard them as a unit when appropriate.

881

       Sections of this type may only appear in object files. The contents

882

       of this type of section are a flag word followed by a series of

883

       section indices.

884

     * SHT_SYMTAB_SHNDX: ELF symbol table entries only provide a 16-bit

885

       field for the section index. For a file with more than 65536

886

       sections, a section of this type is created. It holds one 32-bit

887

       word for each symbol. If a symbol's section index is SHN_XINDEX,

888

       the real section index may be found by looking in the

889

       SHT_SYMTAB_SHNDX section.

890

     * SHT_GNU_LIBLIST (0x6ffffff7): A GNU extension used by the prelinker

891

       to hold a list of libraries found by the prelinker.

892

     * SHT_GNU_verdef (0x6ffffffd): A Sun and GNU extension used to hold

893

       version definitions (I'll take about symbol versions at some

894

       point).

895

     * SHT_GNU_verneed (0x6ffffffe): A Sun and GNU extension used to hold

896

       versions required from other shared libraries.

897

     * SHT_GNU_versym (0x6fffffff): A Sun and GNU extension used to hold

898

       the versions for each symbol.

899

900

   These are the types of section flags.

901

     * SHF_WRITE: Section contains writable data.

902

     * SHF_ALLOC: Section contains data which should be part of the loaded

903

       program image. For example, this would normally be set for a

904

       SHT_PROGBITS section and not set for a SHT_SYMTAB section.

905

     * SHF_EXECINSTR: Section contains executable instructions.

906

     * SHF_MERGE: Section contains constants which the program linker may

907

       merge together to save space. The compiler can use this type of

908

       section for read-only data whose address is unimportant.

909

     * SHF_STRINGS: In conjunction with SHF_MERGE, this means that the

910

       section holds null terminated string constants which may be merged.

911

     * SHF_INFO_LINK: This flag indicates that the info field in the

912

       section holds a section index.

913

     * SHF_LINK_ORDER: This flag tells the program linker that when it

914

       combines sections, this section must appear in the same relative

915

       order as the section in the link field. This can be used to ensure

916

       that address tables are built in the expected order.

917

     * SHF_OS_NONCONFORMING: If the program linker sees a section with

918

       this flag, and does not understand the type or all other flags,

919

       then it must issue an error.

920

     * SHF_GROUP: This section appears in a group (see SHT_GROUP, above).

921

     * SHF_TLS: This section holds TLS data.

922

923

   [11]Permalink [12]Comments

924

925

[13]Linkers part 7

926

927

   September 3, 2007 at 9:52 pm · Filed under [14]Programming

928

929

   As we've seen, what linkers do is basically quite simple, but the

930

   details can get complicated. The complexity is because smart

931

   programmers can see small optimizations to speed up their programs a

932

   little bit, and somtimes the only place those optimizations can be

933

   implemented is the linker. Each such optimizations makes the linker a

934

   little more complicated. At the same time, of course, the linker has to

935

   run as fast as possible, since nobody wants to sit around waiting for

936

   it to finish. Today I'll talk about a classic small optimization

937

   implemented by the linker.

938

939

   Thread Local Storage

940

941

   I'll assume you know what a thread is. It is often useful to have a

942

   global variable which can take on a different value in each thread (if

943

   you don't see why this is useful, just trust me on this). That is, the

944

   variable is global to the program, but the specific value is local to

945

   the thread. If thread A sets the thread local variable to 1, and thread

946

   B then sets it to 2, then code running in thread A will continue to see

947

   the value 1 for the variable while code running in thread B sees the

948

   value 2. In Posix threads this type of variable can be created via

949

   pthread_key_create and accessed via pthread_getspecific and

950

   pthread_setspecific.

951

952

   Those functions work well enough, but making a function call for each

953

   access is awkward and inconvenient. It would be more useful if you

954

   could just declare a regular global variable and mark it as thread

955

   local. That is the idea of Thread Local Storage (TLS), which I believe

956

   was invented at Sun. On a system which supports TLS, any global (or

957

   static) variable may be annotated with __thread. The variable is then

958

   thread local.

959

960

   Clearly this requires support from the compiler. It also requires

961

   support from the program linker and the dynamic linker. For maximum

962

   efficiency-and why do this if you aren't going to get maximum

963

   efficiency?-some kernel support is also needed. The design of TLS on

964

   ELF systems fully supports shared libraries, including having multiple

965

   shared libraries, and the executable itself, use the same name to refer

966

   to a single TLS variable. TLS variables can be initialized. Programs

967

   can take the address of a TLS variable, and pass the pointers between

968

   threads, so the address of a TLS variable is a dynamic value and must

969

   be globally unique.

970

971

   How is this all implemented? First step: define different storage

972

   models for TLS variables.

973

     * Global Dynamic: Fully general access to TLS variables from an

974

       executable or a shared object.

975

     * Local Dynamic: Permits access to a variable which is bound locally

976

       within the executable or shared object from which it is referenced.

977

       This is true for all static TLS variables, for example. It is also

978

       true for protected symbols-I described those back in part 5.

979

     * Initial Executable: Permits access to a variable which is known to

980

       be part of the TLS image of the executable. This is true for all

981

       TLS variables defined in the executable itself, and for all TLS

982

       variables in shared libraries explicitly linked with the

983

       executable. This is not true for accesses from a shared library,

984

       nor for accesses to TLS variables defined in shared libraries

985

       opened by dlopen.

986

     * Local Executable: Permits access to TLS variables defined in the

987

       executable itself.

988

989

   These storage models are defined in decreasing order of flexibility.

990

   Now, for efficiency and simplicity, a compiler which supports TLS will

991

   permit the developer to specify the appropriate TLS model to use (with

992

   gcc, this is done with the -ftls-model option, although the Global

993

   Dynamic and Local Dynamic models also require using -fpic). So, when

994

   compiling code which will be in an executable and never be in a shared

995

   library, the developer may choose to set the TLS storage model to

996

   Initial Executable.

997

998

   Of course, in practice, developers often do not know where code will be

999

   used. And developers may not be aware of the intricacies of TLS models.

1000

   The program linker, on the other hand, knows whether it is creating an

1001

   executable or a shared library, and it knows whether the TLS variable

1002

   is defined locally. So the program linker gets the job of automatically

1003

   optimizing references to TLS variables when possible. These references

1004

   take the form of relocations, and the linker optimizes the references

1005

   by changing the code in various ways.

1006

1007

   The program linker is also responsible for gathering all TLS variables

1008

   together into a single TLS segment (I'll talk more about segments

1009

   later, for now think of them as a section). The dynamic linker has to

1010

   group together the TLS segments of the executable and all included

1011

   shared libraries, resolve the dynamic TLS relocations, and has to build

1012

   TLS segments dynamically when dlopen is used. The kernel has to make it

1013

   possible for access to the TLS segments be efficient.

1014

1015

   That was all pretty general. Let's do an example, again for i386 ELF.

1016

   There are three different implementations of i386 ELF TLS; I'm going to

1017

   look at the gnu implementation. Consider this trivial code:

1018

1019

     __thread int i;

1020

     int foo() { return i; }

1021

1022

   In global dynamic mode, this generates i386 assembler code like this:

1023

1024

     leal i@TLSGD(,%ebx,1), %eax

1025

     call ___tls_get_addr@PLT

1026

     movl (%eax), %eax

1027

1028

   Recall from part 4 that %ebx holds the address of the GOT table. The

1029

   first instruction will have a R_386_TLS_GD relocation for the variable

1030

   i; the relocation will apply to the offset of the leal instruction.

1031

   When the program linker sees this relocation, it will create two

1032

   consecutive entries in the GOT table for the TLS variable i. The first

1033

   one will get a R_386_TLS_DTPMOD32 dynamic relocation, and the second

1034

   will get a R_386_TLS_DTPOFF32 dynamic relocation. The dynamic linker

1035

   will set the DTPMOD32 GOT entry to hold the module ID of the object

1036

   which defines the variable. The module ID is an index within the

1037

   dynamic linker's tables which identifies the executable or a specific

1038

   shared library. The dynamic linker will set the DTPOFF32 GOT entry to

1039

   the offset within the TLS segment for that module. The __tls_get_addr

1040

   function will use those values to compute the address (this function

1041

   also takes care of lazy allocation of TLS variables, which is a further

1042

   optimization specific to the dynamic linker). Note that __tls_get_addr

1043

   is actually implemented by the dynamic linker itself; it follows that

1044

   global dynamic TLS variables are not supported (and not necessary) in

1045

   statically linked executables.

1046

1047

   At this point you are probably wondering what is so inefficient

1048

   aboutpthread_getspecific. The real advantage of TLS shows when you see

1049

   what the program linker can do. The leal; call sequence shown above is

1050

   canonical: the compiler will always generate the same sequence to

1051

   access a TLS variable in global dynamic mode. The program linker takes

1052

   advantage of that fact. If the program linker sees that the code shown

1053

   above is going into an executable, it knows that the access does not

1054

   have to be treated as global dynamic; it can be treated as initial

1055

   executable. The program linker will actually rewrite the code to look

1056

   like this:

1057

1058

     movl %gs:0, %eax

1059

     subl $i@GOTTPOFF(%ebx), %eax

1060

1061

   Here we see that the TLS system has coopted the %gs segment register,

1062

   with cooperation from the operating system, to point to the TLS segment

1063

   of the executable. For each processor which supports TLS, some such

1064

   efficiency hack is made. Since the program linker is building the

1065

   executable, it builds the TLS segment, and knows the offset of i in the

1066

   segment. The GOTTPOFF is not a real relocation; it is created and then

1067

   resolved within the program linker. It is, of course, the offset from

1068

   the GOT table to the address of i in the TLS segment. The movl (%eax),

1069

   %eax from the original sequence remains to actually load the value of

1070

   the variable.

1071

1072

   Actually, that is what would happen if i were not defined in the

1073

   executable itself. In the example I showed, i is defined in the

1074

   executable, so the program linker can actually go from a global dynamic

1075

   access all the way to a local executable access. That looks like this:

1076

1077

     movl %gs:0,%eax

1078

     subl $i@TPOFF,%eax

1079

1080

   Here i@TPOFF is simply the known offset of i within the TLS segment.

1081

   I'm not going to go into why this uses subl rather than addl; suffice

1082

   it to say that this is another efficiency hack in the dynamic linker.

1083

1084

   If you followed all that, you'll see that when an executable accesses a

1085

   TLS variable which is defined in that executable, it requires two

1086

   instructions to compute the address, typically followed by another one

1087

   to actually load or store the value. That is significantly more

1088

   efficient than calling pthread_getspecific. Admittedly, when a shared

1089

   library accesses a TLS variable, the result is not much better than

1090

   pthread_getspecific, but it shouldn't be any worse, either. And the

1091

   code using __thread is much easier to write and to read.

1092

1093

   That was a real whirlwind tour. There are three separate but related

1094

   TLS implementations on i386 (known as sun, gnu, and gnu2), and 23

1095

   different relocation types are defined. I'm certainly not going to try

1096

   to describe all the details; I don't know them all in any case. They

1097

   all exist in the name of efficient access to the TLS variables for a

1098

   given storage model.

1099

1100

   Is TLS worth the additional complexity in the program linker and the

1101

   dynamic linker? Since those tools are used for every program, and since

1102

   the C standard global variable errno in particular can be implemented

1103

   using TLS, the answer is most likely yes.

1104

1105

   [15]Permalink [16]Comments (8)

1106

1107

[17]Linkers part 6

1108

1109

   August 29, 2007 at 8:52 pm · Filed under [18]Programming

1110

1111

   So many things to talk about. Let's go back and cover relocations in

1112

   some more detail, with some examples.

1113

1114

   Relocations

1115

1116

   As I said back in part 2, a relocation is a computation to perform on

1117

   the contents. And as I said yesterday, a relocation can also direct the

1118

   linker to take other actions, like creating a PLT or GOT entry. Let's

1119

   take a closer look at the computation.

1120

1121

   In general a relocation has a type, a symbol, an offset into the

1122

   contents, and an addend.

1123

   From the linker's point of view, the contents are simply an

1124

   uninterpreted series of bytes. A relocation changes those bytes as

1125

   necessary to produce the correct final executable. For example,

1126

   consider the C code g = 0; where g is a global variable. On the i386,

1127

   the compiler will turn this into an assembly language instruction,

1128

   which will most likely be movl $0, g (for position dependent

1129

   code-position independent code would loading the address of g from the

1130

   GOT). Now, the g in the C code is a global variable, and we all more or

1131

   less know what that means. The g in the assembly code is not that

1132

   variable. It is a symbol which holds the address of that variable.

1133

1134

   The assembler does not know the address of the global variable g, which

1135

   is another way of saying that the assembler does not know the value of

1136

   the symbol g. It is the linker that is going to pick that address. So

1137

   the assembler has to tell the linker that it needs to use the address

1138

   of g in this instruction. The way the assembler does this is to create

1139

   a relocation. We don't use a separate relocation type for each

1140

   instruction; instead, each processor will have a natural set of

1141

   relocation types which are appropriate for the machine architecture.

1142

   Each type of relocation expresses a specific computation.

1143

1144

   In the i386 case, the assembler will generate these bytes:

1145

1146

     c7 05 00 00 00 00 00 00 00 00

1147

1148

   The c7 05 are the instruction (movl constant to address). The first

1149

   four 00 bytes are the 32-bit constant 0. The second four 00 bytes are

1150

   the address. The assembler tells the linker to put the value of the

1151

   symbol g into those four bytes by generating (in this case) a R_386_32

1152

   relocation. For this relocation the symbol will be g, the offset will

1153

   be to the last four bytes of the instruction, the type will be

1154

   R_386_32, and the addend will be 0 (in the case of the i386 the addend

1155

   is stored in the contents rather than in the relocation itself, but

1156

   this is a detail). The type R_386_32 expresses a specific computation,

1157

   which is: put the 32-bit sum of the value of the symbol and the addend

1158

   into the offset. Since for the i386 the addend is stored in the

1159

   contents, this can also be expressed as: add the value of the symbol to

1160

   the 32-bit field at the offset. When the linker performs this

1161

   computation, the address in the instruction will be the address of the

1162

   global variable g. Regardless of the details, the important point to

1163

   note is that the relocation adjusts the contents by applying a specific

1164

   computation selected by the type.

1165

1166

   An example of a simple case which does use an addend would be

1167

1168

     char a[10]; // A global array.

1169

     char* p = &a[1]; // In a function.

1170

1171

   The assignment to p will wind up requiring a relocation for the symbol

1172

   a. Here the addend will be 1, so that the resulting instruction

1173

   references a + 1 rather than a + 0.

1174

1175

   To point out how relocations are processor dependent, let's consider g

1176

   = 0; on a RISC processor: the PowerPC (in 32-bit mode). In this case,

1177

   multiple assembly language instructions are required:

1178

1179

     li 1,0 // Set register 1 to 0

1180

     lis 9,g@ha // Load high-adjusted part of g into register 9

1181

     stw 1,g@l(9) // Store register 1 to address in register 9 plus low

1182

     adjusted part g

1183

1184

   The lis instruction loads a value into the upper 16 bits of register 9,

1185

   setting the lower 16 bits to zero. The stw instruction adds a signed 16

1186

   bit value to register 9 to form an address, and then stores the value

1187

   of register 1 at that address. The @hapart of the operand directs the

1188

   assembler to generate a R_PPC_ADDR16_HA reloc. The @l produces a

1189

   R_PPC_ADDR16_LO reloc. The goal of these relocs is to compute the value

1190

   of the symbol g and use it as the store address.

1191

1192

   That is enough information to determine the computations performed by

1193

   these relocs. The R_PPC_ADDR16_HA reloc computes (SYMBOL >> 16) +

1194

   ((SYMBOL & 0x8000) ? 1 : 0). The R_PPC_ADDR16_LO computes SYMBOL &

1195

   0xffff. The extra computation for R_PPC_ADDR16_HA is because the stw

1196

   instruction adds the signed 16-bit value, which means that if the low

1197

   16 bits appears negative we have to adjust the high 16 bits

1198

   accordingly. The offsets of the relocations are such that the 16-bit

1199

   resulting values are stored into the appropriate parts of the machine

1200

   instructions.

1201

1202

   The specific examples of relocations I've discussed here are ELF

1203

   specific, but the same sorts of relocations occur for any object file

1204

   format.

1205

1206

   The examples I've shown are for relocations which appear in an object

1207

   file. As discussed in part 4, these types of relocations may also

1208

   appear in a shared library, if they are copied there by the program

1209

   linker. In ELF, there are also specific relocation types which never

1210

   appear in object files but only appear in shared libraries or

1211

   executables. These are the JMP_SLOT, GLOB_DAT, and RELATIVE relocations

1212

   discussed earlier. Another type of relocation which only appears in an

1213

   executable is a COPY relocation, which I will discuss later.

1214

1215

   Position Dependent Shared Libraries

1216

1217

   I realized that in part 4 I forgot to say one of the important reasons

1218

   that ELF shared libraries use PLT and GOT tables. The idea of a shared

1219

   library is to permit mapping the same shared library into different

1220

   processes. This only works at maximum efficiency if the shared library

1221

   code looks the same in each process. If it does not look the same, then

1222

   each process will need its own private copy, and the savings in

1223

   physical memory and sharing will be lost.

1224

1225

   As discussed in part 4, when the dynamic linker loads a shared library

1226

   which contains position dependent code, it must apply a set of dynamic

1227

   relocations. Those relocations will change the code in the shared

1228

   library, and it will no longer be sharable.

1229

1230

   The advantage of the PLT and GOT is that they move the relocations

1231

   elsewhere, to the PLT and GOT tables themselves. Those tables can then

1232

   be put into a read-write part of the shared library. This part of the

1233

   shared library will be much smaller than the code. The PLT and GOT

1234

   tables will be different in each process using the shared library, but

1235

   the code will be the same.

1236

1237

   I'll be taking a vacation for the long weekend. My next post will most

1238

   likely be on Tuesday.

1239

1240

   [19]Permalink [20]Comments (4)

1241

1242

[21]Linkers part 5

1243

1244

   August 28, 2007 at 11:24 pm · Filed under [22]Programming

1245

1246

   Shared Libraries Redux

1247

1248

   Yesterday I talked about how shared libraries work. I realized that I

1249

   should say something about how linkers implement shared libraries. This

1250

   discussion will again be ELF specific.

1251

1252

   When the program linker puts position dependent code into a shared

1253

   library, it has to copy more of the relocations from the object file

1254

   into the shared library. They will become dynamic relocations computed

1255

   by the dynamic linker at runtime. Some relocations do not have to be

1256

   copied; for example, a PC relative relocation to a symbol which is

1257

   local to shared library can be fully resolved by the program linker,

1258

   and does not require a dynamic reloc. However, note that a PC relative

1259

   relocation to a global symbol does require a dynamic relocation;

1260

   otherwise, the main executable would not be able to override the

1261

   symbol. Some relocations have to exist in the shared library, but do

1262

   not need to be actual copies of the relocations in the object file; for

1263

   example, a relocation which computes the absolute address of symbol

1264

   which is local to the shared library can often be replaced with a

1265

   RELATIVE reloc, which simply directs the dynamic linker to add the

1266

   difference between the shared library's load address and its base

1267

   address. The advantage of using a RELATIVE reloc is that the dynamic

1268

   linker can compute it quickly at runtime, because it does not require

1269

   determining the value of a symbol.

1270

1271

   For position independent code, the program linker has a harder job. The

1272

   compiler and assembler will cooperate to generate spcial relocs for

1273

   position independent code. Although details differ among processors,

1274

   there will typically be a PLT reloc and a GOT reloc. These relocs will

1275

   direct the program linker to add an entry to the PLT or the GOT, as

1276

   well as performing some computation. For example, on the i386 a

1277

   function call in position independent code will generate a R_386_PLT32

1278

   reloc. This reloc will refer to a symbol as usual. It will direct the

1279

   program linker to add a PLT entry for that symbol, if one does not

1280

   already exist. The computation of the reloc is then a PC-relative

1281

   reference to the PLT entry. (The 32 in the name of the reloc refers to

1282

   the size of the reference, which is 32 bits). Yesterday I described how

1283

   on the i386 every PLT entry also has a corresponding GOT entry, so the

1284

   R_386_PLT32 reloc actually directs the program linker to create both a

1285

   PLT entry and a GOT entry.

1286

1287

   When the program linker creates an entry in the PLT or the GOT, it must

1288

   also generate a dynamic reloc to tell the dynamic linker about the

1289

   entry. This will typically be a JMP_SLOT or GLOB_DAT relocation.

1290

1291

   This all means that the program linker must keep track of the PLT entry

1292

   and the GOT entry for each symbol. Initially, of course, there will be

1293

   no such entries. When the linker sees a PLT or GOT reloc, it must check

1294

   whether the symbol referenced by the reloc already has a PLT or GOT

1295

   entry, and create one if it does not. Note that it is possible for a

1296

   single symbol to have both a PLT entry and a GOT entry; this will

1297

   happen for position independent code which both calls a function and

1298

   also takes its address.

1299

1300

   The dynamic linker's job for the PLT and GOT tables is to simply

1301

   compute the JMP_SLOT and GLOB_DAT relocs at runtime. The main

1302

   complexity here is the lazy evaluation of PLT entries which I described

1303

   yesterday.

1304

1305

   The fact that C permits taking the address of a function introduces an

1306

   interesting wrinkle. In C you are permitted to take the address of a

1307

   function, and you are permitted to compare that address to another

1308

   function address. The problem is that if you take the address of a

1309

   function in a shared library, the natural result would be to get the

1310

   address of the PLT entry. After all, that is address to which a call to

1311

   the function will jump. However, each shared library has its own PLT,

1312

   and thus the address of a particular function would differ in each

1313

   shared library. That means that comparisons of function pointers

1314

   generated in different shraed libraries may be different when they

1315

   should be the same. This is not a purely hypothetical problem; when I

1316

   did a port which got it wrong, before I fixed the bug I saw failures in

1317

   the Tcl shared library when it compared function pointers.

1318

1319

   The fix for this bug on most processors is a special marking for a

1320

   symbol which has a PLT entry but is not defined. Typically the symbol

1321

   will be marked as undefined, but with a non-zero value-the value will

1322

   be set to the address of the PLT entry. When the dynamic linker is

1323

   searching for the value of a symbol to use for a reloc other than a

1324

   JMP_SLOT reloc, if it finds such a specially marked symbol, it will use

1325

   the non-zero value. This will ensure that all references to the symbol

1326

   which are not function calls will use the same value. To make this

1327

   work, the compiler and assembler must make sure that any reference to a

1328

   function which does not involve calling it will not carry a standard

1329

   PLT reloc. This special handling of function addresses needs to be

1330

   implemented in both the program linker and the dynamic linker.

1331

1332

   ELF Symbols

1333

1334

   OK, enough about shared libraries. Let's go over ELF symbols in more

1335

   detail. I'm not going to lay out the exact data structures-go to the

1336

   ELF ABI for that. I'm going to take about the different fields and what

1337

   they mean. Many of the different types of ELF symbols are also used by

1338

   other object file formats, but I won't cover that.

1339

1340

   An entry in an ELF symbol table has eight pieces of information: a

1341

   name, a value, a size, a section, a binding, a type, a visibility, and

1342

   undefined additional information (currently there are six undefined

1343

   bits, though more may be added). An ELF symbol defined in a shared

1344

   object may also have an associated version name.

1345

1346

   The name is obvious.

1347

1348

   For an ordinary defined symbol, the section is some section in the file

1349

   (specifically, the symbol table entry holds an index into the section

1350

   table). For an object file the value is relative to the start of the

1351

   section. For an executable the value is an absolute address. For a

1352

   shared library the value is relative to the base address.

1353

1354

   For an undefined reference symbol, the section index is the special

1355

   value SHN_UNDEF which has the value 0. A section index of SHN_ABS

1356

   (0xfff1) indicates that the value of the symbol is an absolute value,

1357

   not relative to any section.

1358

1359

   A section index of SHN_COMMON (0xfff2) indicates a common symbol.

1360

   Common symbols were invented to handle Fortran common blocks, and they

1361

   are also often used for uninitialized global variables in C. A common

1362

   symbol has unusual semantics. Common symbols have a value of zero, but

1363

   set the size field to the desired size. If one object file has a common

1364

   symbol and another has a definition, the common symbol is treated as an

1365

   undefined reference. If there is no definition for a common symbol, the

1366

   program linker acts as though it saw a definition initialized to zero

1367

   of the appropriate size. Two object files may have common symbols of

1368

   different sizes, in which case the program linker will use the largest

1369

   size. Implementing common symbol semantics across shared libraries is a

1370

   touchy subject, somewhat helped by the recent introduction of a type

1371

   for common symbols as well as a special section index (see the

1372

   discussion of symbol types below).

1373

1374

   The size of an ELF symbol, other than a common symbol, is the size of

1375

   the variable or function. This is mainly used for debugging purposes.

1376

1377

   The binding of an elf symbol is global, local, or weak. A global symbol

1378

   is globally visible. A local symbol is only locally visible (e.g., a

1379

   static function). Weak symbols come in two flavors. A weak undefined

1380

   reference is like an ordinary undefined reference, except that it is

1381

   not an error if a relocation refers to a weak undefined reference

1382

   symbol which has no defining symbol. Instead, the relocation is

1383

   computed as though the symbol had the value zero.

1384

1385

   A weak defined symbol is permitted to be linked with a non-weak defined

1386

   symbol of the same name without causing a multiple definition error.

1387

   Historically there are two ways for the program linker to handle a weak

1388

   defined symbol. On SVR4 if the program linker sees a weak defined

1389

   symbol followed by a non-weak defined symbol with the same name, it

1390

   will issue a multiple definition error. However, a non-weak defined

1391

   symbol followed by a weak defined symbol will not cause an error. On

1392

   Solaris, a weak defined symbol followed by a non-weak defined symbol is

1393

   handled by causing all references to attach to the non-weak defined

1394

   symbol, with no error. This difference in behaviour is due to an

1395

   ambiguity in the ELF ABI which was read differently by different

1396

   people. The GNU linker follows the Solaris behaviour.

1397

1398

   The type of an ELF symbol is one of the following:

1399

     * STT_NOTYPE: no particular type.

1400

     * STT_OBJECT: a data object, such as a variable.

1401

     * STT_FUNC: a function

1402

     * STT_SECTION: a local symbol associated with a section. This type of

1403

       symbol is used to reduce the number of local symbols required, by

1404

       changing all relocations against local symbols in a specific

1405

       section to use the STT_SECTION symbol instead.

1406

     * STT_FILE: a special symbol whose name is the name of the source

1407

       file which produced the object file.

1408

     * STT_COMMON: a common symbol. This is the same as setting the

1409

       section index to SHN_COMMON, except in a shared object. The program

1410

       linker will normally have allocated space for the common symbol in

1411

       the shared object, so it will have a real section index. The

1412

       STT_COMMON type tells the dynamic linker that although the symbol

1413

       has a regular definition, it is a common symbol.

1414

     * STT_TLS: a symbol in the Thread Local Storage area. I will describe

1415

       this in more detail some other day.

1416

1417

   ELF symbol visibility was invented to provide more control over which

1418

   symbols were accessible outside a shared library. The basic idea is

1419

   that a symbol may be global within a shared library, but local outside

1420

   the shared library.

1421

     * STV_DEFAULT: the usual visibility rules apply: global symbols are

1422

       visible everywhere.

1423

     * STV_INTERNAL: the symbol is not accessible outside the current

1424

       executable or shared library.

1425

     * STV_HIDDEN: the symbol is not visible outside the current

1426

       executable or shared library, but it may be accessed indirectly,

1427

       probably because some code took its address.

1428

     * STV_PROTECTED: the symbol is visible outside the current executable

1429

       or shared object, but it may not be overridden. That is, if a

1430

       protected symbol in a shared library is referenced by other code in

1431

       the shared library, that other code will always reference the

1432

       symbol in the shared library, even if the executable defines a

1433

       symbol with the same name.

1434

1435

   I'll described symbol versions later.

1436

1437

   More tomorrow.

1438

1439

   [23]Permalink [24]Comments (6)

1440

   [25]« Previous entries · [26]Next entries »

1441

     __________________________________________________________________

1442

     __________________________________________________________________

1443

1444

     * Archives

1445

          + [27]November 2007

1446

          + [28]October 2007

1447

          + [29]September 2007

1448

          + [30]August 2007

1449

          + [31]February 2007

1450

          + [32]January 2007

1451

          + [33]December 2006

1452

          + [34]July 2006

1453

          + [35]June 2006

1454

          + [36]April 2006

1455

          + [37]February 2006

1456

          + [38]January 2006

1457

          + [39]December 2005

1458

          + [40]November 2005

1459

     * Categories

1460

          + [41]Random

1461

          + [42]Money

1462

          + [43]Philosophy

1463

          + [44]Books

1464

          + [45]Politics

1465

          + [46]Programming

1466

     * Search

1467

       ____________________ Search

1468

     * Friend's blogs

1469

          + [47]Frank Ch. Eigler

1470

          + [48]Daniel Berlin

1471

          + [49]Angela Thomas

1472

          + [50]Tom Tromey

1473

     * My pages

1474

          + [51]Home

1475

          + [52]Essays

1476

     * Notable blogs

1477

          + [53]Lawrence Lessig

1478

          + [54]Bruce Schneier

1479

     * Meta

1480

          + [55]Register

1481

          + [56]Login

1482

          + [57]Entries RSS

1483

          + [58]Comments RSS

1484

          + [59]Wordpress

1485

1486

   Design by [60]Beccary and [61]Weblogs.us · [62]XHTML · [63]CSS

1487

1488

References

1489

1490

   Visible links

1491

   1. http://www.airs.com/blog/feed/

1492

   2. http://www.airs.com/blog/feed/rss/

1493

   3. http://www.airs.com/blog/feed/atom/

1494

   4. http://www.airs.com/blog

1495

   5. http://www.airs.com/blog/archives/46

1496

   6. http://www.airs.com/blog/archives/category/programming/

1497

   7. http://www.airs.com/blog/archives/46

1498

   8. http://www.airs.com/blog/archives/46#comments

1499

   9. http://www.airs.com/blog/archives/45

1500

  10. http://www.airs.com/blog/archives/category/programming/

1501

  11. http://www.airs.com/blog/archives/45

1502

  12. http://www.airs.com/blog/archives/45#comments

1503

  13. http://www.airs.com/blog/archives/44

1504

  14. http://www.airs.com/blog/archives/category/programming/

1505

  15. http://www.airs.com/blog/archives/44

1506

  16. http://www.airs.com/blog/archives/44#comments

1507

  17. http://www.airs.com/blog/archives/43

1508

  18. http://www.airs.com/blog/archives/category/programming/

1509

  19. http://www.airs.com/blog/archives/43

1510

  20. http://www.airs.com/blog/archives/43#comments

1511

  21. http://www.airs.com/blog/archives/42

1512

  22. http://www.airs.com/blog/archives/category/programming/

1513

  23. http://www.airs.com/blog/archives/42

1514

  24. http://www.airs.com/blog/archives/42#comments

1515

  25. http://www.airs.com/blog/page/12/

1516

  26. http://www.airs.com/blog/page/10/

1517

  27. http://www.airs.com/blog/archives/date/2007/11/

1518

  28. http://www.airs.com/blog/archives/date/2007/10/

1519

  29. http://www.airs.com/blog/archives/date/2007/09/

1520

  30. http://www.airs.com/blog/archives/date/2007/08/

1521

  31. http://www.airs.com/blog/archives/date/2007/02/

1522

  32. http://www.airs.com/blog/archives/date/2007/01/

1523

  33. http://www.airs.com/blog/archives/date/2006/12/

1524

  34. http://www.airs.com/blog/archives/date/2006/07/

1525

  35. http://www.airs.com/blog/archives/date/2006/06/

1526

  36. http://www.airs.com/blog/archives/date/2006/04/

1527

  37. http://www.airs.com/blog/archives/date/2006/02/

1528

  38. http://www.airs.com/blog/archives/date/2006/01/

1529

  39. http://www.airs.com/blog/archives/date/2005/12/

1530

  40. http://www.airs.com/blog/archives/date/2005/11/

1531

  41. http://www.airs.com/blog/archives/category/random/

1532

  42. http://www.airs.com/blog/archives/category/money/

1533

  43. http://www.airs.com/blog/archives/category/philosophy/

1534

  44. http://www.airs.com/blog/archives/category/books/

1535

  45. http://www.airs.com/blog/archives/category/politics/

1536

  46. http://www.airs.com/blog/archives/category/programming/

1537

  47. http://web.elastic.org/~fche/blog2/

1538

  48. http://www.dberlin.org/blog/

1539

  49. http://blog.360.yahoo.com/blog-f4oLexAlc6l3W_YHZF1IXYDu

1540

  50. http://tromey.com/blog

1541

  51. http://airs.com/ian/

1542

  52. http://airs.com/ian/essays/

1543

  53. http://lessig.org/blog/

1544

  54. http://schneier.com/blog/

1545

  55. http://www.airs.com/blog/wp-register.php

1546

  56. http://www.airs.com/blog/wp-login.php

1547

  57. http://www.airs.com/blog/feed/

1548

  58. http://www.airs.com/blog/comments/feed/

1549

  59. http://wordpress.org/

1550

  60. http://beccary.com/

1551

  61. http://weblogs.us/

1552

  62. http://validator.w3.org/check/referer

1553

  63. http://jigsaw.w3.org/css-validator/check/referer

1554

1555

   Hidden links:

1556

  64. http://www.airs.com/blog/page/10/

1557

  65. http://www.airs.com/blog/page/10/

1558

  66. http://www.airs.com/blog/page/12/

1559

  67. http://www.airs.com/blog/page/12/

1.1                  extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/C.txt

1564

1565

file : http://sources.gentoo.org/viewcvs.py/gentoo-projects/extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/C.txt?rev=1.1&view=markup

1566

plain: http://sources.gentoo.org/viewcvs.py/gentoo-projects/extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/C.txt?rev=1.1&content-type=text/plain

1567

1568

Index: C.txt

1569

===================================================================

1570

   #[1]RSS 2.0 [2]RSS .92 [3]Atom 0.3

1571

1572

[4]Airs - Ian Lance Taylor

1573

1574

[5]Linkers part 14

1575

1576

   September 17, 2007 at 10:01 pm · Filed under [6]Programming

1577

1578

   Link Time Optimization

1579

1580

   I've already mentioned some optimizations which are peculiar to the

1581

   linker: relaxation and garbage collection of unwanted sections. There

1582

   is another class of optimizations which occur at link time, but are

1583

   really related to the compiler. The general name for these

1584

   optimizations is link time optimization or whole program optimization.

1585

1586

   The general idea is that the compiler optimization passes are run at

1587

   link time. The advantage of running them at link time is that the

1588

   compiler can then see the entire program. This permits the compiler to

1589

   perform optimizations which can not be done when sources files are

1590

   compiled separately. The most obvious such optimization is inlining

1591

   functions across source files. Another is optimizing the calling

1592

   sequence for simple functions-e.g., passing more parameters in

1593

   registers, or knowing that the function will not clobber all registers;

1594

   this can only be done when the compiler can see all callers of the

1595

   function. Experience shows that these and other optimizations can bring

1596

   significant performance benefits.

1597

1598

   Generally these optimizations are implemented by having the compiler

1599

   write a version of its intermediate representation into the object

1600

   file, or into some parallel file. The intermediate representation will

1601

   be the parsed version of the source file, and may already have had some

1602

   local optimizations applied. Sometimes the object file contains only

1603

   the compiler intermediate representation, sometimes it also contains

1604

   the usual object code. In the former case link time optimization is

1605

   required, in the latter case it is optional.

1606

1607

   I know of two typical ways to implement link time optimization. The

1608

   first approach is for the compiler to provide a pre-linker. The

1609

   pre-linker examines the object files looking for stored intermediate

1610

   representation. When it finds some, it runs the link time optimization

1611

   passes. The second approach is for the linker proper to call back into

1612

   the compiler when it finds intermediate representation. This is

1613

   generally done via some sort of plugin API.

1614

1615

   Although these optimizations happen at link time, they are not part of

1616

   the linker proper, at least not as I defined it. When the compiler

1617

   reads the stored intermediate representation, it will eventually

1618

   generate an object file, one way or another. The linker proper will

1619

   then process that object file as usual. These optimizations should be

1620

   thought of as part of the compiler.

1621

1622

   Initialization Code

1623

1624

   C++ permits globals variables to have constructors and destructors. The

1625

   global constructors must be run before main starts, and the global

1626

   destructors must be run after exit is called. Making this work requires

1627

   the compiler and the linker to cooperate.

1628

1629

   The a.out object file format is rarely used these days, but the GNU

1630

   a.out linker has an interesting extension. In a.out symbols have a one

1631

   byte type field. This encodes a bunch of debugging information, and

1632

   also the section in which the symbol is defined. The a.out object file

1633

   format only supports three sections-text, data, and bss. Four symbol

1634

   types are defined as sets: text set, data set, bss set, and absolute

1635

   set. A symbol with a set type is permitted to be defined multiple

1636

   times. The GNU linker will not give a multiple definition error, but

1637

   will instead build a table with all the values of the symbol. The table

1638

   will start with one word holding the number of entries, and will end

1639

   with a zero word. In the output file the set symbol will be defined as

1640

   the address of the start of the table.

1641

1642

   For each C++ global constructor, the compiler would generate a symbol

1643

   named __CTOR_LIST__ with the text set type. The value of the symbol in

1644

   the object file would be the global constructor function. The linker

1645

   would gather together all the __CTOR_LIST__ functions into a table. The

1646

   startup code supplied by the compiler would walk down the __CTOR_LIST__

1647

   table and call each function. Global destructors were handled

1648

   similarly, with the name __DTOR_LIST__.

1649

1650

   Anyhow, so much for a.out. In ELF, global constructors are handled in a

1651

   fairly similar way, but without using magic symbol types. I'll describe

1652

   what gcc does. An object file which defines a global constructor will

1653

   include a .ctors section. The compiler will arrange to link special

1654

   object files at the very start and very end of the link. The one at the

1655

   start of the link will define a symbol for the .ctors section; that

1656

   symbol will wind up at the start of the section. The one at the end of

1657

   the link will define a symbol for the end of the .ctors section. The

1658

   compiler startup code will walk between the two symbols, calling the

1659

   constructors. Global destructors work similarly, in a .dtors section.

1660

1661

   ELF shared libraries work similarly. When the dynamic linker loads a

1662

   shared library, it will call the function at the DT_INIT tag if there

1663

   is one. By convention the ELF program linker will set this to the

1664

   function named _init, if there is one. Similarly the DT_FINI tag is

1665

   called when a shared library is unloaded, and the program linker will

1666

   set this to the function named _fini.

1667

1668

   As I mentioned earlier, three are also DT_INIT_ARRAY, DT_PREINIT_ARRAY,

1669

   and DT_FINI_ARRAY tags, which are set based on the SHT_INIT_ARRAY,

1670

   SHT_PREINIT_ARRAY, and SHT_FINI_ARRAY section types. This is a newer

1671

   approach in ELF, and does not require relying on special symbol names.

1672

1673

   More tomorrow.

1674

1675

   [7]Permalink [8]Comments (1)

1676

1677

[9]Linkers part 13

1678

1679

   September 14, 2007 at 8:47 pm · Filed under [10]Programming

1680

1681

   Symbol Versions Redux

1682

1683

   I've talked about symbol versions from the linker's point of view. I

1684

   think it's worth discussing them a bit from the user's point of view.

1685

1686

   As I've discussed before, symbol versions are an ELF extension designed

1687

   to solve a specific problem: making it possible to upgrade a shared

1688

   library without changing existing executables. That is, they provide

1689

   backward compatibility for shared libraries. There are a number of

1690

   related problems which symbol versions do not solve. They do not

1691

   provide forward compatibility for shared libraries: if you upgrade your

1692

   executable, you may need to upgrade your shared library also (it would

1693

   be nice to have a feature to build your executable against an older

1694

   version of the shared library, but that is difficult to implement in

1695

   practice). They only work at the shared library interface: they do not

1696

   help with a change to the ABI of a system call, which is at the kernel

1697

   interface. They do not help with the problem of sharing incompatible

1698

   versions of a shared library, as may happen when a complex application

1699

   is built out of several different existing shared libraries which have

1700

   incompatible dependencies.

1701

1702

   Despite these limitations, shared library backward compatibility is an

1703

   important issue. Using symbol versions to ensure backward compatibility

1704

   requires a careful and rigorous approach. You must start by applying a

1705

   version to every symbol. If a symbol in the shared library does not

1706

   have a version, then it is impossible to change it in a backward

1707

   compatible fashion. Then you must pay close attention to the ABI of

1708

   every symbol. If the ABI of a symbol changes for any reason, you must

1709

   provide a copy which implements the old ABI. That copy should be marked

1710

   with the original version. The new symbol must be given a new version.

1711

1712

   The ABI of a symbol can change in a number of ways. Any change to the

1713

   parameter types or the return type of a function is an ABI change. Any

1714

   change in the type of a variable is an ABI change. If a parameter or a

1715

   return type is a struct or class, then any change in the type of any

1716

   field is an ABI change-i.e., if a field in a struct points to another

1717

   struct, and that struct changes, the ABI has changed. If a function is

1718

   defined to return an instance of an enum, and a new value is added to

1719

   the enum, that is an ABI change. In other words, even minor changes can

1720

   be ABI changes. The question you need to ask is: can existing code

1721

   which has already been compiled continue to use the new symbol with no

1722

   change? If the answer is no, you have an ABI change, and you must

1723

   define a new symbol version.

1724

1725

   You must be very careful when writing the symbol implementing the old

1726

   ABI, if you don't just copy the existing code. You must be certain that

1727

   it really does implement the old ABI.

1728

1729

   There are some special challenges when using C++. Adding a new virtual

1730

   method to a class can be an ABI change for any function which uses that

1731

   class. Providing the backward compatible version of the class in such a

1732

   situation is very awkward-there is no natural way to specify the name

1733

   and version to use for the virtual table or the RTTI information for

1734

   the old version.

1735

1736

   Naturally, you must never delete any symbols.

1737

1738

   Getting all the details correct, and verifying that you got them

1739

   correct, requires great attention to detail. Unfortunately, I don't

1740

   know of any tools to help people write correct version scripts, or to

1741

   verify them. Still, if implemented correctly, the results are good:

1742

   existing executables will continue to run.

1743

1744

   Static Linking vs. Dynamic Linking

1745

1746

   There is, of course, another way to ensure that existing executables

1747

   will continue to run: link them statically, without using any shared

1748

   libraries. That will limit their ABI issues to the kernel interface,

1749

   which is normally significantly smaller than the library interface.

1750

1751

   There is a performance tradeoff with static linking. A statically

1752

   linked program does not get the benefit of sharing libraries with other

1753

   programs executing at the same time. On the other hand, a statically

1754

   linked program does not have to pay the performance penalty of position

1755

   independent code when executing within the library.

1756

1757

   Upgrading the shared library is only possible with dynamic linking.

1758

   Such an upgrade can provide bug fixes and better performance. Also, the

1759

   dynamic linker can select a version of the shared library appropriate

1760

   for the specific platform, which can also help performance.

1761

1762

   Static linking permits more reliable testing of the program. You only

1763

   need to worry about kernel changes, not about shared library changes.

1764

1765

   Some people argue that dynamic linking is always superior. I think

1766

   there are benefits on both sides, and which choice is best depends on

1767

   the specific circumstances.

1768

1769

   More on Monday. If you think I should write about any specific linker

1770

   related topics which have not already been mentioned in the comments,

1771

   please let me know.

1772

1773

   [11]Permalink [12]Comments (2)

1774

1775

[13]Linkers part 12

1776

1777

   September 13, 2007 at 10:47 pm · Filed under [14]Programming

1778

1779

   I apologize for the pause in posts. We moved over the weekend. Last

1780

   Friday at&t told me that the new DSL was working at our new house.

1781

   However, it did not actually start working outside the house until

1782

   Wednesday. Then a problem with the internal wiring meant that it was

1783

   not working inside the house until today. I am now finally back online

1784

   at home.

1785

1786

   Symbol Resolution

1787

1788

   I find that symbol resolution is one of the trickier aspects of a

1789

   linker. Symbol resolution is what the linker does the second and

1790

   subsequent times that it sees a particular symbol. I've already touched

1791

   on the topic in a few previous entries, but let's look at it in a bit

1792

   more depth.

1793

1794

   Some symbols are local to a specific object files. We can ignore these

1795

   for the purposes of symbol resolution, as by definition the linker will

1796

   never see them more than once. In ELF these are the symbols with a

1797

   binding of STB_LOCAL.

1798

1799

   In general, symbols are resolved by name: every symbol with the same

1800

   name is the same entity. We've already seen a few exceptions to that

1801

   general rule. A symbol can have a version: two symbols with the same

1802

   name but different versions are different symbols. A symbol can have

1803

   non-default visibility: a symbol with hidden visibility in one shared

1804

   library is not the same as a symbol with the same name in a different

1805

   shared library.

1806

1807

   The characteristics of a symbol which matter for resolution are:

1808

     * The symbol name

1809

     * The symbol version.

1810

     * Whether the symbol is the default version or not.

1811

     * Whether the symbol is a definition or a reference or a common

1812

       symbol.

1813

     * The symbol visibility.

1814

     * Whether the symbol is weak or strong (i.e., non-weak).

1815

     * Whether the symbol is defined in a regular object file being

1816

       included in the output, or in a shared library.

1817

     * Whether the symbol is thread local.

1818

     * Whether the symbol refers to a function or a variable.

1819

1820

   The goal of symbol resolution is to determine the final value of the

1821

   symbol. After all symbols are resolved, we should know the specific

1822

   object file or shared library which defines the symbol, and we should

1823

   know the symbol's type, size, etc. It is possible that some symbols

1824

   will remain undefined after all the symbol tables have been read; in

1825

   general this is only an error if some relocation refers to that symbol.

1826

1827

   At this point I'd like to present a simple algorithm for symbol

1828

   resolution, but I don't think I can. I'll try to hit all the high

1829

   points, though. Let's assume that we have two symbols with the same

1830

   name. Let's call the symbol we saw first A and the new symbol B. (I'm

1831

   going to ignore symbol visibility in the algorithm below; the effects

1832

   of visibility should be obvious, I hope.)

1833

    1. If A has a version:

1834

          + If B has a version different from A, they are actually

1835

            different symbols.

1836

          + If B has the same version as A, they are the same symbol;

1837

            carry on.

1838

          + If B does not have a version, and A is the default version of

1839

            the symbol, they are the same symbol; carry on.

1840

          + Otherwise B is probably a different symbol. But note that if A

1841

            and B are both undefined references, then it is possible that

1842

            A refers to the default version of the symbol but we don't yet

1843

            know that. In that case, if B does not have a version, A and B

1844

            really are the same symbol. We can't tell until we see the

1845

            actual definition.

1846

    2. If A does not have a version:

1847

          + If B does not have a version, they are the same symbol; carry

1848

on.

1849

          + If B has a version, and it is the default version, they are

1850

            the same symbol; carry on.

1851

          + Otherwise, B is probably a different symbol, as above.

1852

    3. If A is thread local and B is not, or vice-versa, then we have an

1853

       error.

1854

    4. If A is an undefined reference:

1855

          + If B is an undefined reference, then we can complete the

1856

            resolution, and more or less ignore B.

1857

          + If B is a definition or a common symbol, then we can resolve A

1858

            to B.

1859

    5. If A is a strong definition in an object file:

1860

          + If B is an undefined reference, then we resolve B to A.

1861

          + If B is a strong definition in an object file, then we have a

1862

            multiple definition error.

1863

          + If B is a weak definition in an object file, then A overrides

1864

            B. In effect, B is ignored.

1865

          + If B is a common symbol, then we treat B as an undefined

1866

            reference.

1867

          + If B is a definition in a shared library, then A overrides B.

1868

            The dynamic linker will change all references to B in the

1869

            shared library to refer to A instead.

1870

    6. If A is a weak definition in an object file, we act just like the

1871

       strong definition case, with one exception: if B is a strong

1872

       definition in an object file. In the original SVR4 linker, this

1873

       case was treated as a multiple definition error. In the Solaris and

1874

       GNU linkers, this case is handled by letting B override A.

1875

    7. If A is a common symbol in an object file:

1876

          + If B is a common symbol, we set the size of A to be the

1877

            maximum of the size of A and the size of B, and then treat B

1878

            as an undefined reference.

1879

          + If B is a definition in a shared library with function type,

1880

            then A overrides B (this oddball case is required to correctly

1881

            handle some Unix system libraries).

1882

          + Otherwise, we treat A as an undefined reference.

1883

    8. If A is a definition in a shared library, then if B is a definition

1884

       in a regular object (strong or weak), it overrides A. Otherwise we

1885

       act as though A were defined in an object file.

1886

    9. If A is a common symbol in a shared library, we have a funny case.

1887

       Symbols in shared libraries must have addresses, so they can't be

1888

       common in the same sense as symbols in an object file. But ELF does

1889

       permit symbols in a shared library to have the type STT_COMMON

1890

       (this is a relatively recent addition). For purposes of symbol

1891

       resolution, if A is a common symbol in a shared library, we still

1892

       treat it as a definition, unless B is also a common symbol. In the

1893

       latter case, B overrides A, and the size of B is set to the maximum

1894

       of the size of A and the size of B.

1895

1896

   I hope I got all that right.

1897

1898

   More tomorrow, assuming the Internet connection holds up.

1899

1900

   [15]Permalink [16]Comments (2)

1901

1902

[17]Linkers part 11

1903

1904

   September 7, 2007 at 11:09 pm · Filed under [18]Programming

1905

1906

   Archives

1907

1908

   Archives are a traditional Unix package format. They are created by the

1909

   ar program, and they are normally named with a .a extension. Archives

1910

   are passed to a Unix linker with the -l option.

1911

1912

   Although the ar program is capable of creating an archive from any type

1913

   of file, it is normally used to put object files into an archive. When

1914

   it is used in this way, it creates a symbol table for the archive. The

1915

   symbol table lists all the symbols defined by any object file in the

1916

   archive, and for each symbol indicates which object file defines it.

1917

   Originally the symbol table was created by the ranlib program, but

1918

   these days it is always created by ar by default (despite this, many

1919

   Makefiles continue to run ranlib unnecessarily).

1920

1921

   When the linker sees an archive, it looks at the archive's symbol

1922

   table. For each symbol the linker checks whether it has seen an

1923

   undefined reference to that symbol without seeing a definition. If that

1924

   is the case, it pulls the object file out of the archive and includes

1925

   it in the link. In other words, the linker pulls in all the object

1926

   files which defines symbols which are referenced but not yet defined.

1927

1928

   This operation repeats until no more symbols can be defined by the

1929

   archive. This permits object files in an archive to refer to symbols

1930

   defined by other object files in the same archive, without worrying

1931

   about the order in which they appear.

1932

1933

   Note that the linker considers an archive in its position on the

1934

   command line relative to other object files and archives. If an object

1935

   file appears after an archive on the command line, that archive will

1936

   not be used to defined symbols referenced by the object file.

1937

1938

   In general the linker will not include archives if they provide a

1939

   definition for a common symbol. You will recall that if the linker sees

1940

   a common symbol followed by a defined symbol with the same name, it

1941

   will treat the common symbol as an undefined reference. That will only

1942

   happen if there is some other reason to include the defined symbol in

1943

   the link; the defined symbol will not be pulled in from the archive.

1944

1945

   There was an interesting twist for common symbols in archives on old

1946

   a.out-based SunOS systems. If the linker saw a common symbol, and then

1947

   saw a common symbol in an archive, it would not include the object file

1948

   from the archive, but it would change the size of the common symbol to

1949

   the size in the archive if that were larger than the current size. The

1950

   C library relied on this behaviour when implementing the stdin

1951

   variable.

1952

1953

   My next posting should be on Monday.

1954

1955

   [19]Permalink [20]Comments (6)

1956

1957

[21]Linkers part 10

1958

1959

   September 6, 2007 at 11:33 pm · Filed under [22]Programming

1960

1961

   Parallel Linking

1962

1963

   It is possible to parallelize the linking process somewhat. This can

1964

   help hide I/O latency and can take better advantage of modern

1965

   multi-core systems. My intention with gold is to use these ideas to

1966

   speed up the linking process.

1967

1968

   The first area which can be parallelized is reading the symbols and

1969

   relocation entries of all the input files. The symbols must be

1970

   processed in order; otherwise, it will be difficult for the linker to

1971

   resolve multiple definitions correctly. In particular all the symbols

1972

   which are used before an archive must be fully processed before the

1973

   archive is processed, or the linker won't know which members of the

1974

   archive to include in the link (I guess I haven't talked about archives

1975

   yet). However, despite these ordering requirements, it can be

1976

   beneficial to do the actual I/O in parallel.

1977

1978

   After all the symbols and relocations have been read, the linker must

1979

   complete the layout of all the input contents. Most of this can not be

1980

   done in parallel, as setting the location of one type of contents

1981

   requires knowing the size of all the preceding types of contents. While

1982

   doing the layout, the linker can determine the final location in the

1983

   output file of all the data which needs to be written out.

1984

1985

   After layout is complete, the process of reading the contents, applying

1986

   relocations, and writing the contents to the output file can be fully

1987

   parallelized. Each input file can be processed separately.

1988

1989

   Since the final size of the output file is known after the layout

1990

   phase, it is possible to use mmap for the output file. When not doing

1991

   relaxation, it is then possible to read the input contents directly

1992

   into place in the output file, and to relocation them in place. This

1993

   reduces the number of system calls required, and ideally will permit

1994

   the operating system to do optimal disk I/O for the output file.

1995

1996

   Just a short entry tonight. More tomorrow.

1997

1998

   [23]Permalink [24]Comments

1999

   [25]« Previous entries · [26]Next entries »

2000

     __________________________________________________________________

2001

     __________________________________________________________________

2002

2003

     * Archives

2004

          + [27]November 2007

2005

          + [28]October 2007

2006

          + [29]September 2007

2007

          + [30]August 2007

2008

          + [31]February 2007

2009

          + [32]January 2007

2010

          + [33]December 2006

2011

          + [34]July 2006

2012

          + [35]June 2006

2013

          + [36]April 2006

2014

          + [37]February 2006

2015

          + [38]January 2006

2016

          + [39]December 2005

2017

          + [40]November 2005

2018

     * Categories

2019

          + [41]Random

2020

          + [42]Money

2021

          + [43]Philosophy

2022

          + [44]Books

2023

          + [45]Politics

2024

          + [46]Programming

2025

     * Search

2026

       ____________________ Search

2027

     * Friend's blogs

2028

          + [47]Frank Ch. Eigler

2029

          + [48]Daniel Berlin

2030

          + [49]Angela Thomas

2031

          + [50]Tom Tromey

2032

     * My pages

2033

          + [51]Home

2034

          + [52]Essays

2035

     * Notable blogs

2036

          + [53]Lawrence Lessig

2037

          + [54]Bruce Schneier

2038

     * Meta

2039

          + [55]Register

2040

          + [56]Login

2041

          + [57]Entries RSS

2042

          + [58]Comments RSS

2043

          + [59]Wordpress

2044

2045

   Design by [60]Beccary and [61]Weblogs.us · [62]XHTML · [63]CSS

2046

2047

References

2048

2049

   Visible links

2050

   1. http://www.airs.com/blog/feed/

2051

   2. http://www.airs.com/blog/feed/rss/

2052

   3. http://www.airs.com/blog/feed/atom/

2053

   4. http://www.airs.com/blog

2054

   5. http://www.airs.com/blog/archives/51

2055

   6. http://www.airs.com/blog/archives/category/programming/

2056

   7. http://www.airs.com/blog/archives/51

2057

   8. http://www.airs.com/blog/archives/51#comments

2058

   9. http://www.airs.com/blog/archives/50

2059

  10. http://www.airs.com/blog/archives/category/programming/

2060

  11. http://www.airs.com/blog/archives/50

2061

  12. http://www.airs.com/blog/archives/50#comments

2062

  13. http://www.airs.com/blog/archives/49

2063

  14. http://www.airs.com/blog/archives/category/programming/

2064

  15. http://www.airs.com/blog/archives/49

2065

  16. http://www.airs.com/blog/archives/49#comments

2066

  17. http://www.airs.com/blog/archives/48

2067

  18. http://www.airs.com/blog/archives/category/programming/

2068

  19. http://www.airs.com/blog/archives/48

2069

  20. http://www.airs.com/blog/archives/48#comments

2070

  21. http://www.airs.com/blog/archives/47

2071

  22. http://www.airs.com/blog/archives/category/programming/

2072

  23. http://www.airs.com/blog/archives/47

2073

  24. http://www.airs.com/blog/archives/47#comments

2074

  25. http://www.airs.com/blog/page/11/

2075

  26. http://www.airs.com/blog/page/9/

2076

  27. http://www.airs.com/blog/archives/date/2007/11/

2077

  28. http://www.airs.com/blog/archives/date/2007/10/

2078

  29. http://www.airs.com/blog/archives/date/2007/09/

2079

  30. http://www.airs.com/blog/archives/date/2007/08/

2080

  31. http://www.airs.com/blog/archives/date/2007/02/

2081

  32. http://www.airs.com/blog/archives/date/2007/01/

2082

  33. http://www.airs.com/blog/archives/date/2006/12/

2083

  34. http://www.airs.com/blog/archives/date/2006/07/

2084

  35. http://www.airs.com/blog/archives/date/2006/06/

2085

  36. http://www.airs.com/blog/archives/date/2006/04/

2086

  37. http://www.airs.com/blog/archives/date/2006/02/

2087

  38. http://www.airs.com/blog/archives/date/2006/01/

2088

  39. http://www.airs.com/blog/archives/date/2005/12/

2089

  40. http://www.airs.com/blog/archives/date/2005/11/

2090

  41. http://www.airs.com/blog/archives/category/random/

2091

  42. http://www.airs.com/blog/archives/category/money/

2092

  43. http://www.airs.com/blog/archives/category/philosophy/

2093

  44. http://www.airs.com/blog/archives/category/books/

2094

  45. http://www.airs.com/blog/archives/category/politics/

2095

  46. http://www.airs.com/blog/archives/category/programming/

2096

  47. http://web.elastic.org/~fche/blog2/

2097

  48. http://www.dberlin.org/blog/

2098

  49. http://blog.360.yahoo.com/blog-f4oLexAlc6l3W_YHZF1IXYDu

2099

  50. http://tromey.com/blog

2100

  51. http://airs.com/ian/

2101

  52. http://airs.com/ian/essays/

2102

  53. http://lessig.org/blog/

2103

  54. http://schneier.com/blog/

2104

  55. http://www.airs.com/blog/wp-register.php

2105

  56. http://www.airs.com/blog/wp-login.php

2106

  57. http://www.airs.com/blog/feed/

2107

  58. http://www.airs.com/blog/comments/feed/

2108

  59. http://wordpress.org/

2109

  60. http://beccary.com/

2110

  61. http://weblogs.us/

2111

  62. http://validator.w3.org/check/referer

2112

  63. http://jigsaw.w3.org/css-validator/check/referer

2113

2114

   Hidden links:

2115

  64. http://www.airs.com/blog/page/9/

2116

  65. http://www.airs.com/blog/page/9/

2117

  66. http://www.airs.com/blog/page/11/

2118

  67. http://www.airs.com/blog/page/11/

1.1                  extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/D.txt

2123

2124

file : http://sources.gentoo.org/viewcvs.py/gentoo-projects/extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/D.txt?rev=1.1&view=markup

2125

plain: http://sources.gentoo.org/viewcvs.py/gentoo-projects/extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/D.txt?rev=1.1&content-type=text/plain

2126

2127

Index: D.txt

2128

===================================================================

2129

   #[1]RSS 2.0 [2]RSS .92 [3]Atom 0.3

2130

2131

[4]Airs - Ian Lance Taylor

2132

2133

[5]Linkers part 19

2134

2135

   September 24, 2007 at 9:02 pm · Filed under [6]Programming

2136

2137

   I've pretty much run out of linker topics. Unless I think of something

2138

   new, I'll make tomorrow's post be the last one, for a total of 20.

2139

2140

   __start and __stop Symbols

2141

2142

   A quick note about another GNU linker extension. If the linker sees a

2143

   section in the output file which can be part of a C variable name-the

2144

   name contains only alphanumeric characters or underscore-the linker

2145

   will automatically define symbols marking the start and stop of the

2146

   section. Note that this is not true of most section names, as by

2147

   convention most section names start with a period. But the name of a

2148

   section can be any string; it doesn't have to start with a period. And

2149

   when that happens for section NAME, the GNU linker will define the

2150

   symbols __start_NAME and __stop_NAME to the address of the beginning

2151

   and the end of section, respectively.

2152

2153

   This is convenient for collecting some information in several different

2154

   object files, and then referring to it in the code. For example, the

2155

   GNU C library uses this to keep a list of functions which may be called

2156

   to free memory. The __start and __stop symbols are used to walk through

2157

   the list.

2158

2159

   In C code, these symbols should be declared as something like extern

2160

   char __start_NAME[]. For an extern array the value of the symbol and

2161

   the value of the variable are the same.

2162

2163

   Byte Swapping

2164

2165

   The new linker I am working on, gold, is written in C++. One of the

2166

   attractions was to use template specialization to do efficient byte

2167

   swapping. Any linker which can be used in a cross-compiler needs to be

2168

   able to swap bytes when writing them out, in order to generate code for

2169

   a big-endian system while running on a little-endian system, or

2170

   vice-versa. The GNU linker always stores data into memory a byte at a

2171

   time, which is unnecessary for a native linker. Measurements from a few

2172

   years ago showed that this took about 5% of the linker's CPU time.

2173

   Since the native linker is by far the most common case, it is worth

2174

   avoiding this penalty.

2175

2176

   In C++, this can be done using templates and template specialization.

2177

   The idea is to write a template for writing out the data. Then provide

2178

   two specializations of the template, one for a linker of the same

2179

   endianness and one for a linker of the opposite endianness. Then pick

2180

   the one to use at compile time. The code looks this; I'm only showing

2181

   the 16-bit case for simplicity.

2182

2183

     // Endian simply indicates whether the host is big endian or not.

2184

2185

     struct Endian

2186

{

2187

     public:

2188

     // Used for template specializations.

2189

     static const bool host_big_endian = __BYTE_ORDER == __BIG_ENDIAN;

2190

};

2191

2192

     // Valtype_base is a template based on size (8, 16, 32, 64) which

2193

     // defines the type Valtype as the unsigned integer of the specified

2194

     // size.

2195

2196

     template

2197

     struct Valtype_base;

2198

2199

     template<>

2200

     struct Valtype_base<16>

2201

{

2202

     typedef uint16_t Valtype;

2203

};

2204

2205

     // Convert_endian is a template based on size and on whether the

2206

     host

2207

     // and target have the same endianness. It defines the type Valtype

2208

     // as Valtype_base does, and also defines a function convert_host

2209

     // which takes an argument of type Valtype and returns the same

2210

     value,

2211

     // but swapped if the host and target have different endianness.

2212

2213

     template

2214

     struct Convert_endian;

2215

2216

     template

2217

     struct Convert_endian

2218

{

2219

     typedef typename Valtype_base::Valtype Valtype;

2220

2221

     static inline Valtype

2222

     convert_host(Valtype v)

2223

     { return v; }

2224

};

2225

2226

     template<>

2227

     struct Convert_endian<16, false>

2228

{

2229

     typedef Valtype_base<16>::Valtype Valtype;

2230

2231

     static inline Valtype

2232

     convert_host(Valtype v)

2233

     { return bswap_16(v); }

2234

};

2235

2236

     // Convert is a template based on size and on whether the target is

2237

     // big endian. It defines Valtype and convert_host like

2238

     // Convert_endian. That is, it is just like Convert_endian except in

2239

     // the meaning of the second template parameter.

2240

2241

     template

2242

     struct Convert

2243

{

2244

     typedef typename Valtype_base::Valtype Valtype;

2245

2246

     static inline Valtype

2247

     convert_host(Valtype v)

2248

{

2249

     return Convert_endian

2250

     ::convert_host(v);

2251

}

2252

};

2253

2254

     // Swap is a template based on size and on whether the target is big

2255

     // endian. It defines the type Valtype and the functions readval and

2256

     // writeval. The functions read and write values of the appropriate

2257

     // size out of buffers, swapping them if necessary.

2258

2259

     template

2260

     struct Swap

2261

{

2262

     typedef typename Valtype_base::Valtype Valtype;

2263

2264

     static inline Valtype

2265

     readval(const Valtype* wv)

2266

     { return Convert::convert_host(*wv); }

2267

2268

     static inline void

2269

     writeval(Valtype* wv, Valtype v)

2270

     { *wv = Convert::convert_host(v); }

2271

};

2272

2273

   Now, for example, the linker reads a 16-bit big-endian value using

2274

   Swap<16,true>::readval. This works because the linker always knows how

2275

   much data to swap in, and it always knows whether it is reading big- or

2276

   little-endian data.

2277

2278

   [7]Permalink [8]Comments (2)

2279

2280

[9]Linkers part 18

2281

2282

   September 21, 2007 at 11:09 pm · Filed under [10]Programming

2283

2284

   Incremental Linking

2285

2286

   Often a programmer will make change a single source file and recompile

2287

   and relink the application. A standard linker will need to read all the

2288

   input objects and libraries in order to regenerate the executable with

2289

   the change. For a large application, this is a lot of work. If only one

2290

   input object file changed, it is a lot more work than really needs to

2291

   be done. One solution is to use an incremental linker. An incremental

2292

   linker makes incremental changes to an existing executable or shared

2293

   library, rather than rebuilding them from scratch.

2294

2295

   I've never actually written or worked on an incremental linker, but the

2296

   general idea is straightforward enough. When the linker writes the

2297

   output file, it must attach additional information.

2298

     * The linker must create a mapping of object files to areas in the

2299

       output file, so that an incremental link will know what to remove

2300

       when replacing an object file.

2301

     * The linker must retain all the relocations for each input object

2302

       which refer to symbols defined in other objects, so that it can

2303

       reprocess them when symbols change. The linker should store the

2304

       relocations mapped by symbol, so that it can quickly find the

2305

       relevant relocations.

2306

     * The linker should leave extra space in the text and data segments,

2307

       to allow for object files to grow to a limited extent without

2308

       requiring rewriting the whole executable. It must keep a map of

2309

       where this extra space is, as it will tend to move over time over

2310

       the course of incremental links.

2311

     * The linker should keep a list of object file timestamps in the

2312

       output file, so that it can quickly determine which objects have

2313

       changed.

2314

2315

   With this information, the linker can identify which object files have

2316

   changed since the last time the output file was linked, and replace

2317

   them in the existing output file. When an object file changes, the

2318

   linker can identify all the relocations which refer to symbols defined

2319

   in the object file, and reprocess them.

2320

2321

   When an object file gets too large to fit in the available space in a

2322

   text or data segment, then the linker has the option of creating

2323

   additional text or data segments at different addresses. This requires

2324

   some care to ensure that the new code does not collide with the heap,

2325

   depending upon how the local malloc implementation works.

2326

   Alternatively, the incremental linker could fall back on doing a full

2327

   link, and allocating more space again.

2328

2329

   Incremental linking can greatly speed up the edit/compile/debug cycle.

2330

   Unfortunately it is not implemented in most common linkers. Of course

2331

   an incremental link is not equivalent to a final link, and in

2332

   particular some linker optimizations are difficult to implement while

2333

   acting incrementally. An incremental link is really only suitable for

2334

   use during the development cycle, which is course the time when the

2335

   speed of the linker is most important.

2336

2337

   More on Monday.

2338

2339

   [11]Permalink [12]Comments (2)

2340

2341

[13]Linkers part 17

2342

2343

   September 20, 2007 at 11:52 pm · Filed under [14]Programming

2344

2345

   Warning Symbols

2346

2347

   The GNU linker supports a weird extension to ELF used to issue warnings

2348

   when symbols are referenced at link time. This was originally

2349

   implemented for a.out using a special symbol type. For ELF, I

2350

   implemented it using a special section name.

2351

2352

   If you create a section named .gnu.warning.SYMBOL, then if and when the

2353

   linker sees an undefined reference to SYMBOL, it will issue a warning.

2354

   The warning is triggered by seeing an undefined symbol with the right

2355

   name in an object file. Unlike the warning about an undefined symbol,

2356

   it is not triggered by seeing a relocation entry. The text of the

2357

   warning is simply the contents of the .gnu.warning.SYMBOL section.

2358

2359

   The GNU C library uses this feature to warn about references to symbols

2360

   like gets which are required by standards but are generally considered

2361

   to be unsafe. This is done by creating a section named

2362

   .gnu.warning.gets in the same object file which defines gets.

2363

2364

   The GNU linker also supports another type of warning, triggered by

2365

   sections named .gnu.warning (without the symbol name). If an object

2366

   file with a section of that name is included in the link, the linker

2367

   will issue a warning. Again, the text of the warning is simply the

2368

   contents of the .gnu.warning section. I don't know if anybody actually

2369

   uses this feature.

2370

2371

   Short entry today, more tomorrow.

2372

2373

   [15]Permalink [16]Comments (1)

2374

2375

[17]Linkers part 16

2376

2377

   September 19, 2007 at 10:51 pm · Filed under [18]Programming

2378

2379

   C++ Template Instantiation

2380

2381

   There is still more C++ fun at link time, though somewhat less related

2382

   to the linker proper. A C++ program can declare templates, and

2383

   instantiate them with specific types. Ideally those specific

2384

   instantiations will only appear once in a program, not once per source

2385

   file which instantiates the templates. There are a few ways to make

2386

   this work.

2387

2388

   For object file formats which support COMDAT and vague linkage, which I

2389

   described yesterday, the simplest and most reliable mechanism is for

2390

   the compiler to generate all the template instantiations required for a

2391

   source file and put them into the object file. They should be marked as

2392

   COMDAT, so that the linker discards all but one copy. This ensures that

2393

   all template instantiations will be available at link time, and that

2394

   the executable will have only one copy. This is what gcc does by

2395

   default for systems which support it. The obvious disadvantages are the

2396

   time required to compile all the duplicate template instantiations and

2397

   the space they take up in the object files. This is sometimes called

2398

   the Borland model, as this is what Borland's C++ compiler did.

2399

2400

   Another approach is to not generate any of the template instantiations

2401

   at compile time. Instead, when linking, if we need a template

2402

   instantiation which is not found, invoke the compiler to build it. This

2403

   can be done either by running the linker and looking for error messages

2404

   or by using a linker plugin to handle an undefined symbol error. The

2405

   difficulties with this approach are to find the source code to compile

2406

   and to find the right options to pass to the compiler. Typically the

2407

   source code is placed into a repository file of some sort at compile

2408

   time, so that it is available at link time. The complexities of getting

2409

   the compilation steps right are why this approach is not the default.

2410

   When it works, though, it can be faster than the duplicate

2411

   instantiation approach. This is sometimes called the Cfront model.

2412

2413

   gcc also supports explicit template instantiation, which can be used to

2414

   control exactly where templates are instantiated. This approach can

2415

   work if you have complete control over your source code base, and can

2416

   instantiate all required templates in some central place. This approach

2417

   is used for gcc's C++ library, libstdc++.

2418

2419

   C++ defines a keyword export which is supposed to permit exporting

2420

   template definitions in such a way that they can be read back in by the

2421

   compiler. gcc does not support this keyword. If it worked, it could be

2422

   a slightly more reliable way of using a repository when using the

2423

   Cfront model.

2424

2425

   Exception Frames

2426

2427

   C++ and other languages support exceptions. When an exception is thrown

2428

   in one function and caught in another, the program needs to reset the

2429

   stack pointer and registers to the point where the exception is caught.

2430

   While resetting the stack pointer, the program needs to identify all

2431

   local variables in the part of the stack being discarded, and run their

2432

   destructors if any. This process is known as unwinding the stack.

2433

2434

   The information needed to unwind the stack is normally stored in tables

2435

   in the program. Supporting library code is used to read the tables and

2436

   perform the necessary operations. I'm not going to describe the details

2437

   of those tables here. However, there is a linker optimization which

2438

   applies to them.

2439

2440

   The support libraries need to be able to find the exception tables at

2441

   runtime when an exception occurs. An exception can be thrown in one

2442

   shared library and caught in a different shared library, so finding all

2443

   the required exception tables can be a nontrivial operation. One

2444

   approach that can be used is to register the exception tables at

2445

   program startup time or shared library load time. The registration can

2446

   be done at the right time using the global constructor mechanism.

2447

2448

   However, this approach imposes a runtime cost for exceptions, in that

2449

   it takes longer for the program to start. Therefore, this is not ideal.

2450

   The linker can optimize this by building tables which can be used to

2451

   find the exception tables. The tables built by the GNU linker are

2452

   sorted for fast lookup by the runtime library. The tables are put into

2453

   a PT_GNU_EH_FRAME segment. The supporting libraries then need a way to

2454

   look up a segment of this type. This is done via the dl_iterate_phdr

2455

   API provided by the GNU dynamic linker.

2456

2457

   Note that if the compiler believes that the linker will generate a

2458

   PT_GNU_EH_FRAME segment, it won't generate the startup code to register

2459

   the exception tables. Thus the linker must not fail to create this

2460

   segment.

2461

2462

   Since the GNU linker needs to look at the exception tables in order to

2463

   generate the PT_GNU_EH_FRAME segment, it will also optimize by

2464

   discarding duplicate exception table information.

2465

2466

   I know this is section is rather short on details. I hope the general

2467

   idea is clear.

2468

2469

   More tomorrow.

2470

2471

   [19]Permalink [20]Comments

2472

2473

[21]Linkers part 15

2474

2475

   September 18, 2007 at 10:40 pm · Filed under [22]Programming

2476

2477

   COMDAT sections

2478

2479

   In C++ there are several constructs which do not clearly live in a

2480

   single place. Examples are inline functions defined in a header file,

2481

   virtual tables, and typeinfo objects. There must be only a single

2482

   instance of each of these constructs in the final linked program

2483

   (actually we could probably get away with multiple copies of a virtual

2484

   table, but the others must be unique since it is possible to take their

2485

   address). Unfortunately, there is not necessarily a single object file

2486

   in which they should be generated. These types of constructs are

2487

   sometimes described as having vague linkage.

2488

2489

   Linkers implement these features by using COMDAT sections (there may be

2490

   other approaches, but this is the only I know of). COMDAT sections are

2491

   a special type of section. Each COMDAT section has a special string.

2492

   When the linker sees multiple COMDAT sections with the same special

2493

   string, it will only keep one of them.

2494

2495

   For example, when the C++ compiler sees an inline function f1 defined

2496

   in a header file, but the compiler is unable to inline the function in

2497

   all uses (perhaps because something takes the address of the function),

2498

   the compiler will emit f1 in a COMDAT section associated with the

2499

   string f1. After the linker sees a COMDAT section f1, it will discard

2500

   all subsequent f1 COMDAT sections.

2501

2502

   This obviously raises the possibility that there will be two entirely

2503

   different inline functions named f1, defined in different header files.

2504

   This would be an invalid C++ program, violating the One Definition Rule

2505

   (often abbreviated ODR). Unfortunately, if no source file included both

2506

   header files, the compiler would be unable to diagnose the error. And,

2507

   unfortunately, the linker would simply discard the duplicate COMDAT

2508

   sections, and would not notice the error either. This is an area where

2509

   some improvements are needed (at least in the GNU tools; I don't know

2510

   whether any other tools diagnose this error correctly).

2511

2512

   The Microsoft PE object file format provides COMDAT sections. These

2513

   sections can be marked so that duplicate COMDAT sections which do not

2514

   have identical contents cause an error. That is not as helpful as it

2515

   seems, as different compiler options may cause valid duplicates to have

2516

   different contents. The string associated with a COMDAT section is

2517

   stored in the symbol table.

2518

2519

   Before I learned about the Microsoft PE format, I introduced a

2520

   different type of COMDAT sections into the GNU ELF linker, following a

2521

   suggestion from Jason Merrill. Any section whose name starts with

2522

   ".gnu.linkonce." is a COMDAT section. The associated string is simply

2523

   the section name itself. Thus the inline function f1 would be put into

2524

   the section ".gnu.linkonce.f1''. This simple implementation works well

2525

   enough, but it has a flaw in that some functions require data in

2526

   multiple sections; e.g., the instructions may be in one section and

2527

   associated static data may be in another section. Since different

2528

   instances of the inline function may be compiled differently, the

2529

   linker can not reliably and consistently discard duplicate data (I

2530

   don't know how the Microsoft linker handles this problem).

2531

2532

   Recent versions of ELF introduce section groups. These implement an

2533

   officially sanctioned version of COMDAT in ELF, and avoid the problem

2534

   of ".gnu.linkonce" sections. I described these briefly in an earlier

2535

   blog entry. A special section of type SHT_GROUP contains a list of

2536

   section indices in the group. The group is retained or discarded as a

2537

   whole. The string associated with the group is found in the symbol

2538

   table. Putting the string in the symbol table makes it awkward to

2539

   retrieve, but since the string is generally the name of a symbol it

2540

   means that the string only needs to be stored once in the object file;

2541

   this is a minor optimization for C++ in which symbol names may be very

2542

   long.

2543

2544

   More tomorrow.

2545

2546

   [23]Permalink [24]Comments (2)

2547

   [25]« Previous entries · [26]Next entries »

2548

     __________________________________________________________________

2549

     __________________________________________________________________

2550

2551

     * Archives

2552

          + [27]November 2007

2553

          + [28]October 2007

2554

          + [29]September 2007

2555

          + [30]August 2007

2556

          + [31]February 2007

2557

          + [32]January 2007

2558

          + [33]December 2006

2559

          + [34]July 2006

2560

          + [35]June 2006

2561

          + [36]April 2006

2562

          + [37]February 2006

2563

          + [38]January 2006

2564

          + [39]December 2005

2565

          + [40]November 2005

2566

     * Categories

2567

          + [41]Random

2568

          + [42]Money

2569

          + [43]Philosophy

2570

          + [44]Books

2571

          + [45]Politics

2572

          + [46]Programming

2573

     * Search

2574

       ____________________ Search

2575

     * Friend's blogs

2576

          + [47]Frank Ch. Eigler

2577

          + [48]Daniel Berlin

2578

          + [49]Angela Thomas

2579

          + [50]Tom Tromey

2580

     * My pages

2581

          + [51]Home

2582

          + [52]Essays

2583

     * Notable blogs

2584

          + [53]Lawrence Lessig

2585

          + [54]Bruce Schneier

2586

     * Meta

2587

          + [55]Register

2588

          + [56]Login

2589

          + [57]Entries RSS

2590

          + [58]Comments RSS

2591

          + [59]Wordpress

2592

2593

   Design by [60]Beccary and [61]Weblogs.us · [62]XHTML · [63]CSS

2594

2595

References

2596

2597

   Visible links

2598

   1. http://www.airs.com/blog/feed/

2599

   2. http://www.airs.com/blog/feed/rss/

2600

   3. http://www.airs.com/blog/feed/atom/

2601

   4. http://www.airs.com/blog

2602

   5. http://www.airs.com/blog/archives/56

2603

   6. http://www.airs.com/blog/archives/category/programming/

2604

   7. http://www.airs.com/blog/archives/56

2605

   8. http://www.airs.com/blog/archives/56#comments

2606

   9. http://www.airs.com/blog/archives/55

2607

  10. http://www.airs.com/blog/archives/category/programming/

2608

  11. http://www.airs.com/blog/archives/55

2609

  12. http://www.airs.com/blog/archives/55#comments

2610

  13. http://www.airs.com/blog/archives/54

2611

  14. http://www.airs.com/blog/archives/category/programming/

2612

  15. http://www.airs.com/blog/archives/54

2613

  16. http://www.airs.com/blog/archives/54#comments

2614

  17. http://www.airs.com/blog/archives/53

2615

  18. http://www.airs.com/blog/archives/category/programming/

2616

  19. http://www.airs.com/blog/archives/53

2617

  20. http://www.airs.com/blog/archives/53#comments

2618

  21. http://www.airs.com/blog/archives/52

2619

  22. http://www.airs.com/blog/archives/category/programming/

2620

  23. http://www.airs.com/blog/archives/52

2621

  24. http://www.airs.com/blog/archives/52#comments

2622

  25. http://www.airs.com/blog/page/10/

2623

  26. http://www.airs.com/blog/page/8/

2624

  27. http://www.airs.com/blog/archives/date/2007/11/

2625

  28. http://www.airs.com/blog/archives/date/2007/10/

2626

  29. http://www.airs.com/blog/archives/date/2007/09/

2627

  30. http://www.airs.com/blog/archives/date/2007/08/

2628

  31. http://www.airs.com/blog/archives/date/2007/02/

2629

  32. http://www.airs.com/blog/archives/date/2007/01/

2630

  33. http://www.airs.com/blog/archives/date/2006/12/

2631

  34. http://www.airs.com/blog/archives/date/2006/07/

2632

  35. http://www.airs.com/blog/archives/date/2006/06/

2633

  36. http://www.airs.com/blog/archives/date/2006/04/

2634

  37. http://www.airs.com/blog/archives/date/2006/02/

2635

  38. http://www.airs.com/blog/archives/date/2006/01/

2636

  39. http://www.airs.com/blog/archives/date/2005/12/

2637

  40. http://www.airs.com/blog/archives/date/2005/11/

2638

  41. http://www.airs.com/blog/archives/category/random/

2639

  42. http://www.airs.com/blog/archives/category/money/

2640

  43. http://www.airs.com/blog/archives/category/philosophy/

2641

  44. http://www.airs.com/blog/archives/category/books/

2642

  45. http://www.airs.com/blog/archives/category/politics/

2643

  46. http://www.airs.com/blog/archives/category/programming/

2644

  47. http://web.elastic.org/~fche/blog2/

2645

  48. http://www.dberlin.org/blog/

2646

  49. http://blog.360.yahoo.com/blog-f4oLexAlc6l3W_YHZF1IXYDu

2647

  50. http://tromey.com/blog

2648

  51. http://airs.com/ian/

2649

  52. http://airs.com/ian/essays/

2650

  53. http://lessig.org/blog/

2651

  54. http://schneier.com/blog/

2652

  55. http://www.airs.com/blog/wp-register.php

2653

  56. http://www.airs.com/blog/wp-login.php

2654

  57. http://www.airs.com/blog/feed/

2655

  58. http://www.airs.com/blog/comments/feed/

2656

  59. http://wordpress.org/

2657

  60. http://beccary.com/

2658

  61. http://weblogs.us/

2659

  62. http://validator.w3.org/check/referer

2660

  63. http://jigsaw.w3.org/css-validator/check/referer

2661

2662

   Hidden links:

2663

  64. http://www.airs.com/blog/page/8/

2664

  65. http://www.airs.com/blog/page/8/

2665

  66. http://www.airs.com/blog/page/10/

2666

  67. http://www.airs.com/blog/page/10/

1.1                  extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/E.txt

2671

2672

file : http://sources.gentoo.org/viewcvs.py/gentoo-projects/extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/E.txt?rev=1.1&view=markup

2673

plain: http://sources.gentoo.org/viewcvs.py/gentoo-projects/extreme-security/solutions/branches/pappy001/docs/airs_linker_docs/E.txt?rev=1.1&content-type=text/plain

2674

2675

Index: E.txt

2676

===================================================================

2677

   #[1]RSS 2.0 [2]RSS .92 [3]Atom 0.3

2678

2679

[4]Airs - Ian Lance Taylor

2680

2681

[23]Linkers part 20

2682

2683

   September 26, 2007 at 12:14 am · Filed under [24]Programming

2684

2685

   This will be my last blog posting on linkers for the time being.

2686

   Tomorrow my blog will return to its usual trivialities. People who are

2687

   specifically interested in linker information are warned to stop

2688

   reading with this post.

2689

2690

   I'll close the series with a short update on gold, the new linker I've

2691

   been working on. It currently (September 25, 2007) can create

2692

   executables. It can not create shared libraries or relocateable

2693

   objects. It has very limited support for linker scripts-enough to read

2694

   /usr/lib/libc.so on a GNU/Linux system. It doesn't have any interesting

2695

   new features at this point. It only supports x86. The focus to date has

2696

   been entirely on speed. It is written to be multi-threaded, but the

2697

   threading support has not been hooked in yet.

2698

2699

   By way of example, when linking a 900M C++ executable, the GNU linker

2700

   (version 2.16.91 20060118 on an Ubuntu based system) took 700 seconds

2701

   of user time, 24 seconds of system time, and 16 minutes of wall time.

2702

   gold took 7 seconds of user time, 3 seconds of system time, and 30

2703

   seconds of wall time. So while I can't promise that it will stay as

2704

   fast as all features are added, it's in a pretty good position at the

2705

   moment.

2706

2707

   I'm the main developer on gold, but I'm not the only person working on

2708

   it. A few other people are also making improvements.

2709

2710

   The goal is to release gold as a free program, ideally as part of the

2711

   GNU binutils. I want it to be more nearly feature complete before doing

2712

   this, though. It needs to at least support -shared and -r. I doubt gold

2713

   will ever support all of the features of the GNU linker. I doubt it

2714

   will ever support the full GNU linker script language, although I do

2715

   plan to support enough to link the Linux kernel.

2716

2717

   Future plans for gold, once it actually works, include incremental

2718

   linking and more far-reaching speed improvements.

2719

2720

   [25]Permalink [26]Comments (3)

2721

   [27]« Previous entries · [28]Next entries »

2722

     __________________________________________________________________

2723

     __________________________________________________________________

2724

2725

     * Archives

2726

          + [29]November 2007

2727

          + [30]October 2007

2728

          + [31]September 2007

2729

          + [32]August 2007

2730

          + [33]February 2007

2731

          + [34]January 2007

2732

          + [35]December 2006

2733

          + [36]July 2006

2734

          + [37]June 2006

2735

          + [38]April 2006

2736

          + [39]February 2006

2737

          + [40]January 2006

2738

          + [41]December 2005

2739

          + [42]November 2005

2740

     * Categories

2741

          + [43]Random

2742

          + [44]Money

2743

          + [45]Philosophy

2744

          + [46]Books

2745

          + [47]Politics

2746

          + [48]Programming

2747

     * Search

2748

       ____________________ Search

2749

     * Friend's blogs

2750

          + [49]Frank Ch. Eigler

2751

          + [50]Daniel Berlin

2752

          + [51]Angela Thomas

2753

          + [52]Tom Tromey

2754

     * My pages

2755

          + [53]Home

2756

          + [54]Essays

2757

     * Notable blogs

2758

          + [55]Lawrence Lessig

2759

          + [56]Bruce Schneier

2760

     * Meta

2761

          + [57]Register

2762

          + [58]Login

2763

          + [59]Entries RSS

2764

          + [60]Comments RSS

2765

          + [61]Wordpress

2766

2767

   Design by [62]Beccary and [63]Weblogs.us · [64]XHTML · [65]CSS

2768

2769

References

2770

2771

   Visible links

2772

   1. http://www.airs.com/blog/feed/

2773

   2. http://www.airs.com/blog/feed/rss/

2774

   3. http://www.airs.com/blog/feed/atom/

2775

   4. http://www.airs.com/blog

2776

   5. http://www.airs.com/blog/archives/61

2777

   6. http://www.airs.com/blog/archives/category/books/

2778

   7. http://www.airs.com/blog/archives/61

2779

   8. http://www.airs.com/blog/archives/61#comments

2780

   9. http://www.airs.com/blog/archives/60

2781

  10. http://www.airs.com/blog/archives/category/politics/

2782

  11. http://www.airs.com/blog/archives/37

2783

  12. http://www.airs.com/blog/archives/60

2784

  13. http://www.airs.com/blog/archives/60#comments

2785

  14. http://www.airs.com/blog/archives/59

2786

  15. http://www.airs.com/blog/archives/category/money/

2787

  16. http://www.airs.com/blog/archives/59

2788

  17. http://www.airs.com/blog/archives/59#comments

2789

  18. http://www.airs.com/blog/archives/58

2790

  19. http://www.airs.com/blog/archives/category/philosophy/

2791

  20. http://www.airs.com/ian/essays/

2792

  21. http://www.airs.com/blog/archives/58

2793

  22. http://www.airs.com/blog/archives/58#comments

2794

  23. http://www.airs.com/blog/archives/57

2795

  24. http://www.airs.com/blog/archives/category/programming/

2796

  25. http://www.airs.com/blog/archives/57

2797

  26. http://www.airs.com/blog/archives/57#comments

2798

  27. http://www.airs.com/blog/page/9/

2799

  28. http://www.airs.com/blog/page/7/

2800

  29. http://www.airs.com/blog/archives/date/2007/11/

2801

  30. http://www.airs.com/blog/archives/date/2007/10/

2802

  31. http://www.airs.com/blog/archives/date/2007/09/

2803

  32. http://www.airs.com/blog/archives/date/2007/08/

2804

  33. http://www.airs.com/blog/archives/date/2007/02/

2805

  34. http://www.airs.com/blog/archives/date/2007/01/

2806

  35. http://www.airs.com/blog/archives/date/2006/12/

2807

  36. http://www.airs.com/blog/archives/date/2006/07/

2808

  37. http://www.airs.com/blog/archives/date/2006/06/

2809

  38. http://www.airs.com/blog/archives/date/2006/04/

2810

  39. http://www.airs.com/blog/archives/date/2006/02/

2811

  40. http://www.airs.com/blog/archives/date/2006/01/

2812

  41. http://www.airs.com/blog/archives/date/2005/12/

2813

  42. http://www.airs.com/blog/archives/date/2005/11/

2814

  43. http://www.airs.com/blog/archives/category/random/

2815

  44. http://www.airs.com/blog/archives/category/money/

2816

  45. http://www.airs.com/blog/archives/category/philosophy/

2817

  46. http://www.airs.com/blog/archives/category/books/

2818

  47. http://www.airs.com/blog/archives/category/politics/

2819

  48. http://www.airs.com/blog/archives/category/programming/

2820

  49. http://web.elastic.org/~fche/blog2/

2821

  50. http://www.dberlin.org/blog/

2822

  51. http://blog.360.yahoo.com/blog-f4oLexAlc6l3W_YHZF1IXYDu

2823

  52. http://tromey.com/blog

2824

  53. http://airs.com/ian/

2825

  54. http://airs.com/ian/essays/

2826

  55. http://lessig.org/blog/

2827

  56. http://schneier.com/blog/

2828

  57. http://www.airs.com/blog/wp-register.php

2829

  58. http://www.airs.com/blog/wp-login.php

2830

  59. http://www.airs.com/blog/feed/

2831

  60. http://www.airs.com/blog/comments/feed/

2832

  61. http://wordpress.org/

2833

  62. http://beccary.com/

2834

  63. http://weblogs.us/

2835

  64. http://validator.w3.org/check/referer

2836

  65. http://jigsaw.w3.org/css-validator/check/referer

2837

2838

   Hidden links:

2839

  66. http://www.airs.com/blog/page/7/

2840

  67. http://www.airs.com/blog/page/7/

2841

  68. http://www.airs.com/blog/page/9/

2842

  69. http://www.airs.com/blog/page/9/

--

2847

gentoo-commits@g.o mailing list

Gentoo Archives: gentoo-commits