[gentoo-commits] proj/hardened-docs:master commit in: xml/integrity/ - gentoo-commits

From:	Sven Vermeulen <sven.vermeulen@××××××.be>
To:	gentoo-commits@l.g.o
Subject:	[gentoo-commits] proj/hardened-docs:master commit in: xml/integrity/
Date:	Mon, 30 Jul 2012 19:24:35
Message-Id:	`1343676169.380cd1dcbd2b712ca5a850f77cb7aedbe83818d9.SwifT@gentoo`

1

commit:     380cd1dcbd2b712ca5a850f77cb7aedbe83818d9

2

Author:     Sven Vermeulen <sven.vermeulen <AT> siphos <DOT> be>

3

AuthorDate: Mon Jul 30 19:22:49 2012 +0000

4

Commit:     Sven Vermeulen <sven.vermeulen <AT> siphos <DOT> be>

5

CommitDate: Mon Jul 30 19:22:49 2012 +0000

6

URL:        http://git.overlays.gentoo.org/gitweb/?p=proj/hardened-docs.git;a=commit;h=380cd1dc

7

8

Adding concepts guide for integrity subproject

9

10

---

11

 xml/integrity/concepts.xml |  524 ++++++++++++++++++++++++++++++++++++++++++++

12

 1 files changed, 524 insertions(+), 0 deletions(-)

13

14

diff --git a/xml/integrity/concepts.xml b/xml/integrity/concepts.xml

15

new file mode 100644

16

index 0000000..c8859f9

17

--- /dev/null

18

+++ b/xml/integrity/concepts.xml

19

@@ -0,0 +1,524 @@

20

+<?xml version='1.0' encoding='UTF-8'?>

21

+<!DOCTYPE guide SYSTEM "/dtd/guide.dtd">

22

+<!-- $Header$ -->

23

+

24

+<guide lang="en">

25

+<title>Integrity - Introduction and Concepts</title>

26

+

27

+<author title="Author">

28

+  <mail link="swift"/>

29

+</author>

30

+

31

+<abstract>

32

+Integrity validation is a wide field in which many technologies play a role.

33

+This guide aims to offer a high-level view on what integrity validation is all

34

+about and how the various technologies work together to achieve a (hopefully)

35

+more secure environment to work in.

36

+</abstract>

37

+

38

+<!-- The content of this document is licensed under the CC-BY-SA license -->

39

+<!-- See http://creativecommons.org/licenses/by-sa/3.0 -->

40

+<license version="3.0" />

41

+

42

+<version>1</version>

43

+<date>2012-07-30</date>

44

+

45

+<chapter>

46

+<title>It is about trust</title>

47

+<section>

48

+<title>Introduction</title>

49

+<body>

50

+

51

+<p>

52

+Integrity is about trusting components within your environment, and in our case

53

+the workstations, servers and machines you work on. You definitely want to be

54

+certain that the workstation you type your credentials on to log on to the

55

+infrastructure is not compromised in any way. This "trust" in your environment

56

+is a combination of various factors: physical security, system security patching

57

+process, secure configuration, access controls and more.

58

+</p>

59

+

60

+<p>

61

+Integrity plays a role in this security field: it tries to ensure that the

62

+systems have not been tampered with by malicious people or organizations. And

63

+this tamperproof-ness extends to a wide range of components that need to be

64

+validated. You probably want to be certain that the binaries that are ran (and

65

+libraries that are loaded) are those you built yourself (in case of Gentoo) or

66

+were provided to you by someone (or something) you trust. And that the Linux

67

+kernel you booted (and the modules that are loaded) are those you made, and not

68

+someone else.

69

+</p>

70

+

71

+<p>

72

+Most people trust themselves and look at integrity as if it needs to prove that

73

+things are still as you've built them. But to support this claim, the systems you

74

+use to ensure integrity need to be trusted too: you want to make sure that

75

+whatever system is in place to offer you the final yes/no on the integrity only

76

+uses trusted information (did it really validate the binary) and services (is it

77

+not running on a compromised system). To support these claims, many ideas,

78

+technologies, processes and algorithms have passed the review.

79

+</p>

80

+

81

+<p>

82

+In this document, we will talk about a few of those, and how they play in the

83

+Gentoo Hardened Integrity subprojects' vision and roadmap.

84

+</p>

85

+

86

+</body>

87

+</section>

88

+</chapter>

89

+

90

+<chapter>

91

+<title>Hash results</title>

92

+<section>

93

+<title>Algorithmically validating a file's content</title>

94

+<body>

95

+

96

+<p>

97

+Hashes are a primary method for validating if a file (or other resource) has

98

+not been changed since it was first inspected. A hash is the result of a

99

+mathematical calculation on the content of a file (most often a number or

100

+ordered set of numbers), and exhibits the following properties:

101

+</p>

102

+

103

+<ul>

104

+  <li>

105

+    The resulting number is represented in a <e>small (often fixed-size) length</e>.

106

+    This is necessary to allow fast verification if two hash values are the same

107

+    or not, but also to allow storing the value in a secure location (which is,

108

+    more than often, much more restricted in space).

109

+  </li>

110

+  <li>

111

+    The hash function always <e>returns the same hash</e> (output) when the file it

112

+    inspects has not been changed (input). Otherwise it'll be impossible to

113

+    ensure that the file content hasn't changed.

114

+  </li>

115

+  <li>

116

+    The hash function is fast to run (the calculation of a hash result does not

117

+    take up too much time or even resources). Without this property, it would

118

+    take too long to generate and even validate hash results, leading to users

119

+    being malcontent (and more likely to disable the validation alltogether).

120

+  </li>

121

+  <li>

122

+    The hash result <e>cannot be used to reconstruct</e> the file. Although this is

123

+    often seen as a result of the first property (small length), it is important

124

+    because hash results are often also seen as a "public validation" of data

125

+    that is otherwise private in nature. In other words, many processes relie on

126

+    the inability of users (or hackers) to reverse-engineer information based on

127

+    its hash result. A good example are passwords and password databases, which

128

+    <e>should</e> store hashes of the passwords, not the passwords themselves.

129

+  </li>

130

+  <li>

131

+    Given a hash result, it is near impossible to find another file with the

132

+    same hash result (or to create such a file yourself). Since the hash result

133

+    is limited in space, there are many inputs that will map onto the same

134

+    hash result. The power of a good hash function is that it is not feasible to

135

+    find them (or calculate them) except by brute force. When such a match is

136

+    found, it is called a <e>collision</e>.

137

+  </li>

138

+</ul>

139

+

140

+<p>

141

+Compared with checksums, hashes try to be more cryptographically secure (and as

142

+such more effort is made in the last property to make sure collisions are very

143

+hard to obtain). Some even try to generate hash results in a way that the

144

+duration to calculate hashes cannot be used to obtain information from the data

145

+(such as if it contains more 0s than 1s, etc.)

146

+</p>

147

+

148

+</body>

149

+</section>

150

+<section>

151

+<title>Hashes in integrity validation</title>

152

+<body>

153

+

154

+<p>

155

+Integrity validation services are often based on hash generation and validation.

156

+Tools such as <uri link="http://www.tripwire.org/">tripwire</uri> or <uri

157

+link="http://aide.sourceforge.net/">AIDE</uri> generate hashes of files and

158

+directories on your systems and then ask you to store them safely. When you want

159

+the integrity of your systems checked, you provide this information to the

160

+program (most likely in a read-only manner since you don't want this list to

161

+be modified while validating) which then recalculates the hashes of the files

162

+and compares them with the given list. Any changes in files are detected and can

163

+be reported to you (or the administrator).

164

+</p>

165

+

166

+<p>

167

+A popular hash functions is SHA-1 (which you can generate and validate using the

168

+<c>sha1sum</c> command) which gained momentum after MD5 (using <c>md5sum</c>)

169

+was found to be less secure (nowadays collisions in MD5 are easy to generate).

170

+SHA-2 also exists (but is less popular than SHA-1) and can be played with using

171

+the commands <c>sha224sum</c>, <c>sha256sum</c>, <c>sha384sum</c> and

172

+<c>sha512sum</c>.

173

+</p>

174

+

175

+<pre caption="Generating the SHA-1 sum of a file">

176

+~$ <i>sha1sum ~/Downloads/pastie-4301043.rb</i>

177

+6b9b4e0946044ec752992c2afffa7be103c2e748  /home/swift/Downloads/pastie-4301043.rb

178

+</pre>

179

+

180

+</body>

181

+</section>

182

+<section>

183

+<title>Hashes are a means, not a solution</title>

184

+<body>

185

+

186

+<p>

187

+Hashes, in the field of integrity validation, are a means to compare data and

188

+integrity in a relatively fast way. However, by itself hashes cannot be used to

189

+provide integrity assurance towards the administrator. Take the use of

190

+<c>sha1sum</c> by itself for instance.

191

+</p>

192

+

193

+<p>

194

+You are not guaranteed that the <c>sha1sum</c> application behaves correctly

195

+(and as such has or hasn't been tampered with). You can't use <c>sha1sum</c>

196

+against itself since malicious modifications of the command can easily just

197

+return (print out) the expected SHA-1 sum rather than the real one. A way to

198

+thwart this is to provide the binary together with the hash values on read-only

199

+media.

200

+</p>

201

+

202

+<p>

203

+But then you're still not certain that it is that application that is executed:

204

+a modified system might have you think it is executing that application, but

205

+instead is using a different application. To provide this level of trust, you

206

+need to get insurance from a higher-positioned, trusted service that the right

207

+application is being ran. Running with a trusted kernel helps here (but might

208

+not provide 100% closure on it) but you most likely need assistance from the

209

+hardware (we will talk about the Trusted Platform Module later).

210

+</p>

211

+

212

+<p>

213

+Likewise, you are not guaranteed that it is still your file with hash results

214

+that is being used to verify the integrity of a file. Another file (with

215

+modified content) may be bind-mounted on top of it. To support integrity

216

+validation with a trusted information source, some solutions use HMAC digests

217

+instead of plain hashes.

218

+</p>

219

+

220

+<p>

221

+Finally, checksums should not only be taken on file level, but also its

222

+attributes (which are often used to provide access controls or even toggle

223

+particular security measures on/off on a file, such as is the case with PaX

224

+markings), directories (holding information about directory updates such

225

+as file adds or removals) and privileges. These are things that a program like

226

+<c>sha1sum</c> doesn't offer (but tools like AIDE do).

227

+</p>

228

+

229

+</body>

230

+</section>

231

+</chapter>

232

+

233

+<chapter>

234

+<title>Hash-based Message Authentication Codes</title>

235

+<section>

236

+<title>Trusting the hash result</title>

237

+<body>

238

+

239

+<p>

240

+In order to trust a hash result, some solutions use HMAC digests instead. An

241

+HMAC digest combines a regular hash function (and its properties) with a

242

+a secret cryptographic key. As such, the function generates the hash of the

243

+content of a file together with the secret cryptographic key. This not only

244

+provides integrity validation of the file, but also a signature telling the

245

+verification tool that the hash was made by a trusted application (one that

246

+knows the cryptographic key) in the past and has not been tampered with.

247

+</p>

248

+

249

+<p>

250

+By using HMAC digests, malicious users will find it more difficult to modify

251

+code and then present a "fake" hash results file since the user cannot reproduce

252

+the secret cryptographic key that needs to be added to generate this new hash

253

+result. When you see terms like <e>HMAC-SHA1</e> it means that a SHA-1 hash

254

+result is used together with a cryptographic key.

255

+</p>

256

+

257

+</body>

258

+</section>

259

+<section>

260

+<title>Managing the keys</title>

261

+<body>

262

+

263

+<p>

264

+Using keys to "protect" the hash results introduces another level of complexity:

265

+how do you properly, securely store the keys and access them only when needed?

266

+You cannot just embed the key in the hash list (since a tampered system might

267

+read it out when you are verifying the system, generate its own results file and

268

+have you check against that instead). Likewise you can't just embed the key in

269

+the application itself, because a tampered system might just read out the

270

+application binary to find the key (and once compromised, you might need to

271

+rebuild the application completely with a new key).

272

+</p>

273

+

274

+<p>

275

+You might be tempted to just provide the key as a command-line argument, but

276

+then again you are not certain that a malicious user is idling on your system,

277

+waiting to capture this valuable information from the output of <c>ps</c>, etc.

278

+</p>

279

+

280

+<p>

281

+Again rises the need to trust a higher-level component. When you trust the

282

+kernel, you might be able to use the kernel key ring for this.

283

+</p>

284

+

285

+</body>

286

+</section>

287

+</chapter>

288

+

289

+<chapter>

290

+<title>Using private/public key cryptography</title>

291

+<section>

292

+<title>Validating integrity using public keys</title>

293

+<body>

294

+

295

+<p>

296

+One way to work around the vulnerability of having the malicious user getting

297

+hold of the secret key is to not rely on the key for the authentication of the

298

+hash result in the first place when verifying the integrity of the system. This

299

+can be accomplised if you, instead of using just an HMAC, you also encrypt HMAC

300

+digest with a private key.

301

+</p>

302

+

303

+<p>

304

+During validation of the hashes, you decrypt the HMAC with the public key (not

305

+the private key) and use this to generate the HMAC digests again to validate.

306

+</p>

307

+

308

+<p>

309

+In this approach, an attacker cannot forge a fake HMAC since forgery requires

310

+access to the private key, and the private key is never used on the system to

311

+validate signatures. And as long as no collisions occur, he also cannot reuse

312

+the encrypted HMAC values (which you could consider to be a replay attack).

313

+</p>

314

+

315

+</body>

316

+</section>

317

+<section>

318

+<title>Ensuring the key integrity</title>

319

+<body>

320

+

321

+<p>

322

+Of course, this still requires that the public key is not modifyable by a

323

+tampered system: a fake list of hash results can be made using a different

324

+private key, and the moment the tool wants to decrypt the encrypted values, the

325

+tampered system replaces the public key with its own public key, and the system

326

+is again vulnerable.

327

+</p>

328

+

329

+</body>

330

+</section>

331

+</chapter>

332

+

333

+<chapter>

334

+<title>Trust chain</title>

335

+<section>

336

+<title>Handing over trust</title>

337

+<body>

338

+

339

+<p>

340

+As you've noticed from the methods and services above, you always need to have

341

+something you trust and that you can build on. If you trust nothing, you can't

342

+validate anything since nothing can be trusted to return a valid response. And

343

+to trust something means you also want to have confidence that that system

344

+itself uses trusted resources.

345

+</p>

346

+

347

+<p>

348

+For many users, the hardware level is something they trust. After all, as long

349

+as no burglar has come in the house and tampered with the hardware itself, it is

350

+reasonable to expect that the hardware is still the same. In effect, the users

351

+trust that the physical protection of their house is sufficient for them.

352

+</p>

353

+

354

+<p>

355

+For companies, the physical protection of the working environment is not

356

+sufficient for ultimate trust. They want to make sure that the hardware is not

357

+tampered with (or different hardware is suddenly used), specifically when that

358

+company uses laptops instead of (less portable) workstations. 

359

+</p>

360

+

361

+<p>

362

+The more you don't trust, the more things you need to take care of in order to

363

+be confident that the system is not tampered with. In the Gentoo Hardened

364

+Integrity subproject we will use the following "order" of resources:

365

+</p>

366

+

367

+<ul>

368

+  <li>

369

+    <e>System root-owned files and root-running processes</e>. In most cases

370

+    and most households, properly configured and protected systems will trust

371

+    root-owned files and processes. Any request for integrity validation of

372

+    the system is usually applied against user-provided files (no-one tampered

373

+    with the user account or specific user files) and not against the system

374

+    itself.

375

+  </li>

376

+  <li>

377

+    <e>Operating system kernel</e> (in our case the Linux kernel). Although some

378

+    precautions need to be taken, a properly configured and protected kernel can

379

+    provide a higher trust level. Integrity validation on kernel level can offer

380

+    a higher trust in the systems' integrity, although you must be aware that

381

+    most kernels still reside on the system itself.

382

+  </li>

383

+  <li>

384

+    <e>Live environments</e>. A bootable (preferably) read-only medium can be

385

+    used to boot up a validation environment that scans and verifies the

386

+    integrity of the system-under-investigation. In this case, even tampered

387

+    kernel boot images can be detected, and by taking proper precautions when

388

+    running the validation (such as ensuring no network access is enabled from

389

+    the boot up until the final compliance check has occurred) you can make

390

+    yourself confident of the state of the entire system.

391

+  </li>

392

+  <li>

393

+    <e>Hypervisor level</e>. Hypervisors are by many organizations seen as

394

+    trusted resources (the isolation of a virtual environment is hard to break

395

+    out of). Integrity validation on the hypervisor level can therefor provide

396

+    confidence, especially when "chaining trusts": the hypervisor first

397

+    validates the kernel to boot, and then boots this (now trusted) kernel which

398

+    loads up the rest of the system.

399

+  </li>

400

+  <li>

401

+    <e>Hardware level</e>. Whereas hypervisors are still "just software", you

402

+    can lift up trust up to the hardware level and use the hardware-offered

403

+    integrity features to provide you with confidence that the system you are

404

+    about to boot has not been tampered with.

405

+  </li>

406

+</ul>

407

+

408

+<p>

409

+In the Gentoo Hardened Integrity subproject, we aim to eventually support all

410

+these levels (and perhaps more) to provide you as a user the tools and methods

411

+you need to validate the integrity of your system, up to the point that you

412

+trust. The less you trust, the more complex a trust chain might become to

413

+validate (and manage), but we will not limit our research and support to a

414

+single technology (or chain of technologies).

415

+</p>

416

+

417

+<p>

418

+Chaining trust is an important aspect to keep things from becoming too complex

419

+and unmanageable. It also allows users to just "drop in" at the level of trust

420

+they feel is sufficient, rather than requiring technologies for higher levels.

421

+</p>

422

+

423

+<p>

424

+For instance:

425

+</p>

426

+

427

+<ul>

428

+  <li>

429

+    A hardware component that you trust (like a <e>Trusted Platform Module</e>

430

+    or a specific BIOS-supported functionality) verifies the integrity of the

431

+    boot regions on your disk. When ok, it passes control over to the

432

+    bootloader.

433

+  </li>

434

+  <li>

435

+    The bootloader now validates the integrity of its configuration and of the

436

+    files (kernel and initramfs) it is told to boot up. If it checks out, it

437

+    boots the kernel and hands over control to this kernel.

438

+  </li>

439

+  <li>

440

+    The kernel, together with the initial ram file system, verifies the

441

+    integrity of the system components (and for instance SELinux policy) before

442

+    the initial ram system changes to the real system and boots up the

443

+    (verified) init system.

444

+  </li>

445

+  <li>

446

+    The (root-running) init system validates the integrity of the services it

447

+    wants to start before handing over control of the system to the user.

448

+  </li>

449

+</ul>

450

+

451

+<p>

452

+An even longer chain can be seen with hypervisors:

453

+</p>

454

+

455

+<ul>

456

+  <li>

457

+    Hardware validates boot loader

458

+  </li>

459

+  <li>

460

+    Boot loader validates hypervisor kernel and system

461

+  </li>

462

+  <li>

463

+    Hypervisor validates kernel(s) of the images (or the entire images)

464

+  </li>

465

+  <li>

466

+    Hypervisor-managed virtual environment starts the image

467

+  </li>

468

+  <li>

469

+    ...

470

+  </li>

471

+</ul>

472

+

473

+</body>

474

+</section>

475

+<section>

476

+<title>Integrity on serviced platforms</title>

477

+<body>

478

+

479

+<p>

480

+Sometimes you cannot trust higher positioned components, but still want to be

481

+assured that your service is not tampered with. An example would be when you are

482

+hosting a system in a remote, non-accessible data center or when you manage an

483

+image hosted by a virtualized hosting provider (I don't want to say "cloud"

484

+here, but it fits).

485

+</p>

486

+

487

+<p>

488

+In these cases, you want a level of assurance that your own image has not been

489

+tampered with while being offline (you can imagine manipulating the guest image,

490

+injecting trojans or other backdoors, and then booting the image) or even while

491

+running the system. Instead of trusting the higher components, you try to deal

492

+with a level of distrust that you want to manage.

493

+</p>

494

+

495

+<p>

496

+Providing you with some confidence at this level too is our goal within the

497

+Gentoo Hardened Integrity subproject.

498

+</p>

499

+

500

+</body>

501

+</section>

502

+<section>

503

+<title>From measurement to protection</title>

504

+<body>

505

+

506

+<p>

507

+When dealing with integrity (and trust chains), the idea behind the top-down

508

+trust chain is that higher level components first measure the integrity of the

509

+next component, validate (and take appropriate action) and then hand over

510

+control to this component. This is what we call <e>protection</e> or

511

+<e>integrity enforcement</e> of resources.

512

+</p>

513

+

514

+<p>

515

+If the system cannot validate the integrity, or the system is too volatile to

516

+enforce this integrity from a higher level, it is necessary to provide a trusted

517

+method for other services to validate the integrity. In this case, the system

518

+<e>attests</e> the state of the underlying component(s) towards a third party

519

+service, which <e>appraises</e> this state against a known "good" value.

520

+</p>

521

+

522

+<p>

523

+In the case of our HMAC-based checks, there is no enforcement of integrity of

524

+the files, but the tool itself attests the state of the resources by generating

525

+new HMAC digests and validating (appraising) it against the list of HMAC digests

526

+it took before.

527

+</p>

528

+

529

+</body>

530

+</section>

531

+</chapter>

532

+

533

+<chapter>

534

+<title>An implementation: the Trusted Computing Group functionality</title>

535

+<section>

536

+<title>Trusted Platform Module</title>

537

+<body>

538

+

539

+</body>

540

+</section>

541

+</chapter>

542

+

543

+</guide>

Gentoo Archives: gentoo-commits