Re: [gentoo-user] Laptop Install Issue - gentoo-user

From:	Devon Miller <devon.c.miller@×××××.com>
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] Laptop Install Issue
Date:	Mon, 12 Dec 2005 14:58:34
Message-Id:	`c52221f0512120648p457e2f13xd5db3885c052c827@mail.gmail.com`
In Reply to:	Re: [gentoo-user] Laptop Install Issue by "Mariusz Pękala"

1

Do you have CONFIG_CPU_FREQ defined in your kernel config?

2

3

I have an HP laptop where I have seen similar behavior. After dealing with

4

it for some time, I tracked it down to a problem with changing the cpu's

5

frequency. For a very small period after the clock is changed, the thermal

6

sensor reads back nonsense. I've seen readings like "69... 69... 95...

7

70..." and that's with 0.5 second sampling. I've found 2 workarounds:

8

9

1) The quick and easy way:

10

       /etc/init.d/powernowd stop

11

       Now, build x.org

12

       /etc/init.d/powernowd start

13

14

       Of course you'll need to replace powernowd with what ever power

15

management daemon you have emerged.

16

17

2) The uglier, but potentially more useful fix:

18

        Save this as thermal.diff:

19

-----------------------------------------------------------------------------------------

20

--- orig/drivers/acpi/thermal.c 2005-07-07 22:37:42.000000000 -0400

21

+++ new/drivers/acpi/thermal.c  2005-06-15 18:30:43.000000000 -0400

22

@@ -61,7 +61,8 @@

23

 #define ACPI_THERMAL_MODE_ACTIVE       0x00

24

 #define ACPI_THERMAL_MODE_PASSIVE      0x01

25

 #define ACPI_THERMAL_MODE_CRITICAL     0xff

26

-#define ACPI_THERMAL_PATH_POWEROFF     "/sbin/poweroff"

27

+//#define ACPI_THERMAL_PATH_POWEROFF   "/sbin/poweroff"

28

+#define ACPI_THERMAL_PATH_POWEROFF     "/sbin/overheat"

29

30

 #define ACPI_THERMAL_MAX_ACTIVE        10

31

 #define ACPI_THERMAL_MAX_LIMIT_STR_LEN 65

32

-----------------------------------------------------------------------------------------

33

        Patch the kernel by cd'ing to /usr/src/linux and typing:

34

                patch -p1 < <path-to>/thermal.diff

35

36

        This will cause the kernel to call /sbin/overheat instead of

37

/sbin/powerdown if your laptop hits a critical temperature.

38

        Save this as /sbin/overheat:

39

-----------------------------------------------------------------------------------------

40

#!/bin/bash

41

42

POWER_MGT_COMMAND=/etc/init.d/powernowd

43

44

if ${POWER_MGT_COMMAND} status > /dev/null ; then

45

    ${POWER_MGT_COMMAND} stop

46

47

    cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq \

48

        > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed

49

    echo -n 0 > /proc/acpi/thermal_zone/THRM/cooling_mode

50

(

51

        echo System switched to low power mode for cooling

52

        cat /proc/acpi/thermal_zone/THRM/temperature

53

    ) | wall

54

fi

55

-----------------------------------------------------------------------------------------

56

        Make /sbin/overheat executable by typing:

57

            chmod 755 /sbin/overheat

58

59

        Now, when the thermal sensor reports crazy values, my laptop just

60

slows way down instead of completely stopping.

61

62

On my todo list:

63

        o  After the temperature comes down, reenable power management

64

        o  If the temperature does not come down in a reasonable period,

65

then shut it down.

66

        o  A better patch that takes into account cpufreq changes and

67

disable the thermal faults for a few ms after a frequency change. I need to

68

get a better idea of how long the sensor gives erroneous readings.

69

70

dcm

71

72

On 12/12/05, Mariusz Pękala <skoot@××.pl> wrote:

73

>

74

> > El Domingo, 11 de Diciembre de 2005 11:42, C. Beamer escribió:

75

> > > My issue is this:  The computer powered off in the middle of the

76

> install

77

> > > of xorg-x11.  This has happened a couple of times.  I haven't been

78

> > > having problems with the laptop, so I'm pretty sure the issue has

79

> > > something to do with power management since I built power management

80

> > > into the kernel, but didn't emerge acpid.  Anyway, since the emerge of

81

> > > xorg-x11 has bombed a couple of times, is there anything that I should

82

> > > do in the way of clean up before trying to emerge it again?

83

> > > Colleen

84

>

85

> > On 2005-12-11 17:32:46 +0100 (Sun, Dec), Rafael Fernández López wrote:

86

> > I can't find any sense at that issue: I can't understand what's the

87

> reason

88

> > that make your computer turn off in a compilation.

89

> >

90

> > Well... I'm afraid of temperature. I hope that's not the reason, but is

91

> the

92

> > first thing that came to my mind. Maybe in your laptop (I've an Amilo

93

> Fujitsu

94

> > Siemens, and when compiling OO or KDE it is really hot), when it reachs

95

> some

96

> > temperature it turns off because of security reasons.

97

> >

98

> > I cannot find any other reason.

99

>

100

> I vote for temperature issues too. That is my experience with some

101

> Aristo laptop - it get very hot very easily and powers off when

102

> temperature exceeds 85 C.

103

>

104

> You may try to run something like this while emerging:

105

> # while sleep 5 ; do cat /proc/acpi/thermal_zone/THM0/temperature >>

106

>   /tmp/temper ; done &

107

>

108

> and hope that part of that file will survive the poweroff - you will see

109

> whether temperature was raising before end.

110

>

111

> Or you may put something like:

112

> ... do cat /proc/acp..... | tee -a /tmp/temper ; done &

113

> in background in the session in which emerge runs and observe the

114

> temperature between compilation lines.

115

>

116

> The exact path to temperature file may differ, it will be something like

117

> /proc/acpi/thermal_zone/*/temperature - and it will exist only if your

118

> kernel has necessary drivers compiled (or modules inserted).

119

>

120

> The /proc/acpi/thermal_zone/*/temperture file has about 30 bytes,

121

> 35 thousands of copies makes 1MB file, so you loop may run for 9

122

> hours if storing one copy every second or 48 hours if appending one copy

123

> every 5 seconds.

124

>

125

> HTH.

126

>

127

> --

128

> No virus found in this outgoing message.

129

> Checked by 'grep -i virus $MESSAGE'

130

> Trust me.

131

>

132

>

133

>

Gentoo Archives: gentoo-user

Replies

1	Do you have CONFIG_CPU_FREQ defined in your kernel config?
2
3	I have an HP laptop where I have seen similar behavior. After dealing with
4	it for some time, I tracked it down to a problem with changing the cpu's
5	frequency. For a very small period after the clock is changed, the thermal
6	sensor reads back nonsense. I've seen readings like "69... 69... 95...
7	70..." and that's with 0.5 second sampling. I've found 2 workarounds:
8
9	1) The quick and easy way:
10	/etc/init.d/powernowd stop
11	Now, build x.org
12	/etc/init.d/powernowd start
13
14	Of course you'll need to replace powernowd with what ever power
15	management daemon you have emerged.
16
17	2) The uglier, but potentially more useful fix:
18	Save this as thermal.diff:
19	-----------------------------------------------------------------------------------------
20	--- orig/drivers/acpi/thermal.c 2005-07-07 22:37:42.000000000 -0400
21	+++ new/drivers/acpi/thermal.c 2005-06-15 18:30:43.000000000 -0400
22	@@ -61,7 +61,8 @@
23	#define ACPI_THERMAL_MODE_ACTIVE 0x00
24	#define ACPI_THERMAL_MODE_PASSIVE 0x01
25	#define ACPI_THERMAL_MODE_CRITICAL 0xff
26	-#define ACPI_THERMAL_PATH_POWEROFF "/sbin/poweroff"
27	+//#define ACPI_THERMAL_PATH_POWEROFF "/sbin/poweroff"
28	+#define ACPI_THERMAL_PATH_POWEROFF "/sbin/overheat"
29
30	#define ACPI_THERMAL_MAX_ACTIVE 10
31	#define ACPI_THERMAL_MAX_LIMIT_STR_LEN 65
32	-----------------------------------------------------------------------------------------
33	Patch the kernel by cd'ing to /usr/src/linux and typing:
34	patch -p1 < <path-to>/thermal.diff
35
36	This will cause the kernel to call /sbin/overheat instead of
37	/sbin/powerdown if your laptop hits a critical temperature.
38	Save this as /sbin/overheat:
39	-----------------------------------------------------------------------------------------
40	#!/bin/bash
41
42	POWER_MGT_COMMAND=/etc/init.d/powernowd
43
44	if ${POWER_MGT_COMMAND} status > /dev/null ; then
45	${POWER_MGT_COMMAND} stop
46
47	cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq \
48	> /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
49	echo -n 0 > /proc/acpi/thermal_zone/THRM/cooling_mode
50	(
51	echo System switched to low power mode for cooling
52	cat /proc/acpi/thermal_zone/THRM/temperature
53	) \| wall
54	fi
55	-----------------------------------------------------------------------------------------
56	Make /sbin/overheat executable by typing:
57	chmod 755 /sbin/overheat
58
59	Now, when the thermal sensor reports crazy values, my laptop just
60	slows way down instead of completely stopping.
61
62	On my todo list:
63	o After the temperature comes down, reenable power management
64	o If the temperature does not come down in a reasonable period,
65	then shut it down.
66	o A better patch that takes into account cpufreq changes and
67	disable the thermal faults for a few ms after a frequency change. I need to
68	get a better idea of how long the sensor gives erroneous readings.
69
70	dcm
71
72	On 12/12/05, Mariusz Pękala <skoot@××.pl> wrote:
73	>
74	> > El Domingo, 11 de Diciembre de 2005 11:42, C. Beamer escribió:
75	> > > My issue is this: The computer powered off in the middle of the
76	> install
77	> > > of xorg-x11. This has happened a couple of times. I haven't been
78	> > > having problems with the laptop, so I'm pretty sure the issue has
79	> > > something to do with power management since I built power management
80	> > > into the kernel, but didn't emerge acpid. Anyway, since the emerge of
81	> > > xorg-x11 has bombed a couple of times, is there anything that I should
82	> > > do in the way of clean up before trying to emerge it again?
83	> > > Colleen
84	>
85	> > On 2005-12-11 17:32:46 +0100 (Sun, Dec), Rafael Fernández López wrote:
86	> > I can't find any sense at that issue: I can't understand what's the
87	> reason
88	> > that make your computer turn off in a compilation.
89	> >
90	> > Well... I'm afraid of temperature. I hope that's not the reason, but is
91	> the
92	> > first thing that came to my mind. Maybe in your laptop (I've an Amilo
93	> Fujitsu
94	> > Siemens, and when compiling OO or KDE it is really hot), when it reachs
95	> some
96	> > temperature it turns off because of security reasons.
97	> >
98	> > I cannot find any other reason.
99	>
100	> I vote for temperature issues too. That is my experience with some
101	> Aristo laptop - it get very hot very easily and powers off when
102	> temperature exceeds 85 C.
103	>
104	> You may try to run something like this while emerging:
105	> # while sleep 5 ; do cat /proc/acpi/thermal_zone/THM0/temperature >>
106	> /tmp/temper ; done &
107	>
108	> and hope that part of that file will survive the poweroff - you will see
109	> whether temperature was raising before end.
110	>
111	> Or you may put something like:
112	> ... do cat /proc/acp..... \| tee -a /tmp/temper ; done &
113	> in background in the session in which emerge runs and observe the
114	> temperature between compilation lines.
115	>
116	> The exact path to temperature file may differ, it will be something like
117	> /proc/acpi/thermal_zone/*/temperature - and it will exist only if your
118	> kernel has necessary drivers compiled (or modules inserted).
119	>
120	> The /proc/acpi/thermal_zone/*/temperture file has about 30 bytes,
121	> 35 thousands of copies makes 1MB file, so you loop may run for 9
122	> hours if storing one copy every second or 48 hours if appending one copy
123	> every 5 seconds.
124	>
125	> HTH.
126	>
127	> --
128	> No virus found in this outgoing message.
129	> Checked by 'grep -i virus $MESSAGE'
130	> Trust me.
131	>
132	>
133	>