• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

A Proof on Why GPU-Z Must Be Misreading My RAM and SHADER Temps as Unreasonably High

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

sirRealist

Member
Joined
Apr 21, 2006
A Proof on Why GPU-Z Must Be Misreading My RAM and SHADER Temps as Unreasonably High

Before you run away because of the length of the post, I plead with you to read though and try to help me out. I don't understand whats going on here, and any help would be appreciated. Thanks in advance!


I've posted about this in a couple threads along with many other questions,
but I thought I might concentrate my last remaining question(s) (my temp reads) in a new thread with all the newest info.

Before I go on, let me say that I am using an ASUS TOP 4850,
on stock volts, with an Accelero S1 w/ 120mm fan mounted on it, with Thermalright copper sinks on all the ram, choke and vreg chips.

Okay, now to the problem. When running tests/games, here are my GPU-Z readings:
GPU (DISPIO): 48C
RAM: (MEMIO): 98C :eek:
SHADER (SHADERCORE): 74C :eek:


So now I can see only two reasons to explain this:
Either the GPU-Z temp reads are right, or they are wrong

Now I'd like to address this my assuming that the reads are correct, and then showing what I believe to be ample evidence showing that this assumption is incorrect, and thusly show that the other possibility must be correct.

---------------------------------------------------------------------------------------------
The GPU-Z Temperature Reads are Accurate

(A) In stating that the temp reads are correct, there is an underlying assumption that there is no reason to doubt these reads. The first piece of evidence for this is that, to my knowledge, no one else with the same make/model/PCB as me is having any sort of misread. I also don't know any history of there being "random" or even ANY misreads in the past (though admittedly this is my first time using the program). The program has produced "expected" RAM and SHADER temperature reads for others with the same made/model/PCB, and there is no reason to assume that mine is somehow different. Also, the GPU temp reads appear to be accurate and "expected".

(B) The heatsinks are not properly dissipating heat. This can come in two flavors: either the heat is transferring to the sinks properly and so the heat is not being removed from the sinks, OR the heat is not transferring to the sinks. I touched sinks on the RAM, chokes and vregs, after the card had been at load for 10min and temps were at the numbers expressed at the top of this article, and stable. The sinks were warm to the touch, but not overly hot and certainly not too hot for me not to be able to hold my finger against it. To bring the importance of this finding into perspective, lets convert Celsius temperatures to Fahrenheit, for those who don't think in terms of Celsius:
98C ~= 208F
74C ~= 172.4

If the heat is transferring to the sinks properly, then they should be hot; VERY HOT. But, they are not.

But what if the heat isn't transferring to the sinks? Well, I guess that's possible, but I have used these exact sinks before, and had no problems with them in the past. They are made of copper and should take heat readily from the chips they are on. Maybe its the thermal adhesive that sticks them to the chips? Maybe, but that means that the thermal adhesive is pretty much insulating the copper sink from the chip, and I doubt that is the case. PLUS, if the chips were actually that hot, then i would link that the thermal adhesive would start bubbling, or smoking, or SOMETHING to indicate that they were heating to 208F! Plus the sinks would probably fall off, as the thermal adhesive (just sticky stuff) would probably melt and the weight of the copper sink being upside down would cause it to fall off.

(C) Perhaps I have not put the sinks on all the places I should, or have improperly applied them. To adhere them to the chips, I heated the sticky side of the sink with a blow-drier, then heated the chip, then stuck it on and held for about a minute. I repeated this for all the sinks I applied, and all have stayed on despite being upside down (thus, the contact is pretty good). None of them are barely hanging on, or show any sign of bad contact. Now, below is an image of the card I have:

eahhd4850htdi512mla7.jpg



I have put the copper sinks on all the black RAM chips you cant see, plus one sink for each of the 3 gray choke chips towards the back, plus one for each set of the 3 sets of 3 tiny black vregs behind them (all were covered), plus one more on the smaller gray chip (idk what it is) slightly above and to the left of the uppermost choke chip. Unless I am mistaken, that is all everything that should be covered with sinks.


(D) My card will overclock to 760/1175 STABLE in games/tests. Again, this is on stock volts. I could be wrong, but I find it hard to believe that this card would go to those speeds at those temperatures and stay reliably/repeatedly stable in every game/test I have tried.

---------------------------------------------------------------------------------------------

In summary:

Evidence that The RAM/SHADER Temp Reads Are Correct:

1) Others get "expected" temp reads using the same make/model/PCB as me, showing that there doesn't appear to be any inherent trouble with GPU-Z reliably reading temps off these cards.
2) To my knowledge, there is no history of GPU-Z misreading temps
3) GPU-Z appears to be reading my GPU temp correctly.


Evidence that The RAM/SHADER Temp Reads Are Not Correct:

1) There are copper heatsinks on all the "hot spot" chips on the board.
2) Doing so gives others reasonable temp reads
3) The sinks are properly adhered/installed
4) The sinks have worked for me in the past
5) The sinks are not overly hot to the touch
6) There is no other evidence that the chips are hot (adhesive isn't bubbling, or making a weird smell or anything, and sinks are not falling off)
7) The card is stable at 760/1175


So, there you have it. Of course, this isn't really a proof on why misreads MUST be true. It could be that the reads are correct, that the temps really are that high, and there is just some evidence that I am missing.

PLEASE let me know if you have any thoughts on what might be going on, flaws in my reasoning, alternate explanations etc etc etc.

This is driving me crazy and it just doesn't make sense.
 
More Info:

Idle temps:
GPU: 32C
RAM: 56C
SHADER: 45C


The RAM and SHADER temps also seem high than I would expect at load.
Also (and this might be important), when I put the card to load, the RAM temp shoots up to ~94C within 1-2 seconds. I don't remember what the "heating curve" is live for the GPU or SHADER. This info suggests an alternate hypothesis:

Whatever GPU-Z uses to obtain its temperatures is faulty.

This would explain the high/faulty reads for RAM and SHADER, the seemingly correct or "expected" GPU temp, and why it seems to only happen to me.

Thoughts?
 
Yes, more than likely Asus is using different ic taps to report temps ..........doesn't asus have a program with the card for reporting temps and for adjusting speeds? ..........if so check there and compare with gpuz ..........or buy a tempprobe to check temps on you memory)
 
Yes, more than likely Asus is using different ic taps to report temps

Possibly, but like I said above:

Others with the same make/model/pcb as me are not having this issue.

..........doesn't asus have a program with the card for reporting temps and for adjusting speeds? ..........if so check there and compare with gpuz ..........or buy a tempprobe to check temps on you memory)

Good idea on trying ASUS program (though i am loathe to install that kind of vendor crap)
 
Go private message the creator of GPU-Z...he would have a boat load more information for you.
 
all of the releases that have support for the 4850 that I have used, including the most recent release, have been very buggy with regards to temps.

On my crossfire setup the top card, which has worse cooling then the second card, seems accurate. When I switch to the second card it does one of three things

1) shows temps of 100c+ on core 125+ on shader and 150+ on memory IO
2) shows the same temps as above and then crashes my computer
3) crashes my computer instantly.

Ive felt my card and its nowhere near those temps and plays all games just fine. CCC shows it between 32-53 C depending on load conditions
 
I have spoke with the maker of GPUZ and I'm convinced it is reporting correctly. I also made inquiry to ATI and Visiontek hoping to get an explanation of why these temps vary so much. ATI and Visiontek refused to disclose information as they consider it proprietary information. But after all of that effort I found the link on the ATI RV770 Architecture the best source of information.

This link below is to the RV770 (ATI 4870/4850) Architecture overview.

http://www.rage3d.com/reviews/video/atirv770/architecture/

This GPU has 3 temperature sensors which are:

DISPIO
MEMIO
SHADERCORE

Although we cannot know for sure where the sensors are as ATI says that's proprietary information, they do imply they are in the GPU but many think they maybe else where because of the variation in temps.

I think these 3 temp sensors are just on 4870/4850 GPU.

It has 3 VRM(Voltage Regulator Modules) VDD temperature sensors
VDD#1
VDD#2
VDD#3

rv770-diagram.jpg

If you look at the above rv770 diagram closely, you will find shader cores(SHADERCORE) in the middleof GPU, the display controller upper right(DISPIO) and the memory controllers at the bottom(MEMIO).

Through experimenting with my twin turbo coolers, I was able to close the gap of the 3 temps by simply getting the cooler to lay as flat as possible on the GPU and using a very thin layer of AS5. I think the GPU is so small that unless you take much care in cross tightening the cooler it won't be flat enough and so you get the large temp variations.
 
Sorry if you mentioned it in the other post, but didn't see it here. Have you used any other programs to read temps? Such as CCC's Overdrive Panel? I know there are some other, but CCC is the one I use honestly. For overclocking and cooling my dual 4870s I have yet to find any reason to use anything else. Also, are you using the most up to date Rivatuner, I know the new version supports up to 8.11.
 
all of the releases that have support for the 4850 that I have used, including the most recent release, have been very buggy with regards to temps.

On my crossfire setup the top card, which has worse cooling then the second card, seems accurate. When I switch to the second card it does one of three things

1) shows temps of 100c+ on core 125+ on shader and 150+ on memory IO
2) shows the same temps as above and then crashes my computer
3) crashes my computer instantly.

Ive felt my card and its nowhere near those temps and plays all games just fine. CCC shows it between 32-53 C depending on load conditions


Well, its reassuring that its not just me.
 
Sorry if you mentioned it in the other post, but didn't see it here. Have you used any other programs to read temps? Such as CCC's Overdrive Panel? I know there are some other, but CCC is the one I use honestly. For overclocking and cooling my dual 4870s I have yet to find any reason to use anything else. Also, are you using the most up to date Rivatuner, I know the new version supports up to 8.11.

Yes, the CCC temp corresponds to the GPU temp in GPU-Z (DISPIO) and like I said, that's not the temp I'm concerned with. Unless I'm missing something, CCC doesn't show the SHADER (SHADERCORE) or RAM (MEMIO) temps
 
It could outright be bad sensors. Im not sure the exact schematic design of their sensors, but if there is any flaw in the design of the actual sensor, it could cause readings to be way off. I had a northbridge chip read way to hot to be normal once, put a handheld sensor to it while watching the temp in the bios, and there was a 30C descrepency between the two. I really couldnt see your ram functioning at that temperature, let alone being able to clock it up. The shader core may not be way off though, mine generally reads 10-15C higher than the gpu core.

Edit: just seen there was a little more than 10C in your shader core under load. Here's my idle/load temps

GPU 46/61
Mem 47/62
Shader 54/70
 
Yes, more than likely Asus is using different ic taps to report temps ..........doesn't asus have a program with the card for reporting temps and for adjusting speeds? ..........if so check there and compare with gpuz ..........or buy a tempprobe to check temps on you memory)

I would also suggest finding a second program, like the possible asus one, to verify the temps that gpu-z is showing. This would be a quick way to verify whether gpu-z is showing the right temps.
 
I have spoke with the maker of GPUZ and I'm convinced it is reporting correctly. I also made inquiry to ATI and Visiontek hoping to get an explanation of why these temps vary so much. ATI and Visiontek refused to disclose information as they consider it proprietary information. But after all of that effort I found the link on the ATI RV770 Architecture the best source of information.

This link below is to the RV770 (ATI 4870/4850) Architecture overview.

This GPU has 3 temperature sensors which are:

DISPIO
MEMIO
SHADERCORE

Although we cannot know for sure where the sensors are as ATI says that's proprietary information, they do imply they are in the GPU but many think they maybe else where because of the variation in temps.

I think these 3 temp sensors are just on 4870/4850 GPU.

It has 3 VRM(Voltage Regulator Modules) VDD temperature sensors
VDD#1
VDD#2
VDD#3

If you look at the above rv770 diagram closely, you will find shader cores(SHADERCORE) in the middleof GPU, the display controller upper right(DISPIO) and the memory controllers at the bottom(MEMIO).

Through experimenting with my twin turbo coolers, I was able to close the gap of the 3 temps by simply getting the cooler to lay as flat as possible on the GPU and using a very thin layer of AS5. I think the GPU is so small that unless you take much care in cross tightening the cooler it won't be flat enough and so you get the large temp variations.


Thank you much for all your info! At first I was disheartened by your reply, since it seems to indicate that there is infact a problem with my GPU cooling. But then I thought more about it, and if there IS a problem, I DEFINATELY want to know about it.

Your post seems to indicate/suggest that the MEMIO and SHADERCORE sensors are inside the GPU core itself... is that correct? If so, that would explain why the temps could be so hot w/o the RAM sinks being hot, since its not actually the RAM temp that the sensor is reporting, but the memory controller.

Come to think of it, this might explain whats going on. When I put the video card in my computer, the card was sagging at its edge due to the weight of the cooling, so I used a zip tie to connect the Accelero to my CPU cooler, and tightened it until the card was on a flat plane again (not sagging). Maybe during this process, part of the Accelero footprint lifted from the core or something, and now I'm getting uneven cooling.

I guess the only thing I can do is to remove the card, completely reseat the Acceleto (I use Ceramique), and put it back in and just let the card sag. Maybe that will even out my temp readings...
 
Last edited:
It could outright be bad sensors. Im not sure the exact schematic design of their sensors, but if there is any flaw in the design of the actual sensor, it could cause readings to be way off. I had a northbridge chip read way to hot to be normal once, put a handheld sensor to it while watching the temp in the bios, and there was a 30C descrepency between the two. I really couldnt see your ram functioning at that temperature, let alone being able to clock it up. The shader core may not be way off though, mine generally reads 10-15C higher than the gpu core.

Edit: just seen there was a little more than 10C in your shader core under load. Here's my idle/load temps

GPU 46/61
Mem 47/62
Shader 54/70



Thanks for your input/temps. I think I need to reseat my Accelero and see if anything changes.... the heat curve of my 3 sensors seems to indicate that part of the core isn't being cooled properly. For example:

difference between your idle/load:
GPU: 15
RAM: 15
SHADER: 16

A very consistent increase. Now Mine:
GPU: 16
RAM: 42
SHADER: 29


As you can see, my GPU holds true to yours, but the SHADER is off by about double, and the RAM by almost triple.


Now, lets compare the temperature differences as compared against the GPU temp:

IDLE LOAD
GPU +0 +0
RAM +1 +1
SHADER +8 +9


and mine:
IDLE LOAD
GPU +0 +0
RAM +24 +50
SHADER +13 +29



As you can see, my heat curve is way more dramatic than yours. Now, lets assume the GPU sensor is on the opposite side of the core from the RAM sensor, with the SHADER sensor in between. Now Assume that the base of the Accelero is making full contact with the core on the side where the GPU sensor is, but lifted slightly where the RAM sensor is, then that would fit the data. The cooling would be best at the GPU, then worse at the SHADER sensor, and still worse at the RAM sensor. This fits the data. Now look at the picture Alient posted: We have the GPU sensor in the upper right, the SHADER in the middle, and the RAM at the bottom... this could definitely fit the description and possible problem I've outlined.

w00t!!!
 
Realist,

You have got a quite an analytical mind there. What kind of work do you do for a living?

right now I work as a Network Operations Manager for a number of small businesses in NYC. I taking the LSAT in Dec and considering law school (figured my analytical mind might be of use there).

Thanks for asking!


I'll be going away for thanksgiving weekend, so I won't be able to work on this until next week at the earliest. But, my LSAT is on Dec 6th, so I'll probably be spending all next week prepping when I'm not working, so I may not be able to even try and look at this until Dec 8th :-/
 
Last edited:
OKay, so I had a little time on my hands last night, so I did a little vid card work.

I opened up the system, added another 120mm fan strapped to my Accelero S1, tightened the mounting screws for the Accelero, did a little dusting and cable maintenance, and then put it all back together and the MEMIO and SHADERCORE temps are STILL very very high. :cry:

I guess my only option now is to completely remove the Accelero S1, inspect the ceramique footprint (might give an indication of uneven seating of the cooler), remount the whole assembly, and hope that tings get fixed. :-/


EDIT: I also played fallout3 for 3 hours at the 760/1175 clocks I've talked about, with no anomalies, "display driver has stopped responding and recovered" errors, or anything else out of the ordinary.
 
Last edited:
I have a weird temp reading as well. Its on 1 board with all my temps are reading 8-15C higher than my other card. Re seating heatsinks, swaping the heatsinks and even changing locations doesn't make any difference.
 
Back