Name: NVIDIA Tegra 4 Architecture Deep Dive, Plus Tegra 4i, Icera i500 & Phoenix Hands On
Item: NVIDIA Tegra 4 Architecture Deep Dive, Plus Tegra 4i, Icera i500 & Phoenix Hands On

NVIDIA Tegra 4 Architecture Deep Dive, Plus Tegra 4i, Icera i500 & Phoenix Hands On

by Anand Lal Shimpi & Brian Klug on 2/24/2013 3:00 PM EST

Post Your Comment
Please log in or sign up to comment.

Comments Locked

75 Comments

Back to Article

tipoo - Sunday, February 24, 2013 - link
Under 500 in Sunspider, about twice as fast as anything else ARM. But then again, it's a few months newer than that, and actually still not shipping. And as usual with Nvidia they're early to each party (first to dual core, first to quad core), but not always the best performing. We'll see if other Cortex A15 designs beat it.

I'd love to see four of those cores paired with SGXs upcoming 600/Rogue series.
jeffkibuule - Sunday, February 24, 2013 - link
SunSpider is so software sensitive that a Tegra 3 @ 1.2 Ghz on Windows RT beats a Snapdraon S4 Pro @ 1.5Ghz on Nexus 4 using Chrome. It's a terrible benchmark because its so dependent on underlying kernel optimizations in the Android phone market.
tipoo - Sunday, February 24, 2013 - link
True, other benchmarks are similarly impressive though.
karasaj - Sunday, February 24, 2013 - link
Psh it has nothing on my desktop! 125ms on sunspider... Nvidia so behind.

Anyways, still looks impressive. I really want to see some Krait 600/800 benchmarks.
tipoo - Sunday, February 24, 2013 - link
The fact that they're getting well below an order of magnitude slower than desktops is impressive in itself too. Even with iPad 2 level performance I still was reluctant to do most of my web browsing on a tablet for the performance. Maybe with Tegra 4 and beyond hardware speed that will change.
Mumrik - Sunday, February 24, 2013 - link
As someone with heavily tabbed browsing habits, I don't think I'll ever make that jump (and I own a tablet).
tipoo - Sunday, February 24, 2013 - link
Also true, that's my other thing. I like to open a bunch of background tabs and have them ready as I go through each one. Right now, tablets don't do background loading, as far as I know, and if they did they wouldn't be powerful enough to keep the main tab smooth while doing it.
Tarwin - Monday, February 25, 2013 - link
Tablets DO do background loading, as long as they're android. The only performance I've seen is from lack of RAM on my phone and lack of bandwidth on the phone and tablet but those things affect any computer as well. One observation to ne made, they do load in the background but things like audio and video playback will pause if you switch to another tab.
von Krupp - Monday, February 25, 2013 - link
Even Windows Phone 7.5 and 8 do background loading. I haven't used it, but I'd wager that RT does as well, if even the gimpy mobile OS can.
tuxRoller - Sunday, February 24, 2013 - link
As someone who had, until recently, over 40 tabs open on my chrome browser (Nexus 4), the critical problem has been memory. With enough memory, and good enough task management, these problems tend to go away.
Of course, maybe you are than 0.00001% who has hundreds or thousands of tabs open in which case I pity any computer you are likely to own.
darkich - Monday, February 25, 2013 - link
You should know better than to compare different platforms on sunspider.
It's more software than hardware dependant benchmark.
Read the jefkibuule's post
danielfranklin - Wednesday, February 27, 2013 - link
With my own testing ive managed to get my Nexus 10 at between 500-600ms on Sunspider. Clocked at roughly 2ghz. It depends more on the browser, the stock android browser is much faster at this than Chrome and doesnt come with the Nexus 10 or Nexus 4.
ilihijan - Sunday, March 3, 2013 - link
I just got paid $6784 working on my laptop using these simple steps leaked on this web page. Make up to $85 per hour doing simple tasks that are so easy to do that you won't forgive yourself if you don't check it out! Weekly payments! Here is what I've been doing Epic2.c(om)
GiantPandaMan - Sunday, February 24, 2013 - link
Given the vastly different conclusions and what not, I think it would be interesting if Charlie and Anand had a roundtable discussion about the SoC space, both phone and tablet. Has Tegra had a noticeable lack of design wins? Has nVidia overpromised and underdelivered three times in a row? Or is Charlie exaggerating far too much?

I'm making no judgement myself, since I really know very little about how phone and tablet manufacturers view the various SoC's.

Have you guys reached out to manufacturers and gotten their takes at all?
s44 - Sunday, February 24, 2013 - link
What, Charlie pushing anti-Nvidia storylines? Who'd have imagined that.
lmcd - Sunday, February 24, 2013 - link
I feel like T2 underperformed because the software on it underperformed (see DX2), as Honeycomb was a pretty terrible release.

Tegra 3 didn't do poorly at all. It performed phenomenally as a cheap chip (though the high-clocked ones on high-end phones made no sense). 28nm was a must-have for a high-end chip that generation.

Tegra 4 looks about where everyone expected it. No one should have been surprised with any of those units on the performance levels.
lmcd - Sunday, February 24, 2013 - link
DX2 and Honeycomb not being the same subject of course.

But 2.3 was equally bad
rahvin - Monday, February 25, 2013 - link
A lot of what Charlie said is easily check-able. Tegra2 had a ton of design wins and almost no actual sales. Tegra3 has done phenomenally, but only in tablets and it's already been replaced in one or two. I think the most prophetic thing he said is the most obvious, unlike every previous generation they didn't announce a single design win for Tegra4. That to me speaks volumes.

We'll know in time if it's just the vendetta or if his sources are correct. I've never heard of a chip maker doing a reference design and personally I just don't see that having any effect or why they would even do it. The manufacturers like to differentiate and the reference design takes that away, which again speaks to lack of manufacturer interest. Charlie tends to over exaggerate things but IMO he's been fairly spot on. Even with the highest revenue in their history profit was down almost 25% (which I attribute to the change to paying for wafer instead of good chips).

Again, time will tell.
Kidster3001 - Thursday, February 28, 2013 - link
FFRD is popular for companies that only produce the chips and not any phones themselves. Samsung has no need to do it, Apple either. Who's left? Qualcomm sells (almost) reference designs with their MDP, Intel's first two phones (Lava Xolo and Orange Santa Clara) were basically rebadged reference devices. Now NVidia's doing it. One advantage to having an FFRD is so that the customer can bring it to market faster and cheaper. OEM's like that and it also allows for chip manufacturers to get their stuff into the hands of smaller OEMs who don't have large R&D budgets.

All the Tegra chips have had higher power consumption than their peers from other manufacturers. It looks like Tegra4 is no exception. They work well in tablets where it is less important, but poor battery life is a really good reason for OEM's to not make phones based on your chips.

My personal opinion is that A15 (ARMs core) will never be a really good design for a phone. It has really high performance but the power envelope just isn't going to work. Those who design custom cores will come out ahead in the phone battle: Apple, Qualcomm, Intel and perhaps even NVidia if they move away from ARM IP with their Denver design.
TheJian - Monday, February 25, 2013 - link
Charlie has hated NV forever. He did the same crap at theINQ for years. At least he named his site accurately...ROFL. Actually I enjoy reading (used to) some of his stuff, but when he speaks about NV I'd say his site should have been named usuallynotaccurate.com

Now he's actually charging for semiaccurate articles...LOL

Seriously? If it was that important I'd rather pay for something like MPR. Charlie is usually good for a laugh and that's about it regarding NV.

Though I've written some stuff about this site's bias recently (my titan article posts and the 660ti article comments), I don't think anandtech and semiaccurate sit at the same table. Anandtech isn't making stuff up, they're just leaving out 3/4 of the story IMHO (regarding my comments on the 14games etc that should be in the game suite & the two that shouldn't). Charlie just throws darts at a board for a large portion of his articles. IF you keep his articles (I did for a long time) and go back over them he's only right about 50%. Either he's getting WMD like UK info (ala bush and iraq, though I think they just moved them to syria...LOL we gave them ages to move them) or he just makes it up himself ("my deep mole in x company said blah blah"). There's no proof until ages later when most forget what he even said, right or wrong. Note there is NO COMMENT section on the site now. They're all blocked :) Ubergizmo called his site "half accurate". My data of old articles used to say the same :) I expect more than flipping a coin results in reporting. He gets credit for things like breaking the news on the bapco fiasco, but I'd say Van Smith gets credit for exposing not only that Intel OWNED the land they had their building on (they paid rent to Intel), Intel OWNED their domain name, and even had a hand in WRITING the code as Intel software engineers were on hand next door. Van covered it all YEARS before Charlie. Look up Van smith and vanshardware, a lot of that crap and the biased intel reporting forced Van to leave and probably dropped the price of Toms site to 1/4 of it's value when tom dumped it. He was worth MORE than anandtech before that stuff. Not sure of the value today, I'm talking back then.

Biased reporting gets you killed if the right people keep pointing it out with DATA backing it. IF Tom's hadn't gone down that Intel love-in route he probably could have sold for much more. There's a reason Otellini said in 2006 that toms was his favorite tech site ;) Then in turn dumped the site as it's credibility tanked. Bapco was, and still is a sham. AMD/NV/VIA all left the consortium for a reason. I don't put much stock in anything from them (futuremark either). Tom's treatment of Van (even removed his bylines on stories) was downright disgusting. I stopped reading toms for about 5 years due to that crap in ~2001. He replaced every article the guy wrote with "THG staff". Total BS. Charlie does the same with NV hate as toms with Intel love. This crap costs credibility.

Anandtech is coming close to the same thing on NV gpu's; Ryan's AMD love anyway, I'll bang away until he stops :) Funny how they never attack the data I provide here. I link to them at toms forums too, eventually that will begin to hurt as people look at the evidence and draw their own conclusions about his articles and in turn this site's credibility. If he continues on the next reviews (700's and 8000's) I'll get on a lot more forums linking the comments after the data dismantling (polite critique of course Ryan :)).

The "Jian" is a double edged sword Ryan ;) Thin, light and very maneuverable...LOL
https://en.wikipedia.org/wiki/Jian
In Chinese folklore, it is known as "The Gentleman of Weapons"
I'm not hostile Ryan ;) Wikipedia says so. :)
Google this: thejian anandtech
Data piles up don't it? I save all my posts (before posting) anyway, but google does too.
Death666Angel - Wednesday, February 27, 2013 - link
Get a life.
StormyParis - Sunday, February 24, 2013 - link
I'd go with Anand, anytime. Charlie is a raving bitch.
mayankleoboy1 - Sunday, February 24, 2013 - link
Except that this raving bitch has accurately predicted the future course of most companies months before anybody.
Avalon - Monday, February 25, 2013 - link
If by accurate you mean he made many predictions for every company and when one of the predictions came true everyone forgot about all the wrong ones. He guesses.
Kiste - Monday, February 25, 2013 - link
Confirmation bias ahoy!
AmdInside - Monday, February 25, 2013 - link
Are you kidding me? Even congress lies less than Charlie does.
jjj - Sunday, February 24, 2013 - link
I wouldn't expect a huge downclock for phones , they do need to limit heat, not going with POP for the RAM helps ,some actual cooling (air gap or metal) could also be used so they will most likely allow 1-2 cores to go pretty high and maybe all 4 for short periods of time (so the usual tricks to get more out of it).
R3MF - Sunday, February 24, 2013 - link
re tegra 4 gpu architecture.

how did you get through this many words without mentioning OpenCL?

lack of ES 3.0 is only half the problem.
cmikeh2 - Sunday, February 24, 2013 - link
He does reference it when discussing the Chimera ISP:

"At the same time, the elephant in the room is OpenCL (and its current absence on Tegra 4) and what direction the industry will take that to leverage GPU compute for some computational photography processing."
guidryp - Sunday, February 24, 2013 - link
The Icera acquisition was a brilliant one. This gives NVidia the complete mobile package. It will be very interesting to see how this works out in practice. NVidia is a fierce competitor, Qualcomm should be worried.
klmccaughey - Sunday, February 24, 2013 - link
Definitely. All good for us too! :)
twotwotwo - Sunday, February 24, 2013 - link
> In the PC industry we learned that there’s no real downside to quad-core as long as you can power gate individual cores, and turbo up to higher frequencies when fewer than four cores are active, there’s no real tradeoff other than cost.

I'm not completely sure, because there are always other possible uses for die area.

You could do the big/little thing with A7 'companion' cores, like Samsung. You could use even more area for GPU, like Apple. Wiki suggests you could double the L2 cache to 4MB (though more cache would always be eating power, even with only one core turned on).

But in favor of quad-core: software might start using cores a little more effectively w/time--Google and Apple are apparently trying to make WebKit able to do things like HTML parsing and JavaScript garbage collection in the background, and Microsoft's browser team backgrounds JavaScript compilation. And the other uses of space are also only sort-of useful, and cores (like GHz) are handy for marketing. I can't say I know what the right tradeoff for NVidia is, only that there were were other seemingly-interesting options.
guidryp - Sunday, February 24, 2013 - link
"there are always other possible uses for die area"

Yes, in the case of Tegra 3, they could certainly have used extra GPU power more than 4 CPU cores. But they seem to have remedied that this time.
twotwotwo - Monday, February 25, 2013 - link
Def possible, and what they disclosed in this presentation would suggest they've handled it.

All that's working against them, GPU-wise, is that user expectations increased since last gen, and Mali/PowerVR improved. So now T4i needs to drive 1080p phone screens and T4 needs to drive screens like the Nexus 10's, if they want to be the most bleeding-edge, anyway.

But they did talk about large integer-factor improvements in the GPU, so maybe they haven't merely built the GPU that would've been nice to have last gen, but moved up enough to be great this gen.
sosadsohappy - Sunday, February 24, 2013 - link
Samsung has just said it is doing A15-A7 pairing. Saying out the future plans just to keep the crowd excited is not new. That does not rule out the possibility of Qualcomm or Nvidia going for similar big.LITTLE designs. They are for the next-gen I would think. (Tell me if I am wrong but have anyone sampled big.LITTLE based SoC yet?)

And talking about die area, what is impressive about Nvidia is how their chips are always smaller. Quad-core A15 is about 80mm^2 while you can check for the sizes of Qualcomm's or Apple's chips! FWIW Apple's are not in 28nm but still they don't scale equally.

I am excited to see the 60mm^2 (right?) chip (Tegra4i). If it is what they claim, it should have great battery life for a smartphone.
s44 - Monday, February 25, 2013 - link
4+1 is Nvidia's version of big.LITTLE. The 1 low-power A15 is about the same die space as the 4 A7s on the next Exynos...
sosadsohappy - Monday, February 25, 2013 - link
Yes. The only difference is that the big.LITTLE will sport different architectures on the big and LITTLE while NV's 4+1 will have the same arch (A15 for both).

And personally I think 4+1 is better as of now until we have Atlas and Apollo combination of big.LITTLE because (correct me if I'm wrong) A7 does not have as much of memory parallelism, it is to weak as well...

No matter what, it has been impressive that Nvidia chips have significantly lower die size than the competition's dual-core chips!
Krysto - Monday, February 25, 2013 - link
Too weak? For what? Receiving notifications? We'll see if Tegra 4 is more energy efficient than Samsung's Exynos 5 Octa later this year. Then we might get a better idea whether Nvidia or ARM's implementation is better.

And I agree. Nvidia managed to have the same graphics performance + a quad core Cortex A15 CPU in 80mm2 vs Apple with a dual core CPU and same graphics performance in 120 mm2. That's pretty impressive, even if it arrives half a year late.

I still wish Nvidia would actually want to compete at the high-end though, with a 120mm2 chip, and beat Apple. It annoys me that they are still trying to build only "good enough for most people" chips. They should be trying to be the king of mobile graphics. They are freaking Nvidia, and they can't even beat a mobile GPU maker? Come on, Nvidia.
name99 - Monday, February 25, 2013 - link
> In the PC industry we learned that there’s no real downside to quad-core as long as you can power gate individual cores, and turbo up to higher frequencies when fewer than four cores are active, there’s no real tradeoff other than cost.

Sony Ericsson recently released a paper claiming this was not true, even apart from the die area issues. In particular they claimed that with current technology, coupling capacitance, ground plane issues, communication (with the L2, including coherence) and suchlike, quad-core imposed something like a 25% reduction in peak MHz possible for two cores, compared to those same two cores isolated rather than on a quad-core die.

Now obviously any company publication is talking up its book, but I imagine they're not going to make a statement that is blatantly false in a technical publication, implying there is some truth to what they say.
Wilco1 - Wednesday, February 27, 2013 - link
Given Tegra 4i achieves 2.3GHz in a quad core with shared L2, way more than Krait which uses per-CPU L2, I think the claim that a shared L2 is clock limiting seems more marketing than substance.
xsacha - Saturday, March 23, 2013 - link
Tegra4i uses Cortex-A9. Krait is similar to Cortex-A15. The Krait obviously uses way more power and gives way more performance clock-for-clock. So you are comparing apples and oranges here. The 1.9GHz Krait quad-core is roughly equivalent to 2.5GHz+ in a Tegra 4i.
name99 - Monday, February 25, 2013 - link
"But in favor of quad-core: software might start using cores a little more effectively w/time--Google and Apple are apparently trying to make WebKit able to do things like HTML parsing and JavaScript garbage collection in the background, and Microsoft's browser team backgrounds JavaScript compilation"

It would be wise to design for the technology we have today, not the dream of technology we may one day have. As I have stated elsewhere, there is ample evidence that on the desktop, even today, multiple threads running on more than two cores at once is very rare. (More precisely
- many apps are multithreaded, but those threads tend to be mostly async IO type threads, mostly waiting
- there is a mild win to having three cores available, but it's not much advantage over two cores
- the situation has improved a little over ten years ago (when the first SMT P4s first started appearing) and when there was little advantage to two cores over one. But most of the improvement is the result of OS vendors moving as much stuff as possible of what they do (GUI, IO, etc) onto the second core.)

The only real code that utilizes multiple cores is video-encoding. In particular both games and photo processing do not use nearly as much multi-core as people imagine.

The situation for mobile is the same, only a little worse because there is less of simultaneous heavyweight apps running.

Given these facts, and the way code is actually structured today, 4 cores makes very little sense.
SMT makes sense, mainly in that its power and area footprint is very low, so it's a win on those occasions when the OS can make use of it. Beyond that, if you have excess transistors available, beefed up vectors (wider registers, and wider units) probably makes more sense. You'll notice that these recommendations parallel what Intel has done over the past few years --- they are not idiots, and desktop code is very similar to mobile code.

As for parallel web browsing, people have been publishing about it for years now; but the real world results remain unimpressive. It remains an unfortunate fact that the things that have been converted to parallel don't seem to be, for most sites, the things that are actually gating performance. A similar problem exists with PDF display (still not as snappy as I would like on an iPad3) --- the simple and obvious things you can imagine for parallelizing the rendering aren't the things that are usually the problem.

In both cases, the ideal situation would be to restart with totally redesigned file formats that are non-serial in nature; but that seems to be a "boil-the-ocean" strategy that no-one wants to commit to yet. (Though it would be nice if Apple and Adobe could get together to redefine a PDF2.0 file format that was explicitly parallel, and that seems rather easier than fixing the web.)
Krysto - Sunday, February 24, 2013 - link
It seems Nvidia really pulled off making Tegra 4's GPU 6x faster than Tegra 3, and with 5 Cortex A15 cores and 6x more GPU cores, all in the same size. Pretty impressive. But still quite disappointing for lack of OpenGL ES 3.0 and OpenCL support. I really hope they plan on supporting them in Tegra 5 along with the new 64 CPU and Maxwell-based GPU cores.
Mike1111 - Sunday, February 24, 2013 - link
I would really like to see an analysis/comparison of companion core (Nvidia) vs. big.LITTLE (Samsung).
lmcd - Sunday, February 24, 2013 - link
BIG.little (fixed it for ARM) isn't even in reference device stage yet is it?
Krysto - Monday, February 25, 2013 - link
No need to fix it. The "opposite" style naming is intentional. It's ironic. Get it?
phoenix_rizzen - Monday, February 25, 2013 - link
Exynos 5 Octa, which is A15/A7 big.LITTLE, has been demoed. Tegra 4, which is A15 plus a companion core, has been demoed.

Neither are commercially available, neither are in shipping products, neither are available to consumers.

IOW, the Cortex-A15 variations for bit.LITTLE have passed the reference stage, and are in the "find companies to use them to build devices" stage. They'll be in consumers' grubby little hands before Christmas 2013.
tviceman - Sunday, February 24, 2013 - link
GPU performance ended up better than I thought it would after the subdued announcement and leaked early prototype benchmarks. Good to see.
wongwarren - Monday, February 25, 2013 - link
I wonder which is faster. This or the Snapdragon 600.
varad - Monday, February 25, 2013 - link
Snapdragon 600:
http://www.anandtech.com/show/6792/lg-optimus-g-pr...

Tegra 4:
http://www.anandtech.com/show/6787/nvidia-tegra-4-...

So if the metric is simply raw performance [since you asked "faster"], looks like the Tegra 4 will win easily against the Snapdragon 600.

A better/fair comparison would be when we have performance numbers for Snapdragon 600 in a tablet or Tegra 4 in a phone.
Krysto - Monday, February 25, 2013 - link
S600 is just a slightly overclocked S4 Pro with the same GPU.

The real competitor of Tegra 4 will be S800. We'll see if it wins in CPU performance (it might not), and I think there's a high chance it will lose in GPU performance, as Adreno 330 is only 50% faster than Adreno 320 I think, and Tegra 4 is about twice as fast.

Qualcomm has always had slower graphics performance than Nvidia actually. The only "gap" they found in the market was last fall with the Adreno 320, when Nvidia didn't have anything good to show. But Tegra 3 beat S4 with its Adreno 225.
watersb - Monday, February 25, 2013 - link
I'm amazed at the depth of this NVIDIA data-dump. Brilliant work.

Anand's observation re: die size, cost strategy, position in the market and how this buys them time to consolidate... Wow.

Clearly, Nvidia is in this game for the long haul.
djgandy - Monday, February 25, 2013 - link
So OpenGL ES 3.0 doesn't matter, but quad core A15 does? Why do people suck up to Nvidia and their marketing BS so much?

T4i still single channel memory? What a joke configuration.
djgandy - Monday, February 25, 2013 - link
Also a 9 page article about a mobile SoC without a single reference to the word "battery".
varad - Monday, February 25, 2013 - link
Read the article before you write such comments. The very first page is "Introduction & Power" where they do mention some numbers and their thoughts.
djgandy - Tuesday, February 26, 2013 - link
Yeah its all smoke and mirrors under lab test conditions. Where is the real battery life? Is this not for battery powered devices?
Krysto - Monday, February 25, 2013 - link
Personally, I think all 2013 GPU's should have support for OpenGL ES 3.0 and OpenCL. I was stunned to find out Tegra 4 was not going to support it as they haven't even switched to a unified shader architecture.

That being said, Anand is probably right that it was the right move for Nvidia, and they are just going to wait for the Maxwell architecture to streamline the same custom ARMv8 CPU from Tegra 5 to Project Denver across product line-ups, and also the same Maxwell GPU cores.

If that's indeed their plan, then switching Tegra 4 to Kepler this year, only to switch again to Maxwell next year wouldn't have made any sense. GPU architectures barely change even every 2-3 years, let alone 1 year. It wouldn't have been cost effective for them.

I do hope they aren't going to delay the transition again with Tegra 5 though, and I also do hope they follow Qualcomm's strategy with S4 last year of switching IMEMDIATELY to the 20nm process, instead of continuing on 28nm with Tegra 5, like they did with Tegra 3 on 40nm. But I fear Nvidia will repeat the same mistake.

If they put Tegra 5 on 20nm, and make it 120mm2 in size, with Maxwell GPU core, I don't think even Apple's A8X will stand against it next year in terms of GPU performance (and of course it will get beaten easily in CPU performance, just like this year).
djgandy - Tuesday, February 26, 2013 - link
Tegra is smaller because it lacks features and also memory bandwidth. The comparison is not really fair to assume you can just throw more shaders at the problem. You'll need wider memory bus for a start. You'll need more TMU's and in the future it's probably smart to have a dedicate ROP unit. Then also are you seriously going to just stick with FP20 and not support ES 3.0 and OpenCL? OEMs see OpenCL as a de facto feature these days, not because it is widely used but because it opens up future possibilities. Nvidia has simply designed an SoC for gaming here.

Your post focuses on performance, but these are battery powered devices. The primary design goal is efficiency, and it would appear that is why apple went swift and not A15. A15 is just too damn power hungry, even for a tablet.
metafor - Tuesday, February 26, 2013 - link
If the silicon division of Apple were its own business, they'd be in the red. Very few silicon providers can afford to make 120mm^2 chips and still make a profit; let alone one with as little bargaining clout in the mobile space as nVidia.

Numbers are great but at the end of the day, making money is what matters.
milli - Monday, February 25, 2013 - link
nVidia is trying hard but Tegra still isn't making them any money ...
PingviN - Monday, February 25, 2013 - link
Tegra made an operating loss of $150 million for fiscal year 2012, despite getting into both the Nexus 7 (the refresh coming this year has been lost to Qualcomm) and the Surface RT. Sales prognosis cut almost in half for the fiscal year 2013. To date, Nvidia hasn't had any profit coming out of Tegra and now it's in limbo mode until Tegra 4 is released because Tegra 3 gets smashed by it's competition.

It's been a pretty crappy year for Tegra.
guilmon14 - Tuesday, February 26, 2013 - link
I don't know anything about this company "tegra", but have you heard about Nvidia? I heard they're doing great!

http://nvidianews.nvidia.com/Releases/NVIDIA-Repor...

According to this Nvidia is up in income, revenue, and equity.

If you wanted to check the easy way just look at nvidia's wikipedia page, gives you all the nice money numbers.
http://en.wikipedia.org/wiki/Nvidia
trajan2448 - Monday, February 25, 2013 - link
5 years down the road phones will be cooking our dinner. It's amazing how fast the tech is advancing now.
Scannall - Monday, February 25, 2013 - link
If they don't hustle right along, SOC's with the PowerVR 6 series (Rogue) will beat them to market. And considering their GPU just barely squeaks by the iPad as it is, it will be behind early on.
Khato - Monday, February 25, 2013 - link
Was it specifically stated that the Tegra 4 SPECint/W figure was running on the high speed cores? As is mentioned later on the page, a SPECint2000 of 520 is within reach of the power optimized companion core, so the only reason I'd expect NVIDIA to not use the companion core for this data is if they explicitly stated that it wasn't.

Part of the cause for my suspicion is that the Power vs DMIPS chart that Samsung recently provided for the Exynos 5 Octa shows 8k DMIPS at 1 watt... and from the press coverage back in 2009 for the A9 hard macros there's both the 10k DMIPS at 1.9 watts and 2GHz with the speed speed optimized and 4k DMIPS at 250 mW and 800 MHz for the power optimized. Which equate to 5.26 DMIPS/mW and 8 DMIPS/mW, respectively. Now the 2GHz data point should be even worse off than Tegra 3 and yet it only shows the Samsung Exynos 5 Octa as being 52% more efficient.

Going into estimating rather than published numbers, if we up the efficiency of Tegra 3 a bit compared to that 2GHz figure then it's likely going to be closer to A15 being 30% more efficient... which you then add the known ~40% efficiency bump going from a performance to power implementation and you get the kind of drastic increase NVIDIA is touting.
Wilco1 - Monday, February 25, 2013 - link
It doesn't matter whether they used the 5th core or one of the fast ones. By definition the cross over point is where the 5th core uses as much power as a fast core. Since that is ~800MHz, the power efficiency is the same. The 5th core can likely clock to well over 1GHz, but then it uses more power than a fast core.

You are basically right that some of the 73% MIPS/W improvement comes from the 40-28nm process change. However the combined improvement of process and micro architecture means that you can use the low power core far more often. The 5th core in Tegra 4 is effectively more than 3 times as fast than the one in Tegra 3. So that means lots of tasks which needed 1-2 fast Tegra 3 cores can now run on the 5th Tegra 4 core. That means the power efficiency will actually improve by what NVidia suggests.
Khato - Monday, February 25, 2013 - link
Mind sharing the source for that? The wording in this article implies differently - "That 825MHz mark ends up being an important number, because that’s where the fifth companion Cortex A15 tops out at." Given 1.9GHz for the performance-optimized cores, something around 800 MHz sounds about right for the max frequency of a power-optimized version.

Anyway, there's no question that Tegra 4 will be quite a bit more power efficient simply by virtue of being able able to run more workloads exclusively on the companion core. As said before, in exchange for a much lower cap on maximum frequency a power optimized synthesis gives at least a 40% bump in efficiency... and now that power optimized core will still deliver respectable performance.
Wilco1 - Monday, February 25, 2013 - link
Read http://www.nvidia.com/content/PDF/tegra_white_pape... it explains the difference between leakage and active power on low power and high performance transistors. It explicitly says the 5th core in Tegra is capped at 500MHz as that is where it is as power efficient as a fast core. The graphs and the word capped suggest the 5th core can go faster but there is no point.

Note that Tegra 3 uses a different process with low power transistors for the 5th core rather than a low power synthesis (not that they couldn't have done that too, but it is never mentioned and the 5th core looks pretty much the same in the die plots). I presume Tegra 4 does the same on the 28nm process.
Khato - Tuesday, February 26, 2013 - link
Okay, so your commentary is based on the Tegra 3 which is using an entirely different approach to power savings for the companion core. Note that all of the data I was referencing for the difference in efficiency between ARM's two A9 hard macros was on the same process and hence is more applicable to the case of Tegra 4. As you correctly state, Tegra 3 gains its power efficiency for the companion core by using the LP process rather than a low power synthesis, likely due to it being a simpler and faster route to the desired end result and equally effective for their design goals.

Tegra 4 isn't playing process games for the companion core. How do you gain efficiency on the same process? You loosen timings to allow for the usage of smaller transistors, less flop stages, so on so forth. The end result being that you sacrifice maximum switching speed to reduce both leakage and dynamic power. From all the information that NVIDIA has made available it's a completely different implementation from Tegra 3.
Wilco1 - Tuesday, February 26, 2013 - link
Tegra 4 does exactly the same as Tegra 3. According to NVidia's white paper on Tegra 4 (http://www.nvidia.com/docs/IO/116757/NVIDIA_Quad_a... it also uses low power transistors for the 5th core. Again if you look at the die photos of Tegra 4 all 5 cores are identical just like Tegra 3. So that seems to exclude a different synthesis.

The way NVidia get a low power core is by using low power transistors. TSMC 28nm process supports several different transistor libraries, from high performance high leakage to low performance low leakage. Based on the information we have all they have done is swap the transistor libraries.
TheJian - Monday, February 25, 2013 - link
http://www.anandtech.com/show/6472/ipad-4-late-201...
ipad4 scored 47 vs. 57 for T4 in egypt hd offscreen 1080p. I'd say it's more than competitive with ipad4. T4 scores 2.5x iphone5 in geekbench (4148 vs. 1640). So it's looking like it trumps A6 handily.

http://www.pcmag.com/article2/0,2817,2415809,00.as...
T4 should beat 600 in Antutu and browsermark. IF S800 is just an upclocked cpu and adreno 330 this is going to be a tight race in browsermark and a total killing for NV in antutu. 400mhz won't make up the 22678 for HTC ONE vs. T4's 36489. It will fall far short in antutu unless the gpu means a lot more than I think in that benchmark. I don't think S600 will beat T4 in anything. HTC ONE only uses 1.7ghz the spec sheet at QCOM says it can go up to 1.9ghz but that won't help from the beating it took according to pcmag. They said this:
"The first hint we've seen of Qualcomm's new generation comes in some benchmarks done on the HTC One, which uses Qualcomm's new 1.7-GHz Snapdragon 600 chipset - not the 800, but the next notch down. The Tegra 4 still destroys it."

Iphone5 got destroyed too. Geekbench on T4=4148 vs. iphone5=1640. OUCH.

Note samsung/qualcomm haven't let PC mag run their own benchmarks on Octa or S800. Nvidia is showing now signs of fear here. Does anyone have data on the cpu in Snapdragon800? Is the 400cpu in it just a 300cpu clocked up 400mhz because of the process or is it actually a different core? It kind of looks like this is just 400mhz more on the cpu with an adreno330 on top instead of 320 of S600.
http://www.qualcomm.com/snapdragon/processors/800-...

https://www.linleygroup.com/newsletters/newsletter...
"The Krait 300 provides new microarchitecture improvements that increase per-clock performance by 10–15% while pushing CPU speed from 1.5GHz to 1.7GHz. The Krait 400 extends the new microarchitecture to 2.3GHz by switching to TSMC's high-k metal gate (HKMG) process."

Anyone have anything showing the cpu is MORE than just 400mhz more on a new process? This sounds like no change in the chip itself. That article was Jan23 and Gwennap is pretty knowledgeable. Admittedly I didn't do a lot of digging yet (can't find much on 800 cpu specs, did most of my homework on S600 since it comes first).

We need some Rogue 6 data now too :) Lots of post on the G6100 in the last 18hrs...Still reading it all... ROFL (MWC is causing me to do a lot of reading today...). About 1/2 way through and most of it seems to just brag about opengl es3.0 and DX11.1 (not seeing much about perf). I'm guessing because NV doesn't have it on T4 :) It's not used yet, so I don't care but that's how I'd attack T4 in the news ;) Try running something from DX11.1 on a soc and I think we'll see a slide show (think crysis3 on a soc...LOL). I'd almost say the same for all of es3.0 being on. NV was wise to save die space here and do a simpler chip that can undercut prices of others. They're working on DX9_3 features in WinRT (hopefully MS will allow it). OpenGL ES3.0 & DX11.1 will be more important next xmas. Game devs won't be aiming at $600 phones for their games this xmas, they'll aim at mass market for the most part just like on a pc (where they aim at consoles DX9, then we get ports...LOL). It's a rare game that's aimed at GXT680/7970ghz and up. Crysis3? Most devs shoot far lower.
http://www.imgtec.com/corporate/newsdetail.asp?New...
No perf bragging just features...Odd while everyone else brags vs. their own old versions or other chips.

Qcom CMO goes all out:
http://www.phonearena.com/news/Qualcomm-CMO-disses...
"Nvidia just launched their Tegra 4, not sure when those will be in the market on a commercial basis, but we believe our Snapdragon 600 outperforms Nvidia’s Tegra 4. And we believe our Snapdragon 800 completely outstrips it and puts a new benchmark in place.

So, we clean Tegra 4′s clock. There’s nothing in Tegra 4 that we looked at and that looks interesting. Tegra 4 frankly, looks a lot like what we already have in S4 Pro..."

OOPS...I guess he needs to check the perf of tegra4 again. PCmag shows he's 600 chip got "DESTROYED" and all other competition "crushed". Why is Imagination not bragging about perf of G6100? Is it all about feature/api's without much more power? Note that page from phonearena is having issues (their server is) as I had to get it out of google cache just now from earlier. He's a marketing guy from Intel so you know, a "blue crystals" kind of guy :) The CTO would be bragging about perf I think if he had it. Anand C is fluff marketing guy from Intel (but he has a masters in engineering, he's just marketing it appears now and NOT throwing around data just "i believe" comments). One last note, Exynos octa got kicked out of Galaxy S4 because it overheated the phone according to the same site. So Octa is tablet only I guess? Galaxy S4 is a superphone and octa didn't work in it if what they said is true (rumored 1.9ghz rather than 1.7ghz HTC ONE version).
fteoath64 - Wednesday, February 27, 2013 - link
@TheJian: "ipad4 scored 47 vs. 57 for T4 in egypt hd offscreen 1080p. I'd say it's more than competitive with ipad4. T4 scores 2.5x iphone5 in geekbench (4148 vs. 1640). So it's looking like it trumps A6 handily."

Good reference!. This shows T4 doing what it ought to in the tablet space as Apple's CPU release cycle tends to be 12 to 18 months, giving Nvidia lots of breathing room. Also besides, since Qualcomm launched all their new ranges, the next cycle is going to be a while. However, Qualcomm has so many design wins on their Snapdragons, it leaves little room for Nvidia and others to play. Is this why TI went out of this market ?. So could Amazon be candidate for T4i on their next tablet update ?.

PS: The issue with Apple putting quad PVR544 into iPad was to ensure the performance overall with retina is up to par with the non-retina version. Especially the Mini which is among the fastest tablet out there considering it needs to push less than a million pixels yet delivering a good 10 hours of use.
mayankleoboy1 - Tuesday, February 26, 2013 - link
Hey AnandTech, you never told us what is the "Project Thor" , that JHH let slip at CES..
CeriseCogburn - Thursday, February 28, 2013 - link
This is how it goes for nVidia from, well we know whom at this point, meaning, it appears everyone here.

" I have to give NVIDIA credit, back when it introduced Tegra 3 I assumed its 4+1 architecture was surely a gimmick and to be very short lived. I remember asking NVIDIA’s Phil Carmack point blank at MWC 2012 whether or not NVIDIA would standardize on four cores for future SoCs. While I expected a typical PR response, Phil surprised me with an astounding yes. NVIDIA was committed to quad-core designs going forward. I still didn’t believe it, but here we are in 2013 with NVIDIA’s high-end and mainstream roadmaps both exclusively featuring quad-core SoCs. NVIDIA remained true to its word, and the more I think about it, the more the approach makes sense."

paraphrased: " They're lying to me, they lie, lie ,lie ,lie ,lie. (pass a year or two or three) Oh my it wasn't a lie. "
Rinse and repeat often and in overlapping fashion.

Love this place, and no one learns.
Here's a clue: It's AMD that has been lying it's yapper off to you for years on end.
Origin64 - Tuesday, March 12, 2013 - link
Wow. 120mbps LTE? I get a fifth of that through a cable at home.

NVIDIA Tegra 4 Architecture Deep Dive, Plus Tegra 4i, Icera i500 & Phoenix Hands On

Post Your Comment

75 Comments

Back to Article

tipoo - Sunday, February 24, 2013 - link

jeffkibuule - Sunday, February 24, 2013 - link

tipoo - Sunday, February 24, 2013 - link

karasaj - Sunday, February 24, 2013 - link

tipoo - Sunday, February 24, 2013 - link

Mumrik - Sunday, February 24, 2013 - link

tipoo - Sunday, February 24, 2013 - link

Tarwin - Monday, February 25, 2013 - link

von Krupp - Monday, February 25, 2013 - link

tuxRoller - Sunday, February 24, 2013 - link

darkich - Monday, February 25, 2013 - link

danielfranklin - Wednesday, February 27, 2013 - link

ilihijan - Sunday, March 3, 2013 - link

GiantPandaMan - Sunday, February 24, 2013 - link

s44 - Sunday, February 24, 2013 - link

lmcd - Sunday, February 24, 2013 - link

lmcd - Sunday, February 24, 2013 - link

rahvin - Monday, February 25, 2013 - link

Kidster3001 - Thursday, February 28, 2013 - link

TheJian - Monday, February 25, 2013 - link

Death666Angel - Wednesday, February 27, 2013 - link

StormyParis - Sunday, February 24, 2013 - link

mayankleoboy1 - Sunday, February 24, 2013 - link

Avalon - Monday, February 25, 2013 - link

Kiste - Monday, February 25, 2013 - link

AmdInside - Monday, February 25, 2013 - link

jjj - Sunday, February 24, 2013 - link

R3MF - Sunday, February 24, 2013 - link

cmikeh2 - Sunday, February 24, 2013 - link

guidryp - Sunday, February 24, 2013 - link

klmccaughey - Sunday, February 24, 2013 - link

twotwotwo - Sunday, February 24, 2013 - link

guidryp - Sunday, February 24, 2013 - link

twotwotwo - Monday, February 25, 2013 - link

sosadsohappy - Sunday, February 24, 2013 - link

s44 - Monday, February 25, 2013 - link

sosadsohappy - Monday, February 25, 2013 - link

Krysto - Monday, February 25, 2013 - link

name99 - Monday, February 25, 2013 - link

Wilco1 - Wednesday, February 27, 2013 - link

xsacha - Saturday, March 23, 2013 - link

name99 - Monday, February 25, 2013 - link

Krysto - Sunday, February 24, 2013 - link

Mike1111 - Sunday, February 24, 2013 - link

lmcd - Sunday, February 24, 2013 - link

Krysto - Monday, February 25, 2013 - link

phoenix_rizzen - Monday, February 25, 2013 - link

tviceman - Sunday, February 24, 2013 - link

wongwarren - Monday, February 25, 2013 - link

varad - Monday, February 25, 2013 - link

Krysto - Monday, February 25, 2013 - link

watersb - Monday, February 25, 2013 - link

djgandy - Monday, February 25, 2013 - link

djgandy - Monday, February 25, 2013 - link

varad - Monday, February 25, 2013 - link

djgandy - Tuesday, February 26, 2013 - link

Krysto - Monday, February 25, 2013 - link

djgandy - Tuesday, February 26, 2013 - link

metafor - Tuesday, February 26, 2013 - link

milli - Monday, February 25, 2013 - link

PingviN - Monday, February 25, 2013 - link

guilmon14 - Tuesday, February 26, 2013 - link

trajan2448 - Monday, February 25, 2013 - link

Scannall - Monday, February 25, 2013 - link

Khato - Monday, February 25, 2013 - link

Wilco1 - Monday, February 25, 2013 - link

Khato - Monday, February 25, 2013 - link

Wilco1 - Monday, February 25, 2013 - link

Khato - Tuesday, February 26, 2013 - link

Wilco1 - Tuesday, February 26, 2013 - link

TheJian - Monday, February 25, 2013 - link

fteoath64 - Wednesday, February 27, 2013 - link

mayankleoboy1 - Tuesday, February 26, 2013 - link

CeriseCogburn - Thursday, February 28, 2013 - link

Origin64 - Tuesday, March 12, 2013 - link

Log in