The Snapdragon 855 Performance Preview: Setting the Stage for Flagship Android 2019
by Andrei Frumusanu on January 15, 2019 8:00 AM EST- Posted in
- Mobile
- Qualcomm
- Smartphones
- SoCs
- 7nm
- Snapdragon 855
As we get ever closer to the releases of various Android vendor’s newest generation devices, Qualcomm continues its tradition of showcasing the performance of its newest generation SoC ahead of time, giving us a preview and a glimpse of what to expect of the upcoming commercial devices. This year, the company has shuffled the schedule around a bit, and instead of hosting the event at their San Diego headquarters, and we were able to get a hands-on with the Snapdragon 855 at CES before the show had started. Today we’re be looking in quite extensive detail on how exactly the new Snapdragon chipset performs, as well as shedding some light on some important unanswered technical questions on some aspects of the SoC.
It’s been roughly a little over a month now since Qualcomm officially announced the Snapdragon 855 in Hawaii, and we had posted extensive coverage on the event and the various announcements that took place:
Full Coverage
Day One
- Qualcomm Tech Summit, Day 1 Live Blog (Starts at 2pm ET)
- Qualcomm Tech Summit, Day 1: Announcing 5G Partnerships and Snapdragon 855
- The Verizon and AT&T 5G Setups at Qualcomm’s Snapdragon Summit
- 5G Mobile Hotspots: Netgear for AT&T and inseego for Verizon
- A Quick Look at Qualcomm’s 5G Smartphone Reference Design
- Implementing 5G in the UK: EE Trial Sites and First Cities
- Samsung’s 5G Smartphone Prototype: Don’t Mention The Notch
- Moto 5G Mod at Snapdragon Summit: Lots of Antennas, and S855 Inside?
Day Two
- Qualcomm Tech Summit, Day 2 Live Blog (Starts at 2pm ET)
- Qualcomm Tech Summit, Day 2: Snapdragon 855 Mobile Platform
- The Qualcomm Snapdragon 855 Pre-Dive: Going Into Detail on 2019's Flagship Android SoC
- OnePlus: Dual Product Lines, Cost of 5G, and Translation Errors
- Snapdragon 855 Power Consumption Demo, But
- Snapdragon 855: 802.11ax-Ready / Wi-Fi 6 Demonstrations
Day Three
- Qualcomm Tech Summit, Day 3 Live Blog (Starts at 2pm ET)
- Spotted: Qualcomm Snapdragon 8cx Wafer on 7nm
- Qualcomm Tech Summit, Day 3: Snapdragon 8cx, the New ACPC SoC
For anyone who might have missed it, if there’s any one specific article that I would recommend reading before heading in directly to the results of the benchmarking session, then it’s our in-depth piece summarising the disclosed specifications of the new Snapdragon 855.
SoC Overview & CPU Recap
Qualcomm Snapdragon Flagship SoCs 2018-2019 | |||
SoC |
Snapdragon 855 |
Snapdragon 845 | |
CPU | 1x Kryo 485 Gold (A76 derivative) @ 2.84GHz 1x512KB pL2 3x Kryo 485 Gold (A76 derivative) @ 2.42GHz 3x256KB pL2 4x Kryo 485 Silver (A55 derivative) @ 1.80GHz 4x128KB pL2 2MB sL3 |
4x Kryo 385 Gold (A75 derivative) @ 2.8GHz 4x256KB pL2 4x Kryo 385 Silver (A55 derivative) @ 1.80GHz 4x128KB pL2 2MB sL3 |
|
GPU | Adreno 640 @ ?MHz | Adreno 630 @ 710MHz | |
Memory | 4x 16-bit CH @ 2133MHz LPDDR4x 34.1GB/s 3MB system level cache |
4x 16-bit CH @ 1866MHz LPDDR4x 29.9GB/s 3MB system level cache |
|
ISP/Camera | Dual 14-bit Spectra 380 ISP 1x 48MP or 2x 22MP |
Dual 14-bit Spectra 280 ISP 1x 32MP or 2x 16MP |
|
Encode/ Decode |
2160p60 10-bit H.265 HDR10, HDR10+, HLG 720p480 |
2160p60 10-bit H.265 720p480 |
|
Integrated Modem | Snapdragon X24 LTE (Category 20) DL = 2000Mbps 7x20MHz CA, 256-QAM, 4x4 UL = 316Mbps 3x20MHz CA, 256-QAM |
Snapdragon X20 LTE (Category 18/13) DL = 1200Mbps 5x20MHz CA, 256-QAM, 4x4 UL = 150Mbps 2x20MHz CA, 64-QAM |
|
Mfc. Process | 7nm (N7) | 10nm LPP |
Summarising the Snapdragon 855 quickly, at the heart of the new 7nm chipset we find four new Kryo 485 CPUs based on Arm’s new Cortex A76 cores. The new CPU cores are derivatives of the designs Arm makes available by default: Here Qualcomm takes advantage of the “Built on Arm Cortex Technology” license which allows vendors to demand changes to the IP based on a set of microarchitectural tuneables that Arm makes available. Even though the Snapdragon 855’s Kryo 485 is the third generation of such a derivative core from Qualcomm, it represents the first instance of the company actually talking about what kind of microarchitectural changes have been made to the CPUs. The Kryo 485 Cortex A76 derivatives increase the re-order buffer from Arm’s default 128 entries to a higher, undisclosed figure. Qualcomm has also revealed that the prefetchers have been optimised for better efficiency, although again we don’t have any details on how exactly this has been achieved. Finally, as I’ve understood it, the branch data structures have also seen an improvement over the default A76 configuration.
Although the SoC still comes in a “4+4” big.LITTLE high-level core configuration, the big cores in the S855 aren’t actually all equal: Here Qualcomm has gone for a rather exotic 1+3+4 configuration, in which one of the big cores is implemented with a higher frequency physical design reaching 2857MHz, all while also employing a larger L2 cache of 512KB. The remaining three big cores max out at 2439MHz and are only equipped with 256KB L2 caches. Qualcomm does clock this 1+3 configuration differently with asynchronous clock planes, however the cores still share the same voltage plane. Here the company explains that this is a compromise between power efficiency benefits, and cost of implementation complexity as well as supporting power delivery components (Adding an extra voltage plane adds another PMIC rail with inductors & capacitors).
The Kryo 485 in the S855 promises to be Qualcomm’s biggest performance jump ever, advertising a 45% jump in performance compared to the Snapdragon 845. As we had discussed it in our launch article, this figure seems to coincide with the performance jump the Kirin 980 saw over the Snapdragon 845, which makes sense given that the two new SoCs both employ the newest generation A76 CPU at similar clocks.
The Snapdragon 855 QRD
Today’s testing platform is Qualcomm’s new Snapdragon 855 QRD (Qualcomm Reference Design). This year’s QRD design is probably Qualcomm’s most “commercial-device-like” ever as other than the more robustly built bezels, you would have a hard time noticing that this a reference platform – at least on the hardware side. The company stated that the QRD as tested should be a near representation of what to expect of commercial device – but of course vendors might still diverge from this as they may change the underlying software.
We’ve had a limited hands-on time with the devices, and although we managed to do a lot with them, some of the more time-intensive testing such as GPU sustained performance testing were out of the scope of today’s session.
The Big Question For The Snapdragon 855 : Memory Latency
One of the more odd developments that Qualcomm introduced with the Snapdragon 845 last year was the addition of a new 3MB system level cache that sits in front of the memory controllers as a new cache hierarchy. This new block is meant to serve as a SoC-wide buffer for the various IP blocks, reducing the amount of expensive DRAM memory accesses and thus improving power efficiency of the system. As well as improving power efficiency, it could in theory also serve as a booster for performance. The most famous use of such a system cache is Apple’s own A-series chips where we’ve seen this kind of IP block used ever since the A7.
Qualcomm’s introduction of such an IP block in the Snapdragon 845 was exciting to see, however this came as a double-edged sword as it had also introduced a 30% regression (In comparison to the Snapdragon 835) in DRAM memory latency, which may have limited some of the performance aspects of the A75-based CPU last year.
The new Snapdragon 855 unfortunately doesn’t seem to have made any major improvements in this regard as Qualcomm had confirmed that the IP block is the same as the one used in the Snapdragon 845.
To investigate the differences between the Snapdragon 845, 855 and also the Kirin 980, we make use of latency test. First off, let’s view the memory hierarchy in a logarithmic latency graph, better visualising the various latency jumps between hierarchies:
Between the Snapdragon 845 and the new 855, we now see the increased L2 cache of the Kryo 485 Prime core at 512KB, an increase over the 256KB size on the previous generation performance core. What is also visible is that the L2 latency has improved, even though both chipsets are clocked near to each other at ~2.8GHz.
The DSU L3 cache on both the Snapdragon 845 and 855 are identical at 2MB in size: Again what’s interesting here to see is that the latency on this part of the cache hierarchy seems to be identical between both SoCs. This is in contrast to the 4MB L3 of the Kirin 980 which, albeit bigger, seems to be 20% slower.
Moving onto the linear graph, here we can see the difference in DRAM latency in a clearer fashion. The Snapdragon 855 does seem to slightly improve memory latency over the 845, however this might just be an effect of the newer 2133MHz LPDDR4X memory that represents a 14% speed boost over the 1866MHz memory of last year’s devices.
Critically though, is the comparison to the Kirin 980 which employs the same CPU microarchitecture, and the same 2133MHz memory as the Snapdragon 855. Here Qualcomm still shows a notable latency regression in memory latency over the competition, whose effects will be interesting to analyse in the following pages.
132 Comments
View All Comments
cknobman - Tuesday, January 15, 2019 - link
So better power consumption but performance wise it looks like a swing and a miss.Nothing too meaningful over the 845.
IGTrading - Tuesday, January 15, 2019 - link
To be honest, this is good enough for me and most of us.I'd be happy to see Qualcomm focusing more on server CPUs and computers/notebook running Windows on AMR chips.
It's been something like 15 or even 20 years since coders/developers stopped worrying about optimizations, performance improvements and now they only rely on the much improvement hardware being available year after year.
We were building optimized web pages 20 years ago, that looked good and loaded in less than 10 seconds on a 5,6 KB connection.
Now idiots build sites where the Home Page is 300 MB heavy and complain about mobile CPUs and mobile networks not being fast enough.
bji - Tuesday, January 15, 2019 - link
"t's been something like 15 or even 20 years since coders/developers stopped worrying about optimizations, performance improvements and now they only rely on the much improvement hardware being available year after year."Speaking as a software developer, I will say that your statement is bullshit. I have yet to work on any product where performance wasn't considered and efforts to improve efficiency and performance weren't made.
bji - Tuesday, January 15, 2019 - link
Also everything your browser does now is 10,000 times more complicated than anything that browsers did 20 years ago. All of the effort that has gone into developing these technologies didn't go nowhere. You are just making false equivalencies.And if a page took 5 seconds to load in 2019, let alone 10 seconds, you'd be screaming about how terrible the experience is.
name99 - Tuesday, January 15, 2019 - link
It's usually the case that people talking confidently about what computers were like 20 yrs ago (especially how they were faster than today...) are in the age range from not yet born to barely five years old at the relevant time.Those of us who lived through those years (and better yet, who still own machines from those years) have a rather more realistic view.
rrinker - Wednesday, January 16, 2019 - link
Really? What's the 'realistic' view? For background, the first computer I had regular access to was a TRS-80 Model 1 when they first came out in 1977, so I've been doing this a LONG time. Software today is a bloated mess. It's not all the programmers' fault though, there is this pressing need for more and more features in each new version - features that you're lucky if 1% of the users actually even utilize. Web pages now auto start videos on load and also link a dozen ads from sites with questionable response times. That would have been unthinkable in the days 56k and slower dialup, and it just wasn't done. I even optimized my BBS in college - on campus we had (for the time) blazing fast 19.2k connections between all dorm rooms and the computing center, at a time when most people were lucky to have a 1200bps modem, and the really lucky ones had the new 2400s. So I set up my animated ANSI graphic signons in a way that on campus users at 19.2k would get the full experience and off campus users, connecting via the bank of 1200 baud modems we had, would get a simple plain text login. In today's world, there is a much grater speed disparity in internet connections. I have no problem with pretty much any site - but I have over 250mbps download on my home connection. Go visit family across the state - the best they can get a a DSL connection that nets about 500k on a good day on a speed test - and so many sites fail to load, or only ever partially load. But there are plenty of sites that don;t try to force graphics and videos down your throat that still work fine.No, things weren't faster back in the day - but because the resources were more limited, both for apps running on the local computer in terms of RAM, storage, and video performance as well as external connectivity, programs had to be more efficient. Heck, the first computer I actually owned had a whole 256 bytes of RAM - to do anything I had to be VERY efficient.
Klinky1984 - Friday, January 18, 2019 - link
So pay per minute slow internet, the non-standard compliance of Netscape 2.0 and IE 3.0, an internet without any video streaming, were there "good ol days"? Sorry but I remember bloated pages that took a minute plus to download or never loaded. I remember waiting 3 minutes for one single high res jpeg to download... They were not glory days. Can your 256 byte computer even handle Unicode? No way.seamadan - Tuesday, January 22, 2019 - link
I bet your pages looked REALLY good. Like REALLY REALLY good. I'm in awe and I haven't even seen themKrysto - Tuesday, January 15, 2019 - link
That bold has sailed. They've already given all the server IP on a silver platter to their forced Chinese "partner".That said, Snapdragon 8cx for notebooks does look quite intriguing, mainly because of its 10MB shared cache.
Krysto - Tuesday, January 15, 2019 - link
boat*