T O P

  • By -

Austin31415

>Regarding recent reports, Samsung Electronics reached out to us and said, recent media report that Samsung has established an internal team dedicated to CPU core development is not true. Contrary to the news, we have long had multiple internal teams responsible for CPU development and optimization, while constantly recruiting global talents from relevant fields."


somdipdey

I can confirm that Samsung always had multiple teams on CPU development and optimisation. I used to work in the AI/DI team at the Samsung R&D Institute U.K. where we focused on developing software mechanisms to optimise CPU efficiency and we used to work with other teams to improve CPU efficiency.


Austin31415

That sounds like an awesome job!


somdipdey

It was one of the best places I worked at. 😊😊😊


[deleted]

[удалено]


Soundwave_47

Chipmaking is somewhat of a dark alchemic art. It requires a really strong base of experienced talent and cogent direction at all levels. Not to mention supply chain optimization, part constraints, etc.


BLUEGLASS__

It is not like Samsung is lacking resources to be able to assemble the necessary talent, it's just a matter of whether it makes any actual sense to do so.


Soundwave_47

It takes a while and Intel, TSMC, Apple, AMD, NVIDIA have a lot of electrical and computer engineering talent. Samsung isn't really known for that.


somdipdey

This is not true. Samsung has some of the best talents in the world. I’m not saying bcz I used to work there in the AI/DI team on CPU/GPU optimisation but bcz I have first hand worked with truly brilliant people - reputed scientists and graduates from top Unis including Oxford, Cambridge, Imperial. Samsung R&D Institutes and AI institutes have some of the best talents around the world.


Ok_Check_1152

True, people don't really know Thea lack* of talent down the pipe line. I work in RF field, my team is 1/3 of the big players, and even a fraction of their bigger competition. My team accomplished more in 3 years than other competitors would dream off in 5 years. Without saying much ( yay NDA) it takes a lot of talent To make something like this happen, and a small core talented team could be more essential at each stage than majority of the replaceable typical talent. At each stage you hope no one head hunt your core talent. And those talents are not known outside the industry, and they driving new technology not even out to the public and media.


dentistwithcavity

So why is Jim Keller always the one and only guy pushing Semiconductor industry forward? He's always behind some major improvement and as soon as he leaves the product stops being competitive


somdipdey

Unfortunately, I’m not legally allowed to comment based on my past contract. 😊😊😊 Don’t wanna get sued based on my opinion.


soupiejr

That's what throwaway accounts are for! Just saying...


[deleted]

[удалено]


[deleted]

[удалено]


somdipdey

Yes, they can. Moreover, I'm way too busy to handle multiple accounts. 😅😅😅


Narcotras

I can't wait for stupid non competes or NDA for opinions to get thrown out, these are fucking stupid


jamesnyc1

Then why are you not there anymore?


somdipdey

I run my own tech company now - Nosh Technologies. That’s why I left. 😊😊😊


jamesnyc1

Oh cool. Good for you.


Exist50

But teams for CPU **core** development? Because Samsung liquidated the Mongoose team years ago. You're talking about a completely different team/skill set.


[deleted]

[удалено]


Artistic-Toe-8803

wow


kfthebest97

Seems like they want to keep the peace with ARM , so they don't end up in legal trouble like Qualcomm


Austin31415

That was my first reaction too. I do think ARM cores are getting better and better, just hope Samsung actually goes all out on the L2 cache. My second reaction is I don't know if Samsung would admit this publicly if they were designing their own CPU cores as well as I'm sure they are looking into RISC-V cores (especially for the Chinese market).


somdipdey

Btw I am not confirming or commenting anything related to Samsung business but I would like to mention that it’s good to explore areas in cutting edge/latest technologies. RISC-V is great for research purposes atm but still requires some time for mass die production. There are several good applications made on RISC-V, however, due to the associated expensive investment in chip integration process, usually companies around the world test out a lot of different methodologies before they finalise one to mass produce to optimise production cost.


Austin31415

Oh absolutely! I meant this as an early investment for research on a possible long-term strategy.


Exist50

Or a much simpler explanation - that the fundamentals haven't changed since they disbanded their old team years ago. Qualcomm has a wider market than Samsung, and they bought an entire company. Plus, Samsung is having more fundamental issues with their SoCs.


[deleted]

[удалено]


Austin31415

I think they're better off focusing on a killer GPU and NPU, while further developing ARM stock cores in the near future. It's all exciting stuff, but fab rumors are a dime a dozen so I'm not getting my hopes up.


GeneralChaz9

Agreed; not that I don't want CPU performance to stall but I think the next steps are co-processors for AI/ML/etc. functions. Qualcomm is showing that GPU leaps are still possible at this power usage so I think there is still a ton of improvement to be had. CPU efficiency and thermal output should be prioritized over benchmark numbers right now imo. Sustained performance could use some help.


BLUEGLASS__

Imagine if Samsung fully expands Dex out to a full PC operating system that is available on laptops, tablets and All-In-Ones and starts fighting Apple with a unified Android based ecosystem that all integrated with their phones.


Austin31415

I don't see Samsung wanting to own a desktop class OS on top of Android. That's a lot of work for something that gets easier to do each year on existing platforms. They honestly are better off just using Windows and working with Microsoft. I just don't think Samsung is that interested in owning a platform anymore. While the dream of a phone/tablet/desktop OS has been around for a while, even Apple is far from unifying the iPad and MacOS. Google is really the only one that has a reason to do this and they also don't seem that interested in combining OS platforms anytime soon.


BLUEGLASS__

Ok but wouldn't it be funny


Put_It_All_On_Blck

I'm not that optimistic about that. Samsung's 3nm GAE was supposed to debut last year, they said it did, but we never saw a single product ship with it. Now there's rumors that the node was bunk and they are hoping 3nm GAP will save them. But Samsung foundries have been behind for years, so it's unlikely they will be competitive. The RDNA 2 architecture in the Xclipse 920 was garbage, worse than if they stuck with Mali. Meanwhile Qualcomm hit a home run with Adreno 740. RDNA 3 is a smaller advancement than RDNA 2 was, hence why Nvidia is running away with the GPU market with a bigger lead over AMD than in previous years. And again with the modem, I'm sure it will improve, but Qualcomm makes the gold standard for mobile modems, and Samsung has put out a lot of bad ones. I'd much rather just stick with a Qualcomm modem as you know it's going to be good.


uKnowIsOver

>Samsung's 3nm GAE was supposed to debut last year, they said it did, but we never saw a single product ship with it. Now there's rumors that the node was bunk and they are hoping 3nm GAP It entered [risk production](https://fuse.wikichip.org/news/6932/samsung-3nm-gaafet-enters-risk-production-discusses-next-gen-improvements/) last year. Also this not uncommon to axe a node and move to another one, you know TSMC did the same with their N3 node, delayed multiple times(it should have had come out last year) because of poor yield rates and axed in favor of the less complex N3E. >The RDNA 2 architecture in the Xclipse 920 was garbage, worse than if they stuck with Mali. Meanwhile Qualcomm hit a home run with Adreno 740. It wasn't the architecture much like the fact that it didn't implement a proper OpenGL driver but instead relied on ANGLE. Hilarious enough, [it beats Adreno 740](https://i.imgur.com/xR087yU.jpg) in RT performance which was this new GPU gimmick and outperforms [Adreno 730](https://browser.geekbench.com/v5/compute/6463169) in [pure Vulkan](https://browser.geekbench.com/v5/compute/6464589) and in [pure OpenCL computing](https://browser.geekbench.com/v5/compute/6465458) vs [Adreno 730](https://browser.geekbench.com/v5/compute/6466366)


somdipdey

Btw I am not commenting or talking in the sense of an ex-Samsung employee but we need to keep few things in mind. When it comes to R&D institutes, they can develop the latest cutting edge technology but development happens with a lot of underlying assumptions, which might not generalise well. Just bcz a tech performs well during R&D, doesn’t mean that at a production level it performs similarly. Especially when it comes to chip development, currently, thousands of different smartphone devices are available around the world. Certain chip technology might perform wonders for specific smartphone build but might not perform well for a lot more of other builds. So, the issue here is the generalisation of an R&D tech into mass production. Also, since, silicon die manufacturing process is very expensive due to associated initial investment cost, at a business point of view often it needs to be decided whether the ROI of mass producing a R&D tech could be beneficial in the long run or not.


somdipdey

I Read the comments on this post and I personally feel that a lot of people might have a bit misconception related to chip technologies and their performance. Note: I’m not commenting on this as an ex-employee of Samsung, who used to work in the AI/DI team on CPU/GPU optimisation. I’m commenting on this topic as a researcher in the field of CPU/GPU optimisation. Few things that I need to clarify: Modern MPSoCs are extremely powerful and majority of mobile apps out there can’t even utilise such powerful CPUs to their full extent. For example, Samsung Exynos 9810 MPSoC has 4 Mongoose big CPUs and 4 A55 LITTLE CPUs. Now let’s only focus on the big CPUs as they are the most powerful and consumes the most energy among the CPUs. The big CPUs theoretically can scale from 650 MHz to 2.7 GHz. But at the hardware level CPU scaling as part of DVFS mechanism (Dynamics Voltage and Frequency Scaling) is restricted to 1.9 GHz max. Usually software wise max scaling can actually only be allowed to 1.7 GHz on that MPSoC and there’s a big reason for that. The reason is Amdahl’s law. No matter how fast you make a CPU to execute if portions of executing code can’t utilise that speed or if portions of executing code is not parallelised correctly to utilise the operating speed of the processor then there will be bottleneck in performance. So, theoretically even if CPU frequency scaling is allowed up to 2.7 GHz, barely any mobile app would be utilising that frequency speed. In retrospect, operating the CPU at high frequency will only lead to high energy consumption and high thermal behaviour of the MPSoC (not taking thermal throttling in to account). But TBF thermal throttling is done on such CPU operating at such theoretical high frequency in MPSoC in a passively cooled embedded device such as smartphone so that the chip don’t burn and can help with thermal management. Now when it comes to DVFS, it is noted that usually most modern apps including resource intensive mobile games can still run at the highest specs/performance at a much lower operating frequency of CPU/GPU than the MPSoC can support. One example, me and my research team built an AI/machine learning model called EOptomizer - that naturally learns to set the desired operating frequency of CPU/GPU while consuming far less power and with much more optimal thermal behaviour of the CPU/GPU. In fact, the qualitative performance difference while running games using our AI model that was operating CPU/GPU at a far lower frequency compared to CPU/GPU being operated by schedutil schehduler was negligible. Yet the power consumption and thermal behaviour of the former was significantly less compared to the default frequency scaling done by the schedutil scheduler. Using such mechanism we were able to achieve up-to 30% reduced power consumption in Google Pixel 6 smartphones without loosing any qualitative performance. You can read a bit about our this work here: https://www.essex.ac.uk/events/2022/07/11/eoptomizer-launch-and-workshop Now, why did I mention all of these?! If you notice that a lot of people complained about Samsung’s Game Optimizing Service (GOS). In reality, if you do a qualitative analysis almost wide majority of users will not notice any difference in performance (FPS) in games if the optimisation is on. People only complained bcz they checked the scores on benchmarks such as GeekBench while GOS was on. Keep in mind that these benchmarks actually has specific computation running in background to determine a score, which often time might not exactly reflect the exact usage behaviour of a smartphone user. While GOS is on, the achieved score by benchmarks can be low but the qualitative difference of performance might not even be differentiated if GOS is on/off by most users. On the other hand, GOS can help with reducing power consumption and thermal behaviour while maintaining a high qualitative performance. Now on a business stand point, why should a company invest millions/billions of dollars to build the fastest processor chip on pen/paper (or for benchmark scoring) if the practical usage of such a processor technology being fully utilised by apps is not feasible?! That is not a good business. Some related research that might be interesting on this topic are as follows: 1) https://repository.essex.ac.uk/27546/1/User%20Interaction%20Aware%20Reinforcement%20Learning.pdf 2) https://repository.essex.ac.uk/27614/1/conference_041818.pdf


dentistwithcavity

You are failing to account that almost every time you make improvements to your peak clocks you inadvertently also make improvements to the middle range performance of your CPU too. This is why Apple's SoC, AMD and even Intel's 13th gen perform so well at mid level 1.6Ghz like you mentioned while Samsung and Qualcomm have been behind the pact always. This has also led to much greater perf/watt for Apple's Little cores that still to this date out perform ever single core in the market. Their chase for best peaks led them to the most efficient down clocked cores and thus the best battery life in both handhelds and laptops.


zephepheoehephe

The reason Apple's little cores are better is because they aren't that little. ARM still licenses their low-power core for embedded applications and whatnot.


dentistwithcavity

Whatever the die size is, they get almost 2x the performance at same wattage as A510


[deleted]

But die size and manufacturing process are integral parts of the chip's cost. You can't dismiss that when making a comparison between two CPUs.


zephepheoehephe

That is in fact what happens when you aggressively gate a larger chip.


somdipdey

It's partially true but not completely. Improving peak clock does not inadvertently make improvements to the middle range performance of the CPU. That said, Apple's SoC (MPSoC such as M1, M2 chips) are more efficient comparatively because of the fact "Hardware/Software Co-design." Apple produces the MPSoC using Hardware/Software co-design, where the same company (Apple) designs the MPSoC based on the software requirements and retrospectively the software is designed to utilise the hardware capacity of the MPSoC to its full extent. Apple's MPSoC is based on RISC architecture and is influenced by the Apple's software - Mac/iOS. Whereas, comparatively, most of the time Samsung and Qualcomm gets the IP of CPU/GPU from ARM (which is also based on RISC architecture) to build their MPSoC and separately, modifies Android OS (via Android Open Source Project), which they get from Google, to work optimally with the manufactured MPSoC. So, for Samsung and Qualcomm, the problem becomes: Get off-the-shelf hardware IP and get off-the-shelf operating system and then make them work together. In this way, reaching most optimal result on Hardware/Software co-designing become a bit more challenging. One thing to notice, that in the Android space Google with their Google Tensor processors have been able to reach really good performance (and efficiency in terms of power consumption and thermal behaviour) results with Android OS comparatively because Google owns the Android development team and then they modify ARM's CPU IPs (branded as Google Tensors) to get more optimal performance. This is Google's way of doing HW/SW co-designing to develop their Pixel smartphones. AMD and Intel produces CPUs mostly in the CISC architecture domain. Though Intel invested in RISC-V based CPUs/chips but there has been news that they terminated their work in that domain. There are also rumours that AMD might be investing in RISC-V in future but they don't yet. That said, CISC CPU architecture is far more complex and power hungry compared to usual RISC CPU architecture (and these days CISC based CPUs are mostly used in general purpose computing and server based computing systems instead). So, comparing AMD and Intel's CISC CPUs with Samsung/Google/Qualcomm's RISC CPUs might be like comparing apple to oranges, which might not be fair.


dentistwithcavity

Again, you're mostly incorrect. Hardware/Software co-design only factors in for specific application. For eg. ML running on Tensor, very specific video encoding/decoding like Apple does with ProRes on media engine but it doesn't help when you use some other codecs, same with their Rosetta translation. I'm talking about general purpose use case with general purpose languages and Apple "hardware/software co-design" isn't better than what Intel and AMD have been putting out recently. In fact, I'm pretty confident once Nuvia's cores are added to Snapdragon they'll be pretty much at par or exceed Apple's performance. You can also verify that by running general purpose workloads on Apple's SoC like Chrome gets same performance uplift even though it's not a part of "Hardware/software co-design" or conversely run Asahi Linux and as long as drivers aren't an issue, general purpose workloads like compiling perform exactly the same or even a little bit better https://old.reddit.com/r/linux/comments/tj12vw/hugo_runs_twice_as_fast_in_asahi_linux_than_macos/


somdipdey

My research is on RISC based architectures so, I won't comment on Intel and AMD's progress in CPU/chip development even if I am aware of what they are up to. However, I would point out that you should refer to the following paper to understand HW/SW co-design process a bit better: [https://ieeexplore.ieee.org/abstract/document/558708](https://ieeexplore.ieee.org/abstract/document/558708) Apple has a leg up in the area as both the hardware and OS for their devices are mostly built/designed in-house compared to many other smartphone manufacturers out there. The closest who come into competition is Google in the area of HW/SW Co-design as they are producing their own processors based on ARM IPs and Android team is part of the Google. But this doesn't mean that other companies such as Samsung and Qualcomm don't do that or lacking far too behind. I am only stating these based on the current technology that are available.


dentistwithcavity

It's a paid paper, can't access. Reading the abstract I don't get the point? Of course an ASIC will perform 1000x better than general purpose CPU. But that's the whole thing I'm debating against. Run a Tensor ML workload on Tensor GPU and of course it runs better than on Nvidia, but you can't say games will perform just as good on Tensor vs Nvidia. Compare General purpose workloads on General purpose CPU and this "hardware/software co-design" doesn't come into play. Like already showed with proof, cases like Hugo, Rust, Golang works much better on Asahi Linux on the same M1 mac than it does on OSX


somdipdey

HW/SW Co-design mechanism was introduced for SoCs because the biggest concerns related to embedded systems using SoCs (smartphones in our case) are optimising performance, energy consumption and thermal behaviour. Usually these concerns are not a big focus when it comes to general purpose computers. However, when it comes to embedded systems that uses SoCs or as I should say that nowadays they use MPSoC, the primary concerns still remain. Apple moved to MPSoCs even for their laptop because it enables them to create lighter devices (compared to using Intel based chips that requires separate computing subsystems) while achieving performance requirements of modern applications. It took some time for the industry but they finally started to see the benefits of using RISC based processor systems especially in MPSoCs. But as I have mentioned earlier usually personal devices such as laptops, tablets, smartphone, phablets are all using MPSoC these days and operate on batteries rather than having dedicated continuous power supply. Yes, a person can keep the devices connected to a power source all the time then energy consumption might not seem to be an issue. But in the larger sense, as the world focuses on Net Zero Emission goals and sustainability issues, for such personal devices with limited power source (battery operated I mean), the main concern is how to make devices using MPSoC more efficient in terms of performance, energy consumption and thermal behaviour. One of the best ways to achieve this is through HW/SW co-design process where the HW can meet all the requirements by executing applications while catering for desirable energy consumption and thermal behaviour. Note: HW/SW codesign is just a design concept and is a vast area of research/integration in EDA (Electronic Design Automation). A free to read survey paper is this one in the domain that shows different ways performance, energy and thermal can be optimised in MPSoCs, which might help: [https://repository.essex.ac.uk/27441/1/DT\_DTSI\_2019\_08\_0091.R1\_Singh.pdf](https://repository.essex.ac.uk/27441/1/DT_DTSI_2019_08_0091.R1_Singh.pdf) However, I think we might be going off topic from the original post/point. My main point was to say that when a company designs its own HW and SW dependent on each other, they have an edge over the rest of the industry in achieving performance and efficiency in their devices.


[deleted]

[удалено]


somdipdey

It would actually depend on the specific CISC and RISC architecture. I mentioned the above in general terms. By definition, "The CISC approach attempts to minimize the number of instructions per program, sacrificing the number of cycles per instruction. RISC does the opposite, reducing the cycles per instruction at the cost of the number of instructions per program." - [https://cs.stanford.edu/people/eroberts/courses/soco/projects/risc/risccisc/](https://cs.stanford.edu/people/eroberts/courses/soco/projects/risc/risccisc/) So, the associated School of Thought in execution is different. However, yes, in terms of actual implementation the differences between CISC and RISC for specific architectures might not be too dissimilar but yet different.


dentistwithcavity

Yeah, this guy does not sound like he worked on SoC at all. I've had plenty of conversations with my friends from the likes of Apple's SoC team, Nuvia, AMD and Qualcomm and he sounds like a junior guy at best. No wonder Samsung's Exynos is a failure


Madlonewolf

Thank god


grasspopper

What is Exynos then?


HistoricalInstance

SoC != CPU core. Samsung used ARM cores for a while now.


grasspopper

Is Arm different from Apple’s M / A series of chips? Im just trying to understand all these concepts


Rhed0x

ARM can mean two things. It can mean the instruction set, so basically the """language""" the CPU understands or it can mean actual ARM **Cortex** CPU cores that are designed by ARM. Both Samsung and Qualcomm use ARM Cortex CPU cores in their SOCs. Apple designs their own CPUs using only the ARM instruction set. Qualcomm is also moving into that direction and their custo CPU cores are expected to launch either this year or next year.


grasspopper

okay thanks!


somdipdey

ARM usually develops the CPU/GPU IPs and then sells them via licensing to other companies such as Samsung, Apple, Qualcomm, Google, Xilinx. Then the companies develop their own MPSoC (Multiprocessor System-on-Chip) based on the CPU/GPU and other processing element IPs from ARM. MPSoC consists of multiple processing elements such as CPU, GPU, DSP, Neural Processing Elements, etc and other components/sub-systems such as RAM, communication modules, etc. MPSoC is basically a whole computing system built on a single chip. Some times companies also develop their own CPU/GPU by modifying ARM's RISC architecture IP such was the case of Samsung Mongoose CPUs. P.S. Note: I did not comment on this as an ex-employee of Samsung, rather, as a researcher in the field of embedded systems/embedded machine learning for RISC architecture.


iDontSeedMyTorrents

ARM is an instruction set, like x86 or RISC-V. Arm (company) develops complete ARM CPU designs that others can license and use. This is what most companies using ARM CPUs currently do (or they buy complete chips from someone who does). Some companies license the instruction set, and instead build their own custom CPUs running the instruction set. This is what Apple does and what Qualcomm is attempting to do soon with their acquisition of Nuvia.


grasspopper

Okay, thank you! I had no idea about Nuvia


HistoricalInstance

Apple uses the same ARM instruction set, similar like both AMD and Intel use x86. Also like AMD and Intel, Apple and ARM (ARM not only provide the instruction set, but also core designs for others to license) have different CPU designs. Samsung used to have their own CPU-core design (named “Mongoose” iirc) as well, but is now licensing ARM designed cores like Qualcomm and Mediatek.


somdipdey

Exynos is the brand name of the MPSoC (Multiprocessor System-on-Chip) produced by Samsung. MPSoC consists of multiple processing elements such as CPU, GPU, DSP, Neural Processing Elements, etc. MPSoC is basically a whole computing system built on a single chip. 😊😊😊


Maxo112

No exynos in Samsung Galaxy S Series Form Europe anymore ? I hate that we Love Snapdragon!!!


PoliteLunatic

implementation of pluton type ish?


Berkoudieu

They better be not true, let me replace my S22u next year before putting exynos shit again inside new phones. Writing this message on said device btw, lost 2% battery doing so and heating my hands...


RangerLt

Samsung: Reports of groups of people within the company working together towards a common goal have been exaggerated.