← Back to context

Comment by adrian_b

12 hours ago

While the frontend of Intel Skymont, which includes instruction fetching and decoding, is very original and unlike to that of any other CPU core, the backend of Skymont, which includes the execution units, is extremely similar to that of Arm Cortex-X4 (which is a.k.a. Neoverse V3 in its server variant and as Neoverse V3AE in its automotive variant).

This similarity consists in the fact that both Intel Skymont and Arm Cortex-X4 have the same number of execution units of each kind (and there are many kinds of execution units).

Therefore it can be expected that for any application whose performance is limited by the CPU core backend, the CPU cores Intel Skymont and Arm Cortex-X4 (or Neoverse V3) should have very similar performances.

Moreover, Intel Skymont and Arm Cortex-X4 have the same die area, i.e. around 1.7 square mm (including with both cores 1 MB of L2 cache in this area). Therefore the 2 cores not only should have about the same performance for backend-limited applications, but they also have the same cost.

Before Skymont, all the older Intel Atom cores had been designed to compete with the medium-size Arm Cortex-A7xx cores, even if the Intel Atom cores have always lagged in performance Cortex-A7xx by a year or two. For instance Intel Tremont had a very similar performance to Arm Cortex-A76, while Intel Gracemont and Crestmont have an extremely similar core backend with the series of Cortex-A78 to Cortex-A725 (like Gracemont and Crestmont, the 5 cores in the series Cortex-A78, Cortex-A710, Cortex-A715, Cortex-A720 and Cortex-A725 have only insignificant differences in the execution units).

With Skymont, Intel has made a jump in E-core size, positioning it as a match for Cortex-X, not for Cortex-A7xx, like its predecessors.

>positioning it as a match for Cortex-X

Well the recent Cortex X5 or 925 is already at around 3.4mm2 so that comparison isn't exactly accurate. But I would love to test and see results on Skymont compared to X4. But I dont think they are available yet ( as an individual core ).

I am really looking forward to Clearwater Forest which is Skymont on 18A for Server.

And I know I am going to sound crazy but I wouldn't mind a small SoC based on Skymont and Xe2 Graphics for Smartphone to Tablets.

  • Like I have said, Intel Skymont is a very close match for Cortex-X4, not for Cortex-X925.

    With Cortex-X925 Arm has made a big jump in core size, departing from the previous Cortex-X series, which has allowed a good increase in IPC, greatly improving the results of single-threaded benchmarks, but this has been paid by a much worse performance per area, making Cortex-X925 completely unsuitable for multi-threaded applications. Therefore Cortex-X925, like also Intel Lion Cove, is useful only when it is accompanied by smaller cores that handle the multi-threaded workloads.

    So unlike with previous Arm cores, Cortex-X925 has not made Cortex-X4 obsolete, as demonstrated e.g. in MediaTek Dimensity 9400, which includes 1 Cortex-X925 to get good single-threaded benchmark scores, together with 3 Cortex-X4 to get good multi-threaded benchmark scores.

    It is not clear which are the intentions of Arm for the evolution of the Cortex-X series. The rumors are that the next core configuration for smartphones is intended to be like that already deployed by Qualcomm with its custom cores, i.e. to have a big core that is 3 times bigger than the medium-size core and to use 2 big Cortex-X930 cores + 6 medium-size Cortex-A730 cores, for an even split in die area between the big cores and the medium-size cores.

    For this to work well, Cortex-X930 must provide a good improvement in performance per area over Cortex-X925, because otherwise it would be hard to justify a 2+6 arrangement, when in the same die area one could have implemented a 1+9 configuration, with the same single-threaded performance, but with better multi-threaded performance.

    I believe that a small SoC with only 4 Skymont cores and Xe2 graphics would provide performance, battery lifetime and cost for a smartphone that would be completely competitive with any existing Qualcomm, MediaTek or Samsung SoC.

    This would be less obvious in a benchmark like GeekBench 6, where Cortex-X925 or Qualcomm Oryon L would show a greater single-threaded score, but the difference would not be great enough to actually matter in real usage. Also for multi-threaded performance measured by GB6, only 4 Skymont cores would seem to be a little slower than the current flagships, but that would be misleading, because 4 Skymont cores could run at full speed for long durations within the smartphone power constraints, while the current 8-core flagships can never run all 8 cores at the 100% performance recorded by GB6, without overheating after a short time.

    An 8-core Skymont SoC would be excellent for a cheap tablet with long battery lifetime and great performance, even if again, such a configuration would be penalized by GB6, which favors having 1 huge core, like Cortex-X925, for the ST score, together with an over-provisioned set of medium-size cores, which can run all together only for the short time required to complete the GB6 sub-benchmarks, but in real prolonged usage must never be all completely busy at the same time, in order to avoid overheating.