top of page

Everyone deserves a better computer

6 days ago

5 min read

7

520

0

AheadComputing, Inc. is a semiconductor start-up built on the belief that everyone deserves a better computer. We are designing a unique architecture with the ultimate goal of delivering the best per-core performance available. Per-core performance, achieved with state-of-the-art performance per watt, is the foundation of multi-processor system efficiency. Our plan is to develop and license breakthrough, high-performance 64-bit RISC-V processor cores to benefit the emerging computing era  and meet the increasing needs in general purpose compute caused by the advent of deep learning models.


Iron Law of Performance

What makes a computer faster?  The frequency of your computer (nowadays 4, 5, or even 6 gigahertz) is the speed at which its circuits operate, measured in billions of cycles per second or gigahertz. The amount of work your computer performed each cycle is the instructions executed per cycle (IPC). For a given program, the Iron Law of Performance says that your computer’s performance is proportional to the frequency and the number of instructions your computer executes each cycle.



Equation 1: Iron Law of Performance
Equation 1: Iron Law of Performance

Quite conveniently, while still being constrained by the Iron Law of Performance, the computer industry has been able to reduce the time it takes to conduct common operations to less than a second. For example,

  • changing the page in PDF document,

  • rendering a web page,

  • or unzipping a compressed file.


With modern CPUs, all these operations happen faster than your impatience threshold, so most users are not asking for a faster CPU and feel that the status quo is good enough.


However, at AheadComputing we believe this satisfaction with computer performance will not persist in the future! We know the world is at an inflection point with more demanding use cases needing a lot more performance, in both the personal computer and server markets.  


General Purpose Compute Demand Will Soar

Along with the proliferation of deep learning models in sciences, engineering, and business comes an explosion in demand for general-purpose compute to help wrangle the data that informs those models. The demand will be fueled in part by many, everyday, deep-learning model applications that wrangle data, known as Extract, Transform, and Load (ETL). Today’s ETL covers a huge range of data processing and management tools like tokenization, decompression, relational databases, vector databases, normalizatoin, and cleaning, etc. These workloads, created by and required by deep learning models, will run on general-purpose computing platforms. Our team is set to deliver the microarchitectural innovations to supply the demand for performance increases. 


We believe computers should be more intelligent! It’s that simple. Who has not struggled to get their computer to do what they want, due to confusing menus, obscure applications, or cryptic programming languages? Deep learning models are wonderful tools for enabling compute by providing all users with the ability to generate programs they need to get the job done. The industry has just started to scratch the surface of letting users express their desired result while the computer figures out how to accomplish the task.


A computer with an interface that helps a user explore new tips and tricks is an example of an Intelligent User Interface (IUI). There are two elements of IUI software that computers will run to predict what a user's intent is and then turn their ideas into reality with a generated program.

  • A deep learning model interpreting a user’s intent and predicting the correct actions to take.

  • Generated programs, application programming interface calls, and scripts to conduct bookkeeping, arithmetic, memory store, screen update, network connections, and other traditional computing tasks to calculate the desired answer.


As vendors make these capabilities available, everyone with a computer will be able to ask their computer to solve problems for them. It may be a task as simple as adding up a list of numbers or a complicated research query such as identifying which US president presided over the greatest increase in median wages. When you ask your computer to conduct such a task and expect a correct answer, the principled approach is to have the deep learning model predict what exactly you just asked for, try to accomplish it, and test to see if it is right. In this scenario, the CPU needs to do its job: arithmetic, file operations, program compiling, or code interpretation. These two parts of your computer work together to give you the answer you need with the deep learning model directing the CPU to run various small programs to give you the desired result.


The History of Performance

The productivity promised by deep learning models to democratize programming can be unlocked with high-performance, general-purpose CPU cores like AheadComputing’s core. Generated programs require higher per-core performance than ever before. Traditional CPU core companies are slowing down their delivery of performance gains. Traditional CPU manufacturers are attempting to mitigate this deficiency by providing increases in core count that are only useful to engineers running very compute-intensive physical simulations or many virtual machines pretending to be dual-core or quad-core systems. The increasing number of cores in traditional CPUs will not help your daily computing tasks. Already, there are few applications that can use all those cores, but the situation is getting worse since the applications that can are getting less and less performance from simply adding cores with each successive generation. This is due to two fundamental laws of computer architecture:

  • Workload scalability, because no multi-threaded workload scales perfectly on an multi-processor system.

  • Amdahl’s Law, which states the performance improvement gained by optimizing the multi-threaded part of a program is limited by the fraction of time that is spent executing the multi-threaded portion.


We will do a deeper dive into those two topics in an upcoming blog post. For now, though, let’s accept that there are barriers providing significant problems for traditional CPU designs with high core counts to increase the performance of your favorite application. 


Our company mission at  AheadComputing, Inc. is simple: 

Breaking the boundaries of computing performance

AheadComputing will be able to demonstrate leadership in CPU performance and performance per watt in a very short timeframe and start building the second generation of products that will demonstrate our commitment to a roadmap with large gains in performance generation over generation. Additionally, we feel that building our technology on RISC-V is an additional advantage as it is well positioned to take market share from the current performance leaders, such as x86 & ARM


The Future of Performance

Finally, we must be able to get the Iron Law of Performance working for our applications again. The way to do that is to rethink how we deliver performance! If the performance and efficiency from the multi-core scaling era are slowing down, then it's time for the CPU designers to find a different way to use the additional gates from new process technologies. CPU designers must look towards IPC. This will require increasing the functions for each core rather than increasing the number of cores. If we do this intelligently, AheadComputing will provide performance improvements regardless of workload parallelism!


We are entering a new era of mass-market machine learning models and deep learning models, which means computer usages are rapidly changing. We are going to make it much faster for computer users and developers to generate computer programs. Soon, the computer will not need to allow the user to discover what the computer can do because the computer will be able to explain what its functions are. That capability comes down to two pieces. One is a deep learning model and the other is a collection of bookkeeping tasks.


Deep learning models are highly parallel workloads, featuring a huge number of independent operations. They are very suitable for execution on architectures with a huge number of available cores, like a GPU.


The bookkeeping tasks, on the other hand, are not highly parallel. It's difficult for expert humans to write general purpose, parallel code. It is beyond the capability of current tools to create highly parallel, performant, correct programs except in a few limited problem domains like machine learning.


The parallelism inherent to our traditional applications isn't going to change. But, crucially, many more people are going to be able to create an elaborate spreadsheet with the aid of IUIs, just like Graphical User Interfaces (GUIs) enabled so many more people to harness the power of a computer than Command Line Interfaces (CLIs) ever could.


Given all that, the bottleneck in the system will not be the highly parallel portion that is the deep learning model. Everyone is going to be waiting on the sequential portion to allow them to generate programs faster than they can execute and compile them. Everyone will want a better computer. Everyone deserves a better computer.


6 days ago

5 min read

7

520

0

Comments

Share Your ThoughtsBe the first to write a comment.
bottom of page