Aegis – open-source FPGA silicon

Posted by rosscomputerguy 2 days ago

Aegis – open-source FPGA silicon(github.com)

149 points | 40 comments

Bluebirt 1 day ago|

Neat project - there are already a couple of good open FPGA projects. Have a look at Dirk Koch's and the FABolous teams work. They are doing exceptional work.

But all open FPGA projects miss the IO required for a good design. They do not have any serdes hardware nor DDR IO cells.

LarsKrimi 1 day ago||

This project seems to have a serdes block which seems to wrap whatever is in the PDK. Didn't look too far down but from a cursory glance it looked like it was built for an internal clock of 50 MHz (clock default to 20 ns) with an oversampling of 8: 400 MHz

If those numbers are at all right it puts it in useful territory. Very much so for a first spin

For a first spin it looks overall pretty useful. The only nitpick I have would be that `operation` on the DSP tile should be from fabric instead of config (hardcoded in bitstream) otherwise I don't see a convenient way of resetting the accumulator(?)

rosscomputerguy 1 day ago||

Thanks for the suggestion on the DSP. Maybe I'll add new DSP tiles that are reconfigurable and keep the config based DSP tiles. I designed Aegis's Terra 1 to be a "good enough first gen" so that's why things are the way they are. I didn't want to over commit on the design for a first generation.

rosscomputerguy 1 day ago|||

Yeah, I did see there's been attempts but none really satisfied what I wanted out of it. I do know of FABulous and it seems good but not quite what I wanted. You can see that aside from yosys and nextpnr, it is quite self contained and even provides a very easy way of defining new silicon with Nix.

I know that IO is really the 2nd thing which sells FPGA's. I did design a basic serdes hardware that should just work for this first generation. I do want to do DDR IO cells in the future.

morphle 1 day ago||

You can come work with us/for us and scale your SerDes design for us. That gets you actual wafer mask sets taped-out, a million chips and a WSI, not just test chips. A succesful SerDes will get you a job (at least in Europe).

How fast will the SerDes run, 50 Mhz? It is not clear to me from the serdes_tile.dart source code. Can you share the verilog files?

__patchbit__ 1 day ago||

[flagged]

infinitewars 1 day ago||

This is quite a milestone for open silicon. Having a completely auditable path from RTL down to GDS targeting the GF180MCU via wafer.space is no small feat-especially pulling it all together with a Nix-integrated toolchain and Dart for the hardware generation.

On the I/O side, getting even a basic 400MHz oversampled SerDes into a first-gen test chip puts this way ahead of most academic open FPGA efforts.

Really looking forward to seeing the Terra family expand and how the test chips perform.

smj-edison 1 day ago||

As someone who has only dabbled with FPGAs before, this is incredible to see all the steps end-to-end for silicon development! I feel like the articles I've read always leave out details in one part or another, so it's interesting to see all the nix dependencies and build steps.

jononor 1 day ago||

Nice specs! Looking forward to seeing how this and the other projects on Waferspace goes. Being able to produce 1k chips at a reasonable price will hopefully do wonders for open hardware / open silicon.

rosscomputerguy 1 day ago||

Yep, Aegis's Terra 1 is designed to be "good enough" for the first generation. I do plan on expanding the Terra family of FPGA's if there's enough interest. I do want to work my way up to 100k LUT's.

jononor 1 day ago||

What are your thoughts on including a RISC-V hardcore along with the gates? Because for almost all projects I can imagine using FPGA for, I would want a microcontroller also. This might however be slightly colored by me being a software/firmware first type of electronics engineer. Thinking especially for the smaller gate counts, like under 10k - because there a soft CPU takes up very precious resources.

rosscomputerguy 1 day ago||

I've definitely had a thought about doing a hard core RISC-V SoC on a dev board.

morphle 1 day ago||

1k chips for $4000 or $7000 at 180nm is (a lot) more expensive than 180nm at MOSIS or Europractice, I wound not call it reasonable, especially because the EDA software tools and PDK used are inferior.

jononor 13 hours ago|||

I went though the list of prices at Europractice. Waferspace is 7000 USD for 1k of 20mm2. That is a per mm2 price of 350 USD. I could not find any offering at Europrice that matches that?

morphle 12 hours ago||

Chip fabs do not publish prices. First of all, the cost price of making a wafer is not a single item. What node, on what chip machine are they going to be made, what process, what PDK, are you breaking any of the PDK limits, what testing has your design went trough, what types and numbers of slices to chip the wafer, are there test before the chips get chipped or only after they are chipped, what packages the chips are in. Insurance types and fees, locations, what batches. All these steps can be performed in different fabs with different companies and subcontractors, between them they might have to ship your wafer under clean room conditions, sometimes flow around the world. A wafer batch price is a very complex multi-party negotiation under NDAs, none of them has ever been made public. Show me any credible price quotes from the last 55 year (fe few million chips). You can't.

On these multi party shuttle projects this gets simplified into a price list where they quote you a high ball-park number that covers your test chips cost by a wide margin. The actual cost is never disclosed, certainly not on price lists.

A mask set maker and a chip fab create half of your product, they own that intellectual product and they won't even tell you what it has cost them. They merge their product with yours, now thyey co-own your product. There are only a few competing companies world wide (and getting fewer every year) and they compete on all this non-disclosed stuff. Prices above all. Never belief what you read on the internet, especially in the chips war industry.

jononor 13 hours ago|||

Interesting! Which EDA tool must I use for those, and what is the price of that? Will these services accept a single run of 1k?

morphle 12 hours ago||

There are a few EDA companies, all with ancient software tools but kept up to date with the changing parameters and algorithms. You use the tools the insurance companies tell you or the mandatory tools of your chip fab suppliers. They use a lot of software tools on your design files you never get to see.

If you want to make better chips, like the low power Apple Silicon for example, you create your own EDA software tools to make the innovation. Creating a new transistor like the CFET [1] means writing new physics simulation tools, for example.

The outdated 1990's and buggy Open Lane software for example limits what kind of RAM transistors you can make or the complexity of your design.

My team makes asynchronous chips, free space optics photonics, ultra dense 2 transistor SRAM, niobium SQF chips, wafer scale integrations. All require bespoke software simulation tools, netlist rewriting tools, cross-reticle stepper exposure software (a software change in a $400 million dollar machine), etc etc. Making hardware near atomic size structures is mostly a software job. Hardware is software crystalized early, Alan Kay quips.

[1] https://www.imec-int.com/en/articles/imec-puts-complementary...

dizhn 1 day ago||

There's also an open source Authenticator software with the same name.

Neywiny 1 day ago||

Is there a way in the DSP (that's the only one I looked at) to instead of going through a mux at the end just put the output flop optionally in a transparent mode if registering isn't enabled? I don't know if that's possible with the tooling but it seems like it'd save resources and reduce fanout.

morphle 1 day ago||

We make an asynchronous sea of gates runtime reconfigurable gate array chip very different from FPGA's but with the same use cases https://github.com/fiberhood/MorphleLogic/blob/main/README_M...

The problem is you can make test chips like Aegis for around $10 (depending on the yield, on how many of the first 1000 chips actually work) but they are just that, test chips.

In the case of Morphle Logic we make wafer scale integrations (WSI) with 10 billion transistors at 180nm for $750. That yields around 300 million 'gates', the largest commercial FPGA's barely get to 3 million. So our Morphle Logic WSI is the largest and fastest (up to 12 Ghz) FPGA you could get if we can find a few hundred buyers who want to pay up front (crowdfunding). Please email me if you are interested in such a enormous fast FPGA.

I'll buy an Aegis FPGFA test chip just to find out how hard it is to test a test chip.

Good luck RossComputerGuy, I hope you get working chips back. The same fab and supplier lost our first taped-out chips in the mail... and then they went bankrupt.

bajsejohannes 1 day ago||

Interesting to see an alternative approach.

I struggled a bit to understand the explanation on github, but eventually got to something that made sense. It would have helped me if it said up front that

- 0, 1, N and Y pass the input signal on (works like a | or - in the input direction), and that - when a circuit has both a 0 and 1 output value, the output becomes 0 (which is why 11 is an AND and not a OR)

Hopefully that's correctly understood? If so, maybe consider updating the explanation for the next person.

Also, a question: Does a 0 and 1 on the same circuit consume more power than two 0s or two 1s due to the conflicting values? Or is it solved with transistors at the cost of propagation delay? Or something else?

fiberhood 19 hours ago||

Thank you for pointing out I need to improve the explanations.

We made seven different implementations of Morphle Logic, some of which are lower power, use less transistors, different ways to do asynchronous logic or are based on superconducting josephson junctions instead of transistors.

In this particular case the two tokens probably consume the same amount of power regardless of their value, but only measurements will tell.

Neywiny 1 day ago|||

VU19P = 3M gates how? It has more than 3M LUTs and each LUT is certainly > 1 gate

morphle 1 day ago||

I guess the AMD Versal Premium VP1902 adaptive SoC has 18.5 million cells. The VU19P is more than half in LUT count.

Morphle Logic WSI has over 47,169,811 yellow cells. You could say that a single yellow Morphle Logic cell is more complex than ten Versal cells, but it's an apples and oranges comparison. However you count it, the $500 Morphle Logic WSI (cost price) has 10 billion transistors, the AMD Versal Premium cost over $100.000 and is effectively smaller in terms of gates, LUTs or cells even though it has 138 billion transistors.

If I made the Morphle Logic WSI in 2nm TSMC, it would have more than 52 trillion transistors [1], at least 245,283,018,867 yellow cells and cost over $22.500. You could easily emulate several AMD Versal Premium VP1902 FPGA's on the wafer.

[1] https://www.youtube.com/watch?v=vbqKClBwFwI

Neywiny 1 day ago||

Tbh I did forget about versal but yes the PL of the VP1902 absolutely has more than 3 million logic gates no matter how you slice it. I have no doubt that there are non-fpgas with more, but it is a bit disingenuos to say they're orders of magnitude under where they actually are.

I'll also note that it has a ton of SRAM onboard which doesn't shrink well, so I'm not convinced just by that extrapolation that you could eclipse it with a simple lithography shrink. Unless you really meant several per wafer, which doesn't really feel like a hard target...

bgnn 1 day ago||

12 GHz on 180nm? Sorry, that's not possible. What's the actual clock speed?

adrian_b 53 minutes ago|||

Complex processors like AMD Athlon and Intel Pentium 4, which were made in 180 nm a quarter of century ago, had clock frequencies between 1 GHz and 2 GHz. Pentium 4 used internally a double frequency clock for the simpler 32-bit arithmetic-logic units, i.e. up to around 4 GHz.

Today the manufacturing process could be better optimized than 25 years ago, so some logic circuits much simpler than a 64-bit CPU (the previous were 32-bit CPUs for integers, but they had 64-bit/80-bit FPUs working at full speed), i.e. with much less gate delays per pipeline stage, might be able to reach 12 GHz.

However, something like a 64-bit ALU will certainly not reach 12 GHz. Even a 32-bit ALU is very unlikely to reach 12 GHz. Simple things, like shift registers and Galois-field counters, might reach such speeds, or even higher.

The next CMOS process generation, i.e. 130 nm, already allows making complex processors with more than a half of the maximum clock frequency of the fastest processors of today. It also allows making analog amplifiers and mixers for the 5 GHz WiFi frequency bands.

morphle 1 day ago|||

Morphle Logic is asynchronous logic so there is no clock.

At 110nm I measured when a transistor was switching the second transistor on its output. I can prove it, can you disprove it?

A consistent 12Ghz signal cascade was (repeatedly) tested and confirmed on a 28nm asynchronous chip [1].

Why would it be impossible? [2].

We measure 800 Ghz and teraherz clocks on niobium superconducting Josephson Junctions [3,4,5].

[1] https://byrdsight.com/asynchronous-technology-has-its-time-f... ( See also on the slides in the video talk)

[2] "If an elderly but distinguished scientist says that something is possible, he is almost certainly right; but if he says that it is impossible, he is very probably wrong." - Arthur C. Clarke

[3] Ivan Sutherland Keynote Single Flux Quantum SFQ Ditigal Electronics Digital circuits totally distinct from Quan https://www.youtube.com/watch?v=KMVV3ErGSVY

[4] https://www.researchgate.net/profile/Jerome-Pety/publication...

[5] https://scholar.google.com/scholar?hl=en&as_sdt=0,5&qsp=3&q=...

Bluebirt 1 day ago||

You must be fun at parties...

blowback 1 day ago||

Excellent. Put me down for a couple.

rosscomputerguy 3 hours ago||

Please vote in this poll: https://github.com/MidstallSoftware/aegis/discussions/11

morphle 1 day ago||

me too please

rosscomputerguy 3 hours ago||

Please vote in this poll: https://github.com/MidstallSoftware/aegis/discussions/11

mosaibah 1 day ago|

[dead]

fecal_henge 1 day ago||

How do you verify that the fab produces the design authentically? They could create a security vulnerability only they know how to exploit.

bgnn 1 day ago|||

Because you can just look into it and see if it's what you sent fof production, and if not and the word gets out you are done as a fab. Fab business is about trust. You also should trust that your design isn't leaked to the competition.

It's very common to xray the dies, especially for debugging. Also common is to etch it layer by layer, take photos and rebuild the circuit schematic, mainly for reverse engineering but I've seen companies doing it to their own dies too.

Things get more blurry at the board level, the combinations of suppliers and service providers are endless.

fecal_henge 1 day ago||

Is it possible to resolve features on advanced nodes with xray machines? Or the etch and photograph method?

jononor 1 day ago||||

X-ray would be the traditional approach, though quite expensive. IRIS by bunnie is another approach that aims to bring cost way down. Ref https://www.bunniestudios.com/blog/2024/iris-infra-red-in-si...

lnsru 1 day ago|||

That’s a nice theory. Fab is one thing, but there are afterwards packaging and testing facilities where silicon can be swapped. I worked for a short time for a military contractor. They don’t X-ray every single chip. They just use it assuming the ordered chip is the one which was delivered by the markings on the package.

logicchains 1 day ago||

How far away are we from being able to run a hobby Linux on something like that, a completely hardware-backdoor-free system?

rosscomputerguy 1 day ago|||

You can already port over a RISC-V SoC to Aegis. I have not tested that yet but it is something I really want to do.

morphle 1 day ago|||

You can already do that on several Risc-V chips with mmu.