@jesse_m assembly is assembly buuut the compilers vary. The docs will be unique to the compiler you're going to use.
https://montcs.bloomu.edu/Information/LowLevel/Assembly/assembly-tutorial.html this has a zip version
@jesse_m what architecture are you building for? I'm just curious because a lot of people have told me there's no point for x86 x64 development unless you're making an OS or a driver, but I got this theory. Let's say your making a backend for a bank, or a government. Something that requires stupid high security. With assembly you can make your backend AS the OS. Just don't compile what you don't need. Most security vulnerabilities occour at the OS level, a pretty good solution: don't use any OS at all. What do you think?
@skanman This is for good 'ol PowerPC bootloader stuff.
Yeah I'm a big fan of boiling high assurance components down to the bare minimum. However, would this backend you're talking about be handling network traffic? You've basically got an OS when you start adding in a scheduler, memory management, drivers, etc. I feel like super loop stuff gets complex fast. At that point you might as well go with an RTOS or something and invest the time in setting up isolation.
@jesse_m it would be handling network traffic, memory management, but it may not necessarily need real drivers. If it's written exactly for specific hardware types in mind, 99% of what the os does is to be universally compatible. But even at the network level keeping it bare bones eliminates so much, no need for handling subnets, most workstation boards come with Intel proset Ethernet, no display drivers, no audio drivers, hell, don't even need terminal access if it's done right. It would be just a simple data appliance. But even if you want to start to think about compatibility, you only have to modify basic memory address sets for only a handful of components. So if you wanted to let's say make a cluster out of old handphones. Qualcomm has hdks readily available and if you don't care about touch screen, display, wifi, GSM, haptics, but simply processor, memory and network. From there making a server that's 8 used Chinese phones with a spec of 8 core, 8gb ram 128gb storage at 5 watt per unit dices out to a 64 core 64gb ram 800gb SSD. Consuming 40watts, with battery backup built in.. the payoff seems worth it. Old Chinese Qualcomm phones cost nothing. $60.00 per replacement board. That's $480.00 for a 40watt number crunching monster. To turn around and port the software to something Intel powered would only require modifying ram addresses and storage addresses. The main caveat is once the hardware is configured, that's the only job that hardware could do. But again, if you have 2 boards programmed for db, 2 programmed for requests, 2 programmed for storage distribution, 2 programmed for processing delegation. That's a full redundant server that couldn't be hacked by even the developer cause there's no IO for it once it's connected and running. And the amount of hardware resources available would be almost 99% with no OS. I'm gonna task this to one of my teams to start development. I'll open a new git and post you results as they come in so if you wanna play with it.. but seriously, I can't imagine a more powerful/secure/stable configuration per $. Pfff and then there's the size / cooling benefits. A data center could be 10sqm instead of a warehouse.
@skanman yeah that'd definitely be interesting to keep an eye on! But when you say no OS. Do you mean just not a full blown distribution? You'd still have to use something right?
@jesse_m well, let's first consider what an OS does. It's a translator for other software to be written in a simpler to understand language, then given to various combinations of hardware in a more complex language. But if the application already understands the hardware configuration and rules. Then the OS layer may not be necessary. This might seem overly complex but that's only because of the bloat options available for various IO devices. If most of them will never be used, obviously there's no need to make them accessible. Think of the BIOS as a server that serves hardware availability. It already automatically looks for code to execute in certain places, bootloader's typically. But instead of it loading a kernel which begins loading translation layers for the OS, it can load the application instead. If the application already understands where the hardware addresses are for it's job, it has no need for translation. If you wanna throw all caution to the wind and risk everything. Theoretically, if the application is small enough, and it really does understand it's target hardware addresses, you could probably flash it into the BIOS and start the service at machine post time. I've rewritten a few BIOS so I could install crap like OSX where it didn't belong. I would think installing apps directly into the BIOS would be a later phase of the same project. Oh my old powerpcs serve rolling hashes similar to RSA clocks but they also delete and replace an offset bit that rolls with the hash. This prevents x86, x64, based decryption because the PowerPC is behaving like an x126 architecture.. which is really x128 but 2 bits are shredded by position of it's own internal clock. I'm order to know those positions you have to turn it off and restart it, which conveniently resets the clock to a new position so even if you find the old positions, the keys that roll outside the machine won't match again until they are reset in synchrony with the PowerPC. Something they don't teach you in school about security algorithms, the best ones contain flaws intentionally injected into locations at specified random positions that on occasion work. Then mapping the occasion of when they work to yield a key, is the best key ever, because unless the key is used at the precise time interval it was created, it's not even a key for the same architecture anymore. PowerPC does this with pretty low effort, pretending it's different architectures. Sure other architectures can emulate other architectures using things like qemu etc. But they can't really change architecture on the fly quickly. I've got 2 Power8 machines doing this pretty fast. But eventually my dream is to move them to cell architecture, so the cryptography can just scale with the times. No more coffee for me 😂
@jesse_m oh and PowerPC screams running native code. Worth it, unrelated but related I got a pentium pro that I was able to forward the NT kernel commands direct to the extended instruction set, that machine was 200mhz and as old as it is boot windows 2000 server to desktop in 1.5 seconds with all services running off of an old school IDE HDD. I know that's pretty old crap but the performance was insane. I did something similar once with a powermac g5 that I got to run BSD natively and good God it's still in service today. Forcing hardware to eliminate backwards compatibility does magic. If you need help loading anything on your PowerPC lmk.
@skanman Oh wow yeah that sounds wild. What are you using that powermac for?
Thanks, yeah I have the machine booting there was just some little things I was digging into to make sure I understood what was going on and I just had to flip back and forth to the programmers manual. There were a couple I kept forgetting what they were doing.
@jesse_m oh the powermacs do something similar they create keys with missing bits synchronously but with different missing bits, then they subtract the difference from each other creating a negative key, this key is obviously smaller so it repeats the process till the key grows to the size of it's patent keys giving you 2 broken keys in different locations on earth that together compile 1 good key. Then I took a box cutter to the bus of the main board for all user IO like USB display serial and etc. Soldered the CPU and BIOS and destroyed all bus paths for write voltage to the BIOS. Soldered a SCSI controller to the board and 2 SSDs to the SCSI card next to the cache. These 2 powermac g5s create security keys so strong and so large that the keys literally never end, so they stream out continuously in chunks for all eternity. I can't think of better security for the amount of money they cost. If the client wants terabit strength, even petabit, all they gotta do is wait longer for the key chunks. A strange but great byproduct of this is due the streaming nature of the broken keys originating from multiple servers, even if you packet sniff both of the connections, you still don't know where client started listening and stopped in the stream to make their key. It's also pretty awesome that either server can validate a key and return a simple 1 or 0 for good or bad almost instantly. Thanks Motorola and Apple 👍
@skanman That's true, I was mainly looking at the disassembly output from objdump. It lined up pretty well with the programmers manual. The mnemonics seem fairly consistent between the manual and the disassembly.