#excel cpu
Explore tagged Tumblr posts
Text
youtube
"I designed my own 16-Bit Computer in Microsoft Excel without using Visual Basic scripts, plugins, or anything other than plain Excel. This system on a spreadsheet is based off of a custom Instruction Set Architecture that has a total of 23 instruction mnemonics and 26 opcodes.
The main design of the CPU is broken into a fetch unit, control unit, arithmetic logic unit, register file, PC unit, several multiplexers, a memory control unit, a 128KB RAM table, and a 128x128 16-color display."
@sztupy
10 notes
·
View notes
Text
appleiphone
#Apple’s latest iPhone release has once again created a buzz in the tech world. Known for its innovation and premium quality#Apple has introduced several new features and enhancements in this iPhone series. From design upgrades to advanced performance capabilities#the new iPhhttps://pricewhiz.pk/one is making headlines. Let's dive into what makes this new iPhone stand out.#Design and Display:#The design of the new iPhone continues Apple’s legacy of combining elegance with durability. The latest model features a sleek glass and me#giving it a premium look and feel. The Super Retina XDR OLED display offers stunning visuals with improved brightness and contrast#ensuring a vibrant and immersive experience. Available in different sizes#the new iPhone caters to various user preferences#whether you prefer a compact phone or a larger display.#Processor and Performance:#At the heart of the new iPhone is the A16 Bionic chipset#Apple’s most powerful chip to date. This 6-core CPU and 5-core GPU deliver lightning-fast performance#making multitasking#gaming#and content creation smoother than ever. With its advanced machine learning capabilities#the iPhone adapts to your usage patterns#optimizing performance and enhancing overall efficiency.#Camera System:#Apple has always excelled in mobile photography#and the new iPhone takes it a step further. The upgraded 48-megapixel primary camera captures stunningly detailed photos#even in challenging lighting conditions. Low-light photography has seen significant improvements#allowing users to take clearer#sharper images at night. The iPhone also offers advanced video capabilities#including Cinematic Mode and Pro-level editing tools#making it ideal for both amateur and professional content creators.#Battery Life and Charging:#Battery life has always been a crucial factor for iPhone users#and Apple has made improvements in this area as well. The new iPhone promises all-day battery life#ensuring that you stay connected and productive without constantly worrying about recharging. Fast charging and wireless charging options m#Software and Security:
2 notes
·
View notes
Note
Hmmm. 4, 12, 25 for Dantoinette?
4. If you could put this character in any other media, be it a book, a movie, anything, what would you put them in?
Okay so I know SSBU is already a fighting game so shes technically in one already. But also PUT HER IN A FIGHTING GAME I want her in the next Guilty Gear expansion pass do you hear me
12. What's a headcanon you have for this character?
Same with Larry, I dont really have want big ideas for her outside of what generally accepted, but.....I think her and Juni get along. She sees this little girl running around with a giant axe beating up most adults, and gets nostalgic lmao
25. What was your first impression of this character? How about now?
I didnt think much of her at first, mostly because she started off as a tekken joke and I know Nothing about tekken. But now? I love watching her fight, the longer her no losing streak goes, the more nerve wracking it is, seeing when itll finally drop (yes I know shes lost in the nccts but shhhhhh its different) (also she is. Very handsome <3)
#ask#cpu kerfuffle#dani is one of the most beloved cpuk characters for good reason. shes just excellent
3 notes
·
View notes
Text
You could look at it like the third wife of a dying oil baron discovering his of-age son born out of wedlock.
You could look at it like a wizard conjuring forth daemons to do his bidding.
You could look at it like a high king of an empire run on the backs of slaves ruled by masters, sometimes one owning one, often times one owning many.
You could look at it like an all-encompassing, inscrutable god twisting the very landscape of the world, and calling forth simple forms of life to perform wonders and miracles upon the land that are leagues beyond its inhabitants ability to even begin to fathom.
Programming is a strange, abstract frontier where we paint the dreams of sleeping machines. Machines that think with speed of geniuses but with the comprehension of a block of wood. It is rife with metaphorical language, where we grasp at any and all words to try and foment a transferable understanding of just what the hell we are doing. We do so for ourselves to help us accomplish our work, to train newcomers to the field, and for others to know and value what we do... at least enough to still get paid.
coding got me saying shit like “target the child” “assign its class” “override its inheritance” like the third wife of a dying oil baron discovering his of-age son born out of wedlock
#Painting the dreams of sleeping machines is an excellent line#I'm going to steal that from myself to use in the future#Programming and metaphor#Goes hand in hand#Especially when programming languages are technically abstract metaphors used for machine code#Which are abstractions of CPU instructions#Which guide every assignment of 1 and 0 in every register and space in memory#And those 1s and 0s themselves are abstractions of high and low voltage levels which control the firing of transistors#Shit's complicated#This is why we use metaphors
53K notes
·
View notes
Text
Computer: has 24gb RAM
Excel: Monopolizing over half of that, still froze on a real big copy/paste
#tbh this time it isn't excel's fault per se#the files are simply too big#but i'm still annoyed at how good my work laptop is compared to what use i'll be putting it to#this laptop could play majima pirate game so nicely i bet.........#cpu better than 'recommended' cpu on steam... gpu better than 'minimum' if not nearly as good as 'recommended'....#and then obviously all that memory#ugh!
0 notes
Text
My mood rn

#I haven't drawn my oc in weeks I'm about to kill someone#too much work which is excellent and also terirble I hate everyone my cpu is NOISY AS FUCK apparently it broke? could't go to kung fu class#which is when I channel my violence and rage#customers are particularly annoying today#and I had to spend a lot of money bc I'm freezing here the heating system broke#slept too little#like yes this is the mild inconvenience accumulation thing that turns people evil#I just need someone to say 'A' at me and I'll explode
1 note
·
View note
Text
I mean, fuck i would buy something thick as a brick and with like 10 ports
ok this looks ultra mega based, are you kidding me? can you imagine the bullshit i could get up to with this bad boy? fuck yes i want ten
#albeit some of those ports would be usbc#and it had better come with a decent battery and cpu#would be excellent to basically have an puny android desktop in my pocket and all it takes is a good docking station to turn it into a#passable pc
63K notes
·
View notes
Text
Order a new work laptop before i go on my India trip, because my existing one is from ~2018~
Company rule is it HAS to come from company stock
It turns up today and.....
its not the right laptop 🙃
Someone literally picked up the wrong one and didnt realise. And the difference is... ooo half the hard drive size and a waaay less powerful processor..... 🙃
1 note
·
View note
Text
youtube
Functional 16-bit CPU built and runs in Excel, 3Hz processor includes 128KB of RAM, 16-color display, and a custom assembly language
Holy shit, this is great.
0 notes
Text
Dear, memories #2
<- back — PT2 (here) — next ->
It’s not like you wanted to care. Really, you were just trying to mind your own business. But the way that loud-mouthed buffoon was screaming at a bot who clearly couldn’t fight back—or if it could, would still lose spectacularly—was just grating on every last functioning wire of your patience
Whether you stood up for that poor, glitchy stranger because it reminded you of the walking disaster you used to be, or simply because you couldn’t bear another second of listening to that local thug bark like a malfunctioning alarm siren, well... you know exactly why you did it. Deep down. Don’t pretend otherwise
And then—oh joy—there was him. Sitting right next to you like it was the most casual thing in the galaxy. Who wouldn’t recognize that Decepticon death machine? You’d have to be spectacularly stupid not to. That iconic mask, the absurdly overcompensating fusion cannon, and the kind of looming “I-will-kill-you-in-your-sleep” vibe that makes even seasoned warbots reconsider their life choices. That wasn’t just any Decepticon. That was Tarn. From the Decepticon Justice Division. Literal walking nightmare fuel. The kind of guy who turns ‘dangerous’ into a full-blown art form
Your instincts screamed at you to back away from this dude. Slowly. Carefully. Maybe even leave a decoy behind and fake your own shutdown. You didn’t know what he wanted from this conversation—but when someone like Tarn wants something, it usually ends badly for everyone else involved. Any bot with half a CPU would know it’s never worth tangling with the DJD. And Tarn? Tarn is the kind of ‘don’t-touch’ hot stove that burns down your entire house if you so much as look at it funny
“You know” he said in that rich, carefully measured voice of his “it's rare these days to see someone stand up for someone else. I think that deserves a few drinks, on me. You wouldn’t object, would you?”
It wasn’t a question. It was a decree dressed up as a compliment
A few light taps on the table—and just like that, drinks appeared out of thin air. Not even a delay. Apparently, the staff knew who the purple guy was, because when you had ordered earlier, the wait time was somewhere in the range of “eternity plus ten.” But now? Instant service. Because of course. Gotta love that two-tiered customer experience
“To courage” he began, lifting his drink—
“I don’t want your drink”
The words sliced through the moment like a sharpened blade. Tarn froze for a nanosecond, visibly stunned, before letting out a soft laugh. It wasn’t a happy laugh. More like the laugh of someone restraining themselves from flipping the table and turning you into decorative wall art. He didn’t even roll his optics at you—though you could tell he wanted to. Badly
Obviously stubborn. Obviously defiant. He figured you already knew who he was and what he could do—and even if you didn’t, you clearly didn’t give a damn. The rebel type. The difficult kind. He’d met your kind before. The kind that never liked authority. Not back when you were in the academy, and certainly not now. Judging from how you were treating him, that hadn’t changed one bit
Cute, if you asked him. In an infuriating, problem-child sort of way
“So I take it you're not the social type—no mingling, no parties?” he asked with the kind of polished eloquence usually reserved for politicians or used-car salesmen. His voice was velvet, sure, but velvet can still suffocate you if you’re not careful. And you're not stupid. Tarn wasn’t charming—he was a walking red flag with excellent diction, and you had to remind yourself of that. Repeatedly
“I just don’t like you. Is that sufficient?”
“Fair enough... but we will be seeing each other again”
Which, frankly, sounded like both a promise and a threat. Pick your poison
With that final exchange of polite hostility, the infamous Decepticon excused himself, rose with theatrical flair, and walked away. You didn’t stop him. You didn’t even pretend to be sorry. Honestly, his disappearance felt like a personal favor. Finally—a moment of peace. Alone, just the way you liked it. Well, at least until you had to rejoin your own team, who were arguably just as annoying, just in more colorful ways
And about that “free drink” you mentioned earlier? Yeah, that was a total lie. You’d take a free drink from anyone foolish or brave enough to offer you one. Not that many bots were lining up to do so—but hey, a mech can dream
But that little farewell line—“We’ll be seeing each other again”—what was that supposed to mean, exactly? Was he planning to hunt you down across the galaxy? Surely not. From everything you’d heard, Tarn and his little Justice Division fan club weren’t your average bloodthirsty maniacs. No, they were principled bloodthirsty maniacs. And yeah, you kinda hate yourself for putting those words together in the same sentence, but here we are
They didn’t just kill for fun—they killed for reasons. Big, dramatic, morally-questionable reasons. According to them, anyway. Only a fried processor would actually buy into that sanctimonious scrap, but still—the DJD didn’t kill at random. They had a list. A purpose. Neutral bots like you? Not even on their radar. Statistically speaking, you were probably fine
Probably
And if, by some cosmic misfortune, you did end up tortured to death just because your mouth couldn’t stay in its lane... well, that’s on you. That’s the risk you take when your sarcasm has a kill switch
But surely the great and mighty Tarn wouldn’t waste time holding a grudge over a petty insult. He didn’t even know your name. You were just another snarky nobody who happened to be in the wrong place at the wrong time
Hopefully
You shook your head, as if that could dislodge the creeping anxiety, and downed your high-grade in one go. Not like overthinking ever saved anyone anyway
.
.
“So they’re all on the same team, right? According to the intel we’ve gathered..”
“Yes, Tarn. That appears to be the case. Though, we’ll be eliminating all of them soon enough, won’t we—”
“Not this time. We’ll need to interrogate them first. It wouldn’t be fair to punish those who truly didn’t know... that wouldn’t be very just, now would it?”
#transformers idw publishing#transformers x reader#tarn x reader#damus x reader#reader insert#cybertronian reader#transformers#transformers fanfiction series: dear memories
57 notes
·
View notes
Text
im sorry but i need to geek out somewhere and screaming into the void on tumblr is less likely to get me flayed than on twitter, especially if i get terms wrong. plus i can do a read more and yall can click into the tech talk if you want to verse it bombarding your twitter timelines
so idk if i only liked it or if i actually put it in my queue but i saw a post that talked about a few pieces of tech that focus on user repairs and being sustainable (fairphone and frameworks laptop) and after doing some more research into what they have to offer i actually really excited that these products are finely hitting the us market and that people are moving away from the belief that super smooth streamlined glassy = the future. being able to reliably repair and keep what you have alive verse throwing the whole thing away when maybe all you needed to do is add more ram to your current laptop (something that i would do with my laptop to keep using it for a few more years if it wasnt glued shut and i was at risk of cracking the screen) or swap out a fuse.
i know big corporations dont like it but i truly do believe with how much tech we use on a daily basis that the way that we are going to be more environmentally friendly is to move back to tech that we can hang onto for as long as we can and to recycle and then reuse what we cant. like with the frameworks laptop. i saw that they just partnered with coolermaster to create a case specifically so that you can reuse you motherboard, cpu, etc and make a portable workstation. you could dual wield with the laptop you just upgraded if you want to dedicate specific tasks to one or the other. they also specifically mentioned that you could screw it into the back of a monitor and create your own all in one. guys thats cool as shit??? if you had a 3d printer and some time you could even create that yourself
on top of the actual hardware part moving to open source programs when your able. when i update my desktop i plan on running linux. it might have a learning curve compared to windows but in terms of performance??? ive heard that it runs smoother even on older machines, that its more efficient because isnt running stuff in the background that tracks your data and shit. now i understand that not everyone can do that because there are some programs that dont play nice with linux but for my needs at least it does everything i would need it to. and maybe a couple years down the road we do figure out how to run these programs on certain flavors of linux since its open source and people fiddle with it so much. (still looking for alternatives to like word and excel though, i use google docs since its free but i want to move away from them as much as i can too since they laid of their youtube music team (i believe?? it might of been a different branch) for trying to unionize)
if anyone knows of any other smaller companies that actually focus on sustainability and user repairability please let me know. theres certain pieces of tech that i think are now unfortunately behind a software repair paywall, things that used to be just machines and are gaining more bells and whistles like cars and refrigerators if that makes sense. but the more we push for these things to be repairable by us the consumers id hope that would change, or there would at least be options that dont need specific companies to repair them or else they blow up
158 notes
·
View notes
Note
Hi bois! Rank the bae ™️ Cheating at chess!
We all know that Warriors is the Actual chess player in the home. Strategy games in general are his preference, and he's nearly unbeatable. But all of the Links come with a natural competitive streak. So which of the boys resort to more nefarious methods of play in order to gain the upper hand?
Legend - He's the most likely to engage Wars in a game of chess with just the right amount of cajoling. He is, unfortunately, not very good at chess. The second one of his plans goes awry, his temper wins out over his wit. Leg holds the record for greatest number of upturned chess boards upon rage quitting.
Hyrule - Rulie is incredibly smart and good at many things, but chess is a bit out of his wheelhouse. It takes a lot of time and thought for him to play, and Wars gets a bit impatient waiting for him to make his moves. His method of 'cheating' includes making illegal moves and hoping that Warriors takes pity on him. Perhaps a less mentally-demanding game would be better for them to play, like Chinese checkers.
Four - Although he doesn't like to play often, Four does hold his own pretty well. He plays as though he's actually studied the game, too, unlike the other denizens of the townhouse. He's the one Warriors enjoys playing against the most, and he's lost a game or two to Four in their time playing together.
Twilight - He tried to play once or twice, but he finds the game incredibly frustrating. (Sacred Grove Guardian statue puzzle? Anyone?? Anyone????) He also couldn't actually cheat even if his life depended on it. Twi is much better suited for lawn games like horseshoes and cornhole. Get that boy some sunshine and a beer and he's perfectly happy.
Sky - This boy does NOT find chess to be even mildly engaging. Wars has gone so far as to take him to the coffee shop or the park on a nice day to try and keep him interested, but without fail, Sky... will start to yawn..... and eventually....... drift........off.........
Wild - Wild's blissfully unaware, airheaded nature belies how skilled he is in chess. Wild will sit cross-legged on the opposite end of the board with a big bowl of snacks or something soft and huggable in his lap, chatting away animatedly while he and Wars both systematically clear the board of the other's pieces. Warriors swears he sees flashes of Champion in Wild's eyes now and then during a particularly intense game. Is it really cheating if it's your former self who knows the game?
Wind - He's played chess against the CPU for so long that he thinks he knows what to do, but playing against AI for years means that his strategies are pretty rote. Wind is one to challenge Warriors over and over again and lose repeatedly, to his mounting frustration. His 'cheats' are debatably-legal moves that set off a ten-minute argument between him and Wars until he either folds or walks away in a huff, forfeiting the game.
Time - He's a decent player, but no match for Warriors's cunning. But cheating is beneath him. He will lose the old-fashioned way, thank you very much.
And, for our oft-requested bonus round:
Malon - Sure, she knows how to play, but it's not really her preference. She and Wars might start a game together but as their mugs of tea are filled and refilled, their time spent together is more focused on chatting and hot goss than it is playing chess.
Shadow - Like Four, Shadow is also an excellent contender when it comes to chess. His strategies, however, are harder for Warriors to outsmart. Shadow knows the rules of the game but none of the strategies. One part his intelligent, conniving nature plus one part beginner's luck makes him a formidable opponent, no cheating necessary here.
Dark - Eats the pieces. While Warriors is watching. With an audible, unsettling crunch. The cost of the dental work is worth it for the look on pretty boy's face.
20 notes
·
View notes
Text
Stepping Backwards a Bit (or 24)
I was looking for a simpler project. My recent 68030 work has been challenging and really pushing the limits of what I can do. I wanted something I could work on, but perhaps where someone else has already worked out the hardest parts.
I find laying out PCBs to be rather relaxing. It's one of those repetitive, almost meditative tasks, like needlepoint or whittling. The kind of hobby where I can turn on some music or a comfortable old TV show, zone out for a few hours, and wake up to this new thing that I created.
Debugging however is very mentally taxing, and the design work required to have a functional schematic to create a PCB for is an active whole-mind prices. So what I really needed was an existing project I could design a board for.
Enter [Grant Searle]. If you're not familiar with [Grant Searle], he has excellent designs for breadboard computers with a very minimal parts count. I studied his minimal Z80 design when I was first starting to build my own computers and learned a lot from it. I highly recommend his work for anyone who is interested in learning how to build their own computer but doesn't know where to start.
I was recently given a Rockwell 6502 CPU pulled from a dead LED marquee. I've never actually worked with 6502, so this seemed like a good time to try building Grant's 8-chip (or 7-chip) 6502 computer.
A few hours later, I had a PCB design completed, gerbers generated, and an order placed. Less than $5 for 5 boards, including shipping. A couple weeks later they arrived in the mail.


I did end up making a few modifications to [Grant]'s design. Instead of a clock circuit made from a discrete crystal and a couple inverter gates, I used a TTL oscillator because I've always found them to be more reliable. I also added support for an FTDI USB Serial adapter chip so that the board can be used with a modern computer as a terminal. And finally, since a PCB is much harder to add new components to relative to a solderless breadboard, I added an expansion header. All of it wrapped up in a compact PCB with lots of helpful silkscreen marking.

I realized after I had ordered the PCBs that the 16kB ROM chips [Grant] used are no longer manufactured or readily available. I have plenty of 8kB EEPROM chips on hand however. Thankfully the OSI BASIC interpreter [Grant] ported to this design fits within 8kB, so I was able to make a few adjustments and re-assemble it to work with the ROM chips I have on hand.
After a small glitch with my EEPROM programmer, it works!

It's quite a change going from my 33MHz+ 68030 to this tiny 6502 running at just under 2MHz. The BASIC text-based Mandelbrot renderer that completes in seconds on my 68030 takes four and a half minutes on the 6502. Not bad at all, considering my bus-impaired 68000 build takes 9 minutes to do the same.
This was a fun little project. It was a nice little break from some of the more difficult projects I've been working on. I have shared the project on GitHub for anyone who might want to take a look.
I hope to have this project with me this weekend, June 14-16, 2024 at Vintage Computer Festival Southwest. I'll be at table 207 in the Tandy Assemble hall, just across the street from the main exhibit hall.
30 notes
·
View notes
Text
Master CUDA: For Machine Learning Engineers
New Post has been published on https://thedigitalinsider.com/master-cuda-for-machine-learning-engineers/
Master CUDA: For Machine Learning Engineers
CUDA for Machine Learning: Practical Applications
Structure of a CUDA C/C++ application, where the host (CPU) code manages the execution of parallel code on the device (GPU).
Now that we’ve covered the basics, let’s explore how CUDA can be applied to common machine learning tasks.
Matrix Multiplication
Matrix multiplication is a fundamental operation in many machine learning algorithms, particularly in neural networks. CUDA can significantly accelerate this operation. Here’s a simple implementation:
__global__ void matrixMulKernel(float *A, float *B, float *C, int N) int row = blockIdx.y * blockDim.y + threadIdx.y; int col = blockIdx.x * blockDim.x + threadIdx.x; float sum = 0.0f; if (row < N && col < N) for (int i = 0; i < N; i++) sum += A[row * N + i] * B[i * N + col]; C[row * N + col] = sum; // Host function to set up and launch the kernel void matrixMul(float *A, float *B, float *C, int N) dim3 threadsPerBlock(16, 16); dim3 numBlocks((N + threadsPerBlock.x - 1) / threadsPerBlock.x, (N + threadsPerBlock.y - 1) / threadsPerBlock.y); matrixMulKernelnumBlocks, threadsPerBlock(A, B, C, N);
This implementation divides the output matrix into blocks, with each thread computing one element of the result. While this basic version is already faster than a CPU implementation for large matrices, there’s room for optimization using shared memory and other techniques.
Convolution Operations
Convolutional Neural Networks (CNNs) rely heavily on convolution operations. CUDA can dramatically speed up these computations. Here’s a simplified 2D convolution kernel:
__global__ void convolution2DKernel(float *input, float *kernel, float *output, int inputWidth, int inputHeight, int kernelWidth, int kernelHeight) int x = blockIdx.x * blockDim.x + threadIdx.x; int y = blockIdx.y * blockDim.y + threadIdx.y; if (x < inputWidth && y < inputHeight) float sum = 0.0f; for (int ky = 0; ky < kernelHeight; ky++) for (int kx = 0; kx < kernelWidth; kx++) int inputX = x + kx - kernelWidth / 2; int inputY = y + ky - kernelHeight / 2; if (inputX >= 0 && inputX < inputWidth && inputY >= 0 && inputY < inputHeight) sum += input[inputY * inputWidth + inputX] * kernel[ky * kernelWidth + kx]; output[y * inputWidth + x] = sum;
This kernel performs a 2D convolution, with each thread computing one output pixel. In practice, more sophisticated implementations would use shared memory to reduce global memory accesses and optimize for various kernel sizes.
Stochastic Gradient Descent (SGD)
SGD is a cornerstone optimization algorithm in machine learning. CUDA can parallelize the computation of gradients across multiple data points. Here’s a simplified example for linear regression:
__global__ void sgdKernel(float *X, float *y, float *weights, float learningRate, int n, int d) int i = blockIdx.x * blockDim.x + threadIdx.x; if (i < n) float prediction = 0.0f; for (int j = 0; j < d; j++) prediction += X[i * d + j] * weights[j]; float error = prediction - y[i]; for (int j = 0; j < d; j++) atomicAdd(&weights[j], -learningRate * error * X[i * d + j]); void sgd(float *X, float *y, float *weights, float learningRate, int n, int d, int iterations) int threadsPerBlock = 256; int numBlocks = (n + threadsPerBlock - 1) / threadsPerBlock; for (int iter = 0; iter < iterations; iter++) sgdKernel<<<numBlocks, threadsPerBlock>>>(X, y, weights, learningRate, n, d);
This implementation updates the weights in parallel for each data point. The atomicAdd function is used to handle concurrent updates to the weights safely.
Optimizing CUDA for Machine Learning
While the above examples demonstrate the basics of using CUDA for machine learning tasks, there are several optimization techniques that can further enhance performance:
Coalesced Memory Access
GPUs achieve peak performance when threads in a warp access contiguous memory locations. Ensure your data structures and access patterns promote coalesced memory access.
Shared Memory Usage
Shared memory is much faster than global memory. Use it to cache frequently accessed data within a thread block.
Understanding the memory hierarchy with CUDA
This diagram illustrates the architecture of a multi-processor system with shared memory. Each processor has its own cache, allowing for fast access to frequently used data. The processors communicate via a shared bus, which connects them to a larger shared memory space.
For example, in matrix multiplication:
__global__ void matrixMulSharedKernel(float *A, float *B, float *C, int N) __shared__ float sharedA[TILE_SIZE][TILE_SIZE]; __shared__ float sharedB[TILE_SIZE][TILE_SIZE]; int bx = blockIdx.x; int by = blockIdx.y; int tx = threadIdx.x; int ty = threadIdx.y; int row = by * TILE_SIZE + ty; int col = bx * TILE_SIZE + tx; float sum = 0.0f; for (int tile = 0; tile < (N + TILE_SIZE - 1) / TILE_SIZE; tile++) if (row < N && tile * TILE_SIZE + tx < N) sharedA[ty][tx] = A[row * N + tile * TILE_SIZE + tx]; else sharedA[ty][tx] = 0.0f; if (col < N && tile * TILE_SIZE + ty < N) sharedB[ty][tx] = B[(tile * TILE_SIZE + ty) * N + col]; else sharedB[ty][tx] = 0.0f; __syncthreads(); for (int k = 0; k < TILE_SIZE; k++) sum += sharedA[ty][k] * sharedB[k][tx]; __syncthreads(); if (row < N && col < N) C[row * N + col] = sum;
This optimized version uses shared memory to reduce global memory accesses, significantly improving performance for large matrices.
Asynchronous Operations
CUDA supports asynchronous operations, allowing you to overlap computation with data transfer. This is particularly useful in machine learning pipelines where you can prepare the next batch of data while the current batch is being processed.
cudaStream_t stream1, stream2; cudaStreamCreate(&stream1); cudaStreamCreate(&stream2); // Asynchronous memory transfers and kernel launches cudaMemcpyAsync(d_data1, h_data1, size, cudaMemcpyHostToDevice, stream1); myKernel<<<grid, block, 0, stream1>>>(d_data1, ...); cudaMemcpyAsync(d_data2, h_data2, size, cudaMemcpyHostToDevice, stream2); myKernel<<<grid, block, 0, stream2>>>(d_data2, ...); cudaStreamSynchronize(stream1); cudaStreamSynchronize(stream2);
Tensor Cores
For machine learning workloads, NVIDIA’s Tensor Cores (available in newer GPU architectures) can provide significant speedups for matrix multiply and convolution operations. Libraries like cuDNN and cuBLAS automatically leverage Tensor Cores when available.
Challenges and Considerations
While CUDA offers tremendous benefits for machine learning, it’s important to be aware of potential challenges:
Memory Management: GPU memory is limited compared to system memory. Efficient memory management is crucial, especially when working with large datasets or models.
Data Transfer Overhead: Transferring data between CPU and GPU can be a bottleneck. Minimize transfers and use asynchronous operations when possible.
Precision: GPUs traditionally excel at single-precision (FP32) computations. While support for double-precision (FP64) has improved, it’s often slower. Many machine learning tasks can work well with lower precision (e.g., FP16), which modern GPUs handle very efficiently.
Code Complexity: Writing efficient CUDA code can be more complex than CPU code. Leveraging libraries like cuDNN, cuBLAS, and frameworks like TensorFlow or PyTorch can help abstract away some of this complexity.
As machine learning models grow in size and complexity, a single GPU may no longer be sufficient to handle the workload. CUDA makes it possible to scale your application across multiple GPUs, either within a single node or across a cluster.
CUDA Programming Structure
To effectively utilize CUDA, it’s essential to understand its programming structure, which involves writing kernels (functions that run on the GPU) and managing memory between the host (CPU) and device (GPU).
Host vs. Device Memory
In CUDA, memory is managed separately for the host and device. The following are the primary functions used for memory management:
cudaMalloc: Allocates memory on the device.
cudaMemcpy: Copies data between host and device.
cudaFree: Frees memory on the device.
Example: Summing Two Arrays
Let’s look at an example that sums two arrays using CUDA:
__global__ void sumArraysOnGPU(float *A, float *B, float *C, int N) int idx = threadIdx.x + blockIdx.x * blockDim.x; if (idx < N) C[idx] = A[idx] + B[idx]; int main() int N = 1024; size_t bytes = N * sizeof(float); float *h_A, *h_B, *h_C; h_A = (float*)malloc(bytes); h_B = (float*)malloc(bytes); h_C = (float*)malloc(bytes); float *d_A, *d_B, *d_C; cudaMalloc(&d_A, bytes); cudaMalloc(&d_B, bytes); cudaMalloc(&d_C, bytes); cudaMemcpy(d_A, h_A, bytes, cudaMemcpyHostToDevice); cudaMemcpy(d_B, h_B, bytes, cudaMemcpyHostToDevice); int blockSize = 256; int gridSize = (N + blockSize - 1) / blockSize; sumArraysOnGPU<<<gridSize, blockSize>>>(d_A, d_B, d_C, N); cudaMemcpy(h_C, d_C, bytes, cudaMemcpyDeviceToHost); cudaFree(d_A); cudaFree(d_B); cudaFree(d_C); free(h_A); free(h_B); free(h_C); return 0;
In this example, memory is allocated on both the host and device, data is transferred to the device, and the kernel is launched to perform the computation.
Conclusion
CUDA is a powerful tool for machine learning engineers looking to accelerate their models and handle larger datasets. By understanding the CUDA memory model, optimizing memory access, and leveraging multiple GPUs, you can significantly enhance the performance of your machine learning applications.
#AI Tools 101#algorithm#Algorithms#amp#applications#architecture#Arrays#cache#cluster#code#col#complexity#computation#computing#cpu#CUDA#CUDA for ML#CUDA memory model#CUDA programming#data#Data Structures#data transfer#datasets#double#engineers#excel#factor#functions#Fundamental#Global
0 notes
Text
okay so when i was writing this, i had a whole scene written about how steve is a video game guy and bought himself the SNES when it came out as a reward for getting through undergrad and loves the mario franchise in particular. i ended up cutting it out for the sake of brevity, but it got me thinking
In 2008, Steve and Eddie give their daughters a Nintendo Wii as a collective Christmas gift, and with it comes Mario Kart.
Now, nothing rivals the Harrington Family Mario Kart experience – there’s ganging up on each other and mocking the CPUs and throwing Wii remotes across the room and relentless trash talk. It is an all-time favorite game to play as a family.
That being said – Eddie is horrible at Mario Kart, even the janky earlier versions. He’s able to hold his own against his seven- and five-year old for about as long as it takes for them to figure out the controls (which is approx. two days for Moe, and Robbie’s right behind her). After that, he’s consistently getting destroyed by not only his husband, but also his elementary school-aged children.
Steve, on the other hand, is excellent at Mario Kart. He went easy on the girls while they were learning but the second they had it figured out and started to become real competition for him, it was over. He is also extremely competitive, something Moe and Robbie absolutely picked up from him, so by the time the Nintendo Switch is released in 2017, Mario Kart had become a very serious family affair (much to Eddie’s chagrin).
Eddie gets one look at Metal Mario and insists on playing as him because…metal. Duh. But then he’s careening uncontrollably around the course, spending more time soaring off the track than actually driving on it, and he can’t figure out why.
Robbie: Different characters have different stats, Dad.
Eddie: What the fuck are his stats then?
Robbie: Pretty sure he’s, like, one of the fastest ones.
So he switches over to Lemmy (because “that’s a kick-ass head of hair”) and comfortably ambles around the course, never placing higher than eighth but also no longer sending himself flying off into the abyss.
Hazel inherited her dad’s lack of proclivity for the game (though she’s definitely still better at it than him – it would be hard not to be). She likes the “cute” ones – the babies, the villagers, Toad and Toadette – and she usually chooses a novelty cart like the carousel horse. She also doesn’t have that competitive need to win, which is good because Moe, Robbie, and Steve can collectively bring the “healthy” tension-level to its max capacity.
Moe’s guiding force in choosing a Mario Kart character is a healthy mix of aesthetic and irony. She usually opts for King Boo. She also maintains that the stats don’t actually mean anything, and that she drives the same regardless of who she plays as
Steve and Robbie completely disagree with this. They are arguably the best at Mario Kart out of the entire family, and they’re pretty much matched, skill-wise. As such, they have very strong feelings about those stats that Moe says don’t matter because they tend to be the determining factor in who actually wins.
Steve is always using new combinations of characters and karts – he has an Excel spreadsheet for tracking what he’s tried out and everything.
Conversely, Robbie has firmly settled on Rosalina and will not change her mind.
Steve: There’s, like, six characters way faster than her!
Robbie: It’s about the traction, Pop.
#becoming a functioning adult turned steve into an excel guy#the booster course wave releases were no joke to steve and robbie#she literally came home from college for the 3rd and 4th ones so they could try the tracks out together the second they dropped#liv’s steddie dads verse#steddie dads#steddie#steve harrington#eddie munson
97 notes
·
View notes
Text
So, I just finished Astro Bot.
I'm gonna get this out of the way immediately: this game is a fucking masterpiece in my eyes. A genuinely flawless game. If you don't wanna read this whole long ass yapfest I wanna just say this upfront. If you own a PS5 and don't own this game, you are doing yourself a disservice. With that out of the way, allow me to glaze the fuck out of this game.
Before I start with the game itself I wanna talk about Production Value because holy shit it is off the fucking charts here. Every inch of this game is fucking gorgeous. Water is so good Mario WiiU would be brought to tears. Particles and physics objects are everywhere, to the point where it feels like Team Asobi was just showing off with what the PS5 was capable of. I have no issues calling it the best looking PS5 game. Sure, God of War or The Last of Us Part 1 may look better technically but Astro Bot's artstyle combined with a locked 60fps that I didn't notice dip once despite the amount of stuff on screen at once pushes it over the edge for me. On top of that, the music is incredible. Every level has a new tune that you'll sometimes just sit down and listen to for a moment before starting a level. Slo-Mo Casino, Crash Site, and Sky Garden are highlights for me but the whole soundtrack is incredibly good.
But that doesn't really mean much if the game kinda sucks, so I am glad to report that Astro Bot might be the best controlling 3D platformer I have ever played. Everything just feels like it has the just right amount of fine tuning. Astro's jump is just right between floaty and weighty, and his hover helps mitigate platforming mistakes without being essentially a get out of jail free card. His attack is basic but you can also damage enemies by hovering, and the game switches it up often enough for it to not feel repetitive. The levels compliment the control perfectly. While Astro Bot is generally a pretty easy game, I don't think that's a bad thing because of how comfortable it feels to play. Everything just feels good. Every time you mess up a jump, it feels like your fault instead of the game's. This rings true even in the face button challenges (which is what im calling them for lack of a better term lol). These little challenges, themed around the Sony face buttons, can be a lot more challenging than the regular game, but they remain fair. Even the final challenge of the game to get the last bot is a fair challenge. The game never resorts to cheap deaths which makes it way more fun than some other "difficult" games. The boss fights are also really good. The wait times between attacks always remained interesting to me because the pace of everything just felt snappy. They never last more than a few minutes and by the time you're done with them they don't overstay their welcome. They're always a nice change of pace from the main game. Also, going for completion never felt like a slog. I got all 301 bots (missing 4 because my playroom file got deleted on accident :/), all puzzle pieces, and all achievements and I was never bored. Just goes to show how incredible the gameplay is.
The story is nothing super complex but I like it for what it is. Basically an Alien just decided to be a jerk and stole the pieces from the PS5 and scattered all the bots and it's up to Astro to fix everything up. Not the most inspired story ever but that's not really an issue imo. The main alien is constantly bullying the CPU of the PS5 and it's honestly really funny to watch the scenes. For a game without any dialogue they really put their all into the story and I personally think they did an amazing job with the story.
Overall, like I said at the beginning of this, Astro Bot is a masterpiece. Everything this game sets out to do, it not only succeeds, but excel's at. This game doesn't have a single bad level or dull moment. I am not kidding when I say I don't even have any dumb nitpicks to muster up. Astro Bot is a perfect game in my eyes and Team Asobi should be goddamn proud of themselves for releasing a game this fucking good.
Astro Bot gets a 10/10 from me. Please go buy and play this game. It's wonderful.
13 notes
·
View notes