Tumgik
#Distributed Computing
blubberquark · 10 months
Text
Share Your Anecdotes: Multicore Pessimisation
I took a look at the specs of new 7000 series Threadripper CPUs, and I really don't have any excuse to buy one, even if I had the money to spare. I thought long and hard about different workloads, but nothing came to mind.
Back in university, we had courses about map/reduce clusters, and I experimented with parallel interpreters for Prolog, and distributed computing systems. What I learned is that the potential performance gains from better data structures and algorithms trump the performance gains from fancy hardware, and that there is more to be gained from using the GPU or from re-writing the performance-critical sections in C and making sure your data structures take up less memory than from multi-threaded code. Of course, all this is especially important when you are working in pure Python, because of the GIL.
The performance penalty of parallelisation hits even harder when you try to distribute your computation between different computers over the network, and the overhead of serialisation, communication, and scheduling work can easily exceed the gains of parallel computation, especially for small to medium workloads. If you benchmark your Hadoop cluster on a toy problem, you may well find that it's faster to solve your toy problem on one desktop PC than a whole cluster, because it's a toy problem, and the gains only kick in when your data set is too big to fit on a single computer.
The new Threadripper got me thinking: Has this happened to somebody with just a multicore CPU? Is there software that performs better with 2 cores than with just one, and better with 4 cores than with 2, but substantially worse with 64? It could happen! Deadlocks, livelocks, weird inter-process communication issues where you have one process per core and every one of the 64 processes communicates with the other 63 via pipes? There could be software that has a badly optimised main thread, or a badly optimised work unit scheduler, and the limiting factor is single-thread performance of that scheduler that needs to distribute and integrate work units for 64 threads, to the point where the worker threads are mostly idling and only one core is at 100%.
I am not trying to blame any programmer if this happens. Most likely such software was developed back when quad-core CPUs were a new thing, or even back when there were multi-CPU-socket mainboards, and the developer never imagined that one day there would be Threadrippers on the consumer market. Programs from back then, built for Windows XP, could still run on Windows 10 or 11.
In spite of all this, I suspect that this kind of problem is quite rare in practice. It requires software that spawns one thread or one process per core, but which is deoptimised for more cores, maybe written under the assumption that users have for two to six CPU cores, a user who can afford a Threadripper, and needs a Threadripper, and a workload where the problem is noticeable. You wouldn't get a Threadripper in the first place if it made your workflows slower, so that hypothetical user probably has one main workload that really benefits from the many cores, and another that doesn't.
So, has this happened to you? Dou you have a Threadripper at work? Do you work in bioinformatics or visual effects? Do you encode a lot of video? Do you know a guy who does? Do you own a Threadripper or an Ampere just for the hell of it? Or have you tried to build a Hadoop/Beowulf/OpenMP cluster, only to have your code run slower?
I would love to hear from you.
13 notes · View notes
Text
Exploring Popular Machine Learning Tools and Their Impactful Case Studies
Hey friends! Check out this insightful blog on popular machine learning tools like #TensorFlow, #PyTorch, #ScikitLearn, #Keras, and #ApacheSparkMLlib. Explore their features, use cases, and how they enable us to build powerful machine learning models.
In recent years, the field of machine learning has witnessed remarkable growth and advancement, enabling transformative changes in various industries. One of the driving forces behind this progress is the availability of powerful machine learning tools. These tools facilitate the development and deployment of complex machine learning models, making it easier for researchers, data scientists, and…
Tumblr media
View On WordPress
2 notes · View notes
techdriveplay · 21 hours
Text
What Should You Know About Edge Computing?
As technology continues to evolve, so do the ways in which data is processed, stored, and managed. One of the most transformative innovations in this space is edge computing. But what should you know about edge computing? This technology shifts data processing closer to the source, reducing latency and improving efficiency, particularly in environments where immediate action or analysis is…
0 notes
bobmueller · 2 months
Text
Mower Decks And Cancer Research - How I Spent My Weekend
Spent the weekend tackling a riding mower deck replacement after endless repair woes. Also, here's how I’m using my computer to fight cancer.
All Decked Out I spent this weekend working on the riding mower. The deck has been badly torn up in the last couple of years, to the point where I couldn’t bang out the dents any more. I’d resorted to cutting out sections with an angle grinder, but in the end, I think that just made things worse. Even the guide wheels were splayed out at odd angles. Diana and I weighed repairing the deck vs.…
0 notes
otiskeene · 2 months
Text
The Difference Between Distributed Computing And Parallel Computing
Tumblr media
When was the last time you cheered in a theater?
For us, it was during the epic final battle scene in Avengers: Endgame!
Picture this scenario for a moment (spoilers ahead – don’t hold it against us!):
Thanos isn't alone. Imagine ten versions of him attacking Earth simultaneously! To stop them, the Avengers need to work seamlessly as a team. Iron Man might defend California, Thor could battle in London, and Black Panther might rally troops in Wakanda. Each Avenger would use their unique strengths and resources. Seems feasible, right?
Now, consider each Avenger fighting Thanos one-on-one, taking turns based on a strategy. By the time the last Avenger strikes, Thanos would surely be defeated.
In the first scenario, the Avengers work together simultaneously, tackling different threats. In the second, they fight sequentially, one after another. This analogy illustrates the difference between Distributed Computing and Parallel Computing.
So, let's delve deeper and explore these computing approaches and their distinctions. Read on! TechDogs-"The Difference Between Distributed Computing And Parallel Computing" "Computing is not about computers anymore. It is about living."
This quote by Nicholas Negroponte, founder of the Massachusetts Institute of Technology's Media Lab, highlights how deeply computing has integrated into our daily lives and work. Computing has revolutionized information processing, enhancing productivity across various industries. However, as we rely more on computing, our demands for speed and efficiency have also increased.
Two primary strategies have emerged to meet these demands: Distributed Computing and Parallel Computing. Each has its unique strengths, making it essential to choose the right approach depending on your specific needs and goals.
Join us as we explore the differences between distributed and parallel computing – but first, let’s understand both approaches.
Understanding Distributed Computing
Imagine your business hires a team of experts, each a specialist in their field. That would solve most of your problems and challenges, wouldn’t it?
Distributed Computing is similar, but for computing. Instead of a single computer handling everything, multiple smaller computers, called nodes, are connected by a network to work together.
By collaborating, these nodes can tackle complex tasks that a single computer couldn't. Each node handles the job it's best suited for. One might manage visual processing, another performs complex calculations, and the next excels at data storage. They communicate and share information to complete tasks quickly and efficiently.
For example, consider how this works for a weather forecasting service. One computer gathers data from satellites, another crunches numbers to simulate weather patterns, and a third displays the forecast on your phone. Now, that’s teamwork!
Let’s explore the advantages of distributed computing!
Advantages Of Distributed Computing
Distributed Computing offers a powerful alternative to traditional computing by combining the resources of multiple computers. Here are the key advantages it offers:
Flexibility & Adaptability
Distributed Computing is like a team that can adjust on the fly. New computers (nodes) can be added or removed as needed, allowing the system to adapt to changing workloads. This makes it ideal for organizations with fluctuating demands, as resources can be easily scaled up or down.
Global Collaboration
Distributed Computing allows users in different locations to access and contribute to shared resources. This is perfect for multinational corporations where collaboration across geographic boundaries is essential.
Data Redundancy & Backup
Distributed Computing boasts abilities like having multiple copies of important data stored in different locations. Since information can be replicated across multiple nodes, it ensures full availability even if a single node experiences hardware or software failure.
Now that we understand Distributed Computing and its advantages, let’s look closely at Parallel Computing!
Understanding Parallel Computing
If you're hosting a giant feast at home, managing everything alone would be challenging, right? From cooking the food, placing decorations, making the house comfortable, and so on! How about a helping hand?
Parallel Computing follows a similar concept but for computers. Instead of one processor handling everything, it uses multiple processors working together. A big task, such as processing a ton of data, gets broken down into smaller chunks and each processor tackles its assigned chunk. Just as you would assign someone to set up decorations and someone else to serve the food.
Like your helpers, these processors work on their tasks at the same time while sharing a common space. This teamwork lets them finish the job quickly, just like your party crew gets everything ready swiftly!
Let’s get to the advantages of Parallel Computing.
Advantages Of Parallel Computing
Here are various advantages of Parallel Computing – our top picks are:
Enhanced Speed
It accelerates computations by processing instructions simultaneously on multiple processors. This directly reduces processing time and provides faster results, making it ideal for time-sensitive tasks.
Scalability
Additional processing power can be readily added or removed based on computational demands. This flexibility allows for dynamic resource allocation, adapting the system's capacity to meet fluctuating workloads.
Better Resource Utilization
Parallel Computing distributes the workload across available hardware resources and prevents overutilization or underutilization. This ensures optimal resource allocation and enhances the system’s overall efficiency.
Faster Decision-Making
The speed advantage translates to faster turnaround times for results and lets you make faster decisions with confidence. Parallel Computing empowers users with faster processing capability, thereby significantly reducing the decision-making time.
Parallel Computing offers a compelling approach to high-performance computing and its wide range of advantages make it a valuable tool for various applications across industries.
Now that we have understood both Distributed and Parallel Computing, let’s understand the differences between them!
The Difference Between Distributed Computing And Parallel Computing
Both Parallel and Distributed Computing tackle complex tasks by dividing them into smaller chunks. However, how they achieve this teamwork differs!
Here are five key differences:
The Team Size
Parallel Computing works with a single computer that has multiple processors acting like a well-oiled team. Distributed Computing, on the other hand, utilizes a larger crowd - multiple independent computers working together on a network.
Communication
In Parallel Computing, all processors share a single memory space to communicate and access data. Distributed Computing approaches it by giving each computer its own memory and communicating with others over a network.
Synchronization
Parallel Computing systems utilize a single master clock to ensure all processors are in sync. This is similar to a team working together with a shared schedule or deadline. Distributed Computing systems, due to their reliance on network communication, require more complex synchronization algorithms to maintain consistency.
Scaling Up
Both systems can scale as needed, but Distributed Computing offers more flexibility. Adding new computers to the network is simpler than adding processors to a single machine, which can become limited by its internal memory.
Application-specific
While Parallel Computing is ideal for businesses with large, single-site workloads that benefit from fast communication and shared memory, Distributed Computing is perfect for businesses with geographically dispersed operations, massive datasets, or collaborative projects.
In essence, Parallel Computing is like a tightly knit team working within a single machine, while Distributed Computing leverages a network of independent, expert workers for large-scale tasks.
To Sum Up
The phrase "many hands make light work" perfectly captures the essence of both Parallel and Distributed Computing. By dividing complex tasks into smaller pieces, they achieve impressive results. While they share this core concept, they differ in their approach.
Parallel Computing utilizes a single powerful machine with multiple processors working together, while Distributed Computing leverages a network of independent computers. Understanding these differences allows you to choose the right tool for the job, whether it's tackling massive datasets or speeding up complex calculations within your business.
0 notes
jobsbuster · 5 months
Text
0 notes
puppyeared · 6 months
Text
Tumblr media
littlest furth shop
@laikascomet
#i think i had a little too much fun with this lol#i also wanted to draw road boy and other characters but maybe when they actually get introduced#i do have a sketch of him with a lil chainsaw.. im not gonna be normal when he gets introduced man he looks so sillygoofy#if you squint laika's eye marking is a clover yue's is a crescent moon and mars' is a star ^_^#i wanted to give laika an accessory too but i couldnt think of anything.. maybe a stack of pancakes??#im curious to see the apocalypse side of the story too.. like so far we have an idea of the comet fucking everything up#and im assuming that lead to a ripple effect causing the apocalypse but exactly how bad?? i cant wait to find out#rn im kinda piecing stuff together.. larkspur delivers mail in a beat up van so that might mean all transportation is grounded#the buildings we've seen so far are intact like the observatory and turnip's house but idk if thats the same for big cities#laikas playlist only includes songs downloaded on yue's computer and there hasnt been internet in 20 years.. but radio signals might#still work.. if yue grows his own food we can assume that mass production and distribution also isnt a thing anymore#sorry im a sucker for worldbuilding.. and the furth puns are fun to me. i like to think toronto would be clawronto.. and vancouver wld#be nyancouver.. barktic circle.. mewfoundland and labrador.. canyada....#christ i have so many drawing ideas. willow if youre reading this im so sorry youre probably gonna expect to see a lot of drawings frm me#like. i wanna draw laika in the akira bike pose so sosososo bad. IT WOULD BE SO AWESOMECOOL. ill teach myself to draw bikes if i have to#i also wanted to animate laika leekspin.. man#my art#myart#fanart#laika's comet#laikas comet#laika#mars#yue#furry art#fur#littlest pet shop#lps
2K notes · View notes
nisint · 2 years
Text
I’ve been thinking more about the distributed system idea. The hardest problem is how to containerise applications so each program sees the same operating system regardless of the computer.
Kubernetes fixes the problem by using literal containers which bring the whole filesystem along.
Slurm doesn’t. It assumes shared network storage.
Ansible copies the basic set of dependencies around.
Here I’ll introduce a variation of Slurm with an interesting use case. Bank Python (as it’s named) is the usage of Python by large investment banks to manage risk. I don’t know a huge amount about the financial side but what’s interesting is how it handles distributed computing.
Bank Python treats everything like a database and stores all the code in a database as well. Not ideal from my perspective since it needs a new set of development tools. This approach has some advantages though.
For one since everything is stored centrally the scripts don’t assume any access to host resources. The database also makes it easy to deploy new versions or new scripts.
I like the container conceptually but it lacks a certain flexibility. I’m currently leaning towards a mesh networking system where each host publishes a set of directories which can be accessed like a network file system. Although applications don’t see a network filesystem but rather files are explicitly copied to a working directory.
0 notes
sl33py-g4m3r · 2 months
Text
ramble about FreeBSD and Unix~~
how out of my depth would I be trying to install FreeBSD?
would it even boot on my machine?
am I smart enough to go through the install for the system itself as well as get the GUI that I want?
I think you have to go through the command line for quite a bit of time before you get a GUI up and running....
I started off being really interested in BSD/Unix in high school, and tried to fiddle around with a BSD live disc thing in a book (that I don't remember the name of) and then only fiddled around with Linux.
I've been watching videos on youtube of people expressing how stable FreeBSD's modern release is~~
I want to use it on my own hardware; but that's a problem with it I believe, is that it works on sort of limited amount of hardware, as opposed to Linux, that you could even run on a toaster...
Is it really that much harder to deal with than Linux?
Of course I've only dealt with a few distros~~ the rundown of distros I've messed around with are;
Ubuntu (not anymore tho)
Debian (current os being Linux Mint Debian 6)
OpenSUSE briefly (tried to get my sibling to use it on their laptop, with them knowing next to nothing about Linux, sorry...)
Fedora back in high school, I ran it on a laptop for a while. I miss GNOME....
Mageia (I dual booted it on a computer running windows 7, also in or right after high school, so a long time ago)
attempted GhostBSD but it wouldn't boot after install from the live CD (also many years ago at this point)
I like to hop around and (hopefully now I have, yeah right...) I can't make up my mind which I actually want to use permanently.
Linux Mint Debian edition is really good so far tho~~!!
Current PC is an ASUS ROG Stryx (spelling?) that I bought on impulse many years ago~~ Was running windows 10, fixed the issue and now use the OS stated above~~
or maybe I should maybe ditch Mint and run straight Debian... Thought of that too. and it might have an easier time installing and actually booting than FreeBSD on this machine...
but then BSD and by extension unix is meant to be used on older hardware and to be efficient both in execution of things, and space.
"do one thing and do it well" iirc was a bit of the unix philosophy...
yeah, no I HATE technology /heavy sarcasm/
13 notes · View notes
qulizalfos · 2 months
Text
every single fucking second of the new episode feels like one of these
Tumblr media Tumblr media Tumblr media
13 notes · View notes
Text
Tumblr media
Jurassic Park (1993, Steven Spielberg)
11/03/2024
Jurassic Park is a 1993 film directed by Steven Spielberg, based on the novel of the same name written by Michael Crichton.
Spielberg purchased the rights to the book before it was published in 1990, and Crichton was hired to create a film adaptation. David Koepp wrote the final screenplay, in which many of the violent features of the book and much of the narrative part were lost, also making numerous changes to the characters. Spielberg hired Stan Winston Studios to create the animatronic subjects that would bring the dinosaurs to the screen to interact with Industrial Light & Magic's nascent computer-generated imagery technique. If Tron was the first Disney film to use the then newborn computer graphics, Jurassic Park is considered the first big budget film to make use of CGI.
Paleontologist Jack Horner helped the authors and the team responsible for the special effects to make what they were working with as truthful as possible (although the whole appearance of the dinosaurs turns out to be partly wrong due to subsequent changes in evolutionary theories, in particular way in Velociraptor and Dilophosaurus). Filming lasted from August 24 to November 30, 1992 on the Hawaiian islands of Kauai and Oahu, California, Costa Rica and the Dominican Republic.
Jurassic Park premiered on June 9, 1993 in Washington, and was released on June 11 in the United States. The film was a huge success with audiences: against a budget of $63 million, it grossed over $914 million worldwide in its first theatrical release, surpassing E. T. the Extra-Terrestrial and becoming the highest-grossing film of all time until the release of Titanic in 1997.
5 notes · View notes
ofcowardiceandkings · 2 months
Text
everything being reported as "AI" has killed the original meaning because its shorthand for "computer language the layman would think sounds like word soup" and yes this includes chatgpt and midjourney and all that wank
6 notes · View notes
Text
Apache Spark in Machine Learning: Best Practices for Scalable Analytics
Hey friends! Check out this insightful blog on leveraging Apache Spark for machine learning. Discover best practices for scalable analytics with #ApacheSpark #MachineLearning #BigData #DataProcessing #MLlib #Scalability
Apache Spark is a powerful and popular open-source distributed computing framework that provides a unified analytics engine for big data processing. While Spark is widely used for various data processing tasks, it also offers several features and libraries that make it a valuable tool for machine learning (ML) applications. Apache Spark’s usage in Machine Learning Data Processing: Spark…
Tumblr media
View On WordPress
1 note · View note
magpiesbones · 4 months
Text
.
2 notes · View notes
eggy-tea · 11 months
Text
My wish for the world is for everyone to learn the difference between a website and an app
6 notes · View notes
coloursofaparadox · 6 months
Text
im. nnnnnnnnnnnnnnnn.
3 notes · View notes