Tumgik
sociavelly · 4 years
Text
We are going to talk about batch jobs and, what's a batch job.
Good morning, everyone so, let's get into it today we are going to talk about batch jobs and, what's a batch job. Well, basically, the idea of a batch job is to do a very large amount of work in computation terms at a given point in time. Usually you do this on a schedule or some of some sort, and it's extremely important that you know about batch jobs. If you want to build a system that is going to operate efficiently at scale now, the thing is that there are operations out there like you. May not believe it, but there's a there are actually tons and tons of tons of things that, in theory, you could be doing that literally takes so long that we're talking something like some things may take hours like I'll. Give you an example. Let'S say that you wanted to, 
I don't know, export your entire database and let's say that your applications have it has millions of millions of users moving all that data to a file or like streaming it to an another part of like another database and so forth. It will take a long time and that's where a batch job comes in and comes into play, so the trick basically to pass your best batch jobs is to figure out what computation is going to take a lot of time to to execute. Basically, and because you don't the thing, is you don't want to like in general like if you have somebody requesting something to your server or if you you're doing some logging or if you're doing, I don't know like just having your application run as normal? That'S not really like that. That'S not an use case for a batch job but, however, something like sending out a really large amount of emails or having, as I said, migrating a database or something of that nature. 
That'S a perfect, perfect, perfect thing to do for batch job. I can give you an example from work where a lot of heavy computation takes takes place in order to on you know, on the user data in order to generate statistics is another good one like if you've ever used the creative studios like the app for YouTube, Which I use myself, you will notice that there's always a delay on the information that you're getting with you know a day or two something of that nature. And that's because it's such a heavy thing to do all the database accesses and like reduce all the data down to statistics and stuff like that that it's simply it's not possible, or rather it's very inefficient - to try to do that in real time, because you have, If imagine YouTube, they have millions and millions and millions of contributors who are all generating all this data, and it's like there is no computer or you know like it will be so expensive for them to just do this all the time and real in real time. 
So that's why batch jobs are so important to know about now. I'M talking to you about this because I had a one of my viewers, we're talking asking me about how to build things, that's scale and I'm basically making a few videos now to talk about different aspects that you shouldn't or a facet like. Basically, things that you need to know about in order to manage a really really really really large system, and now I'm not talking about. If you have like like handler a few hundred users, this may not be exactly like the batch jobs. It'S not something that you use for, as I said, light operations and that's basically, a lighter operation is anything that is perceived to the user as being almost instantaneously. In other words, if your computation takes less than a few seconds, that's not a good case for a batch job. It'S really designed to do some type of really long-running process which may take hours or at least half an hour or something like that. 
It takes it. Basically takes too long for a user to be able to to sit there and wait for it, and it's it's something that you're gon na have to face when you get up to larger companies where they have usually it's because they have too many users. That'S that's the most common reason why you would use a best batch job and, what's beautiful about it, is that you really only have to. You really only have to have a single like one or a few computers who will run this batch job and, as I said, it's usually on a on a schedule. So you might run this operation once or twice a day or once every week or something of that nature, and it makes the load on your system a lot lower. 
Now you don't think so right now. Trust me if you're making some like a smaller system, because you know computers are actually so fast that you, you kind of get used to that they are always almost instantaneously, but you I'm trust me when you get up to this type of scale, that Facebook and Youtube like Google and so forth, like or in my case it was Ticketmaster where you have so many users and so much data this, but like the batch job, is going to be a part of your daily life. Trust me it's gon na happen, and so I think it's worth for you to know that it's not always a record requirement for you to do it.a I could try to think about operations that you can kind of push into the future or operations that don't have to be instantaneously instead be instant and that's. Those are usually a very good, a good candidate for bad a for a bad job. 
So, as I said like my favorite example is to send out a really large amount of emails, I think about it. As you know, how Walmart, for example, sends out email advertise advert advertisements for Black Friday like that's like, if you think about it, that's millions. Millions and millions of people who are gon na have to have those get those emails, so they most likely. That'S not something that you know they just run and then everything is hunky-dory right. It takes a lot of time to set to do something like that. So that's a good candy for a bad job and, as I said, database migrations, doing, creating lots and lots of statistics for based on your data. That'S also good a good thing, because the thing is that these operations - they take a lot of time, but they don't require you to do it in instant time. So that's basically what you should think about when it comes to a batch job. I know you mean a batch job can be anything it doesn't like. It doesn't take a specific manifestation. I think that the rule of thumb, for when you should consider a batch job is, as I said, it's the computation time of this thing. You'Re gon na do going to be measured in minutes or hours. T
hen you should really think about having a single computer or like some process that is going to run over. You know the course of maybe several hours that just finishes that computation and you should do that. You know, as often as you need may sometimes it's just once, and sometimes it's once a day, but I just wanted to inform you that a batch job is it's a it's a perfect way of saving yourself a lot of hassle and like basically resources on your System, because not all things that you do with the computer needs to be happening in real time. Some things can actually take a long. It take a little bit of a delay and some things out, but out of necessity, needs to take a little bit longer and that's when you should think batch job.
1 note · View note