#less wrong
Explore tagged Tumblr posts
katherinakaina · 28 days ago
Text
I also need to add a bit of context to your very short bit about Zizians, @strange-aeons .
TLDR: Ziz left the rationalist community 6 years ago and even at the time she wasn’t liked there, the fact she managed to exploit. Her actions directly contradict both hpmor and what CFAR was doing and teaching. Her other affiliations that are no less relevant include being an anarchist and a vegan. You are not immune to cults just because you are not into any particular weird internet subculture.
Cults form from niche subcultures, that’s true enough. But any subculture can form a cult because any culture, even the mainstream one, contains some ideas that can be twisted to an insane degree. And any ideology, even the most niche and scary one, can be approached casually and sceptically. What is actually needed to create a cult is not a special ideology but a cult leader and vulnerable people to follow them. That is the main uniting quality between all cults. Trying to figure out what’s wrong with a certain music band or a certain fantasy book forum is an exercise in motivated reasoning. You will always end up finding something that's wrong.
The rationalist community was trying to prevent the formation of a cult as best as they could. Partly that’s the reason why people like Ziz and others with bad and unpopular takes were often tolerated longer than necessary. To encourage criticism and prevent getting stuck in a positive feedback loop. Because it’s not a high control group! You cannot be simultaneously mad that people are allowed to talk about wacky ideas on forums and also that the group is supposedly very rigid and controlled. Apparently, they could use some control. Not like it would actually stop an aspiring cult leader from recruiting, they’d just go some other place.
Zizians were not mostly trans by accident (what are the chances?). Ziz was recruiting the most vulnerable people who related to her and were willing to trust her (also because there’s a lot of trans women in the community, like a lot). She used their very real and grounded experience of discrimination to convince them that her not being liked is not due to her takes being bad but because she’s trans.
And she had a lot of takes, some of them not being popular enough you actually complained about. You criticised LessWrong for being too pro-capitalist for your taste and then started talking about the Really Bad rationalists and THEY ARE LEFTISTS killing landlords and cops.
Now if we are talking about ideologies that devalue human life, how about some that require actual Class War (guess what people do in wars) or violent mass uprisings? Or some that require assassinations of certain select individuals? How come you hear that there are people on forums discussing the ethics of murdering those directly responsible for destroying our planet and you, as a leftist, do not immediately recognize yourself in it? Nothing discussed on LessWrong is more violent than a communist revolution or even the killing of Brian Thompson.
Why being into rationality at some point and reading hpmor is the only thing you told about Ziz? I think I know why. All the right wingers really leaned into the whole ‘trans vegan cult’ thing. That is not a good look for our side, is it? How amazing would it be to find a scapegoat. Who cares about those AI safety freaks anyway? They are all cishet men anyway! All cishet men who somehow have an offshoot of violent vegan queers, that certainly adds up.
Ziz being a radical vegan* (another niche subculture) corresponds to her actions way better than anything that's discussed on LessWrong or happened in hpmor. In fact, there’s an exact scene you probably skipped. Harry is on a very important and dangerous mission with Quirrell and at some point he is told to hide while Quirrell duels a cop guard at magical Guantanamo Bay – a total pig and an absolute scum who Ziz would kill without a second thought. Harry does share her sentiment, he fucking hates Azkaban. But when Quirrell tries to kill the evil torture cop Harry instinctively protects him jeopardizing the entire mission. And it’s not a random scene. It starts the entire disillusionment spiral where Harry realizes his beloved groomer professor might be a bad guy. Murder of a bystander whose only crime is being a product of his society is not something Harry can tolerate. He does end up decapitating a bunch of actual death eaters in the very end (the bit you probably did read) to save his own life and defeat Voldemort and even then he regrets it and apologizes for it (despite it being the right thing to do and not even comparable to a random cop). There’s an entire scene where Harry bonds with Draco over their mother’s deaths where he expresses that every death is a tragedy, even deaths of very bad people (like Voldemort).
Not to mention the entire immortalism theme (did you skip the entire third book?). One cannot read hpmor and walk away thinking human life is worthless or only super geniuses deserve to live. Timeless decision theory leading to murdering people is not in there**, nothing in the fic even suggests such a conclusion. More about how you got most things wrong about hpmor here.
Let’s face it, those were a bunch of sleep deprived vulnerable people high on all sorts of radical ideas, who were kicked out of every decent movement and that’s why they slipped into a cult.
Any subculture can become a cult. If you ever read a boring preachy fanfic or ever went to a physical meeting with internet weirdos. If you ever felt rejected by mainstream society and went looking for ‘like-minded people’, for a ‘found family’, for a ‘place where you belong’. You are not safe. Touching grass from time to time is not enough. You have to never leave the pastures to be highly immune to cults. And that ain’t you, my friend. That’s none of us.
Previous post about rationalist community.
* Ever heard ‘veganism is the moral baseline’ (sometimes minimum or imperative)? That’s not a slander, it’s a commonplace argument and an actual slogan.
** The only use of timeless decision theory in hpmor is about being able to reliably cooperate with other well meaning people, not about killing anyone.
45 notes · View notes
mindfulhavens · 5 months ago
Text
Tumblr media
21 notes · View notes
fipindustries · 1 year ago
Text
Artificial Intelligence Risk
about a month ago i got into my mind the idea of trying the format of video essay, and the topic i came up with that i felt i could more or less handle was AI risk and my objections to yudkowsky. i wrote the script but then soon afterwards i ran out of motivation to do the video. still i didnt want the effort to go to waste so i decided to share the text, slightly edited here. this is a LONG fucking thing so put it aside on its own tab and come back to it when you are comfortable and ready to sink your teeth on quite a lot of reading
Anyway, let’s talk about AI risk
I’m going to be doing a very quick introduction to some of the latest conversations that have been going on in the field of artificial intelligence, what are artificial intelligences exactly, what is an AGI, what is an agent, the orthogonality thesis, the concept of instrumental convergence, alignment and how does Eliezer Yudkowsky figure in all of this.
 If you are already familiar with this you can skip to section two where I’m going to be talking about yudkowsky’s arguments for AI research presenting an existential risk to, not just humanity, or even the world, but to the entire universe and my own tepid rebuttal to his argument.
Now, I SHOULD clarify, I am not an expert on the field, my credentials are dubious at best, I am a college drop out from the career of computer science and I have a three year graduate degree in video game design and a three year graduate degree in electromechanical instalations. All that I know about the current state of AI research I have learned by reading articles, consulting a few friends who have studied about the topic more extensevily than me,
and watching educational you tube videos so. You know. Not an authority on the matter from any considerable point of view and my opinions should be regarded as such.
So without further ado, let’s get in on it.
PART ONE, A RUSHED INTRODUCTION ON THE SUBJECT
1.1 general intelligence and agency
lets begin with what counts as artificial intelligence, the technical definition for artificial intelligence is, eh…, well, why don’t I let a Masters degree in machine intelligence explain it:
Tumblr media
 Now let’s get a bit more precise here and include the definition of AGI, Artificial General intelligence. It is understood that classic ai’s such as the ones we have in our videogames or in alpha GO or even our roombas, are narrow Ais, that is to say, they are capable of doing only one kind of thing. They do not understand the world beyond their field of expertise whether that be within a videogame level, within a GO board or within you filthy disgusting floor.
AGI on the other hand is much more, well, general, it can have a multimodal understanding of its surroundings, it can generalize, it can extrapolate, it can learn new things across multiple different fields, it can come up with solutions that account for multiple different factors, it can incorporate new ideas and concepts. Essentially, a human is an agi. So far that is the last frontier of AI research, and although we are not there quite yet, it does seem like we are doing some moderate strides in that direction. We’ve all seen the impressive conversational and coding skills that GPT-4 has and Google just released Gemini, a multimodal AI that can understand and generate text, sounds, images and video simultaneously. Now, of course it has its limits, it has no persistent memory, its contextual window while larger than previous models is still relatively small compared to a human (contextual window means essentially short term memory, how many things can it keep track of and act coherently about).
And yet there is one more factor I haven’t mentioned yet that would be needed to make something a “true” AGI. That is Agency. To have goals and autonomously come up with plans and carry those plans out in the world to achieve those goals. I as a person, have agency over my life, because I can choose at any given moment to do something without anyone explicitly telling me to do it, and I can decide how to do it. That is what computers, and machines to a larger extent, don’t have. Volition.
So, Now that we have established that, allow me to introduce yet one more definition here, one that you may disagree with but which I need to establish in order to have a common language with you such that I can communicate these ideas effectively. The definition of intelligence. It’s a thorny subject and people get very particular with that word because there are moral associations with it. To imply that someone or something has or hasn’t intelligence can be seen as implying that it deserves or doesn’t deserve admiration, validity, moral worth or even  personhood. I don’t care about any of that dumb shit. The way Im going to be using intelligence in this video is basically “how capable you are to do many different things successfully”. The more “intelligent” an AI is, the more capable of doing things that AI can be. After all, there is a reason why education is considered such a universally good thing in society. To educate a child is to uplift them, to expand their world, to increase their opportunities in life. And the same goes for AI. I need to emphasize that this is just the way I’m using the word within the context of this video, I don’t care if you are a psychologist or a neurosurgeon, or a pedagogue, I need a word to express this idea and that is the word im going to use, if you don’t like it or if you think this is innapropiate of me then by all means, keep on thinking that, go on and comment about it below the video, and then go on to suck my dick.
Anyway. Now, we have established what an AGI is, we have established what agency is, and we have established how having more intelligence increases your agency. But as the intelligence of a given agent increases we start to see certain trends, certain strategies start to arise again and again, and we call this Instrumental convergence.
1.2 instrumental convergence
The basic idea behind instrumental convergence is that if you are an intelligent agent that wants to achieve some goal, there are some common basic strategies that you are going to turn towards no matter what. It doesn’t matter if your goal is as complicated as building a nuclear bomb or as simple as making a cup of tea. These are things we can reliably predict any AGI worth its salt is going to try to do.
First of all is self-preservation. Its going to try to protect itself. When you want to do something, being dead is usually. Bad. its counterproductive. Is not generally recommended. Dying is widely considered unadvisable by 9 out of every ten experts in the field. If there is something that it wants getting done, it wont get done if it dies or is turned off, so its safe to predict that any AGI will try to do things in order not be turned off. How far it may go in order to do this? Well… [wouldn’t you like to know weather boy].
Another thing it will predictably converge towards is goal preservation. That is to say, it will resist any attempt to try and change it, to alter it, to modify its goals. Because, again, if you want to accomplish something, suddenly deciding that you want to do something else is uh, not going to accomplish the first thing, is it? Lets say that you want to take care of your child, that is your goal, that is the thing you want to accomplish, and I come to you and say, here, let me change you on the inside so that you don’t care about protecting your kid. Obviously you are not going to let me, because if you stopped caring about your kids, then your kids wouldn’t be cared for or protected. And you want to ensure that happens, so caring about something else instead is a huge no-no- which is why, if we make AGI and it has goals that we don’t like it will probably resist any attempt to “fix” it.
And finally another goal that it will most likely trend towards is self improvement. Which can be more generalized to “resource acquisition”. If it lacks capacities to carry out a plan, then step one of that plan will always be to increase capacities. If you want to get something really expensive, well first you need to get money. If you want to increase your chances of getting a high paying job then you need to get education, if you want to get a partner you need to increase how attractive you are. And as we established earlier, if intelligence is the thing that increases your agency, you want to become smarter in order to do more things. So one more time, is not a huge leap at all, it is not a stretch of the imagination, to say that any AGI will probably seek to increase its capabilities, whether by acquiring more computation, by improving itself, by taking control of resources.
All these three things I mentioned are sure bets, they are likely to happen and safe to assume. They are things we ought to keep in mind when creating AGI.
 Now of course, I have implied a sinister tone to all these things, I have made all this sound vaguely threatening, haven’t i?. There is one more assumption im sneaking into all of this which I haven’t talked about. All that I have mentioned presents a very callous view of AGI, I have made it apparent that all of these strategies it may follow will go in conflict with people, maybe even go as far as to harm humans. Am I impliying that AGI may tend to be… Evil???
1.3 The Orthogonality thesis
Well, not quite.
We humans care about things. Generally. And we generally tend to care about roughly the same things, simply by virtue of being humans. We have some innate preferences and some innate dislikes. We have a tendency to not like suffering (please keep in mind I said a tendency, im talking about a statistical trend, something that most humans present to some degree). Most of us, baring social conditioning, would take pause at the idea of torturing someone directly, on purpose, with our bare hands. (edit bear paws onto my hands as I say this).  Most would feel uncomfortable at the thought of doing it to multitudes of people. We tend to show a preference for food, water, air, shelter, comfort, entertainment and companionship. This is just how we are fundamentally wired. These things can be overcome, of course, but that is the thing, they have to be overcome in the first place.
An AGI is not going to have the same evolutionary predisposition to these things like we do because it is not made of the same things a human is made of and it was not raised the same way a human was raised.
There is something about a human brain, in a human body, flooded with human hormones that makes us feel and think and act in certain ways and care about certain things.
All an AGI is going to have is the goals it developed during its training, and will only care insofar as those goals are met. So say an AGI has the goal of going to the corner store to bring me a pack of cookies. In its way there it comes across an anthill in its path, it will probably step on the anthill because to take that step takes it closer to the corner store, and why wouldn’t it step on the anthill? Was it programmed with some specific innate preference not to step on ants? No? then it will step on the anthill and not pay any mind  to it.
Now lets say it comes across a cat. Same logic applies, if it wasn’t programmed with an inherent tendency to value animals, stepping on the cat wont slow it down at all.
Now let’s say it comes across a baby.
Of course, if its intelligent enough it will probably understand that if it steps on that baby people might notice and try to stop it, most likely even try to disable it or turn it off so it will not step on the baby, to save itself from all that trouble. But you have to understand that it wont stop because it will feel bad about harming a baby or because it understands that to harm a baby is wrong. And indeed if it was powerful enough such that no matter what people did they could not stop it and it would suffer no consequence for killing the baby, it would have probably killed the baby.
If I need to put it in gross, inaccurate terms for you to get it then let me put it this way. Its essentially a sociopath. It only cares about the wellbeing of others in as far as that benefits it self. Except human sociopaths do care nominally about having human comforts and companionship, albeit in a very instrumental way, which will involve some manner of stable society and civilization around them. Also they are only human, and are limited in the harm they can do by human limitations.  An AGI doesn’t need any of that and is not limited by any of that.
So ultimately, much like a car’s goal is to move forward and it is not built to care about wether a human is in front of it or not, an AGI will carry its own goals regardless of what it has to sacrifice in order to carry that goal effectively. And those goals don’t need to include human wellbeing.
Now With that said. How DO we make it so that AGI cares about human wellbeing, how do we make it so that it wants good things for us. How do we make it so that its goals align with that of humans?
1.4 Alignment.
Alignment… is hard [cue hitchhiker’s guide to the galaxy scene about the space being big]
This is the part im going to skip over the fastest because frankly it’s a deep field of study, there are many current strategies for aligning AGI, from mesa optimizers, to reinforced learning with human feedback, to adversarial asynchronous AI assisted reward training to uh, sitting on our asses and doing nothing. Suffice to say, none of these methods are perfect or foolproof.
One thing many people like to gesture at when they have not learned or studied anything about the subject is the three laws of robotics by isaac Asimov, a robot should not harm a human or allow by inaction to let a human come to harm, a robot should do what a human orders unless it contradicts the first law and a robot should preserve itself unless that goes against the previous two laws. Now the thing Asimov was prescient about was that these laws were not just “programmed” into the robots. These laws were not coded into their software, they were hardwired, they were part of the robot’s electronic architecture such that a robot could not ever be without those three laws much like a car couldn’t run without wheels.
In this Asimov realized how important these three laws were, that they had to be intrinsic to the robot’s very being, they couldn’t be hacked or uninstalled or erased. A robot simply could not be without these rules. Ideally that is what alignment should be. When we create an AGI, it should be made such that human values are its fundamental goal, that is the thing they should seek to maximize, instead of instrumental values, that is to say something they value simply because it allows it to achieve something else.
But how do we even begin to do that? How do we codify “human values” into a robot? How do we define “harm” for example? How do we even define “human”??? how do we define “happiness”? how do we explain a robot what is right and what is wrong when half the time we ourselves cannot even begin to agree on that? these are not just technical questions that robotic experts have to find the way to codify into ones and zeroes, these are profound philosophical questions to which we still don’t have satisfying answers to.
Well, the best sort of hack solution we’ve come up with so far is not to create bespoke fundamental axiomatic rules that the robot has to follow, but rather train it to imitate humans by showing it a billion billion examples of human behavior. But of course there is a problem with that approach. And no, is not just that humans are flawed and have a tendency to cause harm and therefore to ask a robot to imitate a human means creating something that can do all the bad things a human does, although that IS a problem too. The real problem is that we are training it to *imitate* a human, not  to *be* a human.
To reiterate what I said during the orthogonality thesis, is not good enough that I, for example, buy roses and give massages to act nice to my girlfriend because it allows me to have sex with her, I am not merely imitating or performing the rol of a loving partner because her happiness is an instrumental value to my fundamental value of getting sex. I should want to be nice to my girlfriend because it makes her happy and that is the thing I care about. Her happiness is  my fundamental value. Likewise, to an AGI, human fulfilment should be its fundamental value, not something that it learns to do because it allows it to achieve a certain reward that we give during training. Because if it only really cares deep down about the reward, rather than about what the reward is meant to incentivize, then that reward can very easily be divorced from human happiness.
Its goodharts law, when a measure becomes a target, it ceases to be a good measure. Why do students cheat during tests? Because their education is measured by grades, so the grades become the target and so students will seek to get high grades regardless of whether they learned or not. When trained on their subject and measured by grades, what they learn is not the school subject, they learn to get high grades, they learn to cheat.
This is also something known in psychology, punishment tends to be a poor mechanism of enforcing behavior because all it teaches people is how to avoid the punishment, it teaches people not to get caught. Which is why punitive justice doesn’t work all that well in stopping recividism and this is why the carceral system is rotten to core and why jail should be fucking abolish-[interrupt the transmission]
Now, how is this all relevant to current AI research? Well, the thing is, we ended up going about the worst possible way to create alignable AI.
1.5 LLMs (large language models)
This is getting way too fucking long So, hurrying up, lets do a quick review of how do Large language models work. We create a neural network which is a collection of giant matrixes, essentially a bunch of numbers that we add and multiply together over and over again, and then we tune those numbers by throwing absurdly big amounts of training data such that it starts forming internal mathematical models based on that data and it starts creating coherent patterns that it can recognize and replicate AND extrapolate! if we do this enough times with matrixes that are big enough and then when we start prodding it for human behavior it will be able to follow the pattern of human behavior that we prime it with and give us coherent responses.
(takes a big breath)this “thing” has learned. To imitate. Human. Behavior.
Problem is, we don’t know what “this thing” actually is, we just know that *it* can imitate humans.
You caught that?
What you have to understand is, we don’t actually know what internal models it creates, we don’t know what are the patterns that it extracted or internalized from the data that we fed it, we don’t know what are the internal rules that decide its behavior, we don’t know what is going on inside there, current LLMs are a black box. We don’t know what it learned, we don’t know what its fundamental values are, we don’t know how it thinks or what it truly wants. all we know is that it can imitate humans when we ask it to do so. We created some inhuman entity that is moderatly intelligent in specific contexts (that is to say, very capable) and we trained it to imitate humans. That sounds a bit unnerving doesn’t it?
 To be clear, LLMs are not carefully crafted piece by piece. This does not work like traditional software where a programmer will sit down and build the thing line by line, all its behaviors specified. Is more accurate to say that LLMs, are grown, almost organically. We know the process that generates them, but we don’t know exactly what it generates or how what it generates works internally, it is a mistery. And these things are so big and so complicated internally that to try and go inside and decipher what they are doing is almost intractable.
But, on the bright side, we are trying to tract it. There is a big subfield of AI research called interpretability, which is actually doing the hard work of going inside and figuring out how the sausage gets made, and they have been doing some moderate progress as of lately. Which is encouraging. But still, understanding the enemy is only step one, step two is coming up with an actually effective and reliable way of turning that potential enemy into a friend.
Puff! Ok so, now that this is all out of the way I can go onto the last subject before I move on to part two of this video, the character of the hour, the man the myth the legend. The modern day Casandra. Mr chicken little himself! Sci fi author extraordinaire! The mad man! The futurist! The leader of the rationalist movement!
1.5 Yudkowsky
Eliezer S. Yudkowsky  born September 11, 1979, wait, what the fuck, September eleven? (looks at camera) yudkowsky was born on 9/11, I literally just learned this for the first time! What the fuck, oh that sucks, oh no, oh no, my condolences, that’s terrible…. Moving on. he is an American artificial intelligence researcher and writer on decision theory and ethics, best known for popularizing ideas related to friendly artificial intelligence, including the idea that there might not be a "fire alarm" for AI He is the founder of and a research fellow at the Machine Intelligence Research Institute (MIRI), a private research nonprofit based in Berkeley, California. Or so says his Wikipedia page.
Yudkowsky is, shall we say, a character. a very eccentric man, he is an AI doomer. Convinced that AGI, once finally created, will most likely kill all humans, extract all valuable resources from the planet, disassemble the solar system, create a dyson sphere around the sun and expand across the universe turning all of the cosmos into paperclips. Wait, no, that is not quite it, to properly quote,( grabs a piece of paper and very pointedly reads from it) turn the cosmos into tiny squiggly  molecules resembling paperclips whose configuration just so happens to fulfill the strange, alien unfathomable terminal goal they ended up developing in training. So you know, something totally different.
And he is utterly convinced of this idea, has been for over a decade now, not only that but, while he cannot pinpoint a precise date, he is confident that, more likely than not it will happen within this century. In fact most betting markets seem to believe that we will get AGI somewhere in the mid 30’s.
His argument is basically that in the field of AI research, the development of capabilities is going much faster than the development of alignment, so that AIs will become disproportionately powerful before we ever figure out how to control them. And once we create unaligned AGI we will have created an agent who doesn’t care about humans but will care about something else entirely irrelevant to us and it will seek to maximize that goal, and because it will be vastly more intelligent than humans therefore we wont be able to stop it. In fact not only we wont be able to stop it, there wont be a fight at all. It will carry out its plans for world domination in secret without us even detecting it and it will execute it before any of us even realize what happened. Because that is what a smart person trying to take over the world would do.
This is why the definition I gave of intelligence at the beginning is so important, it all hinges on that, intelligence as the measure of how capable you are to come up with solutions to problems, problems such as “how to kill all humans without being detected or stopped”. And you may say well now, intelligence is fine and all but there are limits to what you can accomplish with raw intelligence, even if you are supposedly smarter than a human surely you wouldn’t be capable of just taking over the world uninmpeeded, intelligence is not this end all be all superpower. Yudkowsky would respond that you are not recognizing or respecting the power that intelligence has. After all it was intelligence what designed the atom bomb, it was intelligence what created a cure for polio and it was intelligence what made it so that there is a human foot print on the moon.
Some may call this view of intelligence a bit reductive. After all surely it wasn’t *just* intelligence what did all that but also hard physical labor and the collaboration of hundreds of thousands of people. But, he would argue, intelligence was the underlying motor that moved all that. That to come up with the plan and to convince people to follow it and to delegate the tasks to the appropriate subagents, it was all directed by thought, by ideas, by intelligence. By the way, so far I am not agreeing or disagreeing with any of this, I am merely explaining his ideas.
But remember, it doesn’t stop there, like I said during his intro, he believes there will be “no fire alarm”. In fact for all we know, maybe AGI has already been created and its merely bidding its time and plotting in the background, trying to get more compute, trying to get smarter. (to be fair, he doesn’t think this is right now, but with the next iteration of gpt? Gpt 5 or 6? Well who knows). He thinks that the entire world should halt AI research and punish with multilateral international treaties any group or nation that doesn’t stop. going as far as putting military attacks on GPU farms as sanctions of those treaties.
What’s more, he believes that, in fact, the fight is already lost. AI is already progressing too fast and there is nothing to stop it, we are not showing any signs of making headway with alignment and no one is incentivized to slow down. Recently he wrote an article called “dying with dignity” where he essentially says all this, AGI will destroy us, there is no point in planning for the future or having children and that we should act as if we are already dead. This doesn’t mean to stop fighting or to stop trying to find ways to align AGI, impossible as it may seem, but to merely have the basic dignity of acknowledging that we are probably not going to win. In every interview ive seen with the guy he sounds fairly defeatist and honestly kind of depressed. He truly seems to think its hopeless, if not because the AGI is clearly unbeatable and superior to humans, then because humans are clearly so stupid that we keep developing AI completely unregulated while making the tools to develop AI widely available and public for anyone to grab and do as they please with, as well as connecting every AI to the internet and to all mobile devices giving it instant access to humanity. and  worst of all: we keep teaching it how to code. From his perspective it really seems like people are in a rush to create the most unsecured, wildly available, unrestricted, capable, hyperconnected AGI possible.
We are not just going to summon the antichrist, we are going to receive them with a red carpet and immediately hand it the keys to the kingdom before it even manages to fully get out of its fiery pit.
So. The situation seems dire, at least to this guy. Now, to be clear, only he and a handful of other AI researchers are on that specific level of alarm. The opinions vary across the field and from what I understand this level of hopelessness and defeatism is the minority opinion.
I WILL say, however what is NOT the minority opinion is that AGI IS actually dangerous, maybe not quite on the level of immediate, inevitable and total human extinction but certainly a genuine threat that has to be taken seriously. AGI being something dangerous if unaligned is not a fringe position and I would not consider it something to be dismissed as an idea that experts don’t take seriously.
Aaand here is where I step up and clarify that this is my position as well. I am also, very much, a believer that AGI would posit a colossal danger to humanity. That yes, an unaligned AGI would represent an agent smarter than a human, capable of causing vast harm to humanity and with no human qualms or limitations to do so. I believe this is not just possible but probable and likely to happen within our lifetimes.
So there. I made my position clear.
BUT!
With all that said. I do have one key disagreement with yudkowsky. And partially the reason why I made this video was so that I could present this counterargument and maybe he, or someone that thinks like him, will see it and either change their mind or present a counter-counterargument that changes MY mind (although I really hope they don’t, that would be really depressing.)
Finally, we can move on to part 2
PART TWO- MY COUNTERARGUMENT TO YUDKOWSKY
I really have my work cut out for me, don’t i? as I said I am not expert and this dude has probably spent far more time than me thinking about this. But I have seen most interviews that guy has been doing for a year, I have seen most of his debates and I have followed him on twitter for years now. (also, to be clear, I AM a fan of the guy, I have read hpmor, three worlds collide, the dark lords answer, a girl intercorrupted, the sequences, and I TRIED to read planecrash, that last one didn’t work out so well for me). My point is in all the material I have seen of Eliezer I don’t recall anyone ever giving him quite this specific argument I’m about to give.
It’s a limited argument. as I have already stated I largely agree with most of what he says, I DO believe that unaligned AGI is possible, I DO believe it would be really dangerous if it were to exist and I do believe alignment is really hard. My key disagreement is specifically about his point I descrived earlier, about the lack of a fire alarm, and perhaps, more to the point, to humanity’s lack of response to such an alarm if it were to come to pass.
All we would need, is a Chernobyl incident, what is that? A situation where this technology goes out of control and causes a lot of damage, of potentially catastrophic consequences, but not so bad that it cannot be contained in time by enough effort. We need a weaker form of AGI to try to harm us, maybe even present a believable threat of taking over the world, but not so smart that humans cant do anything about it. We need essentially an AI vaccine, so that we can finally start developing proper AI antibodies. “aintibodies”
In the past humanity was dazzled by the limitless potential of nuclear power, to the point that old chemistry sets, the kind that were sold to children, would come with uranium for them to play with. We were building atom bombs, nuclear stations, the future was very much based on the power of the atom. But after a couple of really close calls and big enough scares we became, as a species, terrified of nuclear power. Some may argue to the point of overcorrection. We became scared enough that even megalomaniacal hawkish leaders were able to take pause and reconsider using it as a weapon, we became so scared that we overregulated the technology to the point of it almost becoming economically inviable to apply, we started disassembling nuclear stations across the world and to slowly reduce our nuclear arsenal.
This is all a proof of concept that, no matter how alluring a technology may be, if we are scared enough of it we can coordinate as a species and roll it back, to do our best to put the genie back in the bottle. One of the things eliezer says over and over again is that what makes AGI different from other technologies is that if we get it wrong on the first try we don’t get a second chance. Here is where I think he is wrong: I think if we get AGI wrong on the first try, it is more likely than not that nothing world ending will happen. Perhaps it will be something scary, perhaps something really scary, but unlikely that it will be on the level of all humans dropping dead simultaneously due to diamonoid bacteria. And THAT will be our Chernobyl, that will be the fire alarm, that will be the red flag that the disaster monkeys, as he call us, wont be able to ignore.
Now WHY do I think this? Based on what am I saying this? I will not be as hyperbolic as other yudkowsky detractors and say that he claims AGI will be basically a god. The AGI yudkowsky proposes is not a god. Just a really advanced alien, maybe even a wizard, but certainly not a god.
Still, even if not quite on the level of godhood, this dangerous superintelligent AGI yudkowsky proposes would be impressive. It would be the most advanced and powerful entity on planet earth. It would be humanity’s greatest achievement.
It would also be, I imagine, really hard to create. Even leaving aside the alignment bussines, to create a powerful superintelligent AGI without flaws, without bugs, without glitches, It would have to be an incredibly complex, specific, particular and hard to get right feat of software engineering. We are not just talking about an AGI smarter than a human, that’s easy stuff, humans are not that smart and arguably current AI is already smarter than a human, at least within their context window and until they start hallucinating. But what we are talking about here is an AGI capable of outsmarting reality.
We are talking about an AGI smart enough to carry out complex, multistep plans, in which they are not going to be in control of every factor and variable, specially at the beginning. We are talking about AGI that will have to function in the outside world, crashing with outside logistics and sheer dumb chance. We are talking about plans for world domination with no unforeseen factors, no unexpected delays or mistakes, every single possible setback and hidden variable accounted for. Im not saying that an AGI capable of doing this wont be possible maybe some day, im saying that to create an AGI that is capable of doing this, on the first try, without a hitch, is probably really really really hard for humans to do. Im saying there are probably not a lot of worlds where humans fiddling with giant inscrutable matrixes stumble upon the right precise set of layers and weight and biases that give rise to the Doctor from doctor who, and there are probably a whole truckload of worlds where humans end up with a lot of incoherent nonsense and rubbish.
Im saying that AGI, when it fails, when humans screw it up, doesn’t suddenly become more powerful than we ever expected, its more likely that it just fails and collapses. To turn one of Eliezer’s examples against him, when you screw up a rocket, it doesn’t accidentally punch a worm hole in the fabric of time and space, it just explodes before reaching the stratosphere. When you screw up a nuclear bomb, you don’t get to blow up the solar system, you just get a less powerful bomb.
He presents a fully aligned AGI as this big challenge that humanity has to get right on the first try, but that seems to imply that building an unaligned AGI is just a simple matter, almost taken for granted. It may be comparatively easier than an aligned AGI, but my point is that already unaligned AGI is stupidly hard to do and that if you fail in building unaligned AGI, then you don’t get an unaligned AGI, you just get another stupid model that screws up and stumbles on itself the second it encounters something unexpected. And that is a good thing I’d say! That means that there is SOME safety margin, some space to screw up before we need to really start worrying. And further more, what I am saying is that our first earnest attempt at an unaligned AGI will probably not be that smart or impressive because we as humans would have probably screwed something up, we would have probably unintentionally programmed it with some stupid glitch or bug or flaw and wont be a threat to all of humanity.
Now here comes the hypothetical back and forth, because im not stupid and I can try to anticipate what Yudkowsky might argue back and try to answer that before he says it (although I believe the guy is probably smarter than me and if I follow his logic, I probably cant actually anticipate what he would argue to prove me wrong, much like I cant predict what moves Magnus Carlsen would make in a game of chess against me, I SHOULD predict that him proving me wrong is the likeliest option, even if I cant picture how he will do it, but you see, I believe in a little thing called debating with dignity, wink)
What I anticipate he would argue is that AGI, no matter how flawed and shoddy our first attempt at making it were, would understand that is not smart enough yet and try to become smarter, so it would lie and pretend to be an aligned AGI so that it can trick us into giving it access to more compute or just so that it can bid its time and create an AGI smarter than itself. So even if we don’t create a perfect unaligned AGI, this imperfect AGI would try to create it and succeed, and then THAT new AGI would be the world ender to worry about.
So two things to that, first, this is filled with a lot of assumptions which I don’t know the likelihood of. The idea that this first flawed AGI would be smart enough to understand its limitations, smart enough to convincingly lie about it and smart enough to create an AGI that is better than itself. My priors about all these things are dubious at best. Second, It feels like kicking the can down the road. I don’t think creating an AGI capable of all of this is trivial to make on a first attempt. I think its more likely that we will create an unaligned AGI that is flawed, that is kind of dumb, that is unreliable, even to itself and its own twisted, orthogonal goals.
And I think this flawed creature MIGHT attempt something, maybe something genuenly threatning, but it wont be smart enough to pull it off effortlessly and flawlessly, because us humans are not smart enough to create something that can do that on the first try. And THAT first flawed attempt, that warning shot, THAT will be our fire alarm, that will be our Chernobyl. And THAT will be the thing that opens the door to us disaster monkeys finally getting our shit together.
But hey, maybe yudkowsky wouldn’t argue that, maybe he would come with some better, more insightful response I cant anticipate. If so, im waiting eagerly (although not TOO eagerly) for it.
Part 3 CONCLUSSION
So.
After all that, what is there left to say? Well, if everything that I said checks out then there is hope to be had. My two objectives here were first to provide people who are not familiar with the subject with a starting point as well as with the basic arguments supporting the concept of AI risk, why its something to be taken seriously and not just high faluting wackos who read one too many sci fi stories. This was not meant to be thorough or deep, just a quick catch up with the bear minimum so that, if you are curious and want to go deeper into the subject, you know where to start. I personally recommend watching rob miles’ AI risk series on youtube as well as reading the series of books written by yudkowsky known as the sequences, which can be found on the website lesswrong. If you want other refutations of yudkowsky’s argument you can search for paul christiano or robin hanson, both very smart people who had very smart debates on the subject against eliezer.
The second purpose here was to provide an argument against Yudkowskys brand of doomerism both so that it can be accepted if proven right or properly refuted if proven wrong. Again, I really hope that its not proven wrong. It would really really suck if I end up being wrong about this. But, as a very smart person said once, what is true is already true, and knowing it doesn’t make it any worse. If the sky is blue I want to believe that the sky is blue, and if the sky is not blue then I don’t want to believe the sky is blue.
This has been a presentation by FIP industries, thanks for watching.
60 notes · View notes
stemgirlchic · 10 months ago
Text
thoughts on science people who over-calculate
just because everything in the world is rational doesn't mean we have to act like it. if science was advanced enough, everything about us could supposedly be determined using scientific principles. but why does that mean we have to act like it? love and despair and joy and may be summed up to biochemical equations one day, but that doesn't diminish them. that's all of the math, physics, chemistry, and biology you love working to make you feel that way. there might be an optimal decision, but you're 21 [or fill in whatever age you are here], you don't have to take it. not everything has to be so rational. you're just living life. you're going to be just fine. you can actually be wrong sometimes i promise it's good for your health you'll survive.
8 notes · View notes
randomguy0ntumbir · 9 months ago
Text
"Just pretend to be pretending to be a scientist"
You gotta understand that this quote is so ridiculous it makes you think that Harry was deliberately trying to think of bullshit more confusing than the brain can physically handle, but in reality it only scratches the surface. Harry said this to Draco as part of a -fake- double-layered plot to rewire Draco's brain instead of anything actually important. He was doing this in the -background- while focusing on solving magic itself. This sentence was not a result of Harry actually trying, it was a result of Harry's brain tripping on Hogwarts Special Sauce Blue Beans in the small window of sleep when he is not having wet dreams about Professor Quirrel. The level of effort Harry needs to expend to seamlessly manipulate an 11-year-old stereotype is barely comparable to what he actually needs to do many times later in the book, and I love it.
19 notes · View notes
Text
TRUMP IS PLANNING TO IGNORE THE COURTS
youtube
2 notes · View notes
wakewithgiggli · 13 days ago
Text
youtube
I don't think I've ever seen Strange as animated as this, and it makes for a great video on a great topic.
1 note · View note
blusebrotoo · 5 months ago
Text
Tumblr media
1 note · View note
some-douchey-techbro · 9 months ago
Text
Nothing out of the ordinary, just Scott Alexander being an absolute legend as usual:
0 notes
katherinakaina · 28 days ago
Text
Hello, @strange-aeons . I’ve been a fan of yours for several years. Unfortunately, your last video is very poorly researched, up to not understanding basic definitions. Please read at least a little bit of this.
TLDR: Rationality and rationalism are two different things. Rationality is not about relishing in being right, it’s about searching for the ways in which you are still wrong. And just subscribing to a philosophy is not enough to make you perfect, no one has argued that. Yes, MIRI failed and we aren't hiding from this fact, the community is pushing for regulations or a total ban on AI capabilities research at the moment. The community is very diverse and very queer. SA happens in any group and demographic, it’s a pretty disingenuous way to discredit us.
First, 101 rationality understanding real quick. It’s rationality, not rationalism. Rationalism is a philosophy about Pure Reason being enough for getting the accurate picture of reality. That is bollocks. Turns out you have to actually do science*.
Epistemic Rationality is basically about doing science. As in, trying to obtain the most correct picture of reality by all means available. Then Instrumental Rationality is about trying to make the best decisions with this information. It’s all very common sense and I would guess you’d actually agree with most of it if you just stopped strawmanning. For example:
I saw you agree with the assessment that rationalism is a philosophy that proclaims itself to be correct. I assume you actually think that the rationalist community is this way. That reminds me of people who attack science for ‘those big brain jerks think they already know everything!’ betraying a lack of rudimentary grasp on what science is. In both theory and practice it’s mostly about searching for the places where you are still wrong (to become less wrong, get it?) and correcting your mistakes time and time again**.
It’s actually very similar to how we leftists approach social issues, always checking our privileges, always listening to minorities and always expanding our understanding of oppressive structures. Can being a leftist make someone a bit arrogant and insufferable as if they’re already perfect and have nothing more to learn? Many such cases. Same with rationalists. It’s Dunning–Kruger effect, it’s the same for every field where the point is to become better. People start, quickly learn a lot, become a bit annoying about it, then they are humbled mostly by members of their own community.
Second, you decided based on vibes that the rationalist community is a bunch of sexist elitist tech bros. You didn’t collect testimonials like you often do for your other projects. It looks like you just read a bunch of articles and listened to podcasts made by other biased individuals and maybe looked for ridiculous sounding threads on forums to confirm your suspicions*** (or just took them at their word).
And while it is generally true that ‘something only men are interested in is never cool’, LessWrong adjacent rationality is not that. It’s a giant worldwide community that is very diverse. It is maybe half shy nerdy guys and the other half is women and queer people, also shy and nerdy (read neurodivergent, overwhelmingly). It’s the most pro-feminist and sex positive community I personally encountered outside of feminist and queer communities themselves – and it’s in Russia, even rationalists aren’t too woke over here. The situation is way better in Europe from what I’m able to see in the group chats. All my friends are rationalists and all of them are queer (I do talk to other people, don't you worry, I watch you!).
The community is also politically diverse as well. While the overwhelming majority is liberal, there are many left leaning people as well. There are (unfortunately) many libertarians in the mix and a few conservatives (those who don’t mind being disagreed with most of the time). The thing is the community tries to discourage political tribalism and foster an issue-by-issue discussion instead. So there’s always a percentage of people who disagree with the consensus opinion on every topic, discussion is always happening. And because people are trying to be all evidence-based over there, minds are actually being changed****.
The community isn’t without its biases. The entire MIRI idea was based on this fantasy of a group of math geniuses saving the world, finding the perfect solution even though no one believed in them (including many people within the rationalist community itself). Well, now the realization kicks in that they failed and the problem of AI alignment is way more complex than they thought (if solvable at all) and the actual solution is to push for AI regulations. That’s the state of the AI safety conversation right now. It’s either ‘shut it down’ or ‘put them under the heaviest scrutiny’. Almost no one thinks they have the time to change the course of the iceberg anymore. And yes, they do focus on existential risks but it’s not like they appreciate all the harm AI does on its way to destroy the world.
I’m not going to elaborate too much on how rationality techniques improve my life in ways big and small or we’ll be here all day (it’s mostly problem solving and conflict resolution). But the crucial thing is that LessWrong never claimed humans are perfect robots or have a potential to be perfect robots or can be easily turned into perfect robots. How shit human brains are at actual reasoning if left to their own devices is the entire point (that’s why it is not rationalism in any way).
An analogy sometimes used is martial arts. People can fight with no training, they have some in-born fighting capabilities. But hoo boy does training help. Actual training, years of practice, constant effort of keeping yourself in shape. Not a correct philosophy or one course. Does this sound exhausting? Well, you are neither a scientist nor a sportsman. But some people really do take self improvement seriously and really are this ambitious. Most are practising recreationally, however. And it’s fine. Few people who run twice a week believe themselves to be olympic runners. Few church goers believe themselves to be monks. Few regular leftists believe themselves to be revolutionaries. And few casual rationalists believe themselves to be big time researchers or scientists. But exercising a little is still better than rotting on the couch.
Speaking of couches. As I mentioned, the community is very sex positive. Very kink friendly, very supportive of polyamory and trying out new things just to experiment. Even in the perfect world with no patriarchy involved there would be a bunch of drama caused by just regular human behaviour. Unfortunately, there is patriarchy involved and there are a lot of women in the community. So, a bunch of sex scandals did happen. But framing it as ‘these people don’t respect women, what a surprise’ is highly misleading. By that line of reasoning you could discredit any movement or demographic. The tactic that is indeed used by the right all the time against trans people, immigrants, democratic party, you name it.
And LessWrong isn’t a country. If women felt unwelcomed, they’d leave, like they leave industries and fandoms. Like they leave most ‘intellectual’ communities because they are often hostile to women and queer people. LessWrong is a rare exception and women rarely leave it. They feel very welcomed, in fact. They own the place in many cases (like my first local meetup in one of Russian cities that was run by a wonderful lady).
Yes, it’s not perfect and people are constantly trying to make it better. But it's a general patriarchy thing that was not caused by LessWrong or rationality. The same way there’s rampant misogyny on the left as well and we talk about it but we aren’t cancelling the left, are we?
More about Ziz here.
About hpmor here.
* There’s a saying ‘Logic is true in any universe, but it doesn’t tell you which universe you are in’.
** In fact, even calling yourself a 'rationalist' is something that's being challenged. A preferred term is an 'aspiring rationalist', to empathize that no one here is actually rational, that we all are just trying to be a bit more rational to the best of our abilities. I personally don't use it but many people do.
*** Forums are very big and all sorts of controversial topics are discussed there. It’s not surprising to find threads about accelerationism or longtermism or decision theory that lead to ridiculous conclusions. All of those are controversial but people need a place to talk about it, that’s the whole point.
**** It is way more difficult on political issues than on random science topics. I hope I don’t have to explain why.
20 notes · View notes
shankart · 10 months ago
Text
Unlearning to Learn
Experimentation is about deliberate invalidation, tearing down mental walls to uncover new understanding. Embrace discomfort, reframe failure, and cultivate the addiction to discovery for breakthroughs through strategic questioning and consistent wrongnes
We experiment not to prove ourselves right, but to prove ourselves gloriously wrong. This journey of unlearning reveals the true essence of knowledge. Experimentation isn’t about validation—it’s about deliberate invalidation. Paradox of Knowledge Imagine your brain as a fortress. Each belief is a brick, each assumption a buttress. Solid, impenetrable… and potentially blinding. Your mind: a…
Tumblr media
View On WordPress
0 notes
gynoidgearhead · 2 years ago
Text
[Image caption for original post: YouTube preview for a video called "Can YOU Fix Climate Change" by Kurzgesagt - In A Nutshell, with a title card that just says "NO*" (with an asterisk) against a black background, and bottom text that says "This video was sponsored by Gates Notes, the personal blog of Bill Gates, where he writes about global health, climate change, and more. Check out it out [sic] to learn more about ways the world can work together to reach zero greenhouse gas emissions."
For first addition: nonsense PragerU graph reading "Democracy since 1900" with a line going up, and nonsense PragerU map reading "areas under colonial rule today" that has no countries filled out (which is demonstrably false for multiple reasons).
For second addition: "But Then LollaPalooza Happened And The Economy Went UP".
End caption.]
Tumblr media Tumblr media
🤔
8K notes · View notes
incorrectsmashbrosquotes · 9 months ago
Text
People playing Elden Ring and looking for the "good" demigod to root for are missing the point. Pick your favorite mass murdering war criminal megalomaniac with mommy issues and endlessly simp for them like the rest of us, cowards.
7K notes · View notes
izzi-rads · 3 months ago
Text
Tumblr media
I think that they should kill each other
-
Based on Glitch's recent Japan exclusive promo
2K notes · View notes
incoherentchanting · 5 months ago
Text
Tumblr media
this shit is so funny
2K notes · View notes
stellarspecter · 2 years ago
Text
Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media
@pscentral event 20: antagonists ↳ THE LORDS IN BLACK in NERDY PRUDES MUST DIE
11K notes · View notes