Text
In Which Archimedes is Abducted by Aliens
So in honor of pi day (it’s always pi day somewhere probably), regarding Archimedes’ famous upper bound of pi of twenty-two sevenths and less famous lower bound of two hundred and twenty-three seventy-firsts. This came from his use of two enneacontahexagons, one on the outside, one on the inside. However, what if you don’t have access to geometry, what if you only know pi as the lowest strictly positive zero of the sine function (yes I have been on a certain website)? Well, then, you could approximate Archimedes method by showing that that upper bound is less than ninety-six times the tangent of a ninety-sixth pi, and that that lower bound is greater than ninety-six times the sine of a ninety-sixth pi. But how do you establish that these are on opposite sides of pi?
Well, first, how does Archimedes? How does Archimedes even assert that a circle has an arclength? Well, how would we define it now? By the integral of the lengths between nearby points, essentially, treating this as a Riemann mesh. That’s pretty much what the inner polygons represent, but also they can be seen as an appropriate lower bound by the notion that a straight line is the shortest between two points, which is as good a first principle as any. It can then be shown that the outer polygons are a least upper bound since they get arbitrarily close and never less. Consistency’s not shown, but it never can be.
This raises the question of what exactly the Euclidean distance is, since even this notion of arclength depends on it. Of course, normally, the Pythagorean theorem is assumed as naturally as breathing. Those of a certain persuasion will tell you the Pythagoreans “stole it from brown people,” meaning that Egyptians and Babylonians had the use of it, although no extant proof. They can’t have cut it from whole cloth, though, so let’s ask ourselves - what exactly is the Pythagorean theorem?
Well, let’s look at what’s sometimes called the “Pythagorean proof,” one more visually obvious than Euclid’s - four equal right triangles, arranged first into two rectangles that leave two squares, then into a frame of a square on the hypotenuse. Note that both of these form a square of side equal to the sum of the two legs, which can be broken down by algebra. So in essence the proof is the proof of the existence of triangles; in the Euclidean distance, including in arclength, is hidden the abstract notion of the triangle.
But returning to Archimedes how to show from this alternate notion the boundaries on this zero of the sine? What’s to be shown is that, just above zero, the sine is always less and the tangent always greater. That the sine is always less can be seen from the series representation, at least for a sufficiently small parameter. That the tangent will be greater can be seen by multiplying the series expansion of the cosine by the parameter and subtracting to see that the sine will be greater near zero.
From there you just need the half-angle formulas, although that’s not what he himself uses. For the tangent, he uses the fact that if you bisect an angle, the ratio of the lengths of the base are that of the sides. For the sine, he uses the same twice. Algebraic manipulation of the half-angle formulas will give a similar result probably and I hear birds.
0 notes
Note
Chemistry? Chemistry! The interactions of our petty universe, to understand something so beyond the gods it cannot be quantified! e is not some property of some chemical; e is e, and if you pull back across our entire 4-plane, e remains e.
It might be a sign of the times that I typed into Google "e is trans..." and the second suggestion was "e is trans or cis."
😂😂😂
For those who are wondering, this is chemistry related.
12 notes
·
View notes
Note
Is it a bad sign that I recently came across the notation "ωxω" (i.e., the Cartesian square of the naturals interpreted as cardinalities of finite sets rather than as a subset of the reals) and immediately thought "uwu"?
😂😂😂
18 notes
·
View notes
Text
Let'sh shay there ain't no 'nfinny!
Imagine there's no heaven, it's easy if you try... take a look at Carathéodory's notion of entropy and it'll tell you something about the nature of consciousness. Anyway.
Onto the greatest bugaboo of mathematics, infinity. Essentially, in the universe, there's no such thing as infinity, and yet so much mathematics today is about the various subtleties of it. Consider in particular the infamous axiom of choice - if all sets are finite, then the axiom of choice is trivial. Of course it's ridiculous to believe that all sets are finite, as ridiculous as it is to believe that the set of all sets does or does not contain itself.
So consider two paradoxes, Russell's and Galileo's. Galileo who's worshipped by proddies for poking the holy bear (...well, I guess the holy bear per se'd be further East) until he bit came up with the mindblowing paradox that there are infinite even numbers and infinite odd numbers, and came to the conclusion that the notions of greater than and less than make no sense for infinite sets. Should've been left there, but someone had to cant his diagonal Canticle over Canticles and make manifest the notion of multiple infinities and therefore of infinite sets.
"But wait!" you say, "weren't there always infinite sets? What about Euclid's proof that the primes were infinite? Hell, what about the very notion of natural number?" Well, yes, but are those sets? That's where Russell's paradox comes in. The notion of a set as "a bunch of shit" doesn't work, because then you could have all the shit that doesn't have its own shit, which both does and doesn't have its own shit. (Incidentally, while the doctor was a woman - both large and small D going into 2018 - the barber was definitely a man. Making the barber a woman misses the point.) Although... actually, maybe the barber sort of is a woman? Maybe.
Basically, what I'm saying is woman = class. And the first thing you need to know about classes, i.e., women, is that they don't exist. I mean, nothing in math quite exists, but these especially don't exist, and their nonexistence is critical to understanding them. These objects, such as the universe of discourse, are not in the universe of discourse, and therefore can't be discussed. What's the universe of discourse? It'd be the set of all sets, but that doesn't exist, so it's the class of all sets, which also doesn't exist, and since it doesn't exist, it's a class.
Why doesn't it exist? Because the only way you can make sense of sets is by saying that, for any set and coherently expressed property, if you have a set, you have a subset consisting of those elements that have that property (even if there aren't any, in which case it's the empty set, and that's fine). If the property is not containing itself, then the set of all sets would have that property iff it didn't have that property. That's Russell's paradox, and that means that a distinction has to be made between sets, which do exist, and classes, which don't. In particular, you cannot be allowed to make the statement, even in the negative, "this class is a member of..."
So why should infinite sets be allowed? Be rid of the axiom of choice and the axiom of infinity; replace the latter with an axiom of finiteness and an axiom of empty set. This will give you Peano arithmetic, and it will make the notion of countability nonsense. What Cantor diagonals would reveal has already been revealed by Galileo. But then what about sets on the real line? Because even restricted to rationals, let alone proper superclasses, that would fail the axiom of finiteness.
So how to define finiteness? The normal definition involves a bijection with a cardinal set, the normal definition of which (a sort of canonical set of a given natural size) essentially is the axiom of infinity, so that's right out. Easiest to axiomatize would be Tarski finiteness, which means that, given a set and a set of subsets of that set such that for every pair of elements one contains the other, then one of those elements is contained in no other. (I'm really tempted to try to write that in symbolic language, but I will have behaved myself so well so far.) Normally, this doesn't imply finiteness in the first sense without the axiom of choice (which is obviously right out), but ZF alone - without infinity or regularity - is enough to show it implies Dedekind finiteness (no proper subset has a surjection onto the whole set, or equivalently, no subset has a bijection to the natural numbers), and that Dedekind finiteness of the set of subsets of the set of subsets is enough to imply natural-number finiteness; therefore, if every set is Tarski finite, then the power set of the power set stupid double subset thing I just mentioned of a given set must be Tarski finite, and therefore Dedekind finite, so every set is natural-number finite. (The definition of the natural numbers - which don't comprise a set - will be a bit ad hoc.)
(This is where that entire Banach-Tarski rant from the last post went.)
Of course the theme of this post gets rid of that paradox, so forget the entire last paragraph. In fact, it collapses any number of paradoxes into what amount to one mega-paradox with fewer metamathematical consequences. I don't know what those are, but what they are they really aren't, are they? After all, the probability of the destruction of mathematics is impossible to assess (any truly rigorous definition of the set of definable numbers would do it - although there are many contingent ones, even contingent ones that can be without contradiction established by fiat to be, per Skolem's paradox - but that lies behind a wall of universal Skolemization) - and perhaps all probabilities are impossible to assess, with such a recasting of set theory. I'm not the first person to suggest this - Kronecker being the most celebrated - although perhaps the first so incoherently (I expect this post, indeed, the entirety of this blog, is just the right combination of informed and erroneous to be infuriating to every level of mathematical acumen), and there are untold hurdles, one of which I've just mentioned. So let's go back to that thing I harp on all the time - PNT. Let's build from the ground up, sort of, in an unholy fusion of Euclidean geometry and set theory applied to modern analysis and number theory. That might be getting me where I'm going. The Lebesgue integral certainly relies on infinite sets as a part of measure theory, so best to stick with the Riemann.
Of course, that brings up the notion of how to define limit. In ZF, it's impossible to prove without countable choice that the epsilon-delta definition of limit is equivalent to the existence of a sequence that comes to the limit in question. In this universe, the latter definition is incoherent, since no infinite sequence exists at all. The former can be restricted harmlessly to the rational numbers, which raises the question of what exactly the limit is; the limit is the rule, which is finite. Similarly, the twofold - two is less than infinity ("less than" expressed as an ordering of the set of cardinalities with at least one infinite cardinality as an element) - composition of limits (one of those limits being a series) in the definition of the integral is a composition of rules. (This computational approach makes me essentially the set-theoretic equivalent of the "nullity" guy, by the way. Well, that's not fair; he got better grades.)
Now, recall the "evidence." Now, it might seem shocking to try to work with the Euler product, in either form, without infinite sets (the reals themselves - Dedekind cuts - are infinite sets), but an infinite sequence, speaking informally, need not imply an infinite set, as long as the rule is finite. What's generated, then, provided the series is convergent, is a Dedekind cut, which would be a proper class under the axiom of Tarski finiteness. However, that's not really important, but rather, what's important is the rule, and manipulations of this rule under the arithmetic operations by a sort of composition; a rigorous definition not found in this post would most likely come from the theory of computation.
I remember how much easier I found vector calc than I did linear algebra. This was because in vector calc, I had already guessed most of the basic operations from simply generalizing the rules from one-dimensional high school calculus. At that age, I couldn't wrap my head around the fact that this clearly wasn't enough. To some extent, modern set theory's treatment of analysis is a formalization of this misapprehension, so that it ceases to be a misapprehension. Let's take the alternative perspective, Turing standing on Peano's shoulders rather than Zermelo's for the hypercomputability hierarchy to replace Gödel's. Let's enhance our confusion to create a grand certainty.
Back to the point. To recap, the "evidence" is that if you take the logarithmic derivative of the zeta function you get the logarithm of each prime in turn divided by one minus that prime to the negation of the parameter, which can alternatively be expressed, per Euclid, as the logarithm of that prime multiplied by the sum from zero to infinity of one over the powers of the prime in question to the parameter (remember, though, that this is a finite rule as opposed to an infinite set). Considering that, all terms being positive, this converges absolutely, the terms can be rearranged; note that the terms that are generated by the rule are exactly fractions with prime powers raised to the parameter in the denominator with the log of the base in the numerator, so let's put them in the order of the prime powers.
Before we go on l let's go back to that sentence, "considering that this converges absolutely, the terms can be rearranged." Remember when we all learned that in high school? But the concept of a permutation on an infinite set is taken as read there, so it's imperative to prove the equivalent principle again in this new framework. By the definition of limit, we can always run the partial long enough that it'll be within some given distance of the sum. In that, provided we have some idea where the terms are going, there must be a maximum destination, which must be equal or greater. Therefore, it must include all the terms, and more. If they're all positive, then this can only be larger. Therefore, it's true of the absolute value. Term-by-term summation following from that of the partials and the arbitrarily low upper bound on the tail, you get from there and the convergence of a subseries to the case of absolute convergence.
Anyway, back to the "evidence." From here it's pretty clear. You can make this sum by layering up integrals that start at each prime power, which will each be the parameter times the log (when there is one) divided by the index to the power of the parameter plus one. So moving the sum inside the integral (since it's absolutely convergent - same argument as before, only this time with Riemann sums slid in) you'll get the second Chebyshev function divided by the parameter of the integral to the power of one plus the parameter of the zeta function, all multiplied by the parameter of the zeta function. If you plug in one, the derivative of the log of the zeta function blows up, and so this integral blows up as well - which is what you'd expect if the second Chebyshev function asymptotically approached identity, because then you'd be dividing identity by the square to get the reciprocal so that would blow up. You'd also expect, then, that subtracting the reciprocal would cause it not to blow up, and in fact this would imply the asymptotic approach. The suggestion of this comes from the fact that if you multiply the zeta function in the log by one less than the parameter, the derivative manifestly converges.
So that's the "evidence," but it's not the proof because we can't obviously move the limit inside, not even with the normal machinery of set theory/analysis. So why can we move the limit inside? The short and incoherent answer is because there are no zeroes on the edge of the critical strip. The proof of this I won't restate - it's just algebra and trig - but the connection still isn't inherently obvious. From there, what's left is the Wiener-Ikehara theorem, which it's even more imperative to view now in terms of Fourier analysis, that is to say, constructing a function from an uncountable accumulation of sinusoids. Might be some kinks to work out there in the absence of infinite sets.
So basically, as I said before, the actual proof is based on the notion of an "approximation of unity," a family of functions of integral one that approach zero bar an infinite isolated point. (Again, none of these concepts exist, but speaking in shorthand, that's what it is.) Multiplying this by offsets of another function allow you to show that that function approaches a constant, and in this case to show that the second Chebyshev function approaches the identity. The fact that this approaches a limit comes from the equation discussed before.
Now, so far, I'm just drunkenly ruminating on cud already so thoroughly chewed. What's interesting, though, is that not only are all these limits determined by finite rules, but that they themselves are their solutions, and these rules for arithmetic operations axiom schemata, rather than inferences. The question, then, isn't whether they follow, but whether they're eliminable in whatever context is important to our purposes. And that's what brings us back to Euclid.
Remember, the challenge there is to build from the sparse postulates of his geometry and "number theory," to which not even he can hold himself entirely, to get to such a radical conclusion regarding the prime numbers. In his terms, as before, it can be expressed only in terms of the harmonic series - at least without great difficulty. New chapters might be introduced bringing it to modern terms (i.e., at the very least, terms in which RH in prime-number form would be meaningful) without a modern notion of infinity, but these would be lengthy chapters. For now, let's go with the prime counting function and the harmonic series. I said then that we would need definite error bounds, but let's replace that with "as close as we like."
Let's work backwards. We first need to start with the fact that an increasing (non-strictly) function that's convergent when you add it up (at integers because no infinity - note that this is an integral of a function of countable range in the ordinary paradigm) subtracting identity and dividing by the square must approach in ratio identity. This follows from the divergence of the harmonic series in effect - if it didn't approach identity in ratio, then the ratio would either approach a number greater or less than one, or it would diverge, being unable to oscillate due to the function's monotonicity. If it did those things, the function wouldn't converge.
From there, two things remain to get us to "as close as we like." If only I could remember what they were. (I suppose I should mention on this point that my hippocampus is a pickled seahorse on a toothpick.) I suspect they're to show that the increasiness is finite for positive parameters and that it reflects the limit. The former is easy enough to show from series (to wit, by saying that it holds for any power, however small, the concept being expressed easily enough geometrically), the latter coming from Fourier analysis. In any case, really, there's no real logic to this other than what I mentioned before, only the relevance of it to this notion of mathematics without infinite sets.
So let's not fuck around. Fourier analysis, the analysis of periodic functions on the real line expressed as the integral of an uncountable set of sine waves, without infinite sets. How? Well, not at all, really, but basically by geometry, obviously. Sine waves, after all, come from sines, the string of a bow the chord, the musical sense of "chord" after all coming from the chord a string made against a lyre (as far as you know), and Euclid knew the law of cosines in a geometric, very Alexandrian form (propositions twelve and thirteen of book II).
So let's go back to the very beginning of this blog. The normal distribution. Now, the slovenly proof I gave then doesn't really show anything and isn't the normal (so to speak) approach anyway. A better notion of probability comes through set theory and measure theory, the latter of which certainly doesn't work as normally understood in a finitist paradigm.
So you'll remember I explained (ish) why a series of fair coinflips should approach the standard normal, but I didn't explain probability beyond that at all. So what is probability? Well, you take a set, a subset of that set's power set closed under complementation (relative to the base set) and countable union, and a function from the latter to [0, 1] such that the empty set maps to zero, the union of countable disjoint sets maps to the sum of those sets' image (implying the union of countable non-disjoint sets is less than or equal to the sum of those sets' image), and the whole shebang maps to one. That's a probability space. Now, if the set has to be finite, you're golden to model, e.g., cards or dice, but you're SOL if you want to model a dartboard, unless you feel like working it out at the quantum level, but even there there may or may not be infinities where probability is concerned. Therefore, under this paradigm, the base set and space would have to be understood in terms of the rules that generate them - but aren't they anyway?
So with that in mind, the notion of approaching a probability space via a series of fair coinflips is analogous to the notion of approaching a real, computable or uncomputable, by a Turing-equivalent machine. Now, I didn't bring up Turing machines (exactly) above because even in this paradigm uncomputable numbers can be defined, such as the sum of two to the power of the negations of successive busy beaver numbers.
I've been writing this for months now, and I'm sure it's wicked self-contradictory and flows like tar, but fuck it, it's going up.
3 notes
·
View notes
Text
A quickie, only because I haven't written anything since my "alloposium."
So I've been working on a post for a while that has to do with the metamathematical/philosophical concept of finitism, i.e., that the notion expressed by Russel's paradox, which he expressed as the barber's paradox (which can't be solved by saying the barber was a woman, or maybe it can, but more on that later), should be solved, rather than by the axiom of regularity, by an axiom of finitism, where infinite sets are treated as proper classes. I'm rather dragging my feet on that, since maybe it turns out that trying to do math drunk is almost as foolish an endeavor as doing it sober. Consider the Achaemenids, blah blah blah.
Meanwhile, I have to write something, so how about I excerpt the rant on the Banach-Tarski paradox I was going to put in, since it's pretty random anyway?
Now, you probably know the name Tarski from the Banach-Tarski paradox, that you can chop a sphere into a finite number of pieces (nine) and reassemble them into two spheres, which I’m now going to explain in tiresome, prosaic detail for really no reason at all. First accept the axiom of choice, in the form that given a set of disjoint sets, you can infer the existence of a set containing exactly one member of each. Now draw two axes through a ball at a funny angle - pretty sure one radian will do - anything that leaves no finite, alternating string of half-turns around one axis and third-turns (in either direction) around the other that ends up get you back to where you started. (I think one radian will work because its sine and cosine are transcendental - this I’m sure of - and they’d have to be the solutions to a finite polynomial of algebraic coefficients to fail this criterion - this I’m also fairly sure of but can’t be arsed to work out.) This guarantees that every such string corresponds to a unique rotation, and it’s easy to see that you can invert them by just going backwards (thus failing to alternate, I’m sure I don’t need to say). Now, first, let’s split the surface of the sphere into four sets - the fourth will be all the points that don’t move in one of these rotations, and you’ll see why I’m calling it the fourth very shortly. In fact, by the end of this sentence: every point belongs to what’s known as an “orbit” of all the points you can get to by a rotation in the set of rotations described above, which is a set since the relation is reflexive (the trivial rotation - zero degrees - is in the set), symmetric (the inverse of every rotation in the set is in the set), and transitive (the composition of two rotations in the set is in the set), which for every point not in the aforementioned fourth set (i.e., those not fixed in any rotation; note that the orbits of those in the fourth set will be entirely within the fourth set - in fact, I think but can’t prove the fourth set itself - if it were finite, it’d follow from the bijection, so stay tuned - since the operation of bringing it to the fixed point, giving it an appropriate turn, and bringing it back will keep it fixed while being a nontrivial element the set of rotations) has a bijection to the set of rotations, and having CHOSEN (would that I could figure out how to add sparkles to that) one from each orbit you get a set that itself has an orbit that’s the whole sphere bar that fourth set, and that orbit we’ll divide into three unions of elements of the orbit by separating the rotations into three sets. Yes, I will call that a sentence, thank you very much. Anyway, the rotations, the trivial rotation is in the first set, if you turn a rotation around the third-turn axis in one direction it goes from the first to the second to the third and back, the other it goes from the third to the second to the first and back, regardless of whether the instructions alternate or not. Likewise, if you flip a rotation in the second or third set, you get one in the first set, always. However - if you flip one in the first set, it depends. Remembering that each rotation has a unique alternating representation of flips and turns (let’s call ‘em that because why not?), if the last instruction in that representation is a flip, you obviously go back to whichever brought you here. If it’s a turn, you go to the second. So now you take these four sets on the surface and use them to divide the sphere into four sets plus the center. First, take the second and third away. Now, you can flip the first set to get the second and third, break it up, turn both parts to get two copies of the first, and flip one to get the second and third again. Since that’s keeping the fourth and the center, you’ve now got the whole ball plus a nice chunk of it. (If you’re keeping score, only four pieces - the one corresponding to the fourth set and center, one for the preimage of each of the second and third sets under the mapping from the first - have actually been rearranged separately, and one for the points beneath the second and third sets themselves. Only the lattermost will be considered from here on out.) The next part’s tricky, inasmuch as it involves a hypothetical ball and its hypothetical rearrangement into a subset of the other two sets, but that’s not going to be what we’re doing in the end: to wit, hypothetically rearrange the hypothetical ball in the reverse of the way we build the last ball to get just the points under the first and fourth sets (and the center), then turn the first set to become the second, turn the fourth to be entirely in the other three (since the set of rotations we built was countable, the points of the fourth set on the sphere - i.e., the surface - are countable, and since only two rotations can connect a given two and there are uncountably many rotations since there are uncountably many angles, there being are uncountably many reals between zero and tau, there must be such a rotation) and just like the entirety of the three sets before break it in two and rotate it totally into the third set, and shift the center to be somewhere else, anywhere else, in the third set (“chosen,” which didn’t actually make my phrasing, but pretend it did, gets no sparkles because it’s only one arbitrary choice, not infinite - more on that nasty little-big word soon). The next bit to break off is the points under the second and third sets of the sphere that you’ll never get to by doing this to the points outside the second and third, or to the points you get from doing that, or from doing that, ad inf. (Arguably a different sense, but really also sort of the same - Euclid’s sense, to the extent he had one.) These stay put, while the points that you will get you break into five according to where you get them, and send them there. Since each iteration is walked back one, you end up with another entire ball. So now, like the barber, you’ve got two balls.
(The "you probably know the name Tarski..." thing comes from a reference to Tarski-finiteness, which is the property that any chain of subsets, i.e., set of subsets in which of any two one contains the other, has an element that contains all others, which without the axiom of choice doesn't necessarily imply finiteness, unless you assume it for all objects in the domain of discourse.)
Anyway, slightly simpler: Vitali sets. So consider measure: if you have a finite number of disjoint intervals, the amalgamation has a "length" equal to their sum. If you have a sequence of lengths of disjoint intervals that converges (e.g., 1 to 1.5, 2 to 2.25, 3 to 3.125, &c.), that'll converge to the sum, and if you have such a sequence that diverges, we'll say that's infinite, like the real line itself. So let's consider just between zero and one, and divide up the numbers such that in each set, the difference between any two is rational. Now, every such set will be of measure zero, but there are "too many" to put into a sequence, to that's fine. Now let's CHOOSE one from every such set, and since they're disjoint, if you add a rational number, you'll get a different set that'll have the same measure, and if you take the part that's greater than one or less than zero, you'll get two disjoint sets whose sum has the same measure, and moving one will give you a set of the same measure between zero and one. These can be put into a sequence, since the rationals can be, and they take up the whole interval, and since they're disjoint, they add up to the whole interval, but since they have the same measure, that's impossible, so they have no measure, not even zero. The aliens who don't understand Euclidean geometry might have a different notion of "number," and in particular "infinitesimal" from us, but that would take more explanation, which I don't have on me.
So... as Izzard would say... yeah.
0 notes
Text
An alloposium in the classical sense, with no guests and something something primes
You’ll recall in the last post the mention of a bit of bourbon-soaked insanity of trying to write up a proof of the PNT that would have been accessible to the ancient Greeks. Well, the first, and I’m sure only hurdle to get over will be defining it. After all, let’s consider the simplest and most accessible statement: the prime counting function approaches its parameter over the parameter’s natural logarithm as the parameter goes to infinity, that is to say, the ratio of the parameter to the product of the prime counting function times the logarithm of the parameter goes to one as the parameter goes to infinity, that is to say, for every epsilon there’s an upper limit such that anywhere above that upper limit you’re within epsilon of one. Your hippie friend might tell you that the Greeks couldn’t comprehend infinity, but in fact they just didn’t play with it as readily as we do; we just need to make explicit the error bounds that would more typically be implied. More obviously there’s the natural logarithm. Consider the harmonic numbers, which differ from the natural logarithm by at most one, so since the numerator rises linearly, if the one ratio approaches one the other will approach one. And while harmonic numbers weren’t really a thing then, the harmonic mean certainly was.
So anyway, first, the Euler product. Remember the three definitions I gave of the zeta function last post? I guess it’s time to show the equivalence of the first and third, and that’s the Euler product, which is essentially a modern version of the sieve of Eratosthenes. If you start with expression 1 of the zeta function, and multiply the denominator of each term by two to the parameter, then subtract that from what you had, essentially multiplying by , what’ll be left is only odd numbers. Do the same with that new series but for three to the parameter, and you’ll be left with only odd numbers not divisible by three. Then five, seven, eleven, and so on… as you keep multiplying by the unit minus the reciprocal of each prime to the parameter, you’ll get closer and closer to one as more and more terms are sieved out. Now, the “canonical” source for the sieve of Eratosthenes, a book on the metaphysics of mathematics somewhere between Spirit Science and this blog, comes about three hundred years after he lived and presents it without formal proof. In fact, what formal proofs it does contain are mostly cribbed from Euclid (often badly; in fact, the vaunted Greek rigor Dodgson was harder for than pretrimmable trim is seen in so few of those Greek writers from the epoch of pretrimmable cock that it’ll be easier to just pretend there’s no one but Euclid), so we’ll just prove it ourselves.
(Oh, yeah, and for the second one, if at that first step instead of multiplying the original series by one half to the power of the parameter, you multiply by one half to the power of one more than the parameter, you get an alternating series, which converges for any parameter of real part strictly greater than zero, not just greater than one. Since that multiplier will be nonzero for any parameter not exactly one, you can just divide it back out to get another expression of the zeta function.)
So for the “Greek” version, the first step has to be to actually show that the sum of the reciprocals to some power strictly greater than one actually converges. Well, since we’re dealing with constructible numbers, let’s restrict it to a number that’s one plus the reciprocal of a power of two. So now you use binomial series, and since it’s so restricted, you can go through the binomial theorem for natural numbers, as first put forth in Chandah Sutra, the famous text on poetic meter written most likely some time before Ashoka. Indeed, Euclid himself starts the ball rolling in the West, with proposition II.4, which iterated, can get you there. Well, if we can do a version of the integral test without limits, we can prove that the sum is between the chosen power of two and one greater. Then, from the fact that the sum up to a sufficiently large prime is less than one off from the square of that prime, we can sieve away to leave a difference less than one. Of course, no limits means no integral test, so how to do it? There’s something called a “telescoping series,” where the terms are expressed as differences that cancel, making the partials, and therefore the ultimate sum, relatively easy to work out. Consider a series where the terms are the power of two in question times the difference between the reciprocals of the index to the reciprocal of that power of two (the result, that is, not just the exponent) and the index’s successor to the same; add it up from one and you get the common multiplier that’s the power of two, so if it can be shown that each term is greater than the successor to the index to the power of the input to the zeta function then you get the upper bound of one greater than the power of two in question. You can do this by pulling out the successor, so that for the subtrahend you just get one, and for the minuend the ratio of the successor to the index, said ratio taken to the power of the same reciprocal. If you had access to the modern binomial theorem, you could stick a minus sign on the exponent, flip the ratio, and you'd have shown that the upper bound is less than the series in question. However, lacking that, it'll instead be a matter of using Euclid's proposition IX.35 on what amount to formal power series and taking both sides to the power itself to show the inequality that the binomial theorem yields, that you can cancel out the one and the successor.
FUCK YOU FUCKING EN-SPACERS FUCKING UP MY FUCKING POST WHEN DID YOUR FUCKING CULT BECOME FUCKING LAW YOUR FUCKING TEXT LOOKS LIKE A FUCKING RUN-ON FUCKING SENTENCE WITH A FEW FUCKING STRAY PRINTER MARKS IN IT SORT OF LIKE THIS BLOCK BUT WITH GNATS ON THE MONITOR IN FACT I'M NOTICING NOW THAT IT AUTOMATICALLY CORRECTS THE WORD WRAP WITH MULTIPLE SPACES UN-FUCKING-LESS THERE ARE EXACTLY TWO THE NARCISSISM OF SMALL DIFFERENCES TO FURTHER THEIR FUCKING CULT AND WHAT IS THIS FUCKING BULLSHIT WHERE THEY GO OUT OF THEIR WAY TO ADD AN EXTRA SPACE WHEN YOU TRY TO FIX THEIR HOLY EN-SPACES JUST TO DICK WITH PROPERLY EM-SPACING "FAIR GAME"
Now for the slightly trickier part. It can be shown similarly that the difference will always be less than the term as regards the index itself, but we need to put a lower bound on by how much. Since it’ll be lower for every term, just working it out for the first term, will do, and since for any power of two the product of the difference between one and the reciprocal of that power of two with the exponent will always be less than five sixths (in fact it approaches .693 from below, but five sixths is good enough), it’s just a matter of getting to a point where the remainder in the partial will be less than one sixth, which'd mean you have to get the telescope to collapse to that level, i.e., multiply the power of two by the index and take it to the power of the power of two (again, the whole power), so that when you take the relevant root, you get something that multiplied by the coefficient will be less than a sixth. Because the subtrahend comes from the successor, you can subtract one from that, which we'll see in the next paragraph'll come in wicked handy.
So it passes the sniff test, but does it pass the Wolfram test? Well, choosing 32 as the power of two, you get an index beyond 11,631,596,243,148,446,061,072,503,646,667,324,283,461,914,680,502,615,356,492,721,967,947,841,535, the sum to which point would be about 32.413. Just to be extra-sure, since it should in principle work with log half e rather than a sixth, let’s try the far more reasonable index (rounding up since indices are integers and since our method finds a minimum, only, since this isn't an integer, that's really rounding down due to our wicked handy subtracting) 38,285,606,278,150,664,580,703,573,515,392,543,734,581,840,373,218,238,427,281,826,340, and we get about 32.273. So… yay! It passes the Wolfram test. Probably don't let's make explicit we’re relying on such ludicrous boundaries when we’re aping Euclid later on.
That brings us to the compass-and-straightedge provenance of the sieve, which comes from the fundamental theorem of arithmetic, which is equivalent to Elements VII.30-32 and XI.14. You can get the Euler product from this by using the bounds above to establish that between the prime and the square of the prime you've got a difference less than one.
(Who the hell thought a blog called “drunk-math” should go hardcore compass-and-straightedge?)
Now, to start with, to define the logarithm, we'll use a limit that goes to the logarithm, namely, the full power minus one over the exponent alone as the exponent goes to zero from above, where the base is the parameter of the logarithm function. For the exponent, we'll use the reciprocal of the power of two, so as the power of two blows up, now we have a constructible analogue to the logarithm, which can be used for a constructible analogue to the first Chebyshev function. Since the denominator is the reciprocal of an integer that blows up, instead of dividing by it we can multiply by the zeta function of one plus the denominator by the argument above.
Now for the hard part, to apply the Werni Wiener-Ikehara theorem. To remind, the preconditions (for one version) are:
For any exponent of real part strictly greater than two, the integral of the function divided by that power of the parameter of integration from one to infinity converges to something complex-continuous with respect to the exponent.
The function also converges for nonreal exponents of real part exactly two.
The function itself is zero at one.
As the exponent approaches two from above the difference between the integral and the reciprocal of two less than the exponent comes to a limit not equal to zero.
1 is obvious by the normal definition and not hard from the adapted one (although the notion of complex continuity could be a bit tricksy); 3 follows wicked fast from the fact that the lowest prime is two. 4 follows from the fact, as in the previous post, that in the upward neighborhood of two the difference plus something that comes to a finite limit comes to the limit of the derivative of the logarithm of the product of two less than the parameter times the zeta function of the parameter minus one. That'll take some finesse to get into our terms, but first, that which was shown by fiat last time, 2, that the function converges for nonreal exponents of real part exactly two.
This is basically equivalent to saying that the zeta function has no zeros with real part one, since recall from before that there's a finite difference between the derivative of the logarithm of the zeta function and the integral. So how to show that? Hadamard had a four-paragraph proof that was truly astounding for its time (compare the 25-page proof used by his contemporary de la Vallée Poussin), but involved some slightly annoying finickiness with the subset of the primes whose logarithms multiplied by the imaginary part of the zero are within a given distance of an odd multiple of pi. Mertens two years later streamlined this into something more elegant, and that's what's usually taught today: given a zero on that line, for some real number close to but greater than one, multiply the cube of the zeta function of that real number by the fourth power of the zeta function of that number plus (times i) the imaginary part of the zero, and as the real approaches one, this will approach zero (as a consequence of the analyticity of the zeta function that I'll still have to justify). Multiply that by the zeta function of the same real part plus twice the imaginary part and you should still get zero, since the zeta function stays finite everywhere on the line, but some trig'll show you you don't, so your hypotheses must be fucked, and unless all math is gibberish (which we accidentally did once and can never be sure we haven't again, or else computers wouldn't be a thing) that'd be the one about the zeta function having a zero of real part one. The trig in question would be the fact that the square of one greater than the cosine, which for real input always be greater than zero, let alone negative infinity, is equal to one and a half plus twice times the cosine plus the half the cosine of twice the angle, which by the way imaginary exponentiation works would factor into the logarithm and thus go to negative infinity if you had a zero.
So anyway, the less tricky one. The above relies only on limits (which the ancients had a vague grip on) and two particular properties of the function that can be proven by other means (probably); the previous proof of precondition 4, on the other hand, relies on properties of the derivative and logarithm that, while second nature to the modern undergrad, are very tricky to adapt to geometer's terms. So what exactly is it that we're doing?
Well, consider, it's not hard to show that the denominator is nonzero, by the argument we've already made. The trick is showing that the derivative, at least from above, is nonzero. This is done normally by appealing to analyticity, so let's adapt the proof that it's analytic, that is to say that it agrees with its Taylor series, so consider that Taylor's theorem can be shown a consequence of integration by parts, which comes from the product rule, which is a limiting case of Euclid's "geometric algebra" (cf. "the Hetaera"). That'll give a general guideline.
So now we actually show that this follows, by taking both limits simultaneously, that is to say both the derivative and the limit that goes to the logarithm. This'd be using the second Chebyshev function in essence, or at least, putting a term of negative one in the denominator that would yield the second rather than the first Chebyshev function in the original integral by Elements IX.35. This is essentially trivia since it's easy to show the difference has a finite (and in fact quite low, less than one) limit. So then it would be a matter of iterating the product rule, but that leaves a lot of messy terms, so instead it would probably be better to apply the above directly to what we already know to be the derivative.
Now that the easy part's done with, it's time to actually show the Wiener-Ikehara theorem. I'm going to follow Müger, including the bit that involves integrals, but I'll have to change a few things to get it to fit with the description of the theorem above. Namely, you'll note that I went with an exponent of the parameter of real part strictly greater than two, whereas Müger goes with e raised to a coefficient strictly greater than some value times the parameter. The problem is that this would require a definition of e. That's going to cause some problems to following his proof exactly, since it relies on the use of the Fourier transform, and therefore directly on the properties of e. So instead, we'll work backwards.
Ultimately, Müger's argument rests on a convolution, i.e., the function produced by integral of the product of values of two functions equidistant from an input pivot, of the parameter of the integral from Müger's formulation (with e) and a family of functions equal to the quotient of the difference of one minus the cosine of the product of some natural number with the parameter over pi times that same number times the square of the parameter. As the pivot goes to infinity, this'll go to one, and from this you can get that the function itself goes to one at infinity by the fact that this function gets pulled increasingly tight to the origin as the coefficient inside the cosine shoots up. He justifies this equality in the limit by annihilating the difference, which he justifies through a fairly easy special case of a result called the Riemann-Lebesgue lemma, which states that the Fourier transform of a function whose absolute value is integrable vanishes at infinity. It'll probably be best to show the disappearance directly, but the first order of business is to show why it is that the difference can be expressed as such a Fourier transform.
That this can be expressed as a Fourier transform comes from the monotone convergence theorem, a statement similarly obvious to the restricted Wiener-Ikehara theorem of the previous post but with the benefit of being true, namely that if you have a sequence of everywhere positive integrable functions where every point is nondecreasing as the sequence goes on and the limit of the series is integrable then the limit of the integral is the integral of the limit. In this case, you'd use the difference and use the limit thereof at the line of real part 2. You get to that step by using Fubini's theorem, which is the same as rearrangement of an absolutely convergent series only applied to a more general integral (since a series is an integral, really, just a special kind), that if a two dimensional integral is absolutely convergent you can shuffle it around however you please.
To streamline things a bit, you can break down the Fourier transform along the minimal abscissa of the product of the result of the integral by the tent function into the convolution at the parameter of the integrand and the Fourier transform of the tend function, minus the integral of the Fourier transform of the tent function from minus infinity to the parameter. This holds because the comparable statement holds for higher abscissae, when the integrands are divided by an appropriate function of e. A variable substitution will get you to the simpler version. Now it becomes a bit tricky in that the variable substitution is "behind the scenes," and we won't be using an integral of the function, but a series, which remember, is a kind of integral (recall also that the integrand doesn't have to be continuous, just the result), except that treating it that way fucks up the precondition that the function be nondecreasing, since now rather than the first/second Chebyshev function we're kind of using the first/second Chebyshev function multiplied by the parameter raised to the remainder, so we need to weaken the preconditions, and in fact I glossed over the use of that particular precondition enough that we'll be able to adapt it easily; specifically, you use the monotonicity to round to one end or another of a section plucked from the convolution, and so all you need to do is mark an offset of one.
So let's make a slight attempt to restore sanity. So let's consider what the hypothetical Greek would know from the arguments far above, in semi-modern terms:
The sum of the quotient of the (Greekified) first Chebyshev functions divided by the index to any exponent strictly greater than two will converge.
The sum of the quotient of the first Chebyshev function divided by the product of the square of the index by the cosine of the product of a given constant by the log of the index will converge, or at least, as the square is approached from a higher power, the sum will approach a limit.
And now let's summarize, from this, what we can prove and how, right after we've cleared one more obstacle. You'll note that all of the above is built on references to integrals. I want to avoid integrals, so instead, I'm going to try to go through a proof similar to Müger's using only sums.
We'll start by considering the sum of the sum we came up with that throws in the bit about the cosine, with some coefficient greater than zero, and "real part" 2. The integral will be zero, and it will also be a difference. I don't know about the sum, but by definition, the integral will be the limit of the Riemann sums, so I should be able to construct a bound.
Huh... so that was only one step, because I guess really that's the only obstacle. So let's put it all together (sort of).
Actually, one little thing first. The proof above relies very heavily on integrals, which could be adapted to this paradigm using Riemann sums, but that feels excessively artificial. So what about replacing the integrals with a simple finite sum? That would still leave the complex numbers, as well as e itself, which also must be e-liminated, but it's likely a good start. So let's think about this one.
Well, the idea is that we're trying to get the limit of the Fourier transform, or the equivalent, on the LHS, and show it equals zero, while on the right-hand ess, to get the difference between the convolution, or the equivalent, and a limit that goes to a constant. To that end, let's start by just swapping out the integrals for sums and see what happens. The first step, the application of Fubini's theorem, still works, since the series is still absolutely convergent. The Fourier transform isn't as "clean" anymore, but it still exists, and it's still real for integers in the parameter of the limit. The trick with the sign still works. The monotone convergence theorem still holds. Replacing the parameter still works.
So now the tricky part (maybe?): the Riemann-Lebesgue lemma. So in this form, I've got to show that a function from the integers to the complex numbers with compact support (i.e., a finite stretch where it ain't zero) has not a discrete Fourier transform, but something like a continuous Fourier transform in the discrete realm, that behaves similarly. The trick there is to find a series of functions that pull in toward zero as the series goes on.
So the question becomes, what properties of the Fourier transform, exactly, does he rely on? The definition plugs into Fubini's theorem. The fact that this particular transform is always positive allows the use of the monotone convergence theorem. The fact that both the function and its transform are even lets you get this into a form whose limit can be shown to be constant. The Riemann-Lebesgue lemma brings you to the final step, where a limit that goes to zero is equal to the difference between 1 and the limit we seek to show goes to 1.
Now, Fubini's theorem is just the continuous analogue of the inverse (which holds) of Riemann's paradox, so that can be brought in. The tricky part then becomes finding an analogue of the Fourier transform; the obvious answer would be the DFT, but how to sample? Maybe sort of like a Riemann sum, each function in the family sampling closer. The problem then becomes showing the Riemann-Lebesgue lemma.
Although the monotone convergence theorem is often stated as applying only to everywhere-positive functions, some very quick algebra will let you apply it to any function bounded on one side. What we'd need to apply it to in this case, though, would be both the Fourier transform and the product of the Fourier transform by the quotient of the second Chebyshev function and its parameter. That being the case, we might just need to bring back that statement from last time that the first Chebyshev function's growth is bounded, which it shouldn't be hard to extend to the second.
So just as a heuristic, let's use the integral over intervals. This presents a problem, in that the antiderivative of the Fourier transform Müger uses isn't expressible in closed form, at least not without using the exponential integral, which might be a little tricky. So instead, let's replace the square in the denominator with the exponential function of the absolute value, which keeps it even, positive, and an "approximate unit." (And in fact, we should get rid of the tau.) For this we get at zero the cosine of half the ordinal, minus the sine of half the ordinal, minus two, all divided by the exponential function of half the ordinal, with one added to the entire quotient. This approaches one as the ordinal gets large, causing the first term to approach zero. Elsewhere what we get is a difference between two values of the first term between three halves the ordinal and half of it, then five halves and three halves the ordinal, and so on, and those all go to zero as the ordinal gets large. That is to say, it's yet another telescoping series.
So now to work backwards. First things first, replace the e with a 2 and opt to telescope through whole rather than half ordinals. We have the slight problem, apparent right away, that trig functions of a nonzero algebraic number are always transcendental, but we'll cross that Kaliningrad bridge when we march to it. So now slip in the convolution and wrap it in the limit, bearing in mind that now it has to work for the entire family, has to go to 1 for the entire family. The proof from here to the goal is basically already laid out in that gibberish above, so let's keep moving forward (backward). Now the original proof got to 1 by getting to tau with the "approximate unit" times a factor of tau integrated from minus infinity to the parameter of the limit as the limit went to infinity. So for us, we'll ultimately be returning to the telescoping series, although to know what exactly we're working with we have to make it equal to the limit we already have.
This was accomplished in the original proof by putting their difference on one side, and on the other, a Fourier transform that could be shown to be equal to the difference, whose limit could be shown to be zero by the Riemann-Lebesgue lemma. So let's ourselves consider the difference between the limit of the "convolution" (bearing in mind that since it's no longer an actual convolution, it may take more finesse to consider the entire family) and the actual telescoping series. Müger links that to the Fourier transform of a function with compact support, which manifestly exists. We'll prefer to treat it as it is, subtracting one from the convoluend.
Anyway, I've completely lost the plot, to the extent I don't think I'll be able to parse this spaghetti drunk. Next post (or the one after): I'll try to make sense of this mess.
0 notes
Text
Talkin' bout magical computers
Immerman's theorem, named for Neil Immerman, whom I took a course with once, states that NL = co-NL. The fuck's that mean?
Well, say you have a magical computer and a budget airline; say if you wait long enough, two cities are directly connected or they're not, but two cities being connected doesn't mean the return trip is connected. The computer, like I said, is magic: if you give it a yes/no question, you can tell it to take a guess, and as long as it's possible, the computer will always give a series of answers that lead to "yes." Now, let's say you want to see whether your cheap-ass airline can get from A to B in any number of layovers.
First, you give the computer A. You tell it to guess a destination. If you can't fly there directly, you tell it to answer "no." (Remember, it doesn't want to answer "no.") If you can fly there directly, and it's B, answer "yes." If you can fly there directly, and it's not B, increment a counter; if the counter is the number of cities you serve, answer "no," and if not, make this your new A.
You'll note that the computer at any one time only stores a starting city, a target city, and a count of the number of cities you have. You can put all of these down as indices of cities, so the space you need is proportional to the number of bits in the number of cities, i.e., the digits you need to write down how many cities you serve. In a word, you need logarithmic space. That's NL.
In fact, any problem in NL can be modeled as such a problem, since if a computer's memory is restricted to a constant multiple of the number of bits in the input, you can consider a "city" to be the entirety of the possible states of the memory, plus the position in the code of the algorithm, and the magic "guess" operator is accounted for by the fact that planes can go to multiple cities. So the number of bits in the number of cities will just be the number of bits you needed before, plus a constant for the code, which is still within a constant multiple of the bits in the input for sufficiently large input.
Okay, so it'll magically guess the values that bring it to a "yes," if there is one. But what if it's the opposite? What if the magical computer wants at every turn to bring you to a "no"? Well, you should ask the opposite question - show there isn't a path. That's co-NL, and the fact that it's possible to show there isn't a path is Immerman's theorem.
Basically, if you can show that you can't get there in so many steps, by putting in as the number of steps the number of cities, you can get there. So for one step, you can just run through all the cities and see that you can't get there from the starts. Since you can do that, you can also check (and store, since its value is less than the number of cities) how many cities you can't get to in one step. For any other number of steps, assuming that you know how to check and store how many can't get there in one step fewer, you can check by checking for each city how many can be reached from one step fewer, then running through each city other than the current, finding a path there (as before, only switching "no" with "yes"), adding one to a count of those points reached, and if the first city can be reached from the second, moving on to the next, incrementing a count, otherwise going back until you're out of cities to find the first from. At this point, you check if the count of cities reached in one less step is the count expected; if not, you say "yes" (which the computer doesn't want). You keep going until you're through all the cities, and if the cities are all the cities, well, you can't get theah from heah.
All right, so let's go on to the more famous one (which I hadn't been planning to do until I chose my title). Cook's theorem started the ball rolling on the question you've probably heard of, whether P = NP. What that means is whether there's anything that magical computer can do in polynomial time (i.e., a length of time that increases in proportion to a constant power of the length of the input) that an ordinary computer can't (not necessarily the same power between the two). Cook's theorem states that if you can, with an ordinary computer, say whether a gaggle of ands, ors, boolean variables, and parentheses has a possible assignation to those variables that yields "true" for the whole gaggle, that magical computer can't do more than an ordinary one. Very quickly about a thousand things (at least twenty-one) were shown to be easy stepping-stones to solve this, and therefore only doable if the magical computer weren't so magical after all. The most famous of these is the "travelling salesman problem," going back to that airline, what's the shortest possible way to get to every city exactly once and back to where you started? This is because it would allow you to determine whether a graph has a Hamiltonian cycle, that is to say, any way to get to every city exactly once and back, and from that you could figure out the smallest number of cities that together are one edge or the other of every flight path, by... uh... hmm.
Okay, this is going to need its own paragraph. You're checking whether there's a subset of vertices that touches every edge (that is, cities from which... fuck it, it means vertices that touch every edge) less than some input number. What you have to do is make from the graph you start with a graph that'll have a Hamiltonian cycle if and only if the graph you started with had a vertex cover less than that size. Well, the vertices of that graph, you'll have one vertex for the allowable number of covers, plus two sets of vertices for each adjacent pair of edge and vertex in the original graph. As for the edges, first, there's always a one-way edge between the corresponding vertices there from the first set to the second. Within each set (but not between sets), there's a two-way edge between any two vertices whose pairs had the same edge on the original graph. There's a one-way edge from the second set back to the first between vertices representing edges adjacent to the same vertex on the original graph SO LONG AS in between the ordinals of the edges on the original graph (i.e., numbers assigned arbitrarily just to tell them apart) there isn't another edge that touches that vertex. There's also a one-way line from the vertex in the second set that represents the highest-numbered edge that touched a given vertex on the original graph to all of those special vertices mentioned at the beginning, and from all of those special vertices to the vertex in the first set that represents the lowest-numbered edge that touches a given vertex. This works because Karp says so.
So if you can compute the travelling salesman problem, you can find a directed Hamiltonian cycle, and from there you can find whether there exists a vertex cover below a given size. And from there? Well, if you look at the sketched proof up there, the formula you get from it can always be made into a single "and" of "ors." (Both of which may have unbounded fan-in, i.e., to use the natural language analogue, a shitload of clauses.) So you draw up a graph where the vertices are labelled with ordered pairs of variables (counting each and its complement separately) and the clauses in which they occur. Put an edge (either or both directions, it doesn't matter because of the nature of the vertex cover problem) between them if one is the negation of the other or if they're in the same clause. The number we're shooting for is the number of vertices minus the number of clauses. If it's satisfiable, you can pick a vertex for each clause that's not part of the vertex cover and that'll do ya. If not, either you're going to have to take out one and its own negation, or you're going to have to take out a whole clause. If, and only if.
So that's the concept of "NP-completeness," which you'll hear sometimes used to mean, essentially, "wicked hard"; that's sort of Cobham's thesis, which is that P is roughly what's tractable, and since NP-complete problems aren't thought to be in P, there's the story. This brings up the notion of Anathem, in which the "travelling fraa problem" is solved by a "Saunt Grod's machine," which uses quantum effects. Now, it's not certain that this is a slightly-less-imaginary quantum computer, but assuming it is, again, there's no proof that a quantum computer could solve NP-complete problems, and Fraa Jad really doubles down by saying that he finds a code by picking numbers randomly.
...I guess I don't really have much else to say, although I feel I should mention Savitch's theorem, that if you have one of these imaginary computers, and you've programmed it to do something with only so much memory, if time is no object, you can get a more realistic computer to do it with (proportional in terms of the input to) the square of the memory. Basically, you dig down all the possible states from either end. Anyway, I'm posting this.
4 notes
·
View notes
Text
What the fuck is a contour integral?
So I was writing some thoughts on the prime number theorem (the theorem, at least as I learned it, that the chance of a random number up to a given cap being prime approaches one over the l.n. of the cap, which is such a CS-ey way to state it that I couldn't stop myself from writing natural log "l.n." even in natural language), which I still plan to do, but I planned to do it with what bill wurtz might call a Greekification overload. I still hope to, although I had been thinking of giving up before my liver did, but with this post, I guess I'm going anemtheic. It came from a deep dive into the (mostly pirated; I currently endeavor to finish this before the stolen Hamilton that showed up in my suggestions on guess which video in the other window ends, since it will certainly have been pulled by then) literature following skimming up on a course on number theory, in particular how they proved the PNT, which I was familiar with using, but had never learned to prove. I was rather encouraged by the fact that at the time (they've changed them now, sadly) the lecture notes in question, the fifteenth in the series, opened with a definition of the derivative, then made it look easier than it really was. I'm not going to go into the details at this time, since that would really only deceive you - I'll gloss over them at the end of this post, as well as the reason I gloss over them. Basically, Newman found a relatively easy way to prove a restricted case of what's known as the "Wiener-Ikehara theorem" (which I can't help but read as "Wernicke-Korsakoff theorem"), that if you can show that if a function continuous in the complex sense (that is, the limit can come from any direction in the complex plane) for real part strictly greater than 1 can be expressed there as the integral from 1 to infinity of an everywhere positive, everywhere increasing function divided by the parameter of integration to the power of the parameter of the initial function, and the difference between that (initial) function and 1/(x-1) converges uniformly, meaning that in the usual "for every epsilon there is a delta," that single delta is for every imaginary component, not just that there is a delta for each imaginary component (to show it has an analytic extension will do), as the real part of x approaches 1 from above, then the function approaches the parameter in ratio as the parameter gets large. Newman came up with a simple proof of an "auxiliary theorem" of Ingham, the seemingly obvious fact that if a function is analytic for all values with strictly positive real parts, and can be expressed there as the sum (Newman has sum, and I would have used sum in my first concept, but the proof is easily adapted to the integral) from one to infinity of some sequence of coefficients of absolute value less than some upper limit, divided by the index to the power of the parameter of the function being defined, and this can be extended to an analytic function whose domain includes the imaginary line, then the function at zero is equal to the sum of the coefficients. This is enough to show that the Wiener-Ikehara theorem is true provided that the function in the integral increases no more quickly than a constant multiple of its parameter, basically by showing that the difference between the function you wish to put an asymptotic bound on divided by the square and 1/x (which doesn't converge per good old Nicole Oresme - no, not the space Randroid) converges.
...well, Hamilton (the Federalist, not the mathematician) has brought me to the massive copy-paste from the PNT post. Now I have an excuse to kill myself further.
(Intermezzo: In reality, around the same time I was about to give up, I found an unpublished paper, written long after I'd begun my idiotic endeavor, from a Dutch physics professor that, doing some poking around his site, was really less a "paper" than a very formal blog post, which was basically a hit piece on Newman. I disagree that Newman's proof is "not simpler," as he says, to modern eyes, but it's less suited to my purposes, so this paper has given me a fourth sheet to a second wind.)
(Several days later... but also weeks earlier... unlike the fictional Achaemenid Hemingway, I write drunk and edit drunk)
So the goal now is to explain to the nonexistent gremlins that might actually read this blog and yet not understand what the people who might read Newman's paper would call "basic math" how to prove the auxiliary theorem, and how it leads to the restricted version of the Wernicke-Korsakoff theorem. Well, let's start with how you might explain it to someone fresh off a complex analysis course. You choose two largish numbers, let's say natural numbers, a radius and a length of the partial. So let's come up with a D (by pure coincidence, the grade I got in complex analysis) shape, that is, a semicircle of the radius chosen to the right of the imaginary line, the ends connected by a curve that in between is to the left of the imaginary line except at a finite number of points (the actual shape hardly matters), and inside the contour there are no points where the function in question isn't analytic (if the function has an analytic extension to and including the imaginary line, for any finite length of the line, it must have such an extension to some finite length beyond, since there can only be finite singularities in a finite area, since by chopping that area into quadrants continuously, you can come to an arbitrarily small area with infinite singularities, which kind of fucks up the whole "singular(ity)" thing). Now, the function we're actually integrating on that contour isn't the function we get from the series; instead, it's that function multiplied by one plus the square of the point in question divided by the chosen radius, all multiplied by the length of the partial to the power of the point in question and divided by the point in question.
Now, it'd be clear to our hypothetical student that this would be equal to the value of (the limit of) that function times its own parameter at the origin times two times pi times i, because when you have a contour like that, where the function both on and everywhere inside the contour is analytic except at a finite number of points, you can express it as that imaginary coefficient multiplied by what's known as the "residue," which, as long as the clause that follows this exists, will be the value of the product of function and the transformation of the by the point you're looking for, at the point you're looking for. In this case, the only such point is the origin, since you've divided a function that's everywhere analytic by the parameter itself. So what's to be proven is that that contour integral approaches the partials as their length grows large. For that, let's first consider each point on the semicircle - first consider the factor of one plus the square of the ratio, bearing in mind that on that semicircle the magnitude will be equal to the preset radius, so this is a point on the unit circle translated one unit, so what you've got is a side of a triangle of which the other two sides are of length 1 and 1 and the angle between them is pi minus twice the angle of x, and by the law of cosines and the fact that the cosine of pi minus a given angle is the negation of the cosine of that angle, you've got for the third side the square root of the sum of two and twice the sum of the cosine of twice the angle, which would be twice the angle. Then we'll consider the initial function. Since the function in the original statement of the theorem was bounded, the absolute values of the remainder around the semicircle, minus a sufficiently large partial, will be bounded by twice the original bound divided by the square of the chosen radius by the argument above combined with some facts about power sums.
As for the left side of the contour, considering only the ultimate sum (that is, the sum of the partial and the remainder), that goes to zero as the length of the partial grows, all else kept constant, since the length is finite and it has that factor of the length raised to the parameter, which has a negative real part "almost everywhere." (A negative real part puts it on the denominator by the definition of a complex exponent.) Now considering only the partial, that doesn't have the singularity since it's only a partial, so the left side of the contour negates the right. Therefore, the difference is of absolute value bounded by the radio of a constant to the chosen radius plus a term that can be made arbitrarily small by extending the partial. Therefore, by choosing a sufficiently high radius and length, an arbitrarily small bound can be placed on the remainder, so the series must converge.
So let's check the Greekification level. Greekification underwhelming. Let's fix that by breaking down the implicit train of thought of that student step by step.
Let's say there were no singularities in the region. Then, since we know there exists a derivative at every point, the Cauchy-Riemann equations hold - that is, since for the derivative to exist, the Newton quotient must converge from every approach, including those parallel to either axis, some quick algebra shows that the derivatives of either component wrt its own parameter are equal, and the derivatives of each component wrt the opposite parameter negate one another.
So now let's chop the region up into a bunch of regions each of which can be swept out by an unbroken line in either a horizontal or a vertical direction. This can get annoying (that wacky Jordan curve theorem) in some cases, but fortunately, all that comes up in this case is a D with a circle cut out of it. Showing it for each of these regions shows it for the whole region, since on the cuts, the contours go in opposite directions and negate one another.
So now let's consider what the contour integral actually is; the differential is between points an infinitesimal distance apart on the contour. So parameterizing it, you multiply the function by the sum of the derivative of the real part and i times the derivative of the imaginary part, all by an infinitesimal part of the parameter. (Only I or the fictitious Epimenides Paul constructs to escape his denunciation of the euhemerism necessary to his own monotheism could abuse notation in natural language!)
So to begin with, let's just consider the product of the real part and the real differential, or rather, the difference between that product's values at opposite ends of a cut in the imaginary direction. Per f.t.o.c. (the theorem that states the difference between two values of a thing is how much the thing's changed), you get that it's the integral of the partial of the function wrt the imaginary coefficient. Since the contour goes counterclockwise (1 -> i -> -1 -> -i), and positive imaginary is up, the subtrahend goes in the positive real direction and the minuend in the negative, so the sign is reversed. Thus the integral within can be expressed in terms of the integral around the contour, both dimensions now taken into account.
Likewise, let's consider just the difference in values of the product of the imaginary part and the imaginary differential at opposite ends of a cut in the real direction. Same argument to get to the partial, but this time, the subtrahend points in the negative real direction and the minuend in the positive, so the sign isn't reversed. Therefore, what you've got is the integral of the difference between the two partials, which is zero by the Cauchy-Riemann equations.
Now let's consider the imaginary parts of the product, that is to say, the real part by the imaginary differential and the imaginary part by the real differential. The former consider at opposite ends of a cut in the real direction, the latter in the imaginary. Unlike before, both the real and imaginary parts end up in the "wrong" order this time because of the extraneous i. Therefore, we get the sum, which is also zero by the Cauchy-Riemann equations.
So now that you've got that you get zero in any region where there are no singularities, what if there are isolated singularities? Then what you do is that you note that when you're very very close to the singularity, the value at every point, times the complex difference between that point and the singularity, is very very close to the residue. (This, the limit that is, is actually the definition of the residue; the fact above follows pretty plainly from this.)
I said before, to think geometrically, we have to cease to think geometrically. That is, this Cartesian paradigm we've been working in so far would have to be put aside, only recognizing the integrals as the Riemann sums they are. To that end, the first step would be to reframe this in terms of polar rather than Cartesian coordinates, which would require rewriting the Cauchy-Riemann equations. To that end, they have an equivalent in polar coordinates: i times the radial derivative of the function is equal to the angular derivative of the function divided by the radius, as can be seen by taking the equivalent equations in polar coordinates, still separating the function in Cartesian terms, the radial and tangential directions still being orthogonal enough to support the arguments from before (basically just a coordinate transformation, albeit one that would depend on which point you were considering), and adding them together.
To actually do the rewriting, it'd probably make sense to replace the left-hand side of the D with two lines radial from the origin running from the extension of the semicircle to the radius of the smaller circle from step 7, connected by an arc of the latter circle. Because this is still to the left of the imaginary line a.e., the argument that it goes to zero still holds. The advantage is that now we have one unbroken region. However, now steps 4-6 above have to be rewritten.
Well, the basic idea, obviously, is that instead of lines in the real and imaginary direction, it's lines in the radial direction and arcs concentric around the origin. The problem here is that you no longer have such a simple infinitesimal on the contour itself. Integrating along the radial parts of the contour (the bit left of the imaginary line), the infinitesimal part of the variable is equal to the infinitesimal of the radius, times the variable itself, divided by the radius itself. Likewise, integrating along the contours, the infinitesimal part of the variable is equal to i times the infinitesimal of the angle times the variable itself. Therefore, the basic idea is that instead of applying the Cauchy-Riemann equations to the function itself, you have to apply them to the function divided by its variable, which is no hurdle, since a differentiable function divided by its variable, so long as you're not at the origin, is still differentiable.
So for the Greek version, you would use the method of exhaustion, putting finite limits that can be made arbitrarily small on the difference by using what would be recognizable to the modern reader as Riemann sums. This would take a couple steps, but the same ideas allow bounds that can be made arbitrarily small. Of course, this would be considering only the real part, which since we're not actually considering geometry, wouldn't be that hard, switching what equate to the real and imaginary parts of various functions in and out of the actual calculations as necessary.
So I've convinced myself that, in theory, it would be possible to do this but it seems extremely artificial, unlike, say Ramanujan's approximate squaring of the circle using 355/113. Multiplying the logarithm of the zeta function by the exponent and the sum above leads to something so weird and unintuitive, and working out the Cauchy-Riemann equations in polar coordinates would be rough, but probably worse than that would be the fact that an angle would have to be chosen to put part of the sector to the left of the imaginary line, and it would have to be shown that a finite such angle existed. The basic idea is that there have to be finite zeros, since otherwise there would have to be a point with infinite zeros within any finite distance of it (chop it up into a finite but arbitrarily large number of little bits, and no matter how small those bits are, one of them must contain infinite zeros), and that would wipe out the Taylor series around that point. Yeah... that'd be tough to prove.
For that reason, I've given up on this tactic, but that friggin' Müger has convinced me of a better way to kill myself. Again, I find Newman's approach more elegant, but both of Müger's - his other being a "Selbergian" proof (i.e., in the vein of Selberg's mid-century proof of the PNT - in his case, the basic idea is showing that since the sum of the second Chebyshev function - which I won't explain later, but it's just like the first Chebyshev function which I will, only you pretend that prime numbers reappear whenever in reality they have a power (i.e., 1,2,3,2,5,6,7,2,3,10,...) - of a given number over each denominator in turn up to that number is the same as the sum of the natural logs up to that number, and from that and the fact that that's pretty close to what you get when you pretend it's identity you can show it must be pretty close to identity), which one might think should be my initial approach, but the original is far too convoluted to fit, and right now it seems even Müger's more elegant one doesn't meet my needs as well as his other - lend themselves better to my approach, and for that reason, that's (...that one, I mean, whichever one it is I do mean) the approach I'll be taking.
つづく…
Actually, right, before I つづく, I forgot to gloss over why it is that that seemingly obvious statement demonstrated above implies the theorem that went unproven for a century. First, that the prime number theorem is equivalent to the statement that the first Chebyshev function (which is the sum of the natural logarithms of the primes less than the input) increases at parity, and second, that the statement that the first Chebyshev function increases at parity is implied by the above.
As for the first, one bound comes from the fact that obviously the prime counting function times the log of the highest prime is going to be greater than the sum of the logs of the primes; the other bound can be established by expressing the prime counting function as the sum for all primes up to the parameter of the quotient of the log and the log. For the denominator there, convert the fraction to the integral of the negation of the derivative from the index to infinity, and separate that into the integrals from the sum's index to the parameter of the initial function and from the parameter to infinity. The parameter to infinity will be the same for each term, so for that part of the sum you've just got the first Chebyshev function divided by the logarithm. The other part of the sum would come to the integral of the first Chebyshev function of the parameter of integration divided by the product of the parameter of integration and the square of its log from two to infinity, since as you break it down, you get that integral along the subsets beginning from each prime in turn. Accepting that the first Chebyshev function increases at parity, you can, to a finite error, cancel it with the parameter of integration in the denominator, leaving the reciprocal of the square of the logarithm. The derivative of the parameter divided by the square of the log of the parameter differs from the reciprocal of the square of the logarithm by twice the reciprocal of the cube of the logarithm, which vanishes in comparison, so they twiddle one another. Therefore, what you've got for the prime counting function is the parameter divided by the log of the parameter plus/minus the parameter divided by the square of the log of the parameter, and again, the latter disappears, so the former is twiddled.
As for the second, let's consider what's known as the zeta function. Right now, I'm not going to go into excessive detail, since having committed to going after this lunacy, I'll be doing that soon anyway. So for now I'll state by fiat that, where the following exist, it can be expressed as:
The sum of the reciprocals of the counting numbers, starting with one, to the power of the input.
The alternating sum (i.e., every other term, starting with the second, is negative) of the same, all divided by one minus one-half to the power of the input.
The product of the reciprocals of one minus the reciprocal of each prime to the power of the input.
These come to the same result where they converge; the first and third can be shown fairly quickly to converge for all inputs with real part greater than one, and the second converges for all inputs with real part greater than zero, unless the input is exactly 1. Moreover, if you multiply the second by one less than the parameter, the product will converge there as well (fiat again yay fiat! ...also l'Hôpital's rule combined with the alternating series test; the alternating harmonic series' convergence to log 2 means it'll converge to one). Not only that, they're complex-differentiable, since the terms are complex-differentiable, and where they're not zero, the same is true of the logarithm. What's important is the logarithm of the product of the parameter and the zeta function of one greater than the parameter. By the second definition above, at least the product exists and is complex-differentiable for all inputs of real part strictly greater than zero. So wherever the product exists and is nonzero, the logarithm will also exist and be complex-differentiable. So what's next is to show that the product is nonzero where the real part is one, which, again, I'll leave for the next bit, saying for now only that if expression 3 above blew up in the right-neighborhood of any point with real part one, expression 1 at nearby points would blow up at that point with real part one and double the imaginary part based on some trig identities, and due to expression 2 that's impossible.
Okay, so to fit this to the restricted version of the Wernicke-Korsakoff theorem, which is, to remind, that if the function is nondecreasing and grows at a rate bounded by a constant multiple of its input - which implies that the integral from one to infinity of it divided by its parameter to some power must converge when the real part of that power is strictly greater than two (as well as for complex numbers with real part two and nonzero imaginary part) - and difference between the integral from that parenthetical and one over two less than the power converges to a limit wherever the real part is two, including two itself, then the function twiddles its own parameter. The goal is to prove this for the first Chebyshev function. First, the boundedness of the growth; this I'll have to go into detail on, since it won't be necessary in the next post. It's enough to show the first Chebyshev function is at most the parameter by 1.39 (log 4 rounded up). I so ache to say that “Chebyshev said it, and I’ll say it again, there’s always a prime between n and 2n," but what I’m actually trying to show is almost the opposite - that the product of all the primes between n and 2n is less than four to the power n; since the natural logarithm of this would be difference in the first Chebyshev function at n and 2n, this inequality would imply that this difference is less than log 4, which combined with the facts that the value at one is zero and that the function is nondecreasing would suffice to show that the function will always be less than the parameter by log 4. So to show this, you only need to rely on the fact that the binomial coefficient 2n choose n is an integer less than four to the n; since this is the quotient of the factorial of 2n and the square of the factorial of n, it must be divisible by every prime in between, so it's greater than their product.
So the first Chebyshev function grows at a rate bounded by a constant multiple of its input, so the next thing would be to show that the difference specified above converges. The way you do this is by noting that, by the definition of the first Chebyshev function, the integral from one to infinity of it divided by the power of the parameter of integration to something of real part greater than two would, from a similar argument as before to establish the function's asymptotic lower bound, yield the sum of the integrals of the logarithm of each prime divided by the product of one less than the power in question and the prime in question to that power one less. Consider that the derivative of the logarithm of the zeta function, by definition 3 above, will be the negation of the sum of the logarithm of each prime divided by one less than that prime to the power of the input; if rather than the log of the zeta function of the input you consider that of one plus the input, and consider the difference between that and the integral just laid out, you'll get a difference that converges as the difference goes to one, to the sum of the logarithm of each successive prime divided by the difference between the square of that prime and the prime itself, and since adding this up for all natural numbers two or greater converges (to one), it has to converge for just the primes. Now, since the derivative of the log is the reciprocal, the difference between the derivative of the log of the zeta function of one greater than the variable and one over the variable will be the derivative of the log of the product of the variable and the zeta function of one greater than the variable. By expression 2 of the zeta function above, that converges, and it's within a term that also converges of the difference we're trying to show converges, so that difference converges. (That it converges everywhere else on the line we've already seen.) So, as Eddie Izzard might wrongly say, yeah.
つづく…
#cuck#prime number theorem#selberg#contour integral#hamilton#euhermerism#epimenides#wiener-ikehara theorem#tauber#death note#bill wurtz#history of the entire world i guess#oh joy sex toy
0 notes
Text
Chaos reigns
Insolubility. An expression of order beyond the gods. The six-fingered girl cannot solve a quintic equation - give her logic gates, she cannot assemble an NC^1 program that can't be solved by a BWBP program. Can she find a Fermat prime greater than 65,537? Who knows?
But today, three numbers, in five digits (as with 65,537), come together, because of arbitrary measurements of time based on a god of false death whose death is dying's contradictorily infallible birth narratives' mean and various cosmological and biological quirks of where a particular ape happened to evolve.
Temporality - entropy - transcendent order expressed in chaos endemic to temporality. Three ways of saying the same thing.
"The fear of death is the beginning of slavery." Slavery of woman, slavery of Gaia, slavery of the fool who thinks the two are one.
youtube
Dying all dying soon nothing. Nirvana is our common eternal experience but incomprehensible because its opposite is impossible to internalize, solipsism self-evident. Alan Turing believed in a godless afterlife (easier to die), also conceived of a human mathematician, a human computer, programmable, with its limitations. The Turing machine gives our ultimate limitations, the ultimate limitations of algorithm. From looms and ancient orreries to the box you're reading this on you see your own death, your head torn open to programming.
Narrative structure, essay structure, neither here nor there. Lars von Trier is a pie anyway.
youtube
The words "Alan Turing believed in an afterlife" appear in red. I can't make red. Human credits screaming. The name becomes the thing; remember isomorphism lest you scream with them. Or should you? The name becomes the being? Certain far autists, functional level notwithstanding, cannot internalize the notion of qualia (at least Matt Dillahunty says so).
Communication underlies qualia, solipsism child of society. The Turing machine does not have natural faculty for communication. Brainfuck includes period and comma (is it?) but does not specify their implementation. Various functional schemes are developed. Merlin lies so that Arthur will answer true, Pilate tries to prove closure under complementation.
Ultimately no contingency. Is is incoherent; iff is absolute. Mathematics, only mathematics, is as could be.
#too many cooks#antichrist#philosophy#deadline#illuminatus#gibberish#drabble#numerology#lars von trier
0 notes
Video
youtube
This is relevant to my interests.
0 notes
Text
Happy backdated pi day! Now here's a racist dick joke.
I made a promise! Now here goes.
As I said before, the reason that the Chinese came to the famous estimate of pi as 355/113 was threefold. First, and the reason this is important will become clear over the next two, they used area rather than arc length (which wasn't rigorously defined then, even in the West). Second, rather than using a power series as we (from every culture) do today, they would come to better and better approximations by adding the numerator and denominator and seeing whether what they got was greater than or less than the real value. So since 4/1 (area of circle with half-edge 1) is too high, and 3/1 (area of 12-gon with circumradius 1) is too low, you go 7/2 is too high, 10/3 is too high, 13/4 is too high, 16/5 is too high, 19/6 is too high, 22/7 is too high, 25/8 is too low - now we're getting somewhere. The rest are all going to be too low, so you'll just be adding 22 to the numerator and 7 to the denominator until you get 355/113 (note, however, that two you'll get will be 157/50 = 3.14 and 311/99 = 3.141414...). Now, how did they know if it was above or below these numbers? More or less the same way Archimedes did, by continuously subdividing polygons, which brings me to the third reason: someone had the bright idea of doing something mathematically equivalent to drawing parabolae on the little polygons' faces.
What's striking about this idea is that it's so geometrically beautiful, and yet it came from the culture that was thinking in the closest thing then to algebra. They didn't seem to realize the connection to parabolae - I don't think that Eastern study of conic sections in those days was advanced enough to even feature parabolae. Rather, they used what's known as rod calculus, a method developed around the second century BCE (although 355/113 comes around the third century CE) that would evolve (fun fact: "evolve" comes from a Latin word invented to mean "to expand a polynomial") into the system popularized in the West and in the Arab world by al-Khwarizmi, who, although he's generally thought of as Persian, came originally from a territory in modern Uzbekistan.
Extremely interesting is what they did instead. Rather than use actual parabolae, what the great Liu Hui noticed was that each step, doubling the number of edges, added more than a quarter of that added by the previous step, and therefore you could more quickly approach them by going by thirds.
Anyway. The basic arithmetic operations work pretty exactly like they're taught in modern schools, only you would physically modify one of the operands to reflect the answer, rather than leaving the operands and writing the answer separately as you do today. It went right down to having a table. It's these that are taught today, so feel free to tell anyone saying long division is useless in the modern world that they're an alt-righter trying to resurrect the Greek tradition. Even the square root algorithm is the same as the one that might have been taught to your parents, because it almost certainly wasn't taught to you, because they want to keep your head free for the date Cnut died, since you won't use this sort of math in later life.
This was eventually replaced by the abacus, which the West had had all along. Meanwhile, the methods used in rod calculus are the ones that are dying out today by the proud innumerate who laugh at the unwillingness to teach calculators.
(Take a wild f呃kin' guess which day I actually posted this.)
0 notes
Text
Oh, I will sleep when we reach Shor, and pray we get there soon...
A terrible pun for my man Friday through Thursday, that makes no fucking sense, but fuck it, you didn’t come to this blog to make sense. I recently heard your usual NPR wank about quantum computers, saying one may soon be real. That reminds me of all the brilliant idiots who don’t know the difference between quantum TMs and nondeterministic TMs. (So called because they were trademarked by the famed Pokemon trainer Satoshi Turing.) So let me be clear: a quantum computer, at least as the term is generally understood in academic circles would probably not be able to solve NP-complete problems efficiently. (Of course, it’s not proven, since that would imply that an ordinary computer couldn’t solve NP-complete problems efficiently, and you may have heard about that little hurdle.) In fact, it’s thought a nondeterministic machine can do things a quantum machine can’t.
So, first, what’s a quantum computer? It might be quantum effects can do more (hell, ordinary IC gates already rely on quantum effects) or less (although it has been implemented successfully on small scales) than this, but it's a mathematical abstraction It could be a QTM, but it’s easier to use the circuit model. A circuit model is traditionally a bunch of and/or/not gates, but in this case the gates map all the possible probabilities of configurations to all the new ones. From there, you can get probabilities that are entangled, which allows things that probably aren't possible with a probabilistic computer alone, chief among these being factorization.
Factorization's easy, obviously, if you have time to count up to the number's square root, but what about when you don't? What if the time you have is proportional to the bits, or the bits squared, or to the sixth power, or whatever constant power? There's certainly no known way, and may be demonstrably no way, to do it under a Turing-equivalent model. (That is, no way to do it that quickly - obviously you can do it if you have time proportional to the square root of the number itself, just like with pen and paper.)
Basically, if it's not a power of a prime (it's pretty easy to check if something's a power of a prime - you just need the ceiling and floor of the roots up to root 2... that and primality testing, but that's not too hard - I think there's a less idiotic way, but I don't happen to know it), then there'll be a square root of 1 mod the input that's neither 1 nor -1. So, for instance, for 21, there's 8. (Basically, break it down into coprime factors and make it mod something of odd order to one and something of even order to the other to prove it - order is how many times you multiply it by itself to get 1.) Once you have that, you'll always be one greater than a number with a common factor (in our case seven). The trick is getting it. You do that by taking Fourier transforms and spinning them around randomly, like tops. Then you'll with a well-defined probability have what you need.
Oh, and, of course, this would kill basically any sort of encryption that doesn't involve a physical key changing hands. Bitcoin would actually be fine, though, probably; the SHA hash wouldn't be killed by a factorization algorithm, at least not right away. I think. You probably shouldn't listen to me. I'm the "someone else" SMBC warned you about.
0 notes
Text
To make proselytes of gods!
So the aliens have returned, and the goal here is to convert them to our weird religion of “geometry,” to explain to them why that “pi” or “tau” thing is actually important.
One involves probability. The aliens understand Bernoulli trials, which we’ll simplify to be a fair coin, 50/50. Let’s go back to that definition of pi as an integral - namely, twice the integral of (1+x^2)^(0.5). If you could just replace 0.5 with an integer, it would be easy to work out, even before calculus as we know it was known - Cavalieri’s quadrature formula lets you do that. You can just multiply it out and play with all the integral exponents. Ten years before the annus mirabilis, John Wallis lamented this fact (he was not, as claimed here, “thinking about sine waves,” the real significance of which hadn’t been established in his day) and, by considering the relationship of each such integral to the next, came up with a workaround.
So what does this have to do with fair coins? Well, let’s bear in mind that the string (which I just got from random.org) THTHTTHTHT is exactly as likely as... um... that doesn’t look very random. Another one. THHTHHTTTT. Good enough. The point is that each of those are exactly as likely as ten heads or ten tails. The reason they feel more likely... if they do feel more likely... is that there are 210 ways to get four heads and six tails, and only one way each to get ten heads or ten tails.
Before generalizing this, let’s consider the fact that mathematicians can instantly identify fake randomness by the fact that it’s too close to alternating, or too close to being 50/50. I don’t know if de Moivre knew this, but he posed the question, what are the chances of getting exactly 50/50? Well, it’s easy enough to calculate directly for small numbers, for any ratio. As above, they all have an equal chance, so it’s only how many ways they can be arranged without changing the layout of heads and tails. If you put numbers on each of them, you obviously get the factorial, but then you can rearrange the heads and tails without affecting the arrangement without the numbers. So it’s the factorial of the whole divided by the product of the factorials of the number of heads and number of tails. So if we’re going to 50/50 (assuming an even number of flips), the number of possible combinations is the factorial of the whole over the square of the factorial of half (for ten flips, 252). Divide by the appropriate power of two. So what happens to this as the number of flips gets too large to work it out exactly?
So let’s go back to that relation Wallis discovered. Well, we’ll consider what happens when you actually write out that exact formula. Up top, you get the factorial of the total number of flips, so the number of flips by one less and so on all the way down. Then on the bottom you’ve got the square of the factorial of half that by the appropriate power of two, but there a funny thing happens - each factorial has half as many terms as there are twos, so sprinkling the twos in, you get the square of the product of just the even numbers up to the number of flips. Cancel the even numbers out of the fraction, and you get the odd numbers over the even numbers.
Wait, I think I just explained the wrong thing. The relation Wallis discovered, let’s consider what happens when you expand (1-x^2) to the power of some integer. Well, it’s basically Pascal’s triangle, since you get the part that’s multiplied by one, and add it to the part that’s multiplied by x^2, which is one over. It turns out that Pascal’s triangle is equal to the numbers we’ve been discussing for arrangements of coins, or as they’re also called, binomial coefficients. Consider that for one flip, there’s one way to get no heads and one way to get one head, so that’s 1, 1, which is a row of the triangle. Now just add up the fractions in the formula above, and you’ll see that the ways of getting a given number of heads in a given number of flips is the sum of the ways of getting the same number in one fewer flips and one fewer in one fewer. So that’s Pascal’s triangle. So that’s also what you get when you expand the integer powers of (1-x^2), i.e., (1 - 2*x^2 + x^4), (1 - 3*x^2 + 3*x^4 - x^6) and so on. And then you can use Cavalieri’s quadrature formula, that the area under the curve x^n is the difference between the values of x^(n+1)/(n+1) evaluated at the two points. So for even n, between 1 and negative one, what you have is the sum of twice the coefficients divided by one greater than the power, e.g., for the formulae above, 2-2/3 = 4/3, 2-4/3+2/5 = 16/15, 2-6/3+6/5-2/7 = 32/35.
Now, with integration by parts, the chain rule, and the power rule (which is the same as Cavalieri’s quadrature formula only extending beyond integers), not only is it fairly easy to get this result, but it’s a more rigorous result than Wallis got. He didn’t really prove what he claimed, but I’ve baldly asserted there that it can be proven with things vaguely resembling those mathy words. I think you understand, human, but Wallis didn’t; regardless, where the integers were concerned, he recognized a pattern, that the integral of (1-x^2)^n is (2n)/(2n+1) times that of (1-x^2)^(n-1).
So the modern way to do that, again, would be by parts. In effect, you would differentiate x*(1-x^2)^n and note that it's (2n+1)*(1-x^2)^n - 2n*(1-x^2)^(n-1), and note that x*(1-x^2)^n, provided n is strictly positive, is going to be the same at 1 and -1, so the integrals over that interval of the two terms of the derivatives must cancel each other out. Wallis, who didn't have the product rule, couldn't have done that precisely (and I don't have it in me to plough through enough Latin to see how he actually did it), but again, beautiful concordance, thought itself - work out the right hand side term by term, and for each coefficient of x^(2k), k = 0 to n-1, you get (-1)^k*(2k+1) times the binomial coefficient, and for k = n, only the first term has such a coefficient, so you get (-1)^k*(2n+1), which takes the same form since n!/(k!*(n-k)!) = n!/(n!*0!) = 1. Integrate, and the (2k+1) disappears from each term, so 2*(-1)^k times the binomial coefficient. If n is even, this obviously cancels; if n is odd, you can see this cancels by the pattern in Pascal's triangle and the associative property of addition.
So anyway, this led to what’s known as the Wallis product. Start with 2 - because obviously the area under a line of (1-x^2)^0 = 1 from 1 to -1 is 2 - and keep going to some extreme value, couple million maybe to warm up the old abacus, by using this formula. So you'll want to multiply by 2/3, 4/5, 6/7, etc. Then when you've got that massive tiny value, you shift half a step, and come back down. Since you're half a step off, the numbers you would multiply by would be 3/4, 5/6, 7/8, etc., but since you're coming back down, it's actually 4/3, 6/5, 8/7, etc. So starting with 2, to get to the area of the semicircle, you multiply by 2/3, 4/3, 4/5, 6/5, 6/7, etc., and then some error factor that goes to zero as n gets large because (2n)/(2n+1) approaches 1 as n gets large.
So let's take a tight, quartic uey back to ringin' a bell, John Wally be good, wasn't I saying something about aliens? Anyway. Remember we were at the product of the odds over the product of the evens. But let's look at that product up there - 2 * 2/3 * 4/3 * 4/5 * 6/5 - even in the first few terms, it's clear that it's the square of the evens over the square of the odds, with a caveat - the odds will always be one further along than the evens. So flip it over, stick a new even on the bottom, and take the square root, and you're going to the same place. Since the product generated by the integrals goes to pi/2, the area of a semicircle of radius 1, the product generated by the coins goes to sqrt(2/(pi*n)), where n is the total number of flips, since remember that was the highest even. So that's the approximate chance of getting exactly half heads for a very large even number of flips.
Okay, that incorporates the square root of pi, but, the humans ask, how does that come to a bell? Well, consider a number of heads not especially close to all or none. What's the quotient of that happening and of exactly half? Well, consider what happens when you divide the binomial coefficients, you get increasing from the halfway point in the denominator, decreasing therefrom in the numerator. The powers of two cancel, since it's still the same number of flips, so that's what you get. So, for two thousand flips, let's pretend we had, say (for 970 or 1030 heads), (1000*999*...*971)/(1030*1029*...*1001), and reframe it as (971/1030)*(972/1029)*...*(1000/1001). So consider the natural logarithm, i.e. log base e, pretending we're zoomed back far enough that the integral works as an approximation of the sum that arises from the log of the product. We'll want to express it so that there's a one-unit gap between the log of each fraction. So looking at it in terms of l, in this case thirty, you'll want to be integrating the natural log of the difference over the sum, or the natural log of (1-2t/(n/2+t)) from 0 to l, and to that end use the identity that the natural log of (1-x) is close to -x for very low x, which follows (switch the sign and take the log) from the definition of e, expressed as (1+x)^(1/x) for small x (switch the sign and take the log). So we're looking for the integral of -2t/(n/2+t) = 2n/(n+2t) - 2, which is n*ln(1+2l/n)) - 2l. (Consider the fundamental properties of the hyperbola, blah blah blah.) The same approximation as before would make this zero, but that means it's off by infinity percent (my old math professor just slapped me), so instead use that fuckin' hyperbola to express ln(1-x) as the negative of the reciprocal of one minus t from 0 to t, and consider that you get this from a power series, e.g., one plus one third plus one ninth plus one twenty-seventh and so on will get you to three halves. Integrate that, plug in -x rather than x, and you get that ln(1+x) - x is close to -x^2/2, so the approximation we're using is -2l^2/n. That'll get you in this case .40657, as opposed to the true result of .40670.
Put it all together, and you get that for large l much smaller than n, the chance is approximately sqrt(2/(pi*n))*e^(-2l^2/n) (in the example above, 0.72537% as opposed to the true value of 0.72551%); this is as far as de Moivre got. Already you've got something that, for large n, looks quite a bit like a bell, but it also looks quite a bit like a straight line. For that reason, multiply the whole thing by the square root of n to keep it from vanishing, and to keep it the same area, multiply l by the same. (So the ranges you'll be considering are in proportion to the square root of the total, which you already knew.) What you get then is sqrt(2/pi)*e^(-2l^2), which is a normal distribution with a mean of 0 and a standard deviation of one half. Why this isn't the standard normal is another story, to be told another time.
I point out to the now extremely bored aliens that this also means that the integral of e^(-x^2) over the whole number line must be the square root of pi, something that might be more interesting worked out backwards. So consider a family of functions with discontinuities at integers, differentiable elsewhere with a derivative of zero, such that the value at every point is the binomial coefficient of twice the function's ordinal - call the ordinal n - and the function's ordinal plus the next natural up from the absolute value, being zero once x > n. The integral of this on the whole number line will always be 2^(2n); redefine the function family by dividing by that and the integral will always be one. Now, obviously this isn't approaching a function whose y-intercept is 1! So multiply the whole thing by sqrt(pi*n/2) and change the discontinuities from the integers to the integers divided by sqrt(n/2); this will change the integral from 1 to sqrt(pi), and the tails will still vanish as n gets large.
Now let's look at the Riemann sums (defined for the sake of this proof as the sum of the values at the parameter closer to the origin times the length of the intervals) of e^(-x^2) from the negative square root of (pi*n) to the positive. As n increases, this will approach the improper integral by definition. Now, you've probably already sketched out the proof in your head and are wondering what's taking me so long. The answer is that I'm a drunk who flunked out of college and now rambles on about aliens who can't do geometry. So what's to be proven is that for every positive epsilon, there's both an n and a number of intervals to the Riemann sum such that increasing either will give a sum whose absolute difference from the nth function from the family above, plus the tails, is below epsilon. We'll go with an n that's double the number of intervals, so that the approximation can be considered as the integral of a function with the same discontinuities as function n from the family, although the values won't be the same. The value on either side of the y-axis would approach 1, as above, to a factor of at most 2n/(2n+1). Because the rest of the limits were found by bounding the proportions to this value, this applies to all others.
As for the other limits, the argument from the integral clearly isn't enough at the edges. So we have to put bounds on the edges as a whole as well as a bound in the middle, such that they all shrink. Well, you just take the series expansion of the natural log and then that of the fraction and fuck it you're aliens you can take it from here.
I'll probably also want want to explain Fourier analysis, the connection between analytic geometry (involving the Cartesian products with which the aliens are familiar) and classical geometry, and maybe Bessel numbers, or contour integration, but all that can wait a bit.
0 notes
Text
Yet more thoughts on the Gauss-Wantzel Theorem, and beer.
I don’t really drink beer. Hurts my throat. The reason it hurts my throat has to do with the acid that surrounds the carbon dioxide bubbles. For this reason, I promised about two months ago to write a note concerning the mathematics of an unpublished paper that has to do with cinnamon breaking down that carbon. Unfortunately, I seem to have somehow managed to accidentally delete that paper from my hard drive, so what I’m going to do instead is ramble on blindly about what might be the mathematics of cinnamon and beer bubbles, based on my high school understanding of chemistry and undergraduate understanding of physics. But first, I’m going to finish what I started regarding the Gauss-Wantzel theorem.
That fifth plank of the last post, the one I abandoned, was how to actually apply all that wank. Let’s start by translating it into the language of high school algebra, which ostensibly is what I was trying to do in the first place. So let’s rewrite those first four planks:
Over the smallest possible set closed under addition and multiplication containing the rationals and all the compass points, the function given by raising the principal compass point to the power of a primitive root, while keeping all rational numbers fixed and preserving the operations of addition and multiplication, must by iterated composition ultimately reach a similar function for every power 1 through p-1.
This repeats every p-1 iterations.
One of these functions (other than the power 1) composed with itself yields the power 1, and when you compose any of the others with this one, you get pairs of functions for which composing any of the four combinations of two from separate pairs will land you in the same pair.
You can express any number in the set as the solution to a quadratic polynomial with coefficients in the subset that’s fixed under the function specified in step 3.
Moreover, the subset has similar properties, since the pairs can be treated as a single function, since they don’t differ for any of the fixed elements, and they likewise form a cycle. Keep going and you’ll end up with only the trivial function, which means you must have the rationals. So all that’s left is to put it all together.
Now, let’s start out with the pentagon (hi, NSA!), known since Euclid, book IV, proposition XI. The roots of x^4 + x^3 + x^2 + x + 1. Like all cuartics, this can be expressed as the product of two quadratic equations using only square and cube roots; unlike the general case, only a single square root will be necessary (in the coefficients of the quadratic equations, I should be clear; another will be needed to solve them). Of course, that’s not how he saw it. What he did was first construct an isosceles triangle with the angle at the vertex being half those at the base, and thus necessarily 36°. Put ten of these triangles around a common vertex and you’ll get a regular decagon, every other vertex of which will form a regular pentagon (not exactly how he did it, but never mind). The way he made such a triangle (book IV, proposition X) was to cut one of the legs in the golden ratio (meaning, of course, the single square root mentioned above is that of 5), and draw the other so that the base is equal to the longer segment (i.e., that the ratio of the base to the leg is the golden mean). He saw that if you cut one of the legs by the golden mean, with the smaller part nearer the base, and drew a line from the cut to the far corner, this split the isosceles triangle into two smaller isosceles triangles; his reasoning was that the base would then be tangent to a circle with the cut, the vertex, and the far corner on it, based on the theorem that if you draw a line from a point outside a circle to the far side of a circle, and multiply just the part outside the circle by the whole line, for the same point outside you’ll always get the same result. By the properties of the golden mean, the whole leg (phi bases) by the part outside (phi-1 bases) will be equal to the square of the larger part (base^2). Since it’s the square of the base, and the base emanates from the same point, it must be tangent. Then he uses the corollary of Thales’ theorem that the angle from an arbitrary point on a circle to the endpoints of a particular chord will be the same for any such point on the same side of the chord, and that’ll be the same as one of the angles between the chord and the tangent. So the angle between the corner and the line to the cut is the same as the line at the vertex, and since the other corner is still the other corner, the triangle’s similar, so the line to the cut must be equal to the base, and therefore equal to the larger section of the leg, so both triangles are isosceles. Since the triangle containing the larger section is isosceles with the section as a leg, and the one containing the smaller section is similar with the section as base, the base angles of the larger triangle must be the sum of the vertex angle (since the one is isosceles) and the vertex angle (since the other is similar to the whole), so the base angles are each twice the vertex.
So let’s throw all that out because I’m stupid. Remember that thing I said I wasn’t fucking doing? I’m still not fucking doing that. I am, however, digging up that old textbook to see how Gauss fucking did this, which means I basically am fucking doing that with just article 354. So he starts from the conclusion that the respective sums of the odd powers and the even powers of the primitive root must be the solutions to a quadratic equation over the rationals. That more or less follows from what I said above (when you consider that the sums must be fixed), and it’s a scalable algorithm, so yay. So he comes up with the formula x^2 + x - 4 for the two sums - by FIMO, the middle term is the negative sum of the two roots, which would be the negative sum of all the compass points (bar +1), which would obviously be -1 since with +1 they would all add up to zero, then multiply by -1 to get +1. The next one, which is the product of the sums, is a bit trickier, and uses a technique he introduces in article 349.
This technique involves the use of “Newton sums,” which you might remember from high school. (I don’t know. I never went to high school. I only heard of them when I had to look them up after their use in article 341.) Even if you don’t know the roots, you can find the sum of their nth powers using the coefficients and the lower such sums - assuming the coefficient of the highest-degree term is 1, then the sum of the roots will, as before, always be the negative of the second. From there, the sum of the squares will be the negative sum of the second-highest-degree term’s coefficient times the first sum and twice the third-highest-degree coefficient, the sum of the cubes will be the negative sum of the second-highest-degree coefficient’s term by the sum of the squares, the third-highest-degree coefficient’s term by the first sum, and thrice the fourth-highest-degree coefficient, and so on. From here the coefficients can be worked out one by one using the previous and the fact that the powers of the individual terms can be found by starting at the 2nd, 3rd, whateverth element of the Gaussian period (the sums).
I really should keep going from there, but I realize now that I’ve completely failed in what was originally my mission in this meditation, if not before, certainly at the point where I linked five hundred pages of Latin. So I’ll go the whole hog and pseudocode this fucker here and now. Except Tumblr formatting makes it pretty much impossible to pseudocode in any quasilanguage other than ±LISP, so I’ll do that. But first, a bit of þe olde BotE.
Start by working out the primitive root, obvs. Not sure how exactly to do that, get to that later. Shouldn’t be that hard. (...oh. It’s always 3. Well, that was a piece of piss.) Then start work on the polynomials, which will all be quadratic polynomials, which will always start with x^2. Add up the targets and multiply by -1 to get the second term. The third is the product of the targets, which you can work out by adding up the powers from the starting point. This incidentally can be seen by symmetry (and the fact that you’re not going to get to the unit by adding 1 to any odd power of 3) to always start out with x^2 + x - (p-1)/4. So I briefly considered “coding” it in human language, but I decided not to bother, and instead I’m going to skip straight to the invisible part where I write a ±LISP program.
So the nightmarish pain in the ass turns out to be coming up with a way to encode it. I mean, it’s easy to do it as a floating-point number (it is pseudocode, isn’t it?) but doesn’t that sort of defeat the purpose? So let’s be recursive as much as we’re being recursive - Polish notation. Write with the program a program to come to a root. So for instance, a fifth root of unity might be:
(+ (/ (+ (sqrt 5) (- 1)) 4) (sqrt (/ (- (+ 5 (sqrt 5))) 8)))
For which we simply need to output, as well as any counting number, /, +, -, and sqrt. (Multiplication won’t be necessary; parentheses won’t be either, since they’re implied in the operators, which each admit a well-defined number of arguments. Note that - is unary.) Anyway, back to that BotE, you just add them to get the (negative) x term of the quadratic equation, and multiply to get the constant term; that’s not hard since as you dive down, you’re demonstrably never going to get outside the periods previously worked out. You do, however, need to work out the roots in advance, which isn’t hard, and keep them, which is a sin in LISP.
So that'd be it. Basically. You just do that. But I've been at this for months, and while an answer at this point seems pretty quick... eh... screw the pseudocode. So basically, you take all the periods, and you recognize that their products can be expressed by adding them up adding each ordinal in turn by the power tower thing to the other thing and adding them up. So you get a quadratic equation where the "b" term is the sum, which we already know (since we just got it), and the "c" term, which isn't hard to get by that procedure. You can tell that the c term will always be one level up because when you use the transformation one level up from where you're already fixed they get swapped around, so the product is fixed one level up, so it's one level up. So yeah.
Oh, right, beer. I'll get to that later, fuck it. Along with actually making sense of whatever the fuck I wrote here. This post is my Cujo, only shit. Arf!
0 notes
Text
A shape very near to my mouth.
Now, you’ve probably noticed, my avatar shows an approximately cylindrical tumbler. I might have used the rocks glass, but I’m glad I didn’t, since this right now is more interesting, and prompted me to hold in abeyance my Ghostbusters “review.”
Now, cinemas give me panic attacks, so I’m not planning on seeing it, despite the risk of Slate reporters and/or Patton Oswalt accosting me on the street. (Although the fact that Howard Taylor liked it tempts me to rent - not buy - it digital download someday.) I was, however, going to write up a post entitled “I ain’t afraid of no gauge ghost!” regarding the connection between -1/12 and bosonic string theory and my attempts to create a video game, and I still plan to do so, but it turns out - and this shocked me - bosonic string theory is a bit complicated, so I decided to play some video games instead. While playing those games, I sipped from the tumbler seen in my avatar, noticed the shape I was making, and decided to talk about it.
At first glance, it looks a little like a cone, which is simple, right? The base times the height times a third. But that relies on the cross-section being proportional to the square of the distance from the base, and here, that’s not so certain. It seems obvious if we’re inside the diameter, but what seems obvious isn’t always right (hence my disclaimer). So what actually is the cross section? Well, first, let’s come up with the parameters.
So let’s say, height (that is to say, distance from base to my mouth), radius (...really?) (the radius will be the unit when not otherwise specified), and the extreme of the liquid as a multiple of the radius, let’s call it depth. So let’s go with the cross-section. Now, the depth is going to be linear, so once we’ve gotten the cross-section, the volume’ll be the integral of the cross-section with respect to the depth. So what’s the cross-section? Well, it would be everything the area bounded by the circle and a chord whose distance from a parallel tangent would be the depth. This would be the wedge minus a triangle. The wedge is the area of the circle times the angle over tau, and the angle would be twice the arccosine of the distance from the center to the chord, which is one minus the depth. Bearing in mind that the area of the circle is pi square radii, that comes to just the arccosine. Now, as we told the aliens, the arccosine is the integral of the negative square root of one over one minus t^2 with respect to t from x to 1.
Okay, so subtract from that the triangle, which is going to be half the depth multiplied by the square root of one minus the depth. All that’s left now is to take the integral. The problem is that the integral of the arccosine can’t really be expressed in elementary functions, so let’s say briefly that the slope is 1/2, so that if we barely touched the far corner the height would be equal to the radius (which is the unit). Then the volume is the integral from zero to the base depth of the arccosine of one minus the parameter of integration, all plus a third the cube of the square root of one minus the square of the base depth, minus a third. Now, of course that only works if the height is less than half the radius. So if the depth of a slice is greater than that you’ll get that it’s pi less what the other side might have been, so for the part that crosses the diameter what you have is pi times the remaining depth minus the integral of the arccosine of the remaining space, all plus the cube of the square root of one minus the square of the remaining space, minus a third.
Anyway, for different slopes, just stretch it. G’night.
(Oh, right, and you find the integral of the arccosine by doing a binomial expansion of the denominator and integrating term-by-term.)
0 notes
Text
I really should be uploading this at 8:15 tomorrow morning.
So Immanuel Kant (like me, a real pissant who was very rarely stable) used as his archetypal example of an entire class of knowledge geometry, by which he meant Euclidean geometry. The earthshattering revelation that it’s possible to draw on a sphere put paid to this. In fact, Euclidean geometry, Descartes that thief to whom I intend to apologize at some future point was instrumental in showing even as he helped to create the duo of dualisms that plague us that rather geometry is a subset of logic. The digits of tau, random as they seem, every one follows directly from first principles, and could not be otherwise. Of course, circles can be otherwise, so what’s a better definition?
Let’s go at it this way: tau is the lowest strictly positive real number such that e^(i*x) = 1. Of course, let’s say I were to try to explain this to al-Khwarizmi, while I could easily explain e, two more pressing issues are raised. First, what the hell does it mean to an imaginary power, second, what the hell is an imaginary number? Every schoolchild knows the answer to the latter question, but it wouldn’t arise in any serious context until the sixteenth century, and to some any argument for its existence fell upon deaf ears. Educated modern people like to laugh at that fact, but the dirty little secret is that numbers don’t exist at all, natural numbers no more than complex.
The definitions of the numbers would take too long to go into, but to give an idea, the real numbers are most commonly defined by something called a “Dedekind cut,” two disjoint, non-empty sets of rationals that split them all entirely between them, such that every element of one set is greater than every element of the other, and either the upper or lower set (it doesn’t vary, but I can’t remember which is “canonical”; both work as a definition) has to be open (if they’re both open, it’s irrational). But more important is the famous formula on Broom Bridge: i^2 = j^2 = k^2 = ijk = -1. This is no more or less valid than complex numbers, so remember, while the utility of complex numbers is certain, when you hear someone angrily say without qualification “complex numbers exist!” in a way they’re dismissing William Hamilton, although I doubt he’d see it that way (fortunately, he’s dead, and there to date are no musicals about him that I know of).
So, then, if I’m to break it down, just by the traditional understanding, the exponential function (i.e., e^x) evaluated at an imaginary number is usually broken down into an infinite series, which can be seen as two infinite series, one real, one imaginary, and these happen to add up to the sine and the cosine. Now, in my last update, I linked to a geometric proof of this that didn’t rely on infinite series, but I’m hoping not to rely on geometry at this stage. Of course, without invoking geometry, the only way I would have of defining the sine and cosine would be through the exponential function, which would be a circular argument, so to speak. So what am I going to do instead?
Well, first off, as I said before, I’m going to give into the concession that imaginary numbers “don’t exist.” As threatened before, let’s think of this in terms of 2-vectors. So how to define the exponential function for such two-vectors? How to do it, that is, without implicitly assuming that the two-vector does indeed represent a number? It is, in essence, completely arbitrary. Well, the short answer is, we won’t. Instead, what we’ll do is that we’ll define a different function, from the real numbers to the Euclidean 2-plane, for which one component is equal to the even terms of the Maclaurin series (the successive natural powers divided by the factorials), and the other the odd, both with the signs of every other turn flipped to negative. So, in short, I’m going to do exactly what I just said I wasn’t going to do, define the sine and cosine through the exponential function, only I’m not going to call it the exponential function, and I’m not going to call them the sine and cosine, but pretend they’re some arbitrary idea I shat out my forehead like Athena, and I’m going to pretend that this somehow makes it okay.
So now that I’ve proven myself an unqualified shitlord (mendacious kyriarch!), let’s go back to geometry. Well, first, we have to define the circumference of a circle, or, in fact, the length of anything that isn’t a straight line. The naïve definition might be to lay a string along it and pull it taut, assuming no flexibility (fuck you Alan Davies) and nothing physical whatsoever. That’s not a very good definition, for reasons the befucked Alan Davies is damned for giving. Euclid did not bother to define it, only saying that “a line is breadthless length,” suggesting at a few points that there were lines other than straight lines, but not dealing with them. Few other than Archimedes concerned themselves with the lengths of curves that couldn’t be broken down into straight lines; he used the method of exhaustion, but this only works because the interior and exterior shapes he uses are both concave. You’ve all seen how abuse of this logic can make tau apparently equal to eight (or between 5.72 and 5.73), though, and what saves Archimedes wouldn’t work for an arbitrary curve. You have to use infinitesimals, and that’s a definition that’s ultimately arbitrary, and unknown to Archimedes (although, like ancient winemakers and distillation, many think he used it privately).
As always, the Chinese were better; they used the area, using the elegant formula tau*r^2/2. This scared me while I was writing that weird-ass story about a man paying a prostitute to integrate by parts that must have made me look like the unholy spawn of Blaise Pascal and Chris Claremont (there’s a reason they say “harder than Chinese algebra”... actually a very interesting one I hope to get to, but this time I meant Bishop Berkeley pitching a tent in the woods) because I thought they’d come up with a very good way of getting there at the time the story was set, but it turns out they’d only come up with a very good way of expressing it once you already had a formula. The formula that got them to the celebrated... er... 2*355/113... was found simply by approximating the arcs outside the inner polygons as parabolae, although I don’t think the mathematician who worked out that it would always be an underestimate realized the parabola connection (or that it was taking the Maclaurin series of the semicircle equation out a term); pretty sure the area of parabolae was just a Western thing back then.
The fact that it’s the same constant (or one by the other) comes from the idea of approximating it as a bunch of little isosceles triangles. These triangles will have area half the base by the height. For very small triangles, the difference between the height and the side (which is the radius) will be negligible. So add them up and you get that the whole circle will be the half the product of the radius and the circumference.
And so we get to the more interesting question, what exactly is a circle? Well, let’s say that an alien were among us, ignorant of our laws, courts, customs, and perception of an approximately Euclidean universe, but they do get infinitesimals and infinite series. Let’s define tau, as before, as the point where those two functions up there get to (1, 0). How do we get from there to a more familiar definition, such as four times the integral of sqrt(1-x^2) from -1 to 1? It could be a very useful concept for linear algebra as well as Fourier analysis. So what’s to be proven is that the smallest strictly positive real number where those two functions reach (1, 0) is four times that integral.
Well, let’s start by, hoping nonetheless to strip geometric preconceptions from your mind, naming these functions, defined by series, “cosine” and “sine.” From the series representations before, you can see that the derivative of the cosine is the minus sine, and that of the sine the cosine, so we get the familiar Celtic knot of functions. Still don’t imagine geometry that glyph has nothing to do with geometry! The aliens couldn’t parse such a glyph nonetheless, of course, but they understand what it represents, that is to say: sin -> cos -> -sin -> -cos -> sin. You can also come to the identity sin^2 + cos^2 = 1 by just working through the coefficients. So from there, let’s define the inverse function, x = sin f(x), so 1 = f’(x)*cos f(x), so f’(x) = 1/(cos f(x)), so the derivative of the inverse would be sign(cos x)/sqrt(1-x^2); a similar argument will get you to the derivative of the inverse cosine being -sign(sin x)/sqrt(1-x^2). (The aliens look at me confused, and I quickly write instead 1/(1-x^2)^0.5.) So if arclength were a thing, it’d be a hop, skip, and a jump from here, but fuck you. Aliens.
So let’s move on. Let’s try to move from arclength to area as we did before, with triangles, but without the benefit of geometry. First, we have to consider what an integral is - the Riemann sum, the limit of the sum at point x = k*(b-a)/n from 1 to n, times (b-a)/n as n increases without bound. So how do we explain to these aliens who know nothing of geometry what we mean by “triangle”? Well, luckily, these aliens already know the trapezoidal rule, although of course they don’t know it by that name. Making a quick human huddle, you can see geometrically that this creates a shape that’s easier to convert to triangles (albeit not congruent ones) radiating from the origin. So now to separate it all into more of those “shape” things that add up to both the “trapezoids” and the “triangles.” We tell the aliens we’ll only be considering one half, and hoping to prove it adds to an eighth of tau.
Okay, a slightly longer human huddle. The smallest trapezoid is already a triangle, so that’s easy. At the top of each trapezoid, likewise, will be a natural triangle once we’ve drawn the radiating triangles, and natural triangles will make up all of the leftmost trapezoid. As for the rest, we need quadrilaterals, which will have two parallel sides from the trapezoid. This gives every quadrilateral exactly two lines parallel to the y-axis, and taking that as the base we get a natural height of 1/n for all the triangles formed by the diagonals. So once we’ve figured out how to partition the vertical lines, the trapezoids part will be trivial. Tougher is the triangle part - each is an isosceles triangle with legs of length 1 and base equal to the differential arclength; the height approaches 1 for sufficiently high n, so this lets us get the arclength in. We need to convince the aliens that the triangles/quadrilaterals add up to both of these, and therefore that they’re equal. Dammit, I see them (if “see” is the right word for how we register their presence, something none of us can agree on) coming back from the Cinnabon now. BREAK!
So I explain to the aliens that we’re breaking up each term k (counting from 1) of the sum into (n-k+1) terms, let’s use the letter l to index them, again counting from 1. For l = (n-k+1), it’ll be (f((k-1)/n) - f(k/n)*(k-1)/k)/(2n); for the rest, it’ll be (f(1-l/n)*(2k-1)/(n-l) - f(1-(l-1)/n)*(2k-1)/(n-l+1))/(2n). Add up every l of the latter type for a given k and you’ll get f(k/n)*(2k-1)/(2n*k); add this to the final element to get the formula for the “trapezoid.” Now let’s add up every k, from one to n-l+1 (remember, k <= n-l+1 iff l <= n-k+1) for each l. The terms of the latter form add up to (n-l)^2*(f(1-l/n)/(n-l) - f(1-(l-1)/n)/(n-l+1))/(2n), whereas the former can be rewritten as (f(1-l/n) - f(1-(l-1)/n)*(n-l)/(n-l+1))/(2n). Therefore, adding them together gives ((n-l+1)*(f(1-l/n) - (n-l)*f(1-(l-1)/n))/(2n). (Sorry, I try to avoid using this much notation, but, aliens.) So I ask the aliens to multiply both the numerator and the denominator by the completely arbitrary (wink wink) quantity of ((n*(f(1-l/n) - f(1-(l-1)/n)))^2 + 1)^0.5, and to factor out the same divided by n, leaving a remaining factor of ((n-l+1)*(f(1-l/n)-(n-l)*(f(1-(l-1)/n))/(2*((n*(f(1-l/n) - f(1-(l-1)/n)))^2 + 1)^0.5). That first factor looks suspiciously like what we’d call the integral of the arclength, so what’s left is to show that that horrible mess comes to 1/2. Let’s first rewrite it as (1/2)*(f(1-l/n) - (n-l)*(f(1-(l-1)/n)-f(1-l/n)))/((n*(f(1-l/n) - f(1-(l-1)/n)))^2 + 1)^0.5. For very large n, this will approach (1/2)*(f(1-l/n) - (1-l/n)*f’(1-l/n))/((f’(1-l/n))^2 + 1)^0.5. So now we substitute in f(x) = (1-x^2)^0.5, f’(x) = -x/(1-x^2)^0.5 to get (1/2)*((1-(1-l/n)^2)^0.5 + (1-l/n)^2/(1-(1-l/n)^2)^0.5)/((1-l/n)^2/(1-(1-l/n)^2) + 1)^0.5 = (1/2)*(1-(1 - l/n)^2 + (1-l/n)^2)/((1-l/n)^2 + 1 - (1-l/n^2)) = 1/2.
So after that hellish paragraph, time for the “hop, skip, and jump” mentioned before, inasmuch as it can be seen that the integral of the function is the integral of (1+(f’(x))^2)^0.5/2, which comes to 1/(2*(1-x^2)^0.5). Now, it can be seen from the series definitions that the sine of zero is zero and the cosine is one, and the sine is increasing as the cosine is peaking. The sine will continue to increase until the cosine hits its next zero, and if the sine is increasing from zero, it must be positive, so over this range, the inverse functions are monotone. Therefore, the sine of twice this integral will be one as we go from zero to one, and the cosine of twice the integral will be zero as we go from one to zero. So for twice the integral, the sine is one and the cosine is zero - what about four times the integral? Well, now that the cosine has hit zero, the sine has peaked - since the sine is positive and the cosine is becoming negative, both are decreasing. Therefore, following the inverse, we get that the sine of four times the integral is zero and that the cosine is minus one. Now that the sine is zero, the cosine’s reached its nadir, so the cosine is increasing while the sine is decreasing; therefore, six times the integral has a sine of minus one and a cosine of zero. And now, the cosine is increasing whereas the sine has passed its nadir, so eight times the integral will bring you back to one and zero, and thus is tau. And so, the aliens, although they still wonder at our strange “geometry” faith that makes it meaningful, now understand the connection, and THIS! SHALL BE KNOWN! AS OUR! INDEPENDENCE DAY!
So ultimately this is just a circuitous (so to speak) exercise in the obvious, but it was written in haste to make the tau day deadline (and maybe then set to private and edited a couple times).
0 notes