#sineofpsi
Explore tagged Tumblr posts
Text
Classical E&M V: Electric Multipoles
Over the last few weeks, we derived exact expressions for the charge densities, electric fields, and electric forces of generic charge configurations. These results are often as complicated as the charge configurations that generate them, which can obscure the larger physical principles at play by becoming computationally intensive and aesthetically gnarled. We can regain a conceptual hold on these quantities by examining systems in certain limits, such as when our sources and targets are well-separated.
Today, we look at the electric force a point charge $q^\prime$ at $\vec{r}^{\text{ }\prime}$ (near the origin $\vec{0}$) exerts on a point charge $q$ at $\vec{r}$ when they are very far from each other. This naturally generates an expansion in $|\vec{r}^{\text{ }\prime}|/|\vec{r}|$:
We call this the electric multipole expansion of the electric force, and give each term a name:
This week we’ll derive the multipole expansion of a point charge, extend it to charge configurations, and explore the monopole term. Next week, we’ll delve into the dipole and quadrupole terms.
I. Significance Comparisons and the Name of the Game
The electric force which $q$ at $\vec{r}$ experiences due to $q^\prime$ at $\vec{r}^{\text{ }\prime}$ is given by Coulomb’s law:
As mentioned in the intro, we’ll take $\vec{r}^{\text{ }\prime}$ close to the origin $\vec{0}$ and choose $\vec{r}$ to be extremely far away in comparison.
To use technical language, we assume by construction that $|\vec{r}^{\text{ }\prime}|$ is significantly less than $|\vec{r}|$, which we write symbolically as,
We could equivalently say that $|\vec{r}|$ is significantly greater than $|\vec{r}^{\text{ }\prime}|$ and write,
We only use “$\ll$” and “$\gg$” to compare positive numbers. For what values does the statement $|\vec{r}^{\text{ }\prime}| \ll |\vec{r}|$ hold true? Well, it depends on how accurate we’re aiming to be in our calculation. If we can only measure distance to a single digit, then $|\vec{r}^{\text{ }\prime}| = 0.1$ is significantly less than $|\vec{r}| = 1$:
However, if we can measure two digits worth of accuracy, then $|\vec{r}^{\text{ }\prime}|=0.10$ is not significantly less than $|\vec{r}|=1.0$, because
Whether a number is significantly less than (or greater than) another number depends on the accuracy of the calculation at hand.
Assuming that $|\vec{r}^{\text{ }\prime}| \ll |\vec{r}|$, we may define their ratio as a small parameter that I’ll label $\epsilon$:
TODAY’S GOAL: write $\vec{F}_{q^\prime\rightarrow q}$ in terms of $\epsilon$, then expand $\vec{F}_{q^\prime\rightarrow q}$ as a sum of terms weighted by increasing powers of $\epsilon$. This is useful because the terms are organized by relative importance: the $\epsilon$ term contributes more to the total force than the $\epsilon^2$ term, which contributes more than the $\epsilon^3$ term, and so-on.
First, note that we may expand the denominator of Coulomb’s law like this:
By definition, the dot product between $\vec{r}$ and $\vec{r}^{\text{ }\prime}$ is proportional to $\cos\theta$, where $\theta$ is the smallest angle between the vectors:
Hence, the denominator equals,
Meanwhile, the quantity in parentheses equals,
When we combine these elements, we obtain the following expression for the force:
which equals, when expressed in terms of $\epsilon$,
We’re going to power series expand this expression in terms of $\epsilon$ and generate the multipole expansion of the electric force in the process. In pursuit of this goal, let’s elaborate on how to handle power series.
II. Power Series, Approximations, and Big O Notation
A power series in $x$ is an infinite sum of increasing powers of $x$. All power series may be written in the following form:
Many functions we’ll deal with in physics have power series representations. For example, $\sin x$ (which outputs a ratio of right triangle side lengths) may be expressed as a power series in the real variable $x$ (aka, a certain right triangle angle), like so:
A function $f(x)$ that has a power series representation at every value of $x$ is called an analytic function. An analytic function might require multiple power series descriptions to accommodate all values of $x$.
Sometimes a function’s power series will be valid beyond the original domain of the function, allowing us to extend the function to a larger domain; this is called an analytic continuation. For example: $\sin x$ is initially defined only over real values of an angle $x$. However, the power series yields sensible results for complex values of $x$ too, and thereby allows us to analytically continue sine’s domain to $\mathbb{C}$.
We’ll worry about analytic continuations another day. For now, consider a generic power series expansion in real-valued $x$. If $|x|\ll 1$, then,
Oftentimes the coefficients $a_n$ in the power series either shrink or grow too slowly as $n\rightarrow \infty$ and cannot combat the decreasing values of $|x^n|$, so that the terms in the power series satisfy a similar chain of conditions:
This proves extremely useful for practical calculations. See, the $a_n$ provide infinitely many free parameters, allowing us to construct a cavalcade of extremely complicated functions. However, if $x$ is naturally small, then terms with larger powers of $x$ won’t contribute much to the final result, and we might be able to neglect all but a finite number of terms to good accuracy. This is called truncating the series. For example, maybe a function $f(x)$ is well-represented to first order in $x$, in which case we’d utilize the following approximation:
We’ve dropped the $a_2$ terms and higher to get a simpler function, but approximations are only as useful as the precision we need. The function $\sin x$ is often well-approximated by its first order approximation ($\sin x \approx x$), as evidenced by their similarity when plotted:
If first order isn't accurate enough, then we can go to second order in $x$:
In general, we say we’ve approximated $f(x)$ to $n$th order in $x$ if we keep terms up to $a_n$:
Constructing these approximations (approximately valid when $x$ is small) yields expressions that can be easier to work with than the full expressions (which are exactly valid everywhere), but cost us information as a result. For example, let’s compare $f_1(x) = x + x^2 + x^3$ and $f_2(x) = \sin x$.
Now, as the above plot makes clear, these functions are not equivalent.
However, these functions are indistinguishable to first order!
This is why the functions look so similar near $x=0$ in the plot: there simply isn’t enough information at first order to tell them apart. Their degeneracy is broken when we go to second order in $x$:
Whether or not the error from neglecting second order (and/or higher) terms matters to us is dependent on the details of the calculation at hand. We can precisely track our approximation errors via big O notation.
Big O notation provides a symbol $\mathcal{O}(x^n)$ that--when tacked onto an expression--says ``this expression is exact up to $(n-1)$th order in $x$”. We might use big O notation to express the equivalence of $f_1$ and $f_2$ up to $x^2$-sized errors:
whereas we could also use big O notation to demonstrate their subsequent nondegeneracy when the errors are reduced to about $x^3$ in size:
Conceptually, the big O symbol sweeps up all terms of the indicated order, giving a limit on how fast the approximation error grows with increasing $x$. That means, among other things,
Well, at least, this is typically the case, and works well for asymptotics--the study of functions as their arguments diverge (e.g. $x\rightarrow \pm \infty$). When we work with small values of $x$, this symbol possesses limitations closely related to the limitations of “$\ll$” and “$\gg$”. To be concrete, suppose we’re interested in $x\sim 0.01$. Then the quantity $100 x^n$ is actually an $(n-1)$th order term, because:
Then again, if we’re interested in $x\sim 0.001$, then it remains small enough to lump into $n$th order terms. Like most things in physics, big O notation requires a careful understanding of when we intend to use the approximation. When in doubt, plug in explicit values and see how numbers compare.
The electric monopole, dipole, and quadrupole terms of the multipole expansion correspond to the $1$, $\epsilon$, and $\epsilon^2$ terms in the power series expansion of $\vec{F}_{q^\prime\rightarrow q}(\epsilon)$. Now that we’re armed with some useful power series notation, let’s get to expanding!
III. Binomial Series and Expanding the Denominator
I’ll prove it in a later post, but for now I simply state the acclaimed binomial series, a power series expansion of $(1+x)^N$ in $x$:
This expression is true for all complex $x$ such that $|x| < 1$ and all complex $N$. (It’s true for other values of $x$ as well, but those require caveats to their validity and we don’t need them today.) To be explicit, the $n$th term of the binomial series is given by,
where the first term corresponds to $n=0$.
~Aside :: The Binomial Series~
I want to quickly mention three popular consequences of the binomial series.
1) Physicists often truncate the binomial series to first order in $x$, which yields the so-called binomial approximation,
This could be the most used approximation in all of physics.
2) Any time $N$ is a nonnegative integer, the binomial series naturally truncates itself and yields the binomial theorem. The first few examples (which you can verify through multiplication!) are,
You might recognize these coefficients as elements of Pascal’s Triangle--this is not a coincidence!
3) When $N=-1$ and we replace $x\rightarrow -x$, the binomial series becomes the so-called geometric series:
I’ve seen this series used a lot in both directions: sometimes we approximate quantities that look like the LHS by using a truncated version of the RHS, and other times we obtain a series that can be rewritten like the RHS and subsequently expressed succinctly via the LHS.
~End Aside~
We’ll use the binomial series to expand $\vec{F}_{q^\prime \rightarrow q}$’s denominator:
Comparing to $(1+x)^N$, we see that we should choose,
To what order in $x$ do we need to expand this denominator to get all of the second order $\epsilon$ terms in the force? Note that $x^n = (-2\epsilon \cos\theta + \epsilon^2)^n$ contains terms with $\epsilon^n$ through $\epsilon^{2n}$:
which we plan to subsequently multiply by $(\hat{r}-\epsilon\hat{r}^\prime)$ to get terms containing $\epsilon^n$ through $\epsilon^{2n+1}$:
Thus, to get all of the $\mathcal{O}(\epsilon^2)$ terms of the force (as to obtain the monopole, dipole, and quadrupole terms), we need to expand to 2nd order in $x$, aka $n=2$. Higher order terms in $x$ will only yield contributions at third order and higher in $\epsilon$.
Now that we know how much we need to expand the denominator, let’s start working through the terms. Each term is coming directly from the definition of the binomial series.
The sum of these terms yields an approximation of the denominator that is good to 2nd order in $\epsilon$:
I swept all terms at $\epsilon^3$ and higher into $\mathcal{O}(\epsilon^3)$ because we’re missing other terms at that order and higher anyway (from $n\geq 3$ terms) and therefore anticipate approximation errors of that size. We haven't been precise enough to worry about $\epsilon^3$ terms.
IV. Substituting the Approximation into the Force
Let’s combine last section’s final result with the other $\epsilon$-dependent term from $\vec{F}_{q^\prime\rightarrow q}$. I multiply the terms out and then reorganize them according to common powers of $\epsilon$:
Plugging this into the force, we obtain the multipole expansion of the electric force $\vec{F}_{q^\prime\rightarrow q}$ once and for all:
I’ve labeled electric monopole, dipole, and quadrupole terms above, which were the objects of our desire. Note we could’ve extended this expansion to higher orders of $\epsilon$ if we wanted. In doing so, we would have found an octopole term ($\epsilon^3$), hexadecapole term ($\epsilon^4$), 32-pole term ($\epsilon^5$), and so-on.
Because we’ll be working with these terms individually, let’s give them symbols. I’ll label each term by the order of $\epsilon$ that appears in it:
By dividing $\vec{F}_{q^\prime\rightarrow q}$ by $q$, we obtain the multipole expansion of the electric field $\vec{E}_{q^\prime}$, which has monopole, dipole, and quadrupole terms analogous to those of the electric force:
We know now how to multipole expand a shifted point charge, but what about more complicated charge configurations? Earlier in this series, we showed we can construct the electric field of any generic charge configuration by adding the electric fields of infinitesimal point charges, courtesy of the superposition principle of electrostatics. For example, we may write the electric field $\vec{E}_{\rho^\prime}$ due to a volume charge density $\rho^\prime$ on $V^\prime$ as,
where each infinitesimal contribution has charge $dq^\prime = \rho^\prime(\vec{r}^{\text{ }\prime})\cdot d\tau^\prime$ and is located at some $\vec{r}^{\text{ }\prime}\in V^\prime$. The multipole expansion of the electric force required we assume our source charge was near the origin. The multidimensional equivalent of this assumption is as follows: we assume we can fit all of $\rho^\prime$ (or whatever charge configuration we’re interested in) in an origin-centric sphere of radius $a^\prime$, where $a^\prime$ is small relative to our point of observation $\vec{r}$.
Each point charge $dq^\prime$ will have its own value of the small parameter $\epsilon(\vec{r}^{\text{ }\prime}) =|\vec{r}^{\text{ }\prime}|/|\vec{r}|$, which makes for a lot of small parameters to juggle. Thankfully, by containing all of the $dq^\prime$ in a small sphere, we ensure every $|\vec{r}^{\text{ }\prime}| < a^\prime$, such that the (assumedly small) ratio $a^\prime/|\vec{r}|$ is larger than every $\epsilon(\vec{r}^{\text{ }\prime})$. This guarantees every $dq^\prime$ can be usefully multipole expanded:
We can then volume integrate these contributions (we use linearity to distribute the integral over the sum), thereby obtaining the electric multipole expansion of a volume charge density:
This process can be extended to any charge configuration (well, that fits in an origin-centric sphere anyway), so that we may discuss the electric multipoles of a generic charge configuration. We generally expect the $n$th term $\vec{E}_{CC^\prime}^{(n)}$ to generate an $(a^\prime/|\vec{r}|)^n$-sized contribution.
To conclude this week, we discuss the electric monopole term. Next week, we’ll discuss the dipole and quadrupole terms.
V. The Electric Monopole Term
The monopole terms of $\vec{F}_{q^\prime\rightarrow q}$ and $\vec{E}_{q^\prime}$ are:
The monopole contribution of a point charge $q^\prime$ at $\vec{r}^{\text{ }\prime}$ looks exactly like the point charge $q^\prime$, but shifted to the origin! From this, we immediately infer that the other terms in the multipole expansion must provide corrections that effectively shift the charge to $\vec{r}^{\text{ }\prime}$. Different choices of $\vec{r}^{\text{ }\prime}$ will require shifting different amounts, and thereby require a different dipole term, quadrupole term, etc. Because the monopole term always sends $q^\prime$ to the origin and the higher multipoles have to correct that, the higher multipoles are typically sensitive to where we place our origin.
As an extreme example, we could move our origin next to $\vec{r}$, except then $\epsilon \sim 1$ and our expansion becomes impractical! Terms containing large powers of $\epsilon$ could remain important to the calculation (they have to shift $q^\prime$ a ridiculous distance after all) and we’d be unable to justify truncating the series. However, even in this absurd case, we’d end up with the exact same monopole term! The monopole term is origin-independent. This property is sometimes referred to as translation invariance.
Why is the electric monopole translation invariant? Consider what a generic charge configuration $CC^\prime$ looks like according to $q$ as $|\vec{r}|$ grows larger and larger:
Qualitatively, the entire charge configuration looks increasingly like a point charge at the origin! As $|\vec{r}|\rightarrow\infty$, the charge configuration becomes indistinguishable from a point charge. This is simultaneously the limit when $\epsilon\rightarrow 0$ and the monopole dominates all higher-order terms. This is why the monopole term cannot resolve the position of source charges.
Let’s quantify the case of multidimensional charge configurations. As mentioned in the last section, a volume charge distribution $\rho^\prime$ on $V^\prime$ has the following monopole term:
The only $\vec{r}^{\text{ }\prime}$ dependence comes from $dq^\prime$, so we can pull everything else outside the volume integral:
We call the quantity in square brackets the net charge (or total charge) $Q_{net}^\prime$ of $\rho^\prime$.
Therefore, the monopole term of $\rho^\prime$ looks exactly like a point charge of magnitude $Q_{net}^\prime$ at the origin! This makes sense: the monopole term shifts all of the $dq^\prime$ point charges to the origin, which we can then combine into a single charge via the decomposition corollary.
The monopole term defines the net charge of generic charge configurations too. Given point charges $q^\prime_{i_0}$ at $\vec{r}^{\text{ }\prime}_{i_0}$, line charges $\lambda^\prime_{i_1}$ on curves $C^\prime_{i_1}$, surface charges $\sigma^\prime_{i_2}$ on surfaces $S^\prime_{i_2}$, and volume charges $\rho^\prime_{i_3}$ on volume charges $V^\prime_{i_3}$, their net charge is defined as,
and we can generically write the monopole term of $\vec{E}_{CC^\prime}(\vec{r})$ as,
Sometimes we get lucky and there’s exactly as much positive charge as negative charge in a configuration, and we end up with zero net charge. We call such configurations uncharged, net neutral, or simply neutral. As indicated in the latest formula, a neutral configuration has a vanishing monopole term.
It turns out that most bulk matter (e.g. rocks, houses, human beings, ...) falls into this category. Consequently, the monopole terms often vanish for macroscopic materials. Their electric behavior is commonly dominated by the dipole term instead. That's precisely where we’ll pick up next week!
Thanks for reading today’s post! Follow sineofpsi.tumblr.com for new physics content every Friday. Have questions about anything we’ve talking about? Send me an ask. I’m wishing you the best!
#sineofpsi#sineofwhy#quantum field theory#particle physics#mathematics#physics#electromagnetism#mathematical physics#classical electromagnetism#studyblr#gradblr#research#researchblr#sciblr#scienceblr#physicsblr#mathblr#multipole expansion#electric multipole expansion#monopole#monopole term#dipole#dipole term#quadrupole#quadrupole term#Coulomb's law#significant relation#significant comparison#significantly less than#significantly greater than
8 notes
·
View notes
Photo
~January 14th, 2017~
I drafted the second SineOfPsi post on classical electromagnetism.
0 notes
Text
Classical E&M IV: Generic Electrostatic Forces
Our goal today is to determine the force between two generic charge configurations. Last week we handled a generic source charge configuration by dissolving it into tiny point-like pieces, then adding up the contributions due to every piece. We can handle the target charge configuration identically.
This post is a stepping stone on our path to completing electrostatics. We’re getting there!
I. The Electrostatic Force between Volume Charge Densities
It helps that we already know the force a target point charge $q$ at $\vec{r}$ experiences due to an entire source charge configuration $CC^\prime$. Namely,
where the electric field due to $CC^\prime$ is the appropriate superposition of volume, surface, line, and point charges, each providing an electric field according to the following equations:
Suppose (for the sake of having a concrete example) we’re interested in the electric force between two volume charge configurations: $\rho^\prime$ occupying a volume $V^\prime$, and $\rho$ occupying a volume $V$. The electric field at $\vec{r}$ due to $\rho^\prime$ is given by $\vec{E}_{\rho^\prime}(\vec{r})$ above.
We dissolve $\rho$ into tiny point-like pieces, each possessing an infinitesimal amount of charge described by its volume charge density:
Because each $dq$ can only receive infinitesimal numbers of photons, the force on any $dq$ is also infinitesimal. We write this infinitesimal force as,
or, more explicitly,
Note that $\vec{r}$ labels which point-like piece of $V$ we’re talking about--it’s a variable that indexes our disintegration of the target $\rho$. Similarly, $\vec{r}^{\text{ }\prime}$ indexes the disintegration of the source $\rho^\prime$. As a result, $dq(\vec{r})$ depends only on $\vec{r}$ and is constant with respect to the $\vec{r}^{\text{ }\prime}$ integration. In other words, the disintegration of $\rho$ is independent of the disintegration of $\rho^\prime$.
We use the target $\rho$ to rewrite the force experienced by $dq(\vec{r})$ in terms of $d\tau$:
Because the infinitesimal $d\tau$ is independent of $\vec{r}^{\text{ }\prime}$ in the same way that $dq(\vec{r})$ was independent of $\vec{r}^{\text{ }\prime}$, we may immediately integrate the $\vec{r}$-dependent function in square brackets with respect to $d\tau$.
Consequently, the electric force $\vec{F}_{\rho^\prime\rightarrow\rho}$ that a target $\rho$ experiences due to a source $\rho^\prime$ is,
Note that this quantity has no explicit positional dependence. All positions have been integrated away! Once we’ve specified our charge distributions, the electric force between them is locked in. This might seem contradictory to our expression for the force experienced by a point charge, which has an $\vec{r}$-dependence; however, specifying a point charge configuration technically requires fixing the point charge's position $\vec{r}$ (in the same way we'd specify the location and orientation of a volume charge), so there is no contradiction.
We also note that $\vec{F}_{\rho^\prime\rightarrow\rho}$ possesses a nice symmetry between $\rho$ and $\rho^\prime$. In fact, swapping primed and unprimed quantities yields the same result, but with a minus sign: $\vec{F}_{\rho^\prime\rightarrow\rho} = -\vec{F}_{\rho\rightarrow\rho^\prime}$. Even charge configurations obey Newton's Third Law. This implies that the net force $\rho$ experiences due to itself is zero: $\vec{F}_{\rho\rightarrow\rho} = -\vec{F}_{\rho\rightarrow\rho} = \vec{0}$
II. The Electrostatic Force between Generic Charge Configurations
This procedure is readily generalizable. Given a generic source $CC^\prime$, the electric forces due to $CC^\prime$ on a volume charge $\rho$ on $V$, surface charge $\sigma$ on $S$, line charge $\lambda$ on $C$, and point charge $q$ at $\vec{r}$ are, respectively,
Finally, we write the force between generic charge configurations. If a charge configuration $CC$ is composed of point charges $q_{i_0}$ at $\vec{r}_{i_0}$, line charges $\lambda_{i_1}$ on curves $C_{i_1}$, surface charges $\sigma_{i_2}$ on surfaces $S_{i_2}$, and volume charges $\rho_{i_3}$ on volume charges $V_{i_3}$, then superposition yields the force that a target charge configuration $CC$ experiences due a source charge configuration $CC^\prime$:
This machinery enables us to calculate the force between any pair of charge configurations. This has important implications when combined with our arbitrary division between sources and targets. While we like to think of photons from a source being absorbed by a target, a source is simultaneously also absorbing its own photons. Similarly, the target (despite its nomenclature) is throwing photons at the source and itself. All of these exchanges generate forces.
To be explicit: we can imagine taking a volume charge density $\rho$ on $V$ and breaking it into two pieces, $V_1$ and $V_2$. The $V_1$ piece will have a volume charge density $\rho_1$ and the $V_2$ piece will have a density $\rho_2$.
Because we can regard $\rho_1$ as a source and $\rho_2$ as a target (or vice-versa), these two pieces will be exerting forces on one-another and thereby wanting to move about. To ensure the charge configuration remains static (as to stay in the realm of electrostatics), we must apply additional external forces to cancel out all of the electrostatic forces.
This perhaps illustrates how limited the realm of electrostatics truly is: nature desires a description more expansive than electrostatics, even according to the rules of electrostatics alone! Our days in electrostatics are inherently numbered.
Next week, we’ll describe special idealized charge configurations called multipoles. Soon we’ll be discussing electrostatics in materials, several Maxwell equations, and the speed of light that will provide us the bridge away from electrostatics once and for all.
Thanks for reading today’s post! Follow sineofpsi.tumblr.com for new physics content every Friday. Have questions about anything we’ve talking about? Send me an ask. I’m wishing you the best!
#sineofpsi#sineofwhy#quantum field theory#particle physics#mathematics#physics#electromagnetism#mathematical physics#classical electromagnetism#studyblr#gradblr#research#researchblr#sciblr#scienceblr#physicsblr#mathblr#electrostatic force#superposition#superposition principle#generic charge configuration#charge configuration#volume charge density#charge density#Newton's Third Law#electrostatics
7 notes
·
View notes
Text
Classical E&M III: The Electric Ambiance of Continuous Distributions
Multidimensional objects are more complex than points, but there is a method of relating the two. We can break multidimensional objects into tiny pieces that are qualitatively point-like. As the pieces become infinitesimally small, this relationship becomes exact:
In this way, multidimensional objects are translated into collections of points. We know the electric properties of point charges, and through that knowledge we'll derive the electric properties of generic charge configurations.
I. The Electric Field of a Line Charge, Infinitesimals
Consider a curve $C^\prime$ in $3$-space with total length $\ell^\prime$:
To begin, let’s assume this curve is uniformly charged.
“Charged” means it emits photons, whereas
“Uniformly” means it possesses only one type of charge (positive or negative) and each point of $C^\prime$ is equally likely to emit photons
For the time being, we restrict ourselves to positive-type charges. This means we don’t have to worry about positive and negative charges cancelling each other out. $C^\prime$ as a whole will be emitting photons at a certain rate $r_{C^\prime}$ proportional to its total charge $Q^\prime$.
Our intent--as mentioned in the intro--is to chop $C^\prime$ into tiny pieces and approximate each piece as a point charge. Thus, we partition $C^\prime$ into some large number $N^\prime$ of equal-length pieces:
Each partition piece has length $\Delta \ell_i^\prime = \ell^\prime /N^\prime$. In order to reproduce the total rate of photon emission (and because the charge is distributed uniformly over the curve), each piece needs to provide an emission rate equal to $r_{C^\prime}/N^\prime$. This implies a curious result: as $N^\prime$ gets larger, the number of pieces increase, and every piece contributes less to the total emission rate. Because emission rate is proportional to total charge (because there’s no positive/negative cancellations), any individual piece possesses less charge. In the limit that $N^\prime\rightarrow \infty$, the curve becomes equivalent to infinitely-many point charges, where each point has no charge!
Charge is evidently an inappropriate local quantity for a uniformly charged curve. This presents a roadblock on our path to continuous charge configurations. We can circumvent it by finding an alternative local quantity to work with. As usual, the trick lies in the problem: charge vanishes as the length of our partition pieces go to zero, so instead of considering charge outright let’s consider the linear charge density $\lambda_i^\prime$ per piece:
This has units of charge per length. For a uniformly charged curve cut into $N^\prime$ equal-length pieces, each piece has length $\Delta \ell^\prime_i = \ell^\prime /N^\prime$ and charge $\Delta q^\prime_i = Q^\prime /N^\prime$. Therefore,
where $\lambda_{C^\prime}$ is the total linear charge density of $C^\prime$, a finite result! Linear charge density let’s us rewrite the charge of each partition piece as,
We found this expression by considering a uniformly-charged, positively-charged curve, but--like we did when developing the integral calculus--we can reweight the charge contribution of each piece, so that $\lambda^\prime_i$ is a function along $C^\prime$. In doing so, we can access nonuniform line charges which possess differing amounts of (positive and negative) charge along their lengths.
Specifically, as the partition becomes infinitely fine, each point $\vec{r}^{\text{ }\prime}$ along the curve can be uniquely identified with an infinitely-short piece of curve possessing a certain linear charge density. The linear charge density thereby becomes a real-valued function $\lambda^\prime(\vec{r}^{\text{ }\prime})$ along $C^\prime$.
Now is as good as any time to introduce infinitesimal quantities. As $N^\prime \rightarrow \infty$, the charge of each piece technically vanishes ($\Delta q^\prime_i\rightarrow 0$). However, we can imagine “stopping short” of infinity and--in doing so--obtain a point-by-point identification of curve pieces yet somehow without fully removing the charge of each piece. If this seems too good to be true, it’s because it is. We must be careful to finish off that $N^\prime\rightarrow \infty$ limit before calling our calculation complete.
Warning: Infinitesimals are used widely throughout physics, oftentimes when juggling multiple limits. For example, sometimes we’ll break time into infinitesimal intervals while simultaneously breaking shapes into infinitesimal pieces. These are technically two different limits, so mixing them can cause trouble (recall how cautious we had to be when manipulating the double-limit in the Dirac delta post). Exercise caution when encountering infinitesimals in the wild.
It’s in this spirit that we talk about the infinitesimal charge $dq^\prime$ and the infinitesimal length $d\ell^\prime$ of each curve piece, which are related through the line charge density function $\lambda^\prime(\vec{r}^{\text{ }\prime})$ along $C^\prime$:
Hence, the infinitesimal limit associates every point $\vec{r}^{\text{ }\prime}$ with a point charge of strength $dq^\prime(\vec{r}^{\text{ }\prime})$. Of course, point charges produce electric fields, and $dq^\prime(\vec{r}^{\text{ }\prime})$ is no exception:
Because the charge is infinitesimally small, the electric field is also infinitesimally small. This encourages us to write the electric field with the lowercase $d$ we associated with other infinitesimals, yielding the label $d\vec{E}$. We also change the subscript on $d\vec{E}$ to reflect our intent to add up all of the electric field contributions along the curve:
We can rewrite this as a line integral contribution via the line charge density. In this case, an individual contribution becomes,
Note that this is a vector-valued object. In our integral calculus series, we discussed integrals of real-valued functions, but not vector-valued functions. How do we proceed?
II. Integrating Vector-Valued Functions
It turns out integration of vector-valued functions follows without much trouble if we assume that linearity of integrals plays nice with $3$-space vector operations.
Particularly, any element of $3$-space can be uniquely associated with three real numbers, called rectangular coordinates:
These coordinates are useful because they allow us to write vector addition and scalar multiplication with ease. Given two $3$-vectors $\vec{r}_1\equiv (x_1,y_1,z_1)$ and $\vec{r}_2\equiv (x_2,y_2,z_2)$, we can use vector addition to build a new $3$-vector labeled $\vec{r}_1+\vec{r}_2$, defined as,
Furthermore, given a real number $\alpha$, we can use scalar multiplication to transform $\vec{r}=(x,y,z)$ into a $3$-vector $\alpha\vec{r}$ defined as,
We will assume that integrals behave nicely with these properties, and treat the unit vectors $\hat{x}$, $\hat{y}$, $\hat{z}$ as constants. (This is true of our rectangular coordinates; there exist other coordinate systems of $3$-space which aren’t so lucky.)
Given a function $\vec{f}$ that maps $3$-vectors to $3$-vectors, we may decompose $\vec{f}$ into components,
Then we define the integral over a vector-valued function $f$ as a component-by-component operation, like so:
Although I’ve written this as a volume integral, the sentiment holds equally true for surface integrals and line integrals.
With this definition in mind, we can now add up the electric field contributions due to all the point charges along $C^\prime$, so that the electric field at $\vec{r}$ due to a line charge $\lambda^\prime$ equals,
We then find the force experienced by a target charge $q$ at $\vec{r}$ due to $\lambda^\prime$ by multiplying $\vec{E}_{C^\prime}(\vec{r})$ by $q$, so that we find,
III. The Electric Field of a Surface Charge
Let’s move on to the equivalent analysis of a surface charge. Again, by evenly distributing a finite amount of positive charge $Q^\prime$ over a surface $S^\prime$ we ensure that any point of $S^\prime$ contains only infinitesimal amounts of charge, thereby eliminating charge as a useful local metric of electric behavior.
Instead, a more appropriate local measure is the surface charge density $\sigma^\prime$. Its definition proceeds just like the line charge density: we divide $S^\prime$ into infinitely-many infinitely-small two-dimensional regions, so that each point $\vec{r}^{\text{ }\prime}$ on $S^\prime$ is uniquely identified with its own infinitesimal region. If the infinitesimal piece at $\vec{r}^{\text{ }\prime}$ has area $dA^\prime$ and charge $dq^\prime$, then the surface charge density $\sigma^\prime$ at $\vec{r}^{\text{ }\prime}$ is defined as,
such that it has units of charge per area. This allows us to transform the charged surface $S^\prime$ into point charges, each contributing an electric field amount equal to,
which combine to yield the electric field at $\vec{r}$ due to a surface charge $\sigma^\prime$:
as well as the force experienced by a target charge $q$ at $\vec{r}$ due to $\sigma^\prime$:
IV. The Electric Field of a Volume Charge
Finally, we distribute charge over a volume $V^\prime$. The appropriate local measure of charge is volume charge density, labelled $\rho^\prime(\vec{r}^{\text{ }\prime})$. To define volume charge density, we break $V^\prime$ into infinitesimals, each possessing some volume $d\tau^\prime$ and some charge $dq^\prime$. Subsequently, $\rho^\prime(\vec{r}^{\text{ }\prime})$ is defined as the multiplicative factor between the two.
Volume charge density has units of charge per volume. With this definition in place, each point charge contributes some infinitesimal amount of electric field,
which collectively generate the electric field at $\vec{r}$ due to a volume charge $\rho^\prime$:
and enable us to write the force experienced by a target charge $q$ at $\vec{r}$ due to $\rho^\prime$:
V. Combining Charge Densities into Generic Configurations
A fully generic charge density may be a combination of all of these objects. Suppose we have a charge configuration $CC^\prime$ made of point charges $q_{i_0}^\prime$ at $\vec{r}^{\text{ }\prime}_{i_0}$, line charges $\lambda^\prime_{i_1}$ on curves $C^\prime_{i_1}$, surface charges $\sigma^\prime_{i_2}$ on surfaces $S^\prime_{i_2}$, and volume charges $\rho^\prime_{i_3}$ on volume charges $V^\prime_{i_3}$. The electric field generated by $CC^\prime$ must be, by the principle of superposition,
If we place a point charge $q$ at a position $\vec{r}$, it will experience a force due to $CC^\prime$ equal to,
There we go: we now know how to calculate the electric properties of any distribution of charges throughout $3$-space. It’s worth noting that if we wanted, we could utilize Dirac deltas to express all charge densities as volume charge densities. Given a volume $V^\prime$ that encompasses a point charge $q^\prime$ at $\vec{r}^{\text{ }\prime}$, a line charge density $\lambda^\prime$ on $C^\prime$, and a surface charge density $\sigma^\prime$ on $S^\prime$, physicists will sometimes write,
Note that the units work out appropriately for these to be volume charge densities. However, these expressions are technically incorrect because Dirac deltas may only exist within integrals. Only in that context may these expressions be utilized without jeopardizing mathematical consistency.
And so, we’ve generalized the ��source” aspect of electrostatics. Next week, we generalize our target charges and calculate the force between two generic charge configurations. See you then!
Thanks for reading today’s post! Follow sineofpsi.tumblr.com for new physics content every Friday. Have questions about anything we’ve talking about? Send me an ask. I’m wishing you the best!
#sineofpsi#sineofwhy#quantum field theory#particle physics#mathematics#physics#electromagnetism#mathematical physics#classical electromagnetism#studyblr#gradblr#research#researchblr#sciblr#scienceblr#physicsblr#mathblr#uniform charge#uniformly charged#linear charge density#charge density#infinitesimals#infinitesimal quantities#integration#vector-valued integration#line charge#surface charge density#surface charge#volume charge density#volume charge
5 notes
·
View notes
Text
Classical E&M II: Superposition + Decomposition
In our first electromagnetism post, we established the behavior of stationary point charges. Now armed with the integral calculus, we’ll build spatially-extended charged objects known as charge configurations.
I. Charge Configurations; Target and Source Charges
We previously motivated Coulomb's Law, the force law between a pair of point charges in $3$-space. Namely, suppose we have a point charge $q^\prime$ at $\vec{r}^{\text{ }\prime}$ and a point charge $q$ at $\vec{r}$, so that they are spatially separated by $\vec{R}\equiv \vec{r}-\vec{r}^{\text{ }\prime}$. Then the target charge $q$ will experience a force $\vec{F}_{q^\prime\rightarrow q}$ as a result of the source charge $q^\prime$, and that force equals,
Recall that $u$ is some constant that ensures we end up with units of force. I’m choosing to label source charges with primed symbols and target charges with unprimed symbols, enabling the mnemonic “primed quantities produce effects.���
Of course, which charge we call a source charge versus a target charge is a matter of perspective because really each charge produces a photon cloud that the other interacts with. The “source” versus “target” nomenclature only makes sense when we’re calculating something with respect to a specific charge’s experience.
While point charges are great, they’re limited in their usefulness. We’d like to generalize electric properties to more complicated arrangements of charge. A generic distribution of charge throughout space is called a charge configuration.
There’s no need to jump straight to the most generic charge configurations; let’s start with simpler examples before we fully extend our machinery. The simplest charge configuration (beyond empty space or a single point charge) is a finite collection of point charges:
Specifically, suppose we have $N$ point charges: $q_1^\prime$ at $\vec{r}^{\text{ }\prime}_1$, $q_2^\prime$ at $\vec{r}^{\text{ }\prime}_2$, and so-on. As a shorthand, we’ll call this entire configuration $Q^\prime$, so that $Q^\prime = \{q_1^\prime,\cdots,q^\prime_N\}$. Next, we introduce a target charge $q$ at $\vec{r}$:
Note that the separation from $q$ is different for each charge in $Q^\prime$, so we need to index their separation vectors: label the separation vector from $q^\prime_i$ to $q$ as $\vec{R}_i$, such that $\vec{R}_i \equiv \vec{r}-\vec{r}^{\text{ }\prime}_i$.
We want to answer the following question: what force $\vec{F}_{Q^\prime\rightarrow q}(\vec{r})$ does the charge configuration $Q^\prime$ exert on the point charge $q$?
II. The Superposition Principle and Electric Fields
The key to determining $\vec{F}_{Q^\prime\rightarrow q}(\vec{r})$ lies in the antisocial behavior of photons. By definition, photons only interact with charged objects, and photons are uncharged. Photons pass right through each other!
For simplicity, suppose $Q^\prime$ consists of only two source charges, $q_1^\prime$ and $q_2^\prime$. Both source charges possess their own photon clouds, but these clouds don’t mind sharing the same space because photons don’t interact with each other. So wherever we place our target charge, there’ll be photons originating from both charges for it to absorb. But because those photons are coming uninterrupted from their sources, their combined effect is the same as the effects due to the individual charges added together. This leads us to the superposition principle of electrostatics.
The superposition principle mathematically states that the net force experienced by a charge $q$ due to a charge configuration $Q^\prime=\{q_i^\prime\}$ equals the sum of the forces that $q$ would experience due to each charge $q_i^\prime$ individually. Notationally, we write
But we know how to write the force between two point charges, so we may express this more explicitly as,
Because the target charge $q$ is independent of the source charge index $i$, we can factor it out of the sum.
This factorization has a physical interpretation that goes all the way back to charge being the capacity for an object to absorb or emit photons. When we place $q$ in the presence of the configuration $Q^\prime$, it’s going to absorb photons from $Q^\prime$ at a rate proportional to $q$. Meanwhile, the contribution in square brackets tells us about the photons $Q^\prime$ is generating, and is independent of $q$. In this way, we disentangle the force into a purely $q$-dependent factor and a purely $Q^\prime$-dependent factor, such that,
The target charge factor is the familiar electric charge of $q$, while the source charge factor is what we call the electric field $\vec{E}_{Q^\prime}$ due to $Q^\prime$:
We sometimes refer to an electric field as an $\vec{E}$-field for short. Because the $\vec{E}$-field is related to the density of source photons and that density varies throughout space, its value depends on the position $\vec{r}$ of the target charge. From its definition, we note that
To better facilitate building an intuition about the electric field, let’s focus on the electric field of a single point charge $q^\prime$, labeled $\vec{E}_{q^\prime}$ and equal to,
We first note that the electric field will point away from a positive source charge and inward for a negative source charge:
This is precisely the direction in which a positive target charge $q$ would experience a force!
In general, the $\vec{E}$-field generated by a charge distribution points in the direction that a positive point charge would experience a force.
Meanwhile, negative charges are pushed against the electric field.
If we imagine the electric field maps the flow of a fluid throughout space, then positive charges are pushed down the electric field flow, while negative charges are pushed up the electric field flow.
III. Electric Field versus Photon Cloud Density
For a single particle, the electric force and field strengths are proportional to the photon cloud density. This is not true of the electric field created by multiple charges.
As an extreme example, consider placing a positive target charge $q$ halfway between two positive source charges $q^\prime_1$ and $q^\prime_2$ of equal charge strength: $q^\prime = q^\prime_1 = q^\prime_2$. Call this halfway point $\vec{r}=\vec{r}_{1/2}$.
The number of photons at $\vec{r}=\vec{r}_{1/2}$ is double the number of photons due to either $q^\prime_1$ or $q^\prime_2$ alone. Simultaneously, the electric field $\vec{E}_{q_1^\prime,q_2^\prime}$ at $\vec{r}_{ 1/2 }$ due to this configuration equals
We can simplify this expression given the available geometric information. Because $\vec{r}_{ 1/2 }$ lies exactly halfway between $\vec{r}^{\text{ }\prime}_1$ and $\vec{r}^{\text{ }\prime}_2$ (see the illustration), $\vec{R}_1$ is exactly opposite $\vec{R}_2$:
Hence, the net electric field at $\vec{r}=\vec{r}_{ 1/2 }$ is,
Despite the existence of many photons at this point, the electric field vanishes! Following this through to the force, we find the force also vanishes:
We interpret this physically as a sort of stalemate between the charges: photons from $q_1^\prime$ are trying to push $q$ towards $q_2^\prime$, but photons from $q_2^\prime$ are pushing $q$ just as hard towards $q_1^\prime$. We’re forced to conclude that electric field strength is not correlated with photon cloud density for generic charge configurations.
IV. The Decomposition Corollary
The mathematics of classical E&M allow us to perform a trick: if we so desire, we can break a single source charge $q^\prime$ at a point $\vec{r}^{\text{ }\prime}$ into many source charges $q_i^\prime$ at $\vec{r}^{\text{ }\prime}$ so long as we’re careful to retain the same total amount of charge:
Because these all live at the same point of space, their separation vectors $\vec{R}_i$ are identical ($\vec{R}_i = \vec{R} = \vec{r}-\vec{r}^{\text{ }\prime} $) and the net electric field due to them equals,
which is precisely the electric field of the original source charge $q^\prime$! In this sense, charge is locally additive.
This is another consequence of photons being unable to see each other: given a charge emitting some number of photons, we may partition those photons into several groups. We can then allocate each group an appropriate fraction of charge and thereby exactly replicate the effects of that group. We can even (mathematically) pull photons out of thin air so long as we’re careful to introduce additional photons that exactly cancel their effects. From the perspective of the charges that are producing these photons, this is automatically ensured if our decomposition has the same total charge as our original source charge.
This result is essentially an additional facet of the superposition principle. We’ll refer to this property as the decomposition corollary:
The decomposition corollary states that a target charge will be identically affected by any two source charge configurations that have point-for-point identical total charges.
The decomposition corollary allows us to (locally) reorganize charges at our convenience.
It’s important to note that we can apply the decomposition corollary to empty space. Because it neither absorbs nor emits photons, we can think of empty space at a point $\vec{r}^{\text{ }\prime}$ as a charge-without-charge, $q^\prime_{empty} =0$. As such, empty space cannot exert a force on target charges. This effect is indistinguishable from instead having a positive charge $+q^\prime$ and a negative charge $-q^\prime$ both at $\vec{r}^{\text{ }\prime}$. Because any forces that $+q^\prime$ would cause will be exactly cancelled by the forces caused by $-q^\prime$, it’s equivalent to empty space as far as target charges are concerned!
This concludes our first generalization towards the full classical electromagnetic theory. Today we handled how a target point charge $q$ is pushed and pulled by a collection of source point charges $Q^\prime =\{q_i^\prime\}$. Now that we know how to handle forces generated by generic zero-dimensional charge configurations, we’ll expand to higher dimensions. Next time on SineOfPsi: we’re looking at line charges, surface charges, and volume charges.
Thanks for reading today’s post! Follow sineofpsi.tumblr.com for new physics content every Friday. Have questions about anything we’ve talking about? Send me an ask. I’m wishing you the best!
#sineofpsi#sineofwhy#quantum field theory#particle physics#mathematics#electromagnetism#mathematical physics#classical electromagnetism#studyblr#gradblr#research#sciblr#scienceblr#physicsblr#mathblr#target charge#source charge#charge configuration#superposition principle of electrostatics#superposition principle#electrostatics#electric field#electric charge#point charge#photon cloud#photon cloud density#decomposition corollary#additive#locally additive#Coulomb's Law
14 notes
·
View notes
Text
Calculus III: the Dirac Delta, a Bridge Between Dimensions
Today, I'm going to teach you a magic trick. This trick connects two very different tools from two very different worlds: the area integral of two-dimensional regions and the line integral of one-dimensional curves. The time has come to join these disparate items under a common framework.
Last week, we described how to find the length of a curve using a line integral. Today’s trick tells us how to find the length of a curve using a surface integral instead.
To accomplish this, we're going to need something unlike anything we've seen before: an ethereal mathematical object called the Dirac delta distribution, which possesses the ability to bridge dimensions.
Come one, come all! The show is about to begin...
I. The Dimensional Bridge Equation: Measuring Length Using Area
In order to discuss the length of a curve, we're going to need a curve to work with. Let's call it $C$:
Like any other curve, $C$ is infinitesimally thin. However, suppose we thicken $C$ so that we obtain a similarly-shaped region with a finite (but small) width $W$:
Let's call this region $\mathbf{C}_W$. Â Despite their dimensional differences, $C$ and $\mathbf{C}_W$ are aesthetically similar. We expect that we should somehow "recover" $C$ in the limit that $\mathbf{C}_W$ becomes infinitesimally thin, aka $W\rightarrow 0$. To maintain this aesthetic similarity, let's partition $C$ and $\mathbf{C}_W$ in analogous fashions:
The finer we make our partition along the length of $C$, the more its partition pieces look like line segments. Simultaneously, if we keep the width of $\mathbf{C}_W$ small, the partition pieces of $\mathbf{C}_W$ look increasingly like rectangles. Let’s pursue this by focusing on one piece of $C$ and the corresponding piece of $\mathbf{C}_W$:
If $C$'s piece has a length approximately equal to the line segment length $\Delta \ell_i$, then $\mathbf{C}_W$'s piece will have an area approximately equal to $\Delta A_i \approx W\cdot \Delta \ell_i$. By combining the contributions of every partition piece in this way, we discover an approximate relationship between lengths and areas:
This relationship becomes exact when our partition is infinitely fine ($N\rightarrow\infty$) and $\mathbf{C}_W$ is infinitely thin ($W\rightarrow 0$), in which case we may write,
Now, while this is a true result, it's not very useful to us. The right-hand side always vanishes, so this equation simply says "the area of an infinitely-thin region equals zero." We'll find a more useful expression if we move all of the $W$-dependent information to one side before taking the limit. That is, let's take the limit as $N\rightarrow \infty$ and $W\rightarrow 0$ of the following expression instead:
Because the length of $C$ is entirely independent of the width of $\mathbf{C}_W$, it's unaffected as $W$ shrinks to zero. Therefore, taking the limits of both sides gives us what I'll call the Dimensional Bridge equation:
This is an awesome result! The Dimensional Bridge equation is an exact relationship between one-dimensional and two-dimensional objects.
Although we used a very specific partition to derive the Dimensional Bridge, all of the quantities appearing in the equation are partition-independent. As we continue to work with the Dimensional Bridge, we'd find our current partition cumbersome; therefore, let's develop a better partition that also simplifies the double-limit ($W\rightarrow 0$ and $N\rightarrow \infty$) buried in the right-hand side of the equation.
II. Switching to a Lattice Partition
Integration relies on a sequence of increasingly-accurate approximations of some desired quantity. We typically increase accuracy by gradually refining the partition of a relevant shape. Let's refer to each partition in this refinement sequence as a stage and let an index $n$ label each stage. We'll let $n=0$ correspond to the first partition, then let $n=1$ correspond to its subsequent refinement, and so-on.
Next, label the number of partition pieces in the $n$th stage as $N^{(n)}$. For example, that means our initial partition will divide the relevant shape into $N^{(0)}$ pieces, then the subsequent partition will utilize $N^{(1)}$ pieces, and so-on. Because it's a refinement, we'd make sure $N^{(n)}$ increases an $n$ increases. When $n\rightarrow \infty$, we obtain $N^{(n)}\rightarrow \infty$, just like before.
Now instead of dividing $\mathbf{C}_W$ into approximate rectangles along its length, let's surround $\mathbf{C}_W$ in a region $R$ and let's divide $R$ into increasingly-refined square lattices. (Note: this is the partition we used when we first defined the surface integral.)
We've seen that the Dimensional Bridge equation involves two limits: $W\rightarrow 0$ and $N\rightarrow \infty$. The $W\rightarrow 0$ limit leads to a series of progressively-thinner curves:
Meanwhile, the $N\rightarrow \infty$ limit yields a series of progressively-finer lattices:
We can combine these two procedures in a single grid, where moving downward decreases $W$ and moving rightward increases $N$:
When we calculate the right-hand side of the Dimensional Bridge equation, we ultimately want to travel down-and-right through the grid. We could do this by first heading forever rightward (in the $N \rightarrow \infty$ direction), and then heading forever downward (in the $W\rightarrow 0$ direction); this is the order implied by the equation.
Alternatively, if we're careful, we can combine the two limits into a single limit by traveling diagonally through the grid. This would mean that as $n$ increases we're simultaneous refining the partition AND thinning the shape whose area we're approximating. I've indicated such a sequence of stages in orange:
There are ways to mess this up, so we have to be very careful. As an extreme example, suppose we decide to take $W$ to zero first (so that we're travelling down the leftmost column of the grid). In this case, we'd be using the coarsest partition for every approximation. As $\mathbf{C}_W$ grows narrower, the number of squares we can fit in $\mathbf{C}_W$ will decrease until $\mathbf{C}_W$ contains no squares at all. In that case, even if we take $N\rightarrow \infty$ afterwards, the Dimensional Bridge equation would seem to imply our curve has no length! In other words, we'll break our math if the width $W$ decreases too quickly.
Therefore, we want to refine the partition quicker than we shrink $\mathbf{C}_W$. We might ensure this by choosing our widths at each stage so that $\mathbf{C}_W$ contains an increasing number of squares as $n$ increases. However we manage to do it, let's label the width we choose for the $n$th stage as $W^{(n)}$ and label the corresponding thickened curve as $C_{W}^{(n)}$.
From here on, we assume we've successfully collapsed the two limits $W\rightarrow 0$ and $N\rightarrow \infty$ into a single limit $n\rightarrow \infty$. Combining the limits like this allows us perform a powerful reorganization of the Dimensional Bridge equation. In particular, we'll use it to augment how we write the area of $\mathbf{C}_W$. To streamline this manipulation, let's introduce the characteristic function of a set in the plane.
III. Characteristic Weight: Measuring the Area of Subsets
Given a set $A$ of points in the real plane, we define its characteristic function $\chi_A$ as a function on the real plane that equals $1$ for points in $A$ and $0$ for points outside of $A$:
When $A$ is a two-dimensional region, $\chi_A$ sort of mimics a plateau sitting directly above $A$.
By construction, the surface integral of $\chi_A$ over $A$ gives us the area of $A$:
Now, suppose our two-dimensional region $A$ is contained within a larger two-dimensional region $B$, and consider integrating $\chi_A$ over $B$. Because $\chi_A$ gives zero weight to points outside of $A$, the pieces of $B$ that aren't also in $A$ contribute nothing to the integral, such that we end up with the area of $A$ again:
But what if instead $B$ contains only some of $A$? In that case, we'll only add up those pieces of $A$ that are contained in $B$, aka those pieces in the intersection of $A$ and $B$, denoted $A\cap B$:
We introduce the characteristic function because it let's us rewrite the area of $\mathbf{C}_W$. Particularly, at the $n$th stage of our partition procedure we're interested in the area of $\mathbf{C}_W^{(n)}$, which has width $W^{(n)}$ and will be approximated via a partition of $N^{(n)}$ squares. Like any other region, $\mathbf{C}_W^{(n)}$ has an associated characteristic function, which we’ll label $\chi^{(n)}_{\mathbf{C}_W}$. We may approximate the area of $\mathbf{C}_W^{(n)}$ via its characteristic function:
where $R$ can be any region that contains $\mathbf{C}^{(0)}_W$ (containing $\mathbf{C}^{(0)}_W$ ensures $R$ contains all of the thickened curves $\mathbf{C}^{(n)}_W$ because $W$ only gets smaller). Let's then apply this expression to the Dimensional Bridge equation, which allows us to write the length of $C$ as,
And just like that, we did it! We've managed to express a curve's length as a surface integral! This is the trick I alluded to in the intro. This trick (as well as its generalizations) is extremely powerful, and is used extensively throughout physics. Of course, like any other magic trick, there's more going on here than meets the eye.
IV. The Dirac Delta Distribution
To see what's up, recall how we defined the surface integral of a function $f$ over a region $R$, and compare it to our latest result:
This definition suggests that whenever we want the length of a curve $C$, we should just weight a surface integral with the following function:
Except this weight is not a function at all!
First of all, this weight depends on $n$: in other words, it changes as we refine our partition. This makes sense (it results from thinning $\mathbf{C}_W$ as we increase $n$), but it also immediately hints that something weird is afoot. In our earlier examples of integrating weight functions, the weight functions remained the same from partition to partition.
Second, this weight outputs nonsense when our partition becomes infinitely fine. For points not on the curve $C$, things aren't so bad. Consider a specific point $(x,y)\not\in C$. At some stage of refinement, $\mathbf{C}_W^{(n)}$ will be too narrow to contain $(x,y)$, but its width $W^{(n)}$ will still be finite. That means that at this stage (and every stage beyond), $\chi^{(n)}_{\mathbf{C}_W}(x,y)=0$, such that the ratio $\chi^{(n)}_{\mathbf{C}_W}(x,y)/W^{(n)}$ vanishes, and
Functions are allowed to output zero, so this isn't really a problem. Unfortunately, the circumstances are worse for points on $C$. The characteristic function $\chi_{\mathbf{C}_W}(x,y)$ always equals $1$ for points on $C$, even as the width $W^{(n)}$ becomes infinitesimally small. As a result, their ratio grows larger and larger, and the weight diverges!
Real-valued functions are not allowed to output infinities!
Therefore, this object we're integrating is definitely not a function. This is the secret at the heart of the trick: instead of a function, we've utilized a mathematical object called a distribution.
Distributions tell us how to weight partition pieces during our integration procedure. While all functions are distributions, many distributions are not functions. This is because distributions need only make sense when integrated. One consequence of this freedom is that weights from a distribution can vary between different partitions (as we saw earlier).
Particularly, the distribution we derived today is the Dirac delta distribution $\delta_C$ corresponding to the curve $C$. It's defined by the expression we found:
However, the utility of the Dirac delta comes from the Dimensional Bridge equation, which (when applied to our definition) yields a more popular expression of the Dirac delta distribution:
Note that I’ve written the line integral over $C\cap R$ instead of simply $C$. This reflects the fact that (just as we saw with the characteristic functions) the only parts of a curve $C$ that can contribute to an integral are those parts that are included in the integration region $R$.
It is through that last equation that a Dirac delta connects objects of different dimensions. It is important to recognize that by definition the Dirac delta distribution only makes sense within integrals. It's literally meaningless otherwise. This fact unfortunately doesn't stop physicists from trying to manipulate Dirac deltas outside of integrals. It's a dangerous practice that can lead to nonsensical results if the physicist isn't careful. Here at SineOfPsi, we'll be more cautious than the average physicist.
Finally, I want to point out that all of our arguments above still work if we multiply the weight $\chi^{(n)}_{\mathbf{C}_W}$ by a function $f(x,y)$ (so long as it's defined on all of $R$), in which case we derive a relationship between generic line integrals and surface integrals:
We'll be using this equation (and generalizations thereof) A LOT on SineOfPsi. Dirac deltas show up all over the place in physics. For example, in classical electromagnetism we often model point charges using Dirac deltas. But before we do that, we have to tie up a few loose ends regarding the integral calculus; that will be the topic of next week's post. Until then, best wishes!
Thanks for reading today’s post! Follow sineofpsi.tumblr.com for new physics content every Friday. Have questions about anything we’ve talking about? Send me an ask. I’m wishing you the best!
#sineofpsi#sineofwhy#quantum field theory#particle physics#mathematics#electromagnetism#calculus#integral calculus#mathematical physics#studyblr#gradblr#researchblr#sciblr#scienceblr#physicsblr#mathblr#Dirac delta distribution#Dirac delta#distribution#dimensional bridge equation#dimensional-bridge equation#dimensional bridge#dimensional-bridge#bridge#dimension#partition#refinement#area#surface integral#length
68 notes
·
View notes
Text
Calculus IV: Integrals vs. The World
The physical world has three spatial dimensions as evidenced by three unique directions of movement: left-and-right, forward-and-back, and up-and-down.
We've spent the last few weeks (1 2 3) developing calculus in the plane, but we need to extend calculus to three dimensions if we’re going to apply it to physical situations. In our pursuit of three-dimensional calculus, we’ll learn about the bizarre zero-dimensional world of points and their zero-dimensional integrals.
If you're up for the challenge, join me for the dimension-destroying conclusion of our integral calculus series!
I. Volume, Volumes, and the Volume Integral
We model our three-dimensional physical environment with a mathematical equivalent known as real 3-space (or simply 3-space). It's essentially three copies of the real line $\mathbb{R}$ strung together. We express this symbolically by writing $\mathbb{R}^3=\mathbb{R}\times \mathbb{R}\times \mathbb{R}$. It’s standard practice in physics to call the first copy of $\mathbb{R}$ the $x$-axis, the second copy the $y$-axis, and the third copy the $z$-axis.
Consequently, any point $\vec{a}$Â in 3-space is uniquely characterized by three real numbers, collectively called rectangular coordinates: an $x$-coordinate $a_x$, a $y$-coordinate $a_y$, and a $z$-coordinate $a_z$. Each coordinate selects a real number from its corresponding axis (so that, for example, the $x$-coordinate is a specific value on the $x$-axis). We often write $3$-space points with an overhead arrow, thus why I've been doing that. We also sometimes write points as a list of their coordinates, such as $\vec{a}=(a_x,a_y,a_z)$.
One of the simplest ways to make a three-dimensional shape in $3$-space is by restricting each of our coordinates to be in a certain interval, i.e. we look at the shape composed of points $(x,y,z)$ such that $x\in [x_L,x_R]$, $y\in [y_L,y_R]$, and $z\in [z_L,z_R]$. The resulting geometric object is called a right rectangular prism (abbreviated as RRP) or a $3$-cell.
Just as line segments have length and rectangles have area, $3$-cells possess a natural measure of size called volume, created by multiplying the lengths of each interval together:
In this context, any of these widths is called a side-length of the $3$-cell. We like $3$-cells because we can approximate any three-dimensional shape as a union of $3$-cells. This enables us to extend the definition of volume to any three-dimensional shape.
Although generic $3$-cells will get the job done, for convenience we typically specialize to $3$-cells which have a uniform side-length, say $L = |x_R-x_L| = |y_R-y_L| = |z_R-z_L|$. We call such an object a cube. By definition,
Having discussed the building blocks, let's move on to three-dimensional shapes proper. A generic three-dimensional shape is (somewhat confusingly) also called a volume. As already mentioned, we can approximate generic volumes using cubes. In the same way that smaller squares make for better approximations of surfaces, smaller cubes make for better approximations of volumes.
Cutting to the chase, if we utilize an infinitely-fine partition of infinitely-small cubes, we obtain the volume integral over a volume $V$:
The quantity $\Delta \tau_i$ in this definition is the volume of the $i$th cube $\text{CUBE}_i$ contained in our approximation of $V$. Meanwhile, the index $n$ labels the stages of our partition refinement, so that as $n\rightarrow \infty$, the number of cubes used to approximate $V$ diverges as well: $N^{(n)}\rightarrow \infty$. Per usual, we can reweight the contribution of each cube in our partition according to a function defined on $V$, resulting in what we call the volume integral of a (real-valued) function $f$ over a volume $V$:
Like we saw last week in the real plane, there are ways to weight cubes such that the weights can depend on the specific partition we use. These possibly partition-dependent weight prescriptions are called distributions. (Reminder: functions are a subset of distributions.) I bring up distributions because we’re just as interested in Dirac delta distributions in $\mathbb{R}^3$ as we were in $\mathbb{R}^2$.
Now that we know how to construct generic three-dimensional objects in a controlled manner, let’s discuss all of the other objects that live in $\mathbb{R}^3$. In doing so, we’ll derive a handful of Dimensional Bridges and obtain corresponding Dirac delta distributions. We’ll also learn about zero-dimensional objects and zero-dimensional integrals.
II. Bridging the Many Shapes of 3-Space
There exist four kinds of dimensional shapes in $\mathbb{R}^3$: volumes, surfaces, curves, and points.
Despite their different dimensionalities, any of these shapes can be reconstructed by a sequence of volumes. We proceed like we did last week: we create the appropriately thickened version of the desired object and then take some of its widths to zero. This is how we build Dimensional Bridges.
For example, suppose we would like to have a certain surface $S$ in $\mathbb{R}^3$, but are only allowed to build it using volumes. To do so, we could craft a volume $\mathbf{S}_{W_1}$ that resembles the desired surface but has a thickness $W_1$ to it:
While it’s technically three-dimensional, we can imagine making $\mathbf{S}_{W_1}$ thinner and thinner until--once its thickness $W_1$ goes to zero--we recover $S$. This procedure constructs a Dimensional Bridge:
In this case, we’ve eliminate precisely one of the volume’s three dimensions.
Alternatively, we could create a curve $C$ from a curve-like volume $\mathbf{C}_{W_1,W_2}$ by eliminating two dimensions:
This leads to another Dimensional Bridge equation:
We can even create a point $P$ from a point-like volume $\mathbf{P}_{W_1,W_2,W_3}$ by eliminating all three of the volume’s dimensions.
The Dimensional Bridge of this case is tricky. By following the pattern of the other cases, we can guess its right-hand side:
But what should the left-hand side of the equation be? If it follows the pattern of the previous cases, it should be a measure of zero-dimensional objects, which leads us to ask:Â what is the natural measure of a point?
III. Making Sum-Thing Out of One-Thing
Every point in $\mathbb{R}^3$ is a zero-dimensional object. This expresses the fact that points lack any spatial extent: they simply exist “at a point.”
To phrase it another way, imagine we scale-up all of our shapes by a factor of two. As a result, our curves become twice as long ($2=2^1$), our surfaces possess four times as much area ($4=2^2$), and our volumes encompass eight times as much volume ($8=2^3$)... but our points look exactly the same as before. Points are scale-invariant.
This gets to the heart of the problem with finding a natural measure of a point: there’s no such thing as a larger or smaller point. What does it mean to "measure" a point when they’re all identical?
Let’s entirely ignore the fact that we don’t know what’s happening on the left-hand side of our unfinished Dimensional Bridge and try to evaluate its right-hand side instead. In doing so, we should approximate the point $\vec{P}$ with a point-like volume $\text{P}_{W_1,W_2,W_3}$. Let’s use a (small) $3$-cell for this purpose. Specifically, let’s use a $3$-cell that contains the point $\vec{P}$ and has side-lengths $W_1=|x_R-x_L|$, $W_2= |y_R-y_L|$, and $W_3=|z_R-z_R|$.
Because our point-like $3$-cell has volume $W_1\cdot W_2\cdot W_3$, we can simplify the right-hand side of our unfinished equation:
Whoa! According to this, whatever the natural measure of a point might be, a point always contributes an amount $1$.
Now we’re getting somewhere! To clear things up, let’s generalize a little bit. What if we instead had $N$ points?Â
Let’s label the points $\vec{P}_1$, $\vec{P}_2$, and so-on, and label their entire collection as $A = \{\vec{P}_1,\cdots,\vec{P}_N\}$. We’ll approximate each point with a $3$-cell with side-lengths $W_1$, $W_2$, and $W_3$. Let’s call the collection of these point-like volumes $\mathbf{A}_{W_1,W_2,W_3}$.
Because each point-like object has volume $W_1\cdot W_2\cdot W_3$, our conglomerate $\mathbf{A}_{W_1,W_2,W_3}$ has $N$-times as much volume, yielding a total volume of $N\cdot (W_1\cdot W_2\cdot W_3)$. That means the right-hand side of our unfinished Dimensional Bridge equals, in this case,
Huh, look at that: we’ve recovered the number of points in $A$! We therefore surmise that the natural way to measure points is to count how many there are; we call this zero-dimensional measurement the set size of a finite set, aka size.
Note how size respects the scale-invariance of points because making everything twice as big doesn’t change how many points there are.
Following this thread to completion allows us to write the final Dimensional Bridge. Given a finite set $A$ of points, the Dimensional Bridge reads,
While this is great, we’ve introduced size via a three-dimensional integral, which is the oppposite of how we treated other natural measures. In all of the previous cases, we first expressed each measure as an integral over an object of the same relevant dimensionality (e.g., expressing area as a surface integral, length as a line integral, and so-on). Let’s rectify this now by writing size as a zero-dimensional integral.
The zero-dimensional integral is unique because it’s the only integral for which partitions are irrelevant. See, points are the basic zero-dimensional building block, and every zero-dimensional object is just a finite set of points. As a result, there’s no need to make any approximations: we can exactly reconstruct zero-dimensional objects.
Like our previous integration routines, if we have $N$ points in a finite set $A=\{\vec{P}_1,\cdots,\vec{P}_N\}$, then we’ll find the size of $A$ by breaking $A$ into individual zero-dimensional pieces (points) and adding up the size of each piece (always $1$). This yields the zero-dimensional integral we desired, aka the point integral over a finite set $A$:
But we can go further. We recovered the natural measure of our zero-dimensional object by associating each point with a value of $1$... which looks like a weight! Following the usual integration script, we can reweight each point in $A$ according to a function $f$ on $\mathbb{R}^3$, and in doing so we obtain the point integral of a function $f$ over a finite set $A$:
where we’ve written the coordinates of $\vec{P}_i$ as $(x_i,y_i,z_i)$. We thereby arrive at a peculiar result: zero-dimensional integrals are finite sums. It’s for this reason I will probably never call zero-dimensional integrals “point integrals” again and instead opt to call them “sums” from here on out.
(Hey, this means you’re technically evaluating an integral every time you count or add two numbers together! If you’ve learned anything today, I hope it’s that--if you're being super technical--schools teach integral calculus to kindergartners.)
IV. Turning Bridges into Dirac Deltas
Last week, we saw how a Dimensional Bridge equation can yield a Dirac delta distribution. In particular, we saw that we could use a Dimensional Bridge to write line integrals as surface integrals instead:
Notice I tweaked my notation slightly by writing the Dirac delta on $C$ as $\delta^2_C$ instead of $\delta_C$. This establishes that it’s meant to be integrated over a 2-dimensional surface as opposed to--for example--a 3-dimensional volume. It also reminds us that distributions are, by definition, objects intended for integration.Â
Aside: I should mention that the above equation holds true for surfaces and curves in $3$-space too, even though $3$-space admits stranger-looking surfaces than the real plane.
We built several additional Dimensional Bridges in $3$-space today. From every bridge we derive an analogous Dirac delta distribution:
The superscript $3$ on these Dirac deltas indicates they should be volume-integrated (as they are). The notation also allows us to quickly determine the physical units of any given Dirac delta. Let’s focus on $\delta_C^2$ and $\delta_C^3$ first. If you’ll allow me to slip into inexact physics notation for a moment, the weights generated by these Dirac deltas go like this:
where each $\chi$ is a characteristic function. Because characteristic functions only output $0$ or $1$, they’re unitless. Therefore, $\delta_C^2$ has units of width$^{-1}$, while $\delta_C^3$ has units of width$^{-2}$. Generally, if the Dimensional Bridge that generates a Dirac delta requires removing $n$ dimensions, then the Dirac delta will have units of width$^{-n}$.
V. Some Final Generic Properties of Integrals
I wouldn’t be able to forgive myself if I concluded this integral calculus series without mentioning certain important integral properties. In particular, because integrals are a special kind of weighted sum, they inherit all of the nice properties that sums have, including linearity.Â
By linearity, I mean that sums respect addition and scalar multiplication. Given two finite collections of real numbers labelled $\{a_i\}$ and $\{b_i\}$ and any real number $\kappa$, the following statements are true:
Both of these properties carry through to integrals in the form of weights, and therefore functions. A technical way to express this is to say that integrals are linear operators with respect to functions. Given two real functions $f$ and $g$ defined on a subset $\alpha$ of $\mathbb{R}^3$ and any real number $\kappa$, integrals satisfy
The fact that integrals behave nicely with addition will be extremely useful for us next week when we discuss the superposition principle of classical electromagnetism. That’s right: after a month of calculus, we’re returning to physics!
No need to worry, calculus fans. It’s only a matter of time before the derivative calculus demands our attention again...
Thanks for reading today’s post! Follow sineofpsi.tumblr.com for new physics content every Friday. Have questions about anything we’ve talking about? Send me an ask. I’m wishing you the best!
#sineofpsi#sineofwhy#quantum field theory#particle physics#mathematics#electromagnetism#calculus#integral calculus#mathematical physics#studyblr#gradblr#researchblr#sciblr#scienceblr#physicsblr#mathblr#real 3-space#3-space#axis#point#rectangular coordinates#x-coordinate#y-coordinate#z-coordinate#x-axis#y-axis#z-axis#right rectangular prism#rectangular prism#cuboid
12 notes
·
View notes
Photo
~January 12th, 2017~
I completed the integral calculus subarc on SineOfPsi.
0 notes
Photo

~January 9th, 2017~
I drafted a SineOfPsi post on 3-space and points.
0 notes
Photo
~January 7th, 2017~
I updated SineOfPsi and corrected my graviton unitarity code.
0 notes
Photo
~January 6th, 2017~
I finished drafting a SineOfPsi post on the Dirac delta distribution.
0 notes
Text
Calculus I: Integrals are Gonna Carry that Weight
Calculus is the branch of mathematics that tells us how to break down mathematical objects and put them back together. Whether you credit Isaac Newton or Gottfried Leibniz with its invention, there’s no denying physics is built upon a foundation of calculus.
We frequently break calculus into two subfields:
The derivative calculus, which–like breaking a staircase into individual steps–tells us how to deconstruct wholes into constituents, and
The integral calculus, which does the reverse: given an assortment of steps, it manages to build a staircase.Â
Today, we introduce the latter topic. We do so in the context of finding the area of a shape in the real plane, and then generalize this procedure to craft our real goal: the integral. The integral is a powerful machine, capable of feats akin to mathematical alchemy.
Are you ready? Okay, 3… 2… 1… Let’s jam!
Calculating Areas in the Real Plane: Rectangles
While integral calculus can be performed in the abstract, for now (in the interest of building intuition) let’s anchor our discussion in the real plane. Once we have a handle on this kind of integration, we’ll generalize.
The real plane $\mathbb{R}^2$ is the set of all points of the form $(x,y)$, where $x$ and $y$ are real numbers. Sometimes we write the real plane as $\mathbb{R}\times \mathbb{R}$ to emphasize that it’s more-or-less two copies of the real numbers packaged together. Perhaps the simplest shape we can define in the real plane is the rectangle $R$, which consists of all points that have an $x$-coordinate between two real numbers $x_L$ and $x_R$, and a $y$-coordinate between two real numbers $y_L$ and $y_R$. This subset is illustrated below in deep blue.
As typically defined, the real numbers come equipped with a nice notion of distance. For example, if I have a real number $x_1$ and a real number $x_2$, then I can say they’re $|x_2-x_1|$ apart. Furthermore, we can say the real interval $[x_1,x_2]$ (consisting of all real numbers between $x_1$ and $x_2$) has length $|x_2-x_1|$. We want to transmute this one-dimensional measure of intervals into a corresponding measure of two-dimensional shapes, and the way we’ll do it is via rectangles.
In the same way that the real plane can be viewed as a product of two copies of the real numbers ($\mathbb{R}^2 = \mathbb{R}\times\mathbb{R}$), a rectangle can be viewed as a product of two real intervals, e.g. $R = [x_L,x_R]\times [y_L,y_R]$. This notation is evocative, suggesting a two-dimensional extension of distance that we call area:
You might recognize this as the usual “width times height” area formula for rectangles. Like distance, area is a nonnegative real number. We want our mathematical notion of area to match our intuition, which means if we have two non-overlapping rectangles $R_1$ and $R_2$, we should be able to define their total area to be the sum of their individual areas:
Note that we write $R_1\cup R_2$ (that is, we take a union of $R_1$ and $R_2$) because we’ve defined rectangles as subsets of the real plane. We’ll use our ability to combine rectangles into bigger (non-rectangular) shapes as a means to calculate more complicated areas.
For our discussion of integration, let’s hone in on the subcollection of rectangles known as squares, which have sides of equal length: $s\equiv |x_R-x_L|=|y_R-y_L|$. Let’s denote a square by $S$, so that the area of a square is given by:
We’ll be covering a lot of ground with the help of squares.
Calculating Areas in the Real Plane: Everything Else
Now, most shapes are not rectangles, let alone squares. We can, however, build most shapes from rectangles or squares. Suppose I have the following shape in the real plane:
I’m going to be referring to this shape as BLOB. We can break BLOB up using squares with side-length $s=1$. Our ability to break BLOB into squares is a direct consequence of being able to do the same to the entire real plane: when performed, we end up with a partition over $\mathbb{R}^2$:
The game here is to measure the area of BLOB by adding up these squares (whose areas we already know). As you can see, BLOB doesn’t fit perfectly in the squares, so we’ll only be estimating BLOB’s area at the moment. Even with this specific partition, there are many ways of estimating BLOB’s area. I’ll be focusing on two, which will give us lower and upper estimates of its area.
The Lower Estimate: let’s only count those squares that fall 100% within BLOB.
A visual count says there are 10 squares fitting this criterion, each with area $s^2=1$. Although all of the area accounted for by these squares (colored green) definitely belongs to BLOB, we’re missing a lot of area (colored red) from partially-filled squares. Therefore, we’re underestimating the area, so all we can conclude is that BLOB’s area is larger than 10:
Next, the Upper Estimate: this time, let’s count every square that contains any amount of BLOB whatsoever.
This rule gives us 35 contributing squares. While we’ve accounted for the entirety of BLOB’s area (colored green), we’ve also included a lot of area outside of BLOB (colored red). This means we’re overestimating the area. Combined with the earlier underestimate, we’ve obtained a window within which BLOB’s area resides:
This isn't a particularly narrow window, is it? Thankfully, we can do better! Using these same rules, we can reduce our estimation errors by using a finer partition: that is, by breaking our existing squares into smaller squares. This gives our estimates a better precision. By breaking each $s=1$ square into four $s=½$ squares, we obtain the following illustrations:
Because $s=½$ now, each square has area $s^2=(½)^2 = ¼$, and our area estimates are calculated by multiplying the appropriate number of squares by $¼$: that is,
This finer partition grants us a narrower window; we now esimate BLOB’s area as somewhere between $17.5$ and $31.0$:
You can literally see our estimation error decrease as the red regions shrink between the $s=1$ and $s=½$ cases.
I’ll include one more iteration of this procedure (getting us to squares with side-length $s=¼$) just to drive the idea home:
We calculate our area estimates via $N_{squares}/16$ this time, with which we find a narrower window for BLOB’s area:
This process can be continued indefinitely. Typically, our upper and lower estimates will grow closer and closer, and in the limit as $s\rightarrow 0 (when our squares become infinitely small), our estimates will converge upon the area of the desired region. In this case, we’d write (using the lower estimate’s procedure),
Let’s break down this notation piece-by-piece:
$N$ is the number of squares covering BLOB, and by $N\rightarrow\infty$ we mean “keep refining the partition” so that BLOB ends up covered in infinitely-many, infinitely-small squares. Equivalently, we could say $s\rightarrow 0$.
$i$ indexes our squares. There are lots of ways to do this. An easy choice is to label our squares one-by-one as we go left-to-right and up-to-down, writing $i=1$, $i=2$, $i=3$, and so-on, like so:
This makes $S_i$ the $i$th square as dictated by our choice of indexing. Because our collection of squares changes with each stage of the procedure, we’ll have to reindex during each stage.
Finally, the condition $S_i\subset\text{BLOB}$ on our sum ensures we only add the areas $area(S_i)$ of squares within BLOB.
With these notations in mind, the equation states “the area of BLOB equals the sum of the areas of squares contained in BLOB, when the partitioning of those squares is made infinitely fine.”
This area calculation is an example of integration, and the resulting area is called an integral.
Generalizing: Integration Done Properly
Integration would be boring (and not very useful) if it only found areas. One slight change to our procedure adds immense power:
Integration allows us to break a region into many small pieces, reweight the importance of each piece, and then recombine those pieces into a new whole.Â
In our previous example, we implicitly gave each piece of BLOB a weight of 1, which says, “Leave this area as you found it.” Thus, when we added all of the pieces together we got the total area of BLOB.
However, many physical scenarios that naturally live in the plane (or in the closely-related $\mathbb{R}^3$) exhibit a bias between different locations. For example, consider placing point charges in the plane. Point charges pull on each other with a force that grows stronger the closer the charges get to one-another. Therefore, if we’re interested in the force on a point charge $q$ due to a bunch of other charges, we can’t simply add up all of the other charges–we have to also weight their contributions by their relative distance to $q$! In other words, if we want to describe systems with many charges, we need an augmented version of our earlier area-exclusive integration.
To be clear about our terminology: when we multiply a small area by a real number during our integration procedure, we’ll call that real number a weight. Weights can be positive, zero, or even negative. Often weight is given by a scalar density function $w(x,y)$ which maps each point in the real plane to a weight.
Consider an alternate version of BLOB, this time equipped with a scalar density function $w(x,y)$ whose value ranges between $w=0$ and $w=2$ throughout the plane:
We’ve presented the changing weight via a contour plot, with deeper blues corresponding to larger weights. Like before, we can partition the grid into squares:
Unlike before, each square now carries a distribution of weights across it. Consider, for instance, the square we zoomed into in the above illustration.
Previously, because its weight was uniformly $w=1$, we could simply include the area of this square in our estimates as-is. Now we’ve got to multiply its area by a weight, but it’s unclear which of the many weights appearing throughout the square we should choose. $1$? $1.4$? Maybe $2$? How do we decide which weight we use?
Turns out, for our purposes? It doesn’t matter, because we’re always going to take the limit as the partition gets finer and finer.
As those squares get smaller, the weight in any single square converges to a specific value. Phrasing it another way: smaller squares have less room over which the weight can change, and infinitely small squares have no room for change at all–we end up with a uniform weight again! At any individual stage (e.g. $s=1$, or $s=½$, or $s=¼$, etc.) of our limiting procedure, all we need to do is choose some point in each square and evaluate the weight there. It might not make for a great approximation at that specific stage, but it won’t matter once $s\rightarrow 0$!
Let $(x_i,y_i)$ denote a point (again, any point will work) in the square $S_i$. Then the weighted version of our earlier expression is,
In general, if we have any region $R$ of the plane and a function $f$ defined from $R$ into $\mathbb{R}$, then we define the surface integral of $f$ over $R$ to be,
The function (aka integrand) $f$ is analogous to our weight function, and the region $R$ is analogous to BLOB. By writing $\Delta A_i$ for the area of our squares, I’m using a notation popularly used throughout physics; see, $\Delta$ is frequently used in physics for quantities that we intend to make infinitely small. In this way, “$\Delta A_i$” becomes the infinitesimal area “$dA$” in the limit that $s$ goes to $0$. This notation is especially useful when generalizing to other (non-rectangular) coordinate systems.
As alluded to earlier, we can recover the area of a region $R$ in the plane by setting $f(x,y)=1$ for all $(x,y)\in R$ and integrating $f$ over $R$:
This is great! But still insufficient. We’re going to need a lot of integrals here at SineOfPsi, so we’ve got to extend our machinery further yet. We’ll pick up from here next week, when we'll learn how to integrate curves in the plane. See you then!
Thanks for reading today’s post! Follow sineofpsi.tumblr.com for new physics content every Friday. Have questions about anything we’ve talking about? Send me an ask. I’m wishing you the best!
#sineofpsi#sineofwhy#quantum field theory#particle physics#physics#mathematics#electromagnetism#calculus#integral calculus#mathematical physics#studyblr#gradblr#researchblr#sciblr#scienceblr#physicsblr#derivative calculus#cowboy bebop#real plane#rectangle#distance#area#square#partition#estimate#estimation#riemann integration#riemann#refinement#partition refinement
83 notes
·
View notes
Text
Calculus II: Going Great Lengths to Integrate
Last week we learned how to find the area of a two-dimensional region in the real plane. We then generalized this procedure to allow for preferential treatment (”reweighting”) of different parts of that subset, resulting in a procedure called integration and a number called a surface integral.
However, not all shapes in the real plane are two-dimensional. For instance, we can draw one-dimensional curves in the real plane too! Just as area was natural to 2D regions, length is natural to curves.
This prompts the question: How do we find the length of a generic curve?
Calculating Length of a Curve in the Real Plane
Suppose you take a pencil and a piece of paper, and place the pencil’s tip somewhere on the page. Then, without lifting the pencil from the paper, you draw on the page, only picking it up once you’re satisfied with the image you’ve created. You might end up with something like this:
This continuous one-dimensional object is a curve. Curves can become extremely complicated as they twist and turn throughout the plane. This makes finding the length of a generic curve a challenging task.
However, there are a class of curves with easy-to-measure lengths: line segments! A line segment connects two points in the plane via a straight path. If a line segment $L$ begins at $\vec{x}_1 = (x_1,y_1)$ and ends at $\vec{x}_2 = (x_2,y_2)$, then it has length,
In the real plane, a line segment looks like this:
We can (and will!) use our knowledge of line segment length to measure the lengths of generic curves. We do this by first breaking a generic curve into smaller pieces and then approximating each of those pieces by line segments. Adding up the lengths of those line segments yields an estimate of the curve’s length. As we'll see, by dividing the curve into infinitely-many pieces we can measure the curve’s length exactly.
Before we get into the details of the division process, let’s get a stronger mathematical grip on what it means to be a curve.
Consider our drawing analogy again. We choose an initial point $\vec{x}_{0}$, draw for a bit, and stop at a final point $\vec{x}_{f}$. Time ticks on as we draw, so that each point $\vec{x}(t)$ we draw on the curve corresponds exactly to a specific time $t$ on a clock. For our purposes, let’s measure time as a fraction of the total time we spend drawing the curve. That means our first point $\vec{x}_{0}$ occurs at $t=0$ and our final point $\vec{x}_f$ occurs at $t=1$:
All other points on the curve are drawn between $t=0$ and $t=1$, such that $t$ satisfies $0\leq t\leq 1$. We say that $t$ parameterizes the curve, and call the range of values that $t$ can take on the parameterization interval, labeled $I_t \equiv [0,1]$. Technically, the curve is the mapping from $t$ to $\vec{x}(t)$, while the shape we drew in the plane (a subset of $\mathbb{R}^2$) is called the curve image... but we typically use the word curve for both and let context determine what definition we mean.
This curve parameterization is useful because it allows us to chop up the parameterization interval (a neat orderly subset of $\mathbb{R}$) instead of a curve (which can sprawl throughout the real plane in a disorganized fashion). Let’s try this procedure on the curve we drew earlier, which we name CURVE. We begin by dividing CURVE’s parameterization interval into three even pieces. This naturally breaks CURVE into three pieces as well:
In doing so, we’ve generated a partition of CURVE. There’s nothing special about dividing CURVE into specifically three pieces; any finite number of pieces works as a place to begin. Because we can rush through (or linger on) different portions of the curve as we draw, an evenly divided parameterization interval does not imply an evenly divided curve.
Dividing $I_t$ into three pieces defines four special values of $t$: the endpoints of $I_t$, plus the points between adjacent interval pieces. In this particular case, these $t$-values are $t_0=0$, $t_1=1/3$, $t_2=2/3$, and $t_3=1$. Our curve maps each of these to a point $\vec{x}(t)$ in $\mathbb{R}^2$; we’ll call each of these special points a division point. They’re denoted with black circles in the image above.
We approximate the curve’s length by drawing line segments between each adjacent pair of division points and adding up the lengths of all of these line segments.Â
We label our division points according to the order they appear along the curve (going from $t=0$ to $t=1$), so that the first point is $\vec{x}_0$ (which is consistent with our earlier definition), and then the second one is $\vec{x}_1$, and so-on, until we get to $\vec{x}_f$, which is our fourth division point and thus labelled $\vec{x}_3$.
Next, we draw line segments between adjacent division points. For instance, we draw a line segment from $\vec{x}_0$ to $\vec{x}_1$, label it $L_1$, and label its length as $\Delta \ell_1\equiv \text{length}(L_1) = |\vec{x}_1-\vec{x}_0|$. We move through every other adjacent pair of points in the same way, thereby drawing the line segments $L_2$ and $L_3$.
Finally, our approximation of the length is the sum of the lengths of these line segments:
The illustration makes it clear that this approximation doesn’t work well for CURVE. This is apparent from the large gaps between our (green) line segments and the (dark blue) CURVE.
We can, however, reduce these gaps and make a better approximation by refining our partition. In this vein, let’s chop each existing piece of $I_t$ in half:
This makes for six pieces of CURVE (from splitting each piece in half) and seven division points (one more per piece we split). To commemorate this new stage of approximation, we reindex our division points so that they’re still labeled $\vec{x}_0$, $\vec{x}_1$, etc. as we move from $\vec{x}_0$ to $\vec{x}_f$ along the curve. According to this updated index, $\vec{x}_f=\vec{x}_7$.   We proceed like before and draw line segments between adjacent division points, utilizing the same notations as before such that $\Delta \ell_i = \text{length}(L_{i}) = |\vec{x}_i-\vec{x}_{i-1}|$ for $i=1$, $2$, ..., $6$.
Our new approximation reads,
Again, this approximation isn’t necessarily close to the actual length of the curve, but the illustrations imply it’s better than our previous approximation because our line segments fall closer to the actual curve. As we further divide $I_t$, the distance between adjacent division points decreases, the line segments deviate less from CURVE, and we obtain an increasingly accurate length estimation.
We can measure the curve’s length exactly by performing an infinitely-fine partition of $I_t$, which yields the expression,
where $N\rightarrow \infty$ means “divide the curve into infinitely-many infinitely-small line segments.”
This process (like last week’s) is a form of integration and the number it produces is a line integral.
If two persons draw the same shape but take their time on different portions of the drawing then they’ll have different parameterizations, but the curve will have the same length. That is, the length of the curve (and therefore, our integral) is independent of the specific parameterization we use. In fact, the parameter $t$--while we inspired it via time--does not have to be time at all! We can parameterize our curve using any map that associates each value in $I_t$ with a unique point on the curve so long as it preserves the flow of the curve from one end to the other.
Generalizing: Integration along Curves
Line integrals show up often in physics, such as when we follow the path of a particle through space. For example, suppose we arrange several point charges throughout space that are, by assumption, fixed in place. Let’s label them collectively as $Q$. Further suppose we introduce an additional point charge $q$ and begin dragging it throughout space. Our new charge will interact with the photon clouds of $Q$, and thus experiences a force due to $Q$. But the density of the photon clouds change as we move through space, so $q$ will experience different forces at different points in space. If we want to know the collective energy we spend against these forces as we drag $q$ around, then we have to perform an integral over $q$’s path that weights different infinitesimal lengths according to those changing forces.
In order to develop such a tool, we define a weight function (aka scalar density function) $w$ along CURVE, which maps each point $(x,y)$ on CURVE to a real number called a weight $w(x,y)$.
The weight tells us what each line segment in our approximation will be multiplied by. In our length calculation, we effectively used $w(x,y)=1$. Now CURVE has weights varying between $0$ and $2$ as it winds through the plane, as described by the plot’s legend.
In this context, our earlier three-piece partition now looks like,
If we follow the script from last week, we’d now choose a weight from within our approximated region to use as the weight in our approximate expression. However, because our curve is a 1D object living in a 2D space, we run into an issue we didn’t have before: our approximate shape hardly overlaps our actual curve. Our green line segments only intersect CURVE at their endpoints!
Thankfully, the parameterization interval saves us. While we use the line segments (which live in $\mathbb{R}^2$) to approximate the length of our curve between points, we should use $I_t$ to choose a weight for each of those line segments. For example: if the line segment $L_1$ connects to the points corresponding to parameter values $t_0$ and $t_1$, we should choose some $t^\prime_1\in [t_0,t_1]$ and associate $\Delta \ell_1$ with the weight at $\vec{x}(t^\prime_1)$.
Note: the specific $t$ we choose in each subinterval doesn’t matter in the limit that our partition becomes infinitely fine. Any choice will do the job! In the interval corresponding to $L_i$, let’s label this choice of $t$ as $t_i^\prime$.
This procedure yields the desired weighted sum of length over CURVE:
More generally, given a curve $C$ and a function $f$ defined on enough of $\mathbb{R}^2$ to cover $C$, we define the line integral of $f$ along $C$ as,
By integrating $f=1$, we recover the length of $C$,
We now have two ways to integrate in the real plane: surface integrals and line integrals. While the basic principles of integration are the same, these integrals are fundamentally different. However, there does exist a means of unifying surface and line integrals: a mathematical object called the Dirac delta distribution. This is the topic of next week's post. See you then!
Thanks for reading today’s post! Follow sineofpsi.tumblr.com for new physics content every Friday. Have questions about anything we’ve talking about? Send me an ask. I’m wishing you the best!
#sineofpsi#sineofwhy#quantum field theory#particle physics#mathematics#electromagnetism#calculus#integral calculus#mathematical physics#studyblr#gradblr#researchblr#sciblr#scienceblr#physicsblr#curve#line segment#parameterize#parameterization#parameterization interval#curve image#curve parameterization#partition#division point#line integral#length#weight function#weight#scalar density function
33 notes
·
View notes
Text
Classical E&M I: Coulomb’s Photon Cloud
“Electric charge” is an adjective meaning able to interact with photons.Â
By definition, an electrically charged material constantly emits and absorbs photons; in doing so, they pump a perpetual fog of photons throughout the surrounding space--a photon cloud. When the electrically charged material is also very small in size, we promote the word “charge” to a noun, and call the material an electric point charge, or simply a point charge.
Today, we're discussing how point charges can push and pull on each other from a distance. In doing so, we motivate Coulomb's Law, which describes the electric force.
Lucky for us, the photon distribution around any point charge is a simple shape with only one free parameter: a positive real number called charge strength. Charge strength measures how quickly a point charge can emit and absorb photons. By doubling the charge strength of a particle, you double the emission/absorption rate of that particle and consequently double the density of its photon cloud.
It’s conventional to label point charges by their charge strengths, e.g. a particle with charge strength $q$ (a number) gets called $q$ (a label).
Suppose we place a point charge with strength $q^\prime$ at a location $\vec{r}^\prime$. As a point charge, $q^\prime$ possesses a photon cloud. This cloud extends throughout all space, but grows less dense the further you are from the charge.
Say we set an additional point charge $q$ at a new location $\vec{r}$. Like any other point charge, $q$ wants to interact with photons and (special thanks to $q^\prime$) there’s plenty to go around. Thus, $q$ begins absorbing momentum-carrying photons originating from $q^\prime$. In this way, $q$ experiences a force due to $q^\prime$! This is the electric force.
A key element here is that photons are NOT charged. We might otherwise worry that the photons from $q^\prime$ would be blocked from reaching $q$ by its own photon cloud. But nope! Photon clouds pass right through one-another.
Like all forces, the electric force on $q$ due to $q^\prime$ is a vector; we’ll denote it as $\vec{F}_{q^\prime\rightarrow q}$. Our question becomes: what is the direction and magnitude of the electric force?
The Direction of the Electric Force:
Charges emit photons uniformly in all directions. This means the only physically-special direction available in our problem is the line of separation between $q^\prime$ and $q$. Mathematically, this is described by the separation vector:
As the mnemonic “final minus initial” implies, our separation vector points from the source charge $q^\prime$ at $\vec{r}^\prime$ to the target charge $q$ at $\vec{r}$. At the moment we don’t care about the length of this vector, only its direction, so let’s divide by its length,
The uniformity of emission and absorption (or, rather, the symmetries implied by that uniformity) imply our force vector must point along the line of separation:
where the “$+$” case indicates a repulsive force and the “$-$” case indicates an attractive force. Which is the correct choice? Well, that depends...
There are actually two kinds of electric charge in nature. They’re completely indistinguishable in regards to their own photon clouds; however, they are picky about where their photons come from. To ease the explanation, let’s call these two types of charge Type A and Type B.
Suppose our source charge is a Type A; label it $q_A^\prime$. By existing, $q_A^\prime$ emits a photon cloud. When we place a target charge in the vicinity of $q_A^\prime$, we have to choose whether we place a Type A charge $q_A$ or a Type B charge $q_B$, and as it turns out? That choice matters.
A $q_A$ placed near $q_A^\prime$ will absorb photons from $q_A^\prime$, and move away. Meanwhile, a $q_B$ placed near $q_A^\prime$ will behave in the opposite manner: $q_B$ will move towards $q_A^\prime$. In general,
Like-typed charges repel and different-typed charges attract.
Why are there two kinds of charge and why do they interact like this? The answer lies in quantum field theory and outside the scope of this post. Right now, we simply state the existence and behavior of those charge types as fact, so that our force equation looks like this:
where $F_{mag}$ is a nonnegative real number and our next topic of discussion.
The Magnitude of the Electric Force:
As you may have suspected, the force $q$ experiences due to $q^\prime$ is proportional to the number photons that $q$ absorbs from $q^\prime$. This fact immediately tells us $\vec{F}_{q^\prime\rightarrow q}$ must be proportional to:
The source’s charge strength, $|q^\prime|$: this dictates the number of photons our source emits in the first place. More photons emitted means more photons for $q^\prime$ to absorb, and
The target’s charge strength, $|q|$: this controls our target’s capacity for interacting with photons, such that $q$ can absorb more photons if it has a larger charge strength.
I’ve only defined the charge strength as a nonnegative real number so far, so taking the absolute values of $q$ and $q^\prime$ like I did above is unnecessary. I do this now only because it’ll be useful later.
For the sake of a derivation, suppose that $q^\prime$ emits $N_{emit}$ photons in miscellaneous directions all at once. The photons emitted by $q^\prime$ go forever outward at a constant speed, such that these $N_{emit}$ photons travel together on the surface of a sphere:
When the photons are $R$ away from $q^\prime$, they occupy a sphere with surface area $4\pi R^2$. As time goes on, their distance $R$ from $q^\prime$ grows, and they become more spread out. The local density of photons goes like $N_{emit}/(4\pi R^2)$.
This spreading out with area is why a charge’s photon cloud becomes less dense the further we are from that charge. It also means that $q$ has fewer photons to absorb the further out we are, such that the number of photons that reach $q$ goes like $N_{abs}\propto N_{emit}/(4\pi R^2)$, or simply $N_{abs} \propto 1/R^2$.
Combining all of the pieces, I claim our force’s magnitude is this:
where $u$ is a constant that ensures our units work out. Hence, our force law is,
This case-by-case result is accurate, but lacks aesthetic and utility. Let’s change that.
Thus far, we’ve had two (nonnegative) scales of electric charge, one for Type A charge strengths and another for Type B charge strengths. We can combine these into one scale as follows: embed charge strength into a real number’s magnitude (e.g. $|q|$ instead of $q$) and embed the charge’s type into that real number’s sign. For instance, we might define Type A charges as positive numbers and Type B charges as negative numbers:
Using this convention, when $q$ and $q^\prime$ are the same type, we have $|q|\cdot |q^\prime| = q\cdot q^\prime$, whether they’re both Type A or Type B. Similarly, when $q$ and $q^\prime$ are different types, we have $|q|\cdot|q^\prime| = -q\cdot q^\prime$. Note the difference in sign: there’s a “$+$” for like-typed charges and a “$-$” for different-typed charges. This mimics the case-dependent sign in our force law!
Aside: We could’ve just as easily defined Type A charges to be negative and Type B charges as positive. As far as the physical laws go, either version of this math trick works.
By embedding charge types in the sign of our charges, we’ve obtained a traditional form of Coulomb’s Law:
This is the electric force that a stationary charge $q$ at $\vec{r}$ experiences due to a stationary charge $q^\prime$ at $\vec{r}^\prime$. Because $\vec{R} = \vec{r}-\vec{r}^\prime$, Coulomb’s Law is also written,
Note that Coulomb’s Law automatically satisfies Newton’s Third Law: by swapping the source and target, we find,
Therefore, our point charges experience equal and opposite forces! Not bad!
While Coulomb’s law for point charges is a powerful tool in solving classical E&M problems, we can make it more powerful by extending it to the so-called charge distributions. But first, let's discuss how we can break the universe into an infinitude of tiny volumes and put it back together again. See you next week!
Thank you for reading! Follow sineofpsi.tumblr.com for more physics posts. Have questions about QFT or particle physics? Send me an ask! Best wishes.
#sineofpsi#sineofwhy#quantum field theory#particle physics#physics#mathematics#electromagnetism#E&M#classical mechanics#special relativity#mathematical physics#studyblr#gradblr#researchblr#sciblr#scienceblr#physicsblr#charge#electric charge#electrically charged#photon cloud#electric point charge#point charge#charge strength#electric force#separation vector#source charge#target charge#opposites attract#photon density
23 notes
·
View notes
Text
Luminosity: a Fine Line Between Order and Chaos
Particle physics relies on firing two high-energy beams at each other and seeing what comes flying out when they collide. By a beam, we mean a macroscopic collection of particles that travel together with the same energy. For example, the Large Hadron Collider (LHC) at CERN fires two proton beams at each other and uses various detectors to quantify the resulting chaos. Right now it operates at an energy of 13 TeV, which means that each proton in each beam has (13 TeV)/2 = 6.5 TeV worth of energy. This is a HUGE amount of energy! Each 6.5 TeV proton travels only several meters/second slower than the speed of light!
The LHC also reports another number: it's delivered "8359.72 inverse-picobarns" worth of "integrated luminosity" during 2016 so far. We're also told this number will go up as the LHC continues running. What is the integrated luminosity and why is it important?
Although we talk about beams as objects in their own right, they're really made up of particles (as illustrated above). Because we want to know how particles interact, we have to determine how to translate the intersection of (macroscopic) beams into the interactions of (microscopic) particles. This is precisely what the luminosity accomplishes. But how?
Consider what the crossing of these beams looks like for the individual particles in the beams. Particles from different beams will certainly have the opportunity to interact if we fire them directly at each other. This is recorded as a HIT in the following image:
Similarly, if two particles pass each other with a decent amount of distance between them, then they'll be too far away to interact and they will MISS each other.
Those two regimes are fairly well-defined. What's not well-defined is what happens in between: the ??? case. Consider two fridge magnets. If I place one on top of the other in the right way, I can create a massive repulsion between them. I can also put them on opposite sides of a room so that they don't influence each other at all. But if I then slowly bring the magnets together, the repulsion will smoothly increase from nonexistent to extremely strong. Physically, there is no clear division between the two regimes.
The same thing is true in the case of particles in beams: there is no clear division between HIT and MISS. Thankfully, there are ways to reduce the fuzziness, primarily by reducing the possibility for particles to interact more than once. For instance, we can aim our beams at a slight angle relative to each other, so that interactions only occur at the point where the two beams cross. We can also make our beams diffuse enough that particles only rarely get close enough to interact. Additionally, we can chop our beams into bunches (a bunch is a technical term for a short beam segment). Using bunches ensures not too many particles are colliding at once so there are fewer opportunities for interactions. All of this limits complications that might occur when separating the regimes of HIT and MISS.
Next, we define an interaction window relative to each particle. This is an area we define perpendicular to the particle's motion, outside of which no interactions occur. To be clear, consider a particle A. If a particle B enters particle A's interaction window, then particles A and B might interact (this corresponds to HIT or ??? from the previous figure). If particle B does not enter particle A's interaction window, then by definition they cannot interact (this corresponds to MISS). This concept is illustrated below:
Because the division between HIT and MISS is fuzzy, the interaction window is not uniquely defined. But that's okay! Physical arguments give us an idea of roughly how big it should be, and--if we define everything consistently--the interaction window drops out of physical quantities. We'll see an example where this happens in next week's post.
Each of the particles in the beam will have its own interaction window. That means as our beams cross there will be a lot of interaction opportunities overall. We quantify this via the luminosity:
Luminosity is how many interaction opportunities occur overall per time per interaction window. In other words, luminosity is the flux of interaction opportunities per particle. This is why luminosity is important at the LHC: a higher luminosity = more interaction opportunities for each particle per time = data is generated more quickly! As its definition indicates, luminosity has units of $[\text{area}]^{-1}[\text{time}]^{-1}$.
It's possible to calculate the luminosity of an experiment by knowing how the experimentalists set up their beams and defining an appropriate interaction window. But wait! I said we're told the LHC's integrated luminosity. What's that? Integrated luminosity is the luminosity summed over time. We can write it as,
or, in a conceptually-equivalent form,
Integrated luminosity has units of $[\text{area}]^{-1}$, which is why the LHC reports it in inverse-picobarns (a "picobarn" is a unit of area). Because more interaction possibilities yield more data, we see that integrated luminosity is a measure of how much data a particle collider has produced so far. Therefore, integrated luminosity is a very important quantity!
As a high-energy phenomenologist, I use the integrated luminosities reported by experiments rather than calculate them myself. Instead, I calculate other experimentally-relevant quantities, such as cross-sections. Cross-sections summarize the quantum-weirdness that happens when two particles share an interaction window. We'll be talking about that next time!
Thank you for reading! Follow sineofpsi.tumblr.com for more physics posts. Have questions about QFT or particle physics? Send me an ask! Best wishes.
#sineofwhy#sineofpsi#quantum field theory#luminosity#cross section#educational#physics#mathematics#particle physics#high energy physics#qft#particles#quantum mechanics#beam physics#physicist#particle collider
16 notes
·
View notes
Text
How can a Particle be “Virtual”?
Particle physicists sometimes talk about virtual particles. The word “virtual” evokes images of online avatars or holograms: things that are in some sense false, which causes virtual particles to sound fake. Yet physicists also say things like “Particle A and Particle B exchange a virtual particle and experience a force.” Well, if it makes a real force, then how is the exchanged particle virtual? Let’s get into it.
Modern particle physics is described via ~*~Quantum Field Theory~*~, which associates each kind of particle with a quantum field. For instance, an electron is a type of particle, and so there exists an electron field.
You can think of a quantum field like a sprawling network of guitar strings across all of space and time. Like guitar strings, you can pluck the quantum field and–if you do it just right–you’ll hit a note that resonates across the whole network of strings. (A note that resonates across all of space and time!) These metaphorical notes correspond to physical particles.
It turns out the field’s resonant notes don’t occur randomly. In fact, each field is associated with a number–called the mass m of that field–and the field’s resonant notes all satisfy the following expression:
where E and p are the energy and momentum of the resonant note/particle, and c is the speed of light. (You might recognize this formula as Einstein’s Mass-Energy Equivalence Formula. Quantum fields automatically encode Einstein’s theory of special relativity!)
These resonant notes are called real particles and are said to be on-shell because they satisfy Einstein’s Formula. If it can avoid interacting or decaying, a real particle has the potential to travel forever. Thus, real particles can travel far enough to reach our detectors and they’re Pretty Important as a result.
So real particles appear when we pluck the field and hit a resonant note. However, nothing prevents us from hitting a non-resonant note. These result in particles that do not satisfy Einstein’s Formula:
That means these ill-formed particles can have any mass they’d like, unlike real particles. They also won’t travel forever. Instead, they quickly lose effectiveness and decay into other particles. Therefore, they don’t reach our detectors. These are what we call virtual particles. Because of their unrestricted masses, they’re said to be off-shell.
But just because we can’t detect virtual particles doesn’t mean they’re unphysical. See, our experiments often involve throwing real particles at each other (such as in a collider), and oftentimes when these particles make contact, they’ll pluck a field via an interaction. When they do this, they might hit a resonant note of that field, but there’s also a strong chance that they’ll hit a non-resonant note. Really, anything is fair game when particles collide!
Furthermore, any particle–that means real or virtual–can decay or interact with any other particles. So, yeah, virtual particles won’t live long enough to see our detector, but their energy and momentum has to go somewhere, and so they’ll often decay into other particles. Those resulting particles might be real, and so might reach our detectors. Alternatively, a virtual particle might dump it’s energy/momentum into some other particle, and that lucky particle might be real enough to be detected. Either way, virtual particles yield physical consequences.
Therefore, despite their name, virtual particles are as real as any other particle. We just don’t directly observe them.
Thank you for reading! Follow sineofpsi.tumblr.com for more physics posts. Have questions about QFT or particle physics? Send me an ask! Best wishes.
#sineofpsi#qft#particle physics#real particle#virtual particle#sineofwhy#quantum field theory#particles#quantum mechanics#energy#momentum#einstein#interaction#physicist#physics#mathematics
11 notes
·
View notes