How I learned to stop worrying and love the Floer C-epsilon space

The following is a thing I sometimes say to my students, or anyone else willing to listen to me:

People take the C_\varepsilon-space too seriously!

I’m going to try and explain in this post what I mean by that. In particular, I’m going to state a couple of rules for dealing with C_\varepsilon-spaces that I believe would make symplectic topology a slightly happier field if they were generally observed. My students should definitely read this post, because I promise to give them trouble if they submit a thesis that does not follow these rules. There are surely also people who are not my students and ought to read this post, but many of them won’t, and that’s probably not the end of the world.


What on Earth are you talking about, Chris?

I should remind you first what the Floer C_\varepsilon-space actually is and what it is for. Suppose you want to prove a statement of the following form, which I will refer to from now on simply as “the GTT”:

Generic transversality theorem (GTT): For generic J \in {\mathcal J}, the moduli space {\mathcal M}(J) is cut out transversely.

For concreteness, most of my readers will be happy to imagine that {\mathcal J} is a space of smooth tame/compatible almost complex structures on some symplectic manifold, {\mathcal M}(J) is a moduli space of J-holomorphic curves, and “cut out transversely” means that every curve in {\mathcal M}(J) is Fredholm regular, implying that {\mathcal M}(J) is a smooth manifold or (if there are symmetries) orbifold of the “expected” finite dimension derived from a Fredholm index. Results of this kind are also quite standard in other contexts involving nonlinear elliptic PDEs, e.g. in gauge theory. Whichever context you prefer to imagine, I want to assume in general that the set {\mathcal J} of geometric data on which the PDE is based consists of smooth sections of some fiber bundle over a compact manifold M, and is endowed with the C^\infty-topology. This is the most natural assumption to make in most geometric settings. (Given the name of this blog, I should add that one can also allow M to be noncompact if one requires all sections in {\mathcal J} to match some fixed section outside of a fixed compact subset — this is what one does for defining continuation or cobordism maps in any Floer-type theory. But for this discussion, I will assume M is compact just for simplicity.)

Now, up to thorny technical details that we’ll get to in a moment, there is a standard playbook for proving the GTT:

  1. Define a universal moduli space {\mathcal M}({\mathcal J}) := \left\{ (u,J)\ \big|\ J \in {\mathcal J},\ u \in {\mathcal M}(J) \right\}, and prove via the implicit function theorem that it is a smooth Banach manifold.
  2. Observe that the projection \pi : {\mathcal M}({\mathcal J}) \to {\mathcal J} : (u,J) \mapsto J is a smooth Fredholm map, so the Sard-Smale theorem gives a comeager subset {\mathcal J}^{\text{reg}} \subset {\mathcal J} such that {\mathcal M}(J) = \pi^{-1}(J) is cut out transversely for every J \in {\mathcal J}^{\text{reg}}.

There is an immediate impediment to implementing this strategy if {\mathcal J} is indeed a space of smooth objects with the C^\infty-topology: {\mathcal J} is not a Banach manifold, thus the universal moduli space {\mathcal M}({\mathcal J}) will not be one either if it is defined as described above. In the case I deal with most often, where {\mathcal J} is a space of smooth almost complex structures on a compact manifold, {\mathcal J} is at best a Fréchet manifold, not a Banach manifold, just as the space of smooth functions on a compact manifold is a Fréchet space with the C^\infty-topology, but not a Banach space. Unfortunately, there is no contraction mapping principle in Fréchet spaces, and thus no implicit function theorem (at least not unless one wants to use something overly complicated like Nash-Moser, which I don’t). In short, the naive definition of the universal moduli space {\mathcal M}({\mathcal J}) does not work; one needs a cleverer trick to define something that is useful.

I’m aware of two such tricks that are popular among symplectic topologists:

The disadvantage of Option 1 is that {\mathcal J}^m necessarily contains elements J \in {\mathcal J}^m that have only finitely-many derivatives, so the nonlinear operator defining {\mathcal M}(J) will no longer be smooth in general, and {\mathcal M}(J) itself becomes (at best) a manifold of class C^k for some large k, but not a smooth manifold. That is not the end of the world, but it is a major pain in the neck if at every step you have to keep track of how many derivatives of your finitely-differentiable functions you have not yet burned up. I personally prefer to avoid this whenever possible.

Option 2 avoids that problem because Floer’s C_\varepsilon-space contains only smooth objects. To sketch the idea, suppose first that we are interested in smooth sections of a finite-rank vector bundle E over the compact manifold M, and we make the usual choices (e.g. a finite covering by coordinate charts and local trivializations) so that the C^m-norm of such a section is well defined for each integer m \ge 0. Choose a sequence

\varepsilon = \{\varepsilon_m > 0\}_{m=0}^\infty such that \varepsilon_m \to 0,

and define the C_\varepsilon-norm of a smooth section \eta \in \Gamma(E) by

\|\eta\|_{C_\varepsilon} := \sum_{m=0}^\infty \varepsilon_m \|\eta\|_{C^m}.

It is easy to check that the space C_\varepsilon(E) of smooth sections \eta with \|\eta\|_{C_\varepsilon} < \infty is a Banach space with respect to this norm, and its obvious inclusion into the Fréchet space \Gamma(E) is continuous, i.e. C_\varepsilon-convergence implies C^\infty-convergence.

For applications to the GTT, one typically fixes a “reference” object J_{\text{ref}} \in {\mathcal J} and defines the Banach manifold {\mathcal J}_\varepsilon \subset {\mathcal J} as something along the lines of a C_\varepsilon-small neighborhood of J_{\text{ref}} in {\mathcal J}. To say this more precisely, let us assume that {\mathcal J} really is a Fréchet manifold, so every J \in {\mathcal J} has a tangent space T_J{\mathcal J}, which is the Fréchet space of C^\infty-sections of some vector bundle, and one can define an “exponential map”

T_J{\mathcal J} \supset {\mathcal O} \stackrel{\text{exp}}{\longrightarrow} {\mathcal U} \subset {\mathcal J}

that sends a neighborhood {\mathcal O} \subset T_J{\mathcal J} of 0 homeomorphically to a neighborhood {\mathcal U} \subset {\mathcal J} of J. (There are typically easy direct ways to define \text{exp} so that it has this local homeomorphism property — one need not think about connections on infinite-dimensional Fréchet manifolds or anything so exotic.) One can then fix a sufficiently C^\infty-small neighborhood {\mathcal O} \subset T_{J_{\text{ref}}}{\mathcal J} of 0 and define

{\mathcal J}_\varepsilon := \left\{ \text{exp}_{J_{\text{ref}}}(Y) \ \big|\ Y \in {\mathcal O} \text{ with } \|Y\|_{C_\varepsilon} < \infty \right\}.

This can be regarded as a smooth Banach manifold in a trivial way: {\mathcal O} is an open subset of a Banach space with the C_\varepsilon-norm, and Y \mapsto \text{exp}_{J_{\text{ref}}}(Y) is the inverse of a global chart identifying this subset with {\mathcal J}_\varepsilon. Since only one chart has been defined, there is no need to worry about transition maps.

It is important to understand however that since not every smooth section has finite C_\varepsilon-norm, {\mathcal J}_\varepsilon does not contain every smooth element in {\mathcal J}, and in fact it does not even contain a C^\infty-neighborhood of J_{\text{ref}}. In this sense, {\mathcal J}_\varepsilon is not a remotely natural space to work with; {\mathcal J}^m is much more natural by comparison. However, the obvious inclusion {\mathcal J}_{\varepsilon} \hookrightarrow {\mathcal J} is clearly continuous, and this means that {\mathcal J}_\varepsilon does contain (some, but not all) arbitrarily C^\infty-small perturbations of the particular element J_{\text{ref}}; moreover, there is nothing special about J_{\text{ref}}, as it can be chosen arbitrarily before defining {\mathcal J}_\varepsilon. As we will see below, this makes {\mathcal J}_\varepsilon a good enough space for use in proving the GTT, and — in my opinion at least — proving it this way is typically less painful than dealing with finitely-differentiable moduli spaces.

The Floer space is even less natural than it looks

I’m planning to say some very positive things about the C_\varepsilon-topology in a moment, but first, I want to make sure you’re fully aware of its flaws.

Recall that the C^m-norms on sections of the vector bundle E \to M are not canonically defined, but since M is compact, they are well defined up to equivalence, so the C^m-topology is canonical. Here’s a bit of bad news that may not have occurred to you yet: the C_\varepsilon-norm is a linear combination of infinitely many norms, each of which depends on choices. If we had only finitely many norms to add up, then we could easily prove that the C_\varepsilon-topology is similarly independent of choices. But if we pick \eta \in C_\varepsilon(E) and modify infinitely many of the C^m-norms within their individual equivalence classes, \|\eta\|_{C_\varepsilon} can easily become infinite. In other words, for any given sequence \varepsilon = \{\varepsilon_m\}_{m=0}^\infty, the space C_\varepsilon(E) and its topology are not canonically defined. They depend on further choices such as local charts and trivializations.

This leads me to the first of the two rules I’d like to propose:

Rule 1: The C_\varepsilon-space can play a starring role in lemmas, but should never, ever, be mentioned in the statement of a theorem.

Floer obeyed this rule, though several illustrious people since then have occasionally disregarded it, and some have even ended up with the impression that if one chooses to use {\mathcal J}_\varepsilon in a proof of the GTT, then {\mathcal J}_\varepsilon must also be mentioned in the statement. At the end of the day, though, you want to have a result about generic C^\infty-small perturbations of your geometric data, not generic perturbations in a much finer and completely unnatural topology that depends on arbitrary choices.

I’m here to tell you that this is possible.

The Floer space is tricky to get one’s hands on

In the standard proof of the GTT using C_\varepsilon-spaces, the main thing one needs to know about them is an easy lemma that was proved essentially by Floer — I will state it somewhat informally as follows:

Bump function lemma (see Lemma 5.1 in Floer’s paper): If the sequence \varepsilon_m has sufficiently rapid decay, then C_\varepsilon(E) contains sections with arbitrarily small support around any given point and arbitrary values at that point.

This lemma basically says that for a given J \in {\mathcal J}_\varepsilon and any point p \in M, one can without loss of generality assume there exist small perturbations J' \in {\mathcal J}_\varepsilon of J that are “pushed” in any desired direction near p but match J everywhere else. In practice, this makes {\mathcal J}_\varepsilon a “big enough” space of perturbations to prove the GTT.

In more general applications, however, the bump function lemma does not always suffice, and if you’re not thinking from the right perspective, working with C_\varepsilon-spaces can then start to seem harder than it actually is. The following is a slight simplification of a question that I recently found myself banging my head against:

Frustrating Question: Given a smooth submanifold \Sigma \subset M, a point p \in \Sigma and a linear map \lambda : T_p M \to {\mathbb R} that vanishes on T_p\Sigma, does there exist a function f : M \to {\mathbb R} of class C_\varepsilon that satisfies f|_\Sigma \equiv 0 and df(p) = \lambda?

This question arises unavoidably in the approach to equivariant transversality that I’ve recently been trying to promote via my paper on super-rigidity. The answer would be obviously “yes” if we only needed f to be smooth, but the ability to make its C_\varepsilon-norm finite while also choosing it to vanish on the submanifold \Sigma \subset M seems to depend on information about \Sigma that is not given. In the application I have in mind, \Sigma is the image of an arbitrary holomorphic curve in a moduli space that is completely unknown, so answering the question seems hopeless. I briefly had the terrible feeling that I was going to have to switch to Option 1 and rewrite large portions of my super-rigidity paper to accommodate finitely-differentiable almost complex structures.

But then I realized that I was thinking about it the wrong way around, and thus learned the second rule:

Rule 2: If you find yourself needing to prove that a given function is of class C_\varepsilon, then you are thinking backwards.

Let me explain.

Lack of naturality is not a bug, it’s a feature

There’s one thing about C_\varepsilon-spaces that you must never, ever forget: just as the topology of C_\varepsilon(E) depends on plenty of noncanonical choices, the sequence \varepsilon = \{\varepsilon_m\}_{m=0}^\infty is in itself a noncanonical choice, and you are free to change it. In particular, it can always be useful to make \varepsilon_m converge to 0 even faster. For crying out loud, that’s why it’s called \varepsilon!

Let’s formalize this idea a bit. Denote by \boldsymbol{\mathcal E} the set of all sequences of positive numbers that converge to 0, and define a pre-order \prec on \boldsymbol{\mathcal E} by

\displaystyle \varepsilon \prec \varepsilon' \quad\Longleftrightarrow\quad \lim\sup_{m \to \infty} \frac{\varepsilon_m}{\varepsilon_m'} < \infty.

Intuitively, \varepsilon \prec \varepsilon' means that the sequence \varepsilon_m decays to 0 at least as fast as \varepsilon_m'. One can now define statements of the form “X(\varepsilon) holds whenever \varepsilon has sufficiently rapid decay” to have the precise meaning, “There exists an \varepsilon_0 \in \boldsymbol{\mathcal E} such that X(\varepsilon) holds for every \varepsilon \prec \varepsilon_0“.

Pre-order Lemma: Every countable subset of \boldsymbol{\mathcal E} has a lower bound with respect to the pre-order \prec. Moreover, the C_\varepsilon-spaces of smooth sections of E \to M have the following properties:

  1. There is a continuous inclusion C_{\varepsilon'}(E) \hookrightarrow C_\varepsilon(E) whenever \varepsilon \prec \varepsilon'.
  2. For any countable collection of smooth sections \eta_1,\eta_2,\eta_3,\ldots of E, one has \eta_k \in C_\varepsilon(E) for all k if \varepsilon has sufficiently rapid decay.

Proof: A lower bound for a countable collection of sequences \varepsilon^{(1)},\varepsilon^{(2)},\varepsilon^{(3)},\ldots \in \boldsymbol{\mathcal E} is given by \varepsilon = \{\varepsilon_m\}_{m=0}^\infty with \varepsilon_m := \min\{ \varepsilon_m^{(1)},\ldots,\varepsilon_m^{(m)}\}. The statement about inclusions C_{\varepsilon'}(E) \hookrightarrow C_\varepsilon(E) follows easily from the observation that \varepsilon \prec \varepsilon' if and only if there exist constants C > 0 and m_0 \in {\mathbb N} such that \varepsilon_m \le C \varepsilon_m' holds for all m \ge m_0. The last statement now follows after observing that any \eta \in \Gamma(E) has \|\eta\|_{C_\varepsilon} < \infty if \varepsilon_m \le 1 / (2^m \cdot \|\eta\|_{C^m}) for all m.

The ability to choose \varepsilon_m decaying faster than any countably infinite collection of choices is a very useful bit of freedom that should not be underestimated. One can use this for instance to make C_\varepsilon(E) dense in Banach spaces such as C^m(E), L^p(E) or W^{k,p}(E) for p < \infty, since these are all separable and already contain \Gamma(E) as a dense subspace.

We can now resolve the question that I was recently banging my head against.

Answer to the Frustrating Question: This is the wrong question to ask.

Indeed, we already know how to construct the desired function f : M \to {\mathbb R} if it only needs to be smooth instead of belonging to a given C_\varepsilon-space. The solution is thus simply to construct a smooth function f, and then choose \varepsilon \in \boldsymbol{\mathcal E} so that \|f\|_{C_\varepsilon} < \infty. Now of course, there may actually be infinitely many such C_\varepsilon-functions we need to find for different choices of the data \Sigma, p, \lambda, however… in all situations I’m familiar with, one can get away with restricting to a countable set of such choices, in which case the Pre-order Lemma gives exactly what we need.

To show how this works, let’s work through a slightly novel take on the standard proof of the GTT.

Proving the Generic Transversality Theorem

To avoid excessive vagueness, we will be concrete now and assume M is a closed 2n-manifold, {\mathcal J} is the space of all smooth almost complex structures on M, and {\mathcal M}(J) is the space of (parametrized) somewhere injective J-holomorphic spheres u : (S^2,i) \to (M,J) with its natural C^\infty-topology. Equivalently, {\mathcal M}(J) is the zero-set of the nonlinear Cauchy-Riemann operator,

\bar{\partial}_J : {\mathcal B} \to {\mathcal E} : u \mapsto du + J(u) \circ du \circ i,

which we can regard as a smooth section of the Banach space bundle {\mathcal E} \to {\mathcal B} with fibers {\mathcal E}_u = L^p(\overline{\text{Hom}}_{\mathbb C}(T S^2 , u^*TM)) over the Banach manifold {\mathcal B} = W^{1,p}(S^2,M) for some p \in (2,\infty). (Note that the smoothness of the section \bar{\partial} depends on J being smooth — this is one of the nice things we’d have to give up if we were following Option 1.) Linearizing \bar{\partial}_J at u \in \bar{\partial}_J^{-1}(0) gives rise to a linear Fredholm operator

\mathbf{D}_u := D\bar{\partial}_J(u) : T_u{\mathcal B} \to {\mathcal E}_u,

and we call u Fredholm regular whenever this operator is surjective. This is equivalent to the condition that u is a transverse intersection of the section \bar{\partial}_J with the zero-section of the Banach space bundle {\mathcal E} \to {\mathcal B}. The goal is to find a comeager subset {\mathcal J}^{\text{reg}} \subset {\mathcal J} such that for every J \in {\mathcal J}^{\text{reg}}, the intersection of \bar{\partial}_J with the zero-section is everywhere transverse.


Let’s first quickly work through what I will call the “fairyland proof” of the theorem — in a fictional world where unicorns are real, sushi grows on trees, and {\mathcal J} is a smooth Banach manifold, this is how the proof would go.

Fairyland proof of the GTT: The universal moduli space {\mathcal M}({\mathcal J}) = \left\{ (u,J)\ \big|\ J \in {\mathcal J} \text{ and } u \in {\mathcal M}(J) \right\} is the zero-set of the smooth section

\bar{\partial} : {\mathcal B} \times {\mathcal J} \to {\mathcal E}' : (u,J) \mapsto \bar{\partial}_J(u),

where {\mathcal E}' denotes the obvious extension of {\mathcal E} \to {\mathcal B} to a bundle over {\mathcal B} \times {\mathcal J}. Its linearization at some (u,J) \in \bar{\partial}^{-1}(0) is then a bounded linear operator

\mathbf{L} := D\bar{\partial}(u,J) : T_u{\mathcal B} \oplus T_J{\mathcal J} \to {\mathcal E}'_{(u,J)} : (\eta,Y) \mapsto \mathbf{D}_u \eta + Y(u) \circ du \circ i,

where we are pretending T_J{\mathcal J} is a Banach space, and we recall for concreteness that

T_u{\mathcal B} = W^{1,p}(u^*TM) \qquad\text{ and }\qquad {\mathcal E}'_{(u,J)} = L^p(\overline{\text{Hom}}_{\mathbb C}((TS^2,i),(u^*TM,J)))

for some p \in (2,\infty). We claim that \mathbf{L} is always surjective. Since \mathbf{D}_u is Fredholm, a standard exercise in functional analysis implies that \mathbf{L} has closed image, so we only need to prove that its image is also dense. If it isn’t, then there exists a nontrivial section \lambda \in L^q(\overline{\text{Hom}}_{\mathbb C}(TS^2,u^*TM)) for \frac{1}{p} + \frac{1}{q} = 1 that annihilates the image of \mathbf{L}, implying the two conditions

\langle \mathbf{D}_u\eta,\lambda \rangle_{L^2} = 0 for all \eta\in T_u{\mathcal B}, \qquad\text{ and }\qquad \langle Y(u) \circ du \circ i , \lambda \rangle_{L^2} = 0 for all Y \in T_J{\mathcal J}.

The first condition means \lambda is a weak solution to the Cauchy-Riemann type equation \mathbf{D}_u^*\lambda = 0, so by elliptic regularity and the similarity principle, it is smooth and has only isolated zeroes. We can then pick an injective point z_0 \in S^2 of u at which \lambda is nonzero and find a smooth section Y \in T_J{\mathcal J} with support near u(z_0) to make \langle Y(u) \circ du \circ i, \lambda \rangle_{L^2} positive, producing a contradiction that proves the claim. The rest of the proof consists of standard applications of big theorems: the surjectivity of \mathbf{L} implies via the implicit function theorem that {\mathcal M}({\mathcal J}) is a smooth Banach manifold, and applying the Sard-Smale theorem to the projection {\mathcal M}({\mathcal J}) \to {\mathcal J} : (u,J) \mapsto J gives the desired comeager set of regular values {\mathcal J}^{\text{reg}} \subset {\mathcal J}, for which \bar{\partial}_J : {\mathcal B} \to {\mathcal E} intersects the zero-section transversely.

Departure from fairyland

The lazy way of transporting the proof above out of fairyland and into the real world is to replace {\mathcal J} with the Banach manifold {\mathcal J}_\varepsilon at every step. That produces a correct argument, but it proves a slightly different theorem than the one we really want, one that violates Rule 1 by providing generic C_\varepsilon-small perturbations instead of C^\infty-small perturbations. Floer’s way around this was to settle for a slightly weaker result: since {\mathcal J}_\varepsilon contains arbitrarily C^\infty-small perturbations of J_{\text{ref}}, and one could have chosen any element of {\mathcal J} to call J_{\text{ref}}, the argument does provide a dense set of almost complex structures in {\mathcal J} that achieve transversality. “Dense” is a weaker condition than “comeager”, but it is enough for most applications, e.g. it certainly suffices for defining Floer homology. On the other hand, sometimes one would like to intersect the set of regular data with some other generic subset and know that the intersection is still nonempty, in which case dense sets are not good enough, though comeager sets would be.

But there is also a subtler problem: not every step in the fairyland proof has a completely straightforward adaptation for {\mathcal J}_\varepsilon. The trickiest step is where one needs to find an element Y \in T_J{\mathcal J}_\varepsilon with support near u(z_0) such that \langle Y(u) \circ du \circ i , \lambda \rangle_{L^2} > 0. This was easy in fairyland because finding smooth bump functions with arbitrarily small support is easy; in the real world, this is where one needs to apply Floer’s bump function lemma, which requires \varepsilon to have sufficiently rapid decay. But the application of this lemma is also not so straightforward, for another reason that I haven’t mentioned yet: one of the irritating features of {\mathcal J}_\varepsilon as we’ve defined it is that for any given J \in {\mathcal J}_\varepsilon, it is not so easy to say precisely which sections Y \in T_J{\mathcal J} actually belong to T_J{\mathcal J}_\varepsilon and which do not. The exception is the case J = J_{\text{ref}}, since the construction clearly identifies T_{J_\text{ref}}{\mathcal J}_\varepsilon as the space of elements in T_{J_\text{ref}}{\mathcal J} that have finite C_\varepsilon-norm. You may have noticed in any case that by discussing this issue, we are running afoul of Rule 2. We are thinking backwards.

How to prove it without violating the rules

By this point, most of the necessary ideas for a correct but maximally stress-free proof have been mentioned, they just need to be assembled in the right way. Here we go.

Step 1: The universal “epsilon-regular” moduli space.

Fix arbitrary J_{\text{ref}} \in {\mathcal J} and \varepsilon \in \boldsymbol{\mathcal E} and use these to define the Banach manifold {\mathcal J}_\varepsilon in the usual way. Notice that while this space depends on the choice of sequence \varepsilon = \{\varepsilon_m\}_{m=0}^\infty, the “reference” almost complex structure J_{\text{ref}} belongs to {\mathcal J}_\varepsilon for every choice. One can now define the usual universal moduli space

{\mathcal M}({\mathcal J}_\varepsilon) := \left\{ (u,J) \ \big|\ J \in {\mathcal J}_\varepsilon \text{ and } u \in {\mathcal M}(J) \right\},

and present it as the zero-set of a smooth section \bar{\partial}(u,J) := \bar{\partial}_J(u) of the obvious extension of {\mathcal E} to a Banach space bundle over {\mathcal B} \times {\mathcal J}_\varepsilon. Let us denote the linearization of this section at (u,J) \in \bar{\partial}^{-1}(0) by

\mathbf{L}_\varepsilon := D\bar{\partial}(u,J) : T_u{\mathcal B} \oplus T_J{\mathcal J}_\varepsilon \to {\mathcal E}'_{(u,J)},

and notice that \mathbf{L}_\varepsilon is just the restriction of the operator \mathbf{L} :  T_u{\mathcal B} \oplus T_J{\mathcal J} \to {\mathcal E}'_{(u,J)} in the fairyland proof to a smaller domain. At this point, I find it useful to introduce the following bookkeeping device:

Definition: An element (u,J) \in {\mathcal M}({\mathcal J}_\varepsilon) is \varepsilon-regular if the operator \mathbf{L}_\varepsilon defined by linearizing \bar{\partial} : {\mathcal B} \times {\mathcal J}_\varepsilon \to {\mathcal E}' at (u,J) is surjective. Similarly, given any smooth almost complex structure J, a curve u \in {\mathcal M}(J) can be called \varepsilon-regular if J belongs to {\mathcal J}_\varepsilon and the pair (u,J) is \varepsilon-regular.

Note that a curve u \in {\mathcal M}(J) can be \varepsilon-regular without being Fredholm regular, as the former is a smoothness condition concerning the neighborhood of u (or more accurately of (u,J)) in the universal moduli space, rather than in {\mathcal M}(J). Clearly \varepsilon-regularity is also an open condition, so

{\mathcal M}^{\text{reg}}({\mathcal J}_\varepsilon) := \left\{ (u,J) \in  {\mathcal M}({\mathcal J}_\varepsilon) \ \big|\ (u,J) \text{ is } \varepsilon\text{-regular} \right\}

is an open subset of {\mathcal M}({\mathcal J}_\varepsilon). The implicit function theorem then implies that {\mathcal M}^{\text{reg}}({\mathcal J}_\varepsilon) is a smooth Banach manifold, and applying the Sard-Smale theorem in the usual way to the projection {\mathcal M}^{\text{reg}}({\mathcal J}_\varepsilon) \to {\mathcal J}_{\varepsilon} : (u,J) \mapsto J gives:

Lemma 1: For every \varepsilon \in \boldsymbol{\mathcal E}, there exists a comeager subset {\mathcal J}_\varepsilon^{\text{reg}} \subset {\mathcal J}_\varepsilon such that for each J \in {\mathcal J}_\varepsilon^{\text{reg}}, every \varepsilon-regular curve u \in {\mathcal M}(J) is Fredholm regular.

Notice that so far, we have not actually done any work, we just applied some standard theorems in a standard way. The lemma is, at this stage, correspondingly free of content: we have not yet shown that the set of \varepsilon-regular curves in {\mathcal M}(J) is ever nonempty. That is the next task.

Step 2: Proving epsilon-regularity for one curve

None of the ideas in our fairyland proof above were fundamentally wrong, they just were not applied in the right context. Salvaging the work done in the main technical step now leads to the following lemma. Note that in this statement, J = J_{\text{ref}}; since the tangent space T_J{\mathcal J}_\varepsilon for J \ne J_{\text{ref}} is not easy to describe, we shall avoid thinking about it altogether.

Lemma 2: For any given curve u \in {\mathcal M}(J_{\text{ref}}), u is \varepsilon-regular for all \varepsilon \in \boldsymbol{\mathcal E} with sufficiently rapid decay.

Proof: The fairyland proof contains a completely valid argument showing that the operator \mathbf{L} :  T_u{\mathcal B} \oplus T_{J_{\text{ref}}}{\mathcal J} \to {\mathcal E}'_{(u, J_{\text{ref}} )} has dense image. Since {\mathcal E}'_{(u, J_{\text{ref}} )} is a separable Banach space, we can then choose a dense sequence \xi_1,\xi_2,\xi_3 \in {\mathcal E}'_{(u, J_{\text{ref}} )}, along with a sequence (\eta_k,Y_k) \in T_u{\mathcal B} \oplus T_{ J_{\text{ref}}}{\mathcal J} satisfying \mathbf{L}(\eta_k,Y_k) = \xi_k for all k=1,2,3,\ldots. By the Pre-order Lemma, all of the smooth sections Y_1,Y_2,Y_3,\ldots are of class C_\varepsilon for \varepsilon with sufficiently rapid decay, hence they are in T_{J_{\text{ref}}}{\mathcal J}_\varepsilon, and it follows in this case that the image of \mathbf{L}_\varepsilon is also dense. Since it is already known to be closed, the result follows.

Step 3: Every space you care about is second countable.

The next statement strengthens Lemma 2 via a change in the order of quantifiers.

Lemma 3: For all \varepsilon with sufficiently rapid decay, every curve in {\mathcal M}(J_{\text{ref}}) is \varepsilon-regular.

Proof: As a subset of the separable metrizable space {\mathcal B} = W^{1,p}(S^2,M), {\mathcal M}(J_{\text{ref}}) = \bar{\partial}_{J_{\text{ref}}}^{-1}(0) \subset {\mathcal B} is also separable, so in particular, it is a second-countable topological space, and thus has the property that every open cover has a countable subcover. Now since \varepsilon-regularity is an open condition, we can apply Lemma 2 and associate to each u \in {\mathcal M}(J_{\text{ref}}) some \varepsilon^{(u)} \in \boldsymbol{\mathcal E} and a neighborhood {\mathcal U}_u \subset {\mathcal M}(J_{\text{ref}}) of u such that

v \in {\mathcal U}_u \qquad\Longrightarrow \qquad  v \text{ is } \varepsilon^{(u)}\text{-regular}.

Pick a sequence u_1,u_2,u_3,\ldots \in {\mathcal M}(J_{\text{ref}}) such that the open sets {\mathcal U}_{u_k} still cover {\mathcal M}(J_{\text{ref}}). The statement then holds if \varepsilon is chosen to be any lower bound for the countable set \varepsilon^{(u_1)}, \varepsilon^{(u_2)}, \varepsilon^{(u_3)},\ldots.

Step 4: The Taubes trick

Traditionally, the so-called Taubes trick appears in the literature as a method for converting slightly inelegant statements about comeager subsets of {\mathcal J}^m or {\mathcal J}_\varepsilon into more elegant statements about comeager subsets of {\mathcal J}. Some authors don’t bother with it, and thus settle for slightly inelegant statements, especially if they don’t care about violating Rule 1. But in this version of the proof of the GTT, the Taubes trick plays a slightly more prominent role.

The Taubes trick depends on the ability to exhaust {\mathcal M}(J) by a countable collection of compact subsets that depend continuously in some sense on J. Let us assume in particular that for each N \in {\mathbb N} and J \in {\mathcal J}, a subset {\mathcal M}^N(J) \subset {\mathcal M}(J) can be defined that has the following properties:

  • (Exhaustion) \displaystyle \bigcup_{N \in {\mathbb N}} {\mathcal M}^N(J) = {\mathcal M}(J).
  • (Compactness) For any C^\infty-convergent sequence J_k\to J and any fixed N \in {\mathbb N}, every sequence u_k \in {\mathcal M}^N(J_k) has a subsequence C^\infty-convergent to an element of {\mathcal M}^N(J).

There are various ways to define {\mathcal M}^N(J) in general, e.g. by imposing uniform C^1-bounds to force compactness and other conditions to prevent the loss of injective points; such details are tangential to the present discussion, so we will omit them. We can now define

{\mathcal J}^N \subset {\mathcal J}

for each N \in {\mathbb N} as the set of all J \in {\mathcal J} for which every curve in {\mathcal M}^N(J) is Fredholm regular. By the Exhaustion property stated above, the countable intersection \bigcap_{N \in {\mathbb N}} {\mathcal J}^N is then precisely the set that we would like to show is comeager. This follows from the next two lemmas.

Lemma 4a: For each N \in {\mathbb N}, {\mathcal J}^N is open.

Proof: If not, then there exists a J \in {\mathcal J}^N and a convergent sequence J_k \to J such that J_k \not\in {\mathcal J}^N for all k, meaning there also exists a sequence u_k \in {\mathcal M}^N(J_k) of curves that are not Fredholm regular. The Compactness property then provides a subsequence of u_k convergent to some u \in {\mathcal M}^N(J), and u must be Fredholm regular since J \in {\mathcal J}^N. That is a contradiction, as Fredholm regularity is an open condition.

Now comes the more interesting part.

Lemma 4b: For each N \in {\mathbb N}, {\mathcal J}^N is dense.

Proof: Since the “reference” almost complex structure J_{\text{ref}} in the definition of {\mathcal J}_\varepsilon can be chosen arbitrarily, it will suffice to prove that there exists a sequence J_k \in {\mathcal J}^N converging in the C^\infty-topology to J_{\text{ref}}. By Lemma 3, we can choose some \varepsilon \in \boldsymbol{\mathcal E} with sufficiently rapid decay so that every curve in {\mathcal M}(J_{\text{ref}}) is \varepsilon-regular. Lemma 1 then provides a comeager subset {\mathcal J}_\varepsilon^{\text{reg}} \subset {\mathcal J}_\varepsilon such that every \varepsilon-regular curve in {\mathcal M}(J) is also Fredholm regular for J \in {\mathcal J}_\varepsilon^{\text{reg}}. Comeager subsets are dense, so we can choose a sequence J_k \in {\mathcal J}_\varepsilon^{\text{reg}} that converges in the C_\varepsilon-topology to J_{\text{ref}}, in which case J_k also converges to J_{\text{ref}} in C^\infty. We claim that for each N \in {\mathbb N}, J_k belongs to {\mathcal J}^N for all k sufficiently large. If not, then there exists a sequence u_k \in {\mathcal M}^N(J_k) of curves that are not Fredholm regular, and the Compactness property allows us to replace u_k with a subsequence that converges to some u \in {\mathcal M}^N(J_{\text{ref}}). In particular, (u_k,J_k) now converges to (u,J_{\text{ref}}) in {\mathcal M}({\mathcal J}_\varepsilon). Since \varepsilon-regularity is an open condition, it follows that u_k is also \varepsilon-regular for all k sufficiently large, and is therefore Fredholm regular since J_k \in {\mathcal J}_\varepsilon^{\text{reg}}, contradicting our assumptions.

Final word

Here’s what I want to point out about the proof we’ve just completed: every step where one actually has to do some work (rather than just quoting a big theorem) is carried out in the C^\infty-setting, and I would conjecture that one can get away with this in almost any transversality proof if one approaches it in the right way. One then just needs to know that the collection of all the C_\varepsilon-topologies for all choices of \varepsilon is a good enough “approximation” to the C^\infty-topology, in a sense that is made precise by the Pre-order Lemma.

If you’re curious to see how one might apply this strategy in more general situations, like the one that motivated the Frustrating Question stated above, you’ll find examples in Sections 5.4 and 6 of the newest revision of my super-rigidity paper, which appeared on the arXiv this week.

What one definitely should not do is try to understand what it “means” for a function to be of class C_\varepsilon. This question has no deeply meaningful answer, and one can do considerable harm to one’s own peace-of-mind by thinking about it. I know this from experience.

Posted in Uncategorized | Tagged , , | 1 Comment

Ph.D. position available in Berlin

Thomas Walpuski, who recently arrived in my department at the HU Berlin, asked me to pass on the following advert for a Ph.D. studentship in his group:

For those who can’t read the German, this is for prospective Ph.D. students interested in working in the broad area of differential geometry and geometric analysis, with emphasis on manifolds with exceptional holonomy, gauge theory, and pseudoholomorphic curves. I will add to this that with a current total of three permanent professors plus at least 7 postdocs and 4 Ph.D. students working in these areas, new students in our group need not worry about running out of people to talk to.

The application deadline is December 9; any other questions, ask Thomas! (And please pass on this info to any suitable prospective students you may know.)

Posted in Uncategorized | Leave a comment

spinal open books

I’m writing this post in order to fulfill a promise I made to myself.

Many readers of this blog (I’ll just assume it still has readers) are probably at least vaguely aware of a joint paper of mine with Sam Lisi and Jeremy Van Horn-Morris that has been in preparation for, oh… quite some time. For several years, the only publicly available sources we could point to describing the contents of that paper were a blog post by Laura Starkston that was written long before I even decided to create this blog, and this Youtube video of a talk that Jeremy gave at the 2012 Georgia Topology Conference. At the time, we really did believe that the paper would be done soon. But suffice it to say that there is a photograph somewhere of Sam and me working on that paper next to an infant, and that infant is now approaching double digits.

A few years ago, we decided to split the paper in two pieces, and Part 1 finally appeared on the arXiv in October 2018. Unfortunately, Part 1 was the shorter and easier part. I have been consoling myself since then with the knowledge that if my incomplete paper was becoming something of a fiasco, it was at least a smaller fiasco than the notorious Berlin-Brandenburg International Airport (BER), which remains unfinished since its official opening in June 2012 (meant to coincide with the closure of Berlin’s two existing airports) was postponed on a few weeks’ notice, making it into the laughing stock of all German construction projects. My new goal became: we will finish the spinal open book paper before BER opens.

Well, BER opens tomorrow, pandemic notwithstanding, and I suspect that if that were going to change this time, I’d have heard about it by now. Part 2 of the spinal open book paper is also done, and will be announced on the arXiv Monday morning. Let this blog post serve as proof for all posterity that it really did get finished before the opening of BER.

Posted in Uncategorized | 3 Comments

Postdoc position in symplectic topology available in Berlin for 3 years starting in 2020 (Autumn or sooner)

Once again this year, there is a 3-year postdoc position available in my research group at the HU Berlin: see the announcement on or the (legally binding) German version on the HU website.

The starting date is planned for Autumn 2020 but can be sooner if desired by the candidate. This is a non-teaching position, so no knowledge of German is required, though some voluntary teaching is possible, even in English (for upper-level courses). Feel free to send me an e-mail if you have any questions!

* My standard practical note for those unfamiliar with the vagaries of public employment in Germany: the precise salaries for such positions are determined by a complicated formula that depends on various details, including your family circumstances, but in theory you can compute them more exactly (and also the after-tax version) using this salary calculator. The main thing you need to know is that this is a full-time position in Entgeltgruppe E13 of the TV-L. Please don’t ask me about Zusatzversorgung or Lohnsteuerklassen… if you want to know what these things mean, your best bet is to find an actual German, which, as you probably know, I am not.

Posted in Uncategorized | Tagged , | Leave a comment

Book on intersection theory (plus some developments in higher dimensions)

Trigger warning: In this post, I am not going to say anything about transversality.

Actually, I want to advertise a new book about intersection theory in 3-dimensional contact topology, but before I do that, I need to mention two “recent” developments in higher-dimensional contact topology that I am very excited about.

Contact geometry in dimensions five and higher

(1) This is not so recent anymore, but since it’s one of the topics I used to talk about a lot on this blog, it would be criminal of me not to mention that the symplectic capping problem has now been solved, in parallel work by Conway-Etnyre and Lazarev. As I discussed in one of the earliest posts on this blog, this is not unexpected, but the fact that it took so long for such a proof to appear makes it a major development. Both proofs are (as far as I can tell) fairly similar and not so hard to comprehend, but they are very much a part of the ongoing revolution that was triggered by Borman-Eliashberg-Murphy’s introduction of overtwistedness in higher dimensions, along with the criteria for overtwistedness subsequently established by Casals-Murphy-Presas. A nontrivial role is also played by Bowden-Crowley-Stipsicz’s topological work on Stein cobordisms via bordism theory. All of these results were published within the last five years, so I think it’s fair to say that the existence of symplectic caps is a fairly deep fact, despite having been expected for a long time. (Note: Lazarev’s paper also answers some questions that I asked on this blog a while back.)

(2) My former student Agustin Moreno has been doing some interesting work lately with Bowden and Gironella, resulting in a new preprint on Bourgeois contact structures that hit the arXiv today. Bourgeois proved in 2002 that for any given contact manifold (M,\xi), there exists a contact structure on M \times {\mathbb T}^2 that is determined by a choice of supporting open book for (M,\xi); this was one of the prototypical results suggesting the conjecture (proved in the mean time by BEM) that all manifolds with almost contact structures should also admit contact structures. The new paper by Bowden-Gironella-Moreno shows that in contrast to the overtwisted (and therefore flexible) contact structures produced by BEM, Bourgeois’s contact structures on M \times {\mathbb T}^2 are more rigid, and this is true independently of the choice of contact structure \xi on M, e.g. they prove that for any contact 3-manifold (M,\xi), Bourgeois’s contact structure on M \times {\mathbb T}^2 will be tight.  There are also some surprising results about symplectic fillings of such contact structures, including a theorem that for every n \ge 2, all symplectically aspherical strong fillings of the unit cotangent bundle of {\mathbb T}^n are diffeomorphic, making this in some ways a natural successor to my paper that proved the uniqueness of fillings of {\mathbb T}^3.

Each of those topics probably deserves a post of its own at some point… maybe I will find some time for that now that my daughter has started preschool.

…and dimension three

But I did actually want to say something today about intersection theory. You are probably aware that the intersection theory of holomorphic curves plays an important role in 4-dimensional symplectic topology, and you may also be aware that an extension of this theory for punctured holomorphic curves in the setting of symplectic field theory exists, and has interesting applications for contact 3-manifolds (e.g. the aforementioned classification of fillings of {\mathbb T}^3). If you’re like most people I know, you are also afraid to read Siefring’s original papers on this subject, which are, well… long. (Also well written, I should add, though perhaps not as user friendly as one might hope.)

I have been on something of a crusade[1] for several years to popularize this intersection theory, and the newest product of that crusade is a book to be published by Cambridge University Press, the latest draft of which has just been updated on the arXiv:

Contact 3-manifolds, holomorphic curves and intersection theory, arXiv:1706.05540

For anyone who already knows standard holomorphic curve theory and wants to learn the facts of Siefring’s intersection theory as efficiently as possible, my recommendation is to turn directly to Appendix C: this is meant as a quick reference guide that states the essential facts as concisely as possible, and I have already gotten into the habit of consulting it myself on a regular basis for various formulas that I sometimes need to use in my papers. If you also want to know why these concisely stated facts are true, you will find them explained in Lectures 3 and 4, though without the analytically intensive proofs of the relative asymptotic formulas that form the basis of the theory. (I suspect that most readers will consider that a feature rather than a bug.)

The book focuses on topological rather than analytical issues, and the main portion of it was written with a student audience in mind, so the amount of background it assumes in symplectic and contact topology is fairly light. Most of the necessary facts from holomorphic curve theory are summarized concisely without proofs in an appendix.

It does also include one thing for readers who specifically enjoy analysis: Appendix B contains a mostly self-contained proof of local positivity of intersections. When I say “self-contained,” I mean that instead of quoting analytically deep results from the late 1980’s by McDuff or Micallef-White, the appendix gives a complete proof of a “weak version” of the Micallef-White theorem to describe the local structure of critical points of holomorphic curves — this is something that one could equally well describe as a “non-asymptotic” variant of Siefring’s relative asymptotic formulas.[2] To understand the proof, you need to be comfortable with distributions and Sobolev spaces, and you need to be able to follow some of the standard arguments of elliptic regularity theory (e.g. the use of difference quotients and the Banach-Alaoglu theorem), but “self-contained” also means that I’ve avoided relying on certain (standard but…) difficult things like the Calderón-Zygmund inequality. This is possible due to some arguments explained to me by Jean-Claude Sikorav, which I’ve written about in previous posts.

You can read the book for free on the arXiv, but it should also be appearing in print sometime in 2020, so if you like it, please buy it! (Yes, that’s right, I get royalties… not much, but something. If you like, think of it as your modest contribution toward my daughter’s bilingual preschool tuition fees.)

[1] My own personal viewpoint on my career path includes the observation that I benefited early on from being one of at most five people to have read and understood Siefring’s thesis. This made it possible for me to pick a certain amount of low-hanging fruit that no one else at the time perceived as low hanging.

[2] The Micallef-White theorem says that in well-chosen coordinates near any critical point of a J-holomorphic curve, the curve looks like a holomorphic polynomial. This makes it possible to prove that if a critical point is present, then any immersed perturbation of the curve will have a well-defined and strictly positive count of double points in a neighborhood of the original critical point. But one doesn’t need the full Micallef-White theorem to prove the latter — it suffices to have a formula presenting the difference between two intersecting curves as a holomorphic polynomial plus a remainder term, and this is what the “weak version” I’m referring to does. The idea to prove positivity of intersections this way is something I originally learned from Hofer, and it is based on the same intuition as Siefring’s Ph.D. thesis which first introduced the punctured intersection theory. For full disclosure, I should mention that the “weak version” of Micallef-White also appears in Chapter 2 of my perpetually unfinished lecture notes on holomorphic curves, but the proof given there has some flaws and will need to be rewritten in a future revision.

Posted in Uncategorized | Tagged , , , | 1 Comment

Why Petri’s condition is generic

Part 1: Where it all went wrong

I would like to state a lemma, but it comes with a major caveat: the lemma is false. I guess this means that it’s “not a lemma in the sense of mathematics,” so perhaps I should call it something else, like… an emma? No, let’s call it a lemming.

(Lemmings, as you may have heard, do not generally jump off cliffs. But Lemming 1 did.)

Lemming 1Suppose \mathbf{D} : \Gamma(E) \to \Gamma(F) is a real-linear Cauchy-Riemann type operator over a Riemann surface \Sigma, such that the bundle map E \to F defined by the complex-antilinear part of \mathbf{D} is invertible on some open subset {\mathcal U} \subset \Sigma. Then \mathbf{D} satisfies Petri’s condition on {\mathcal U}.

Recall from the previous post: the words “satisfies Petri’s condition on {\mathcal U}” mean that the natural map

\Pi_{\mathcal U} : \ker \mathbf{D} \otimes \ker\mathbf{D}^* \to \Gamma(E \otimes F|_{\mathcal U})

is injective, where \Pi_{\mathcal U} sends each \eta \otimes \xi \in \ker\mathbf{D} \otimes \ker\mathbf{D}^* to the section \Pi_{\mathcal U}(\eta \otimes \xi)(z) := \eta(z) \otimes \xi(z) of E \otimes F restricted to {\mathcal U}. Put another way, this means that if we fix bases \eta_1,\ldots,\eta_m \in \ker\mathbf{D} and \xi_1,\ldots,\xi_n \in \ker\mathbf{D}^*, then for every nontrivial set of real numbers \Psi_{ij} \in {\mathbb R}, the section

\sum_{i,j} \Psi_{ij} \eta_i \otimes \xi_j \in \Gamma(E \otimes F)

is guaranteed to be nonzero somewhere in {\mathcal U}. I tried to explain in the previous post why this is something you would want to be true if you are studying equivariant transversality problems. And given what we know about unique continuation for Cauchy-Riemann type equations, it certainly looks true on first glance. But as you might gather from the hypothesis in Lemming 1, reality is more complicated. I should emphasize at this point that even through the bundles E and F come with complex structures, the operators \mathbf{D} and \mathbf{D}^* are in general not complex linear, so all tensor products in this discussion must be understood to be real tensor products, even if \ker\mathbf{D} and \ker\mathbf{D}^* happen in some cases to be complex vector spaces. That makes the following an example in which Petri’s condition fails: take E to be the trivial line bundle over {\mathbb D} \subset {\mathbb C}, and \mathbf{D} = \bar{\partial} := \frac{\partial}{\partial \bar{z}} with formal adjoint \mathbf{D}^* = -\partial := -\frac{\partial}{\partial z}. Then

1 \otimes i\bar{z} - i \otimes \bar{z} - z \otimes i + iz \otimes 1 \in \ker\mathbf{D} \otimes \ker\mathbf{D}^*

is a nontrivial element in the kernel of \Pi_{\mathbb D} : \ker\mathbf{D} \otimes \ker\mathbf{D}^* \to \Gamma(E \otimes F). We see of course that this would not be a nontrivial element in the complex tensor product, and in fact: it is not hard to show that for complex-linear Cauchy-Riemann type operators, the complex analogue of Petri’s condition (using complex tensor products) always holds. This is essentially a consequence of unique continuation, together with the fact that, in local coordinates, all \eta \in \ker\mathbf{D} are power series in z while \xi \in \ker\mathbf{D}^* are power series in \bar{z}.

This fact about the complex-linear case was what originally misled me into believing that Lemming 1 should be true. The hypothesis of an invertible antilinear part ensures that all real-linearly independent sets in \ker\mathbf{D} or \ker\mathbf{D}^* are also complex-linearly independent, so that one might realistically hope for properties of \ker \mathbf{D} \otimes_{\mathbb C} \ker\mathbf{D}^* to carry over to \ker\mathbf{D} \otimes_{\mathbb R} \ker\mathbf{D}^*. After developing that intuition and then sitting down to work out the details, I was a bit too quick to believe I had succeeded, leading to the “proof” (or is it a “roof”, or a “proo”?) that appeared in 2016 and had to be withdrawn from the arXiv almost two years later. In reality, I had underestimated the difficulty of this detail, and at least part of my intuition on why it should work out was wrong.

I’ll show you an actual counterexample to Lemming 1 at the end of this post, but first I want to stop talking about things that are false, and say some more about what is actually true.

Part 2: What is right

The point of Lemming 1 was never supposed to be that there is something special about Cauchy-Riemann type operators with invertible antilinear part. The intended point was that Petri’s condition is generic: every Cauchy-Riemann type operator can be perturbed to make its antilinear part somewhere invertible. If Lemming 1 were true, then it would be a short step from there to proving that for generic almost complex structures J, every J-holomorphic curve has a normal Cauchy-Riemann operator that satisfies Petri’s condition… thus making equivariant transversality feasible.

It would therefore suffice if Lemming 1 is replaced with a statement that is less specific about the desirable class of Cauchy-Riemann operators, but still says that they are generic. Something like this:

Lemma 2. For any Cauchy-Riemann type operator \mathbf{D} : \Gamma(E) \to \Gamma(F) on a Riemann surface \Sigma and a fixed open subset {\mathcal U} \subset \Sigma with compact closure, there is a Baire subset {\mathcal A}^{\text{reg}}({\mathcal U}) in the space {\mathcal A}({\mathcal U}) of all smooth linear bundle maps A : E \to F supported in {\mathcal U}, consisting of perturbations A such that \mathbf{D}_A := \mathbf{D} + A satisfies Petri’s condition on {\mathcal U}.

This is essentially what Corollary 5.9 in the new version of my paper on super-rigidity says, and it is again a short step from there to proving that all normal Cauchy-Riemann operators of J-holomorphic curves satisfy Petri’s condition for generic J.

Part of the intuition here is that when you look at examples of operators for which Petri’s condition fails, the counterexamples look very special: the condition \Pi_{\mathcal U}(\sum_i \eta_i \otimes \xi_i) = 0 translates into nontrivial pointwise linear dependence relations among some linearly independent local solutions \eta_i \in \ker\mathbf{D} and \xi_i \in \ker\mathbf{D}^* over an open set, and it would seem surprising somehow for generic operators to admit such relations. Unique continuation also still plays a role, as must be expected since, if there were a nontrivial local solution to \mathbf{D}\eta = 0 that vanishes on some open set {\mathcal U} \subset \Sigma, then one could pair it with any nontrivial \xi \in \ker\mathbf{D}^* and call \eta \otimes \xi an easy counterexample to Petri’s condition. But on balance, I understand Lemma 2 mainly as a genericity result—unique continuation is still an important ingredient in the proof, but the main tool is actually Sard’s theorem.

Part 3: How do you prove a “local genericity result” anyway?

I don’t mind admitting that I was quite puzzled for a while as to how one might go about proving Lemma 2. In the first place, it isn’t immediately clear whether it should be understood analytically as a global or a local result. Calling it “global” in this case would mean that it depends on the global setup of the operator and, in all likelihood, makes use of the fact that \mathbf{D} and \mathbf{D}^* are Fredholm. That sounds good at first, because it seems much more likely for the “Petri map” \Pi_{\mathcal U} : \ker\mathbf{D} \otimes \ker\mathbf{D}^* \to \Gamma(E \otimes F|_{\mathcal U}) to be injective if its domain is finite dimensional. But the problem starts to seem a lot dicier if you imagine what happens to this domain under perturbations: \ker\mathbf{D} and \ker\mathbf{D}^* do not depend continuously on \mathbf{D} in a straightforward way, as their dimensions can jump suddenly downward. One cannot therefore just set up some kind of “universal moduli space”

U := \left\{ (A,t)\ \big|\ A \in {\mathcal A}({\mathcal U}),\ t \in \ker \Pi_{\mathcal U} \subset \ker \mathbf{D}_A \otimes \ker \mathbf{D}_A^* \right\}

and try to apply the Sard-Smale theorem to the obvious projection U \to {\mathcal A}({\mathcal U}) : (A,t) \mapsto A, because U does not closely resemble anything that could reasonably be called a manifold.

The second problem with viewing Lemma 2 globally is that since we want a result that applies to multiply covered holomorphic curves, we would also need a version of the lemma that considers operators \mathbf{D} which are equivariant under the action of some finite symmetry group G, so that the perturbations A are also required to be G-invariant. This makes the problem vulnerable to the same difficulty that this whole endeavor was designed to overcome: transversality and symmetry are not generally compatible with each other. One of the selling points of Lemming 1 had always been that since the condition required on the perturbation was fundamentally local, proving it for linearized Cauchy-Riemann operators along simple curves would immediately imply the same result for all multiple covers of those curves.

All this makes a pretty convincing argument for taking a local approach to Lemma 2: we should not assume any condition (such as compactness) on \Sigma, nor should we assume that \mathbf{D} is Fredholm… whatever can be proven should be provable by considering small zeroth-order perturbations of the standard Cauchy-Riemann operator \bar{\partial} = \partial_s + i\partial_t : C^\infty({\mathbb D},{\mathbb C}^m) \to C^\infty({\mathbb D},{\mathbb C}^m). This idea does not have the two drawbacks mentioned above—in particular, it is a standard result of local elliptic regularity theory that the infinite-dimensional space \ker \mathbf{D} \subset C^\infty({\mathbb D},{\mathbb C}^m) does vary smoothly with the operator \mathbf{D} in suitable functional-analytic settings. But now there is a new problem: nothing in the setup is Fredholm, and there is no Sard’s theorem for non-Fredholm maps between infinite-dimensional manifolds.

There does exist a local approach that doesn’t have this last drawback: one can consider the problem on jet spaces of sections at a point. In this way, everything becomes finite dimensional, and no actual functional analysis is needed.

Part 4: The jet space approach

I will now describe the setup for proving Lemma 2. I’ll focus specifically on Cauchy-Riemann operators, but it’s interesting to note that a large portion of the discussion makes sense for much more general classes of differential operators, for which one might conceivably be interested in studying equivariant transversality (see e.g. the preprint by Doan and Walpuski on this subject).

We are given a Riemann surface \Sigma and complex vector bundle E \to \Sigma, giving rise to the bundle F := \overline{\text{Hom}}_{\mathbb C}(T\Sigma,E) and the affine space of real-linear Cauchy-Riemann type operators {\mathcal CR}_{\mathbb R}(E), which map \Gamma(E) to \Gamma(F). Fix a point p \in \Sigma and let J^k_p E denote the vector space of k-jets of sections of E at p. Each \mathbf{D} \in {\mathcal CR}_{\mathbb R}(E) then descends to a linear map

\mathbf{D} : J^k_p E \to J^{k-1}_p F

for every k \in {\mathbb N}, and usefully, this map is always surjective. The latter can be deduced from standard local existence results for solutions to the equation \bar{\partial} u = f, but in the jet space context, it’s actually much easier than that: first, one can easily just write down a right-inverse for the operator \bar{\partial} : J^k_p E \to J^{k-1}_p F. The general case is then a consequence of the fact that surjectivity is an open condition, using the following observation:

Rescaling principle: Every Cauchy-Riemann type operator \mathbf{D} : J^k_p E \to J^{k-1}_p F is equivalent (via choices of local coordinates and trivializations near p) to an arbitrarily small perturbation of the standard operator \bar{\partial}.

Let {\mathcal CR}^k_{\mathbb R}(E) denote the space of linear maps J^k_p E \to J^{k-1}_p F that are induced by operators in {\mathcal CR}_{\mathbb R}(E). Since the (k-1)-jet of a section \mathbf{D}_A \eta = \mathbf{D}\eta + A\eta depends on the zeroth-order perturbation A only up to its (k-1)-jet, {\mathcal CR}^k_{\mathbb R}(E) is an affine space over the finite-dimensional vector space J^{k-1}_p\text{Hom}(E,F). We can now consider the k-jet Petri map

\Pi^k : J^k_p E \otimes J^k_p F \to J^k_p(E \otimes F),

defined by letting the natural map \Pi : \Gamma(E) \otimes \Gamma(F) \to \Gamma(E \otimes F) descend to quotient spaces. We will be interested particularly in the restriction of \Pi^k to the subspace \ker\mathbf{D} \otimes \ker\mathbf{D}^* \subset J^k_p E \otimes J^k_p F for each \mathbf{D} \in {\mathcal CR}^k_{\mathbb R}(E). There is a trivial reason why this map will never actually be injective: if \eta \in J^k_p E vanishes to order q-1 \le k and \xi \in J^k_p F vanishes to order r-1 \le k with q+r > k, then their product vanishes to order at least k and is thus trivial in J^k(E \otimes F). The fancy way to say this is that jet spaces carry natural filtrations,

J^k_p E = (J^k_p E)^0 \supset (J^k_p E)^1 \supset \ldots \supset (J^k_p E)^k \supset (J^k_p E)^{k+1} = \{0\},

where we can identify k-jets with Taylor polynomials in coordinates to define (J^k_p E)^\ell as the space of Taylor polynomials that are O(|z|^\ell). Under the natural tensor product filtration that J^k_p E \otimes J^k_p F inherits from the filtrations of J^k_p E and J^k_p F, the Petri map \Pi^k preserves filtrations and thus vanishes on (J^k_p E \otimes J^k_p F)^{k+1}. This observation motivates considering for each \mathbf{D} \in {\mathcal CR}^k_{\mathbb R}(E) and each k,\ell \in {\mathbb N} with 0 \le \ell \le k+1 the space

{\mathcal M}^k_\ell(\mathbf{D}) := \left\{ t \in \ker\mathbf{D} \otimes \ker\mathbf{D}^*\ \big|\ \Pi^k(t) = 0 \text{ and } t \not\in (J^k_p E \otimes J^k_p F)^\ell \right\}.

Notice how we’ve just quietly reinserted unique continuation into this discussion. If we can find sequences \ell_n,k_n \to \infty such that {\mathcal M}^{k_n}_{\ell_n}(\mathbf{D}) = \emptyset for a given operator \mathbf{D} \in {\mathcal CR}_{\mathbb R}(E), then we’ve proven that the only possible counterexamples to Petri’s condition for \mathbf{D} are nontrivial elements t \in \ker\mathbf{D} \otimes \ker\mathbf{D}^* that vanish to all orders at the point p. One can easily deduce from unique continuation that there are no such elements, so this would imply Petri’s condition.

What I’m about to say will sound like bad news, but it leads to something good. One can easily compute the expected dimension of {\mathcal M}^k_\ell(\mathbf{D}) via a dimension count. My initial naive hope had been that this expected dimension would turn out to be negative, perhaps after choosing k sufficiently large, and one could then argue via Sard’s theorem that {\mathcal M}^k_\ell(\mathbf{D}) is empty for almost every \mathbf{D} \in {\mathcal CR}^k_{\mathbb R}(E). But the expected dimension isn’t negative. In fact, for all choices k, \ell and \mathbf{D}, {\mathcal M}^k_\ell(\mathbf{D}) turns out to be a nonempty open subset in a nontrivial vector space that depends smoothly on \mathbf{D}. There are good geometric reasons for this, which I can happily explain to anyone who’s curious, but I won’t get into them here—the point for now is just that the naive idea doesn’t work.

You get some interesting insight, however, if you then try to imagine (as I did) how the nonemptiness of {\mathcal M}^k_\ell(\mathbf{D}) might be used to disprove Lemma 2, e.g. to show that Petri’s condition fails for every Cauchy-Riemann operator. The nontriviality of every {\mathcal M}^k_\ell(\mathbf{D}) means that one can associate to every Cauchy-Riemann operator \mathbf{D} and integer \ell \in {\mathbb N} a sequence of tensor products of sections

\displaystyle t_k = \sum_{i=1}^{r_k} \eta_{k,i} \otimes \xi_{k,i} \in \Gamma(E) \otimes \Gamma(F)

such that for each k, t_k does not vanish to order \ell at p, but \mathbf{D}\eta_{k,i} and \mathbf{D}^*\xi_{k,i} vanish to order k and the Petri map takes t_k to a section of E \otimes F that also vanishes to order k at p. It is very far from obvious whether t_k can be made to converge to something as k \to \infty, though if it does, then it would be reasonable to expect that the limit is the infinity-jet of a counterexample to Petri’s condition. One of the big reasons why convergence is unclear is that the numbers r_k may be unbounded. One can rephrase this as follows: given two vector spaces V and W, say that an element t \in V \otimes W has rank r  if one can write t = \sum_{i=1}^r v_i \otimes w_i for two linearly-independent sets v_1,\ldots,v_r \in V and w_1,\ldots,w_r \in W. It is built into the definition of a tensor product of vector spaces that every element in it has finite rank. This is no longer true if one wishes to define a tensor product of infinite-dimensional Hilbert spaces—in that context, one needs to enlarge the algebraic tensor product to an analytical completion that includes elements of infinite rank. I find it conceivable that Petri’s condition really will fail at the local level for all Cauchy-Riemann operators if one replaces \Gamma(E) \otimes \Gamma(F) with a Hilbert space tensor product of local sections. But that is not what we are doing; the sequence t_k described above only has any chance of converging to a counterexample if the rank of t_k stays bounded.

With this in mind, let’s modify our definition of {\mathcal M^k_\ell}(\mathbf{D}): define for each r  \in {\mathbb N} the space

{\mathcal M}^k_{r,\ell}(\mathbf{D}) := \left\{ t \in \ker\mathbf{D} \otimes \ker\mathbf{D}^*\ \big|\ \Pi^k(t) = 0,\ \text{rank}(t) = r \text{ and } t \not\in (J^k_p E \otimes J^k_p F)^\ell \right\}

One should view this as a subset of

{\mathcal V}^k_{r,\ell}(\mathbf{D}) := \left\{ t \in \ker\mathbf{D} \otimes \ker\mathbf{D}^*\ \big|\ \text{rank}(t) = r \text{ and } t \not\in (J^k_p E \otimes J^k_p F)^\ell \right\},

which is a smooth submanifold of the vector space \ker\mathbf{D} \otimes \ker\mathbf{D}^* \subset J^k_p E \otimes J^k_p F, for the same reason that the space of matrices of a fixed rank is a submanifold in the space of all matrices. Its codimension depends on r and produces a general formula for the dimension of {\mathcal V}^k_{r,\ell}(\mathbf{D}) that grows linearly with k. On the other hand, the extra condition \Pi^k(t) = 0 \in J^k_p(E \otimes F) cuts out a subset whose expected codimension is the dimension of J^k_p(E \otimes F); that is the number of distinct Taylor polynomials up to degree k in z and \bar{z} with values in a fiber of E \otimes F, and it grows quadratically with k. As a result, the expected dimension of {\mathcal M}^k_{r,\ell}(\mathbf{D}) becomes negative as soon as k is sufficiently large. This is, in my opinion, the main reason why you should believe that Lemma 2 is true. It now becomes a consequence of the following more technical statement:

Lemma 3. For every r,\ell \in {\mathbb N}, there exists k_0 \in {\mathbb N} such that {\mathcal M}^k_{r,\ell}(\mathbf{D}) = \emptyset for all k \ge k_0 and almost every \mathbf{D} \in {\mathcal CR}^k_{\mathbb R}(E).

I’ll add just a few comments about the proof of this lemma. To set it up for Sard’s theorem, one needs to consider a “universal” version of the space {\mathcal M}^k_{r,\ell}(\mathbf{D}), namely

{\mathcal M}^k_{r,\ell} := \left\{ (\mathbf{D},t)\ \big|\ \mathbf{D} \in {\mathcal CR}^k_{\mathbb R}(E) \text{ and } t \in {\mathcal M}^k_{r,\ell}(\mathbf{D}) \right\},

which one might hope should be a smooth submanifold of the manifold

{\mathcal V}^k_{r,\ell} := \left\{ (\mathbf{D},t)\ \big|\ \mathbf{D} \in {\mathcal CR}^k_{\mathbb R}(E) \text{ and } t \in {\mathcal V}^k_{r,\ell}(\mathbf{D}) \right\},

e.g. because the smooth map {\mathcal V}^k_{r,\ell} \to J^k_p(E \otimes F) : (\mathbf{D},t) \mapsto \Pi^k(t) is transverse to zero. That seems to be not quite true in general, but what can be proved is close enough to that statement that it gives the desired result: one can show namely that the lineaization of (\mathbf{D},t) \mapsto \Pi^k(t) with respect to changes in \mathbf{D} has its rank bounded below by some quadratic function of k. As a measure of plausibility for this claim, notice that since the space of perturbations {\mathcal CR}^k_{\mathbb R}(E) is an affine space over J^{k-1}_p\text{Hom}(E,F), its dimension is also a quadratic function of k. The bound on the rank does not prove that {\mathcal M}^k_{r,\ell} is a submanifold, but it does prove that it’s something I like to call a C^\infty-subvariety, which has the property that it locally is contained in (locally defined) submanifolds whose codimension is given by the lower bound on the rank. That is enough structure to apply Sard’s theorem and prove, given that the codimension will exceed the dimension of {\mathcal V}^k_{r,\ell}(\mathbf{D}) when k is large enough, that for almost every \mathbf{D} \in {\mathcal CR}^k_{\mathbb R}(E), {\mathcal M}^k_{r,\ell}(\mathbf{D}) itself is locally contained in submanifolds that have negative dimension, meaning {\mathcal M}^k_{r,\ell}(\mathbf{D}) is empty.

This general picture reduces the proof of Lemma 2 to a linear algebra problem: in principle, one needs to write down the linearization of the map {\mathcal V}^k_{r,\ell} \to J^k_p(E \otimes F) : (\mathbf{D},t) \mapsto \Pi^k(t) with respect to variations in \mathbf{D} \in {\mathcal CR}^k_{\mathbb R}(E) at an arbitrary point (\mathbf{D},t) \in {\mathcal M}^k_{r,\ell}, and find a good lower bound on the rank of this linear map. My final remark about this is that due to the rescaling principle mentioned above, one does not really need to consider arbitrary (\mathbf{D},t) \in {\mathcal M}^k_{r,\ell}; it suffices instead to establish this bound only for the special case \mathbf{D} = \bar{\partial}, for which it is a bit tedious but not very hard in principle to write down \ker\Pi^k \subset \ker\mathbf{D} \otimes \ker\mathbf{D}^* and the linearized map explicitly. Once you’ve done that, the rank bound carries over to an open neighborhood of such pairs in {\mathcal M}^k_{r,\ell}, and since every Cauchy-Riemann operator is (up to choices of coordinates and trivializations) an arbitrarily small perturbation of \bar{\partial}, the result applies to all operators.

Epilogue: The fall of Lemming 1

By this point, no one is reading this post anymore except for the referees of my paper and possibly one or two stalkers, so just for amusement, I might as well tell you how to find a concrete counterexample to Lemming 1. Take E and F to be the trivial line bundle over {\mathbb D} \subset {\mathbb C} and consider the operators

\mathbf{D} := \bar{\partial} + \kappa,    \mathbf{D}^* := -\partial + \kappa^*,

where \kappa : E \to F and \kappa^* : F \to E both denote the real-linear bundle map defined by complex conjugation. This is the simplest Cauchy-Riemann operator with invertible antilinear part that one can possibly write down, but I was stuck for an embarrassingly long time on how to write down precise local solutions to \mathbf{D}\eta = 0 and \mathbf{D}^*\xi = 0. There’s an easy trick for this that will be familiar to anyone who knows about asymptotic formulas for punctured holomorphic curves. In that context, we often have occasion to consider operators of the form \partial_s + i\partial_t + S(t) for functions on a half-cylinder [0,\infty) \times S^1, with S(t) an S^1-family of real-linear transformations on {\mathbb C}^m, and the equation (\partial_s + i\partial_t + S)\eta = 0 then has a special solution of the form

\eta(s,t) = e^{\lambda s} f_\lambda(t)

whenever f_\lambda : S^1 \to {\mathbb C}^m is an eigenfunction of the operator -i\partial_t - S with eigenvalue \lambda. Now, \mathbf{D} = \partial_s + i\partial_t + \kappa can be viewed as such an operator on a half-cylinder, but if we are truly only interested in local solutions, then we can ignore the requirement for the eigenfunction f_\lambda(t) to be periodic in t, which makes arbitrary real numbers possible for the eigenvalue \lambda. Once you’ve thought of this, you can do some calculations and are led sooner or later to write down an example like the following: define local sections \eta_\lambda \in \ker\mathbf{D} and \xi_\lambda \in \ker\mathbf{D}^* for \lambda \in (-1,1) by

\eta_\lambda(s,t) := e^{\lambda s + \sqrt{1 - \lambda^2} t}\left( \sqrt{1-\lambda} + i \sqrt{1 + \lambda} \right),

\xi_\lambda(s,t) := e^{-\lambda s - \sqrt{1 - \lambda^2} t} \left( \sqrt{1 - \lambda} - i \sqrt{1 + \lambda} \right),

If we identify the fibers of E and F with {\mathbb R}^2 so that the fibers of E \otimes F become the space of real 2-by-2 matrices, then feeding \eta_\lambda \otimes \xi_\lambda into the Petri map \Pi : \Gamma(E) \otimes \Gamma(F) \to \Gamma(E \otimes F) gives constant sections,

\Pi(\eta_\lambda \otimes \xi_\lambda)(s,t) = \begin{pmatrix} 1 - \lambda & -\sqrt{1 - \lambda^2} \\ \sqrt{1 - \lambda^2} & -1 - \lambda \end{pmatrix}.

These all take values in the 3-dimensional vector space of matrices of the form \begin{pmatrix} a & b \\ -b & c \end{pmatrix}, thus any four such products must be linearly dependent, and the dependence relation yields counterexamples to Petri’s condition if you choose four distinct numbers for the eigenvalue \lambda \in (-1,1).

Shit happens.

Acknowledgement: A substantial proportion of what I understand about the subject of this post emerged from conversations with Aleksander Doan and Thomas Walpuski.

Posted in Uncategorized | Tagged , , | Leave a comment

Super-rigidity is fixed

A new version of the paper Transversality and super-rigidity for multiply covered holomorphic curves has just been uploaded to my homepage, and will be replacing the previous (withdrawn) version on the arXiv within the next couple of days. Here’s the quick update for those who are keeping score but don’t have time for the details: the main theorems remain unchanged, and all gaps in their proofs have been filled.

For those who do have time for the details, my intention in this post is to review what the problem was and clarify why it was essential to fix it—I’ve come to view it as something more interesting and possibly more important than a mere technical difficulty, and I want to explain why. In the sequel I will then explain how the problem has been solved.

I’m not going to assume that everyone has read my series of previous posts on the super-rigidity paper and what went wrong in the proof. The main thing you need to know is this: the goal is to understand, in precise terms, when it is possible or impossible to establish transversality (or related conditions) for multiply covered J-holomorphic curves via the standard method of perturbing the almost complex structure generically. At the linearized level, this becomes an equivariant transversality problem: given a linear Cauchy-Riemann type operator that is invariant under a group action, when can you add generic zeroth-order perturbations to make the operator surjective/injective without breaking the symmetry?

Since I wasn’t the first person to have thought about such issues, I’ve been asked by several colleagues how my approach differs from earlier work by other authors… in particular the three or four previous attempted proofs (later withdrawn) that super-rigidity holds for generic J in Calabi-Yau 3-folds. Many elements in my approach have indeed appeared before: the twisted bundle decomposition for Cauchy-Riemann operators originated in work of Taubes, as did the idea (to be discussed below) of stratifying a moduli space via conditions on kernels and cokernels of Cauchy-Riemann operators. These two ideas later served as the basis for Eftekhary’s partial result on super-rigidity, and I’ve also seen the stratification idea appear in the wall-crossing argument in Ionel and Parker’s paper on the Gopakumar-Vafa formula. The main element in my approach that was not present in any of the others is a result that I used to call quadratic unique continuation, though for reasons that I’ll get into in the next post, I now find that to be a bad choice of words and am instead calling it Petri’s condition (thanks to Aleksander Doan and Thomas Walpuski for the terminology). The technical foundation of my paper is based on a result saying that Petri’s condition can be achieved locally under generic local perturbations of any Cauchy-Riemann type operator. That is the lemma that was wrong in the previous version, and has now been corrected.

Stratification and Petri’s condition

I want to explain a bit why Petri’s condition arises as an essential obstacle to overcome in equivariant transversality problems. This issue is quite general—as demonstrated in recent work by Doan and Walpuski, it pertains to more than just Cauchy-Riemann type operators or holomorphic curves, thus I will try to frame it in the generality that it deserves.

Consider a linear first-order partial differential operator \mathbf{D} : \Gamma(E) \to \Gamma(F) between two vector bundles over a smooth manifold M. We will assume that \mathbf{D} satisfies some nice condition such as ellipticity, so that it will be Fredholm when extended to suitable Banach space settings (which I won’t talk about here) and all local solutions to \mathbf{D}\eta = 0 are smooth. Fix also an open subset {\mathcal U} \subset M with compact closure and let

{\mathcal A}({\mathcal U}) \subset \Gamma(\text{Hom}(E,F))

denote the space of all smooth bundle maps E \to F with support in \mathcal{U}. These define compact perturbations of \mathbf{D} in the relevant Banach space setting, so that the perturbed operator

\mathbf{D}_A := \mathbf{D} + A : \Gamma(E) \to \Gamma(F)

for each A \in {\mathcal A}({\mathcal U}) is also Fredholm. The main idea of the stratification approach is now to consider subsets of the form

{\mathcal A}_{k,\ell}({\mathcal U}) := \left\{ A \in {\mathcal A}({\mathcal U})\ \big|\ \dim\ker(\mathbf{D}_A) = k \text{ and } \dim\text{coker}(\mathbf{D}_A) = \ell \right\} \subset {\mathcal A}({\mathcal U}).

If we are lucky, then this space will be a smooth finite-codimensional submanifold of {\mathcal A}({\mathcal U}), and its codimension in this particular setting should be k\ell. This is analogous to the fact that the space of all linear transformations {\mathbb R}^m \to {\mathbb R}^n of a fixed rank forms a smooth submanifold, and it can be proved in much the same way: one can associate to each A \in {\mathcal A}_{k,\ell}({\mathcal U}) a neighborhood {\mathcal O} \subset {\mathcal A}({\mathcal U}) and a smooth map

\Phi : {\mathcal O} \to \text{Hom}(\ker\mathbf{D}_{A} , \text{coker} \mathbf{D}_{A})

whose zero set is a neighborhood of A in {\mathcal A}_{k,\ell}({\mathcal U}), hence {\mathcal A}_{k,\ell}({\mathcal U}) is indeed a submanifold with the aforementioned codimension if we can arrange for the linearization of \Phi at A to be surjective. (For details on how to define \Phi, see the discussion of walls in the space of Fredholm operators in an earlier post.)

Surjectivity is the subtle part. The linearization in question takes the form

\mathbf{L} := d\Phi(A) : {\mathcal A}({\mathcal U}) \to \text{Hom}(\ker \mathbf{D}_{A} , \text{coker} \mathbf{D}_{A}),

\mathbf{L}(B) \eta := \pi(B\eta),

where \pi denotes the projection from the relevant Banach space of sections of F to the quotient \text{coker} \mathbf{D}_{A}, or equivalently, to the kernel of the formal adjoint operator \mathbf{D}_{A}^* with respect to some fixed choices of geometric data (i.e. bundle metrics and volume forms) on E, F and M. Let us fix such geometric data and denote the resulting L^2-inner product for sections of E or F by \langle\ ,\ \rangle_{L^2}. Choosing bases \eta_1,\ldots,\eta_m \in \ker\mathbf{D}_{A} and \xi_1,\ldots,\xi_n \in \ker\mathbf{D}_{A}^*, the difference between \mathbf{L}(B)\eta_i and B\eta_i is L^2-orthogonal to each \xi_j, thus the matrix elements that determine the linear map \mathbf{L}(B) : \ker \mathbf{D}_A \to \ker \mathbf{D}_A^* for each B \in {\mathcal A}({\mathcal U}) are

\langle \mathbf{L}(B) \eta_i , \xi_j \rangle_{L^2} = \langle B\eta_i , \xi_j \rangle_{L^2}.

The map \mathbf{L} then fails to be surjective onto \text{Hom}(\ker \mathbf{D}_A , \ker \mathbf{D}_A^*) if and only if there exists a nontrivial set of constants \Psi_{ij} \in {\mathbb R} that are “orthogonal” to the image of \mathbf{L} in the sense that for all B \in {\mathcal A}({\mathcal U}),

\sum_{i,j} \Psi_{ij} \langle B\eta_i , \xi_j \rangle_{L^2} = \int_{\mathcal U} \langle\cdot,\cdot\rangle_F \circ (B \otimes \text{Id}) \left( \sum_{i,j} \Psi_{ij} \eta_i \otimes \xi_j\right) \, d\text{vol} = 0.

The interesting term in this expression is the summation in parentheses: \sum_{i,j} \Psi_{ij} \eta_i \otimes \xi_j is a section of the tensor product bundle E \otimes F, which we are free to restrict to the subset {\mathcal U} \subset \Sigma since the support of B is contained there. In particular, \sum_{i,j} \Psi_{ij} \eta_i \otimes \xi_j is an element in the image of the natural linear map

\Pi_{\mathcal U} : \ker\mathbf{D}_A \otimes \ker\mathbf{D}_A^* \longrightarrow \Gamma(E \otimes F|_{\mathcal U})

which sends each product \eta \otimes \xi to the section \Pi(\eta \otimes \xi)(z) := \eta(z) \otimes \xi(z) restricted to {\mathcal U}. It is an easy linear algebra exercise to show that if \sum_{i,j} \Psi_{ij} \eta_i \otimes \xi_j \in \Gamma(E \otimes F) is nonzero on some open set in {\mathcal U}, then one can find some B \in {\mathcal A}({\mathcal U}) to make sure that the integral above does not vanish. In other words, \mathbf{L} is guaranteed to be surjective if the following condition is achieved:

Definition. The operator \mathbf{D}_A : \Gamma(E) \to \Gamma(F) satisfies Petri’s condition on the subset {\mathcal U} \subset M if the natural map \Pi_{\mathcal U} : \ker\mathbf{D}_A \otimes \ker\mathbf{D}_A^* \to \Gamma(E \otimes F|_{\mathcal U}) is injective.


One of the beautiful things about this approach to transversality issues is that if the program I’ve just sketched can be carried out at all, then it can also be carried out equivariantly. In particular, if the operators \mathbf{D}_A arise as linearized operators for something like a multiply covered holomorphic curve, then they come with symmetry, e.g. there may be a finite group G acting on M and the two bundles such that \mathbf{D} is G-equivariant and we are only allowed to perturb within the space {\mathcal A}_G({\mathcal U}) \subset {\mathcal A}({\mathcal U}) of G-invariant zeroth-order perturbations. In this case, the map \Phi automatically takes values in the space of G-equivariant linear maps \ker\mathbf{D}_A \to \ker\mathbf{D}_A^*, so that the linearized problem becomes to show that the map

\mathbf{L} : {\mathcal A}_G({\mathcal U}) \to \text{Hom}_G(\ker\mathbf{D}_A , \ker\mathbf{D}_A^*)

given by the same formula as before is surjective. If we have Petri’s condition, then this is easy: given \Psi \in \text{Hom}_G(\ker\mathbf{D}_A,\ker\mathbf{D}_A^*), we can use the non-equivariant case to find a (not necessarily G-invariant) solution \widetilde{B} \in {\mathcal A}({\mathcal U}) to \mathbf{L}(\widetilde{B}) = \Psi, but then symmetrize it to produce a solution

B := \frac{1}{|G|} \sum_{g \in G} g^*\widetilde{B} \in {\mathcal A}_G({\mathcal U}), satisfying \mathbf{L}(B) = \Psi.

Here’s the punchline. In certain settings, depending on the overall goal, it may well be that you can get away with proving less than the statement that {\mathcal A}_{k,\ell}({\mathcal U}) is a smooth submanifold of the right codimension, in which case you might not need to know whether Petri’s condition holds. But for almost any such work-around you might choose, the equivariant case will not work—at least, not in as much generality as one would like. Let me expand on that a bit. The papers I mentioned above by Taubes, Eftekhary and Ionel-Parker all make use of this stratification idea, so some form of the operator that I’m calling \mathbf{L} appears in all of them. But in all three papers, it turns out that the main results do not really require {\mathcal A}_{k,\ell}({\mathcal U}) to be a submanifold of the predicted codimension—it suffices to prove that it’s some kind of “subvariety” that resembles a manifold and whose codimension can be bounded from below, which means not necessarily proving that \mathbf{L} is surjective, but establishing a good lower bound on its rank. Taubes, for instance, uses the following cute trick: if we fix a nontrivial element \eta_0 \in \ker\mathbf{D}_A, then we can associate to every \xi \in \ker\mathbf{D}_A^* a zeroth-order perturbation of the form

B_\xi := \langle \eta_0,\cdot \rangle_E \, \xi \in \Gamma(\text{Hom}(E,F)),

which then satisfies

\langle \mathbf{L}(B_\xi) \eta_0 , \xi \rangle_{L^2} = \langle B_\xi \eta_0 , \xi \rangle_{L^2} = \int_M \langle \eta_0,\eta_0 \rangle_E \cdot \langle \xi,\xi \rangle_F, d\text{vol} > 0

due to unique continuation. One therefore obtains an injective linear map \ker\mathbf{D}_A^* \to \text{Hom}(\ker\mathbf{D}_A,\ker\mathbf{D}_A^*) : \xi \mapsto \mathbf{L}(B_\xi), which proves \text{rank} \mathbf{L} \ge \dim \text{coker} \mathbf{D}_A.

This argument suffices for certain applications, but outside of a very restrictive range of special cases (such as the regular double covers of tori in Taubes’s paper), it doesn’t give anything for the equivariant case: one can symmetrize the perturbations B_\xi constructed above, but there’s no guarantee that they won’t all become zero.

This is just one example; there are a few other tricks that I found in various other papers and attempted to implement as work-arounds when I wanted to prove Petri’s condition but didn’t know how to do it. None of them seemed sufficient to produce equivariant results in full generality. The conclusion I came to was that if you want to understand equivariant transversality for nonlinear PDEs, then Petri’s condition is one of the main necessary ingredients, and it is absolutely necessary.

As you can imagine, I was therefore fairly distraught when my original proof of Petri’s condition for Cauchy-Riemann type operators broke down. I still believed that it was very likely to be a generic property, and I also suspected that someone in either geometric analysis or algebraic geometry must have thought about this before and could simply give me the solution, if I only knew whom to ask. But having now asked around quite a bit more, I’m left with the impression that, in fact, hardly anyone has thought very much about this before. Thus I decided to write this post, telling you why Petri’s condition is something worth thinking about. In the next one, I’ll tell you what I’ve learned in the effort to prove it.

Posted in Uncategorized | Tagged , , | Leave a comment

One postdoc and two Ph.D. studentships in symplectic topology available in Berlin

The following jobs in my symplectic research group at the HU Berlin have just been advertised with application deadline of January 1, 2019:

Starting dates are planned for Autumn 2019 but can be moved a bit if desired by the candidate. They are all non-teaching positions, so no knowledge of German is required, though some voluntary teaching is possible, even in English (for upper-level courses).

A note about the Ph.D. studentships: these positions are conceived so that the salary should be comparable to a standard Ph.D. fellowship as offered e.g. by the Berlin Mathematical School (BMS).* If you are interested in working with me as a Ph.D. student, then I recommend applying both for BMS Phase 2 admission and for one of these positions, as I might not be able to consider you for this funding unless you have specifically applied for it. (There are advantages to being a member of the BMS regardless.)

* My standard practical note for those unfamiliar with the vagaries of public employment in Germany: the precise salaries for such positions are determined by a complicated formula that depends on various details, including your family circumstances, but in theory you can compute them more exactly (and also the after-tax version) using this salary calculator. The main thing you need to know is that these positions are in Entgeltgruppe E13 of the TV-L. For the PhD studentships you need to input “50%” for Arbeitszeit, as the positions are officially half-time (because you spend the other half of the time learning?). Please don’t ask me about Zusatzversorgung or Lohnsteuerklassen… if you want to know what these things mean, your best bet is to find an actual German, which, as you probably know, I am not.

Posted in Uncategorized | Tagged , | Leave a comment

Super-rigidity is open again

Update 28.05.2019: The error that this post talks about has now been fixed. The details are explained in a pair of more recent posts.

I’ll start with the bad news: as of today, I am withdrawing the 2016 preprint Transversality and super-rigidity for multiply covered holomorphic curves. There is a mistake in that paper which makes the proofs of all four of its main results incomplete, and I cannot currently say with confidence that I know how to fix the mistake. As some readers will be aware, this is not the first time in history that an attempted proof of the super-rigidity conjecture for holomorphic curves has been withdrawn—in fact, it is not even the first time in history that such a proof has been withdrawn (or at least drastically downgraded) by me. The last time I did that, it was because I had come to believe the whole idea of my approach was unsuitable for the problem at hand.

The good news is that that is not the case this time: I strongly believe that the current situation is temporary, that the approach attempted in my 2016 paper is fundamentally correct, that 90% of the contents of the paper are correct and the remaining 10% is correctable. Moreover, I do currently know how to fill the gap to prove transversality results for multiple covers in certain cases, and there is good reason to believe that those cases are general enough for the applications to Embedded Contact Homology that originally motivated this project.

Nonetheless, since super-rigidity (Theorem A in my paper) is a well-known open problem that has been studied by several people in the Gromov-Witten community in the past, it is important to acknowledge publicly at this point that the problem is still open.

I would now like to explain a bit what goes wrong in my proof and what would be needed in order to fix it. I have explored a few ideas for a fix without making much progress, but it still seems quite plausible to me that a relatively easy fix may be possible, perhaps using ideas that are well-known among experts in either elliptic PDE theory or algebraic geometry (I would not call myself an expert in either). As I have recently learned from Thomas Walpuski and Aleksander Doan, the required lemma is not necessarily unique to the context of holomorphic curves, but can be formulated and has applications to a much wider class of equivariant transversality problems for elliptic PDEs (as discussed e.g. in Section 4 of their paper on associative submanifolds and monopoles). So, if you can prove it, you should write it up—it would be publishable as a stand-alone result!

(Acknowledgements: the error in my proof was found by Thomas Walpuski and Aleksander Doan, and my current understanding of the problem has been greatly influenced by discussions with them.)

What went wrong: quadratic unique continuation (AKA “Petri’s condition”)

I explained the main ideas behind my transversality approach in a series of three posts last December, and the vast majority of what it says in those posts is still correct as far as I’m aware. The detail that goes wrong is discussed in the penultimate section of the third post: it is a result I referred to at the time as a quadratic unique continuation lemma. Recent discussions with Doan and Walpuski have taught me some new terminology for it, which is borrowed from algebraic geometry.

Definition: Assume E and F are real Euclidean vector bundles over a manifold \Sigma with a fixed volume form, \mathbf{D} : \Gamma(E) \to \Gamma(F) is a real-linear partial differential operator and \mathbf{D}^* : \Gamma(F) \to \Gamma(E) denotes its formal adjoint. We say that \mathbf{D} satisfies Petri’s condition on some region {\mathcal U} \subset \Sigma if the natural map

\ker\mathbf{D} \otimes_{\mathbb R} \ker\mathbf{D}^* \stackrel{\iota}{\longrightarrow} \Gamma(E \otimes_{\mathbb R} F|_{\mathcal U})

defined by \iota(\eta \otimes \xi)(z) = \eta(z) \otimes \xi(z) is injective.

For reasons that I tried to explain in one of those posts last December, the main theorems on transversality and super-rigidity in my paper all follow if one restricts attention to J-holomorphic curves whose normal Cauchy-Riemann operators all satisfy Petri’s condition on some open subset where J can be perturbed. As I also explained in that post, it is not true that all Cauchy-Riemann type operators satisfy Petri’s condition, at least not locally:  for any complex-linear Cauchy-Riemann operator \mathbf{D} on a trivial line bundle over the disk, one can find complex-linearly independent sections \eta_1,\eta_2 \in \ker\mathbf{D} and \xi_1,\xi_2 \in \ker\mathbf{D}^* such that the section

\eta_1 \otimes_{\mathbb R} i\xi_2 - i\eta_1 \otimes_{\mathbb R} \xi_2 - \eta_2 \otimes_{\mathbb R} i\xi_1 + i\eta_2 \otimes_{\mathbb R} \xi_1 \in \Gamma(E \otimes_{\mathbb R} F)

is identically zero. Evidently, Petri’s condition can only hold for real Cauchy-Riemann type operators, and only under some condition preventing \mathbf{D} and \mathbf{D}^* from restricting to complex-linear operators on some subbundle. There are many generic conditions that prevent the latter: the simplest one I know is requiring the bundle map E \to F defined as the complex-antilinear part of \mathbf{D} to be invertible at some point z_0 \in \Sigma, a condition that holds for all normal Cauchy-Riemann operators of J-holomorphic curves if J is generic. When that condition holds, it forces \ker \mathbf{D} and \ker \mathbf{D}^* to be totally real subspaces of \Gamma(E) and \Gamma(F) respectively, and my claim in the paper was that whenever the latter is true, Petri’s condition holds.

That might be true, but my proof of it had a careless mistake that, when discovered, called the whole idea behind my proof into question. It’s probably not worth telling you what the precise mistake was—suffice it to say it was a problem of linear algebra (I claimed some set of vectors was complex-linearly independent when I actually only knew they were real-linearly independent). In any case, the idea had been to carry out a mild generalization of the proof that complex-linear Cauchy-Riemann operators satisfy the complex analogue of Petri’s condition involving complex tensor products. The latter is easy to prove using Taylor series, due to the fact that sections in \ker\mathbf{D} are power series in z while sections in \ker\mathbf{D}^* are power series in \bar{z}. The idea was then to use the totally real condition as a bridge between the complex- and real-linear worlds, i.e. if we know that real-linearly independent sets in \ker\mathbf{D} and \ker\mathbf{D}^* are also complex-linearly independent, then a result on the complex Petri condition should imply a corresponding real result.

On closer inspection, it is not as easy as I thought at first to make a rigorous argument out of this intuition involving the totally real condition, and I would not bet anyone’s life on the original lemma being true the way I stated it. Nonetheless, it seems to me almost inconceivable for something like the following statement not to hold:

Conjecture: Suppose \mathbf{D} : \Gamma(E) \to \Omega^{0,1}(\Sigma,E) is a complex-linear Cauchy-Riemann type operator on a vector bundle E over a Riemann surface \Sigma, and z_0 \in \Sigma is a point. Then there exists a generic condition on the \infty-jet of a complex-antilinear bundle map A : E \to \Lambda^{0,1}T^*\Sigma \otimes_{\mathbb C} E at z_0 such that for all A satisfying this condition, the operator \mathbf{D} + A satisfies Petri’s condition on neighborhoods of z_0.

What we know

Petri’s condition makes sense for arbitrary linear PDEs on vector bundles, and it tells us something about the relationship between global and pointwise linear independence for solutions to those PDEs. Here are some situations in which we can easily say that it holds:

  1. It holds for any first-order operator over a 1-dimensional domain, e.g. for the asymptotic operators that arise by linearizing Reeb vector fields in contact manifolds along periodic orbits. The condition holds in this case just because the PDE is an ODE, so by local existence and uniqueness, global linear independence implies pointwise linear independence. This observation is actually useful, e.g. one can adapt the methods described in my super-rigidity paper to classify the bifurcations of closed Reeb orbits under generic deformations of a contact form.
  2. It also holds for any Cauchy-Riemann type operator that splits over a direct sum of trivial line bundles on a closed surface, or more generally, for any operator whose solutions are constrained (e.g. via the similarity principle and asymptotic winding estimates) to be nowhere zero. This is again because global linear independence in this case implies pointwise linear independence. In higher dimensions this condition is rather special and difficult to observe in nature, but it is quite a familiar phenomenon in dimension four, where normal Cauchy-Riemann operators are defined on line bundles: for instance, this includes the case of the multiply covered holomorphic tori that arose in Taubes’s work on the Gromov invariant, and it similarly appears to include the situations that are important for applications to Embedded Contact Homology.
  3. For any differential operator that is Fredholm, Petri’s condition is clearly open. This makes the scenario in the previous bullet point seem slightly less special and more applicable.
  4. As mentioned above, the complex analogue of the Petri condition holds for all complex-linear Cauchy-Riemann operators. I do not currently know whether this observation is useful, since my original idea to produce a proof of the real Petri condition out of it has not panned out. I can still imagine clever ways that such an argument might work, though to be honest, I’d rather see a different proof—what makes me uneasy about appealing to complex analysis in this way is that it would be hard to imagine how such an argument might generalize to other elliptic operators beyond the Cauchy-Riemann case.
  5. There is also a weaker condition that does hold for all Cauchy-Riemann type operators: say that \mathbf{D} satisfies Petri’s condition “up to rank r” on a region {\mathcal U} \subset \Sigma if for all subspaces K \subset \mathbf{D} and C \subset \mathbf{D}^* of dimensions at most r, the natural map K \otimes_{\mathbb R} C \to \Gamma(E \otimes_{\mathbb R} F|_{\mathcal U}) is injective. The rank 1 Petri condition is an immediate consequence of the usual unique continuation results for Cauchy-Riemann operators, and it is not hard to extend this to ranks 2 and 3 using the fact that two real-linearly independent solutions to a Cauchy-Riemann type equation are always also pointwise real-linearly independent on an open and dense subset. (The latter follows from the Leibniz rule for Cauchy-Riemann operators together with the fact that real-valued holomorphic functions are all constant.) These partial results toward Petri’s condition appear to be the reason why Eftekhary’s partial proof of super-rigidity works. The counterexample mentioned above shows, however, that from rank 4 upward, Petri’s condition does not come for free.

That’s all for the moment. My paper will reappear in some revised form in the future, though it remains to be seen whether that means fixing the original proof or downgrading the generality of the stated results. I’ll surely write more on this topic when I know more.

Posted in Uncategorized | Tagged , | 5 Comments

Postdoc and PhD studentship for symplectic topology in Berlin

I am happy to announce that my research group in symplectic and contact topology at the Humboldt-Universität in Berlin is hiring for Autumn 2018:

  • One 2-year postdoc position, salary approx. 44,000€/year (before taxes).* Here is the official advert in English and in German.
  • One 3-year PhD studentship, salary approx. 22,000€/year (before taxes).* Here is the official advert in English and in German.

Both positions are research-only; no teaching is required, so applicants need not know any German. (Opportunities to teach will nonetheless be available if desired.) They are both part of an ERC-funded project involving pseudoholomorphic curves in symplectic and contact topology, so postdoc applicants should ideally have some background in that subject, and PhD applicants should at least have a solid background in differential geometry and analysis. Applications should be e-mailed directly to me (including recommendation letters sent separately) by March 15, and we aim to make decisions as soon as possible after that date. The adverts specify what documents should be sent in the application: if you’ve already been applying for other postdoc or PhD positions, it is mostly the same stuff as usual. (Applicants for the PhD studentship: please be sure to include your university transcripts in addition to your CV and a statement of research experience/interests).

If there’s anything else I can clarify, please feel free to contact me by e-mail!

* For those unfamiliar with the vagaries of public employment in Germany: the precise salaries are determined by a complicated formula that depends on various details, including your family circumstances, but in theory you can compute them more exactly (and also the after-tax version) using this salary calculator. The main thing you need to know is that both positions are in Entgeltgruppe E13 of the TV-L. For the PhD studentship you need to input “50%” for Arbeitszeit, as the position is officially half-time (because you spend the other half of the time learning?). Please don’t ask me about Zusatzversorgung or Lohnsteuerklassen… if you want to know what these things mean, your best bet is to find an actual German, which, as you probably know, I am not.

Posted in Uncategorized | Tagged , | Leave a comment