Postdoc and PhD studentship for symplectic topology in Berlin

I am happy to announce that my research group in symplectic and contact topology at the Humboldt-Universität in Berlin is hiring for Autumn 2018:

  • One 2-year postdoc position, salary approx. 44,000€/year (before taxes).* Here is the official advert in English and in German.
  • One 3-year PhD studentship, salary approx. 22,000€/year (before taxes).* Here is the official advert in English and in German.

Both positions are research-only; no teaching is required, so applicants need not know any German. (Opportunities to teach will nonetheless be available if desired.) They are both part of an ERC-funded project involving pseudoholomorphic curves in symplectic and contact topology, so postdoc applicants should ideally have some background in that subject, and PhD applicants should at least have a solid background in differential geometry and analysis. Applications should be e-mailed directly to me (including recommendation letters sent separately) by March 15, and we aim to make decisions as soon as possible after that date. The adverts specify what documents should be sent in the application: if you’ve already been applying for other postdoc or PhD positions, it is mostly the same stuff as usual. (Applicants for the PhD studentship: please be sure to include your university transcripts in addition to your CV and a statement of research experience/interests).

If there’s anything else I can clarify, please feel free to contact me by e-mail!

* For those unfamiliar with the vagaries of public employment in Germany: the precise salaries are determined by a complicated formula that depends on various details, including your family circumstances, but in theory you can compute them more exactly (and also the after-tax version) using this salary calculator. The main thing you need to know is that both positions are in Entgeltgruppe E13 of the TV-L. For the PhD studentship you need to input “50%” for Arbeitszeit, as the position is officially half-time (because you spend the other half of the time learning?). Please don’t ask me about Zusatzversorgung or Lohnsteuerklassen… if you want to know what these things mean, your best bet is to find an actual German, which, as you probably know, I am not.

Posted in Uncategorized | Tagged , | Leave a comment

The transversality machine

In the last two posts, I have been describing a machine…

…well, no, not quite like that machine. I was speaking figuratively.

The machine is a stratification theorem, whose purpose is to demystify the transversality properties of multiply covered holomorphic curves. Its user interface consists mainly of a set of “twisted” Cauchy-Riemann operators associated to the normal operator \mathbf{D}_u^N of any multiply covered curve u = v \circ \varphi, producing a splitting

\mathbf{D}_u^N \cong (\mathbf{D}_u^{\boldsymbol{\theta}_1})^{\oplus k_1} \oplus \ldots \oplus (\mathbf{D}_u^{\boldsymbol{\theta}_N})^{\oplus k_N}

as described in the previous post. Operating the machine requires no specialized training beyond the ability to compute the indices of these operators—which are Cauchy-Riemann type operators defined on Sobolev spaces with exponential weight conditions over a punctured surface—plus a certain amount of patience with dimension-counting arguments. If you have this, then the machine gives you a stratification of the moduli space of multiple covers, with chambers in which transversality (or possibly the next best thing) is achieved, separated by smooth walls and further strata whose dimensions can all be computed.

Anyway, that’s what the instruction manual says it does. In this post I want to open up the machine and try to explain why it works.

The statement of the theorem

I am assuming you’ve read the original post in which all of the following notation was explained, but recall in particular that {\mathcal M}^d_G(J) is a moduli space of d-fold covered J-holomorphic curves u = v \circ \varphi with prescribed critical and branching behavior for v and \varphi respectively, and for \mathbf{k} = (k_1,\ldots,k_N) and \mathbf{c} = (c_1,\ldots,c_N) we consider the subset

\displaystyle {\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}) = \big\{ u \in {\mathcal M}^d_G(J)\ \big|\ \dim_{{\mathbb K}_i} \ker\mathbf{D}_u^{\boldsymbol{\theta}_i} = k_i \text{ and } \dim_{{\mathbb K}_i} \text{coker}\,\mathbf{D}_u^{\boldsymbol{\theta}_i} = c_i for i=1,\ldots,N \big\}.

Theorem D. For generic J, {\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}) is a smooth submanifold of {\mathcal M}^d_G(J) with

\displaystyle \text{codim}\, {\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}) = \sum_{i=1}^N t_i k_i c_i,

where for each of the irreducible representations \boldsymbol{\theta}_i : G \to \text{Aut}_{\mathbb R}(W_i), the number t_i \in \{1,2,4\} is defined as the real dimension of {\mathbb K}_i := \text{End}_G(W_i).

The proof of this theorem follows the same general outline as most other theorems you’ve seen before that begin with the words “For generic J…”. The relevant generic set comes from applying the Sard-Smale theorem to a projection map

{\mathcal M}^d_G(\mathbf{k},\mathbf{c}) \to {\mathcal J} : (J,u) \mapsto J,

where {\mathcal J} is a suitable Banach manifold of perturbed almost complex structures and {\mathcal M}^d_G(\mathbf{k},\mathbf{c}) is the resulting universal moduli space

{\mathcal M}^d_G(\mathbf{k},\mathbf{c}) = \big\{ (J,u) \ \big|\ J \in {\mathcal J} and u \in {\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}) \big\}.

The main step is thus to prove that this universal moduli space is a smooth Banach manifold, which will follow from the implicit function theorem after proving that a certain linearized operator is surjective. The latter is, as usual, the hard part, and it resembles more familiar arguments in that it requires a unique continuation result for linear Cauchy-Riemann type equations, but the precise lemma we need is probably different than what you are used to. This so-called “quadratic” unique continuation lemma is the heart of the machine, and in itself it is not very hard to prove, but it is somewhat of a challenge to understand what it means and what it is good for, so one of my main goals for this post will be to explain that.

It should go without saying that when the kernel and cokernel conditions in the definition of {\mathcal M}^d_G(\mathbf{k},\mathbf{c}) are dropped, the resulting universal moduli space

{\mathcal M}^d_G := \big\{ (J,u)\ \big|\ J \in {\mathcal J} and u \in {\mathcal M}^d_G(J) \big\}

is indeed a smooth Banach manifold; this follows from standard arguments. Thus assuming you’re on board with what I said above about the Sard-Smale theorem, we can agree that the goal is to prove the following:

Main lemma. {\mathcal M}^{d,*}_G(\mathbf{k},\mathbf{c}) \subset {\mathcal M}^d_G is a smooth finite-codimensional submanifold with codimension \sum_i t_i k_i c_i.

I cheated a bit in this statement by replacing {\mathcal M}^d_G(\mathbf{k},\mathbf{c}) with a new object {\mathcal M}^{d,*}_G(\mathbf{k},\mathbf{c}) that I have not yet defined. Its precise definition will come in a bit, but for now, suffice it to say that it is an open subset of {\mathcal M}^d_G(\mathbf{k},\mathbf{c}), defined via an extra condition without which I wouldn’t know how to prove the lemma—but this extra condition will have no impact on the rest of the discussion.

What follows is an extended sketch of the proof, with occasional references to specific results in the paper where the full details are carried out. This may turn out to be somewhat weightier material than you are used to reading on a blog, so maybe it will help to have some appropriately weighty music as accompaniment; I personally recommend Beethoven’s Große Fuge.

Reduction to the case of regular covers

For reasons that were hinted at near the end of the previous post, we can impose the following simplifying assumption without loss of generality:

Assumption (cf. the beginning of Section 3.5). The branched covers \varphi : (\Sigma',j') \to (\Sigma,j) for elements (J,u = v \circ \varphi) \in {\mathcal M}^d_G are regular (i.e. normal).

To see why this is not a loss of generality, you need to recall how the splitting of \mathbf{D}_u^N is defined in terms of a regular presentation of \varphi (see the previous post). Each branched cover \varphi : (\Sigma',j') \to (\Sigma,j) of degree d with generalized automorphism group G has a (canonical up to isomorphism) regular branched cover \pi : (\Sigma'',j'') \to (\Sigma,j) that factors through it, with degree \tilde{d} = |G| \ge d and \text{Aut}(\pi) = G. The holomorphic curve \tilde{u} := v \circ \pi then has a splitting

\mathbf{D}_{\tilde{u}}^N = (\mathbf{D}_{\tilde{u}}^{\boldsymbol{\theta}_1})^{\oplus \tilde{k}_1} \oplus \ldots \oplus (\mathbf{D}_{\tilde{u}}^{\boldsymbol{\theta}_N})^{\oplus \tilde{k}_N}

whose summands are the same twisted Cauchy-Riemann operators that appear in the splitting of \mathbf{D}_u^N, but with different multiplicities \tilde{k}_i \ge k_i; in particular, the regularity of \pi implies via a standard theorem in representation theory that all of the \tilde{k}_i must be nonzero. Since \pi and \varphi have exactly the same critical values, their respective neighborhoods in the space of branched covers with fixed branching data can be identified naturally, so that there are canonical identifications {\mathcal M}^{\tilde{d}}_G = {\mathcal M}^d_G and {\mathcal M}^{\tilde{d}}_G(\mathbf{k},\mathbf{c}) = {\mathcal M}^d_G(\mathbf{k},\mathbf{c}) in the neighborhoods of these two curves.

With the preceding understood, the regularity assumption will be in effect from now on. It has the following important consequences:

  1. G = \text{Aut}(\varphi) acts by biholomorphic maps on the domain (\Sigma',j'), and therefore also on the kernel and cokernel of the normal operator \mathbf{D}_u^N, by reparametrization.
  2. The multiplicities k_i in the splitting of \mathbf{D}_u^N are all positive.

Notice now that if (J_0,u_0) \in {\mathcal M}^d_G(\mathbf{k},\mathbf{c}), then sufficiently nearby elements (J,u) \in {\mathcal M}^d_G will belong to {\mathcal M}^d_G(\mathbf{k},\mathbf{c}) if and only if

\dim \ker \mathbf{D}_u^N = \dim \ker \mathbf{D}_{u_0}^N.

Indeed, the splittings of these operators vary continuously as u_0 moves toward u, and each summand is a Fredholm operator, so the dimension of its kernel can jump downward under small perturbations, but not upward. (If you are unfamiliar with this fact, it will follow easily from the discussion of the space of Fredholm operators below.) Since every twisted operator \mathbf{D}_u^{\boldsymbol{\theta}_i} appears with positive multiplicity in the splitting of \mathbf{D}_u^N, any downward jump in \dim \ker \mathbf{D}_u^{\boldsymbol{\theta}_i} will necessarily cause a downward jump in \dim \ker\mathbf{D}_u^N.

How to perturb a normal Cauchy-Riemann operator

To understand the local structure of {\mathcal M}^d_G(\mathbf{k},\mathbf{c}), we need to understand the effect that moving (J,u) through {\mathcal M}^d_G has on the normal operators \mathbf{D}_u^N.  This is fairly straightforward if we consider only variations in (J,u) that leave u fixed, which will suffice for our purposes. Fix a closed J-holomorphic curve u = v \circ \varphi, where v : \Sigma \to M is somewhere injective, and consider a 1-parameter family \{J_s\in {\mathcal J}\}_{s \in (-\epsilon,\epsilon)} with J_0 \equiv J such that J_s = J along the image of v, so u remains J_s-holomorphic for all s. Its normal bundle N_u also remains unchanged as the parameter moves, but the normal Cauchy-Riemann operator depends on J_s, so let us denote the resulting 1-parameter family of operators by

\mathbf{D}^N_{u,s} : \Gamma(N_u) \to \Omega^{0,1}(\Sigma',N_u).

Now Y := \partial_s J_s\big|_{s=0} vanishes along the image of v, but if z \in \Sigma is any injective point of v, then there is considerable freedom to choose the derivative of Y near v(z) in directions normal to v. Choose \eta \in (N_v)_z and write the normal derivative \nabla_\eta Y in terms of the tangent-normal splitting T_{v(z)} M = (T_v)_z \oplus (N_v)_z as

\nabla_\eta Y = \begin{pmatrix} \nabla_\eta^T Y & \nabla_\eta^{TN} Y \\ \nabla_\eta^{NT} Y & \nabla_\eta^N Y \end{pmatrix}.

One can now compute that if \nabla Y is chosen to vanish near all non-injective points of v, then the change in the normal Cauchy-Riemann operator is precisely

\partial_s\mathbf{D}_{u,s}^N \eta\big|_{s=0} = \nabla^{NT}_\eta Y \circ Tu \circ j'.

The point is that we can choose families J_s to make this perturbation more or less anything we want. If we’re working in a symplectic manifold (M,\omega), then of course we’d like to require all the J_s to be compatible with \omega, which gives a nontrivial relation between \nabla^{NT}_\eta Y and \nabla^{TN}_\eta Y, but this does not constrain \nabla^{NT}_\eta Y at all if we are willing to let it determine \nabla^{TN}_\eta Y. The one caveat is that if u = v \circ \varphi has \text{Aut}(\varphi) = G, then every perturbation of \mathbf{D}_u^N produced in this way will automatically be G-invariant. The important result can thus be stated as follows:

Lemma 1 (see Lemma 6.1). Let {\mathcal U} \subset \Sigma denote the open and dense set of injective points of v. Then given any smooth G-invariant zeroth-order perturbation A : N_u \to \overline{\text{Hom}}_{\mathbb C}(T\Sigma',N_u) with support in \varphi^{-1}({\mathcal U}), there exists a smooth 1-parameter family of compatible almost complex structures J_s, satisfying J_s = J_0 and matching J_0 along the image of u for all s, such that \partial_s \mathbf{D}_{u,s}^N\big|_{s=0} = A.

Generic J are nowhere integrable

It’s time to fill in the missing detail about the definition of the constrained universal moduli space {\mathcal M}^d_G(\mathbf{k},\mathbf{c}). For technical reasons that will become clear when we discuss unique continuation at the end of this post, I need to impose an extra open condition on pairs (J,u). It amounts to the requirement that J must not be integrable on any neighborhood of the image of u, though non-integrability as such is not really the point, but is more of a side-effect.

Recall that for any complex vector space (V,J), a real subspace W \subset V is called totally real if W \cap J(W) = \{0\}. (Sometimes one also requires W to have half the dimension of V, but I am not requiring that here.) Since \mathbf{D}_u^N is a real-linear operator between two complex Banach spaces, one can ask in particular whether its kernel and cokernel (meaning the kernel of its formal adjoint) are totally real. There is an easy criterion for this: recall that if we break up \mathbf{D}_u^N into its complex-linear part \mathbf{D}_u^{\mathbb C} and antilinear part \mathbf{D}_u^{\bar{\mathbb C}}, then the latter is a zeroth-order term, meaning a smooth bundle map N_u \to \overline{\text{Hom}}_{\mathbb C}(T\Sigma',N_u). Using the standard unique continuation results for linear Cauchy-Riemann operators, one can easily show (cf. Lemma 3.11) that \ker \mathbf{D}_u^N and \text{coker}\, \mathbf{D}_u^N are guaranteed to be totally real whenever the antilinear bundle map \mathbf{D}_u^{\bar{\mathbb C}} is invertible on some fiber. This is manifestly an open condition, and we shall say that \mathbf{D}_u^N itself is totally real whenever it holds. Notice that if this holds for a given curve v, then it automatically also holds for all of its multiple covers u = v \circ \varphi.

Definition. Let {\mathcal M}^{d,*}_G(\mathbf{k},\mathbf{c}) \subset {\mathcal M}^d_G(\mathbf{k},\mathbf{c}) denote the open subset consisting of pairs (J,u) for which \mathbf{D}_u^N is totally real.

If J is integrable, then \mathbf{D}_u^N is always complex linear and the totally real condition can never be satisfied. Of course, this should not worry us very much, because we are trying to prove a theorem about generic J, which cannot be expected to be integrable. It turns out in fact that genericity is enough to guarantee that the totally real condition is always satisfied, and this is why establishing the smoothness of {\mathcal M}^{d,*}_G(\mathbf{k},\mathbf{c}) will suffice for proving Theorem D.

Lemma 2 (cf. Lemma 6.2). For generic J, every closed J-holomorphic curve has the property that \mathbf{D}_u^N is totally real.

The proof of this is not hard once you’ve absorbed the implications of Lemma 1. The point is to show that the universal moduli space of somewhere injective curves that fail to satisfy the totally real condition lives inside a submanifold of arbitrarily large codimension, namely the set of curves v : \Sigma \to M such that \mathbf{D}_v^{\bar{\mathbb C}} satisfies an incidence relation with the subvariety of noninvertible linear maps (N_v)_z \to \overline{\text{Hom}}_{\mathbb C}(T_z\Sigma,(N_v)_z) at some chosen point z \in \Sigma. Having the invertibility condition fail on an open neighborhood of z means that the incidence relation can be taken to involve jets of arbitrarily high order, cutting out submanifolds of arbitrarily large codimension.

Walls in the space of Fredholm operators

I would now like to tell you a lovely fact about the space of Fredholm operators that everyone ought to know: for any pair of real Banach spaces X and Y and any integers k,\ell \ge 0, the subset

V_{k,\ell} := \big\{ \mathbf{T} : X \to Y\ \big|\ \dim \ker \mathbf{T} = k and \dim \text{coker}\, \mathbf{T} = \ell \big\}

is a smooth finite-codimensional submanifold in the space of bounded linear maps X \to Y, with

\text{codim}\, V_{k,\ell} = k\ell.

One can see it as follows. Given an operator \mathbf{T}_0 \in V_{k,\ell}, \mathbf{T}_0 is evidently Fredholm, so there exist splittings X = V \oplus K, Y = W \oplus C into closed linear subspaces, with K = \ker\mathbf{T}_0 and W = \text{im}\, \mathbf{T}_0, thus C \cong \text{coker}\, \mathbf{T}_0, and \mathbf{T}_0 restricts to V as a Banach space isomorphism V \to W. These produce a block decomposition for any bounded linear operator \mathbf{T} : X \to Y in the form

\mathbf{T} = \begin{pmatrix} \mathbf{A} & \mathbf{B} \\ \mathbf{C} & \mathbf{D} \end{pmatrix},

such that \mathbf{A} : V \to W will necessarily be invertible whenever \mathbf{T} lies in a sufficiently small neighborhood {\mathcal O} of \mathbf{T}_0. We can therefore define a smooth map

\boldsymbol{\Phi} : {\mathcal O} \to \text{Hom}(\ker \mathbf{T}_0,\text{coker}\, \mathbf{T}_0) : \mathbf{T} \mapsto \mathbf{D} - \mathbf{C}\mathbf{A}^{-1}\mathbf{B},

whose derivative at \mathbf{T}_0 is

d\boldsymbol{\Phi}(\mathbf{T}_0) \begin{pmatrix} \mathbf{a} & \mathbf{b} \\ \mathbf{c} & \mathbf{d} \end{pmatrix} = \mathbf{d}

and is thus manifestly surjective. Now if we also associate to each \mathbf{T} \in {\mathcal O} the linear “coordinate change” on X defined in terms of the splitting X = V \oplus K by

\boldsymbol{\Psi} = \begin{pmatrix} \text{Id} & -\mathbf{A}^{-1}\mathbf{B} \\ 0 & \text{Id} \end{pmatrix},

we have \mathbf{T} \boldsymbol{\Psi} = \begin{pmatrix} \mathbf{A} & 0 \\ \mathbf{C} & \boldsymbol{\Phi}(\mathbf{T}) \end{pmatrix}, implying

\ker \mathbf{T} \cong \ker \boldsymbol{\Phi}(\mathbf{T}).

Since \boldsymbol{\Phi}(\mathbf{T}) is defined on the space K with dimension \dim \ker\mathbf{T}_0, this implies that \dim \ker\mathbf{T} \le \dim \ker\mathbf{T}_0 for all \mathbf{T} sufficiently close to \mathbf{T}_0, and equality is satisfied if and only if \boldsymbol{\Phi}(\mathbf{T}) = 0. As a consequence,

V_{k,\ell} \cap {\mathcal O} = \boldsymbol{\Phi}^{-1}(0),

and since d\boldsymbol{\Phi}(\mathbf{T}_0) is surjective, the implicit function theorem gives this zero set the structure of a smooth submanifold with codimension equal to \dim \text{Hom}(K,C) = k\ell.

Walls in the universal moduli space of multiple covers

We now proceed toward the proof of the main lemma. Given (J_0,u_0) \in {\mathcal M}^{d,*}_G(\mathbf{k},\mathbf{c}), we observed already that a sufficiently close element (J,u) \in {\mathcal M}^d_G will also belong to {\mathcal M}^d_G(\mathbf{k},\mathbf{c}) if and only if \dim \ker \mathbf{D}_u^N = \dim \ker \mathbf{D}_{u_0}^N, or equivalently if the two cokernels have matching dimensions. Plugging this into the above discussion about Fredholm operators in general, there is a neighborhood {\mathcal O} \subset {\mathcal M}^d_G of (J_0,u_0) and a smooth map

\boldsymbol{\Phi} : {\mathcal O} \to \text{Hom}(\ker \mathbf{D}_{u_0}^N,\text{coker}\, \mathbf{D}_{u_0}^N)

whose zero set is a neighborhood of (J_0,u_0) in {\mathcal M}^{d,*}_G(\mathbf{k},\mathbf{c}). It is important to notice moreover that in light of the natural action of G = \text{Aut}(\varphi), \boldsymbol{\Phi} can be arranged to have its image in the space of Gequivariant linear maps \ker\mathbf{D}_{u_0}^N \to \text{coker}\, \mathbf{D}_{u_0}^N, i.e.

\boldsymbol{\Phi} : {\mathcal O} \to \text{Hom}_G(\ker \mathbf{D}_{u_0}^N,\text{coker}\, \mathbf{D}_{u_0}^N).

Computing the dimension of the space on the right hand side requires some representation theory: each of \ker\mathbf{D}_{u_0}^N and \text{coker}\, \mathbf{D}_{u_0}^N are now representations of G, and their decompositions into irreducible representations can be deduced from our splitting of \mathbf{D}_{u_0}^N. Schur’s lemma then breaks up the elements of \text{Hom}_G(\ker \mathbf{D}_{u_0}^N,\text{coker}\, \mathbf{D}_{u_0}^N) into blocks that always must vanish when they correspond to two non-isomorphic representations, and the result (cf. Equation (3.22)) is

\displaystyle \dim \text{Hom}_G(\ker \mathbf{D}_{u_0}^N,\text{coker}\, \mathbf{D}_{u_0}^N) = \sum_{i=1}^N \dim \text{End}_G(W_i) \cdot \dim_{{\mathbb K}_i} \text{Hom}(\ker \mathbf{D}_{u_0}^{\boldsymbol{\theta}_i} , \text{coker}\, \mathbf{D}_{u_0}^{\boldsymbol{\theta}_i})

= \sum_{i=1}^N t_i k_i c_i.

The implicit function theorem will now complete the proof of the main lemma if we can prove that the linearization of \boldsymbol{\Phi} at (J_0,u_0) is surjective. Let us write down the derivative of this map in directions of the form (Y,0) \in T_{(J_0,u_0)} {\mathcal M}^d_G that we considered in Lemma 1. Denoting by A_Y the zeroth-order perturbation of \mathbf{D}_{u_0}^N that results from varying (J_0,u_0) in the (Y,0) direction, the derivative in question gives a linear map

T_{J_0}{\mathcal J} \to \text{Hom}_G(\ker \mathbf{D}_{u_0}^N,\text{coker}\, \mathbf{D}_{u_0}^N) : Y \mapsto \mathbf{L}_Y

of the form

\mathbf{L}_Y \eta = \pi_C (A_Y \eta),

where \pi_C denotes the natural linear projection map from the relevant Sobolev space of sections of \overline{\text{Hom}}_{\mathbb C}(T\Sigma',N_u) to \text{coker}\, \mathbf{D}_u^N.

Lemma 3 (cf. Lemmas 5.4 and 6.4). The operator Y \mapsto \mathbf{L}_Y described above is surjective.

This lemma is the main technical step. To prove it, one can restate the problem in light of Lemma 1 as follows. Let us choose suitable bundle metrics and area forms so that there are well-defined L^2-pairings on spaces of sections, and we can thus identify \text{coker}\,\mathbf{D}_u^N with the kernel of the formal adjoint of \mathbf{D}_u^N. This makes \text{coker}\, \mathbf{D}_{u_0}^N the L^2-orthogonal complement of \text{im}\, \mathbf{D}_{u_0}^N, so that the matrix elements of the linear transformation \mathbf{L}_Y for any given Y \in T_{J_0}{\mathcal J} take the form

\langle \xi , \mathbf{L}_Y \eta \rangle_{L^2} = \langle \xi , A_Y \eta \rangle_{L^2} for \eta \in \ker\mathbf{D}_{u_0}^N and \xi \in \text{coker}\,\mathbf{D}_{u_0}^N.

So we need to know that for any given G-equivariant linear map \Psi : \ker\mathbf{D}_{u_0}^N \to \text{coker}\, \mathbf{D}_{u_0}^N, we can find a G-invariant zeroth-order perturbation A : N_u \to \overline{\text{Hom}}_{\mathbb C}(T\Sigma',N_u), with support away from the non-injective points of the underlying simple curve v, such that

\langle \xi, A \eta \rangle_{L^2} = \langle \xi , \Psi \eta \rangle_{L^2} for all \eta \in \ker\mathbf{D}_{u_0}^N and \xi \in \text{coker}\, \mathbf{D}_{u_0}^N.

Notice that if we can find a solution A to this problem that is not G-invariant, then we can always symmetrize it to produce one that is; this is possible due to the G-equivariance of \Psi. We are therefore free to ignore the G-symmetry from now on, and simply look for any zeroth-order perturbation A that is supported in a given open set and satisfies the above relation for a given linear map \Psi \in \text{Hom}(\ker \mathbf{D}_{u_0}^N,\text{coker}\, \mathbf{D}_{u_0}^N).

This problem does not sound unsolvable when you consider that \Psi is required to live in a finite-dimensional vector space, while we are free to choose A from a space that is infinite-dimensional. Arguing by contradiction, suppose there is no solution, or equivalently, that \text{Hom}(\ker \mathbf{D}_{u_0}^N,\text{coker}\, \mathbf{D}_{u_0}^N) contains a nontrivial element orthogonal to every element that can be produced by choices of zeroth-order perturbations A. This can be expressed more concretely in terms of the matrix elements with respect to orthonormal bases (\eta_i) of \ker\mathbf{D}_{u_0}^N and (\xi_j) of \text{coker}\, \mathbf{D}_{u_0}^N: if the solution we’re looking for does not exist, then there is a set of real numbers \Psi^{ij}, not all equal to zero, such that

\displaystyle \sum_{i,j} \Psi^{ij} \langle \xi_j , A \eta_i \rangle_{L^2} = 0

for every A in the space of allowed perturbations. This doesn’t sound very likely, but there is one conceivable situation where we would now be in big trouble: to see this, let us write the L^2-product more explicitly as an integral of a real bundle metric \langle\cdot,\cdot \rangle, which we can view as a fiberwise linear form on the tensor product of the bundle \overline{\text{Hom}}_{\mathbb C}(T\Sigma',N_u) with itself. The above expression then becomes

\displaystyle \int_{\Sigma'} \langle\cdot,\cdot\rangle \circ (\text{Id} \otimes A) \left( \sum_{i,j} \Psi^{i,j} \xi_j(z) \otimes \eta(z) \right) \, d\text{vol} = 0.

The summation in parentheses in this integral is a section of the tensor bundle \overline{\text{Hom}}_{\mathbb C}(T\Sigma',N_u) \otimes_{\mathbb R} N_u, defined as a linear combination of products \xi_j \otimes \eta_i where \eta_i satisfies a linear Cauchy-Riemann type equation and \xi_j satisfies its formal adjoint equation. It is not difficult to show (cf. Lemma 5.5) that if this linear combination is nonzero on some open set, then A can be chosen with support in that set to ensure that the integral is nonzero, thus giving a contradiction. We know that the \eta_i and \xi_j all satisfy unique continuation results, so we can easily find an open set on which all of them are nonvanishing. Is it really possible that a linear combination of this form could nonetheless vanish?

Quadratic unique continuation

Actually yes: if we’re not careful, such a linear combination certainly could vanish. Let’s put the problem in slightly more general terms: assume \mathbf{D} : \Gamma(E) \to \Gamma(F) is a real-linear Cauchy-Riemann type operator on some complex vector bundle E over a Riemann surface \Sigma, with F := \overline{\text{Hom}}_{\mathbb C}(T\Sigma,E), and \mathbf{D}^* : \Gamma(F) \to \Gamma(E) denotes its formal adjoint with respect to some chosen L^2-pairing. The question we need to consider is local, so we place no assumptions on the base \Sigma, i.e. it could be simply a disk.

Question: If K \subset \ker \mathbf{D} and C \subset \ker \mathbf{D}^* are finite-dimensional subspaces, must the natural map

C \otimes_{\mathbb R} K \stackrel{\iota}{\longrightarrow} \Gamma(F \otimes_{\mathbb R} E)

defined by \iota(\xi \otimes \eta)(z) = \xi(z) \otimes \eta(z) be injective?

Answer: In general, no. For example take E and F to be the trivial line bundle over \Sigma = {\mathbb C}, with \mathbf{D} = \bar{\partial} and \mathbf{D}^* = -\partial, so their kernels are the spaces of holomorphic and antiholomorphic functions respectively. Now define K \subset \ker \bar{\partial} to be the complex span of the functions \eta_1(z) = 1 and \eta_2(z) = z, while C \subset \ker \partial is the complex span of \xi_1(z) = 1 and \xi_2(z) = \bar{z}. Then

\xi_1 \otimes i \eta_2 - i \xi_1 \otimes \eta_2 - \xi_2 \otimes i \eta_1 + i \xi_2 \otimes \eta_1 \in C \otimes_{\mathbb R} K

defines a nontrivial element in the real tensor product of the vector spaces C and K, but it also defines the zero-section of the bundle F \otimes_{\mathbb R} E.

It turns out that the reason for the horror scenario in the above example is that we allowed the spaces K and C to be complex. This is where the extra condition in our definition of {\mathcal M}^{d,*}_G(\mathbf{k},\mathbf{c}) comes into play, as we will see that the problem goes away if we require K and C to be totally real.

Lemma 4 (quadratic unique continuation, cf. Proposition 5.1). For any sets of complex-linearly independent sections \eta_1,\ldots,\eta_p \in \ker \mathbf{D} and \xi_1,\ldots,\xi_q \in \ker \mathbf{D}^* in the above setting, no linear combination

\displaystyle \sum_{i,j} c^{ij} \xi_j \otimes \eta_i \in \Gamma(F \otimes_{\mathbb C} E)

with coefficients c^{ij} \in {\mathbb C} satisfying c^{ij} \ne 0 for some i,j vanishes to infinite order at any point.

This is a statement about complex bases and a complex tensor product, but in light of the canonical surjective bundle map F \otimes_{\mathbb R} E \to F \otimes_{\mathbb C} E and the fact that any real basis of a totally real subspace is also complex-linearly independent, it has the following consequence:

Corollary. If the finite-dimensional subspaces K \subset \ker\mathbf{D} and C \subset \ker \mathbf{D}^* are both totally real, then the natural map

C \otimes_{\mathbb R} K \to \Gamma(F \otimes_{\mathbb R} E)

is injective, and the nontrivial sections in its image do not vanish to infinite order at any point.

You’ll gain some valuable intuition about Lemma 4 if you work out the case where \mathbf{D} = \bar{\partial} on a trivial vector bundle, so the \eta_i are all holomorphic functions and the \xi_j are antiholomorphic functions. The proof in this situation is not very hard: it hinges on the fact that all terms in the Taylor series of \eta_i are powers of z, while those in the Taylor series of \xi_j are powers of \bar{z}. Putting them together in a complex tensor product thus leaves them “decoupled” so that no nontrivial linear combination can kill them, so long as the original sets of Taylor coefficients are complex-linearly independent. For the general case, one cannot use Taylor series, but it is still useful to note that—as a corollary of the usual unique continuation results (based on the similarity principle or whatever else you prefer)—the first nonvanishing term in the Taylor expansion of each \eta_i is a power of z, and similarly for \xi_j with powers of \bar{z}. The proof thus proceeds following the same idea as in the (anti-)holomorphic case, but paying specific attention to the first nontrivial term in the Taylor expansion at each step.

A concluding remark on integrability

I’m sure no one will argue with me when I say that integrable complex structures are not “generic,” and indeed, some of the corollaries of Theorem D are known to be false in the integrable case. For instance, Bryan and Pandharipande have found examples in algebraic Calabi-Yau 3-folds for which super-rigidity fails. While I honestly don’t know whether the totally real condition in my definition of {\mathcal M}^{d,*}_G(\mathbf{k},\mathbf{c}) is essential, I find it amusing and slightly poetic that the generic almost complex structures provided by Theorem D are guaranteed to be non-integrable. That does not mean of course that results like super-rigidity cannot possibly hold in integrable settings: there are also known cases in which they do, but when it happens, it is for reasons completely unrelated to any of what I’ve been talking about here.

Is the Große Fuge over yet?

Update (19.12.2017): I have edited this post (and will shortly also be updating the paper on the arXiv) to correct a minor error in representation theory. The correction necessitated a change in the definition of the space {\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}), so that some dimensions that used to be real are now dimensions over the endomorphism algebra {\mathbb K}_i = \text{End}_G(W_i). This change (fortunately) has no adverse impact on the main applications concerning super-rigidity and transversality for multiple covers. Many thanks to Thomas Walpuski and Aleksander Doan for catching the error.

Posted in Uncategorized | Tagged , , | Leave a comment

Regular presentations and twisted Cauchy-Riemann operators

This is the first of two followup posts that I promised at the end of “Transversality for multiple covers, super-rigidty, and all that”. I want to fill in some details about the natural splitting

\mathbf{D}_u^N = \bigoplus_{i=1}^N (\mathbf{D}_u^{\boldsymbol{\theta}_i})^{\oplus k_i}

that exists for the normal Cauchy-Riemann operator \mathbf{D}_u^N : \Gamma(N_u) \to \Omega^{0,1}(\Sigma,N_u) of any multiply covered J-holomorphic curve u = v \circ \varphi. As I mentioned in the other post, the main thing you need to know before solving any given transversality problem for multiple covers is the Fredholm indices of these operators. So let’s start by understanding what the operators are.

The idea behind this splitting is presumably standard in some circles, but it was new to me when I read about it in Taubes’s 1996 paper Counting pseudo-holomorphic submanifolds in dimension 4. The starting point is the observation that if \varphi is a regular covering map, then sections of the pullback bundle N_u = \varphi^*N_v can be identified naturally with sections of some tensor product bundle N_v \otimes V determined by a permutation representation of the automorphism group of \varphi. The irreducible subrepresentations then give rise to subbundles of N_v \otimes V, which correspond to subspaces of \Gamma(N_u) and \Omega^{0,1}(\Sigma,N_u) that are respected by \mathbf{D}_u^N.

Since Taubes was mainly interested in multiply covered tori covering embedded tori, he did not need to consider cases where \varphi has branch points or fails to be regular (meaning |\text{Aut}(\varphi)| < \text{deg}(\varphi)), but we do. In the following, we will deal with curves u = v \circ \varphi where v : (\Sigma,j) \to (M,J) is any closed somewhere injective J-holomorphic curve and

\varphi : (\Sigma',j') \to (\Sigma,j)

is an arbitrary holomorphic branched cover of closed Riemann surfaces, with degree d, which we assume to be at least 2. Everything I say here can be generalized to punctured surfaces—which I expect to have some interesting applications in SFT and ECH—but I will not discuss that in this post. I should mention that a slightly different variation on Taubes’s splitting idea has been used in Eftekhary’s approach to the super-rigidity problem, though it appears to be limited to the case of regular branched covers.

To simplify notation, let us denote the Cauchy-Riemann type operator \mathbf{D}_v^N simply by \mathbf{D}, with domain and target bundles

E := N_v, F := \overline{\text{Hom}}_{\mathbb C}(T\Sigma,N_v).

We can regard \mathbf{D}_u^N as a pullback of \mathbf{D} and will thus denote it by \varphi^*\mathbf{D}. Its domain and target bundles are

E^\varphi := N_u = \varphi^*N_v and F^\varphi := \overline{\text{Hom}}_{\mathbb{C}}(T\Sigma',N_u),

and the two operators are related to each other by

(\varphi^*\mathbf{D})(\eta \circ \varphi) = \varphi^*(\mathbf{D}\eta) for all \eta \in \Gamma(E).

Branch points as punctures

Now is the moment to reveal a very small lie that I’ve already told you a few times: the splitting I’m going to define is not actually a decomposition of the operator \varphi^*\mathbf{D} : \Gamma(E^\varphi) \to \Gamma(F^\varphi), but is instead a decomposition of a different operator that is, for purposes of transversality questions, equivalent to it. The reason for this extra complication is that I would like to apply a certain amount of standard covering space theory to the branched cover \varphi : \Sigma' \to \Sigma, but before I can do that, I need to remove its branch points so that it becomes an honest covering map. Let

\Theta \subset \Sigma

denote a finite set that contains all the critical values of \varphi; in most situations we will simply take \Theta to be the set of critical values, but in some cases it is useful to have a bit more freedom. With this choice in place, define

\dot{\Sigma} := \Sigma \setminus \Theta, \Theta' := \varphi^{-1}(\Theta), \dot{\Sigma}' := \Sigma' \setminus \Theta',

so that \varphi now restricts to a degree d covering map of punctured surfaces

\dot{\Sigma}' \stackrel{\varphi}{\longrightarrow} \dot{\Sigma}.

We will denote the restrictions of the bundles E and F to the punctured domain \dot{\Sigma} by

\dot{E} := E|_{\dot{\Sigma}} and \dot{F} := F|_{\dot{\Sigma}},

and denote by \dot{\mathbf{D}} : \Gamma(\dot{E}) \to \Gamma(\dot{F}) the corresponding restriction of the operator \mathbf{D} : \Gamma(E) \to \Gamma(F). In order to get useful analytical results from such a restriction, we need to choose suitable Banach space completions so that \mathbf{D} and \dot{\mathbf{D}} end up having kernels and cokernels of the same dimension. The way to do this is straightforward, but it may strike some readers as odd at first glance.

Choose any k \in \mathbb{N} and p \in (1,\infty), so that \mathbf{D} becomes a bounded linear operator between Sobolev spaces

\mathbf{D} : W^{k,p}(E) \to W^{k-1,p}(F),

and elliptic regularity implies that \ker\mathbf{D} is a fixed finite-dimensional space of smooth sections independent of the choice of k and p; for similar reasons, the Fredholm index of \mathbf{D} also does not depend on these choices, hence neither does the dimension of its cokernel. The corresponding choice of Sobolev spaces for \dot{\mathbf{D}} requires exponential weights at the punctures. To define this, choose a trivialization of E over a neighborhood of each point z_0 \in \Theta, together with holomorphic cylindrical coordinates (s,t) identifying a punctured neighborhood of z_0 with the cylindrical end [0,\infty) \times S^1. Then for any \delta > 0, we define the Banach space

W^{k,p,-\delta}(\dot{E}) = \{ \eta \in W^{k,p}_{\text{loc}}(\dot{E})\ |\ e^{-\delta s} \eta \in W^{k,p}([0,\infty) \times S^1) on each cylindrical end near \Theta \}.

Pay careful attention to the signs here: sections in W^{k,p,-\delta}(\dot{E}) need not be bounded near the punctures, as they are allowed to grow exponentially at a rate bounded by \delta. A Banach space of this type will never arise as a tangent space to any reasonable Banach manifold of nonlinear maps, but it is the right space for our present purposes, due to the following result:

Lemma. For any \delta > 0 sufficiently small, the operators \mathbf{D} : W^{k,p}(E) \to W^{k-1,p}(F) and \dot{\mathbf{D}} : W^{k,p,-\delta}(\dot{E}) \to W^{k-1,p,-\delta}(\dot{F}) have the same Fredholm index, and the map \eta \mapsto \eta|_{\dot{\Sigma}} defines an isomorphism \ker\mathbf{D} \to \ker \dot{\mathbf{D}}.

The proof of this is a straightforward application of elliptic regularity results, including the asymptotic formulas for Cauchy-Riemann type equations on half-cylinders due to Hofer-Wysocki-Zehnder et al (see the appendix of Siefring’s asymptotics paper). The main point is that if \eta \in W^{k,p,-\delta}(\dot{E}) satisfies \dot{\mathbf{D}}\eta = 0 and \delta > 0 is small enough so that a certain asymptotic operator has no eigenvalues between 0 and \delta, then asymptotic formulas imply that \eta does not actually grow exponentially, but is bounded well enough at infinity to deduce that it extends over the punctures as an element of W^{k,p}(E). In this context, it should not be very surprising that we impose an exponential growth condition instead of exponential decay: some exponential weight is necessary since \dot{\mathbf{D}} would otherwise fail to be Fredholm, but a decay condition would mean that we only see elements of \ker\mathbf{D} that happen to vanish at \Theta, which most of them will not.

The above discussion applies just as well to the pulled back operator \varphi^*\mathbf{D} as it does to \mathbf{D}: everything is fine as long as the exponential weight \delta > 0 is sufficiently small. We will assume from now on that both \dot{\mathbf{D}} and \varphi^*\dot{\mathbf{D}} are defined on weighted Sobolev spaces of this form. Notice that since branch points have been removed, the domain and target bundles of the latter can both be identified with pullback bundles:

\varphi^*\dot{\mathbf{D}} : W^{k,p,-\delta}(\varphi^*\dot{E}) \to W^{k-1,p,-\delta}(\varphi^*\dot{F}).

Regular presentations of branched covers

The next step is to encode the symmetries of \varphi : \dot{\Sigma}' \to \dot{\Sigma} in group-theoretic terms. The most obvious group to look at for this purpose is the finite automorphism group \text{Aut}(\varphi), but unless \varphi happens to be regular (i.e. normal), this group might be too small to contain all of the information we need—it might even be trivial. A more useful notion is the following: one can always identify \varphi : \dot{\Sigma}' \to \dot{\Sigma} with a map of the form

\left( \dot{\Sigma}'' \times \{1,\ldots,d\} \right) \Big/ G \to \dot{\Sigma} : [(z,i)] \mapsto \pi(z),

for some connected regular covering map \pi : \dot{\Sigma}'' \to \dot{\Sigma}, with a finite group G = \text{Aut}(\pi) acting on \dot{\Sigma}'' by deck transformations and acting transitively on \{1,\ldots,d\} via some injective homomorphism to the symmetric group,

\rho : G \to S_d.

Such a presentation can be derived from the following standard picture of covering spaces. Let \tilde{\pi} : {\mathscr U} \to \dot{\Sigma} denote the universal cover of \dot{\Sigma}, with \text{Aut}(\tilde{\pi}) = \pi_1(\dot{\Sigma}). After choosing an ordering for the lifts in \dot{\Sigma}' of a base point in \dot{\Sigma}, every based loop in \dot{\Sigma} lifts to paths in \dot{\Sigma}' that permute the lifted base points, thus defining a homomorphism \tilde{\rho} : \pi_1(\dot{\Sigma}) \to S_d such that \varphi : \dot{\Sigma}' \to \dot{\Sigma} can now be identified with the covering map

\left( {\mathscr U} \times \{1,\ldots,d\} \right) \Big/ \pi_1(\dot{\Sigma}) \to \dot{\Sigma} : [(z,i)] \mapsto \tilde{\pi}(z).

The group G := \pi_1(\dot{\Sigma}) / \ker\tilde{\rho} is now finite (with order at most d!) since \tilde{\rho} descends to an injection \rho : G \to S_d, and \tilde{\pi} likewise descends to \dot{\Sigma}'' := {\mathscr U} / \ker\tilde{\rho} as a regular cover \pi : \dot{\Sigma}'' \to \dot{\Sigma} with automorphism group G. If \Theta is chosen to contain only the critical values of \varphi : \Sigma' \to \Sigma, then we refer to G as the generalized automorphism group of \varphi, and one can identify it with the quotient of \pi_1(\dot{\Sigma}) by the normal core of the subgroup \varphi_*\pi_1(\dot{\Sigma}'). Notice that G is always a nontrivial group, with order at least d and at most d!. The lower bound is attained if and only if \varphi_*\pi_1(\dot{\Sigma}') equals its normal core, which just means that \varphi is regular and G is in this case \text{Aut}(\varphi). One can show that \varphi : \dot{\Sigma}' \to \dot{\Sigma} and \pi : \dot{\Sigma}'' \to \dot{\Sigma} are isomorphic covers if \varphi is regular; more generally, \pi always factors through another cover that is isomorphic to \varphi.

regular presentation of \varphi : \Sigma' \to \Sigma is defined in general to mean a choice of finite set \Theta \subset \Sigma containing the critical values, plus an identification of \dot{\Sigma}' with (\dot{\Sigma}'' \times \{1,\ldots,d\}) / G as described above for some regular cover \pi : \dot{\Sigma}'' \to \dot{\Sigma} with a finite automorphism group G that acts transitively on \{1,\ldots,d\} via a homomorphism \rho : G \to S_d. Here we need not generally assume that \rho is injective, but if this holds and \Theta is chosen to contain only the critical values of \varphi, then we say the regular presentation is minimal; in this case it is necessarily isomorphic to the one we constructed above, with G as the generalized automorphism group of \varphi. It is not hard to show that for any regular presentation, the regular cover \pi : \dot{\Sigma}'' \to \dot{\Sigma} extends to a regular holomorphic branched cover of closed Riemann surfaces

\pi : (\Sigma'',j'') \to (\Sigma,j)

such that \dot{\Sigma}'' = \Sigma'' \setminus \Theta'' where \Theta'' := \pi^{-1}(\Theta).

The stratification theorem I stated in the previous post requires fixing the minimal regular presentation of \varphi : \Sigma' \to \Sigma, but non-minimal presentations can also be useful in certain inductive arguments, e.g. in the proof of super-rigidity. For the construction below, we fix any regular presentation.

The twisted bundle construction

We now come to the heart of the matter. Identifying \dot{\Sigma}' with (\dot{\Sigma}'' \times \{1,\ldots,d\}) / G as described above, suppose

\boldsymbol{\theta} : G \to \text{Aut}_{\mathbb R}(W)

is a real finite-dimensional representation of G. We can use this to associate with \dot{E} \to \dot{\Sigma} a new complex vector bundle

E^{\boldsymbol{\theta}} := E \otimes W^{\boldsymbol{\theta}} \to \dot{\Sigma}

by defining W^{\boldsymbol{\theta}} \to \dot{\Sigma} to be the flat real vector bundle

W^{\boldsymbol{\theta}} := \left( \dot{\Sigma}'' \times W\right) \Big/ G,

with G acting on \dot{\Sigma}'' via deck transformations and on W via the representation \boldsymbol{\theta}. I am referring to W^{\boldsymbol{\theta}} as “flat” because the trivial connection on \dot{\Sigma}'' \times W \to \dot{\Sigma}'' descends naturally to a flat connection on W^{\boldsymbol{\theta}} \to \dot{\Sigma}. Defining F^{\boldsymbol{\theta}} := \dot{F} \otimes W^{\boldsymbol{\theta}} similarly, our Cauchy-Riemann operator \dot{\mathbf{D}} : \Gamma(\dot{E}) \to \Gamma(\dot{F}) now determines a “twisted” Cauchy-Riemann type operator

\mathbf{D}^{\boldsymbol{\theta}} : \Gamma(E^{\boldsymbol{\theta}}) \to \Gamma(F^{\boldsymbol{\theta}})

by setting \mathbf{D}^{\boldsymbol{\theta}}(\eta \otimes v) := (\dot{\mathbf{D}}\eta) \otimes v for any smooth local sections \eta of \dot{E} and v of W^{\boldsymbol{\theta}} such that v is flat.

The purpose of this construction becomes clearer when we consider the following choice of representation: the homomorphism \rho : G \to S_d determines a permutation representation

\boldsymbol{\rho} : G \to \text{GL}(d,{\mathbb R})

which acts on the standard basis vectors e_i \in {\mathbb R}^d for i=1,\ldots,d by \boldsymbol{\rho}(g) e_i := e_{\rho(g)(i)}. Using the obvious identification of E^{\boldsymbol{\rho}} with (\pi^*\dot{E} \otimes {\mathbb R}^d) / G, we can write global sections \eta \in \Gamma(E^{\boldsymbol{\rho}}) as G-equivariant sections \sum_i \eta^i \otimes e_i of \pi^*\dot{E} \otimes {\mathbb R}^d, where \eta^1,\ldots,\eta^d are now sections of \pi^*\dot{E} \to \dot{\Sigma}''. Denoting the action of G on \dot{\Sigma}'' by (g,z) \mapsto g \cdot z, the equivariance condition means

\eta^i(z) = \eta^{\rho(g)(i)}(g \cdot z)

for every z \in \dot{\Sigma}'', i \in \{1,\ldots,d\} and g \in G. It follows that we can now associate to \eta a section \hat{\eta} of \varphi^*\dot{E}, defined at [(z,i)] \in (\dot{\Sigma}'' \times \{1,\ldots,d\}) / G = \dot{\Sigma}' by

\hat{\eta}([(z,i)]) = \eta^i(z),

and the equivariance condition guarantees that this is well defined. Conversely, any section of \varphi^*\dot{E} written in this way gives rise to sections \eta^1,\ldots,\eta^d of \pi^*\dot{E} that automatically satisfy the equivariance condition, so that \eta = \sum_i \eta^i \otimes e_i forms a well-defined section of E^{\boldsymbol{\rho}}. Doing the same thing with the bundle F^{\boldsymbol{\rho}} and looking again at our Cauchy-Riemann operators, we conclude:

Proposition. The correspondence \sum_i \eta^i \otimes e_i \leftrightarrow \hat{\eta} described above defines natural bijections \Gamma(E^{\boldsymbol{\rho}}) \leftrightarrow \Gamma(\varphi^*\dot{E}) and \Gamma(F^{\boldsymbol{\rho}}) \leftrightarrow \Gamma(\varphi^*\dot{F}) which identify the Cauchy-Riemann type operators \mathbf{D}^{\boldsymbol{\rho}} and \varphi^*\dot{\mathbf{D}}.

Identifying \varphi^*\dot{\mathbf{D}} with \mathbf{D}^{\boldsymbol{\rho}} makes the operator easy to decompose: any splitting of {\mathbb R}^d into G-invariant subspaces produces a splitting of W^{\boldsymbol{\rho}} into subbundles such that \mathbf{D}^{\boldsymbol{\rho}} will respect the resulting splittings of E^{\boldsymbol{\rho}} and F^{\boldsymbol{\rho}}. The restriction of \mathbf{D}^{\boldsymbol{\rho}} to one of these subbundles will then be the same as \mathbf{D}^{\boldsymbol{\theta}} if \boldsymbol{\theta} is the representation defined by restricting \boldsymbol{\rho} to the corresponding invariant subspace.  To put it another way, if we fix a list \boldsymbol{\theta}_1,\ldots,\boldsymbol{\theta}_N of the distinct irreducible real representations of G, then

\boldsymbol{\rho} \cong \boldsymbol{\theta}_1^{\oplus k_1} \oplus \ldots \oplus \boldsymbol{\theta}_N^{\oplus k_N},

for uniquely determined integers k_1,\ldots,k_N \ge 0, and this produces splittings of the bundles E^{\boldsymbol{\rho}} and F^{\boldsymbol{\rho}} such that

\mathbf{D}^{\boldsymbol{\rho}} \cong (\mathbf{D}^{\boldsymbol{\theta}_1})^{\oplus k_1} \oplus \ldots \oplus (\mathbf{D}^{\boldsymbol{\theta}_N})^{\oplus k_N}.

It is often not easy to see directly which subspace of \Gamma(\varphi^*\dot{E}) a given subrepresentation of \boldsymbol{\rho} corresponds to, but the following example should be helpful to keep in mind. Since \boldsymbol{\rho} : G \to \text{GL}(d,{\mathbb R}) acts on {\mathbb R}^d by permuting coordinates, we can single out two subspaces that are always invariant: write

{\mathbb R}^d = W_+ \oplus W_-

where W_+ is the 1-dimensional subspace spanned by (1,\ldots,1), and W_- is its orthogonal complement, consisting of vectors whose coordinates add up to zero. This gives a decomposition \boldsymbol{\rho} = \boldsymbol{\theta}_+ \oplus \boldsymbol{\theta}_-, where \boldsymbol{\theta}_+ is the trivial representation. The subspace of \Gamma(\varphi^*\dot{E}) determined by \boldsymbol{\theta} is the one that we’ve previous called \Gamma_+(\varphi^*\dot{E}) in the degree 2 case: it consists of all sections of the form \eta \circ \varphi for \eta \in \Gamma(\dot{E}). The subspace \Gamma_-(\varphi^*\dot{E}) corresponding to \boldsymbol{\theta}_- consists of all sections \eta with the property that for every \zeta \in \dot{\Sigma},

\displaystyle \sum_{z \in \varphi^{-1}(\zeta)} \eta(z) = 0.

For example, if u = v \circ \varphi is a degree two cover, then \boldsymbol{\rho} : {\mathbb Z}_2 \to \text{GL}(2,{\mathbb R}) is the unique nontrivial permutation representation of {\mathbb Z}_2, whose decomposition into irreducible subrepresentations is simply \boldsymbol{\theta}_+ \oplus \boldsymbol{\theta}_-, producing the familiar decomposition

\mathbf{D}_u^N = \mathbf{D}_u^+ \oplus \mathbf{D}_u^-.

In the general case, this decomposition still exists, but one must expect W_- to admit smaller invariant subspaces, so that \boldsymbol{\theta}_- is reducible and gives rise to a further splitting of \mathbf{D}_u^-. Notice that since G acts transitively on \{1,\ldots,d\}, W_+ is the largest subspace of {\mathbb R}^d on which \boldsymbol{\rho} acts trivially, hence if we list the irreducible representations \mathbf{\theta}_1,\ldots,\mathbf{\theta}_N with the trivial representation first, then the first term in our general splitting of \mathbf{D}^{\boldsymbol{\rho}} always satisfies k_1 = 1 and matches \mathbf{D}^{\boldsymbol{\theta}_+}. Moreover, this operator is equivalent to \dot{\mathbf{D}} under the obvious bundle isomorphism between \dot{E} and E^{\boldsymbol{\theta}_+} = \dot{E} \otimes W^{\boldsymbol{\theta}_+} resulting from the fact that W^{\boldsymbol{\theta}_+} is canonically trivial.

Non-faithful representations

The equivalence of \mathbf{D}^{\boldsymbol{\theta}_+} with \dot{\mathbf{D}} is indicative of a more general phenomenon that is useful to be aware of: for a non-faithful representation \boldsymbol{\theta} : G \to \text{Aut}_{\mathbb R}(W), \mathbf{D}^{\boldsymbol{\theta}} can always be identified with a similar operator that comes from a branched cover of strictly smaller degree that \varphi factors through. The main idea is basically to take your regular presentation of \varphi and divide everything by the kernel of \boldsymbol{\theta}. I’ll refer you to the paper for details, rather than trying to explain them here, but the message is that for any given branched cover u = v \circ \varphi with its splitting of \mathbf{D}_u^N, the truly meaningful terms in the splitting are the ones that come from faithful representations. The rest are not actually telling you something about u, but rather about lower-degree covers of v that u factors through; or, in the case of the trivial representation, about v itself.

The regular case

I’ll conclude with one further observation about the splitting \mathbf{D}^{\boldsymbol{\rho}} \cong \bigoplus_{i=1}^N (\mathbf{D}^{\boldsymbol{\theta}_i})^{\oplus k_i}. Each integer k_i \ge 0 in this expression is the multiplicity with which the representation \boldsymbol{\theta}_i appears in \boldsymbol{\rho}, and while the trivial representation is always present with multiplicity k_1=1, some of the others can be zero in general. There is an important special case in which this cannot happen: if the cover \varphi is regular, then the permutation action of G on \{1,\ldots,d\} is not only transitive but also without fixed points, which means that \boldsymbol{\rho} : G \to \text{GL}(d,{\mathbb R}) is equivalent to the so-called regular representation. By a standard result in representation theory, every irreducible representation of G is a subrepresentation of the regular representation, so the integers k_i in this case are all positive. This has the consequence that everything one needs to know about the twisted operators \mathbf{D}^{\boldsymbol{\theta}_i} can be deduced by considering only regular covers. The proof of the stratification theorem—to be discussed in the next post—makes use of this.

Posted in Uncategorized | Tagged , , | Leave a comment

Transversality for multiple covers, super-rigidty, and all that

Let’s talk some more about transversality.

“So… the inability of transversality to exist, or the inability to create the situation in which transversality is a potential is not unattainable. This is of course different than to say, transversality is possible or even to say, transversality is not impossible. And if all of this seems funny it is because it is, in fact, comedy, by definition, through the irreducibility of nothing.” Excerpted from a transhumanist(?) essay I found when I googled the word “transversality”.

Most of my readers know that transversality is a stressful topic in symplectic topology. Moduli spaces of pseudoholomorphic curves are generically nice smooth objects… except for when they aren’t, which is generally the case if they contain multiple covers, which they typically do. Various people have described various remedies for this over the years, usually involving words like “virtual”, “Kuranishi” or “polyfold”, and there is probably a theorem that for every proposed remedy, you can find at least one well-respected symplectic topologist willing to make denigrating and sarcastic remarks about it after a few drinks.

But that isn’t what this post is about. I want to talk about something more naive and concrete. I want to answer the following questions:

When is it possible for a multiply covered holomorphic curve to be regular? If it cannot be regular, then why not, and what is true instead?

I was surprised to learn sometime last year that these questions actually admit reasonable answers, and I think a lot of people who use holomorphic curves in their research will find the answers interesting and potentially useful. The details are written up in a preprint I put up last year called Transversality and super-rigidity for multiply covered holomorphic curves, which, among other things, answered an open question about the Gromov-Witten invariants of Calabi-Yau 3-folds. In this post I will try to explain some of the main ideas.

First, here are two sample theorems from the paper:

Theorem B. For generic J in a symplectic manifold of dimension at least 4, every unbranched cover of a closed somewhere injective J-holomorphic curve is Fredholm regular.[1]

Theorem C. Let {\mathcal M} denote a connected component of the moduli space of closed muliply covered J-holomorphic curves u = v \circ \varphi in a symplectic manifold, where v is a somewhere injective curve with a prescribed number of critical points with prescribed critical orders, and \varphi is a holomorphic branched cover with prescribed numbers of critical values and branch points with prescribed branching orders. Then if J is generic, we have the following alternative:

  1. No curve in {\mathcal M} is Fredholm regular.
  2. Regularity is achieved for all curves in an open and dense subset of {\mathcal M}.

The precise nature of the “moduli space of multiple covers” I’m referring to in this second result deserves more explanation, so I’ll come back to this below. The result says in effect that for each branched cover, transversality is either completely impossible for topological reasons or it is achieved with probability 1 for generic J, meaning there may be a nonempty subset in the space of multiply covered curves for which transversality fails, but this subset has measure 0. The first result is stronger since it does not have such a probabilistic caveat, but it requires the stronger hypothesis that the cover has no branch points.

Since I’m focusing on transversality in this post, I haven’t yet mentioned Theorem A in the paper, which states that in dimensions six and above, all somewhere injective index 0 curves are “super-rigid” for generic J. If you don’t already know what that means, then you don’t know why it was an exciting enough result to get top billing in my paper, but I’ll say a bit more about this below.

Why are we having this conversation?

Let’s clear up one thing before we continue: no matter how cleverly I can prove that certain multiple covers achieve transversality, there is no hope that results like this will ever fully replace virtual cycle or abstract perturbation methods for defining things like Gromov-Witten theory or SFT. That is not the goal. You might therefore ask: if more abstract methods are necessary anyway, what’s the point of struggling to understand transversality issues for actual holomorphic curves?

The answer is that it depends what kind of problem you want to solve. Abstractly perturbing the nonlinear Cauchy-Riemann equation destroys symmetry, which is good if achieving transversality is your only goal—transversality and symmetry are famously incompatible—but it also kills properties that sometimes carry interesting information. I’ll give you two examples:

  1. Most of the celebrated results about symplectic 4-manifolds or contact 3-manifolds based on holomorphic curve theory rely crucially on positivity of intersections, and so do some of the enumerative invariants in this context such as Taubes’s Gromov invariant and its 3-dimensional cousin, embedded contact homology (ECH). But positivity of intersections holds only for actual J-holomorphic curves, not for solutions to abstractly perturbed Cauchy-Riemann type equations.
  2. The Gopakumar-Vafa formula for Gromov-Witten invariants on Calabi-Yau 3-folds is often interpreted as a relation between the count of embedded curves and their “multiple cover contributions”. This notion ceases to have any meaning as soon as perturbations eliminate the distinction between simple curves and multiple covers.

My own motivation to think about this stuff came mainly from the first example, as I strongly suspect that the phenomena behind Theorems A, B and C will end up filling in a crucial missing piece of the analytical picture necessary for completing the definition of ECH, i.e. for defining cobordism maps and proving invariance without reference to Seiberg-Witten theory. Also, while it’s pretty clear that not all of the important transversality problems of SFT can be solved via these methods, I have a feeling they might at least go far enough to define something akin to the “semipositive case” of SFT, thus giving a rigorous version of the theory that works in some settings and is more concrete than the general case.


The proofs of Theorems B and C are based on a general result about the local structure of the space of multiply covered curves. Here is an outline of the idea.

The normal Cauchy-Riemann operator

You may be familiar with the fact that an immersed holomorphic curve u : (\Sigma,j) \to (M,J) is Fredholm regular if and only if a certain linear Cauchy-Riemann type operator on its normal bundle N_u \to \Sigma, the so called normal Cauchy-Riemann operator

\mathbf{D}_u^N : \Gamma(N_u) \to \Omega^{0,1}(\Sigma,N_u),

is surjective. If you’re only reading this post for an explanation of super-rigidity, then this is all you need to know about \mathbf{D}_u^N and you can now skip to the next subsection. But to discuss Theorems B and C, we also need the generalization of this statement to non-immersed curves.

Recall that if u : (\Sigma,j) \to (M,J) is connected and non-constant, then it is necessarily immersed outside of a discrete set \text{Crit}(u) \subset \Sigma, and at each of the critical points z \in \text{Crit}(u), it has a well-defined critical order, which is a positive integer. For instance, the injective holomorphic curve u(z) = (z^{k+1},z^{k+2}) has critical order k at z=0. That this is well defined in general can be regarded as a consequence of the well-known theorem of Micallef and White on the structure of singularities for J-holomorphic curves, though there are also other ways to see it. A related fact is that there is a well-defined and smooth complex line bundle T_u \subset u^*TM whose fiber at each immersed point z \in \Sigma \setminus \text{Crit}(u) is simply \text{im}\, du(z). It is therefore natural to choose a complementary complex subbundle N_u \subset u^*TM and call it the generalized normal bundle of u.

Now if \mathbf{D}_u : \Gamma(u^*TM) \to \Omega^{0,1}(\Sigma,u^*TM) denotes the usual linearized Cauchy-Riemann operator at u, the splitting u^*TM = T_u \oplus N_u decomposes it into block form as

\mathbf{D}_u = \begin{pmatrix} \mathbf{D}^T_u & \mathbf{D}^{TN}_u \\ \mathbf{D}_u^{NT} & \mathbf{D}_u^N \end{pmatrix}.

It is easy to check that the diagonal terms \mathbf{D}_u^T and \mathbf{D}_u^N are Cauchy-Riemann type operators on the bundles T_u and N_u respectively, while the off-diagonal terms are zeroth-order terms (in fact \mathbf{D}_u^{NT} vanishes identically). The Fredholm index of \mathbf{D}_u^N is the same as the virtual dimension of the moduli space if u is immersed, though in general they differ because c_1(N_u) differs from c_1(u^*TM) - \chi(\Sigma) by the algebraic count of critical points in u. What you may find more surprising is that even in the non-immersed case, \mathbf{D}_u^N fully characterizes transversality:

Lemma (see Theorem 3 in [W. 2010]). A non-constant J-holomorphic curve is Fredholm regular if and only if its normal Cauchy-Riemann operator is surjective. 

Cauchy-Riemann operators with symmetry

Abstract approaches to the transversality problem are typically based on the premise that transversality and symmetry are incompatible, therefore you need a perturbation that breaks the symmetry, e.g. by letting the complex structure j on \Sigma depend on points z \in \Sigma, or replacing the equation \bar{\partial}_J u = 0 with \bar{\partial}_J u = \nu or various similar ideas. But if you prefer to keep the symmetry instead of breaking it, then you have to make use of it somehow. Here we can use an important observation that I learned from a paper of Taubes.[2]

Assume u = v \circ \varphi : \widetilde{\Sigma} \to M is a multiply covered J-holomorphic curve, where v : \Sigma \to M is a simple curve and \varphi : \widetilde{\Sigma} \to \Sigma is a d-fold holomorphic branched cover. In this case, the symmetry of u endows its normal Cauchy-Riemann operator with a natural splitting

\mathbf{D}_u^N = \bigoplus_{i=1}^N (\mathbf{D}_u^{\boldsymbol{\theta}_i})^{\oplus k_i}.

Here \boldsymbol{\theta}_1,\ldots,\boldsymbol{\theta}_N are the real irreducible representations of some finite group G, each of which has an associated Fredholm operator \mathbf{D}_u^{\boldsymbol{\theta}_i} that is equivalent to a Cauchy-Riemann type operator on some bundle, and the k_i \ge 0 are integers. To understand what’s going on here, it’s instructive to start with the simplest nontrivial example: assume d=2, so \varphi : \widetilde{\Sigma} \to \Sigma is a branched double cover and therefore has automorphism group \text{Aut}(\varphi) = {\mathbb Z}_2. If \psi \in \text{Aut}(\varphi) denotes the generator of this group, then the space of sections of N_u has a natural splitting as \Gamma_+(N_u) \oplus \Gamma_-(N_u), where

\Gamma_\pm(N_u) := \{ \eta \in \Gamma(N_u)\ |\ \eta \circ \psi = \pm \eta \}.

The target bundle \overline{\text{Hom}}_{\mathbb C}(T\Sigma,N_u) for \mathbf{D}_u^N inherits a similar splitting, and the symmetry of u means that \mathbf{D}_u^N preserves the splitting, thus it decomposes into two pieces

\mathbf{D}_u^N = \mathbf{D}_u^+ \oplus \mathbf{D}_u^-.

The relation between this and the more general splitting above is as follows: G = {\mathbb Z}_2, \boldsymbol{\theta}_1 and \boldsymbol{\theta}_2 are its unique trivial and nontrivial irreducible representations respectively, \mathbf{D}_u^{\boldsymbol{\theta}_1} = \mathbf{D}_u^+, \mathbf{D}_u^{\boldsymbol{\theta}_2} = \mathbf{D}_u^-, and k_1 = k_2 = 1.

Notice that there is a canonical bijection between \Gamma_+(N_u) and \Gamma(N_v) sending any section \eta in the latter to the symmetric section \eta \circ \varphi. Under this identification, \mathbf{D}_u^+ is identified with \mathbf{D}_v^N, the normal Cauchy-Riemann operator for the underlying simple curve. This observation is obviously helpful since we already know how to prove that \mathbf{D}_v^N is surjective for generic J; the transversality problem for u has thus been reduced to a question about the operator \mathbf{D}_u^-.

For covers of arbitrary degree d \ge 2, one can still define a natural splitting \mathbf{D}_u^N = \mathbf{D}_u^+ \oplus \mathbf{D}_u^- in which \mathbf{D}_u^+ is identified with \mathbf{D}_v^N, but the factor \mathbf{D}_u^- can be split further if d \ge 3. This is a somewhat longer story, so I will make it the subject of a separate post, but the picture in general is as sketched above: there is a finite group G whose real irreducible representations \boldsymbol{\theta}_i give rise to so-called twisted Cauchy-Riemann operators \mathbf{D}_u^{\boldsymbol{\theta}_i}, which are all Fredholm. These should be regarded as the fundamental building blocks of transversality theory for multiple covers. Here we can always assume \boldsymbol{\theta}_1 is the trivial representation of G and k_1 = 1, so the first factor (\mathbf{D}_u^{\boldsymbol{\theta}_1})^{\oplus k_1} in the splitting can be identified with \mathbf{D}_u^+ \cong \mathbf{D}_v^N, and the rest is a splitting of \mathbf{D}_u^-.  The group G always has order at least d, as it is the automorphism group of some normal[3] branched cover of \Sigma that factors through \varphi. Such normal covers always exist, and in fact there is a canonical one up to isomorphism, with degree at most d!, whose automorphism group I refer to as the generalized automorphism group of \varphi. If \varphi is already normal, then the canonical normal cover factoring through it is isomorphic to \varphi itself, so we can take G = \text{Aut}(\varphi).

Moduli spaces of multiple covers

Given a tame almost complex structure J, an integer d \ge 2, a finite group G and some additional combinatorial data to be specified below, we now consider a moduli space

{\mathcal M}^d_G(J) = \{ u = v \circ \varphi \},


  • v is a somewhere injective J-holomorphic curve with some prescribed nonnegative number of marked points, each of which is contrained to be a critical point with a prescribed critical order, and v is immersed everywhere else.
  • \varphi is a holomorphic map of closed Riemann surfaces with degree d and generalized automorphism group isomorphic to G, with some prescribed nonnegative number of marked points such that every branch point is one of the marked points and each has a prescribed branching order. Moreover, we also prescribe whether any given pair of branch points are mapped to the same point, thus determining the total number of critical values.

By forgetting the marked points, we can regard {\mathcal M}^d_G(J) as a subset of the usual moduli space of smooth unparametrized J-holomorphic curves (including both simple curves and multiple covers). It should be clear that every d-fold multiply covered J-holomorphic curve u = v \circ \varphi belongs to {\mathcal M}^d_G(J) for appropriate choices of the finite group G and the combinatorial data that prescribes the critical points of v and branching behavior of \varphi. Note that the set of all possible choices of this combinatorial data is countable. It is also easy to see that {\mathcal M}^d_G(J) is a smooth manifold for generic J, as the prescribed critical points determine a smooth submanifold of the usual space of simple curves, with codimension depending on the critical orders, while for any fixed v, the moduli space of branched covers \varphi with fixed branching data is a manifold of real dimension twice the number of critical values of \varphi. This is all more or less standard.

The point of prescribing critical and branching behavior in this way is that as v and \varphi move around to produce various multiple covers u = v \circ \varphi \in {\mathcal M}^d_G, the splitting

\mathbf{D}_u^N = \bigoplus_{i=1}^N (\mathbf{D}_u^{\boldsymbol{\theta}_i})^{\oplus k_i}

changes continuously. In particular, prescribing the critical points of v makes the generalized normal bundle N_v into a continuous family of bundles parametrized by v; this would not be true if v were allowed to move arbitrarily through the space of somewhere injective curves, since its critical points could then disappear, changing the topology of N_v. Similarly, prescribing the branching data of \varphi ensures that every nearby branched cover with the same branching data has the same generalized automorphism group.

A stratification theorem

We can now state the theorem that makes everything else in this story work.

For the moduli space {\mathcal M}^d_G described above, let \boldsymbol{\theta}_i : G \to \text{Aut}_{\mathbb R}(W_i) for i=1,\ldots,N denote the distinct real irreducible representations of G, and denote

{\mathbb K}_i := \text{End}_G(W_i)t_i := \dim {\mathbb K}_i.

By standard results in representation theory, the endomorphism algebra {\mathbb K}_i is always isomorphic to either {\mathbb R}, {\mathbb C} or the quaternions {\mathbb H}, hence t_i \in \{1,2,4\}. This alternative depends on whether the complexification of \boldsymbol{\theta} is also irreducible (as a complex representation) or is the direct sum of a complex irreducible representation \boldsymbol{\lambda}_i with its dual (which is either isomorphic to \boldsymbol{\lambda}_i or not).[4] It is useful to note that the endomorphisms endow W_i and consequently the domain and target spaces of \mathbf{D}_u^{\boldsymbol{\theta}_i} with {\mathbb K}_i-module structures such that \mathbf{D}_u^{\boldsymbol{\theta}_i} is {\mathbb K}_i-linear.

Now for any tuples of nonnegative integers \mathbf{k} = (k_1,\ldots,k_N) and \mathbf{c} = (c_1,\ldots,c_N), we consider the subset

{\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}) = \{ u \in {\mathcal M}^d_G(J)\ |\ \dim_{{\mathbb K}_i} \ker \mathbf{D}_u^{\boldsymbol{\theta}_i} = k_i and \dim_{{\mathbb K}_i} \text{coker}\, \mathbf{D}_u^{\boldsymbol{\theta}_i} = c_i for all i \}.

Note that since each of the operators \mathbf{D}_u^{\boldsymbol{\theta}_i} is Fredholm, this subset will be automatically empty unless \mathbf{k} and \mathbf{c} are chosen so that k_i - c_i is the index (with respect to {\mathbb K}_i) of \mathbf{D}_u^{\boldsymbol{\theta}_i} for every i=1,\ldots,N.

Theorem D. For generic J, {\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}) is a smooth submanifold of {\mathcal M}^d_G(J) with

\text{codim }{\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}) = \sum_{i=1}^N t_i k_i c_i.

The theorem can also be stated in a more general form with J replaced by any smooth finite-dimensional family of almost complex structures, so that the definitions of {\mathcal M}^d_G(J) and {\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}) become slightly more general, but the formula for the codimension remains the same. This means that in addition to proving transversality results, Theorem D can be used as the starting point of a general bifurcation theory for multiply covered curves under generic deformations of the almost complex structure.

I’ll wait until a followup post before making any attempt to explain why Theorem D is true, but I’d now like to discuss a few of its implications. The first is Theorem C above: the splitting of \mathbf{D}_u^N already tells us that there is no hope for a multiply covered curve u = v \circ \varphi to be Fredholm regular unless

\text{ind}\,\mathbf{D}_u^{\boldsymbol{\theta}_i} \ge 0

for every i=1,\ldots,N such that k_i > 0, as \mathbf{D}_u^N cannot be surjective unless every operator in the splitting is surjective. In principle, the indices of these twisted operators can be computed, and they give us a topological obstruction to the regularity of u. But Theorem D now supplements this with the following insight: if the topological obstruction vanishes, then almost every element of {\mathcal M}^d_G(J) will definitely be regular, as the non-regular curves all live in substrata {\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}) that have strictly positive codimension!

Why unbranched double covers are regular

With Theorem D in hand, various generic transversality results can now be proved via dimension-counting arguments. To illustrate this, let’s prove a special case of Theorem B: we claim that for generic J, all unbranched double covers u = v \circ \varphi of immersed somewhere injective curves v are Fredholm regular. Let \text{ind}(v) and \text{ind}(u) denote the virtual dimensions of the moduli spaces containing v and u respectively, and note that since both are immersed, these dimensions match the Fredholm indices of the respective normal Cauchy-Riemann operators. Combining the Riemann-Hurwitz formula for branched covers with the Riemann-Roch formula for Fredholm indices, we find

\text{ind}(u) = 2\, \text{ind}(v)

Note that \text{ind}(v) \ge 0 without loss of generality since J is generic. Then \text{ind}(v) is also the dimension of the space {\mathcal M}^2_{{\mathbb Z}_2}(J) in which any immersed double cover of v lives, as there are no critical points to lower the dimension of the space of simple curves, and the space of unbranched covers does not have any moduli of its own. The splitting of \mathbf{D}_u^N takes the simple form \mathbf{D}_u^+ \oplus \mathbf{D}_u^-, with \mathbf{D}_u^+ \cong \mathbf{D}_v^N, and the relevant representations in this picture are the unique trivial and nontrivial irreducible representations of {\mathbb Z}_2, both of which remain irreducible after complexifying them, so the dimensions of the corresponding {\mathbb Z}_2-equivariant endomorphism algebras are

t_+ = t_- = 1.

The simplicity of the splitting \mathbf{D}_u^N = \mathbf{D}_u^+ \oplus \mathbf{D}_u^- also provides enough information to deduce the index of \mathbf{D}_u^-:

\text{ind}\,\mathbf{D}_u^- = \text{ind}\,\mathbf{D}_u^N - \text{ind}\,\mathbf{D}_u^+ = \text{ind}(u) - \text{ind}(v) = \text{ind}(v).

We can assume \mathbf{D}_u^+ is surjective since v is Fredholm regular, so if u is not regular, it can only be because \dim \text{coker}\, \mathbf{D}_u^- > 0, meaning that u belongs to a space of the form {\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}) with \mathbf{k} = (k_+,k_-) and \mathbf{c} = (c_+,c_-), where c_+ = 0, c_- > 0, k_+ = c_+ + \text{ind}\,\mathbf{D}_u^+ = \text{ind}(v), and k_- = c_- + \text{ind}\,\mathbf{D}_u^- = c_- + \text{ind}(v). Putting all of this information together, u now lives in a smooth manifold {\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}) with dimension

\text{ind}(v) - t_+ k_+ c_+ - t_- k_- c_- = \text{ind}(v) - c_- (c_- + \text{ind}(v)) = - c_-^2 - (c_- - 1) \text{ind}(v),

and this is strictly negative since c_- \ge 1 by assumption. This is a contradiction.

In this example we were lucky because the index of \mathbf{D}_u^- could be deduced without any further computation. In more complicated situations, it is often necessary to compute the indices of the twisted operators \mathbf{D}_u^{\boldsymbol{\theta}_i} directly, but this can be done. The values of these indices are typically what determines whether transversality is achievable or not in any given situation.


One famous setting where transversality is clearly impossible is when the symplectic manifold is 6-dimensional with vanishing first Chern class, sometimes called a symplectic Calabi-Yau 3-fold. The usual formula for virtual dimensions then gives

\text{ind}(u) = (n-3) (2-2g) + 2 c_1([u]) = 0

for all holomorphic curves u, whether multiply covered or not. For covers with nonempty set of branch points, this is clearly not the desired answer, as the freedom to move branch points around produces nontrivial moduli in the space of branched covers, producing an actual moduli space that is guaranteed not to be 0-dimensional. The best one can hope for in this setting is a kind of “Morse-Bott”  or “clean intersection” condition, saying that the linearized deformation operator for a branched cover u has kernel of dimension only as large as it manifestly must be, given the dimension of the moduli space of branched covers. This condition turns out to be equivalent to

\dim \ker \mathbf{D}_u^N = 0,

and we say that a simple curve v is super-rigid if it is immersed and all of its multiple covers have injective normal Cauchy-Riemann operators. If this condition holds for all simple curves, then it prevents any sequence of simple curves from converging to any multiple cover, so that for each genus and homology class, only finitely many simple curves exist. Moreover, the moduli spaces of branched covers over these simple curves have well-defined obstruction bundles, whose Euler classes determine the Gromov-Witten invariants (see e.g. this paper by Zinger).

Theorem D gives rise to a proof that for generic J in any symplectic manifold of dimension 2n \ge 6, all simple curves of index zero are indeed super-rigid. Looking again at our splitting of \mathbf{D}_u^N, you can see that the first step in proving this must be to check that whenever u = v \circ \varphi for an immersed simple curve v of index zero, all the twisted operators satisfy

\text{ind}\,\mathbf{D}_u^{\boldsymbol{\theta}_i} \le 0,

as \mathbf{D}^N clearly could not be injective without this. Actually, the necessary dimension-counting argument requires a stricter upper bound for the case when the representation \boldsymbol{\theta}_i is faithful. To illustrate this, let’s focus again on the case of a degree 2 cover and try to prove that \mathbf{D}_u^N will be injective for generic J if \text{ind}(v) = 0 and n \ge 3. Assume \varphi is a 2-fold branched cover with r \ge 0 branch points, which in this case is the same as the number of critical values. Then assuming our simple curve v is immersed (which is always true for simple index 0 curves if J is generic), the cover u = v \circ \varphi lives in a space of real dimension

\dim {\mathcal M}^2_{{\mathbb Z}_2}(J) = 2r.

Since J is generic, we can assume v is Fredholm regular, hence the index zero operator \mathbf{D}_v^N \cong \mathbf{D}_u^+ is an isomorphism. If \mathbf{D}_u^N is not injective, it therefore means that \mathbf{D}_u^- is not injective, so u belongs to a stratum {\mathcal M}^2_{{\mathbb Z}_2}(J;\mathbf{k},\mathbf{c}) \subset {\mathcal M}^2_{{\mathbb Z}_2}(J) with \mathbf{k} = (k_+,k_-) and \mathbf{c} = (c_+,c_-) where k_+ = c_+ = 0, k_- > 0 and c_- = k_- - \text{ind}\,\mathbf{D}_u^-. To make full use of this, we need more precise information about \text{ind}\,\mathbf{D}_u^-. This is the factor in the splitting that corresponds to a faithful representation of {\mathbb Z}_2; the other factor \mathbf{D}_u^+ corresponds to the trivial (and therefore unfaithful) representation of {\mathbb Z}_2, but we do not need to care about it since \mathbf{D}_u^+ is identified with \mathbf{D}_v^N, so that the standard analysis for simple curves has already told us everything we wanted to know. To compute \text{ind}\,\mathbf{D}_u^-, note that by the Riemann-Hurwitz formula, the count r \ge 0 of branch points of the cover \varphi : \widetilde{\Sigma} \to \Sigma satisfies

r = -\chi(\widetilde{\Sigma}) + d \chi(\Sigma),

with d = 2 in the present situation. Combining this with the Riemann-Roch formula then gives the relation

\text{ind}\,\mathbf{D}_u^N = d \cdot \text{ind}\,\mathbf{D}_v^N - (n-1)r,

thus \text{ind}\,\mathbf{D}_u^N = 2\, \text{ind}(v) - (n-1) r = -(n-1)r. This implies

\text{ind}\,\mathbf{D}_u^- = \text{ind}\,\mathbf{D}_u^N - \text{ind}\,\mathbf{D}_u^+ = -(n-1)r,

and plugging this in to compute c_-, Theorem D now implies that u lives in a substratum {\mathcal M}^2_{{\mathbb Z}_2}(J;\mathbf{k},\mathbf{c}) with dimension

2r - t_+ k_+ c_+ - t_- k_- c_- = 2r - k_- [k_- + (n-1)r] \le -k_-^2 - [k_- (n-1) - 2]r,

and this is strictly negative since we assumed n \ge 3. This is of course a contradiction, and thus proves that the super-rigidity condition is satisfied for branched covers of degree two.[5]

I should be careful about attributions here: the partial super-rigidity result I just sketched is not due to me, but was proved first in a paper by Eftekhary, using a quite similar approach based on ideas of Taubes. Theorem D, however, is strong enough to prove super-rigidity for all branched covers of index zero curves. As in the transversality problems discussed above, the main additional piece of input one needs for this is a computation of the indices of the twisted operators \mathbf{D}_u^{\boldsymbol{\theta}_i}, which comes more or less for free when d=2 but is more involved in the general case.

This post is more than long enough already, so I’ll conclude it with an I-owe-you. In order to convince skeptics that this whole story isn’t just wishful thinking, there are two important things I need to explain:

  1. How to define the splitting of \mathbf{D}_u^N for general multiple covers u = v \circ \varphi, in particular in cases where \varphi is a non-normal branched cover;
  2. Why Theorem D is true.

I intend to devote a separate post to each of these topics sometime in the near future. Not sure when, but soon.

Update: Both of the followup posts I promised now exist. See “Regular presentations and twisted Cauchy-Riemann operators” and “The transversality machine”.

Update 2 (19.12.2017): I have edited this post (and will shortly also be updating the paper on the arXiv) to correct a minor error in representation theory. The correction necessitated a change in the definition of the space {\mathcal M}^d_G(J;\mathbf{k},\mathbf{c}), so that some dimensions that used to be real are now dimensions over the endomorphism algebra {\mathbb K}_i = \text{End}_G(W_i). This change (fortunately) has no adverse impact on Theorems A, B and C. Many thanks to Thomas Walpuski and Aleksander Doan for catching the error.


[1] I use the term Fredholm regular to refer to the condition that is usually meant when we say that a holomorphic curve is “transversely cut out,” i.e. it implies via the implicit function theorem that a neighborhood of the curve in its moduli space is a smooth manifold (or orbifold if it has nontrivial automorphism group). In the present context, the word “regular” also arises with a different meaning, referring to a branched cover whose degree matches the order of its automorphism group. We must be careful to distinguish between these two meanings.

[2] Taubes’s paper Counting pseudo-holomorphic submanifolds in dimension 4, which explains all of the technical details needed for defining the Gromov invariant, has been something of an anomaly in the symplectic literature for 20 years. I can think of hardly any other papers that address the multiple cover problem by actually proving that transversality generically holds (as indeed it does for the doubly covered tori that the Gromov invariant counts; Theorem B is a generalization of this statement). My own work on this subject began in earnest when I started asking the question, “how did Taubes prove that those tori are regular?”, and it turned out that none of the people I would have expected to know the answer actually did. I suspect most people believed it to be a phenomenon that only happens in dimension four, but Theorem B tells you that that is false. A version of Taubes’s splitting of Cauchy-Riemann operators also appears in Eftekhary’s paper that proved some cases of super-rigidity.

[3] Recall that a covering map p : Y \to X is called normal (or also regular) if for every x \in X and every pair y_1,y_2 \in p^{-1}(x), there exists a deck transformation sending y_1 to y_2. This definition extends in an obvious way to a branched cover of Riemann surfaces \varphi : \widetilde{\Sigma} \to \Sigma, so \varphi is normal if and only if the order of its automorphism group \text{Aut}(\varphi) is the same as its degree.  In general, |\text{Aut}(\varphi)| \le \text{deg}(\varphi). (I personally prefer the term “regular” in place of “normal”, but I’m avoiding using it in this post so as to prevent confusion with the term “Fredholm regular”.)

[4] Readers unfamiliar with the real representation theory of finite groups will find these details explained in the classic textbook by Serre.

[5] My dimension-counting argument for super-rigidity conspicuously fails in dimension four, and I do not know whether the result is true in that case. I haven’t thought about it very much, since I don’t immediately know what it would be useful for, but what I do know is the following: one can use uniquely low-dimensional methods to prove that super-rigidity holds in dimension four for index zero curves of genus zero or one. The genus zero case is more or less an example of “automatic transversality,” and can be deduced from the classic paper by Hofer-Lizan-Sikorav on that subject. For genus one curves, the proof uses a method known as the Hutchings magic trick, thanks to the blog post in which Hutchings first described it. (His description was phrased in terms of the ECH index inequality, but for closed curves in symplectic 4-manifolds, it basically reduces to the standard adjunction formula.)

Posted in Uncategorized | Tagged , , , | 2 Comments

New SFT lecture notes / book on the arXiv

Dear readers (assuming I still have any), I would like to draw your attention to something that I uploaded to the arXiv this week:

Lectures on Symplectic Field Theory

This is the expanded version of some lecture notes I wrote for a 2-term graduate course on SFT that I taught last year in London, and it is due to be published as a book in the next year. I would be grateful for any useful comments or corrections that readers may choose to send my way before it goes to press!

The book is mainly intended to cover what I regard as the “basics” of SFT at a level suitable for PhD students and researchers from other fields. But I also used the opportunity to add a few things to the literature that haven’t appeared in writing before, at least not quite in this form, such as:

  • Lecture 3 includes a self-contained explanation and proof of the fact that between any two (trivialized) asymptotic operators associated to nondegenerate Reeb orbits there is a well-defined spectral flow, and related facts involving the Conley-Zehnder index and winding numbers of eigenfunctions. By “self-contained,” I mainly mean that you don’t need to read Kato or understand a lot about the spectral theory of unbounded self-adjoint operators before reading this, you mainly just need to know the basic facts about Fredholm operators.
  • Lecture 5 implements a novel approach that was suggested 20 years ago by Taubes for proving the Riemann-Roch formula and its generalization to surfaces with cylindrical ends. The standard reference for the latter has traditionally been Schwarz’s thesis, which does it by using a linear gluing argument to break up arbitrary surfaces into simpler pieces on which the index can be computed explicitly. (This is analogous to the proof that McDuff and Salamon give for compact surfaces with boundary.) Taubes’s alternative approach is unusual in the symplectic context but familiar from gauge theory: the idea is to use a simple Weitzenböck-type formula for Cauchy-Riemann operators to show that if you deform the operator by a sufficiently large zeroth-order term that is complex antilinear, it forces sections in the kernel and cokernel to concentrate around the zero-set of the perturbation. The index calculation then reduces to a signed count of zeroes of the perturbation, in other words, a relative first Chern number of the appropriate vector bundle. This idea was sketched in a 2-page section that Taubes labeled a “non-sequitur” at the end of his paper defining the Gromov invariant of symplectic 4-manifolds; Lecture 5 works out the details.
  • Lecture 8 contains what is meant to be a definitive proof of the standard theorem about transversality for somewhere injective holomorphic curves u : \dot{\Sigma} \to {\mathbb R} \times M with generic {\mathbb R}-invariant almost complex structures on symplectizations (originally due to Dragnev), together with the requisite lemmata on injective points of the projection \dot{\Sigma} \to M and the nonvanishing of \pi_\xi \circ du \in \text{Hom}(T\dot{\Sigma},u^*\xi). As I’ve discussed before on this blog, this fundamental result has been badly understood for a long time. The proof is Lecture 8 is essentially the one I sketched in my earlier blog post Some good news about the forgetful map in SFT, which is a generalization of an argument by Bourgeois.
  • Lecture 10 illustrates one of the simplest nontrivial applications of SFT by constructing a rigorous version of cylindrical contact homology in certain specialized settings in order to distinguish the various tight contact structures on the 3-torus. It’s fair to say that the outlines of this argument are standard, e.g. you can read a succinct account of it in Bourgeois’s lecture notes on contact homology from 2003; however, the latter uses the Morse-Bott methods from Bourgeois’s thesis, and since I didn’t want to devote a whole lecture to Morse-Bott methods in the class, I ended up doing something different.(*) The standard approach to proving uniqueness of the relevant holomorphic cylinders for nondegenerate asymptotic data would be by viewing that data as a small perturbation of Morse-Bott data, for which the required uniqueness result is more or less obvious. Instead of that, I view the nondegenerate contact data as a small perturbation of a nondegenerate but integrable stable Hamiltonian structure, for which the cylinders in question can be regarded as solutions to the Floer equation and standard results from Hamiltonian Floer homology can be applied. This argument isn’t completely original either—I have some vague memory of reading something similar in Eliashberg-Kim-Polterovich several years ago—but it’s a nice example of the practical applicability of stable Hamiltonian structures and might be quite useful in other contexts, so I wanted to give it some attention.
  • Some readers might be grateful for Appendix B, which proves the basic properties of Floer’s C_\varepsilon space, e.g. that it is a separable Banach space which contains enough bump functions to prove nice transversality results via the Sard-Smale theorem. Funny story: the reason I ended up writing this appendix is that in the original version of these notes, I made at least one catastrophic technical error in my presentation of spectral flow in Lecture 3. I doubt whether anyone who read that version of the notes noticed, as the error was hidden behind a lemma that I stated without proof because the proof would have required the Sard-Smale theorem (which we only covered four lectures later). Long story short, last summer I finally sat down to write up a proof of that lemma and found out that it was wrong, it couldn’t possibly work the way I’d envisioned it, because I was trying to apply the Sard-Smale theorem with a Banach space of perturbations that was not separable.(**) I wouldn’t have noticed if I hadn’t started asking myself fundamental questions like “why is the C_\varepsilon space separable anyway?”, but I did, and in the effort to answer them, I both revised Lecture 3 and wrote Appendix B.

Unlike most of the existing references, I also made an effort in these notes to include general stable Hamiltonian structures in the setting of all technical results whenever possible. This is not always possible—the result on transversality in symplectizations for instance requires some extra condition on either the hyperplane distribution \xi \subset TM or the curve u : \dot{\Sigma} \to {\mathbb R} \times M in order to exclude certain pathological counterexamples. It is also not strictly necessary if all you want to do is set up SFT as a framework for invariants of contact manifolds; assuming of course that the usual trouble with transversality for multiple covers can be solved somehow, it should suffice for that purpose to work with “contact-type” stable Hamiltonian structures. But of course we don’t just want to define the theory, we’d also like to be able to compute it, and e.g. for the computations on {\mathbb T}^3 in Lecture 10 and in many other situations that have arisen in my own research, it proves extremely useful to be able to relate the usual contact data to something more general and exploit the technical apparatus in that more general setting. In the effort to present things this way, I found out that a lot of basic lemmas that many of us have been taking for granted for years were actually never proved in their properly general setting, and the proofs sometimes require slightly new ideas. This ended up making the first half of Lecture 9 in particular (on asymptotic results) a lot longer than I expected it to be.

Speaking of Lecture 9, the second half of it discusses the SFT compactness theorem, and I tried to illustrate the main ideas behind the proof but made no attempt to make this discussion complete or self-contained. I did not want to end up writing a whole book about the SFT compactness theorem, and anyway, such a book already exists.

Another thing that is not included in the uploaded draft is… well, the entirety of Lectures 14, 15 and 16. There are two practical reasons for this: (1) it’s pretty easy to convince a publisher to let you keep the manuscript of your book freely available online in perpetuity if you tell them you won’t include the last three chapters. More immediately, (2) I haven’t typed them up yet. But they will appear in the published version, so I’ll post news about that as soon as there is news to post (sometime in the next year).

(*) The wisdom of not discussing Morse-Bott methods is of course highly debatable, as Morse-Bott methods are indisputably useful. The truth is: I had one lecture left before the Christmas vacation and I wanted to use it for proving a big theorem, not just more technical lemmas. That’s why the 3-torus discussion is in Lecture 10 and not Lecture 11.

(**) We all learned at some point that the Banach space \ell^\infty of bounded sequences is not separable. I was never impressed by this since I’ve never seen \ell^\infty arise naturally in any problem I cared about. But here’s another Banach space that isn’t separable: the space {\mathcal L}(H) of bounded linear maps on a Hilbert space H. It doesn’t matter if H itself is separable, because {\mathcal L}(H) contains an isometric embedding of \ell^\infty. This means you have to be extremely careful in any discussion of “generic families of operators H \to H,” as the Sard-Smale theorem doesn’t apply, so e.g. you can’t assume that a smooth Fredholm map defined on {\mathcal L}(H) or (I mention this next example for no reason at all) C_\varepsilon([0,1],{\mathcal L}(H)) has an abundance of regular values. As my former PhD advisor would say: shit happens.


Posted in Uncategorized | Tagged , , | 1 Comment

Some personal news (and two postdoc positions)

This blog has been sadly neglected in recent months, and there are several posts that I’ve been meaning to write when I find more free time. But rather than writing any of those right now, I would like to make two announcements:

  1. I will be leaving my current job at UCL in Spring/Summer of this year to start a new one as Professor for Differential Geometry and Global Analysis at Humboldt University in Berlin. Relatedly:
  2. I have two postdoc positions to offer.

Here are some details on the second point. The first thing I should make clear is that if you cannot either speak German or imagine yourself learning enough of it to teach problem classes in German after a year, then you probably shouldn’t get your hopes up. I’ve had a couple of inquiries already from people who don’t know any German — what I say in these cases is that if you’re a sufficiently good fit and are enthusiastic about coming to Berlin and learning enough German to teach in your second year, then we may be able to find enough English-language teaching duties to get you through the first year, though I can’t make any promises. If you’re not too discouraged yet, then read on.

The two jobs are Wissenschaftlicher Mitarbeiter positions attached to my new research group in symplectic topology at the HU, with fixed-term contracts for 5 or 6 years. Both have a start date of August 1, or as soon as possible thereafter, and the application deadline for both is March 29. Both also have light teaching duties, which can vary along a spectrum between teaching one 90-minute problem class (Übung) per week and teaching lecture courses on geometry or topology for undergraduates, or more specialized courses for Master’s students.

There is a slight difference between the two positions:

  • Position 1 (reference number AN/026/16) is a 2/3-position for up to 6 years, which is the standard type of postdoc position at the HU. (Being 2/3 means it is paid a bit less than a “full” position, but this also makes the teaching duties lighter, and unlike the place I’m moving from, living costs in Berlin are still affordable.)
  • Position 2 (reference number AN/025/16) is a “full” position for up to 5 years, so the pay is higher than a 2/3 position, and the teaching load is also slightly higher (though a lot less than a regular faculty position). In theory, this one is intended for someone at a slightly more advanced level in their career, though we can be flexible on this detail.

I would encourage anyone who thinks they might be suitable for either of these positions to submit identical applications for both. Below are English versions of the official job adverts, including links to the (legally binding) German versions.

Update 29/2/2016: Since I’ve been asked about this, I should clarify that sending the application materials in English is perfectly fine, even though the official job adverts are in German.


Position 1 (2/3 part time, fixed-term for up to 6 years, reference number AN/026/16):

Job description: Research and teaching in the research group for differential geometry and global analysis, in particular in the field of symplectic topology; collaboration on research projects

Requirements: PhD in mathematics; expertise in symplectic or contact topology; ability to teach in German is also required

Applications (including CV, publications list, description of current and planned research, details of previous teaching experience, and at least two letters of recommendation) should be sent electronically under reference number AN/026/16 to Prof. Wendl at the Institute for Mathematics, Faculty of Mathematics and Natural Sciences, Humboldt-Universität zu Berlin,

Application deadline: March 29, 2016

Please visit our website for access to the legally binding German version:


Position 2 (full time, fixed-term for up to 5 years, reference number AN/025/16):

Job description: Research and teaching in the research group for differential geometry and global analysis, in particular in the field of symplectic topology; collaboration on research projects

Requirements: PhD in mathematics; expertise in symplectic or contact topology, preferably with an established track record of research in these fields; ability to teach in German is also required

Applications (including CV, publications list, description of current and planned research, details of previous teaching experience, and at least two letters of recommendation) should be sent electronically under reference number AN/025/16 to Prof. Wendl at the Institute for Mathematics, Faculty of Mathematics and Natural Sciences, Humboldt-Universität zu Berlin,

Application deadline: March 29, 2016

Please visit our website for access to the legally binding German version:

Posted in Uncategorized | Leave a comment

Signs (or how to annoy a symplectic topologist)

In this post, I will finally address that most pressing question of our times:

Wrong Way - Do Not Enter

For ****’s sake, is it \omega(X_H,\cdot) = dH or \omega(X_H,\cdot) = -dH?

My original motivation to write this post was actually a slightly different sign question, which I hadn’t realized until recently is quite closely related to this one. If you’ve seen me at any conferences recently, you may already be able to guess what I’m referring to, because I’ve developed a habit of interrupting other people’s talks to make the following seemingly frivolous observation:

The symplectization of M is {\mathbb R} \times M, not M \times {\mathbb R}.

I don’t mean to sound dogmatic, and I don’t bring up this point just to annoy people — I bring it up because most people don’t realize there is an issue here that goes beyond a matter of arbitrary convention, and as anyone who’s ever tried to understand Floer-type theories with something other than {\mathbb Z}_2-coefficients will tell you, being careless with orientations can easily lead to trouble. This kind of trouble:

cartoon about sign errors

© Bob Krohmer ,

Still with me? Good.

So is it R times M or M times R?

The issue with the symplectization of a contact manifold is very simple. In the literature it appears most often in the following form: assume (M,\xi) is a contact manifold with contact form \alpha. The symplectization of (M,\xi) can then be described as a manifold diffeomorphic to {\mathbb R} \times M with an exact symplectic form \omega = d(e^t\alpha); or sometimes one presents the symplectization instead as (0,\infty) \times M with \omega = d(t\alpha), which is fine since these two constructions are obviously symplectomorphic. What is not fine, but is nonetheless often done in papers by quite prominent authors (including the paper that gave this blog its name), is to write \omega in one of the above forms but write the manifold itself as M \times {\mathbb R} or M \times (0,\infty). Why isn’t this fine? Well, I assume we can all agree on the following:

  1. If X and Y are two oriented manifolds, then X \times Y inherits a canonical orientation, with a positively oriented local coordinate system given by (x_1,\ldots,x_m,y_1,\ldots,y_n) for any choice of positively oriented local coordinates (x_1,\ldots,x_m) on X and (y_1,\ldots,y_n) on Y.
  2. Any symplectic manifold (W,\omega) carries a canonical orientation, namely the one defined by the volume form \omega \wedge \ldots \wedge \omega.
  3. Any contact manifold (M,\xi) with a co-oriented contact structure also carries a canonical orientation, namely the one defined by \alpha \wedge d\alpha \wedge \ldots \wedge d\alpha for any choice of contact form \alpha compatible with the co-orientation of \xi.

The second point is the reason why, for example, it’s important in symplectic topology to make the distinction between {\mathbb C} P^2 (carrying its natural orientation as a complex manifold) and \overline{{\mathbb C} P}^2, the same manifold with reversed orientation. The first admits a symplectic structure but the second does not, since we know from de Rham cohomology that there is no closed 2-form \omega on \overline{{\mathbb C} P}^2 satisfying \omega \wedge \omega > 0.

Similarly, if (M,\xi) is a co-oriented contact manifold, then M inherits a canonical orientation and therefore so does {\mathbb R} \times M — and it is easy to check that the latter matches the canonical orientation determined by the symplectic form d(e^t\alpha).  But since {\mathbb R} and M are both odd-dimensional, M \times {\mathbb R} has the opposite orientation. The wrong one. There is no arbitrary convention involved.

How well do you know the cotangent bundle, really?

If I were really so dogmatic, I would tell you that that’s the end of the story, but of course it isn’t. I did make one choice in the above discussion: I chose to write the symplectic form on the symplectization as d(e^t\alpha) or d(t\alpha). These two conventions are indeed equivalent, and up to the choice of writing the {\mathbb R}-coordinate as t and the contact form as \alpha, every paper I’m aware of in the symplectic/contact literature follows one of these two conventions. (Please feel free to point out exceptions in the comments, if you know any!) However, these are not the only two conventions that might conceivably make sense: one could reasonably choose to write the symplectic form differently, and depending on this choice, one might be forced to write M \times {\mathbb R} instead of {\mathbb R} \times M. Let me explain.

We need to recall quickly why the symplectization of (M,\xi) — despite appearances to the contrary in the formula \omega = d(e^t\alpha) — doesn’t actually depend on the choice of contact form. The canonical definition of the symplectization is as a particular symplectic submanifold of T^*M, namely

S_\xi M \subset T^*M

is the submanifold consisting of all \lambda \in T^*M such that \lambda|_\xi = 0 and \lambda > 0 in the direction positively transverse to \xi. In other words, S_\xi M is a fiber bundle over M whose sections are precisely the contact forms for \xi. A choice of contact form \alpha trivializes this fiber bundle and defines a diffeomorphism

{\mathbb R} \times M \to S_\xi M : (t,q) \mapsto e^t \alpha_q,

such that the canonical 1-form on T^*M pulls back to {\mathbb R} \times M as e^t \alpha. The contact condition for \xi thus turns out to be equivalent to the condition that S_\xi M is a symplectic submanifold of T^*M, and according to a standard convention, the above diffeomorphism identifies S_\xi M with ({\mathbb R} \times M,d(e^t\alpha)).

Wait, wait… did you say “convention”?

Yes, there was exactly one semi-arbitrary convention involved in what I just said. Did you see it? I’ll give you a moment. Once you’re ready for the answer, scroll past this video of a dog lamenting the horrors of sign errors:

So here’s the thing:

The symplectic form on T^*M is not canonical.

Cotangent bundles do of course have canonical Liouville forms. As we all learned in Symplectic Geometry 101, there is a 1-form \lambda_{\text{can}} defined on T^*M such that if you choose any local coordinates (q_1,\ldots,q_n) on a neighborhood in M and let (p_1,\ldots,p_n) denote the induced coordinates on the fibers over that neighborhood, then

\lambda_{\text{can}} = \sum_{j=1}^n p_j \, d q_j.

Since it’s obvious from the formula that d\lambda_{\text{can}} is symplectic, we often assume that d\lambda_{\text{can}} is the “canonical” symplectic form on T^*M. By why should it be? Why shouldn’t the symplectic form be -d\lambda_{\text{can}}?

If this strikes you as a silly question, keep reading.

(Update 24/08/2015: Patrick Massot makes a very good point below in the comments, that “canonical” is perhaps the wrong word to be using here — \lambda_{\text{can}} can more accurately be called a tautological 1-form, and d\lambda_{\text{can}} can just as accurately be called a “tautological 2-form” on T^*M. This reinforces my opinion that d\lambda_{\text{can}} is the “best” choice for a symplectic form on T^*M, though it is not the only reasonable choice.)

What, you haven’t asked Isaac Newton’s opinion?

One could argue in various ways that d\lambda_{\text{can}} and -d\lambda_{\text{can}} are equally good choices of symplectic forms on T^*M; for instance, the canonical Liouville vector field (pointing outward in the fibers) is Liouville with respect to both of them. In fact, there are situations in which one must take -d\lambda_{\text{can}} instead of d\lambda_{\text{can}}. This leads us back to the question that started this post, the question that has caused countless headaches to graduate students attempting to start their first research projects in Floer homology and related subjects:

I ask you for the last ****ing time, is it \omega(X_H,\cdot) = dH or \omega(X_H,\cdot) = -dH?

The symplectic literature is pretty evenly split in its opinion about the definition of a Hamiltonian vector field, but there’s a basic rule of thumb that I would say must always be (and usually is) observed. Whatever sign conventions you choose, they must lead to a version of Hamilton’s equations that physicists would recognize.

An undergraduate physics student would write Hamilton’s equations as follows:

\displaystyle \dot{q}_j = \frac{\partial H}{\partial p_j},        \displaystyle \dot{p}_j = - \frac{\partial H}{\partial q_j},

where q_1,\ldots,q_n are the “position” variables (moving in M) and p_1,\ldots,p_n are the “momentum” variables (moving in the fibers of T^*M). In the special case where M = {\mathbb R}^n and we’re talking about motion in a Newtonian potential, that same physics student will define H by

\displaystyle H(q,p) = \sum_{j=1}^n \frac{p_j^2}{2 m_j} + V(q),

where V(q) is the potential energy, and the positive term in front of it (depending on some constant masses m_1,\ldots,m_n > 0) is the kinetic energy. To make sure you’ve gotten the signs right in Hamilton’s equations, all you have to do is plug in this formula and compute \dot{q}_j = p_j / m_j, which is really what \dot{q}_j had better be if you’re going to refer to p_1,\ldots,p_n as “momentum” variables. If you end up defining momentum as minus mass times velocity, then you’ve clearly done something wrong.

So if you accept what I’ve just said, then it forces upon us the following dichotomy:

Option 1: You can define Hamiltonian vector fields by \omega(X_H,\cdot) = -dH, and then you get the correct local version of Hamilton’s equations if the symplectic structure on T^*M is

\omega = d\lambda_{\text{can}} = \sum_{j=1}^n d p_j \wedge d q_j.

In this case, the symplectization of (M,\xi) can be written as ({\mathbb R} \times M,d(e^t\alpha)), but not as M \times {\mathbb R} since the latter has the wrong orientation.

Option 2If you prefer to write \omega(X_H,\cdot) = dH, then you get the correct local expression for Hamilton’s equations if the symplectic structure on T^*M is

\omega = - d\lambda_{\text{can}} = \sum_{j=1}^n d q_j \wedge d p_j.

I have seen papers that conform to this convention, but most of them either don’t deal at all with contact geometry, or they do so but get some of the orientations wrong. Assuming \dim M = 2n-1, one would have to write the symplectization of (M,\xi) in this case as

({\mathbb R} \times M, - d(e^t\alpha)) if n is even,

(M \times {\mathbb R}, - d(e^t\alpha)) if n is odd.

For reasons that should by now be obvious, I prefer the first option. I have never seen the second option implemented in a consistent way in any paper; if I did, I would certainly find it a bit perverse, but I could not call it wrong.

(Acknowledgement: Thanks to Yankı Lekili for a conversation that helped me greatly in getting my thoughts on this topic in order. The correct order, not the wrong order.)

Posted in Uncategorized | Tagged , | 12 Comments