Some personal news (and two postdoc positions)

This blog has been sadly neglected in recent months, and there are several posts that I’ve been meaning to write when I find more free time. But rather than writing any of those right now, I would like to make two announcements:

  1. I will be leaving my current job at UCL in Spring/Summer of this year to start a new one as Professor for Differential Geometry and Global Analysis at Humboldt University in Berlin. Relatedly:
  2. I have two postdoc positions to offer.

Here are some details on the second point. The first thing I should make clear is that if you cannot either speak German or imagine yourself learning enough of it to teach problem classes in German after a year, then you probably shouldn’t get your hopes up. I’ve had a couple of inquiries already from people who don’t know any German — what I say in these cases is that if you’re a sufficiently good fit and are enthusiastic about coming to Berlin and learning enough German to teach in your second year, then we may be able to find enough English-language teaching duties to get you through the first year, though I can’t make any promises. If you’re not too discouraged yet, then read on.

The two jobs are Wissenschaftlicher Mitarbeiter positions attached to my new research group in symplectic topology at the HU, with fixed-term contracts for 5 or 6 years. Both have a start date of August 1, or as soon as possible thereafter, and the application deadline for both is March 29. Both also have light teaching duties, which can vary along a spectrum between teaching one 90-minute problem class (Übung) per week and teaching lecture courses on geometry or topology for undergraduates, or more specialized courses for Master’s students.

There is a slight difference between the two positions:

  • Position 1 (reference number AN/026/16) is a 2/3-position for up to 6 years, which is the standard type of postdoc position at the HU. (Being 2/3 means it is paid a bit less than a “full” position, but this also makes the teaching duties lighter, and unlike the place I’m moving from, living costs in Berlin are still affordable.)
  • Position 2 (reference number AN/025/16) is a “full” position for up to 5 years, so the pay is higher than a 2/3 position, and the teaching load is also slightly higher (though a lot less than a regular faculty position). In theory, this one is intended for someone at a slightly more advanced level in their career, though we can be flexible on this detail.

I would encourage anyone who thinks they might be suitable for either of these positions to submit identical applications for both. Below are English versions of the official job adverts, including links to the (legally binding) German versions.

Update 29/2/2016: Since I’ve been asked about this, I should clarify that sending the application materials in English is perfectly fine, even though the official job adverts are in German.


Position 1 (2/3 part time, fixed-term for up to 6 years, reference number AN/026/16):

Job description: Research and teaching in the research group for differential geometry and global analysis, in particular in the field of symplectic topology; collaboration on research projects

Requirements: PhD in mathematics; expertise in symplectic or contact topology; ability to teach in German is also required

Applications (including CV, publications list, description of current and planned research, details of previous teaching experience, and at least two letters of recommendation) should be sent electronically under reference number AN/026/16 to Prof. Wendl at the Institute for Mathematics, Faculty of Mathematics and Natural Sciences, Humboldt-Universität zu Berlin,

Application deadline: March 29, 2016

Please visit our website for access to the legally binding German version:


Position 2 (full time, fixed-term for up to 5 years, reference number AN/025/16):

Job description: Research and teaching in the research group for differential geometry and global analysis, in particular in the field of symplectic topology; collaboration on research projects

Requirements: PhD in mathematics; expertise in symplectic or contact topology, preferably with an established track record of research in these fields; ability to teach in German is also required

Applications (including CV, publications list, description of current and planned research, details of previous teaching experience, and at least two letters of recommendation) should be sent electronically under reference number AN/025/16 to Prof. Wendl at the Institute for Mathematics, Faculty of Mathematics and Natural Sciences, Humboldt-Universität zu Berlin,

Application deadline: March 29, 2016

Please visit our website for access to the legally binding German version:

Posted in Uncategorized | Leave a comment

Signs (or how to annoy a symplectic topologist)

In this post, I will finally address that most pressing question of our times:

Wrong Way - Do Not Enter

For ****’s sake, is it \omega(X_H,\cdot) = dH or \omega(X_H,\cdot) = -dH?

My original motivation to write this post was actually a slightly different sign question, which I hadn’t realized until recently is quite closely related to this one. If you’ve seen me at any conferences recently, you may already be able to guess what I’m referring to, because I’ve developed a habit of interrupting other people’s talks to make the following seemingly frivolous observation:

The symplectization of M is {\mathbb R} \times M, not M \times {\mathbb R}.

I don’t mean to sound dogmatic, and I don’t bring up this point just to annoy people — I bring it up because most people don’t realize there is an issue here that goes beyond a matter of arbitrary convention, and as anyone who’s ever tried to understand Floer-type theories with something other than {\mathbb Z}_2-coefficients will tell you, being careless with orientations can easily lead to trouble. This kind of trouble:

cartoon about sign errors

© Bob Krohmer ,

Still with me? Good.

So is it R times M or M times R?

The issue with the symplectization of a contact manifold is very simple. In the literature it appears most often in the following form: assume (M,\xi) is a contact manifold with contact form \alpha. The symplectization of (M,\xi) can then be described as a manifold diffeomorphic to {\mathbb R} \times M with an exact symplectic form \omega = d(e^t\alpha); or sometimes one presents the symplectization instead as (0,\infty) \times M with \omega = d(t\alpha), which is fine since these two constructions are obviously symplectomorphic. What is not fine, but is nonetheless often done in papers by quite prominent authors (including the paper that gave this blog its name), is to write \omega in one of the above forms but write the manifold itself as M \times {\mathbb R} or M \times (0,\infty). Why isn’t this fine? Well, I assume we can all agree on the following:

  1. If X and Y are two oriented manifolds, then X \times Y inherits a canonical orientation, with a positively oriented local coordinate system given by (x_1,\ldots,x_m,y_1,\ldots,y_n) for any choice of positively oriented local coordinates (x_1,\ldots,x_m) on X and (y_1,\ldots,y_n) on Y.
  2. Any symplectic manifold (W,\omega) carries a canonical orientation, namely the one defined by the volume form \omega \wedge \ldots \wedge \omega.
  3. Any contact manifold (M,\xi) with a co-oriented contact structure also carries a canonical orientation, namely the one defined by \alpha \wedge d\alpha \wedge \ldots \wedge d\alpha for any choice of contact form \alpha compatible with the co-orientation of \xi.

The second point is the reason why, for example, it’s important in symplectic topology to make the distinction between {\mathbb C} P^2 (carrying its natural orientation as a complex manifold) and \overline{{\mathbb C} P}^2, the same manifold with reversed orientation. The first admits a symplectic structure but the second does not, since we know from de Rham cohomology that there is no closed 2-form \omega on \overline{{\mathbb C} P}^2 satisfying \omega \wedge \omega > 0.

Similarly, if (M,\xi) is a co-oriented contact manifold, then M inherits a canonical orientation and therefore so does {\mathbb R} \times M — and it is easy to check that the latter matches the canonical orientation determined by the symplectic form d(e^t\alpha).  But since {\mathbb R} and M are both odd-dimensional, M \times {\mathbb R} has the opposite orientation. The wrong one. There is no arbitrary convention involved.

How well do you know the cotangent bundle, really?

If I were really so dogmatic, I would tell you that that’s the end of the story, but of course it isn’t. I did make one choice in the above discussion: I chose to write the symplectic form on the symplectization as d(e^t\alpha) or d(t\alpha). These two conventions are indeed equivalent, and up to the choice of writing the {\mathbb R}-coordinate as t and the contact form as \alpha, every paper I’m aware of in the symplectic/contact literature follows one of these two conventions. (Please feel free to point out exceptions in the comments, if you know any!) However, these are not the only two conventions that might conceivably make sense: one could reasonably choose to write the symplectic form differently, and depending on this choice, one might be forced to write M \times {\mathbb R} instead of {\mathbb R} \times M. Let me explain.

We need to recall quickly why the symplectization of (M,\xi) — despite appearances to the contrary in the formula \omega = d(e^t\alpha) — doesn’t actually depend on the choice of contact form. The canonical definition of the symplectization is as a particular symplectic submanifold of T^*M, namely

S_\xi M \subset T^*M

is the submanifold consisting of all \lambda \in T^*M such that \lambda|_\xi = 0 and \lambda > 0 in the direction positively transverse to \xi. In other words, S_\xi M is a fiber bundle over M whose sections are precisely the contact forms for \xi. A choice of contact form \alpha trivializes this fiber bundle and defines a diffeomorphism

{\mathbb R} \times M \to S_\xi M : (t,q) \mapsto e^t \alpha_q,

such that the canonical 1-form on T^*M pulls back to {\mathbb R} \times M as e^t \alpha. The contact condition for \xi thus turns out to be equivalent to the condition that S_\xi M is a symplectic submanifold of T^*M, and according to a standard convention, the above diffeomorphism identifies S_\xi M with ({\mathbb R} \times M,d(e^t\alpha)).

Wait, wait… did you say “convention”?

Yes, there was exactly one semi-arbitrary convention involved in what I just said. Did you see it? I’ll give you a moment. Once you’re ready for the answer, scroll past this video of a dog lamenting the horrors of sign errors:

So here’s the thing:

The symplectic form on T^*M is not canonical.

Cotangent bundles do of course have canonical Liouville forms. As we all learned in Symplectic Geometry 101, there is a 1-form \lambda_{\text{can}} defined on T^*M such that if you choose any local coordinates (q_1,\ldots,q_n) on a neighborhood in M and let (p_1,\ldots,p_n) denote the induced coordinates on the fibers over that neighborhood, then

\lambda_{\text{can}} = \sum_{j=1}^n p_j \, d q_j.

Since it’s obvious from the formula that d\lambda_{\text{can}} is symplectic, we often assume that d\lambda_{\text{can}} is the “canonical” symplectic form on T^*M. By why should it be? Why shouldn’t the symplectic form be -d\lambda_{\text{can}}?

If this strikes you as a silly question, keep reading.

(Update 24/08/2015: Patrick Massot makes a very good point below in the comments, that “canonical” is perhaps the wrong word to be using here — \lambda_{\text{can}} can more accurately be called a tautological 1-form, and d\lambda_{\text{can}} can just as accurately be called a “tautological 2-form” on T^*M. This reinforces my opinion that d\lambda_{\text{can}} is the “best” choice for a symplectic form on T^*M, though it is not the only reasonable choice.)

What, you haven’t asked Isaac Newton’s opinion?

One could argue in various ways that d\lambda_{\text{can}} and -d\lambda_{\text{can}} are equally good choices of symplectic forms on T^*M; for instance, the canonical Liouville vector field (pointing outward in the fibers) is Liouville with respect to both of them. In fact, there are situations in which one must take -d\lambda_{\text{can}} instead of d\lambda_{\text{can}}. This leads us back to the question that started this post, the question that has caused countless headaches to graduate students attempting to start their first research projects in Floer homology and related subjects:

I ask you for the last ****ing time, is it \omega(X_H,\cdot) = dH or \omega(X_H,\cdot) = -dH?

The symplectic literature is pretty evenly split in its opinion about the definition of a Hamiltonian vector field, but there’s a basic rule of thumb that I would say must always be (and usually is) observed. Whatever sign conventions you choose, they must lead to a version of Hamilton’s equations that physicists would recognize.

An undergraduate physics student would write Hamilton’s equations as follows:

\displaystyle \dot{q}_j = \frac{\partial H}{\partial p_j},        \displaystyle \dot{p}_j = - \frac{\partial H}{\partial q_j},

where q_1,\ldots,q_n are the “position” variables (moving in M) and p_1,\ldots,p_n are the “momentum” variables (moving in the fibers of T^*M). In the special case where M = {\mathbb R}^n and we’re talking about motion in a Newtonian potential, that same physics student will define H by

\displaystyle H(q,p) = \sum_{j=1}^n \frac{p_j^2}{2 m_j} + V(q),

where V(q) is the potential energy, and the positive term in front of it (depending on some constant masses m_1,\ldots,m_n > 0) is the kinetic energy. To make sure you’ve gotten the signs right in Hamilton’s equations, all you have to do is plug in this formula and compute \dot{q}_j = p_j / m_j, which is really what \dot{q}_j had better be if you’re going to refer to p_1,\ldots,p_n as “momentum” variables. If you end up defining momentum as minus mass times velocity, then you’ve clearly done something wrong.

So if you accept what I’ve just said, then it forces upon us the following dichotomy:

Option 1: You can define Hamiltonian vector fields by \omega(X_H,\cdot) = -dH, and then you get the correct local version of Hamilton’s equations if the symplectic structure on T^*M is

\omega = d\lambda_{\text{can}} = \sum_{j=1}^n d p_j \wedge d q_j.

In this case, the symplectization of (M,\xi) can be written as ({\mathbb R} \times M,d(e^t\alpha)), but not as M \times {\mathbb R} since the latter has the wrong orientation.

Option 2If you prefer to write \omega(X_H,\cdot) = dH, then you get the correct local expression for Hamilton’s equations if the symplectic structure on T^*M is

\omega = - d\lambda_{\text{can}} = \sum_{j=1}^n d q_j \wedge d p_j.

I have seen papers that conform to this convention, but most of them either don’t deal at all with contact geometry, or they do so but get some of the orientations wrong. Assuming \dim M = 2n-1, one would have to write the symplectization of (M,\xi) in this case as

({\mathbb R} \times M, - d(e^t\alpha)) if n is even,

(M \times {\mathbb R}, - d(e^t\alpha)) if n is odd.

For reasons that should by now be obvious, I prefer the first option. I have never seen the second option implemented in a consistent way in any paper; if I did, I would certainly find it a bit perverse, but I could not call it wrong.

(Acknowledgement: Thanks to Yankı Lekili for a conversation that helped me greatly in getting my thoughts on this topic in order. The correct order, not the wrong order.)

Posted in Uncategorized | Tagged , | 12 Comments

Non-contact wormholes in all (higher) dimensions


I have an update on the subject of a post I wrote several months ago, in which I used the word “wormhole” together with the following graphic as a shameless attention-getting device:

A wormhole, obviously

The belt sphere of a connected sum, obviously.

(It worked then, so why not use it now?)

Anyway, topologists know that when I say wormhole, I mean connected sum: the topic of that earlier post was the fact that the prime decomposition theorem for tight contact 3-manifolds cannot be extended to dimension five, or to put it another way, nonprime 5-manifolds can admit strictly more tight contact structures than what you would get just by performing contact connected sums. Examples of this were observed in the second version of my preprint with Paolo Ghiggini and Klaus Niederkrüger, inspired in part by a result of Bowden, Crowley and Stipsicz, which I’ll have more to say about below.

The new development is that we can now prove the same is true in all higher dimensions, not just dimension five:

Theorem (Ghiggini-Niederkrüger-W. ’15).  Suppose n \ge 3, and (M,\Xi) is a closed almost contact manifold that is not a homotopy sphere but is diffeomorphic to an S^{n-1}-bundle over S^n. Let (-M,\overline{\Xi}) denote the same manifold with reversed orientation, carrying the same almost contact structure with reversed co-orientation. Then M \# (-M) admits a Stein fillable contact structure that is homotopic to \Xi \# \overline{\Xi} but is not isotopic to \xi_1 \# \xi_2 for any contact structures \xi_1 and \xi_2 on M and -M respectively.

The reason our result was initially restricted to dimension five was that we were trying to prove it as a corollary of the main theorem in our paper, concerning symplectic fillings of subcritical surgeries — the above statement is not a corollary of our main theorem when n \ge 4, but one can view it nonetheless as a corollary of our proof. The credit for this realization goes to Paolo Ghiggini, or possibly to the warm sea air of Gökova that inspired him (I have never been to Gökova, but I hear it’s very nice). The proof that is now in the third version of our preprint, which appeared on the arXiv about two weeks ago, is slightly different than the one that Paolo explained at the Gökova Geometry/Topology Conference, but the idea is the same. (In case the referee is reading this, please accept our apologies for not having thought up this improvement to the paper before we actually submitted it…)

As in the 5-dimensional version, the class of examples we use is borrowed from the paper of Bowden-Crowley-Stipsicz in which they exhibit “topological counterexamples” to a higher-dimensional extension of Eliashberg’s theorem on fillings of connected sums. They proved:

Theorem (Bowden-Crowley-Stipsicz ’14). Let M := ST^*S^{2k+1} denote the unit cotangent bundle of the (2k+1)-dimensional sphere for some odd number k \ge 5. Then M admits an almost contact structure \Xi such that some contact structure on M \# (-M) homotopic to \Xi \# \overline{\Xi} is Stein fillable, but no contact structure on M or -M homotopic to \Xi or \overline{\Xi} respectively is Stein fillable.

The corresponding statement is notably false for M = ST^*S^2, or for that matter when M is any 3-manifold, due to the combination of two well-known theorems:

  1. If M_1 and M_2 are closed oriented 3-manifolds, then all tight contact structures on M_1 \# M_2 are of the form \xi_1 \# \xi_2.
  2. If (M_1,\xi_1) and (M_2,\xi_2) are closed contact 3-manifolds, then every Stein filling of (M_1 \# M_2, \xi_1 \# \xi_2) is obtained by attaching a 1-handle to Stein fillings of (M_1,\xi_1) and (M_2,\xi_2).

The first statement is a weak version of the contact prime decomposition theorem, due mainly to Colin (see e.g. Section 4.12 of Geiges’ book), and the second is one of the main results in Eliashberg’s holomorphic disk-filling paper (the book by Cieliebak and Eliashberg contains a more complete proof). When the Bowden-Crowley-Stipsicz result first appeared, many of us interpreted it as evidence that Eliashberg’s theorem cannot be extended to higher dimensions, but this interpretation is false: according to our theorem, the reason one shouldn’t expect the fillable contact structures of Bowden-Crowley-Stipsicz to have fillable summands is that they are the wrong contact structures, i.e. they are not the ones that arise from contact connected sums. The prime decomposition theorem thus fails in higher dimensions, but Eliashberg’s theorem on fillings of connected sums could still be true! (I will refrain from expressing an opinion as to whether it actually is, as I’m not sure I would expect this question to be answerable in my lifetime — suffice it to say that the main result of the paper with Ghiggini and Niederkrüger provides some very weak evidence in favor, but it’s arguably so weak as to be hardly worth mentioning.)

I don’t want to make this post longer than necessary by explaining the proof of our theorem in detail, but I can give a reasonable sketch of the idea. As I’ve expressed it above, the stated hypotheses guarantee three essential properties of M:

  1. M is (2n-1)-dimensional with n \ge 3;
  2. M has an almost contact structure \Xi;
  3. M admits a Morse function f : M \to {\mathbb R} that has unique local minima and maxima and otherwise only critical points of indices n-1 and n.

We do not actually require M to be an S^{n-1}-bundle over S^n in general, as any M with these three properties will do; it’s immediate at least that M could be the unit cotangent bundle of any sphere, so our examples subsume those of Bowden-Crowley-Stipsicz. We now examine the same Stein domain that they do: let M^* denote the complement of an open ball in M, and consider the compact manifold with boundary and corners defined by

W := [-1,1] \times M^*.

After smoothing the corners, we can regard W as a compact smooth manifold with boundary M \# (-M), and moreover, the almost contact structure on M determines (up to homotopy) an almost complex structure J on W which induces the almost contact structure \Xi \# \overline{\Xi} on the boundary. Similarly, we can assume the Morse function on M has its unique local maximum in the disk that was removed to create M^*, thus it induces on W a Morse function with outward gradient at the boundary and critical points of index 0, n-1 and n. Since n \ge 3, Eliashberg’s topological characterization of Stein structures now gives W a Stein structure homotopic to J, inducing on the boundary a contact structure \xi homotopic to \Xi \# (-\Xi).

W with the belt sphere of the connected sum on the boundary

W := [-1,1] \times M^* with the belt sphere of the connected sum on the boundary

Now if \xi is isotopic to \xi_1 \# \xi_2 for some contact structures \xi_1 and \xi_2 on M and -M respectively, it means that after an isotopy of \xi, the belt sphere

S := \{0\} \times \partial M^* \subset \partial W

of the connected sum is a particular kind of coisotropic submanifold in (\partial W,\xi), namely the kind that arises as the boundary of the co-core of a Weinstein 1-handle. One of the main things we show in our paper is that spheres of this type can be used as boundary conditions for holomorphic disks in the filling, as they decompose into families of totally real submanifolds. The details are significantly more complicated than in Eliashberg’s 1990 paper, but the outcome is quite similar: after a suitable choice of compatible almost complex structure J on W, one can define a moduli space {\mathcal M}(J) of J-holomorphic disks

u : ({\mathbb D}^2,\partial{\mathbb D}^2) \to (W,S)

with one marked point, and this moduli space is a compact (2n-1)-dimensional manifold which, due to the marked point, has the form of a trivial disk bundle \Sigma \times {\mathbb D}^2. Moreover, it has a continuous evaluation map

\text{ev} : {\mathcal M}(J) = \Sigma \times {\mathbb D}^2 \to W

which takes \partial {\mathcal M}(J) to the belt sphere S and defines a map of degree one \partial {\mathcal M}(J) \to S; in fact, \text{ev} is a diffeomorphism on some open subset. It’s easy to see that the image of \text{ev} cannot have much interesting topology, e.g. \text{ev}_* sends H_k({\mathcal M}(J)) to zero for every k > 0. Indeed, since {\mathcal M}(J) \cong \Sigma \times {\mathbb D}^2, every k-cycle in {\mathcal M}(J) can be pushed to the boundary by moving it in {\mathbb D}^2, thus \text{ev} maps it to a cycle in the (2n-2)-dimensional sphere S, which is trivial unless k=2n-2 (in which case the cycle must be trivial in H_k({\mathcal M}(J)) to begin with since it lives in the boundary).

Our original goal in this paper had been to use the topological information provided by the moduli space {\mathcal M}(J) to prove in much more general situations that [S] = 0 \in \pi_*(W), implying the lack of any homotopy-theoretic obstruction to W containing a handle with S as its belt sphere. We still don’t know how to prove that, except in a limited set of cases, such as when n=3. In the specific situation at hand, however, one gets more information by composing \text{ev} with the obvious projection W = [-1,1] \times M^* \to M^*, producing a continuous map

f := \text{pr}_2 \circ \text{ev} : ({\mathcal M}(J),\partial{\mathcal M}(J)) \to (M^*,\partial M^*)

which evidently has degree 1. In fact, the topological information we have about f now closely resembles the evaluation map in the proof of the Eliashberg-Floer-McDuff theorem on fillings of spheres! In that setting, one starts with an unknown symplectically aspherical filling X of a standard contact sphere S^{2n-1} and constructs a moduli space with an evaluation map

\text{ev} : ({\mathcal M}(J),\partial{\mathcal M}(J)) \to (X,S^{2n-1}),

which necessarily has degree 1 and various other properties that suffice to deduce that \pi_1(X) = 0 and H_k(X) = 0 for all k > 0. This implies via the Hurewicz theorem that X is weakly contractible, hence it is contractible by Whitehead’s theorem, and finally the h-cobordism theorem implies that it is diffeomorphic to a ball. The same topological arguments work in the present case: they imply that M^* must be contractible and thus M must be a homotopy sphere, contradicting the assumptions of our theorem. (You don’t actually need to apply the h-cobordism theorem in this case, though you are free to do so if you like it.)


Posted in Uncategorized | Tagged , | 1 Comment

Some good news about the forgetful map in SFT

This post is, unsurprisingly, a sequel to “Some bad news about the forgetful map in SFT,” in which I gave an example of a stable Hamiltonian structure (\lambda,\omega) for which the forgetful map cannot be made transverse by choosing J generically in {\mathcal J}(\lambda,\omega). In that example, the hyperplane distribution \xi = \ker\lambda was integrable, and that will turn out to be a significant detail: the goal of this post is to explain why no such example can occur if \xi is contact.

The universal moduli space

Recall the question we considered in the previous post. Question 1: Given a smooth submanifold X \subset {\mathcal M}_{g,m+p+q}, do generic perturbations of J \in {\mathcal J}(\lambda,\omega) suffice to ensure that the moduli space of somewhere injective curves {\mathcal M}^*_{g,m,p,q}(J) is cut out transversely and the forgetful map {\mathcal M}^*_{g,m,p,q}(J) \stackrel{\Phi}{\longrightarrow} {\mathcal M}_{g,m+p+q} is transverse to X?

Using the usual approach to generic transversality results, one can rephrase this question in terms of a universal moduli space, i.e. a space {\mathcal M}_{g,m,p,q}^*({\mathcal J}_\epsilon) of pairs (u,J) where J belongs to an infinite-dimensional Banach manifold {\mathcal J}_\epsilon of perturbed almost complex structures in {\mathcal J}(\lambda,\omega), and u \in {\mathcal M}_{g,m,p,q}^*(J). Recall that locally, this space can be identified with the zero set of a smooth section of a Banach space bundle {\mathcal E} \to {\mathcal T} \times {\mathcal B} \times {\mathcal J}_\epsilon, namely

\bar{\partial} : {\mathcal T} \times {\mathcal B} \times {\mathcal J}_\epsilon \to {\mathcal E} : (j,u,J) \mapsto Tu + J \circ Tu \circ j,

where {\mathcal B} is a Banach manifold of asymptotically cylindrical maps u : \dot{\Sigma} \to {\mathbb R} \times M and {\mathcal T} is a Teichmüller slice, i.e. a finite-dimensional smooth family of complex structures on \dot{\Sigma} parametrizing an open subset of Teichmüller space. The main step in any generic transversality proof is to show that the linearization

D\bar{\partial}(j,u,J) : T_j{\mathcal T} \oplus T_u{\mathcal B} \oplus T_J{\mathcal J}_\epsilon \to {\mathcal E}_{(j,u,J)}

D\bar{\partial}(j,u,J)(y,\eta,Y) = J \circ Tu \circ y + \mathbf{D}_u \eta + Y \circ Tu \circ j

is surjective, so that \bar{\partial}^{-1}(0) is a Banach manifold, and a Baire set of regular almost complex structures is found by applying the Sard-Smale theorem to \bar{\partial}^{-1}(0) \to {\mathcal J}_\epsilon : (j,u,J) \mapsto J. In this picture, the forgetful map takes the form

\Phi : \bar{\partial}^{-1}(0) \to {\mathcal T} : (j,u,J) \mapsto j,

and Question 1 is now equivalent to either of the following:

Question 2: Is the map \bar{\partial}^{-1}(0) \stackrel{\Phi}{\longrightarrow} {\mathcal T} : (j,u,J) \mapsto j a submersion?

Question 2′: Is the linear map T_u{\mathcal B} \oplus T_J{\mathcal J}_\epsilon \to {\mathcal E}_{(j,u,J)} : (\eta,Y) \mapsto \mathbf{D}_u \eta + Y \circ Tu \circ j surjective?

Exercise 1: Assuming D\bar{\partial}(j,u,J) is surjective, convince yourself that Questions 2 and 2′ are equivalent.

If the answer to either question is yes, then for any submanifold X \subset {\mathcal T}, \Phi^{-1}(X) \subset \bar{\partial}^{-1}(0) is also a manifold, and applying the Sard-Smale theorem to \Phi^{-1}(X) \to {\mathcal J}_\epsilon : (j,u,J) \mapsto J produces a Baire set for which the forgetful map {\mathcal M}_{g,m,p,q}(J) \to {\mathcal M}_{g,m+p+q} is transverse to X.

It depends on the contact condition!

So here’s the good news. Given a stable Hamiltonian structure (\lambda,\omega) on M, denote by \pi_\xi : T({\mathbb R} \times M) \to \xi the projection along the complex subbundle \text{Span}_{\mathbb R}(\partial_t,R); recall that d\lambda(R,\cdot) = 0, thus d\lambda(v,\cdot) = d\lambda(\pi_\xi v,\cdot) for all v \in T({\mathbb R} \times M). Let

{\mathcal M}_{g,m,p,q}^!(J) \subset {\mathcal M}_{g,m,p,q}^*(J)

denote the open subset defined by the condition that u : \dot{\Sigma} \to {\mathbb R} \times M has an injective point z_0 \in \dot{\Sigma} satisfying

\text{im}\left( \pi_\xi \circ du(z_0)\right) \cap \ker (d\lambda|_\xi) = \{0\}.

This condition similarly defines an open subset {\mathcal M}_{g,m,p,q}^!({\mathcal J}_\epsilon) \subset {\mathcal M}_{g,m,p,q}^*({\mathcal J}_\epsilon) of the universal moduli space. We can make two immediate observations about this condition:

  1. It is never satisfied if \xi is integrable, as d\lambda|_\xi = 0 in this case. This applies in particular to the counterexample in the previous post.
  2. If \xi is contact, then the condition is satisfied for all somewhere injective curves other than trivial cylinders: indeed, d\lambda|_\xi is nondegenerate in this case, and \pi_\xi \circ du vanishes only at isolated points (the latter is always true unless \pi_\xi \circ du vanishes identically; there’s a simple proof of this in another earlier post).

Theorem. The space {\mathcal M}_{g,m,p,q}^!({\mathcal J}_\epsilon) is a smooth Banach manifold, and the forgetful map \Phi : {\mathcal M}_{g,m,p,q}^!({\mathcal J}_\epsilon) \to {\mathcal M}_{g,m+p+q} is a submersion. Using the usual Sard-Smale argument plus the Taubes trick to replace {\mathcal J}_\epsilon by {\mathcal J}(\lambda,\omega) (see part 1 of the transversality post for more details), this implies: Corollary. Given any submanifold X \subset {\mathcal M}_{g,m+p+q}, there exists a Baire subset {\mathcal J}^{\text{reg}}(X) \subset {\mathcal J}(\lambda,\omega) such that for all J \in {\mathcal J}^{\text{reg}}(X){\mathcal M}_{g,m,p,q}^!(J) is a smooth manifold of the predicted dimension and the forgetful map \Phi : {\mathcal M}_{g,m,p,q}^!(J) \to {\mathcal M}_{g,m+p+q} is transverse to X. As outlined above, the key step in proving the theorem is to show that the linear map

\mathbf{L} : T_u{\mathcal B} \oplus T_J{\mathcal J}_\epsilon \to {\mathcal E}_{(j,u,J)} : (\eta,Y) \mapsto \mathbf{D}_u \eta + Y \circ Tu \circ j

is surjective. The argument for this begins in a standard way: if \mathbf{L} is not surjective, then there exists a nontrivial (0,1)-form \theta \in \Omega^{0,1}(\dot{\Sigma},u^*TW) which is L^2-orthogonal to the image of \mathbf{L}, implying

  1. \langle \mathbf{D}_u \eta , \theta \rangle_{L^2} = 0 for all \eta \in T_u{\mathcal B};
  2. \langle Y \circ Tu \circ j, \theta \rangle_{L^2} = 0 for all Y \in T_J{\mathcal J}_\epsilon.

The first condition implies as usual that \theta is a weak solution to a Cauchy-Riemann type equation and is thus (by elliptic regularity and the similarity principle) smooth with isolated zeroes. We would then like to argue that one can choose Y \in T_J{\mathcal J}_\epsilon using a bump function near the image of the injective point z_0 so that the second condition makes \theta vanish near z_0, giving a contradiction. But this is not obvious, because Y : T({\mathbb R} \times M) \to T({\mathbb R} \times M) only acts nontrivially on the subbundle \xi rather than on the entirety of T({\mathbb R} \times M). In order to deal with this, we need to examine \mathbf{L} more carefully in relation to the natural splitting of T({\mathbb R} \times M) produced by the stable Hamiltonian structure. The following is a slight repackaging of the argument given by Bourgeois.

Abbreviate W := {\mathbb R} \times M and note that for any J \in {\mathcal J}_\epsilon, we have a splitting of complex vector bundles

(TW,J) = (\Lambda,i) \oplus (\xi,J), where \Lambda := \text{Span}_{\mathbb R}(\partial_t,R).

This gives a splitting u^*TW = u^*\Lambda \oplus u^*\xi and thus breaks down the Cauchy-Riemann type operator \mathbf{D}_u in block form as

\mathbf{D}_u = \begin{pmatrix} \mathbf{D}_u^\Lambda & \mathbf{D}_u^{\xi\Lambda} \\ \mathbf{D}_u^{\Lambda\xi} & \mathbf{D}_u^\xi \end{pmatrix}.

It is easy check that \mathbf{D}_u^\Lambda and \mathbf{D}_u^\xi are Cauchy-Riemann type operators on u^*\Lambda and u^*\xi respectively, while \mathbf{D}_u^{\Lambda\xi} : u^*\Lambda \to u^*\xi and \mathbf{D}_u^{\xi\Lambda} : u^*\xi \to u^*\Lambda are tensorial, i.e. they are smooth bundle maps. The perturbation term Y \in T_J{\mathcal J}_\epsilon can likewise be written in block form as

Y = \begin{pmatrix} 0 & 0 \\ 0 & Y_\xi \end{pmatrix},

where Y_\xi \in \Gamma(\overline{\text{End}}_{\mathbb C}(\xi,J)), and \theta has components (\theta_\Lambda,\theta_\xi) with respect to the splitting

\overline{\text{Hom}}_{\mathbb C}(T\dot{\Sigma},u^*TW) = \overline{\text{Hom}}_{\mathbb C}(T\dot{\Sigma},u^*\Lambda) \oplus \overline{\text{Hom}}_{\mathbb C}(T\dot{\Sigma},u^*\xi).

Now choosing Y_\xi via a bump function near u(z_0), one can use the second orthogonality condition above to prove that \theta_\xi vanishes identically near z_0. It remains to show that the same is true for \theta_\Lambda, and this is the step that will turn out to depend on the condition \text{im}\left(\pi_\xi \circ du(z_0)\right) \cap \ker (d\lambda|_\xi) = \{0\}. Notice that if we choose \eta = (\eta_\Lambda,\eta_\xi) with \eta_\Lambda \equiv 0 and \eta_\xi supported in the region near z_0 where \theta_\xi is known to vanish, then the orthogonality conditions reduce to

\langle \mathbf{D}^{\xi\Lambda} \eta_\xi , \theta_\Lambda \rangle_{L^2} = 0.

We’re going to need an explicit formula for the bundle map \mathbf{D}^{\xi\Lambda} : u^*\xi \to u^*\Lambda. For this purpose, it will help to think of (W,J) as something resembling a Stein manifold: notice that the coordinate projection function t : {\mathbb R} \times M \to {\mathbb R} satisfies

-dt \circ J = \lambda

for every J \in {\mathcal J}(\lambda,\omega), so its level sets are J-convex if and only if \lambda is contact. Choose holomorphic local coordiates \sigma+i\tau near z_0 and, for a section \eta_\xi \in \Gamma(u^*\xi) supported in this coordinate neighborhood, let us compute \lambda(\mathbf{D}_u \eta_\xi(\partial_\sigma)). By the definition of the linearized Cauchy-Riemann operator, we can write

\mathbf{D}_u \eta_\xi(\partial_\sigma) = \left.\nabla_s \left( \partial_\sigma u_s + J(u_s) \partial_\tau u_s \right)\right|_{s=0}

for any smooth family of maps u_s : \dot{\Sigma} \to W with u_0 = u and \partial_s u_s|_{s=0} = \eta_\xi and any connection \nabla on W. Then since \partial_\sigma u + J(u) \partial_\tau u = 0, we find

\lambda(\mathbf{D}_u \eta_\xi(\partial_\sigma)) = \lambda \left( \left. \nabla_s (\partial_\sigma u_s + J(u_s) \partial_\tau u_s)\right|_{s=0} \right) = \left.\partial_s \left[ \lambda(\partial_\sigma u_s + J(u_s) \partial_\tau u_s)\right]\right|_{s=0}

= \left.\partial_s [\lambda(\partial_\sigma u_s)]\right|_{s=0} + \left.\partial_s [(\lambda \circ J)(\partial_\tau u_s)]\right|_{s=0} = d\lambda(\eta_\xi,\partial_\sigma u) + d(\lambda \circ J)(\eta_\xi,\partial_\tau u)

= d\lambda(\eta_\xi,\pi_\xi \partial_\sigma u),

where we’ve used the formula

d\lambda(X,Y) = {\mathcal L}_X[\lambda(Y)] - {\mathcal L}_Y[\lambda(X)] - \lambda([X,Y])

and eliminated several terms using the fact that \lambda(\eta_\xi) = \lambda(J\eta_\xi) = 0 since \eta_\xi is valued in \xi, plus d(\lambda \circ J) = 0 since \lambda \circ J = -dt is exact. A similar computation gives

dt(\mathbf{D}_u \eta_\xi(\partial_\sigma)) = -d\lambda(\eta_\xi, \pi_\xi \partial_\tau u) = -d\lambda(\eta_\xi, J \pi_\xi \partial_\sigma u),

so removing the local coordinates from the picture and writing sections of u^*\Lambda with respect to the obvious complex trivialization, we have

\mathbf{D}^{\xi\Lambda}_u \eta_\xi = - d\lambda(\eta_\xi, J \pi_\xi du(\cdot)) + i \, d\lambda(\eta_\xi, \pi_\xi du(\cdot)).

The following exercise in symplectic linear algebra shows that this bundle map u^*\xi \to u^*\Lambda is surjective on all fibers in some neighborhood of z_0. (If you have no patience for the exercise, just convince yourself that it’s true if d\lambda|_\xi is nondegenerate and tames J|_\xi.)

Exercise 2: Assume V is a finite-dimensional vector space, X, Y \subset V are linearly independent vectors, and \Omega is an alternating bilinear form. Show that the real-linear map

A : V \to {\mathbb C} : v \mapsto \Omega(v,X) + i \Omega(v,Y)

is surjective if and only if \text{Span}(X,Y) \cap \ker \Omega = \{0\}Hint: Under the latter condition, one loses no generality by replacing V with a subspace that is complementary to \ker \Omega and contains \text{Span}(X,Y), in which case (V,\Omega) becomes a symplectic vector space. Now consider the restriction of A to a 2-dimensional subspace transverse to the symplectic complement of \text{Span}(X,Y).

In light of the exercise, we can choose \eta_\xi with support near z_0 so that \mathbf{D}_u^{\xi\Lambda} \eta_\xi = \theta_\Lambda near z_0, thus the condition \langle \mathbf{D}_u^{\xi\Lambda} \eta_\xi , \theta_\Lambda \rangle_{L^2} = 0 implies that \theta_\Lambda must indeed vanish near z_0. We conclude that \theta itself vanishes near z_0, contradicting the fact that it has isolated zeroes and thus completing the proof.

Exercise 3: Generalize the theorem above to the setting of a Weinstein manifold (W,\lambda,f), i.e. assume (W,\lambda) is a Liouville manifold whose dual Liouville vector field V_\lambda is gradient like with respect to f : W \to {\mathbb R}. Notice that away from critical points of f, \lambda = -df \circ J is satisfied for any almost complex structure J that preserves the subbundle \xi := \ker df \cap \ker \lambda and takes V_\lambda to df(V_\lambda) R_\lambda, with R_\lambda denoting the “level-wise Reeb vector field” defined by the conditions

df(R_\lambda) = 0,     \lambda(R_\lambda) = 1,     and     d\lambda(R_\lambda,\cdot)|_\xi = 0.

For what class of J-holomorphic curves can you achieve transversality only by perturbing J on \xi?

Epilogue: what of the normal Cauchy-Riemann operator?

Compared with the argument I explained in my earlier 2-part post on this subject, the one given above achieves a stronger result at the cost of a stronger hypothesis. The weakening of hypotheses in the previous result is achieved by focusing on the normal bundle and the corresponding normal Cauchy-Riemann operator \mathbf{D}_u^N; this makes it reasonable to consider local perturbations of J only in directions normal to u (hence the condition that u must not be everywhere tangent to \xi), and the contact condition plays no role. This approach suffices because transversality in the usual moduli space of unparametrized holomorphic curves is equivalent to the surjectivity of \mathbf{D}_u^N. The latter fact is not hard to grasp in the case where u is immersed: then every other J-holomorphic curve near u admits a unique parametrization in the form

u' = \exp_u \eta

for some section \eta of the normal bundle. Such curves may be considered pseudoholomorphic if and only if their tangent spaces are J-invariant, and the linearization of the nonlinear problem detecting J-invariant immersions of the form \exp_u \eta then reduces to \mathbf{D}_u^N. (This alternative perspective on the nonlinear problem is explained more precisely in the paper by Hofer-Lizan-Sikorav.) From this point of view, however, notice that the domain complex structure of the nearby solution u' cannot be prescribed: it is fully determined by j' := (u')^*J, so the problem of finding nearby pseudoholomorphic maps u' : (\dot{\Sigma},j) \to (W,J) of the form u' = \exp_u \eta with j prescribed and \eta normal to u is overdetermined. This is why looking at \mathbf{D}_u^N does not give us transversality of the forgetful map, and we had to take a different approach in the proof above. For the same reasons, there are (as far as I know) no useful “automatic transversality” results in dimension four for moduli spaces with fixed conformal structures on the domain. Such results are typically proved by showing that \mathbf{D}_u^N is surjective, but this only gives a meaningful transversality condition if the conformal structure of the domain is allowed complete freedom of movement. The forgetful map is never automatically transverse.

Posted in Uncategorized | Tagged , , | Leave a comment

Some bad news about the forgetful map in SFT

This post and its sequel (fittingly titled “Some good news about the forgetful map in SFT”) are meant as addenda to my 2-part post from last winter on generic transversality in symplectizations. The result I tried to explain in that post might be called the fundamental transversality theorem of SFT: it states that for generic choices of J in the usual space of translation-invariant almost complex structures on the symplectization {\mathbb R} \times M of a contact manifold (M,\xi), every somewhere injective J-holomorphic curve in {\mathbb R} \times M is Fredholm regular. In fact, the proof I explained works in the more general setting where instead of a contact structure, M is endowed with a stable Hamiltonian structure (\lambda,\omega), the caveat being that one must exclude from consideration any holomorphic curves that are everywhere tangent to \xi := \ker \lambda, a scenario that can never happen in the contact case.

In this and the next post, I want to discuss a slightly subtle detail about the smoothness of these moduli spaces: the forgetful map. Readers may be familiar with this notion from Gromov-Witten theory, e.g. it appears in the book by McDuff and Salamon as a continuous map from the moduli space {\mathcal M}(J) of unparametrized J-holomorphic curves to the moduli space {\mathcal M} of Riemann surfaces, associating to each equivalence class of holomorphic curves u : (\Sigma,j) \to (W,J) the isomorphism class of complex structures (or equivalently, conformal structures) on its domain; in symbols,

{\mathcal M}(J) \stackrel{\Phi}{\longrightarrow} {\mathcal M} : [(j,u)] \mapsto [j].

This map plays a key role in defining many of the more interesting algebraic structures in Gromov-Witten theory, and in principle it can play a similar role in Symplectic Field Theory. But in order to make use of it, one needs to know not just that {\mathcal M}(J) is smooth, but also that \Phi can be made transverse to given cycles in {\mathcal M} for generic choices. This transversality condition is not so obvious, and as I’ll show in this post, there are situations where it is not even true. That is the bad news. The corresponding good news will be that it is true in the setting that SFT really cares about, namely in the symplectization of any contact manifold.

The impetus for this post was a minicourse on transversality techniques that I gave last week at the 2015 Summer School on Moduli Problems in Symplectic Geometry at IHES, in which I explained the transversality proof that is the subject of the earlier post and referenced this blog in lieu of lecture notes. In the minicourse, I mentioned (but did not give a very good explanation of) the following curious detail: unlike other proofs of generic transversality that one commonly sees, my proof in the symplectization setting depended crucially on the fact that we are considering holomorphic curves whose domain conformal structures are allowed to vary. In particular, my proof does not imply that one can make moduli spaces of J-holomorphic curves with constrained conformal structures on the domain smooth by choosing J generically — or to say it the fancy way, it does not make the forgetful map transverse. There are other proofs in the literature that do not have this limitation, e.g. a different proof for holomorphic cylinders in the contact setting appears in the appendix of a paper by Bourgeois. Strictly speaking, the moduli space of conformal structures is irrelevant in Bourgeois’s proof since the conformal structure of a cylinder is unique, but as I’ll sketch in the sequel to this post, his argument can easily be generalized to arbitrary punctured Riemann surfaces without the moduli space of conformal structures playing any role, so it does imply transversality of the forgetful map.

I hadn’t seriously thought about this detail before, but when it came up in the minicourse, it got me wondering whether this discrepancy was just a defect of my approach or an actual mathematical phenomenon. The answer turned out to be the latter, and it could conceivably have some nontrivial implications for computation problems in SFT: in the general setting of stable Hamiltonian structures, the forgetful map is not generically transverse, and I will give a counterexample below. The fact that Bourgeois’s proof does give transversality of the forgetful map is therefore a distinctly contact phenomenon. Let me explain what I mean.

The forgetful map in SFT

We will work in the same setting as in the earlier post: M is a closed oriented (2n-1)-dimensional manifold carrying a stable Hamiltonian structure (\lambda,\omega), which induces a co-oriented hyperplane distribution \xi = \ker\lambda \subset TM and a Reeb vector field R, and also determines a space {\mathcal J}(\lambda,\omega) of smooth translation-invariant almost complex structures on the symplectization {\mathbb R} \times M with J(\partial_t) = R and J|_{\xi} : \xi \to \xi compatible with \omega|_\xi.

Fix nonnegative integers g, m, p and q. Given J \in {\mathcal J}(\lambda,\omega), the moduli space {\mathcal M}_{g,m,p,q}(J) of unparametrized J-holomorphic curves in {\mathbb R} \times M with genus g, m marked points, p positive and q negative punctures consists of equivalence classes of tuples (\Sigma,j,\Theta,\Gamma,u) where (\Sigma,j) is a closed Riemann surface of genus g, \Theta \subset \Sigma is an ordered set of m points, \Gamma \subset \Sigma is partitioned into two ordered subsets \Gamma^+ and \Gamma^- of p and q points respectively, u : (\dot{\Sigma} := \Sigma \setminus \Gamma,j) \to ({\mathbb R} \times M,J) is J-holomorphic and positively/negatively asymptotic to trivial cylinders over closed Reeb orbits at each of the punctures in \Gamma^\pm \subset \Sigma, and two such tuples are considered equivalent if they are related by a diffeomorphism of their respective domains. Let

{\mathcal M}^*_{g,m,p,q}(J) \subset {\mathcal M}_{g,m,p,q}(J)

denote the open subset defined by the condition that u : \dot{\Sigma} \to {\mathbb R} \times M is somewhere injective. The forgetful map is defined by

\Phi : {\mathcal M}_{g,m,p,q}(J) \to {\mathcal M}_{g,m+p+q} : [(\Sigma,j,\Theta,\Gamma,u)] \mapsto [(\Sigma,j,\Theta \cup \Gamma)],

where {\mathcal M}_{g,k} denotes the moduli space of (marked) Riemann surfaces, consisting of equivalence classes of tuples (\Sigma,j,\Theta') with \Theta' \subset \Sigma an ordered set of k points.

It is a classical fact that {\mathcal M}_{g,k} is smooth, though in general the presence of biholomorphic automorphisms makes it an orbifold rather than a manifold. For the purposes of this discussion, I’m going to ignore automorphisms and pretend {\mathcal M}_{g,k} is a manifold wherever convenient. Then it is natural to ask the following:

Question 1: Given a smooth submanifold X \subset {\mathcal M}_{g,m+p+q}, do generic perturbations of J \in {\mathcal J}(\lambda,\omega) suffice to ensure that the moduli space of somewhere injective curves {\mathcal M}^*_{g,m,p,q}(J) is cut out transversely and the forgetful map {\mathcal M}^*_{g,m,p,q}(J) \stackrel{\Phi}{\longrightarrow} {\mathcal M}_{g,m+p+q} is transverse to X?

Here the words “cut out transversely” are used as a synonym for what I usually call Fredholm regularity of the elements in {\mathcal M}^*_{g,m,p,q}(J), so in particular {\mathcal M}^*_{g,m,p,q}(J) is a smooth manifold of the “correct” dimension as predicted by the usual index formula, but in addition to this, we obtain a smooth submanifold

\Phi^{-1}(X) \subset {\mathcal M}^*_{g,m,p,q}(J)

whose codimension matches that of X \subset {\mathcal M}_{g,m+p+q}. For example, if X is defined to be a single element [(\Sigma,j,\Theta \cup \Gamma)] with trivial automorphism group, then \Phi^{-1}(X) can be identified with the space of parametrized somewhere injective J-holomorphic maps u : (\dot{\Sigma},j) \to ({\mathbb R} \times M), where j, \Theta and \Gamma are regarded as fixed data on a fixed surface \Sigma, and transversality implies that this space of maps is a manifold.

A counterexample in the integrable case

The following example shows that the answer to Question 1 is sometimes no. Suppose (X,\Omega) is a closed symplectic manifold of dimension 2n-2, and endow M := S^1 \times X with the stable Hamiltonian structure (\lambda,\omega) := (d\theta,\Omega), where \theta denotes the coordinate on S^1. The Reeb vector field is then R = \partial_{\theta}, so every J \in {\mathcal J}(\lambda,\omega) on {\mathbb R} \times M = {\mathbb R} \times S^1 \times X is of the form

J(t,\theta,x) = i \oplus \hat{J}_\theta(x),

where i denotes the standard complex structure i \partial_t = \partial_\theta on {\mathbb R} \times S^1 and \{\hat{J}_\theta\}_{\theta \in S^1} is a smooth S^1-family of compatible almost complex structures on (X,\Omega). A map

u = (f,g,v) : (\dot{\Sigma},j) \to ({\mathbb R} \times S^1 \times X,J)

is then J-holomorphic if and only if \varphi := (f,g) : (\dot{\Sigma},j) \to ({\mathbb R} \times S^1,i) is holomorphic and v : (\dot{\Sigma},j) \to (X,\hat{J}^g) is pseudoholomorphic for the domain-dependent almost complex structure \hat{J}^g on X defined by

\hat{J}^g(z,x) := \hat{J}_{g(z)}(x)

for (z,x) \in \dot{\Sigma} \times X. Notice that such a map will be somewhere injective whenever v : \dot{\Sigma} \to X is somewhere injective, even if \varphi : \dot{\Sigma} \to {\mathbb R} \times S^1 is a multiple cover.

Let us fix the domain (\dot{\Sigma} = \Sigma \setminus \Gamma,j), assuming \Sigma has genus g and that (\Sigma,j,\Gamma) has no automorphisms. Then if u = (f,g,v) : (\dot{\Sigma},j) \to {\mathbb R} \times S^1 \times X is J-holomorphic and v is somewhere injective, let {\mathcal M}_u denote a small neighborhood of u in the space of parametrized J-holomorphic maps (\dot{\Sigma},j) \to ({\mathbb R} \times S^1 \times X,J); without loss of generality we may assume

{\mathcal M}_u = {\mathcal M}_\varphi \times {\mathcal M}_v,

where {\mathcal M}_\varphi and {\mathcal M}_v similarly denote small neighborhoods of \varphi = (f,g) and v in their respective moduli spaces. Each of these spaces has virtual dimension equal to the Fredholm index of the relevant linearized Cauchy-Riemann operator on the pulled back tangent bundle, that is,

\text{vir-dim } {\mathcal M}_u = \text{ind } \mathbf{D}_u = \text{ind } \mathbf{D}_\varphi + \text{ind } \mathbf{D}_v.

Since v is somewhere injective, standard transversality arguments imply that for generic choices of the family \{J_\theta\}_{\theta \in S^1} and hence for generic J \in {\mathcal J}(\lambda,\omega), {\mathcal M}_v will be a smooth manifold of dimension \text{ind } \mathbf{D}_v. On the other hand, {\mathcal M}_\varphi is always a manifold, but its dimension will usually be larger than \text{ind } \mathbf{D}_\varphi, implying \dim {\mathcal M}_u > \text{vir-dim } {\mathcal M}_u. To be precise, I claim that

\dim {\mathcal M}_\varphi = 2     but     \text{ind } \mathbf{D}_\varphi = 2 - 2g.

It’s easy to see where the two dimensions of {\mathcal M}_\varphi come from: one can compose \varphi : \dot{\Sigma} \to {\mathbb R} \times S^1 with two dimensions of holomorphic translations on {\mathbb R} \times S^1. The rest of the claim can be proven using the punctured version of the Riemann-Roch formula and the fact that \varphi^*T({\mathbb R} \times S^1) \to \dot{\Sigma} is canonically a trivial bundle; in the interest of brevity, I will leave the details as an exercise for the reader.

This example shows that unless (\dot{\Sigma},j) has genus zero, one can never choose J \in {\mathcal J}(\lambda,\omega) generically enough for {\mathcal M}_u to be cut out transversely. It’s worth noting however that we wouldn’t have had this problem if {\mathcal M}_u had been defined without fixing the complex structure on the domain. If j is allowed to vary, then the moduli space of unparametrized holomorphic curves (\dot{\Sigma},j) \to ({\mathbb R} \times S^1,i) is always smooth and cut out transversely (cf. Example 3.16 in my automatic transversality paper). And indeed, the usual generic transversality theorem is true for these curves — as long as \varphi : \dot{\Sigma} \to {\mathbb R} \times S^1 is not constant, u = (\varphi,v) satisfies the hypothesis of being not everywhere tangent to \xi, so the proof in my earlier post applies, but says nothing about the forgetful map.

As a general rule, most of the analytical phenomena needed to make SFT work as advertised — compactness, asymptotic formulas and generic transversality, for example — hold just as well for arbitrary stable Hamiltonian structures as for contact structures, but there are occasional exceptions and the issue described above with the forgetful map is one of them. An even simpler issue is that for general choices of (\lambda,\omega) and J \in {\mathcal J}(\lambda,\omega), nontrivial curves u : (\dot{\Sigma},j) \to ({\mathbb R} \times M,J) need not always have a positive puncture, though a maximum principle implies that they do when \xi is contact, and this fact is crucial for the basic algebraic structure of SFT. Thus SFT was never meant to be valid for completely arbitrary stable Hamiltonian structures — nonetheless, non-contact examples can be useful tools in a variety of problems within the SFT context, particularly for computations (this has been the case in a few papers of mine, where e.g. certain stable Hamiltonian structures appear as limits of degenerating families of contact structures). Thus it’s valuable to see how far the analysis can be pushed, and where it stops working.

So much for the bad news. I’ll discuss the good news in the next post.

Posted in Uncategorized | Tagged , , | Leave a comment

The similarity principle without Calderón-Zygmund

In my L^2 vs. L^p post a few weeks ago, I sketched a more or less standard proof of the similarity principle, and then wrote:

I defy the reader to come up with any alternative version of the above proof that does not use properties of the operator \bar{\partial} : W^{1,p}({\mathbb D}) \to L^p({\mathbb D}) for some p > 2.

Two readers responded to this challenge: they were Jean-Claude Sikorav and Patrick Massot, and in this post I’m going to explain (as I did last week on the topic of regularity and bubbling) my reinterpretation of the proof that they sent me. It should be said that after I’d managed to understand this proof, I still felt rather surprised that it works, and while I can’t speak for anyone else, it strikes me as something that I would never come up with if I had not first seen the standard proof in L^p.

The problem

Recall the statement: the most useful version of the similarity principle can be viewed as saying that if \mathbf{D} is a real-linear Cauchy-Riemann type operator on a smooth complex vector bundle E over a Riemann surface \Sigma, and \eta \in \Gamma(E) satisfies \mathbf{D}\eta \equiv 0 and \eta(z_0) = 0, then on some neighborhood of z_0 in \Sigma , E admits a continuous trivialization that identifies \eta with a holomorphic function. This is useful because it implies a unique continuation result: either \eta \equiv 0 or it has an isolated zero at z_0 (which is also of positive order if E is a line bundle).

As I outlined in L^p or not L^p, that is the question, the similarity principle is a corollary of the following local existence result for solutions of linear Cauchy-Riemann type equations. Let’s fix the usual notation: {\mathbb D}, {\mathbb D}_r \subset {\mathbb C} will denote the open disks of radius 1 and r respectively, write the standard coordinate on {\mathbb C} as z = s + it, and the standard Cauchy-Riemann operator as \bar{\partial} := \partial_s + i \partial_t.

Lemma 1. Suppose n \in {\mathbb N}, p > 2, and A : {\mathbb D} \to \text{End}_{\mathbb R}({\mathbb C}^n) is a function of class L^p. Then for any \epsilon > 0 sufficiently small, there exists a linear map {\mathbb C}^n \to C^0({\mathbb D}_\epsilon,{\mathbb C}^n) associating to each v \in {\mathbb C}^n a function u : {\mathbb D}_\epsilon \to {\mathbb C}^n that satisfies (\bar{\partial} + A) u = 0 in the sense of distributions and u(0) = v.

One remark before we get into the proof. The regularity assumption on the zeroth order term A may strike you as absurdly weak — normally geometers are only interested in smooth Cauchy-Riemann type operators. Recall however that the first step in proving the similarity principle is to replace a real-linear Cauchy-Riemann operator with one that is complex linear but still annihilates the given section \eta: this can always be done by changing the zeroth order term, but since we do not know a priori what the zero set of \eta looks like, the price we pay is that A can at best be assumed to be of class L^\infty after this change. The above statement weakens the hypothesis to L^p for p > 2 just because that will turn out to be what we need in the proof. The similarity principle then follows because we can use the local existence result to construct each column of a continuous matrix-valued function \Phi on {\mathbb D}_\epsilon that satisfies \mathbf{D}\Phi = 0 and \Phi(0) = I, and since \mathbf{D} is now complex linear, the Leibniz rule implies that if \mathbf{D}\eta = 0 and \eta = \Phi f, then \bar{\partial} f = 0.

There’s a fairly straightforward way to prove Lemma 1 if you’re willing to accept the fact — essentially equivalent to the Calderón-Zygmund inequality — that

\bar{\partial} : W^{1,p}({\mathbb D}) \to L^p({\mathbb D})

has a bounded right inverse for p > 2. The idea is to look for solutions u \in W^{1,p}({\mathbb D}) to the equation (\bar{\partial} + A_\epsilon) u = 0, where A_\epsilon : {\mathbb D} \to \text{End}_{\mathbb R}({\mathbb C}^n) is defined to match A on {\mathbb D}_\epsilon and to vanish everywhere else. By the Sobolev embedding theorem, we have

\| A_\epsilon u \|_{L^p} \le \| A_\epsilon \|_{L^p} \| u \|_{C^0} \le c \| A \|_{L^p({\mathbb D}_\epsilon)} \| u \|_{W^{1,p}},

hence \bar{\partial} + A_\epsilon \to \bar{\partial} + A_0 := \bar{\partial} in the space of bounded linear operators W^{1,p} \to L^p as \epsilon \to 0. Now consider the continuous family of bounded linear maps

\Psi_\epsilon : W^{1,p}({\mathbb D}) \to L^p({\mathbb D}) \times {\mathbb C}^n : u \mapsto ((\bar{\partial} + A_\epsilon) u , u(0))

for \epsilon \in [0,1], and notice that since constant functions are holomorphic, one can use the bounded right inverse of \bar{\partial} to construct a bounded right inverse for \Psi_0. The existence of bounded right inverses is an open condition, so it follows that \Psi_\epsilon also admits such an inverse for all \epsilon > 0 sufficiently small; call it T_\epsilon : L^p({\mathbb D}) \times {\mathbb C}^n \to W^{1,p}({\mathbb D}). After embedding W^{1,p} into C^0, the linear map {\mathbb C}^n \to C^0({\mathbb D}_\epsilon) promised by Lemma 1 can now be written as

v \mapsto T_\epsilon(0,v)|_{{\mathbb D}_\epsilon},

and this completes the proof.

We used the assumption p > 2 quite a few times in the above argument: without it, the map \Psi_\epsilon would not be continuous because W^{1,p} \to {\mathbb C}^n : u \mapsto u(0) is not continuous, and even if we could obtain solutions of class W^{1,p} in the end, they might not be in C^0. For these reasons, I previously could not imagine how it might be possible to prove such a local existence result without relying on the elliptic estimates for \bar{\partial} : W^{1,p} \to L^p with p > 2.

But it is possible.

The solution

Here’s a slightly different kind of local existence result.

Lemma 2. Suppose n \in {\mathbb N}, p > 2, A : {\mathbb D} \to \text{End}_{\mathbb R}({\mathbb C}^n) is a function of class L^p, and f_0 : {\mathbb D}_r \to {\mathbb C}^n is a holomorphic function on a disk of some radius r \le 1. Then for any number \delta > 0, there exists a number \epsilon \in (0,r] and a continuous function f : {\mathbb D}_\epsilon \to {\mathbb C}^n such that \| f \|_{C^0} \le \delta and (\bar{\partial} + A)(f_0 + f) = 0 in the sense of distributions.

In this lemma we’ve dropped the requirement that our solution take a prescribed value at the origin, instead just asking for it to be C^0-close to some prescribed function. Nonetheless, it’s not too hard to see that Lemma 2 implies Lemma 1: one can use Lemma 2 to construct the columns of a continuous matrix-valued function \Phi : {\mathbb D}_\epsilon \to \text{End}({\mathbb R}^{2n}) that satisfies (\bar{\partial} + A)\Phi = 0 and is C^0-close to the identity, hence everywhere invertible. Solutions with prescribed values at 0 can then be constructed in the form \Phi f where f is constant.

So how can we prove Lemma 2 using only L^2 estimates? We shall again look for continuous functions f : {\mathbb D} \to {\mathbb C}^n satisfying (\bar{\partial} + A_\epsilon) (f_0 + f) = 0. Notice that it is still true that

\lim_{\epsilon \to 0} (\bar{\partial} + A_\epsilon) = \bar{\partial}

in the space of bounded linear operators W^{1,2} \to L^2, though for different reasons than in the p > 2 case: W^{1,2}({\mathbb D}) is a Sobolev borderline case and embeds continuously into L^q({\mathbb D}) for every q \ge 1, so picking q > 1 such that 1/q + 2/p = 1 and using Hölder’s inequality, we have

\| A_\epsilon u \|_{L^2}^2 \le \int_{{\mathbb D}_\epsilon} | A |^2 |u|^2 \le \| |A|^2 \|_{L^{p/2}({\mathbb D}_\epsilon)} \cdot \| |u|^2 \|_{L^q({\mathbb D}_\epsilon)} = \| A \|_{L^p({\mathbb D}_\epsilon)}^2 \| u \|_{L^{2q}}^2

\le c \| A \|_{L^p({\mathbb D}_\epsilon)}^2 \| u \|_{W^{1,2}}^2.

It follows that since \bar{\partial} : W^{1,2}({\mathbb D}) \to L^2({\mathbb D}) has a bounded right inverse, so does \bar{\partial} + A_\epsilon for \epsilon > 0 sufficiently small: denote this right inverse by

T_\epsilon : L^2({\mathbb D}) \to W^{1,2}({\mathbb D}).

It should now at least seem plausible that any solution f_0 \in W^{1,2}({\mathbb D}) to the equation \bar{\partial} f_0 = 0 admits a W^{1,2}-close perturbation f_0 + f satisfying (\bar{\partial} + A_\epsilon) (f_0 + f) = 0: indeed, the latter is equivalent to the equation

(\bar{\partial} + A_\epsilon) f = - A_\epsilon f_0,

so an obvious solution presents itself in the form

f := - T_\epsilon (A_\epsilon f_0).

Since A_\epsilon f_0 is L^2-small for small \epsilon and T_\epsilon is close to T_0 (the right inverse of \bar{\partial}) in the operator norm, our solution f is evidently W^{1,2}-small. This is nice, of course, but it’s not good enough. We also need f to be continuous, and C^0-small. Is it?

A nice little fact about the right inverse of \bar{\partial}

As it turns out, yes: the solution we just found is continuous and C^0-small. This is easy to see if you don’t mind using Calderón-Zygmund, because A_\epsilon f_0 is also small in L^p and the right inverse of \bar{\partial} restricts to L^p \subset L^2 as a continuous operator L^p \to W^{1,p}, which is then continuous into C^0 by the Sobolev embedding theorem. But actually, this C^0-bound on f admits a much more direct proof that is orders of magnitude easier than either Calderón-Zygmund or the Sobolev embedding theorem.

Proposition. The standard Cauchy-Riemann operator \bar{\partial} : W^{1,2}({\mathbb D}) \to L^2({\mathbb D}) admits a bounded right inverse T : L^2({\mathbb D}) \to W^{1,2}({\mathbb D}) such that for each p \in (2,\infty), T restricts to L^p({\mathbb D}) \subset L^2({\mathbb D}) as a bounded linear operator L^p({\mathbb D}) \to C^0({\mathbb D}).

In fact, with a little bit more effort one can prove that T maps L^p continuously to the Hölder space C^{0,1-2/p}, but the C^0-bound will be plenty sufficient for our purposes. To see why it is true, let us quickly recall how T : L^2 \to W^{1,2} is constructed (cf. Section 2.6 in the current version of my book in progress on holomorphic curves). The Cauchy-Riemann operator has a fundamental solution K \in L^1_{\text{loc}}({\mathbb C}), defined by

K(z) = \frac{1}{2\pi z}.

Being a fundamental solution means that K satisfies \bar{\partial} K = \delta in the sense of distributions, so the equation \bar{\partial} u = f can be solved for sufficiently nice functions f by writing u as the convolution

u(z) = K * f(z) = \int_{\mathbb C} K(z - \zeta) f(\zeta) \, d\mu(\zeta),

where d \mu(\zeta) denotes the Lebesgue measure for functions of \zeta \in {\mathbb C}. This is well defined in particular whenever f is smooth with compact support in {\mathbb D}, and in this case one can prove a straightforward variation on Young’s inequality to bound \| K * f \|_{L^p({\mathbb D})} in terms of \| f \|_{L^p} for any p \ge 1, so f \mapsto K * f extends to bounded linear map L^p({\mathbb D}) \to L^p({\mathbb D}). Since \bar{\partial}(K*f) = f, one obtains a W^{1,p}-bound on K * f if one can also bound \| \partial (K*f) \|_{L^p({\mathbb D})} in terms of \| f \|_{L^p} for all f \in C_0^\infty({\mathbb D}). This is always possible if p > 1, and in the p \ne 2 case this is the essence of the Calderón-Zygmund inequality. But for p = 2 there is an easy proof by Fourier transforms: observe that the Fourier transform of the relation \bar{\partial} K = \delta gives

2\pi i \zeta \widehat{K}(\zeta) = 1,

where \widehat{K}(\zeta) denotes the Fourier transform of K(z) in the sense of tempered distributions. Similarly, taking the Fourier transform of the equation u = K * f gives \hat{u} = \widehat{K} \hat{f}, hence 2\pi i \zeta \hat{u}(\zeta) = \hat{f}(\zeta), and we can now use Plancherel’s theorem to compute

\| \partial u \|_{L^2}^2 = \| \widehat{\partial u} \|_{L^2}^2 = \| 2\pi i \bar{\zeta} \hat{u} \|_{L^2}^2 = \int_{\mathbb C} \left| \frac{\bar{\zeta}}{\zeta} 2\pi i \zeta \hat{u}(\zeta) \right|^2 \, d\mu(\zeta) = \int_{\mathbb C} | \hat{f}(\zeta) |^2 \, d\mu(\zeta)

= \| f \|_{L^2}^2.

This proves that the map f \mapsto K*f extends to a bounded linear operator L^2({\mathbb D}) \to W^{1,2}({\mathbb D}), and we define T to be this extension. With this explicit formula in hand, the proof of the proposition is very quick: notice in particular that K \in L^q_{\text{loc}}({\mathbb C}) for every q \in [1,2), so if p > 2 and 1 / q + 1 / p = 1, then for every f \in C_0^\infty({\mathbb D}) and every z \in {\mathbb D},

| Tf(z) | = \left| \int_{\mathbb C} K(z-\zeta) f(\zeta) \, d\mu(\zeta) \right| \le \int_{\mathbb D} | K(z - \cdot) f | \le \| K(z - \cdot) \|_{L^q({\mathbb D})} \| f \|_{L^p({\mathbb D})}

\le C \| f \|_{L^p},

where C > 0 is the supremum of all L^q-norms of K restricted to disks of unit radius in {\mathbb C}. Since the convolution maps smooth functions to smooth functions and the L^\infty-closure of C^\infty is C^0, this proves the proposition.

There remains just one niggling detail: we’ve shown that T_0 := T maps L^p to C^0, but in our proof of Lemma 2, we need to know that this is also true for T_\epsilon, the right inverse of the perturbed operator \bar{\partial} + A_\epsilon. To see this, it will help to give a slightly more precise definition of T_\epsilon. Notice that

(\bar{\partial} + A_\epsilon) T = 1 + A_\epsilon T

is a bounded linear operator on L^2 and is close to the identity in the operator norm since T : L^2 \to W^{1,2} is continuous and A_\epsilon : W^{1,2} \to L^2 is small. But for slightly different reasons, this operator is also close to the identity in the space of bounded linear operators on L^p: indeed, we showed above that T : L^p \to C^0 is continuous, and A_\epsilon : C^0 \to L^p is also small since

\| A_\epsilon u \|_{L^p} \le \| A_\epsilon \|_{L^p} \| u \|_{C^0} = \| A \|_{L^p({\mathbb D}_\epsilon)} \| u \|_{C^0}.

Thus if we choose \epsilon > 0 small enough, 1 + A_\epsilon T defines an isomorphism on both L^2 and L^p, so defining

T_\epsilon := T (1 + A_\epsilon T)^{-1}

gives a right inverse of \bar{\partial} + A_\epsilon that is continuous both from L^2 to W^{1,2} and from L^p to C^0. The proof of Lemma 2 is now complete, and though we appealed to Calderón-Zygmund once or twice for intuition, we never actually used it.


This will be my last post on the L^p vs. L^2 debate for a while, as I’m sure it’s clear to everyone by now that I’ve been thinking about this far too much lately. The evidence currently available to me suggests that it might very well be possible to develop the entire theory of pseudoholomorphic curves using only L^2 estimates — useful perhaps if you want to feel honest without taking the time to read the proof of Calderón-Zygmund, or if you’re one of those strange people with an aversion to the axiom of choice (it’s needed for the Hahn-Banach theorem, which is needed for regularity and transversality arguments in W^{k,p} for p \ne 2, but you can avoid it if you only work with Hilbert spaces).

But just as proving the L^p estimates requires effort, avoiding them also requires effort, and some of the resulting proofs become arguably less straightforward and less elegant. In the end, it’s a matter of taste.

Posted in Uncategorized | Tagged | Leave a comment

Bubbling without Calderón-Zygmund

A few weeks ago in L^p or not L^p, that is the question, I presented two examples of important technical lemmas about pseudoholomorphic curves that I couldn’t see how to prove without relying on the Calderón-Zygmund inequality, that is, the p > 2 case of the standard L^p elliptic estimates for the Cauchy-Riemann operator (which are hard to prove, whereas the p=2 case is an easy exercise in Stokes’ theorem). As it turns out, both of those results can be proved using only L^2 estimates; it was only lack of imagination — or perhaps lack of good analytical intuition — that prevented me from seeing how to do it. I’m not exactly sure where this leaves my opinion on the “L^2 vs. L^p” issue: one might still make an argument that many results which are provable without the use of p>2 estimates can be proved more easily and elegantly with them — at some stage the law of “conservation of effort” takes over. And it must also be said that the theory of pseudoholomorphic curves contains many technical arguments that I (and surely also most of the readers of this blog) have never delved into deeply enough to develop an opinion on precisely which estimates are absolutely necessary. In any case, I have learned something from the effort to understand alternative proofs of these two results, so I’m going to explain them: bubbling in this post, and the similarity principle in the next one.

Acknowledgement: the proofs I’m going to explain in this and the next post are heavily inspired by arguments that originated with Jean-Claude Sikorav and were communicated to me by Patrick Massot. I would like to thank both of them warmly for recent discussions that have been most illuminating.

An L^p-neutral regularity statement

Recall the fundamental engine behind bubbling arguments for holomorphic curves: we need to know that if u_k : ({\mathbb D},i) \to (M,J) is a sequence of J-holomorphic disks satisfying a uniform C^1-bound, then it has a subsequence that converges in C^\infty_{\text{loc}} in the interior of the disk. In the previous post, I mentioned that one can deduce this from a standard local regularity lemma saying that if J is a smooth almost complex structure on the unit ball B^{2n} in {\mathbb C}^n and u_k : ({\mathbb D},i) \to (B^{2n},J) are smooth J-holomorphic maps satisfying a uniform W^{1,p}-bound for some p > 2, then they also satisfy uniform bounds in W^{m,p}_{\text{loc}} for every m \ge 1. (A similar result holds with J and u_k assumed to have only finitely many derivatives, but let’s make our lives easier and leave that detail aside for now.) This is the way it is done in standard references such as McDuff-Salamon or the (unpublished but somewhat widely circulated) manuscript by Abbas and Hofer — the proofs in those two sources differ in several key details, but both make crucial use of the fact that W^{1,p} for p > 2 embeds continuously into C^0. A rather different approach to the same problem appears in Sikorav’s article in “the green book“, but it relies more on estimates of \bar{\partial} in Hölder spaces, which are also not so simple to prove. It is easy to see moreover that the corresponding statement about W^{1,2}-bounds is not true: in any closed symplectic manifold with a tame almost complex structure, the relation between symplectic area and “energy” implies that a sequence of closed J-holomorphic curves in a fixed homology class is always uniformly bounded in W^{1,2}, but it certainly is not true that every such sequence is compact.

Though it doesn’t appear in any of the references I just mentioned, most readers will probably not be too surprised to learn that one can prove a similar local regularity statement about curves bounded in W^{m,p}, where p \le 2 is allowed so long as m is large enough to ensure mp > 2 so that the usual nice consequences of the Sobolev embedding theorem hold. Of course m needs to be at least 2 in this case, so the following result doesn’t immediately seem to help with the “gradient bounds imply C^\infty-bounds” problem, but have patience, I’m only just getting started.

Proposition 1. Assume p \in (1,\infty) and m \in {\mathbb N} with mp > 2, J is a smooth almost complex structure on the 2n-dimensional unit ball B^{2n}, and u_k : ({\mathbb D},i) \to (B^{2n},J) is a sequence of smooth J-holomorphic maps satisfying a uniform bound \| u_k \|_{W^{m,p}} \le c. Then u_k also satisfies uniform W^{m+1,p}-bounds on every compact subset of the interior of {\mathbb D}.

Here is the plan for the rest of this post. I shall:

  • Explain a proof of Proposition 1, paying particular attention to where the hypothesis mp > 2 is used. The proof will rely on the elliptic estimate for the particular values of m \in {\mathbb N} and p \in (1,\infty) that we’ve chosen, that is,

\| f \|_{W^{m,p}} \le c \| \bar{\partial} f \|_{W^{m-1,p}} for all f \in W^{m,p}_0({\mathbb D}).

  • Adapt the same argument to prove a similar result for the case m=1 and p \le 2, under the additional hypothesis that u_k is bounded in C^1.

A rescaling trick

I always find the following bit of general intuition useful: the nonlinear Cauchy-Riemann equation is locally an arbitrarily small perturbation of a linear Cauchy-Riemann type equation. One way to make this precise is as follows.

If u : ({\mathbb D},i) \to (B^{2n},J) is J-holomorphic, then after a suitable change of coordinates, we can assume without loss of generality that J(0) = i and u(0) = 0. Assume now that we have, as in the statement of Proposition 1, a sequence of J-holomorphic maps u_k : ({\mathbb D},i) \to (B^{2n},J) satisfying a uniform W^{m,p}-bound with mp > 2. In order to establish a uniform W^{m+1,p}_{\text{loc}}-bound, it suffices to prove that every subsequence of u_k has a further subsequence that satisfies such a bound. We can thus make use of the Sobolev embedding theorem and start by extracting a subsequence that converges in C^0, so we lose no generality in assuming from the beginning that u_k \stackrel{C^0}{\to} u for some continuous map u : {\mathbb D} \to B^{2n} with u(0) = 0. It will then suffice to prove a bound on \| u_k \|_{W^{m+1,p}({\mathbb D}_r)}, where {\mathbb D}_r denotes the open disk in {\mathbb D} of some radius r < 1. Equivalently, given R \ge 1 and \epsilon < 1, it suffices to prove that the maps

u_k' : {\mathbb D} \to {\mathbb C}^n : z \mapsto R u_k(\epsilon z)

satisfy a uniform bound in W^{m+1,p}({\mathbb D}_r) for some r > 0. These maps are J_R-holomorphic, where J_R is an almost complex structure defined on a suitably sized ball in {\mathbb C}^n by

J_R(p) = J(p / R),

thus since J(0) = i, we can assume J_R is arbitrarily C^\infty-close to i on B^{2n} by taking R sufficiently large. In order to make use of this, we should also choose \epsilon > 0 sufficiently small to make sure u_k'({\mathbb D}) \subset B^{2n}, and this is clearly possible since u_k is converging uniformly to a map u with u(0) = 0. But as we will see in the proof below, we actually need a bit more: working in W^{m,p} means that we will sometimes encounter terms of the form \| J(u'_k) - i \|_{W^{m,p}} that need to be bounded, and in fact, we’ll need to be able to assume they are small so that the behavior of our solutions does not differ too much from solutions to the linear equation \bar{\partial} u = 0. This means we’d like to be able to make \| u'_k \|_{W^{m,p}({\mathbb D})} arbitrarily small by choosing \epsilon sufficiently small. That sounds reasonable, but the details depend on the space we’re working in: for instance if f_k \in W^{m,p}({\mathbb D}) is a bounded sequence and we define f^\epsilon_k \in W^{m,p}({\mathbb D}) by f^\epsilon_k(z) := f_k(\epsilon z), then change of variables implies

\| D^m f_k^\epsilon \|_{L^p({\mathbb D})}^p = \epsilon^{mp-2} \| D^m f_k \|_{L^p({\mathbb D}_\epsilon)}^p \le \epsilon^{mp-2} \| D f_k \|_{L^p({\mathbb D})}^p,

so e.g. for p=2 and m=1, we can argue without too much trouble that the norms \| f_k^\epsilon \|_{W^{1,2}} are uniformly bounded, but there is no guarantee that they can be made small by making \epsilon smaller. Indeed, there is evidently no hope of this being true in general unless mp > 2. It turns out that this does always work when mp > 2:

Lemma. Suppose m \in {\mathbb N} and p \in (1,\infty), and associate to each f \in W^{m,p}({\mathbb D}) and \epsilon \in (0,1) the function f^\epsilon \in W^{m,p}({\mathbb D}) defined by f^\epsilon(z) := f(\epsilon z), where {\mathbb D} as usual denotes the open unit disk in {\mathbb C}. If mp > 2, then there exists a continuous function C(\epsilon) > 0 such that for every \ell = 1,\ldots,m,

\| D^\ell f^\epsilon \|_{L^p({\mathbb D})} \le C(\epsilon) \| f \|_{W^{m,p}({\mathbb D})},

and \lim_{\epsilon \to 0} C(\epsilon) = 0.

I suspect this lemma is considered standard in certain circles, but I’m hard pressed to provide a reference for it other than “page 82 of the notebook I was working in last week”. The required estimates for the derivatives of order less than m follow via some combination of the Sobolev embedding theorems and Hölder’s inequality.

The main message of this discussion is that when mp > 2, we are free to assume without loss of generality that J is C^\infty-close to the standard complex structure i on B^{2n}, while our W^{m,p}-bounded curves u_k : {\mathbb D} \to B^{2n} are also W^{m,p}-close to 0. This implies in particular that \| J \circ u - i \|_{W^{m,p}({\mathbb D})} and some related quantities may be assumed arbitrarily small, and we will make use of this assumption in the proof below.

Proof of Proposition 1

Assume as in the discussion above that u_k converges uniformly to some u \in C^0({\mathbb D},B^{2n}) with u(0) = 0, and J(0) = i. Let us rewrite the nonlinear Cauchy-Riemann equation \partial_s u_k + J(u_k) \partial_t u_k = 0 in the form

\bar{\partial} u_k - Q(u_k) \partial_t u_k = 0,

where Q := i - J is assumed (via the rescaling trick described above) to be C^\infty-small. Choose a smooth bump function \beta : {\mathbb D} \to [0,1] that has compact support in the interior of the disk and is identically equal to 1 on {\mathbb D}_r for some r < 1. Given bounds on \| u_k \|_{W^{m,p}({\mathbb D})}, we will then try to derive a bound on \| u_k \|_{W^{m+1,p}({\mathbb D}_r)} by bounding \| \beta \partial_j u_k \|_{W^{m,p}({\mathbb D})} for each of the partial derivatives \partial_j, j=1,2. Since \beta \partial_j u_k \in W^{m,p}_0({\mathbb D}), the standard elliptic estimate gives

\| \beta \partial_j u_k \|_{W^{m,p}} \le c \| \bar{\partial} (\beta \partial_j u_k) \|_{W^{m-1,p}}

for some c > 0, so we need to understand \bar{\partial} (\beta \partial_j u_k). For this, we can apply \partial_j to the equation that u_k satisfies, giving

\bar{\partial} \partial_j u_k - Q(u_k) \partial_t \partial_j u_k = DQ(u_k)(\partial_j u_k, \partial_t u_k),


\bar{\partial} (\beta \partial_j u_k) - Q(u_k) \partial_t (\beta \partial_j u_k) = \beta DQ(u_k)(\partial_j u_k, \partial_t u_k) + (\bar{\partial}\beta) \partial_j u_k - Q(u_k) (\partial_t \beta) \partial_j u_k

and thus

\| \bar{\partial} (\beta \partial_j u_k) \|_{W^{m-1,p}} \le \| Q(u_k) \partial_t(\beta \partial_j u_k) \|_{W^{m-1,p}} + \| DQ(u_k)(\beta \partial_j u_k, \partial_t u_k) \|_{W^{m-1,p}} + \| (\bar{\partial}\beta) \partial_j u_k \|_{W^{m-1,p}} + \| Q(u_k) (\partial_t \beta) \partial_j u_k \|_{W^{m-1,p}}.

Let’s see if we can bound each of the terms on the right hand side of this uniformly in k. The last two terms are easy since \| u_k \|_{W^{m,p}} is bounded uniformly, thus so are \| \partial_j u_k \|_{W^{m-1,p}}, and \| Q \circ u_k \|_{W^{m,p}}; the latter observation uses the assumption mp > 2, and we need to use it again to handle the product Q(u_k) (\partial_t \beta) \partial_j u_k, namely the fact that there is a continuous pairing

W^{m,p} \times W^{m-1,p} \to W^{m-1,p} : (f,g) \mapsto fg.

For the first term, we can write

\| Q(u_k) \partial_t(\beta \partial_j u_k) \|_{W^{m-1,p}} \le c \| Q \circ u_k \|_{W^{m,p}} \| \partial_t(\beta \partial_j u_k) \|_{W^{m-1,p}} \le c \| Q \circ u_k \|_{W^{m,p}} \| \beta \partial_j u_k \|_{W^{m,p}}

for some universal constant c > 0, where we’ve again used the continuous product pairing. For the second term, we can use this continuous pairing in a similar manner and write

\| DQ(u_k)(\beta \partial_j u_k, \partial_t u_k) \|_{W^{m-1,p}} \le c \| DQ \circ u_k \|_{W^{m,p}} \| \beta \partial_j u_k \|_{W^{m,p}} \| \partial_t u_k \|_{W^{m-1,p}} \le c \| DQ \circ u_k \|_{W^{m,p}} \| u_k \|_{W^{m,p}} \| \beta \partial_j u_k \|_{W^{m,p}}

where c > 0 is again a universal constant. Putting all these estimates together, we’ve shown that

\| \beta \partial_j u_k \|_{W^{m,p}} \le C + C' \left( \| Q \circ u_k \|_{W^{m,p}} + \| DQ \circ u_k \|_{W^{m,p}} \right) \| \beta \partial_j u_k \|_{W^{m,p}},

where C, C' > 0 are constants arising from various Sobolev estimates and the elliptic estimate \| f \|_{W^{m,p}} \le c \| \bar{\partial} f \|_{W^{m-1,p}}, but they do not depend on any choices we’ve made. Thus after using the rescaling trick to make \| Q \circ u_k \|_{W^{m,p}} and \| DQ \circ u_k \|_{W^{m,p}} sufficiently small for all k, we can move the \| \beta \partial_j u_k \|_{W^{m,p}} term from the right to the left hand side and this inequality implies the desired uniform bound on \| \beta \partial_j u_k \|_{W^{m,p}}.

The case m=1, p \le 2

If m=1 and 1 < p \le 2, then the proof above fails already at the first step, where we extracted a C^0-convergent subsequence via the Sobolev embedding theorem. It then fails again at the step where we have to bound \| Q(u_k) (\partial_t \beta) \partial_j u_k \|_{L^p}, \| Q(u_k) \partial_t (\beta \partial_j u_k) \|_{L^p} and \| DQ(u_k)(\beta \partial_j u_k, \partial_t u_k) \|_{L^p}. One problem here is that if p \le 2 then there is no continuous product pairing W^{1,p} \times L^p \to L^p; even worse, Q \circ u_k and DQ \circ u_k might not even be bounded in W^{1,p}, and if they are, then they cannot be assumed to be small since the rescaling trick fails for mp \le 2. None of this should be surprising, as we know Proposition 1 cannot hold in this case: uniform W^{1,p}-bounds for p \le 2 do not imply anything useful about J-holomorphic curves in general.

Notice however that the above problems all disappear if we are willing to strengthen our hypothesis and assume a bound on \| u_k \|_{C^1}. Indeed, the C^0-convergent subsequence now comes from Arzelà-Ascoli, and we have

\| Q(u_k) (\partial_t \beta) \partial_j u_k \|_{L^p} \le \| Q \|_{C^0} \| \beta \|_{C^1} \| u_k \|_{W^{1,p}} \le c\| Q \|_{C^0} \| \beta \|_{C^1} \| u_k \|_{C^1},

\| DQ(u_k)(\beta \partial_j u_k, \partial_t u_k) \|_{L^p} \le \| Q \|_{C^1} \| u_k \|_{C^1} \| u_k \|_{W^{1,p}} \le c\| Q \|_{C^1} \| u_k \|_{C^1}^2,


\| Q(u_k) \partial_t (\beta \partial_j u_k) \|_{L^p} \le \| Q \|_{C^0} \| \partial_t (\beta \partial_j u_k) \|_{L^p} \le \| Q \|_{C^0} \| \beta \partial_j u_k \|_{W^{1,p}},

where the constant \| Q \|_{C^0} in front of \| \beta \partial_j u_k \|_{W^{1,p}} in this last estimate can again be assumed arbitrarily small by a rescaling trick (there is no need to assume \| u_k \|_{W^{1,p}} is small). We’ve thus proved:

Proposition 2. Assume p \in (1,\infty), J is a smooth almost complex structure on the 2n-dimensional unit ball B^{2n}, and u_k : ({\mathbb D},i) \to (B^{2n},J) is a sequence of smooth J-holomorphic maps satisfying a uniform bound \| u_k \|_{C^1} \le c. Then u_k also satisfies uniform W^{2,p}-bounds on every compact subset of the interior of {\mathbb D}.

The proof again relies only on the L^p elliptic estimates for the chosen value of p, thus if we insist on having a Calderón-Zygmund-free proof that “gradient bounds imply C^\infty-bounds”, we can now do it as follows:

  1. Use Proposition 2 to prove that any C^1-bounded sequence of J-holomorphic disks is also bounded in W^{2,2} on compact subsets;
  2. Show inductively via Proposition 1 that the sequence is also bounded in W^{m,2} on compact subsets for every integer m \ge 2. By the Sobolev embedding theorem, this implies bounds in C^\infty_{\text{loc}}.

At some point soon I will be updating my book-in-progress on holomorphic curves to include the above proof and its analogue showing (via difference quotients) that J-holomorphic curves of class W^{m,p} are smooth whenever mp > 2; this proof is both more general and a bit nicer than the one currently in the manuscript. But it may be a couple of months before I find the time to do that.

I’ll talk about the similarity principle in the next post.

Posted in Uncategorized | Tagged | Leave a comment