New SFT lecture notes / book on the arXiv

Dear readers (assuming I still have any), I would like to draw your attention to something that I uploaded to the arXiv this week:

Lectures on Symplectic Field Theory

This is the expanded version of some lecture notes I wrote for a 2-term graduate course on SFT that I taught last year in London, and it is due to be published as a book in the next year. I would be grateful for any useful comments or corrections that readers may choose to send my way before it goes to press!

The book is mainly intended to cover what I regard as the “basics” of SFT at a level suitable for PhD students and researchers from other fields. But I also used the opportunity to add a few things to the literature that haven’t appeared in writing before, at least not quite in this form, such as:

  • Lecture 3 includes a self-contained explanation and proof of the fact that between any two (trivialized) asymptotic operators associated to nondegenerate Reeb orbits there is a well-defined spectral flow, and related facts involving the Conley-Zehnder index and winding numbers of eigenfunctions. By “self-contained,” I mainly mean that you don’t need to read Kato or understand a lot about the spectral theory of unbounded self-adjoint operators before reading this, you mainly just need to know the basic facts about Fredholm operators.
  • Lecture 5 implements a novel approach that was suggested 20 years ago by Taubes for proving the Riemann-Roch formula and its generalization to surfaces with cylindrical ends. The standard reference for the latter has traditionally been Schwarz’s thesis, which does it by using a linear gluing argument to break up arbitrary surfaces into simpler pieces on which the index can be computed explicitly. (This is analogous to the proof that McDuff and Salamon give for compact surfaces with boundary.) Taubes’s alternative approach is unusual in the symplectic context but familiar from gauge theory: the idea is to use a simple Weitzenböck-type formula for Cauchy-Riemann operators to show that if you deform the operator by a sufficiently large zeroth-order term that is complex antilinear, it forces sections in the kernel and cokernel to concentrate around the zero-set of the perturbation. The index calculation then reduces to a signed count of zeroes of the perturbation, in other words, a relative first Chern number of the appropriate vector bundle. This idea was sketched in a 2-page section that Taubes labeled a “non-sequitur” at the end of his paper defining the Gromov invariant of symplectic 4-manifolds; Lecture 5 works out the details.
  • Lecture 8 contains what is meant to be a definitive proof of the standard theorem about transversality for somewhere injective holomorphic curves u : \dot{\Sigma} \to {\mathbb R} \times M with generic {\mathbb R}-invariant almost complex structures on symplectizations (originally due to Dragnev), together with the requisite lemmata on injective points of the projection \dot{\Sigma} \to M and the nonvanishing of \pi_\xi \circ du \in \text{Hom}(T\dot{\Sigma},u^*\xi). As I’ve discussed before on this blog, this fundamental result has been badly understood for a long time. The proof is Lecture 8 is essentially the one I sketched in my earlier blog post Some good news about the forgetful map in SFT, which is a generalization of an argument by Bourgeois.
  • Lecture 10 illustrates one of the simplest nontrivial applications of SFT by constructing a rigorous version of cylindrical contact homology in certain specialized settings in order to distinguish the various tight contact structures on the 3-torus. It’s fair to say that the outlines of this argument are standard, e.g. you can read a succinct account of it in Bourgeois’s lecture notes on contact homology from 2003; however, the latter uses the Morse-Bott methods from Bourgeois’s thesis, and since I didn’t want to devote a whole lecture to Morse-Bott methods in the class, I ended up doing something different.(*) The standard approach to proving uniqueness of the relevant holomorphic cylinders for nondegenerate asymptotic data would be by viewing that data as a small perturbation of Morse-Bott data, for which the required uniqueness result is more or less obvious. Instead of that, I view the nondegenerate contact data as a small perturbation of a nondegenerate but integrable stable Hamiltonian structure, for which the cylinders in question can be regarded as solutions to the Floer equation and standard results from Hamiltonian Floer homology can be applied. This argument isn’t completely original either—I have some vague memory of reading something similar in Eliashberg-Kim-Polterovich several years ago—but it’s a nice example of the practical applicability of stable Hamiltonian structures and might be quite useful in other contexts, so I wanted to give it some attention.
  • Some readers might be grateful for Appendix B, which proves the basic properties of Floer’s C_\varepsilon space, e.g. that it is a separable Banach space which contains enough bump functions to prove nice transversality results via the Sard-Smale theorem. Funny story: the reason I ended up writing this appendix is that in the original version of these notes, I made at least one catastrophic technical error in my presentation of spectral flow in Lecture 3. I doubt whether anyone who read that version of the notes noticed, as the error was hidden behind a lemma that I stated without proof because the proof would have required the Sard-Smale theorem (which we only covered four lectures later). Long story short, last summer I finally sat down to write up a proof of that lemma and found out that it was wrong, it couldn’t possibly work the way I’d envisioned it, because I was trying to apply the Sard-Smale theorem with a Banach space of perturbations that was not separable.(**) I wouldn’t have noticed if I hadn’t started asking myself fundamental questions like “why is the C_\varepsilon space separable anyway?”, but I did, and in the effort to answer them, I both revised Lecture 3 and wrote Appendix B.

Unlike most of the existing references, I also made an effort in these notes to include general stable Hamiltonian structures in the setting of all technical results whenever possible. This is not always possible—the result on transversality in symplectizations for instance requires some extra condition on either the hyperplane distribution \xi \subset TM or the curve u : \dot{\Sigma} \to {\mathbb R} \times M in order to exclude certain pathological counterexamples. It is also not strictly necessary if all you want to do is set up SFT as a framework for invariants of contact manifolds; assuming of course that the usual trouble with transversality for multiple covers can be solved somehow, it should suffice for that purpose to work with “contact-type” stable Hamiltonian structures. But of course we don’t just want to define the theory, we’d also like to be able to compute it, and e.g. for the computations on {\mathbb T}^3 in Lecture 10 and in many other situations that have arisen in my own research, it proves extremely useful to be able to relate the usual contact data to something more general and exploit the technical apparatus in that more general setting. In the effort to present things this way, I found out that a lot of basic lemmas that many of us have been taking for granted for years were actually never proved in their properly general setting, and the proofs sometimes require slightly new ideas. This ended up making the first half of Lecture 9 in particular (on asymptotic results) a lot longer than I expected it to be.

Speaking of Lecture 9, the second half of it discusses the SFT compactness theorem, and I tried to illustrate the main ideas behind the proof but made no attempt to make this discussion complete or self-contained. I did not want to end up writing a whole book about the SFT compactness theorem, and anyway, such a book already exists.

Another thing that is not included in the uploaded draft is… well, the entirety of Lectures 14, 15 and 16. There are two practical reasons for this: (1) it’s pretty easy to convince a publisher to let you keep the manuscript of your book freely available online in perpetuity if you tell them you won’t include the last three chapters. More immediately, (2) I haven’t typed them up yet. But they will appear in the published version, so I’ll post news about that as soon as there is news to post (sometime in the next year).

(*) The wisdom of not discussing Morse-Bott methods is of course highly debatable, as Morse-Bott methods are indisputably useful. The truth is: I had one lecture left before the Christmas vacation and I wanted to use it for proving a big theorem, not just more technical lemmas. That’s why the 3-torus discussion is in Lecture 10 and not Lecture 11.

(**) We all learned at some point that the Banach space \ell^\infty of bounded sequences is not separable. I was never impressed by this since I’ve never seen \ell^\infty arise naturally in any problem I cared about. But here’s another Banach space that isn’t separable: the space {\mathcal L}(H) of bounded linear maps on a Hilbert space H. It doesn’t matter if H itself is separable, because {\mathcal L}(H) contains an isometric embedding of \ell^\infty. This means you have to be extremely careful in any discussion of “generic families of operators H \to H,” as the Sard-Smale theorem doesn’t apply, so e.g. you can’t assume that a smooth Fredholm map defined on {\mathcal L}(H) or (I mention this next example for no reason at all) C_\varepsilon([0,1],{\mathcal L}(H)) has an abundance of regular values. As my former PhD advisor would say: shit happens.


Posted in Uncategorized | Tagged , , | 1 Comment

Some personal news (and two postdoc positions)

This blog has been sadly neglected in recent months, and there are several posts that I’ve been meaning to write when I find more free time. But rather than writing any of those right now, I would like to make two announcements:

  1. I will be leaving my current job at UCL in Spring/Summer of this year to start a new one as Professor for Differential Geometry and Global Analysis at Humboldt University in Berlin. Relatedly:
  2. I have two postdoc positions to offer.

Here are some details on the second point. The first thing I should make clear is that if you cannot either speak German or imagine yourself learning enough of it to teach problem classes in German after a year, then you probably shouldn’t get your hopes up. I’ve had a couple of inquiries already from people who don’t know any German — what I say in these cases is that if you’re a sufficiently good fit and are enthusiastic about coming to Berlin and learning enough German to teach in your second year, then we may be able to find enough English-language teaching duties to get you through the first year, though I can’t make any promises. If you’re not too discouraged yet, then read on.

The two jobs are Wissenschaftlicher Mitarbeiter positions attached to my new research group in symplectic topology at the HU, with fixed-term contracts for 5 or 6 years. Both have a start date of August 1, or as soon as possible thereafter, and the application deadline for both is March 29. Both also have light teaching duties, which can vary along a spectrum between teaching one 90-minute problem class (Übung) per week and teaching lecture courses on geometry or topology for undergraduates, or more specialized courses for Master’s students.

There is a slight difference between the two positions:

  • Position 1 (reference number AN/026/16) is a 2/3-position for up to 6 years, which is the standard type of postdoc position at the HU. (Being 2/3 means it is paid a bit less than a “full” position, but this also makes the teaching duties lighter, and unlike the place I’m moving from, living costs in Berlin are still affordable.)
  • Position 2 (reference number AN/025/16) is a “full” position for up to 5 years, so the pay is higher than a 2/3 position, and the teaching load is also slightly higher (though a lot less than a regular faculty position). In theory, this one is intended for someone at a slightly more advanced level in their career, though we can be flexible on this detail.

I would encourage anyone who thinks they might be suitable for either of these positions to submit identical applications for both. Below are English versions of the official job adverts, including links to the (legally binding) German versions.

Update 29/2/2016: Since I’ve been asked about this, I should clarify that sending the application materials in English is perfectly fine, even though the official job adverts are in German.


Position 1 (2/3 part time, fixed-term for up to 6 years, reference number AN/026/16):

Job description: Research and teaching in the research group for differential geometry and global analysis, in particular in the field of symplectic topology; collaboration on research projects

Requirements: PhD in mathematics; expertise in symplectic or contact topology; ability to teach in German is also required

Applications (including CV, publications list, description of current and planned research, details of previous teaching experience, and at least two letters of recommendation) should be sent electronically under reference number AN/026/16 to Prof. Wendl at the Institute for Mathematics, Faculty of Mathematics and Natural Sciences, Humboldt-Universität zu Berlin,

Application deadline: March 29, 2016

Please visit our website for access to the legally binding German version:


Position 2 (full time, fixed-term for up to 5 years, reference number AN/025/16):

Job description: Research and teaching in the research group for differential geometry and global analysis, in particular in the field of symplectic topology; collaboration on research projects

Requirements: PhD in mathematics; expertise in symplectic or contact topology, preferably with an established track record of research in these fields; ability to teach in German is also required

Applications (including CV, publications list, description of current and planned research, details of previous teaching experience, and at least two letters of recommendation) should be sent electronically under reference number AN/025/16 to Prof. Wendl at the Institute for Mathematics, Faculty of Mathematics and Natural Sciences, Humboldt-Universität zu Berlin,

Application deadline: March 29, 2016

Please visit our website for access to the legally binding German version:

Posted in Uncategorized | Leave a comment

Signs (or how to annoy a symplectic topologist)

In this post, I will finally address that most pressing question of our times:

Wrong Way - Do Not Enter

For ****’s sake, is it \omega(X_H,\cdot) = dH or \omega(X_H,\cdot) = -dH?

My original motivation to write this post was actually a slightly different sign question, which I hadn’t realized until recently is quite closely related to this one. If you’ve seen me at any conferences recently, you may already be able to guess what I’m referring to, because I’ve developed a habit of interrupting other people’s talks to make the following seemingly frivolous observation:

The symplectization of M is {\mathbb R} \times M, not M \times {\mathbb R}.

I don’t mean to sound dogmatic, and I don’t bring up this point just to annoy people — I bring it up because most people don’t realize there is an issue here that goes beyond a matter of arbitrary convention, and as anyone who’s ever tried to understand Floer-type theories with something other than {\mathbb Z}_2-coefficients will tell you, being careless with orientations can easily lead to trouble. This kind of trouble:

cartoon about sign errors

© Bob Krohmer ,

Still with me? Good.

So is it R times M or M times R?

The issue with the symplectization of a contact manifold is very simple. In the literature it appears most often in the following form: assume (M,\xi) is a contact manifold with contact form \alpha. The symplectization of (M,\xi) can then be described as a manifold diffeomorphic to {\mathbb R} \times M with an exact symplectic form \omega = d(e^t\alpha); or sometimes one presents the symplectization instead as (0,\infty) \times M with \omega = d(t\alpha), which is fine since these two constructions are obviously symplectomorphic. What is not fine, but is nonetheless often done in papers by quite prominent authors (including the paper that gave this blog its name), is to write \omega in one of the above forms but write the manifold itself as M \times {\mathbb R} or M \times (0,\infty). Why isn’t this fine? Well, I assume we can all agree on the following:

  1. If X and Y are two oriented manifolds, then X \times Y inherits a canonical orientation, with a positively oriented local coordinate system given by (x_1,\ldots,x_m,y_1,\ldots,y_n) for any choice of positively oriented local coordinates (x_1,\ldots,x_m) on X and (y_1,\ldots,y_n) on Y.
  2. Any symplectic manifold (W,\omega) carries a canonical orientation, namely the one defined by the volume form \omega \wedge \ldots \wedge \omega.
  3. Any contact manifold (M,\xi) with a co-oriented contact structure also carries a canonical orientation, namely the one defined by \alpha \wedge d\alpha \wedge \ldots \wedge d\alpha for any choice of contact form \alpha compatible with the co-orientation of \xi.

The second point is the reason why, for example, it’s important in symplectic topology to make the distinction between {\mathbb C} P^2 (carrying its natural orientation as a complex manifold) and \overline{{\mathbb C} P}^2, the same manifold with reversed orientation. The first admits a symplectic structure but the second does not, since we know from de Rham cohomology that there is no closed 2-form \omega on \overline{{\mathbb C} P}^2 satisfying \omega \wedge \omega > 0.

Similarly, if (M,\xi) is a co-oriented contact manifold, then M inherits a canonical orientation and therefore so does {\mathbb R} \times M — and it is easy to check that the latter matches the canonical orientation determined by the symplectic form d(e^t\alpha).  But since {\mathbb R} and M are both odd-dimensional, M \times {\mathbb R} has the opposite orientation. The wrong one. There is no arbitrary convention involved.

How well do you know the cotangent bundle, really?

If I were really so dogmatic, I would tell you that that’s the end of the story, but of course it isn’t. I did make one choice in the above discussion: I chose to write the symplectic form on the symplectization as d(e^t\alpha) or d(t\alpha). These two conventions are indeed equivalent, and up to the choice of writing the {\mathbb R}-coordinate as t and the contact form as \alpha, every paper I’m aware of in the symplectic/contact literature follows one of these two conventions. (Please feel free to point out exceptions in the comments, if you know any!) However, these are not the only two conventions that might conceivably make sense: one could reasonably choose to write the symplectic form differently, and depending on this choice, one might be forced to write M \times {\mathbb R} instead of {\mathbb R} \times M. Let me explain.

We need to recall quickly why the symplectization of (M,\xi) — despite appearances to the contrary in the formula \omega = d(e^t\alpha) — doesn’t actually depend on the choice of contact form. The canonical definition of the symplectization is as a particular symplectic submanifold of T^*M, namely

S_\xi M \subset T^*M

is the submanifold consisting of all \lambda \in T^*M such that \lambda|_\xi = 0 and \lambda > 0 in the direction positively transverse to \xi. In other words, S_\xi M is a fiber bundle over M whose sections are precisely the contact forms for \xi. A choice of contact form \alpha trivializes this fiber bundle and defines a diffeomorphism

{\mathbb R} \times M \to S_\xi M : (t,q) \mapsto e^t \alpha_q,

such that the canonical 1-form on T^*M pulls back to {\mathbb R} \times M as e^t \alpha. The contact condition for \xi thus turns out to be equivalent to the condition that S_\xi M is a symplectic submanifold of T^*M, and according to a standard convention, the above diffeomorphism identifies S_\xi M with ({\mathbb R} \times M,d(e^t\alpha)).

Wait, wait… did you say “convention”?

Yes, there was exactly one semi-arbitrary convention involved in what I just said. Did you see it? I’ll give you a moment. Once you’re ready for the answer, scroll past this video of a dog lamenting the horrors of sign errors:

So here’s the thing:

The symplectic form on T^*M is not canonical.

Cotangent bundles do of course have canonical Liouville forms. As we all learned in Symplectic Geometry 101, there is a 1-form \lambda_{\text{can}} defined on T^*M such that if you choose any local coordinates (q_1,\ldots,q_n) on a neighborhood in M and let (p_1,\ldots,p_n) denote the induced coordinates on the fibers over that neighborhood, then

\lambda_{\text{can}} = \sum_{j=1}^n p_j \, d q_j.

Since it’s obvious from the formula that d\lambda_{\text{can}} is symplectic, we often assume that d\lambda_{\text{can}} is the “canonical” symplectic form on T^*M. By why should it be? Why shouldn’t the symplectic form be -d\lambda_{\text{can}}?

If this strikes you as a silly question, keep reading.

(Update 24/08/2015: Patrick Massot makes a very good point below in the comments, that “canonical” is perhaps the wrong word to be using here — \lambda_{\text{can}} can more accurately be called a tautological 1-form, and d\lambda_{\text{can}} can just as accurately be called a “tautological 2-form” on T^*M. This reinforces my opinion that d\lambda_{\text{can}} is the “best” choice for a symplectic form on T^*M, though it is not the only reasonable choice.)

What, you haven’t asked Isaac Newton’s opinion?

One could argue in various ways that d\lambda_{\text{can}} and -d\lambda_{\text{can}} are equally good choices of symplectic forms on T^*M; for instance, the canonical Liouville vector field (pointing outward in the fibers) is Liouville with respect to both of them. In fact, there are situations in which one must take -d\lambda_{\text{can}} instead of d\lambda_{\text{can}}. This leads us back to the question that started this post, the question that has caused countless headaches to graduate students attempting to start their first research projects in Floer homology and related subjects:

I ask you for the last ****ing time, is it \omega(X_H,\cdot) = dH or \omega(X_H,\cdot) = -dH?

The symplectic literature is pretty evenly split in its opinion about the definition of a Hamiltonian vector field, but there’s a basic rule of thumb that I would say must always be (and usually is) observed. Whatever sign conventions you choose, they must lead to a version of Hamilton’s equations that physicists would recognize.

An undergraduate physics student would write Hamilton’s equations as follows:

\displaystyle \dot{q}_j = \frac{\partial H}{\partial p_j},        \displaystyle \dot{p}_j = - \frac{\partial H}{\partial q_j},

where q_1,\ldots,q_n are the “position” variables (moving in M) and p_1,\ldots,p_n are the “momentum” variables (moving in the fibers of T^*M). In the special case where M = {\mathbb R}^n and we’re talking about motion in a Newtonian potential, that same physics student will define H by

\displaystyle H(q,p) = \sum_{j=1}^n \frac{p_j^2}{2 m_j} + V(q),

where V(q) is the potential energy, and the positive term in front of it (depending on some constant masses m_1,\ldots,m_n > 0) is the kinetic energy. To make sure you’ve gotten the signs right in Hamilton’s equations, all you have to do is plug in this formula and compute \dot{q}_j = p_j / m_j, which is really what \dot{q}_j had better be if you’re going to refer to p_1,\ldots,p_n as “momentum” variables. If you end up defining momentum as minus mass times velocity, then you’ve clearly done something wrong.

So if you accept what I’ve just said, then it forces upon us the following dichotomy:

Option 1: You can define Hamiltonian vector fields by \omega(X_H,\cdot) = -dH, and then you get the correct local version of Hamilton’s equations if the symplectic structure on T^*M is

\omega = d\lambda_{\text{can}} = \sum_{j=1}^n d p_j \wedge d q_j.

In this case, the symplectization of (M,\xi) can be written as ({\mathbb R} \times M,d(e^t\alpha)), but not as M \times {\mathbb R} since the latter has the wrong orientation.

Option 2If you prefer to write \omega(X_H,\cdot) = dH, then you get the correct local expression for Hamilton’s equations if the symplectic structure on T^*M is

\omega = - d\lambda_{\text{can}} = \sum_{j=1}^n d q_j \wedge d p_j.

I have seen papers that conform to this convention, but most of them either don’t deal at all with contact geometry, or they do so but get some of the orientations wrong. Assuming \dim M = 2n-1, one would have to write the symplectization of (M,\xi) in this case as

({\mathbb R} \times M, - d(e^t\alpha)) if n is even,

(M \times {\mathbb R}, - d(e^t\alpha)) if n is odd.

For reasons that should by now be obvious, I prefer the first option. I have never seen the second option implemented in a consistent way in any paper; if I did, I would certainly find it a bit perverse, but I could not call it wrong.

(Acknowledgement: Thanks to Yankı Lekili for a conversation that helped me greatly in getting my thoughts on this topic in order. The correct order, not the wrong order.)

Posted in Uncategorized | Tagged , | 12 Comments

Non-contact wormholes in all (higher) dimensions


I have an update on the subject of a post I wrote several months ago, in which I used the word “wormhole” together with the following graphic as a shameless attention-getting device:

A wormhole, obviously

The belt sphere of a connected sum, obviously.

(It worked then, so why not use it now?)

Anyway, topologists know that when I say wormhole, I mean connected sum: the topic of that earlier post was the fact that the prime decomposition theorem for tight contact 3-manifolds cannot be extended to dimension five, or to put it another way, nonprime 5-manifolds can admit strictly more tight contact structures than what you would get just by performing contact connected sums. Examples of this were observed in the second version of my preprint with Paolo Ghiggini and Klaus Niederkrüger, inspired in part by a result of Bowden, Crowley and Stipsicz, which I’ll have more to say about below.

The new development is that we can now prove the same is true in all higher dimensions, not just dimension five:

Theorem (Ghiggini-Niederkrüger-W. ’15).  Suppose n \ge 3, and (M,\Xi) is a closed almost contact manifold that is not a homotopy sphere but is diffeomorphic to an S^{n-1}-bundle over S^n. Let (-M,\overline{\Xi}) denote the same manifold with reversed orientation, carrying the same almost contact structure with reversed co-orientation. Then M \# (-M) admits a Stein fillable contact structure that is homotopic to \Xi \# \overline{\Xi} but is not isotopic to \xi_1 \# \xi_2 for any contact structures \xi_1 and \xi_2 on M and -M respectively.

The reason our result was initially restricted to dimension five was that we were trying to prove it as a corollary of the main theorem in our paper, concerning symplectic fillings of subcritical surgeries — the above statement is not a corollary of our main theorem when n \ge 4, but one can view it nonetheless as a corollary of our proof. The credit for this realization goes to Paolo Ghiggini, or possibly to the warm sea air of Gökova that inspired him (I have never been to Gökova, but I hear it’s very nice). The proof that is now in the third version of our preprint, which appeared on the arXiv about two weeks ago, is slightly different than the one that Paolo explained at the Gökova Geometry/Topology Conference, but the idea is the same. (In case the referee is reading this, please accept our apologies for not having thought up this improvement to the paper before we actually submitted it…)

As in the 5-dimensional version, the class of examples we use is borrowed from the paper of Bowden-Crowley-Stipsicz in which they exhibit “topological counterexamples” to a higher-dimensional extension of Eliashberg’s theorem on fillings of connected sums. They proved:

Theorem (Bowden-Crowley-Stipsicz ’14). Let M := ST^*S^{2k+1} denote the unit cotangent bundle of the (2k+1)-dimensional sphere for some odd number k \ge 5. Then M admits an almost contact structure \Xi such that some contact structure on M \# (-M) homotopic to \Xi \# \overline{\Xi} is Stein fillable, but no contact structure on M or -M homotopic to \Xi or \overline{\Xi} respectively is Stein fillable.

The corresponding statement is notably false for M = ST^*S^2, or for that matter when M is any 3-manifold, due to the combination of two well-known theorems:

  1. If M_1 and M_2 are closed oriented 3-manifolds, then all tight contact structures on M_1 \# M_2 are of the form \xi_1 \# \xi_2.
  2. If (M_1,\xi_1) and (M_2,\xi_2) are closed contact 3-manifolds, then every Stein filling of (M_1 \# M_2, \xi_1 \# \xi_2) is obtained by attaching a 1-handle to Stein fillings of (M_1,\xi_1) and (M_2,\xi_2).

The first statement is a weak version of the contact prime decomposition theorem, due mainly to Colin (see e.g. Section 4.12 of Geiges’ book), and the second is one of the main results in Eliashberg’s holomorphic disk-filling paper (the book by Cieliebak and Eliashberg contains a more complete proof). When the Bowden-Crowley-Stipsicz result first appeared, many of us interpreted it as evidence that Eliashberg’s theorem cannot be extended to higher dimensions, but this interpretation is false: according to our theorem, the reason one shouldn’t expect the fillable contact structures of Bowden-Crowley-Stipsicz to have fillable summands is that they are the wrong contact structures, i.e. they are not the ones that arise from contact connected sums. The prime decomposition theorem thus fails in higher dimensions, but Eliashberg’s theorem on fillings of connected sums could still be true! (I will refrain from expressing an opinion as to whether it actually is, as I’m not sure I would expect this question to be answerable in my lifetime — suffice it to say that the main result of the paper with Ghiggini and Niederkrüger provides some very weak evidence in favor, but it’s arguably so weak as to be hardly worth mentioning.)

I don’t want to make this post longer than necessary by explaining the proof of our theorem in detail, but I can give a reasonable sketch of the idea. As I’ve expressed it above, the stated hypotheses guarantee three essential properties of M:

  1. M is (2n-1)-dimensional with n \ge 3;
  2. M has an almost contact structure \Xi;
  3. M admits a Morse function f : M \to {\mathbb R} that has unique local minima and maxima and otherwise only critical points of indices n-1 and n.

We do not actually require M to be an S^{n-1}-bundle over S^n in general, as any M with these three properties will do; it’s immediate at least that M could be the unit cotangent bundle of any sphere, so our examples subsume those of Bowden-Crowley-Stipsicz. We now examine the same Stein domain that they do: let M^* denote the complement of an open ball in M, and consider the compact manifold with boundary and corners defined by

W := [-1,1] \times M^*.

After smoothing the corners, we can regard W as a compact smooth manifold with boundary M \# (-M), and moreover, the almost contact structure on M determines (up to homotopy) an almost complex structure J on W which induces the almost contact structure \Xi \# \overline{\Xi} on the boundary. Similarly, we can assume the Morse function on M has its unique local maximum in the disk that was removed to create M^*, thus it induces on W a Morse function with outward gradient at the boundary and critical points of index 0, n-1 and n. Since n \ge 3, Eliashberg’s topological characterization of Stein structures now gives W a Stein structure homotopic to J, inducing on the boundary a contact structure \xi homotopic to \Xi \# (-\Xi).

W with the belt sphere of the connected sum on the boundary

W := [-1,1] \times M^* with the belt sphere of the connected sum on the boundary

Now if \xi is isotopic to \xi_1 \# \xi_2 for some contact structures \xi_1 and \xi_2 on M and -M respectively, it means that after an isotopy of \xi, the belt sphere

S := \{0\} \times \partial M^* \subset \partial W

of the connected sum is a particular kind of coisotropic submanifold in (\partial W,\xi), namely the kind that arises as the boundary of the co-core of a Weinstein 1-handle. One of the main things we show in our paper is that spheres of this type can be used as boundary conditions for holomorphic disks in the filling, as they decompose into families of totally real submanifolds. The details are significantly more complicated than in Eliashberg’s 1990 paper, but the outcome is quite similar: after a suitable choice of compatible almost complex structure J on W, one can define a moduli space {\mathcal M}(J) of J-holomorphic disks

u : ({\mathbb D}^2,\partial{\mathbb D}^2) \to (W,S)

with one marked point, and this moduli space is a compact (2n-1)-dimensional manifold which, due to the marked point, has the form of a trivial disk bundle \Sigma \times {\mathbb D}^2. Moreover, it has a continuous evaluation map

\text{ev} : {\mathcal M}(J) = \Sigma \times {\mathbb D}^2 \to W

which takes \partial {\mathcal M}(J) to the belt sphere S and defines a map of degree one \partial {\mathcal M}(J) \to S; in fact, \text{ev} is a diffeomorphism on some open subset. It’s easy to see that the image of \text{ev} cannot have much interesting topology, e.g. \text{ev}_* sends H_k({\mathcal M}(J)) to zero for every k > 0. Indeed, since {\mathcal M}(J) \cong \Sigma \times {\mathbb D}^2, every k-cycle in {\mathcal M}(J) can be pushed to the boundary by moving it in {\mathbb D}^2, thus \text{ev} maps it to a cycle in the (2n-2)-dimensional sphere S, which is trivial unless k=2n-2 (in which case the cycle must be trivial in H_k({\mathcal M}(J)) to begin with since it lives in the boundary).

Our original goal in this paper had been to use the topological information provided by the moduli space {\mathcal M}(J) to prove in much more general situations that [S] = 0 \in \pi_*(W), implying the lack of any homotopy-theoretic obstruction to W containing a handle with S as its belt sphere. We still don’t know how to prove that, except in a limited set of cases, such as when n=3. In the specific situation at hand, however, one gets more information by composing \text{ev} with the obvious projection W = [-1,1] \times M^* \to M^*, producing a continuous map

f := \text{pr}_2 \circ \text{ev} : ({\mathcal M}(J),\partial{\mathcal M}(J)) \to (M^*,\partial M^*)

which evidently has degree 1. In fact, the topological information we have about f now closely resembles the evaluation map in the proof of the Eliashberg-Floer-McDuff theorem on fillings of spheres! In that setting, one starts with an unknown symplectically aspherical filling X of a standard contact sphere S^{2n-1} and constructs a moduli space with an evaluation map

\text{ev} : ({\mathcal M}(J),\partial{\mathcal M}(J)) \to (X,S^{2n-1}),

which necessarily has degree 1 and various other properties that suffice to deduce that \pi_1(X) = 0 and H_k(X) = 0 for all k > 0. This implies via the Hurewicz theorem that X is weakly contractible, hence it is contractible by Whitehead’s theorem, and finally the h-cobordism theorem implies that it is diffeomorphic to a ball. The same topological arguments work in the present case: they imply that M^* must be contractible and thus M must be a homotopy sphere, contradicting the assumptions of our theorem. (You don’t actually need to apply the h-cobordism theorem in this case, though you are free to do so if you like it.)


Posted in Uncategorized | Tagged , | 1 Comment

Some good news about the forgetful map in SFT

This post is, unsurprisingly, a sequel to “Some bad news about the forgetful map in SFT,” in which I gave an example of a stable Hamiltonian structure (\lambda,\omega) for which the forgetful map cannot be made transverse by choosing J generically in {\mathcal J}(\lambda,\omega). In that example, the hyperplane distribution \xi = \ker\lambda was integrable, and that will turn out to be a significant detail: the goal of this post is to explain why no such example can occur if \xi is contact.

The universal moduli space

Recall the question we considered in the previous post. Question 1: Given a smooth submanifold X \subset {\mathcal M}_{g,m+p+q}, do generic perturbations of J \in {\mathcal J}(\lambda,\omega) suffice to ensure that the moduli space of somewhere injective curves {\mathcal M}^*_{g,m,p,q}(J) is cut out transversely and the forgetful map {\mathcal M}^*_{g,m,p,q}(J) \stackrel{\Phi}{\longrightarrow} {\mathcal M}_{g,m+p+q} is transverse to X?

Using the usual approach to generic transversality results, one can rephrase this question in terms of a universal moduli space, i.e. a space {\mathcal M}_{g,m,p,q}^*({\mathcal J}_\epsilon) of pairs (u,J) where J belongs to an infinite-dimensional Banach manifold {\mathcal J}_\epsilon of perturbed almost complex structures in {\mathcal J}(\lambda,\omega), and u \in {\mathcal M}_{g,m,p,q}^*(J). Recall that locally, this space can be identified with the zero set of a smooth section of a Banach space bundle {\mathcal E} \to {\mathcal T} \times {\mathcal B} \times {\mathcal J}_\epsilon, namely

\bar{\partial} : {\mathcal T} \times {\mathcal B} \times {\mathcal J}_\epsilon \to {\mathcal E} : (j,u,J) \mapsto Tu + J \circ Tu \circ j,

where {\mathcal B} is a Banach manifold of asymptotically cylindrical maps u : \dot{\Sigma} \to {\mathbb R} \times M and {\mathcal T} is a Teichmüller slice, i.e. a finite-dimensional smooth family of complex structures on \dot{\Sigma} parametrizing an open subset of Teichmüller space. The main step in any generic transversality proof is to show that the linearization

D\bar{\partial}(j,u,J) : T_j{\mathcal T} \oplus T_u{\mathcal B} \oplus T_J{\mathcal J}_\epsilon \to {\mathcal E}_{(j,u,J)}

D\bar{\partial}(j,u,J)(y,\eta,Y) = J \circ Tu \circ y + \mathbf{D}_u \eta + Y \circ Tu \circ j

is surjective, so that \bar{\partial}^{-1}(0) is a Banach manifold, and a Baire set of regular almost complex structures is found by applying the Sard-Smale theorem to \bar{\partial}^{-1}(0) \to {\mathcal J}_\epsilon : (j,u,J) \mapsto J. In this picture, the forgetful map takes the form

\Phi : \bar{\partial}^{-1}(0) \to {\mathcal T} : (j,u,J) \mapsto j,

and Question 1 is now equivalent to either of the following:

Question 2: Is the map \bar{\partial}^{-1}(0) \stackrel{\Phi}{\longrightarrow} {\mathcal T} : (j,u,J) \mapsto j a submersion?

Question 2′: Is the linear map T_u{\mathcal B} \oplus T_J{\mathcal J}_\epsilon \to {\mathcal E}_{(j,u,J)} : (\eta,Y) \mapsto \mathbf{D}_u \eta + Y \circ Tu \circ j surjective?

Exercise 1: Assuming D\bar{\partial}(j,u,J) is surjective, convince yourself that Questions 2 and 2′ are equivalent.

If the answer to either question is yes, then for any submanifold X \subset {\mathcal T}, \Phi^{-1}(X) \subset \bar{\partial}^{-1}(0) is also a manifold, and applying the Sard-Smale theorem to \Phi^{-1}(X) \to {\mathcal J}_\epsilon : (j,u,J) \mapsto J produces a Baire set for which the forgetful map {\mathcal M}_{g,m,p,q}(J) \to {\mathcal M}_{g,m+p+q} is transverse to X.

It depends on the contact condition!

So here’s the good news. Given a stable Hamiltonian structure (\lambda,\omega) on M, denote by \pi_\xi : T({\mathbb R} \times M) \to \xi the projection along the complex subbundle \text{Span}_{\mathbb R}(\partial_t,R); recall that d\lambda(R,\cdot) = 0, thus d\lambda(v,\cdot) = d\lambda(\pi_\xi v,\cdot) for all v \in T({\mathbb R} \times M). Let

{\mathcal M}_{g,m,p,q}^!(J) \subset {\mathcal M}_{g,m,p,q}^*(J)

denote the open subset defined by the condition that u : \dot{\Sigma} \to {\mathbb R} \times M has an injective point z_0 \in \dot{\Sigma} satisfying

\text{im}\left( \pi_\xi \circ du(z_0)\right) \cap \ker (d\lambda|_\xi) = \{0\}.

This condition similarly defines an open subset {\mathcal M}_{g,m,p,q}^!({\mathcal J}_\epsilon) \subset {\mathcal M}_{g,m,p,q}^*({\mathcal J}_\epsilon) of the universal moduli space. We can make two immediate observations about this condition:

  1. It is never satisfied if \xi is integrable, as d\lambda|_\xi = 0 in this case. This applies in particular to the counterexample in the previous post.
  2. If \xi is contact, then the condition is satisfied for all somewhere injective curves other than trivial cylinders: indeed, d\lambda|_\xi is nondegenerate in this case, and \pi_\xi \circ du vanishes only at isolated points (the latter is always true unless \pi_\xi \circ du vanishes identically; there’s a simple proof of this in another earlier post).

Theorem. The space {\mathcal M}_{g,m,p,q}^!({\mathcal J}_\epsilon) is a smooth Banach manifold, and the forgetful map \Phi : {\mathcal M}_{g,m,p,q}^!({\mathcal J}_\epsilon) \to {\mathcal M}_{g,m+p+q} is a submersion. Using the usual Sard-Smale argument plus the Taubes trick to replace {\mathcal J}_\epsilon by {\mathcal J}(\lambda,\omega) (see part 1 of the transversality post for more details), this implies: Corollary. Given any submanifold X \subset {\mathcal M}_{g,m+p+q}, there exists a Baire subset {\mathcal J}^{\text{reg}}(X) \subset {\mathcal J}(\lambda,\omega) such that for all J \in {\mathcal J}^{\text{reg}}(X){\mathcal M}_{g,m,p,q}^!(J) is a smooth manifold of the predicted dimension and the forgetful map \Phi : {\mathcal M}_{g,m,p,q}^!(J) \to {\mathcal M}_{g,m+p+q} is transverse to X. As outlined above, the key step in proving the theorem is to show that the linear map

\mathbf{L} : T_u{\mathcal B} \oplus T_J{\mathcal J}_\epsilon \to {\mathcal E}_{(j,u,J)} : (\eta,Y) \mapsto \mathbf{D}_u \eta + Y \circ Tu \circ j

is surjective. The argument for this begins in a standard way: if \mathbf{L} is not surjective, then there exists a nontrivial (0,1)-form \theta \in \Omega^{0,1}(\dot{\Sigma},u^*TW) which is L^2-orthogonal to the image of \mathbf{L}, implying

  1. \langle \mathbf{D}_u \eta , \theta \rangle_{L^2} = 0 for all \eta \in T_u{\mathcal B};
  2. \langle Y \circ Tu \circ j, \theta \rangle_{L^2} = 0 for all Y \in T_J{\mathcal J}_\epsilon.

The first condition implies as usual that \theta is a weak solution to a Cauchy-Riemann type equation and is thus (by elliptic regularity and the similarity principle) smooth with isolated zeroes. We would then like to argue that one can choose Y \in T_J{\mathcal J}_\epsilon using a bump function near the image of the injective point z_0 so that the second condition makes \theta vanish near z_0, giving a contradiction. But this is not obvious, because Y : T({\mathbb R} \times M) \to T({\mathbb R} \times M) only acts nontrivially on the subbundle \xi rather than on the entirety of T({\mathbb R} \times M). In order to deal with this, we need to examine \mathbf{L} more carefully in relation to the natural splitting of T({\mathbb R} \times M) produced by the stable Hamiltonian structure. The following is a slight repackaging of the argument given by Bourgeois.

Abbreviate W := {\mathbb R} \times M and note that for any J \in {\mathcal J}_\epsilon, we have a splitting of complex vector bundles

(TW,J) = (\Lambda,i) \oplus (\xi,J), where \Lambda := \text{Span}_{\mathbb R}(\partial_t,R).

This gives a splitting u^*TW = u^*\Lambda \oplus u^*\xi and thus breaks down the Cauchy-Riemann type operator \mathbf{D}_u in block form as

\mathbf{D}_u = \begin{pmatrix} \mathbf{D}_u^\Lambda & \mathbf{D}_u^{\xi\Lambda} \\ \mathbf{D}_u^{\Lambda\xi} & \mathbf{D}_u^\xi \end{pmatrix}.

It is easy check that \mathbf{D}_u^\Lambda and \mathbf{D}_u^\xi are Cauchy-Riemann type operators on u^*\Lambda and u^*\xi respectively, while \mathbf{D}_u^{\Lambda\xi} : u^*\Lambda \to u^*\xi and \mathbf{D}_u^{\xi\Lambda} : u^*\xi \to u^*\Lambda are tensorial, i.e. they are smooth bundle maps. The perturbation term Y \in T_J{\mathcal J}_\epsilon can likewise be written in block form as

Y = \begin{pmatrix} 0 & 0 \\ 0 & Y_\xi \end{pmatrix},

where Y_\xi \in \Gamma(\overline{\text{End}}_{\mathbb C}(\xi,J)), and \theta has components (\theta_\Lambda,\theta_\xi) with respect to the splitting

\overline{\text{Hom}}_{\mathbb C}(T\dot{\Sigma},u^*TW) = \overline{\text{Hom}}_{\mathbb C}(T\dot{\Sigma},u^*\Lambda) \oplus \overline{\text{Hom}}_{\mathbb C}(T\dot{\Sigma},u^*\xi).

Now choosing Y_\xi via a bump function near u(z_0), one can use the second orthogonality condition above to prove that \theta_\xi vanishes identically near z_0. It remains to show that the same is true for \theta_\Lambda, and this is the step that will turn out to depend on the condition \text{im}\left(\pi_\xi \circ du(z_0)\right) \cap \ker (d\lambda|_\xi) = \{0\}. Notice that if we choose \eta = (\eta_\Lambda,\eta_\xi) with \eta_\Lambda \equiv 0 and \eta_\xi supported in the region near z_0 where \theta_\xi is known to vanish, then the orthogonality conditions reduce to

\langle \mathbf{D}^{\xi\Lambda} \eta_\xi , \theta_\Lambda \rangle_{L^2} = 0.

We’re going to need an explicit formula for the bundle map \mathbf{D}^{\xi\Lambda} : u^*\xi \to u^*\Lambda. For this purpose, it will help to think of (W,J) as something resembling a Stein manifold: notice that the coordinate projection function t : {\mathbb R} \times M \to {\mathbb R} satisfies

-dt \circ J = \lambda

for every J \in {\mathcal J}(\lambda,\omega), so its level sets are J-convex if and only if \lambda is contact. Choose holomorphic local coordiates \sigma+i\tau near z_0 and, for a section \eta_\xi \in \Gamma(u^*\xi) supported in this coordinate neighborhood, let us compute \lambda(\mathbf{D}_u \eta_\xi(\partial_\sigma)). By the definition of the linearized Cauchy-Riemann operator, we can write

\mathbf{D}_u \eta_\xi(\partial_\sigma) = \left.\nabla_s \left( \partial_\sigma u_s + J(u_s) \partial_\tau u_s \right)\right|_{s=0}

for any smooth family of maps u_s : \dot{\Sigma} \to W with u_0 = u and \partial_s u_s|_{s=0} = \eta_\xi and any connection \nabla on W. Then since \partial_\sigma u + J(u) \partial_\tau u = 0, we find

\lambda(\mathbf{D}_u \eta_\xi(\partial_\sigma)) = \lambda \left( \left. \nabla_s (\partial_\sigma u_s + J(u_s) \partial_\tau u_s)\right|_{s=0} \right) = \left.\partial_s \left[ \lambda(\partial_\sigma u_s + J(u_s) \partial_\tau u_s)\right]\right|_{s=0}

= \left.\partial_s [\lambda(\partial_\sigma u_s)]\right|_{s=0} + \left.\partial_s [(\lambda \circ J)(\partial_\tau u_s)]\right|_{s=0} = d\lambda(\eta_\xi,\partial_\sigma u) + d(\lambda \circ J)(\eta_\xi,\partial_\tau u)

= d\lambda(\eta_\xi,\pi_\xi \partial_\sigma u),

where we’ve used the formula

d\lambda(X,Y) = {\mathcal L}_X[\lambda(Y)] - {\mathcal L}_Y[\lambda(X)] - \lambda([X,Y])

and eliminated several terms using the fact that \lambda(\eta_\xi) = \lambda(J\eta_\xi) = 0 since \eta_\xi is valued in \xi, plus d(\lambda \circ J) = 0 since \lambda \circ J = -dt is exact. A similar computation gives

dt(\mathbf{D}_u \eta_\xi(\partial_\sigma)) = -d\lambda(\eta_\xi, \pi_\xi \partial_\tau u) = -d\lambda(\eta_\xi, J \pi_\xi \partial_\sigma u),

so removing the local coordinates from the picture and writing sections of u^*\Lambda with respect to the obvious complex trivialization, we have

\mathbf{D}^{\xi\Lambda}_u \eta_\xi = - d\lambda(\eta_\xi, J \pi_\xi du(\cdot)) + i \, d\lambda(\eta_\xi, \pi_\xi du(\cdot)).

The following exercise in symplectic linear algebra shows that this bundle map u^*\xi \to u^*\Lambda is surjective on all fibers in some neighborhood of z_0. (If you have no patience for the exercise, just convince yourself that it’s true if d\lambda|_\xi is nondegenerate and tames J|_\xi.)

Exercise 2: Assume V is a finite-dimensional vector space, X, Y \subset V are linearly independent vectors, and \Omega is an alternating bilinear form. Show that the real-linear map

A : V \to {\mathbb C} : v \mapsto \Omega(v,X) + i \Omega(v,Y)

is surjective if and only if \text{Span}(X,Y) \cap \ker \Omega = \{0\}Hint: Under the latter condition, one loses no generality by replacing V with a subspace that is complementary to \ker \Omega and contains \text{Span}(X,Y), in which case (V,\Omega) becomes a symplectic vector space. Now consider the restriction of A to a 2-dimensional subspace transverse to the symplectic complement of \text{Span}(X,Y).

In light of the exercise, we can choose \eta_\xi with support near z_0 so that \mathbf{D}_u^{\xi\Lambda} \eta_\xi = \theta_\Lambda near z_0, thus the condition \langle \mathbf{D}_u^{\xi\Lambda} \eta_\xi , \theta_\Lambda \rangle_{L^2} = 0 implies that \theta_\Lambda must indeed vanish near z_0. We conclude that \theta itself vanishes near z_0, contradicting the fact that it has isolated zeroes and thus completing the proof.

Exercise 3: Generalize the theorem above to the setting of a Weinstein manifold (W,\lambda,f), i.e. assume (W,\lambda) is a Liouville manifold whose dual Liouville vector field V_\lambda is gradient like with respect to f : W \to {\mathbb R}. Notice that away from critical points of f, \lambda = -df \circ J is satisfied for any almost complex structure J that preserves the subbundle \xi := \ker df \cap \ker \lambda and takes V_\lambda to df(V_\lambda) R_\lambda, with R_\lambda denoting the “level-wise Reeb vector field” defined by the conditions

df(R_\lambda) = 0,     \lambda(R_\lambda) = 1,     and     d\lambda(R_\lambda,\cdot)|_\xi = 0.

For what class of J-holomorphic curves can you achieve transversality only by perturbing J on \xi?

Epilogue: what of the normal Cauchy-Riemann operator?

Compared with the argument I explained in my earlier 2-part post on this subject, the one given above achieves a stronger result at the cost of a stronger hypothesis. The weakening of hypotheses in the previous result is achieved by focusing on the normal bundle and the corresponding normal Cauchy-Riemann operator \mathbf{D}_u^N; this makes it reasonable to consider local perturbations of J only in directions normal to u (hence the condition that u must not be everywhere tangent to \xi), and the contact condition plays no role. This approach suffices because transversality in the usual moduli space of unparametrized holomorphic curves is equivalent to the surjectivity of \mathbf{D}_u^N. The latter fact is not hard to grasp in the case where u is immersed: then every other J-holomorphic curve near u admits a unique parametrization in the form

u' = \exp_u \eta

for some section \eta of the normal bundle. Such curves may be considered pseudoholomorphic if and only if their tangent spaces are J-invariant, and the linearization of the nonlinear problem detecting J-invariant immersions of the form \exp_u \eta then reduces to \mathbf{D}_u^N. (This alternative perspective on the nonlinear problem is explained more precisely in the paper by Hofer-Lizan-Sikorav.) From this point of view, however, notice that the domain complex structure of the nearby solution u' cannot be prescribed: it is fully determined by j' := (u')^*J, so the problem of finding nearby pseudoholomorphic maps u' : (\dot{\Sigma},j) \to (W,J) of the form u' = \exp_u \eta with j prescribed and \eta normal to u is overdetermined. This is why looking at \mathbf{D}_u^N does not give us transversality of the forgetful map, and we had to take a different approach in the proof above. For the same reasons, there are (as far as I know) no useful “automatic transversality” results in dimension four for moduli spaces with fixed conformal structures on the domain. Such results are typically proved by showing that \mathbf{D}_u^N is surjective, but this only gives a meaningful transversality condition if the conformal structure of the domain is allowed complete freedom of movement. The forgetful map is never automatically transverse.

Posted in Uncategorized | Tagged , , | Leave a comment

Some bad news about the forgetful map in SFT

This post and its sequel (fittingly titled “Some good news about the forgetful map in SFT”) are meant as addenda to my 2-part post from last winter on generic transversality in symplectizations. The result I tried to explain in that post might be called the fundamental transversality theorem of SFT: it states that for generic choices of J in the usual space of translation-invariant almost complex structures on the symplectization {\mathbb R} \times M of a contact manifold (M,\xi), every somewhere injective J-holomorphic curve in {\mathbb R} \times M is Fredholm regular. In fact, the proof I explained works in the more general setting where instead of a contact structure, M is endowed with a stable Hamiltonian structure (\lambda,\omega), the caveat being that one must exclude from consideration any holomorphic curves that are everywhere tangent to \xi := \ker \lambda, a scenario that can never happen in the contact case.

In this and the next post, I want to discuss a slightly subtle detail about the smoothness of these moduli spaces: the forgetful map. Readers may be familiar with this notion from Gromov-Witten theory, e.g. it appears in the book by McDuff and Salamon as a continuous map from the moduli space {\mathcal M}(J) of unparametrized J-holomorphic curves to the moduli space {\mathcal M} of Riemann surfaces, associating to each equivalence class of holomorphic curves u : (\Sigma,j) \to (W,J) the isomorphism class of complex structures (or equivalently, conformal structures) on its domain; in symbols,

{\mathcal M}(J) \stackrel{\Phi}{\longrightarrow} {\mathcal M} : [(j,u)] \mapsto [j].

This map plays a key role in defining many of the more interesting algebraic structures in Gromov-Witten theory, and in principle it can play a similar role in Symplectic Field Theory. But in order to make use of it, one needs to know not just that {\mathcal M}(J) is smooth, but also that \Phi can be made transverse to given cycles in {\mathcal M} for generic choices. This transversality condition is not so obvious, and as I’ll show in this post, there are situations where it is not even true. That is the bad news. The corresponding good news will be that it is true in the setting that SFT really cares about, namely in the symplectization of any contact manifold.

The impetus for this post was a minicourse on transversality techniques that I gave last week at the 2015 Summer School on Moduli Problems in Symplectic Geometry at IHES, in which I explained the transversality proof that is the subject of the earlier post and referenced this blog in lieu of lecture notes. In the minicourse, I mentioned (but did not give a very good explanation of) the following curious detail: unlike other proofs of generic transversality that one commonly sees, my proof in the symplectization setting depended crucially on the fact that we are considering holomorphic curves whose domain conformal structures are allowed to vary. In particular, my proof does not imply that one can make moduli spaces of J-holomorphic curves with constrained conformal structures on the domain smooth by choosing J generically — or to say it the fancy way, it does not make the forgetful map transverse. There are other proofs in the literature that do not have this limitation, e.g. a different proof for holomorphic cylinders in the contact setting appears in the appendix of a paper by Bourgeois. Strictly speaking, the moduli space of conformal structures is irrelevant in Bourgeois’s proof since the conformal structure of a cylinder is unique, but as I’ll sketch in the sequel to this post, his argument can easily be generalized to arbitrary punctured Riemann surfaces without the moduli space of conformal structures playing any role, so it does imply transversality of the forgetful map.

I hadn’t seriously thought about this detail before, but when it came up in the minicourse, it got me wondering whether this discrepancy was just a defect of my approach or an actual mathematical phenomenon. The answer turned out to be the latter, and it could conceivably have some nontrivial implications for computation problems in SFT: in the general setting of stable Hamiltonian structures, the forgetful map is not generically transverse, and I will give a counterexample below. The fact that Bourgeois’s proof does give transversality of the forgetful map is therefore a distinctly contact phenomenon. Let me explain what I mean.

The forgetful map in SFT

We will work in the same setting as in the earlier post: M is a closed oriented (2n-1)-dimensional manifold carrying a stable Hamiltonian structure (\lambda,\omega), which induces a co-oriented hyperplane distribution \xi = \ker\lambda \subset TM and a Reeb vector field R, and also determines a space {\mathcal J}(\lambda,\omega) of smooth translation-invariant almost complex structures on the symplectization {\mathbb R} \times M with J(\partial_t) = R and J|_{\xi} : \xi \to \xi compatible with \omega|_\xi.

Fix nonnegative integers g, m, p and q. Given J \in {\mathcal J}(\lambda,\omega), the moduli space {\mathcal M}_{g,m,p,q}(J) of unparametrized J-holomorphic curves in {\mathbb R} \times M with genus g, m marked points, p positive and q negative punctures consists of equivalence classes of tuples (\Sigma,j,\Theta,\Gamma,u) where (\Sigma,j) is a closed Riemann surface of genus g, \Theta \subset \Sigma is an ordered set of m points, \Gamma \subset \Sigma is partitioned into two ordered subsets \Gamma^+ and \Gamma^- of p and q points respectively, u : (\dot{\Sigma} := \Sigma \setminus \Gamma,j) \to ({\mathbb R} \times M,J) is J-holomorphic and positively/negatively asymptotic to trivial cylinders over closed Reeb orbits at each of the punctures in \Gamma^\pm \subset \Sigma, and two such tuples are considered equivalent if they are related by a diffeomorphism of their respective domains. Let

{\mathcal M}^*_{g,m,p,q}(J) \subset {\mathcal M}_{g,m,p,q}(J)

denote the open subset defined by the condition that u : \dot{\Sigma} \to {\mathbb R} \times M is somewhere injective. The forgetful map is defined by

\Phi : {\mathcal M}_{g,m,p,q}(J) \to {\mathcal M}_{g,m+p+q} : [(\Sigma,j,\Theta,\Gamma,u)] \mapsto [(\Sigma,j,\Theta \cup \Gamma)],

where {\mathcal M}_{g,k} denotes the moduli space of (marked) Riemann surfaces, consisting of equivalence classes of tuples (\Sigma,j,\Theta') with \Theta' \subset \Sigma an ordered set of k points.

It is a classical fact that {\mathcal M}_{g,k} is smooth, though in general the presence of biholomorphic automorphisms makes it an orbifold rather than a manifold. For the purposes of this discussion, I’m going to ignore automorphisms and pretend {\mathcal M}_{g,k} is a manifold wherever convenient. Then it is natural to ask the following:

Question 1: Given a smooth submanifold X \subset {\mathcal M}_{g,m+p+q}, do generic perturbations of J \in {\mathcal J}(\lambda,\omega) suffice to ensure that the moduli space of somewhere injective curves {\mathcal M}^*_{g,m,p,q}(J) is cut out transversely and the forgetful map {\mathcal M}^*_{g,m,p,q}(J) \stackrel{\Phi}{\longrightarrow} {\mathcal M}_{g,m+p+q} is transverse to X?

Here the words “cut out transversely” are used as a synonym for what I usually call Fredholm regularity of the elements in {\mathcal M}^*_{g,m,p,q}(J), so in particular {\mathcal M}^*_{g,m,p,q}(J) is a smooth manifold of the “correct” dimension as predicted by the usual index formula, but in addition to this, we obtain a smooth submanifold

\Phi^{-1}(X) \subset {\mathcal M}^*_{g,m,p,q}(J)

whose codimension matches that of X \subset {\mathcal M}_{g,m+p+q}. For example, if X is defined to be a single element [(\Sigma,j,\Theta \cup \Gamma)] with trivial automorphism group, then \Phi^{-1}(X) can be identified with the space of parametrized somewhere injective J-holomorphic maps u : (\dot{\Sigma},j) \to ({\mathbb R} \times M), where j, \Theta and \Gamma are regarded as fixed data on a fixed surface \Sigma, and transversality implies that this space of maps is a manifold.

A counterexample in the integrable case

The following example shows that the answer to Question 1 is sometimes no. Suppose (X,\Omega) is a closed symplectic manifold of dimension 2n-2, and endow M := S^1 \times X with the stable Hamiltonian structure (\lambda,\omega) := (d\theta,\Omega), where \theta denotes the coordinate on S^1. The Reeb vector field is then R = \partial_{\theta}, so every J \in {\mathcal J}(\lambda,\omega) on {\mathbb R} \times M = {\mathbb R} \times S^1 \times X is of the form

J(t,\theta,x) = i \oplus \hat{J}_\theta(x),

where i denotes the standard complex structure i \partial_t = \partial_\theta on {\mathbb R} \times S^1 and \{\hat{J}_\theta\}_{\theta \in S^1} is a smooth S^1-family of compatible almost complex structures on (X,\Omega). A map

u = (f,g,v) : (\dot{\Sigma},j) \to ({\mathbb R} \times S^1 \times X,J)

is then J-holomorphic if and only if \varphi := (f,g) : (\dot{\Sigma},j) \to ({\mathbb R} \times S^1,i) is holomorphic and v : (\dot{\Sigma},j) \to (X,\hat{J}^g) is pseudoholomorphic for the domain-dependent almost complex structure \hat{J}^g on X defined by

\hat{J}^g(z,x) := \hat{J}_{g(z)}(x)

for (z,x) \in \dot{\Sigma} \times X. Notice that such a map will be somewhere injective whenever v : \dot{\Sigma} \to X is somewhere injective, even if \varphi : \dot{\Sigma} \to {\mathbb R} \times S^1 is a multiple cover.

Let us fix the domain (\dot{\Sigma} = \Sigma \setminus \Gamma,j), assuming \Sigma has genus g and that (\Sigma,j,\Gamma) has no automorphisms. Then if u = (f,g,v) : (\dot{\Sigma},j) \to {\mathbb R} \times S^1 \times X is J-holomorphic and v is somewhere injective, let {\mathcal M}_u denote a small neighborhood of u in the space of parametrized J-holomorphic maps (\dot{\Sigma},j) \to ({\mathbb R} \times S^1 \times X,J); without loss of generality we may assume

{\mathcal M}_u = {\mathcal M}_\varphi \times {\mathcal M}_v,

where {\mathcal M}_\varphi and {\mathcal M}_v similarly denote small neighborhoods of \varphi = (f,g) and v in their respective moduli spaces. Each of these spaces has virtual dimension equal to the Fredholm index of the relevant linearized Cauchy-Riemann operator on the pulled back tangent bundle, that is,

\text{vir-dim } {\mathcal M}_u = \text{ind } \mathbf{D}_u = \text{ind } \mathbf{D}_\varphi + \text{ind } \mathbf{D}_v.

Since v is somewhere injective, standard transversality arguments imply that for generic choices of the family \{J_\theta\}_{\theta \in S^1} and hence for generic J \in {\mathcal J}(\lambda,\omega), {\mathcal M}_v will be a smooth manifold of dimension \text{ind } \mathbf{D}_v. On the other hand, {\mathcal M}_\varphi is always a manifold, but its dimension will usually be larger than \text{ind } \mathbf{D}_\varphi, implying \dim {\mathcal M}_u > \text{vir-dim } {\mathcal M}_u. To be precise, I claim that

\dim {\mathcal M}_\varphi = 2     but     \text{ind } \mathbf{D}_\varphi = 2 - 2g.

It’s easy to see where the two dimensions of {\mathcal M}_\varphi come from: one can compose \varphi : \dot{\Sigma} \to {\mathbb R} \times S^1 with two dimensions of holomorphic translations on {\mathbb R} \times S^1. The rest of the claim can be proven using the punctured version of the Riemann-Roch formula and the fact that \varphi^*T({\mathbb R} \times S^1) \to \dot{\Sigma} is canonically a trivial bundle; in the interest of brevity, I will leave the details as an exercise for the reader.

This example shows that unless (\dot{\Sigma},j) has genus zero, one can never choose J \in {\mathcal J}(\lambda,\omega) generically enough for {\mathcal M}_u to be cut out transversely. It’s worth noting however that we wouldn’t have had this problem if {\mathcal M}_u had been defined without fixing the complex structure on the domain. If j is allowed to vary, then the moduli space of unparametrized holomorphic curves (\dot{\Sigma},j) \to ({\mathbb R} \times S^1,i) is always smooth and cut out transversely (cf. Example 3.16 in my automatic transversality paper). And indeed, the usual generic transversality theorem is true for these curves — as long as \varphi : \dot{\Sigma} \to {\mathbb R} \times S^1 is not constant, u = (\varphi,v) satisfies the hypothesis of being not everywhere tangent to \xi, so the proof in my earlier post applies, but says nothing about the forgetful map.

As a general rule, most of the analytical phenomena needed to make SFT work as advertised — compactness, asymptotic formulas and generic transversality, for example — hold just as well for arbitrary stable Hamiltonian structures as for contact structures, but there are occasional exceptions and the issue described above with the forgetful map is one of them. An even simpler issue is that for general choices of (\lambda,\omega) and J \in {\mathcal J}(\lambda,\omega), nontrivial curves u : (\dot{\Sigma},j) \to ({\mathbb R} \times M,J) need not always have a positive puncture, though a maximum principle implies that they do when \xi is contact, and this fact is crucial for the basic algebraic structure of SFT. Thus SFT was never meant to be valid for completely arbitrary stable Hamiltonian structures — nonetheless, non-contact examples can be useful tools in a variety of problems within the SFT context, particularly for computations (this has been the case in a few papers of mine, where e.g. certain stable Hamiltonian structures appear as limits of degenerating families of contact structures). Thus it’s valuable to see how far the analysis can be pushed, and where it stops working.

So much for the bad news. I’ll discuss the good news in the next post.

Posted in Uncategorized | Tagged , , | Leave a comment

The similarity principle without Calderón-Zygmund

In my L^2 vs. L^p post a few weeks ago, I sketched a more or less standard proof of the similarity principle, and then wrote:

I defy the reader to come up with any alternative version of the above proof that does not use properties of the operator \bar{\partial} : W^{1,p}({\mathbb D}) \to L^p({\mathbb D}) for some p > 2.

Two readers responded to this challenge: they were Jean-Claude Sikorav and Patrick Massot, and in this post I’m going to explain (as I did last week on the topic of regularity and bubbling) my reinterpretation of the proof that they sent me. It should be said that after I’d managed to understand this proof, I still felt rather surprised that it works, and while I can’t speak for anyone else, it strikes me as something that I would never come up with if I had not first seen the standard proof in L^p.

The problem

Recall the statement: the most useful version of the similarity principle can be viewed as saying that if \mathbf{D} is a real-linear Cauchy-Riemann type operator on a smooth complex vector bundle E over a Riemann surface \Sigma, and \eta \in \Gamma(E) satisfies \mathbf{D}\eta \equiv 0 and \eta(z_0) = 0, then on some neighborhood of z_0 in \Sigma , E admits a continuous trivialization that identifies \eta with a holomorphic function. This is useful because it implies a unique continuation result: either \eta \equiv 0 or it has an isolated zero at z_0 (which is also of positive order if E is a line bundle).

As I outlined in L^p or not L^p, that is the question, the similarity principle is a corollary of the following local existence result for solutions of linear Cauchy-Riemann type equations. Let’s fix the usual notation: {\mathbb D}, {\mathbb D}_r \subset {\mathbb C} will denote the open disks of radius 1 and r respectively, write the standard coordinate on {\mathbb C} as z = s + it, and the standard Cauchy-Riemann operator as \bar{\partial} := \partial_s + i \partial_t.

Lemma 1. Suppose n \in {\mathbb N}, p > 2, and A : {\mathbb D} \to \text{End}_{\mathbb R}({\mathbb C}^n) is a function of class L^p. Then for any \epsilon > 0 sufficiently small, there exists a linear map {\mathbb C}^n \to C^0({\mathbb D}_\epsilon,{\mathbb C}^n) associating to each v \in {\mathbb C}^n a function u : {\mathbb D}_\epsilon \to {\mathbb C}^n that satisfies (\bar{\partial} + A) u = 0 in the sense of distributions and u(0) = v.

One remark before we get into the proof. The regularity assumption on the zeroth order term A may strike you as absurdly weak — normally geometers are only interested in smooth Cauchy-Riemann type operators. Recall however that the first step in proving the similarity principle is to replace a real-linear Cauchy-Riemann operator with one that is complex linear but still annihilates the given section \eta: this can always be done by changing the zeroth order term, but since we do not know a priori what the zero set of \eta looks like, the price we pay is that A can at best be assumed to be of class L^\infty after this change. The above statement weakens the hypothesis to L^p for p > 2 just because that will turn out to be what we need in the proof. The similarity principle then follows because we can use the local existence result to construct each column of a continuous matrix-valued function \Phi on {\mathbb D}_\epsilon that satisfies \mathbf{D}\Phi = 0 and \Phi(0) = I, and since \mathbf{D} is now complex linear, the Leibniz rule implies that if \mathbf{D}\eta = 0 and \eta = \Phi f, then \bar{\partial} f = 0.

There’s a fairly straightforward way to prove Lemma 1 if you’re willing to accept the fact — essentially equivalent to the Calderón-Zygmund inequality — that

\bar{\partial} : W^{1,p}({\mathbb D}) \to L^p({\mathbb D})

has a bounded right inverse for p > 2. The idea is to look for solutions u \in W^{1,p}({\mathbb D}) to the equation (\bar{\partial} + A_\epsilon) u = 0, where A_\epsilon : {\mathbb D} \to \text{End}_{\mathbb R}({\mathbb C}^n) is defined to match A on {\mathbb D}_\epsilon and to vanish everywhere else. By the Sobolev embedding theorem, we have

\| A_\epsilon u \|_{L^p} \le \| A_\epsilon \|_{L^p} \| u \|_{C^0} \le c \| A \|_{L^p({\mathbb D}_\epsilon)} \| u \|_{W^{1,p}},

hence \bar{\partial} + A_\epsilon \to \bar{\partial} + A_0 := \bar{\partial} in the space of bounded linear operators W^{1,p} \to L^p as \epsilon \to 0. Now consider the continuous family of bounded linear maps

\Psi_\epsilon : W^{1,p}({\mathbb D}) \to L^p({\mathbb D}) \times {\mathbb C}^n : u \mapsto ((\bar{\partial} + A_\epsilon) u , u(0))

for \epsilon \in [0,1], and notice that since constant functions are holomorphic, one can use the bounded right inverse of \bar{\partial} to construct a bounded right inverse for \Psi_0. The existence of bounded right inverses is an open condition, so it follows that \Psi_\epsilon also admits such an inverse for all \epsilon > 0 sufficiently small; call it T_\epsilon : L^p({\mathbb D}) \times {\mathbb C}^n \to W^{1,p}({\mathbb D}). After embedding W^{1,p} into C^0, the linear map {\mathbb C}^n \to C^0({\mathbb D}_\epsilon) promised by Lemma 1 can now be written as

v \mapsto T_\epsilon(0,v)|_{{\mathbb D}_\epsilon},

and this completes the proof.

We used the assumption p > 2 quite a few times in the above argument: without it, the map \Psi_\epsilon would not be continuous because W^{1,p} \to {\mathbb C}^n : u \mapsto u(0) is not continuous, and even if we could obtain solutions of class W^{1,p} in the end, they might not be in C^0. For these reasons, I previously could not imagine how it might be possible to prove such a local existence result without relying on the elliptic estimates for \bar{\partial} : W^{1,p} \to L^p with p > 2.

But it is possible.

The solution

Here’s a slightly different kind of local existence result.

Lemma 2. Suppose n \in {\mathbb N}, p > 2, A : {\mathbb D} \to \text{End}_{\mathbb R}({\mathbb C}^n) is a function of class L^p, and f_0 : {\mathbb D}_r \to {\mathbb C}^n is a holomorphic function on a disk of some radius r \le 1. Then for any number \delta > 0, there exists a number \epsilon \in (0,r] and a continuous function f : {\mathbb D}_\epsilon \to {\mathbb C}^n such that \| f \|_{C^0} \le \delta and (\bar{\partial} + A)(f_0 + f) = 0 in the sense of distributions.

In this lemma we’ve dropped the requirement that our solution take a prescribed value at the origin, instead just asking for it to be C^0-close to some prescribed function. Nonetheless, it’s not too hard to see that Lemma 2 implies Lemma 1: one can use Lemma 2 to construct the columns of a continuous matrix-valued function \Phi : {\mathbb D}_\epsilon \to \text{End}({\mathbb R}^{2n}) that satisfies (\bar{\partial} + A)\Phi = 0 and is C^0-close to the identity, hence everywhere invertible. Solutions with prescribed values at 0 can then be constructed in the form \Phi f where f is constant.

So how can we prove Lemma 2 using only L^2 estimates? We shall again look for continuous functions f : {\mathbb D} \to {\mathbb C}^n satisfying (\bar{\partial} + A_\epsilon) (f_0 + f) = 0. Notice that it is still true that

\lim_{\epsilon \to 0} (\bar{\partial} + A_\epsilon) = \bar{\partial}

in the space of bounded linear operators W^{1,2} \to L^2, though for different reasons than in the p > 2 case: W^{1,2}({\mathbb D}) is a Sobolev borderline case and embeds continuously into L^q({\mathbb D}) for every q \ge 1, so picking q > 1 such that 1/q + 2/p = 1 and using Hölder’s inequality, we have

\| A_\epsilon u \|_{L^2}^2 \le \int_{{\mathbb D}_\epsilon} | A |^2 |u|^2 \le \| |A|^2 \|_{L^{p/2}({\mathbb D}_\epsilon)} \cdot \| |u|^2 \|_{L^q({\mathbb D}_\epsilon)} = \| A \|_{L^p({\mathbb D}_\epsilon)}^2 \| u \|_{L^{2q}}^2

\le c \| A \|_{L^p({\mathbb D}_\epsilon)}^2 \| u \|_{W^{1,2}}^2.

It follows that since \bar{\partial} : W^{1,2}({\mathbb D}) \to L^2({\mathbb D}) has a bounded right inverse, so does \bar{\partial} + A_\epsilon for \epsilon > 0 sufficiently small: denote this right inverse by

T_\epsilon : L^2({\mathbb D}) \to W^{1,2}({\mathbb D}).

It should now at least seem plausible that any solution f_0 \in W^{1,2}({\mathbb D}) to the equation \bar{\partial} f_0 = 0 admits a W^{1,2}-close perturbation f_0 + f satisfying (\bar{\partial} + A_\epsilon) (f_0 + f) = 0: indeed, the latter is equivalent to the equation

(\bar{\partial} + A_\epsilon) f = - A_\epsilon f_0,

so an obvious solution presents itself in the form

f := - T_\epsilon (A_\epsilon f_0).

Since A_\epsilon f_0 is L^2-small for small \epsilon and T_\epsilon is close to T_0 (the right inverse of \bar{\partial}) in the operator norm, our solution f is evidently W^{1,2}-small. This is nice, of course, but it’s not good enough. We also need f to be continuous, and C^0-small. Is it?

A nice little fact about the right inverse of \bar{\partial}

As it turns out, yes: the solution we just found is continuous and C^0-small. This is easy to see if you don’t mind using Calderón-Zygmund, because A_\epsilon f_0 is also small in L^p and the right inverse of \bar{\partial} restricts to L^p \subset L^2 as a continuous operator L^p \to W^{1,p}, which is then continuous into C^0 by the Sobolev embedding theorem. But actually, this C^0-bound on f admits a much more direct proof that is orders of magnitude easier than either Calderón-Zygmund or the Sobolev embedding theorem.

Proposition. The standard Cauchy-Riemann operator \bar{\partial} : W^{1,2}({\mathbb D}) \to L^2({\mathbb D}) admits a bounded right inverse T : L^2({\mathbb D}) \to W^{1,2}({\mathbb D}) such that for each p \in (2,\infty), T restricts to L^p({\mathbb D}) \subset L^2({\mathbb D}) as a bounded linear operator L^p({\mathbb D}) \to C^0({\mathbb D}).

In fact, with a little bit more effort one can prove that T maps L^p continuously to the Hölder space C^{0,1-2/p}, but the C^0-bound will be plenty sufficient for our purposes. To see why it is true, let us quickly recall how T : L^2 \to W^{1,2} is constructed (cf. Section 2.6 in the current version of my book in progress on holomorphic curves). The Cauchy-Riemann operator has a fundamental solution K \in L^1_{\text{loc}}({\mathbb C}), defined by

K(z) = \frac{1}{2\pi z}.

Being a fundamental solution means that K satisfies \bar{\partial} K = \delta in the sense of distributions, so the equation \bar{\partial} u = f can be solved for sufficiently nice functions f by writing u as the convolution

u(z) = K * f(z) = \int_{\mathbb C} K(z - \zeta) f(\zeta) \, d\mu(\zeta),

where d \mu(\zeta) denotes the Lebesgue measure for functions of \zeta \in {\mathbb C}. This is well defined in particular whenever f is smooth with compact support in {\mathbb D}, and in this case one can prove a straightforward variation on Young’s inequality to bound \| K * f \|_{L^p({\mathbb D})} in terms of \| f \|_{L^p} for any p \ge 1, so f \mapsto K * f extends to bounded linear map L^p({\mathbb D}) \to L^p({\mathbb D}). Since \bar{\partial}(K*f) = f, one obtains a W^{1,p}-bound on K * f if one can also bound \| \partial (K*f) \|_{L^p({\mathbb D})} in terms of \| f \|_{L^p} for all f \in C_0^\infty({\mathbb D}). This is always possible if p > 1, and in the p \ne 2 case this is the essence of the Calderón-Zygmund inequality. But for p = 2 there is an easy proof by Fourier transforms: observe that the Fourier transform of the relation \bar{\partial} K = \delta gives

2\pi i \zeta \widehat{K}(\zeta) = 1,

where \widehat{K}(\zeta) denotes the Fourier transform of K(z) in the sense of tempered distributions. Similarly, taking the Fourier transform of the equation u = K * f gives \hat{u} = \widehat{K} \hat{f}, hence 2\pi i \zeta \hat{u}(\zeta) = \hat{f}(\zeta), and we can now use Plancherel’s theorem to compute

\| \partial u \|_{L^2}^2 = \| \widehat{\partial u} \|_{L^2}^2 = \| 2\pi i \bar{\zeta} \hat{u} \|_{L^2}^2 = \int_{\mathbb C} \left| \frac{\bar{\zeta}}{\zeta} 2\pi i \zeta \hat{u}(\zeta) \right|^2 \, d\mu(\zeta) = \int_{\mathbb C} | \hat{f}(\zeta) |^2 \, d\mu(\zeta)

= \| f \|_{L^2}^2.

This proves that the map f \mapsto K*f extends to a bounded linear operator L^2({\mathbb D}) \to W^{1,2}({\mathbb D}), and we define T to be this extension. With this explicit formula in hand, the proof of the proposition is very quick: notice in particular that K \in L^q_{\text{loc}}({\mathbb C}) for every q \in [1,2), so if p > 2 and 1 / q + 1 / p = 1, then for every f \in C_0^\infty({\mathbb D}) and every z \in {\mathbb D},

| Tf(z) | = \left| \int_{\mathbb C} K(z-\zeta) f(\zeta) \, d\mu(\zeta) \right| \le \int_{\mathbb D} | K(z - \cdot) f | \le \| K(z - \cdot) \|_{L^q({\mathbb D})} \| f \|_{L^p({\mathbb D})}

\le C \| f \|_{L^p},

where C > 0 is the supremum of all L^q-norms of K restricted to disks of unit radius in {\mathbb C}. Since the convolution maps smooth functions to smooth functions and the L^\infty-closure of C^\infty is C^0, this proves the proposition.

There remains just one niggling detail: we’ve shown that T_0 := T maps L^p to C^0, but in our proof of Lemma 2, we need to know that this is also true for T_\epsilon, the right inverse of the perturbed operator \bar{\partial} + A_\epsilon. To see this, it will help to give a slightly more precise definition of T_\epsilon. Notice that

(\bar{\partial} + A_\epsilon) T = 1 + A_\epsilon T

is a bounded linear operator on L^2 and is close to the identity in the operator norm since T : L^2 \to W^{1,2} is continuous and A_\epsilon : W^{1,2} \to L^2 is small. But for slightly different reasons, this operator is also close to the identity in the space of bounded linear operators on L^p: indeed, we showed above that T : L^p \to C^0 is continuous, and A_\epsilon : C^0 \to L^p is also small since

\| A_\epsilon u \|_{L^p} \le \| A_\epsilon \|_{L^p} \| u \|_{C^0} = \| A \|_{L^p({\mathbb D}_\epsilon)} \| u \|_{C^0}.

Thus if we choose \epsilon > 0 small enough, 1 + A_\epsilon T defines an isomorphism on both L^2 and L^p, so defining

T_\epsilon := T (1 + A_\epsilon T)^{-1}

gives a right inverse of \bar{\partial} + A_\epsilon that is continuous both from L^2 to W^{1,2} and from L^p to C^0. The proof of Lemma 2 is now complete, and though we appealed to Calderón-Zygmund once or twice for intuition, we never actually used it.


This will be my last post on the L^p vs. L^2 debate for a while, as I’m sure it’s clear to everyone by now that I’ve been thinking about this far too much lately. The evidence currently available to me suggests that it might very well be possible to develop the entire theory of pseudoholomorphic curves using only L^2 estimates — useful perhaps if you want to feel honest without taking the time to read the proof of Calderón-Zygmund, or if you’re one of those strange people with an aversion to the axiom of choice (it’s needed for the Hahn-Banach theorem, which is needed for regularity and transversality arguments in W^{k,p} for p \ne 2, but you can avoid it if you only work with Hilbert spaces).

But just as proving the L^p estimates requires effort, avoiding them also requires effort, and some of the resulting proofs become arguably less straightforward and less elegant. In the end, it’s a matter of taste.

Posted in Uncategorized | Tagged | Leave a comment