74

An official spec for CommonMark with reference implementation was released recently.

Will these specs (with optional additions like MathJax for those sites that already have it) be adopted? There are some places where the implementation deviates, example 371 [link](foo(and(bar))) amongst others.

Or will it break too many posts?

7
  • 3
    is [link](foo(and(bar))) something you find you need to do very often?
    – JonW
    Commented Sep 4, 2014 at 10:27
  • 4
    @JonW not really but it's more about following the common standard than needing the special features. Commented Sep 4, 2014 at 10:32
  • 2
    Whether it gets implemented or not, I think it's better for features to fulfill an actual demand than to add in loads of features that nobody really has any use for. That just makes it more confusing to be able to do the standard tasks (looking at you Microsoft Ribbon).
    – JonW
    Commented Sep 4, 2014 at 10:34
  • 17
    @JonW: yet properly supporting a Markdown spec means that more tools can be developed, which can lead to all sorts of interesting features. Content from SE sites can be re-used by such tools with confidence that no special-casing is required. GitHub and Stack Exchange can pool resources on parsing the stuff, meaning that they can both devote more time on other, more pressing tasks. Markdown is core to Stack Exchange, which is why Ben was part of the group to hash out the spec. Commented Sep 4, 2014 at 10:43
  • 1
    @Stijn The CommonMark spec still hasn't been finalized. It's unlikely that you'd get a full push of something like this over the Q&A product before it is final.
    – animuson StaffMod
    Commented Aug 17, 2016 at 2:20
  • @animuson That's true, and I don't expect to see this implemented in the short term. But the remaining issues that must be resolved don't pose a significant impact in a way that stops any preparation/research to an eventual switch.
    – Stijn
    Commented Aug 17, 2016 at 7:48
  • 2
    The latest announcement on this topic: We're switching to CommonMark
    – V2Blast
    Commented Jun 1, 2020 at 17:41

3 Answers 3

50

Regarding the question "will it break too many posts?", what Adam Davis says is correct. So when we switch (and at this point I'm fairly certain that it's "when", not "if"), the most important thing to know is that this won't suddenly change old posts.

Here are some thoughts I currently have about the switch. I'm just thinking aloud; we haven't decided anything yet, so take it with a grain of salt.

Impact

For the vast majority of posts, it won't make any visible difference. The point of CommonMark is to break as few existing documents as possible. Yes, there will be posts that (purposefully or not) rely on edge cases that CommonMark handles differently than we currently do, and thus when an edit is made to an old post, some things may have to be manually changed.

That said, when comparing MarkdownSharp/PageDown behavior to CommonMark on a lot of Meta posts, there are three differences that impact a pretty big number of posts.

Two of them are continuation of list items and block quotes.

List items

Currently, this will create a list item with two paragraphs:

1. This is the first paragraph

 And this is the second one.

With CommonMark (and even in a significant number of other Markdown implementations), the "second one" will not be part of the list item, but a stand-alone paragraph after the list. To make it part of the list item, you have to indent it to the same margin as the first paragraph like this:

1. This is the first paragraph

   And this is the second one.

Block quotes

As for block quotes, at the moment when you write a post on Stack Exchange, this:

> Roses are red

> Violets are blue

will create a single blockquote with two paragraphs. CommonMark (and again, various other implementations) turns it into two separate blockquotes. To keep the paragraphs connected, the empty line needs a quote character as well:

> Roses are red
> 
> Violets are blue

Headings

The third difference that sees a decent amount of posts being changed is with ATX headers like these:

# Introduction

## Impact ##

###Further Research

The third one will no longer work in CommonMark, because unlike what our current Markdown version does, CommonMark requires the space between the # characters and the text.

Note that all three are related to typing as few as possible characters to achieve the desired formatting, which is probably why they are so common – why would you continue adding characters once you see in the preview that it looks as you want it to look?

Detection

My current thought is that we should, when someone edits a post that was created or last edited under the MarkdownSharp regime, check for these three issues and, if any of them is present in the post, offer to auto-correct them.

It is unfeasible to create a complete old-to-new converter and running it over all the posts, but at the time of editing (when it matters) I think those three major cases should be handled semi-automatically.

Diffs

This would be nice, but I haven't completely thought through all the implications and figured out solutions for all issues: I'm considering to keep MarkdownSharp running on the server for revision diffs (and suggested edits). So if you have a post that was created in 2014 but edited after the CommonMark switch, then for the purpose of showing the revision diff, we could render the old version with MarkdownSharp and the new version with CommonMark.

For some background info: Only the current version of a post is stored as a rendered version (see Adam Davis' point again); for older revisions we only store the Markdown source. The cached rendered version is what's displayed when you look at a question page, but once you click "edited by…" to go to the revision list, the diffs are created by rendering the revisions' Markdown sources on the fly and comparing the resulting HTML.

If you used CommonMark to render both the old and the new version, you would hide any major changes that were caused be the new Markdown engine and that should be handled.

For our incremental improvements to MarkdownSharp and PageDown, we have been living with that (we certainly don't want to keep each and every version around that has ever been used to render a post), but for the huge break that is the switch to CommonMark, it may make sense. (To be clear, there will be no option on question/answer submission to "render with the legacy Markdown version"; we would keep MarkdownSharp around only for diffs against old revisions).

Disclaimers

As I said above, nothing is set in stone yet. Everything above is just me thinking aloud. Also keep in mind that CommonMark is still evolving, so details in the spec may still change.

13
  • It would be interesting to survey maybe the top 1% (in terms of views) posts and find out how many actually would have a problem with the three changes mentioned. My guess is that the impact would still be very small, and while the corrective actions would be nice, they probably aren't necessary. Editors will understand the markdown diff won't look right past a certain date, and regular users won't care - they aren't editing. A simple check: if editing and the last edit is older than the markdown change date then render both old and new. If different results, then post note
    – Pollyanna
    Commented Jun 12, 2015 at 13:23
  • Where note would be, "In the transition to CommonMark, this post may require formatting updates. During your edit, please review the entire post for formatting issues." - I don't think you'll necessarily need to keep both around, render both for every post or edit when reviewing, etc. Sure, it would be nice, but it sounds like a lot of effort. See what the impact is - if it's not terribly large, make the switch, add note where there's a difference, and let the users complain if it does, in fact, make a bigger difference than thought - at that point you can implement actual needs.
    – Pollyanna
    Commented Jun 12, 2015 at 13:26
  • Oh, and you don't have to keep the old renderer around - just run the new renderer on the old markdown and compare to the stored HTML.
    – Pollyanna
    Commented Jun 12, 2015 at 13:27
  • It's not the old posts but old posters' habits that have to be changed. The ATX headers change (space after #) was the first behavioral barrier for me when I started using CommonMark-based renderers. A bit of unlearning is in order, I'm afraid. Commented Jun 12, 2015 at 16:47
  • 5
    Oh, and CommonMark tables would be awesome. :) Commented Jun 12, 2015 at 16:48
  • @AdamDavis that requires keeping the old rendered for all posts (over a few million) currently rendered, it's cheaper storage wise to just keep the old renderer around. Commented Jun 12, 2015 at 17:08
  • 1
    @ratchetfreak No it doesn't. Once it's edited, then you no longer need to display the note.
    – Pollyanna
    Commented Jun 12, 2015 at 17:15
  • 2
    @DeerHunter FYI, I'm quoting you here.
    – balpha StaffMod
    Commented Jun 15, 2015 at 13:54
  • 2
    @balpha Is this still being explored?
    – Stijn
    Commented Aug 10, 2016 at 22:30
  • 1
    @balpha I too want to know what's going on with this. Markdown support was asked about 2 years ago, has 85 upvotes and has zero response from SE team that I can find. SE should put up a blog post about this.
    – jcollum
    Commented Dec 4, 2016 at 19:45
  • @balpha: So, where exactly is SE on this? Is the problem in CommonMark being standardized or is the problem with implementing it? Commented Jun 9, 2017 at 16:41
  • 3
    @NicolBolas we investigated the feasibility of doing the work in May, and concluded that it's doable. Right now it's in the hands of our Q&A platform team to prioritize and schedule. We want to do it, but we expect some delays due to finite resources and pressing goals, not to mention complicated implementation (do we rebake all posts or use a flag or date to determine which editor? Is that confusing for users? etc.). In short we're hoping for this year, but I can't make any promises yet.
    – Haney Staff
    Commented Jun 9, 2017 at 16:46
  • Do ordered lists still rewrite the numbers? So if you try to make a list of items 4, 5, 8, 9, it rewrites it to 1, 2, 3, 4?
    – endolith
    Commented Jun 6, 2020 at 15:16
35

6 years later: Stack Exchange is switching to CommonMark! Yay


Original answer:

I'd expect so, yes, because Balpha, aka Benjamin Dumke-von der Ehe, an SE employee, is one of the authors of the specification. And commenting on this post he has confirmed there are plans to support the standard:

I hope to eventually switch Stack Exchange over to this Markdown version. That's still a bit in the future though.

You may also notice another Stack Exchange luminary on the list of authors, although he now has moved on to other things. He blogged about Standard Markdown CommonMark today.

Currently handling [link](foo(and(bar))) is kinda broken, you get link) instead of the expected output in the standard, so it is not like handling that properly in the future will be so terribly bad.

4
  • 25
    You're giving me too much credit; I wouldn't count myself as an author. John MacFarlane pretty much single-handedly wrote the spec. But yes, I was and am involved a bit, and I hope to eventually switch Stack Exchange over to this Markdown version. That's still a bit in the future though.
    – balpha StaffMod
    Commented Sep 4, 2014 at 10:40
  • 3
    @balpha: the spec page gives you that credit. :-) Commented Sep 4, 2014 at 10:44
  • 5
    @balpha: besides, you are listed second in the repository contributors list. Sure, the stats are a little skewed, but still. :-P Commented Sep 4, 2014 at 10:46
  • 2
    Stack Exchange developers are so humble! It's why y'all are so likable! That and you built this great platform that makes us awesome at our jobs!
    – jmort253
    Commented Sep 6, 2014 at 1:23
21

will it break too many posts?

Posts are rendered into HTML upon submission, and edit submission, the HTML is then saved in the database and served.

Thus, a change to CommonMark will not result in post breakage except in the case where one is editing an older post with a conflicting syntax. The preview, however, will show the discrepancy, and they should be able to notice it and fix it before saving the edit.

It's possible inattentive editors will miss this, and some posts will become broken. Undoubtedly others will fix them, and if it become a major source of problems I expect Stack Exchange will solve it with a little code.

1
  • 8
    True. In addition, CommonMark is highly compatible. For the most part, it just clearly specifies edge cases. The biggest change, impact-wise, is probably the new double-newline rule. Before we switch, I'll make sure to look how big that impact actually is.
    – balpha StaffMod
    Commented Sep 13, 2014 at 6:26

Not the answer you're looking for? Browse other questions tagged .