Many programming languages allow trailing commas in lists.
C, C++, C# (and probably other languages) permit a trailing comma after the last enumerator:
enum Color { Red, Blue, Green, // ^ trailing comma };
They also allow a trailing comma in list initializers.
// C, C++ Thing a[] = { { 1, 2 }, { 3, 4 }, { 5, 6 }, // ^ trailing comma }; // C# Thing[] a = new[] { new Thing { Name = "Bob", Id = 31415, // ^ trailing comma }, new Thing { Name = "Alice", Id = 2718, // ^ trailing comma }, // ^ trailing comma }; Dictionary d = new Dictionary<string, Thing>() { ["Bob"] = new Thing("Bob") { Id = 31415 }, ["Alice"] = new Thing("Alice", 2718), // ^ trailing comma };
These trailing commas are convenient when you arrange for each element to appear on its own line, like we did in the examples above. It lets you rearrange the items by moving lines around without having to worry about having to add a comma to an element when it moves out of the final position, or removing a comma from the element that moved into the final position.
It also reduces merge risk when people modify the list. For example, if somebody adds a new color “Black” to the end, they won’t have to touch any of the other lines, which means that a change from “Blue” to “LightBlue” won’t result in a merge conflict.
And even when there is a merge conflict due to two simultaneous adds, you can easily resolve it by accepting both.
enum Color { Red, Blue, Green, <<< VERSION 1 Black, ||| White, <<< VERSION 2 };
To resolve this, you can just delete all the conflict markers.
enum Color { Red, Blue, Green, Black, White, };
If your code didn’t use trailing commas, the merge would be messier:
enum Color { Red, Blue, <<< VERSION 1 Green, Black ||| Green, White <<< VERSION 2 };
And if you have a lot of these merges to deal with, you might forget to insert a comma after “Black”:
enum Color { Red, Blue, Green, Black // ⇐ oops, forgot a comma White };
Since the trailing comma reduces the number of lines of code that have to be modified when the list is extended, it also makes git blame more accurate. Without the trailing comma, a git blame on enum Color
would blame the person who added “Black” for also being the last person to modify the “Green” line. If you’re investigating a problem with “Green”, you might ask that person for help, and they’ll say, “Oh no, I didn’t add ‘Green’. I added ‘Black’. You’ll have to dig further back into the history to figure out who added ‘Green’.”
|
|
|
||||||
|
|
|
||||||
|
|
|
Bonus chatter: The trailing comma also makes it easier for code generators, since they can just emit a comma after each element and not have to worry about suppressing the final comma.
Bonus bonus chatter: But why not go all the way and allow a trailing comma in parameter lists?
SomeFunction(1, 2, ); // ^ trailing comma not allowed
I suspect the primary reason is “nobody asked for it.” Variadic functions are relatively uncommon, so this is not something that code generators stumble across. Also, that extra comma just plain looks weird.
Overloaded functions could pose a parsing problem. If there are 2-parameter and 3-parameter overloads of SomeFunction
, is this a call to the two-parameter overload, or is it a call to the three-parameter overload with some sort of default?
Bonus bonus bonus chatter: JavaScript, Rust, and Ruby allow a trailing comma in parameter lists.
Bonus bonus bonus bonus chatter: In the Pascal programming language, the semicolon is a statement separator, not a statement terminator, so you can write
begin i := 1; j := 2 (* no trailing semicolon *) end
In practice, everybody puts a semicolon just before the end. Imaging rearranging two lines of code and having to adjust semicolons.
Zig also supports trailing comma in parameter lists. Which may appear useless at first glance, but there is a neat side-effect you get from it: If you pass a file into `zig fmt` it will format that file for you. For parameter lists it puts them all on one line. But if the parameter list has a trailing comma, it will instead put them one on each line. So without trailing comma:
<code>
With trailing comma:
<code>
This is one of the three reasons I’m baffled that .NET Core switched from XML to JSON for config files (the other two being retro-compatibility and vanilla JSON disallowing comments).
IIRC, PL/1 was also a ‘semi-colon is statement separator’ language. Made porting from C to PL/1 slightly harder, as you had semicolons to eliminate.
JSON5 allows the trailing commas, etc. It’s available as a library/package/etc for most of your favorite programming languages.
Note that Pascal doesn’t allow a semicolon before
else
. The nearest equivalent in C would be, say, trying to writedo { break; }; while (1);
Which in fact is entirely in the logic that semicolon is a delimiter, not end-of-statemeny mark, since
<code>
Nothing to separate here. Just like in your do…while example, before while part, where there’s nothing to mark end of, yes.
Although for C it’s better to treat {} as a language construct separate from statements like if, while or for, since the rules for semicolon are not as consistent as in Pascal.
P.S. Thanks, blog engine, for breaking EBNF.
The do-while loop in C is also consistent with if-else and other control statements:
<code>
the can be any statement, including a simple statement that ends in semicolon:
<code>
the only places I can think of where braces are required are function bodies and switch statements. In those cases, you can consider the tokens and to directly belong to the grammars for and respectively, and then becomes completely consistent...
Well, not really. I mean, if we say that semicolon is a statement terminator (which it mostly is) and call {} a compound statement (but still statement, just like if or while) then there’s a problem:
<code>
If braces are compound statement and statements are terminated with semicolon, then we should put semicolons after both closing braces. That would work for the else-branch (although for completely different reason called ”empty statement”) but something goes wrong with the...
Yes, the real inconsistency is in that some statements end in a semicolon while others don’t (most notably
{ ... }
, but things likeif (...) ... else ...
also themselves don’t have the semicolon for that matters 🙂And the chad F# allows you to write lists and argument lists with no commas at all!
In C++ you can use trailing commas in brace initialization, and the extra comma does not change how overload resolution works. In C# if you have a normal variadic (i.e., params array, not vaargs) method and you find yourself constantly changing the number of arguments or shuffling them, then it’s better to call it with an explicitly initialized array, for which you can use trailing comma.
Not only does Go support the trailing comma, it _requires_ it in the situations shown here.
JSON – Not you 🙂
The worst consequence of not using a trailing comma is that you can make a mistake when merging two versions or rearranging the lines, and the compiler won't warn you. For example:
<code>
Here, Green and Blue were added in two different branches. They will be concatenated together by the preprocessor, so the effective code will be:
<code>
which is a bug.