September 9th, 2019

PowerShell programming puzzle: Convert snake_case to PascalCase in one line

A friend posed this PowerShell programming puzzle (P³), which represents an actual problem he had to solve:

Given a string in $t in snake_case or SHOUTY_SNAKE_CASE, return the corresponding PascalCase string. You may assume no consecutive underscores.

The initial version went like this:

$uc = $true; 
[string]::Concat(($t.ToCharArray() | % { 
  if ($uc) { ($_ -as [string]).ToUpper(); $uc = $false; } 
  elseif ($_ -eq '_') { $uc = $true; } 
  else { ($_ -as [string]).ToLower();} 
}))

This is a straightforward translation of the requirements: Walk through the string one character at a time. Capitalize the first letter and any letter that follows an underscore. Everything else becomes lowercase.

I countered with this crazy one-liner:

(Get-Culture).TextInfo.ToTitleCase(($t.ToLower() -replace "_", " ")) -replace " ", ""

This version cheats like crazy, but hey, we’re code-golfing. It relies on the capitalization rules of the English language, and it assumes that every word starts with a letter.

The idea is to take advantage of To­Title­Case, which capitalizes the first letter of each space-separated word. All we have to do is transform the string into something that we can feed into To­Title­Case.

So start with the string, convert it entirely to lowercase (to avoid the feature of To­Title­Case that preserves all-capitalized words), and change the underscores to spaces. Now we can ask To­Title­Case to capitalize the first letter of each word. Then we compress out the spaces with a final -replace.

A less crazy version would be something like this:

[regex]::replace($t.ToLower(), '(^|_)(.)', { $args[0].Groups[2].Value.ToUpper()})

First, we convert the entire string to lowercase. Then we search for each character that comes after an underscore (considering the first character of the string to be after an imaginary leading underscore) and capitalize it. The underscore itself is not returned, which causes it to vanish.

 

Topics
Code

Author

Raymond has been involved in the evolution of Windows for more than 30 years. In 2003, he began a Web site known as The Old New Thing which has grown in popularity far beyond his wildest imagination, a development which still gives him the heebie-jeebies. The Web site spawned a book, coincidentally also titled The Old New Thing (Addison Wesley 2007). He occasionally appears on the Windows Dev Docs Twitter account to tell stories which convey no useful information.

15 comments

Discussion is closed. Login to edit/delete existing comments.

  • Nick

    Can’t you get away with skipping the underscore to space step?  Some quick testing seems to show that ToTitleCase will also treat underscores as word separators:

    (Get-Culture).TextInfo.ToTitleCase($t.ToLower()).Replace(“_”,””)

    seems to work correctly.  Is there a corner case I’m missing?

    Also, “shouty snake case” is a new favorite; though it seems like the alternative should then be “sneaky snake case”.

    • Raymond ChenMicrosoft employee Author

      Awesome! Underscore is a “punctuation connector” which ToTitleCase treats as a word separator, so this works.

  • Mystery Man

    Regex was my first thought.

  • Sacha Roscoe

    Are there any potential issues here with culture settings? How do ToLower and ToTitleCase interact in other cultures? Are we going to muck up the Turkish i and/or ı, as we so often like to do?

    • Andrew Cook

      `ToTitleCase` internally uses `ToLower` and `ToUpper` as needed, so it has the same cultural implications as the other two examples. The documentation calls out that *currently* it uses naïve English casing rules only, but reserves the right to use culture-specific ones in the future.
      `ToTitleCase` gives different results than the other two, however, if there are any embedded non-alphanumerics in the string.

      Read more
      • cheong00

        And if your script plans to run on systems that may have other culture, I think you can use System.Globalization.CultureInfo.InvariantCulture instead of Get-Culture

  • cheong00

    Here’s one case where using RegEx is actually better.

  • John Wiltshire

    The real question is whether you can do it with cmd.exe!

    • David Streeter

      @ECHO OFF
      SET t=SHOUTY_SNAKE_CASE
      FOR %%i IN ("A=a" "B=b" "C=c" "D=d" "E=e" "F=f" "G=g" "H=h" "I=i" "J=j" "K=k" "L=l" "M=m" "N=n" "O=o" "P=p" "Q=q" "R=r" "S=s" "T=t" "U=u" "V=v" "W=w" "X=x" "Y=y" "Z=z") DO CALL SET "t=%%t:%%~i%%"
      SET t1=%t:~0,1%
      FOR %%i IN ("a=A" "b=B" "c=C" "d=D" "e=E" "f=F" "g=G" "h=H" "i=I" "j=J" "k=K" "l=L" "m=M" "n=N" "o=O" "p=P" "q=Q" "r=R" "s=S" "t=T" "u=U" "v=V" "w=W" "x=X" "y=Y" "z=Z") DO CALL SET "t1=%%t1:%%~i%%"
      SET t2=%t:~1%
      SET...

      Read more
      • Neil Rashbrook

        Overkill. Try this:
        @set t=_%1
        @for %%l in (a b c d e f g h i j k l m n o p q r s t u v w x y z)do @call set t=%%t:%%l=%%l%%
        @for %%l in (A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)do @call set t=%%t:_%%l=%%l%%
        @echo %t%
        Edit: the formatting looked...

        Read more
    • Me Gusta

      Well, you can run powershell.exe from cmd.exe, so given this post, I would imagine something based around powershell.exe -Command.

  • Paulo Morgado
    $uc = $true;
    [string]::new(($t.ToCharArray() | % {
      if ($uc) { [CultureInfo]::CurrentCulture.TextInfo.ToUpper($_); $uc = $false; }
      elseif ($_ -eq '_') { $uc = $true}
      else { [CultureInfo]::CurrentCulture.TextInfo.ToLower($_); }
    }))
  • Petri Oksanen

    Kids today with their fancy PowerShell, am I right? 🙂
    perl -wnle “my$l;for(split’_’){$l.=ucfirst lc}print$l”

    • Raymond ChenMicrosoft employee Author

      If you’re going to go perl, you may as well go all the way: perl -ne “for(split _){print ucfirst lc}”. But that requires the snake_case to be passed as stdin rather than on the command line. perl -e “for(split _,shift){print ucfirst lc}” takes it from the command line.

      • Petri Oksanen

        I’m assuming the data is from the stdin or from the contents of a file provided as argument. 🙂
        I wonder if this can be beat: perl -ne “print ucfirst lc for split _”