Homework 3

FIND:
(\s{2,})

REPLACE:
,

RESULT:
First String,Second,1.22,3.4
Second,More Text,1.555555,2.2220
Third,x,3,124

I captured each instance of 2 or more spaces and replaced each with a comma.

FIND:
(\w+),(\s)(\w+),\s(.+)

REPLACE:
\3\2\1 (\4)

RESULT:
Bryan Ballif (University of Vermont)
Aaron Ellison (Harvard Forest)
Sydne Record (Bryn Mawr)

First I captured everything I wanted to keep: the last name with (\w+), space between names with (\s), first name with (\w+), and name of school with (.+). I also listed the comma between names and comma/space ,(\s) between names and school, but did not capture them. Then I replaced the sequence with the \3 (first name), \2 (space) \1 (last name), and \4 (the school, with parentheses around it).

FIND:
(.mp3) 

REPLACE:
\1\n

RESULT:
0001 Georgia Horseshoe.mp3 
0002 Billy In The Lowground.mp3 
0003 Cherokee Shuffle.mp3 
0004 Walking Cane.mp3

I captured all instances of .mp3 and included a space outside of the parentheses above. Then I kept the .mp3 using \1 and inserted a line break in place of the space by using \n.

FIND:
(\d{4}) (.+)(.mp3)

REPLACE:
\2_\1\3

RESULT:
Georgia Horseshoe.mp3_0001
Billy In The Lowground.mp3_0002
Cherokee Shuffle.mp3_0003
Walking Cane.mp3_0004

I captured the 4-digit number, typed a space (to avoid a space before each final file name), and then captured the rest of the text with (.+) and (.mp3) at the end. Then I simply re-ordered the components by placing the song name (\2) first, underscore, then file number (\1), and .mp3 (\3) at the end.

FIND:
(\w)\w+,(\w+),\d+.\d(,\d+)

REPLACE:
\1_\2\3

RESULT:
C_pennsylvanicus,44
C_herculeanus,3
M_punctiventris,4
L_neoniger,55

I captured the first letter of the genus name with (\w), included the rest of the genus name and comma with \w+,, captured the whole species name with (\w+), and then included the comma and first number with ,\d+.\d, and then captured the second comma and second number with (,d\d+). Then I replaced this with just the first letter, underscore, species name, comma, and second number.

FIND:
(\w)\w+,(\w{4})\w+,\d+.\d(,\d+)

REPLACE:
\1_\2\3

RESULT:
C_penn,44
C_herc,3
M_punc,4
L_neon,55

I used the same expression as in #5 except I only captured 4 letters of the species name using the curly brackets, and had to add an additional \w+ right after that to recognize the rest of the species name.

FIND:
(\w{3})\w+,(\w{3})\w+(,)(\d+.\d)(,)(\d+)

REPLACE:
\1\2\3 \6\5 \4

RESULT:
Campen, 44, 10.2
Camher, 3, 10.5
Myrpun, 4, 12.2
Lasneo, 55, 3.3

I captured the first three letters of the each genus and species, again using the curly brackets, and also captured the comma after those (,). I included but did not capture the rest of the genus and species names with two \w+ wildcards after each capture with the curly brackets. Then I captured the numbers and other comma, and re-ordered them to appear in the right order in the result, listing \6 first and \5 second, which correspond to the second and first numbers respectively.

Homework 3

Hannah Shafer

2/2/2022