I will use recurrent neural networks to generate original guitar music in the style of Metallica, from guitar pro tabs scraped from the internet.
I decided to use guitar pro files because they are quite popular, so there will be a lot of accessible data, and they are reasonably accurate, containing a lot of fine data which can be missed in conventional guitar tablature, which an example is given below.
See here ▼
Below is an example of typical guitar tablature from ultimate-guitar.com
Band- Metallica Song- TuesdAYS gONE A5 E5 F#5 D x 0 2 2 x x 0 2 2 x x x 2 4 4 x x x x x 0 2 3 2 Dsus4 Dsus2 G F#m x x 0 2 3 3 x x 0 2 3 0 3 2 0 0 3 3 2 4 4 2 2 2 GIII 3 5 5 4 3 3 Gtr I (Eb Ab Db Gb Bb Eb) - 'acoustic' Gtr II (Eb Ab Db Gb Bb Eb) - 'dobro' Gtr III (Eb Ab Db Gb Bb Eb) - 'acoustic' Gtr IV (Eb Ab Db Gb Bb Eb) - 'acoustic' Intro Slowly H.=50 A5 E5 3/4 Gtr I | | | || / / / // Gtr II ~~ |-------|-------||-----------|----------------------|------------| |-------|-------||-14--------|-14------14b16r=(14)--|-12---------| |-------|-------||-----------|----------------------|------------| |-------|-------||-----------|----------------------|------------| |-------|-------||-----------|----------------------|------------| |-------|-------||-----------|----------------------|------------| | | Gtr III | ~~ |-------|-------||-----------|----------------------|------------| |-------|-------||-----------|----------------------|------------| |-------|-------||-11--------|-11b13r=(11)----------|--9---------| |-------|-------||-----------|----------------------|------------| |-------|-------||-----------|----------------------|------------| |-------|-------||-----------|----------------------|------------| | | Gtr IV |-------|-------||-----------|----------------------|------------| |-------|-------||-----------|--2-------------------|------------| |-------|-------||--------2--|----------2-----------|------------| |-------|-------||-----2-----|------------------2---|--------2---| |-------|-------||--0--------|----------------------|-----2------| |-------|-------||-----------|----------------------|--0---------| F#5 D | | | | || | | | | | || | / / / / // / / / / / // / ~~~~~~~ ~ |-----------------|----------------|--------------------|----------| |----12b13r=(12)--|-12-10-(10)-----|----10b12r====(10)--|-7--------| |-----------------|----------------|--------------------|----------| |-----------------|----------------|--------------------|----------| |-----------------|----------------|--------------------|----------| |-----------------|----------------|--------------------|----------| | | ~~ ~ |-----------------|----------------|--------------------|----------| |-----------------|----------------|--------------------|----------| |-----------------|----------------|--------------------|----------| |-----9b10r==(9)--|--7-------------|----------7b9r=(7)--|-4--------| |-----------------|----------------|--------------------|----------| |-----------------|----------------|--------------------|----------| | | PM---------------------------| |-----------------|----------------|--------------------|----------| |-----------------|----------------|--------------------|-------3--| |-1---------------|----------------|-2------------------|----2-----| |-----2-----------|-------------4--|-----4--------------|-0--------| |-------------2---|---------4------|----------------4---|----------| |-----------------|--2-------------|--------------------|----------|
This is an example of the kind of tabs I scraped in previous work on country music, that can be found here. While the guitar tab may seem quite orderly, because the tabs are user-submitted, they can differ tab to tab. Also, additional data, such as whether the guitar is palm-muted, may often be ommited.
Guitar tabs can be easily converted to sheet music since every fret on each string corresponds to a note. Different frets on diffrent strings can have the same note, so it is a many-to-one mapping, but if we restrict ourselves to the first 4 frets it is a one-to-one mapping. The image below may be helpful for understanding, above shows a guitar tablature, and below the corresponding sheet musical.
Guitar pro files are created using specialized software (Guitar Pro), which has a built-in MIDI-editor, so much of the fine data is both available and standardized, and, in general, more accurately resembles the song it is transcripting.
To obtain the guitar pro files I scraped the website ultimate-guitar.com, as it has the most comprehensive repository of guitar tabs on the internet. I scrape the the website for only 5-star rated tabs, and filter for guitar pro files, and choose remove duplicate songs (for example mutliple versions) by choosing the tabs with the most ratings.
See here ▼
Below is an python script of grabbing guitar pro files from ultimate-guitar.com.
# -*- coding: utf-8 -*- """ Created on July 30th @author: matt """ # = Import packages import requests import re from bs4 import BeautifulSoup import pandas as pd import cgi import shutil import os # = Helper Functions ========================================================= def removeTags(string): ''' Function to remove html tags ''' return re.sub('<[^<]+?>', '', string) def getBandTree(band, page): ''' Function to get xml tree given the band name ''' if type(page) == int: page = str(page) theURL = 'https://www.ultimate-guitar.com/search.php?band_name=' + band + \ '&type%5B4%5D=500&rating%5B4%5D=5&approved%5B1%5D=1&page=' + page + \ '&view_state=advanced&tab_type_group=text&app_name=ugt&order=myweight' pageBand = requests.get(theURL) return BeautifulSoup(pageBand.content) def download_url(url, directory): """Download file from url to directory URL is expected to have a Content-Disposition header telling us what filename to use. Returns filename of downloaded file. """ response = requests.get(url, stream=True) if response.status_code != 200: raise ValueError('Failed to download') params = cgi.parse_header( response.headers.get('Content-Disposition', ''))[-1] if 'filename' not in params: raise ValueError('Could not find a filename') filename = re.sub('([\(\[]).*?([\)\]])', '', os.path.basename(params['filename'])) filename = re.sub(' ','', filename) abs_path = os.path.join(directory, filename) with open(abs_path, 'wb') as target: response.raw.decode_content = True shutil.copyfileobj(response.raw, target) return filename # ============================================================================ def main(): band = 'metallica' page = 1 bandURL = getBandTree(band, page) #Get max pages pages = bandURL.find_all('div', { "class" : "paging" }) maxPages = str(pages).count('') dfs = [] for page in range(1, maxPages + 1): bandURL = getBandTree(band, page) mybs = bandURL.find_all('b', { "class" : "ratdig" }) mybs2 = [removeTags(str(rating)) for rating in mybs] songname = bandURL.find_all('a', { "class" : "song result-link" }) songname2 = [re.sub('([\(\[]).*?([\)\]])', '', removeTags(str(song)).strip()).strip() for song in songname] tabType = bandURL.find_all('strong') tabType2 = [removeTags(str(tab)) for tab in tabType] df1 = pd.DataFrame({'Rating': mybs2, 'Type': tabType2}) df2 = df1[df1.Type == 'guitar pro'] df2.loc[:,'Song_Name'] = songname2 links = [] for a in songname: links.append(a['href']) df2.loc[:,'Song_Links'] = links df3 = df2.loc[df2.groupby(['Song_Name'], sort=False)['Rating'].idxmax()] dfs.append(df3) tot_df = pd.concat(dfs) # print(tot_df) song_links_list = list(tot_df.Song_Links) for i in range(len(tot_df)): webPage = requests.get(song_links_list[i]) soup = BeautifulSoup(webPage.content) tab_id = soup.find_all("input", {"type" : "hidden", "name" : "id", "id" : "tab_id"}) the_id = tab_id[0].get('value') mydir = os.path.dirname('out/' +band + '/') if not os.path.exists(mydir): os.makedirs(mydir) download_url('https://tabs.ultimate-guitar.com/tabs/download?id='+str(the_id), mydir) if __name__=='__main__': main()
The scraper grabs 123 songs, which seems pretty good, since there are a total of 151 songs in Metallica's catalog.
I can use the same recurrent neural network to generate country music lyrics that seems to work pretty well. To use this model the guitar pro files need to be in text file format. I do this by extracting all the pertinent information from the guitar pro files and write them to a txt file. This method has to be reversible since the neural network will output text in the same format.
See here ▼
Below is an python script of how guitar pro files are converted to a txt file for the neural network to read.
# -*- coding: utf-8 -*- """ Created on Sat Jul 30 14:26:22 2016 @author: matt-666 """ # = Import packages import guitarpro from os import listdir # = Helper functions ========================================================= def unfold_tracknumber(tracknumber, tracks): """Substitute '*' with all track numbers except for percussion tracks.""" if tracknumber == '*': for number, track in enumerate(tracks, start=1): if not track.isPercussionTrack: yield number else: yield tracknumber def transpose(myfile, track): ''' Get pertinent information from guitar pro files and write to text file ''' myfile.write("%s" % 'strgs: ' +' '.join([str(string) for string in track.strings]) + ' ') myfile.write("%s" % 'fc: ' +str(track.fretCount) + ' ') measure1 = track.measures[0] myfile.write("%s" % str(measure1.keySignature) + ' ') myfile.write("%s" % 'len:' + str(measure1.length) + ' ') myfile.write("%s" % 'tmpo:' + str(measure1.tempo) + '\n') i = 1 for measure in track.measures: myfile.write("%s" % 'Num:' + str(i) + ' ') myfile.write("%s" % 'mst:' + str(measure.start) + '\n') for voice in measure.voices: for beat in voice.beats: myfile.write("%s" % 'vst:' + str(beat.start) + '\n') for note in beat.notes: myfile.write("%s" % 'S:' + str(note.string) + ' ') myfile.write("%s" % 'V:' + str(note.value) + '\n') i += 1 # ================================================================= def main(): band = 'metallica' mydir = 'out/' + band files = listdir(mydir) myfile = open('allTabs.txt', 'w') for gpfile in files: curl = guitarpro.parse(mydir + '/'+ gpfile) transpose(myfile, curl.tracks[0]) myfile.write('\n\r\n\r') myfile.close() if __name__== '__main__': main()
Now that all the guitar pro files are converted to one large text file it can be fed into the recurrent neural network. The model comes from Andrej Karpathy’s great char-rnn library for Lua/Torch. Recurrent neural networks can use the output of the current node as the input for the next node.
The model takes a few days to run on my poor laptop with 400 nodes and 3 layers in the network, which corresponds to about 5 million parameters in the network.
The output of the model is in the same text format of the input of the model. I take a guitar pro file and modify the measures, notes, and metadata. Creating a guitar pro file from scrtach in python is quite difficult due to the number of settings that have to be included in order to save the file properly, so this method was a compromise.
See here ▼
Below is an python script of how txt is converted back to guitar pro files.
# -*- coding: utf-8 -*- """ Created on Sat Aug 6 15:53:06 2016 @author: matt-666 """ # = Import packages import guitarpro import time # = Helper Functions ==================================================== def transpose2GP5file(track, totaldict): ''' Function to take an empty gp5 song and fill with song information from dictionary generated from txt2songDict function ''' meas = 1 breakMeas = Inf for measure in track.measures: #measure.keySignature = totaldict['measure_'+str(meas)]['key'] #measure.length = totaldict['measure_'+str(meas)]['mlen'] #measure.start = totaldict['measure_'+str(meas)]['mstart'] for voice in measure.voices: try: beats_list = totaldict['measure_'+str(meas)]['beats'] thebeat = 0 except KeyError: if meas < breakMeas: breakMeas = meas break for beat in voice.beats: beat.start = beats_list[thebeat] try: strings = totaldict['measure_'+str(meas)]['strings'+str(beats_list[thebeat])] realValues = totaldict['measure_'+str(meas)]['Notes'+str(beats_list[thebeat])] except KeyError: break notes = [] for string, realValue in zip(strings, realValues): #print(string) note = guitarpro.base.Note() note.string = string note.value = realValue print(meas, string, note.value) notes.append(note) beat.notes = notes thebeat += 1 meas += 1 track.measures = track.measures[:breakMeas-1] return(track) # ================================================================================ def main(): def txt2songDict(track): ''' Read txt file and go through line by line convert all information into dictionary of measures ''' # myfile =open('xyz.txt', 'r') with open(track) as f: total_dict = {} measures_list = [] measNum = 0 measStart = 0 key = 'CMajor' song_len = 3840 for line in f: if line[:5] == 'strgs': fc1 = line.find('fc') try: strings = line[6:fc1].split() except ValueError: break key1 = line.find('Key') fc = line[fc1+4:key1] len1 = line.find('len') key = line[key1+13:len1-1] tempo1 = line.find('tmpo') song_len = line[len1+4:tempo1] try: tempo = line[tempo1+5:tempo1+7] except ValueError: tempo = line[tempo1+5:tempo1+6] init_dict = {} init_dict['tempo'] = tempo gpStrings = [guitarpro.base.GuitarString(string) for string in strings] init_dict['string'] = gpStrings init_dict['fretCount'] = fc total_dict['init_dict'] = init_dict elif line[:3] == 'Num': measStart1 = line.find('mst') if measNum > 0: total_dict['measure_'+str(measNum)]['beats'] = beats measNum += 1 measStart += 3840 meas_dict = {} meas_dict['key'] = key meas_dict['mlen'] = song_len meas_dict['mstart'] = measStart measures_list.append(meas_dict) total_dict['measure_'+str(measNum)] = meas_dict beats = [] elif line[:3] == 'vst': beatStart1 = measStart + 221820-int(line[4:11]) beats.append(beatStart1) strings = [] notes = [] elif line[:1] == 'S': valStart = line.find('V') string = int(line[2]) realVal = int(line[valStart+2:valStart+5]) strings.append(string) notes.append(realVal) total_dict['measure_'+str(measNum)]['strings' + str(beatStart1)] = strings total_dict['measure_'+str(measNum)]['Notes' + str(beatStart1)]= notes return(total_dict) curl = guitarpro.parse('Serenade.gp5') track = curl.tracks[0] for measure in track.measures: for voice in measure.voices: for beat in voice.beats: beat.notes = [] curl.tracks[0] = track songDict = txt2songDict('genMetallica2.txt') track = curl.tracks[0] track = transpose2GP5file(track, songDict) curl.tracks[0] = track curl.artist = 'R. N. Net' curl.album = time.strftime("%d %B %Y") curl.title = 'Metallica Style Song' guitarpro.write(curl, 'genMetallica.gp5') if __name__ == '__main__': main()
The model works well, with the training error decreasing as the model runs, and the validation error being slightly higher than the trtaining error, the sign of good model parameters. An example output of the model is shown below and can be played by hitting the play button, the tab is played using AlphaTab, that is able to play guitar pro files online. This example is an output consisting of 7000 characters and can be modified accordingly, ot seems that a whole song may consist of 15-20,000 characters in the text format the model requires, or about 70 musical measures. Works best on google chrome!
The model is able to pick up on the song structure, and even power chords, which are often played in Metallica's music. Though the song is a little out-of-order we can see that it starts with a solo or interlude, and continues to a chorus or verse.
This model could be used to generate other original music in the style of other artists dependent on the input files, for example, pearl jam or nirvana songs could be used or even a combination of both to get songs in the style of 90's alternative. To take this even further guitar pro file exist for bass guitar, drum, and piano music, as well as lyrics, so it is conceivable that whole songs could be generated using this model. It would take a very long time though, scaling linearly with the size of the imput file. I would recommend having at least 75 songs to accurately train the model for the various characteristic features of the songs, especially if multiple artists are used, where common features may be more subtle.
For the meantime I'll be Master of puppets I’m pulling your strings. Twisting your mind and smashing your dreams.