Chapter 6. Correcting Data Automatically

Table of Contents

1. Remote Correct
1.1. What are MusicBrainz and MusicIP?
1.2. How Does Jaikoz Use MusicBrainz and MusicIP?
1.3. Retrieve Acoustic Ids
1.4. Autocorrect Tags from MusicBrainz
1.4.1. Match Preferences
1.4.2. Match Score Preferences
1.4.3. AutoMatch Options
1.4.4. Format Options
1.4.5. Format 2 Options
1.5. ManualCorrect Tags from MusicBrainz
1.5.1. Manual Match Preferences
1.6. Cluster Albums
1.7. Correct Lyrics
1.8. Update Tags from Existing MusicBrainz Id
1.9. Update Tags from Discogs
1.9.1. Format Preferences
1.10. Submissions to Musicbrainz
1.10.1. Submit MusicBrainz/PUID Pair
1.10.2. Submit MusicBrainz Genres
1.10.3. Submit Musicbrainz Collection
2. Local Correct
2.1. Correct Artwork
2.1.1. Maximum number of images that should be added automatically to one file
2.1.2. Maximum size of images to be added (MB)
2.2. Delete Duplicates
2.3. Correct Track Nos
2.3.1. Preserve Track/Total Setting
2.3.2. Zeroise Track
2.4. Cross Referencing Correct - Correct Artists/Albums/Titles/Genres/Recording Times/Comments/Composers
2.4.1. How it Works
2.4.2. AutoMatch Preferences
2.4.2.1. Match words that appear misspelt
2.4.2.2. Ignore Word Order when matching
2.4.2.3. Ignore Case when matching
2.4.2.4. Ignore words in this list when matching
2.4.2.5. Split into Words using these values
2.4.3. AutoFormat Preferences
2.4.3.1. Remove whitespace at start or end of value
2.4.3.2. Remove Widespace
2.4.3.3. Remove undisplayable characters
2.4.3.4. Capitalization
2.4.3.5. Replace words by another word
2.4.3.6. Remove these Punctuation Characters
3. File and Folder Correct
3.1. Shift Base Folder to Sub Folder
3.2. Shift Sub Folder to Base Folder
3.3. Correct Tags from Filename
3.3.1. Example
3.3.2. Introduction
3.3.3. How it Works
3.3.4. Split the Filename using these words
3.4. Correct Filenames from Tags
3.5. Correct Sub Folders from Tags
4. Delete Duplicates
5. Auto Correcter

Auto Correction allows your Metadata to be corrected without you manually editing the changes. This is a much quicker and more accurate way of sorting out your music. Jaikoz can perform 'Local Correct' which does not require Internet Access, and is very quick but can only work with the Metadata that you already have in your files. Jaikoz can do 'Musicbrainz/MusicIP Correct' which uses the MusicBrainz/MusicIP database to perform matches, this is much more accurate but takes longer. 'File and Folder Correct' is another form of Local Correct that modifies your folder and filenames, rather than just your Metadata. The 'Autocorrecter' allows you to batch up a number of these tasks into a single task. Usually you would use all these methods to clean up your Music library

1. Remote Correct

Remote correct groups tasks together that require Internet access in order to run

1.1. What are MusicBrainz and MusicIP?

MusicBrainz is an online database of information on more than 5 million songs. This is a community based database with contributions by over 200,000 people, a system of moderation ensures the data is extremely accurate. Additionally many of these songs have associated Acoustic Ids provided by MusicIP, allowing a song to be identified by the actual music, so it can do a match even if you have no Metadata. MusicIP also provides an online database and a number of Music Mixing and Music Identification services. This is the very latest technology and is much more powerful than other systems that only allow a match by Metadata, or only match album by album instead of uniquely per track.

1.2. How Does Jaikoz Use MusicBrainz and MusicIP?

Jaikoz allows to lookup your songs by both the acoustic id and the Metadata making it very accurate, nothing has to be done by you so you can go away and do something else more interesting ! Jaikoz compares your track acoustically with the MusicIP database and if it finds a match it retrieves the acoustic id. It can then be used to contact the MusicBrainz Server to find matching tracks and meta-data. Sometimes multiple tracks may be returned but unlike other systems Jaikoz can use its its AutoMatch algorithm to determine which is the correct match in most cases. It does this by comparing the meta-data in the records returned from MusicBrainz with the record it is trying to match. As a user this means you can run Jaikoz against MusicBrainz and the corrections will be done without any additional intervention required from yourself. This is very useful because the creation of Acoustic ids and lookup in the MusicBrainz Database can take a while if you want to correct many files so is best run unattended for a large number of files. If a match is found Jaikoz will always write a record to the Unique File Identifier which provides a link back to the Music Brainz Website . It also writes values to many of the other fields such as the Music Id (PUID), Artist Id, Album Id, Disc Id, AmazonId ,artist, album, title, year, album artist, release status, release type, release country and track number if they do not exist already on your file, or if they are allowed to be overwritten based on your AutoFormat Preferences.

1.3. Retrieve Acoustic Ids

This attempts to find an acoustic match for your track in the MusicIP database. In order to get a successful match the track must exist in the MusicIP database and your track must be of sufficient quality so that it sounds similar enough to the track in the MusicIP database, if an Acoustic Id is found it will be added to the MusicIP Id Field. Acoustic matching takes a few seconds per track but the Acoustic Id for a track never changes so there is no need to regenerate it, it only has to be done once. Because the Acoustic Id for a particular file will never change, the Acoustic Id will be automatically saved to file as it is retrieved. By default Jaikoz does not submit unknown tracks to MusicIP because analysing unknown tracks take significantly longer, but this option can be enabled. When you submit an unknown track to MusicIP it will not give you an Acoustic Id immediately, you have to wait 24 hours and then retry retrieving acoustic ids for the unknown track. By this time the track should have been added to the MusicIP database and you will normally able to retrive an acoustic id for it. You can change these options in the Preferences/MusicBrainz/MusicIP window.

1.4. Autocorrect Tags from MusicBrainz

This attempts to find a match in the MusicBrainz Database, if an Acoustic Id has been found this will be used as the main key, otherwise Metadata such as artist, album and title will be used. How Jaikoz determines the best match depends on the options you have selected in the Match and Match Score Preferences. The default will provide a good match for mostb customers but you can finely tune the matching as reuired. If a match is found then Jaikoz will populate as many fields as it can from the MusicBrainz database, if they are enabled in your AutoFormat options

1.4.1. Match Preferences

The MusicBrainz Server field specifies what MusicBrainz server to use, currently there is only one but it is expected there will be mirror servers available in the future.

The Do not match online if already have a MusicBrainz Unique id option allows you to skip over tracks that have already had a Unique id. This is very useful if you reload records that you have already analysed, so you do not re-analyse them which could take some time if you are loading many records, it also allows you to skip over records you have just analysed if the online MusicBrainz match has timed out and you need to rerun it. This option is not enabled by default because you may want to re-analyse records because of improvements in your Metadata, which may give a more accurate match.

When Jaikoz finds potential matches for a track it doesn't have all the album information required to calculate exactly the rating a match should be given unless the album is in the Jaikoz album cache. For example it will have the album name but the not list of countries the album has been released in. To get this information it has to do another query for each album and this would slow down matching because MusicBrainz only allows one query per second so by default Jaikoz only retrieves the full album information for the track with the best rating although Jaikoz caches album information so it doesnt have to lookup the same album more than once in a Jaikoz session Enabling the Retrieve extended release details for more accurate ratings option will retrieve the extra album information for all potential matches, this will make matching slower but more accurate.

If Prefer releases that have been used by other tracks is enabled, releases that have been used by other tracks will have their rating increased by the Release has been used by other tracks score in the Match Score preference. Enabling this option helps minimize the number of albums tracks are spread over.

Musicbrainz categorizes all releases into either Official, Promo, Bootleg or Pseufo Release - most releases are Official. IfPrefer offical releases is enabled, Official releases will have their rating increased by Release is an Official Release score in the Match Score preference. Enabling this option helps prevents incorrect matching to live bootleg tracks.

If Prefer releases of the following types is enabled all the release types checked below it will cause tracks with releases of one of the checked types to have their rating increased by the Release is one of the preferred release types score in the Match Score preference. This option can be used to favour original abums, singles or compilations.

The selected Preferred Country of release will give tracks with releases in this country a rating increase by the Release released in preferred country score in the Match Score preference.

The Track Duration must be within this number of seconds options lets you specify that the duration of any matching tracks must be within x seconds of the actual duration for this track. Matches can be upto x seconds shorter or longer to account for slight differences in encoding or possibly song versions.

If the Possible matches must be within track duration option is enabled then no track will be returned based on a metamatch unless its track duration is within the range specified, if it is disabled then these track will still be returned as long as their total metamatch score is high enough. This option does not effect tracks matched based on their PUID.

1.4.2. Match Score Preferences

Jaikoz find the best few ten possible matches for a track then calculates a rating out of 100 for each track, and the highest rating track is selected by the Autocorrecter if its rating is above the minimum rating accepted. You can decide how the rating is calculated by adjusting how the rating is allocated. Some ratings are not used if the associated checkbox is not enabled in the Match Preferences, for example the Release is an Offical release rating is only used if you have enabled the Prefer official releases in the Match Preferences.

1.4.3. AutoMatch Options

The Do not match if unable to Acoustic Id Match option will only match tracks that have an acoustic id. This increases accuracy but reduces the number of matches.

The Minimum rating required if meta match only option allows you to balance the number of matches against accuracy. If your track has no acoustic id the match is done purely on the Metadata, the higher this setting the more likely it is that any match made will be correct, but less tracks will get matched at all. If your track has an acoustic id this setting is ignored because acoustic ids are so accurate it is very likely that if an acoustic match is made it will be for the correct track (even if it is not for the exact version), and forcing a check on Metadata as well would throw away accurate matches for tracks with poor or incorrect Metadata. The default value is 70.

If the Acoustic Id match must also have minimum meta rating option is enabled then matching tracks matched by their acoustic id must also have a Metadata match higher then the value in Minimum rating required if have an Acoustic Id Match. This option is not enabled by default.

If the Prefer to use acoustic id if exists even if rating is less than metamatch option is enabled Jaikoz will always use match based on MusicIp Id over a metamatch regardless of which has the best score. Using a MusicIP id is safer than a metamatch for identifying the correct title because it does not rely on your file having correct Metadata. But when a track appears on multiple releases metamatches can be better at ensuring all tracks from one release are all tagged against the same release.

1.4.4. Format Options

This lists most of the values that can possibly be populated by MusicBrainz upon a successful match, this includes text fields and Artwork. By default all fields are populated on a MusicBrainz match if MusicBrainz has a value, but you can select to only populate the field if it is empty , or never to populate it.

When matching cover art Jaikoz will attempt to finest the highest quality image it can from a variety of sources, the image is added to the audio file automatically but will like all tasks the image will not be saved until you save changes.

A release may have multiple release events, these consist of a Country and a Release Date. The When selecting year and country of release prefer to use option lets you specify what is more important - the country or the date of release.

The Translate foreign artist names to english where possible uses the english version of latin names where possible. this is useful if youre system is Englisg/latin based but you are have some tracks by non lation artists such as Chinese or Japanese which would be difficult to use otherwise.

This is the current list of fields that can be auto populated

  • MusicBrainz Unique Id
  • Artist
  • Album
  • Album Artist
  • Sort Artist
  • Sort Album Artist
  • Title
  • Year
  • Track No
  • is Compilation
  • MusicBrainz Artist Id
  • MusicBrainz Release Id
  • MusicBrainz Release Artist Id
  • MusicBrainz Disc Id
  • Release Type
  • Release Status
  • Release Country
  • Amazon Id
  • Artwork
  • Composer

1.4.5. Format 2 Options

This lists additional values that can possibly be populated by MusicBrainz upon a successful match.

This is the current list of fields that can be auto populated

  • Label
  • Barcode
  • Catalog No
  • Media
  • Lyricist
  • Conductor
  • Remixer
  • Release Discogs Url
  • Release Wikipedia Url
  • Release Official Url
  • Sort Composer

1.5. ManualCorrect Tags from MusicBrainz

For each (selected) track this finds upto ten potential matches in the MusicBrainz Database, the same algorithm is used as for the autocorrect but instead of automatically selecting the best match, upto ten matches are displayed and it is your decision to select a match or not. Matches by Acoustic Id are shown first, then matches by meta data - sorted by their rating. Each choice is displayed on a seperate row for each release event so that you can select the Release Date,Record Label and CatalogNo to select when there are multiple choices. You can modify what fields are displayed and in what order and these changes are preserved. The last column contains either Search for the master record or View for the choices, if you select View Jaikoz opens the release in Musicbrainz so that you can look in more detail. If you select Search it opens Musicbrainz Search you can then select one of the results or search again to find the track that you want, then select the tagger button on the webpage and Jaikoz will use this selection.

By default records are processed in batches of ten, after the first ten songs are processed the matches are displayed in a dialog. Whilst you are reviewing the options the next batch of songs is processed in the background. If you select Ok the songs are updated to your matches. This continues until there there are no more records to process, unless you select Cancel. You can select Reset to undo any changes you have made in the current batch that you are reviewing.

1.5.1. Manual Match Preferences

The Number of records to process before showing dialog sets how many records are processed before displaying the potential matches in a dialog when running ManualCorrect Tags from MusicBrainz.

1.6. Cluster Albums

Many albums exist within Musicbrainz as multiple releases, usually this is because the album was released with an extra track in a particular country. All the track ids for an album are unique to that album because of a flaw in MusicBrainz schema, so that even if there is an identical track on both releases it will have different track ids. Now when a PUID is created for a song it is likely to only be associated to one of the track ids. So what typically happens when you look up some tracks from one album in Jaikoz is that it finds matches, but some are for tracks in one version of the release and some in another. This means the MusicBrainz Release Id varies and things like Find Missing Tracks Report do not work so well because it compares release ids rather than album names. The Cluster Albums task analyses all the (selected) tracks that have been matched in Musicbrainz and groups them by artist and album and tries to reduce the number of release ids the tracks are spread over

1.7. Correct Lyrics

Jaikoz can retrieve the lyrics for many of your songs using an online database.

1.8. Update Tags from Existing MusicBrainz Id

For each (selected) track this checks that they already have a MusicBrainz Id, and if they do retrieve the latest information from MusicBrainz for this Id and updates the relevent fields in the record if they are enabled in your Format options. If the track does not have a MusicBrainz Id but it does have a Musicbrainz Release Id then Jaikoz tries to find the correct track on the release by comparing the trackname, track number and track duration.

This has a number of uses:

  • Ensuring that you have the latest information from MusicBrainz because MusicBrainz may have improved/corrected/increased the information held for this track since you originally matched it.

  • Jaikoz may not have been able to find a MusicBrainz Match itself, so you can find the correct track using the MusicBrainz Website yourself and then enter the id into the UniqueId field, then use this task to update all the fields based on this Id.

  • Even easier, just find the Release Id and then paste the Musicbrainz Release Id for every track into the Release Id field

1.9. Update Tags from Discogs

For each (selected) song this checks that they already have a Release Discogs Url, and if they do then Jaikoz retrieves the latest information from Discogs for this Release Id and tries to find the correct track on the release by comparing the trackname, track number and track duration of the tracks with the Discogs release information. Jaikoz then updates the relevent fields in the record if they are enabled in your Format options. The Release Discogs Url can be added after a successful Musicbrainz lookup but if you're song does not have one you can search on Discogs for it yourself and add it manually.

Whether or not you have already tagged your track from Musicbrainz, Discogs may contain additional information that is not included in the Musicbrainz Database. You can also use this to help identify information that you can be added into Musicbrainz. Discogs Cover Art is generally higher quality than is available from Musicbrainz.

1.9.1. Format Preferences

This lists the values that can possibly be populated by Discogs upon a successful match, this includes text fields and Artwork. By default fields are only populated if they are currently empty but you can select to always populate the field, or never to populate it.

This is the current list of fields that can be auto populated

  • Artist
  • Album
  • Album Artist
  • Release Type
  • Release Status
  • Title
  • Year
  • Track No
  • Is Compilation
  • Release Country
  • Artwork
  • Composer
  • Label
  • Barcode
  • Cat No
  • ISRC
  • Lyricist
  • Conductor
  • Remixer
  • Genre

1.10. Submissions to Musicbrainz

In order to submit data to MusicBrainz you need to have a user account, these are completely free and can be created at the MusicBrainz Website. If you have a user account specify your username and password in this tab in the MusicBrainz Username and MusicBrainz password fields. By submitting data to Musicbrainz you are helping to improve this open source database and maintain the most accurate music resource in existence. By submitting information for songs that you own you create an archive of information that will never be lost even if your own files get damaged. Currently the majority of information that is contributed to Musicbrainz is via their website, but you can contribute some information automatically via Jaikoz. if you do this please ensure the information you submit is accurate and of a high quality.

1.10.1. Submit MusicBrainz/PUID Pair

If you successfully create an Acoustic Id (MusicIP PUID) for a track and you have successfully found a match on MusicBrainz, and your track has a Music Brainz Unique File Id, you can submit the pairing to MusicBrainz the database. You should only submit a pairing if you have confirmed that the pairing has been correctly and identified and you should not submit the same pairing more than once.

1.10.2. Submit MusicBrainz Genres

Musicbrainz has a Folksonomy cloud that can be used to tag artists, releases and tracks with anything you like. Jaikoz uses this to add genres at releases and track level. Because genres are subjective this feature is going to become more useful as more people submit their genres, so please give it a go. Jaikoz also uses the folksonomy cloud to fix the genre field when correcting songs from Musicbrainz.

1.10.3. Submit Musicbrainz Collection

Musicbrainz can store a list of the release you own and inform you of new releases by your favourite artists. Submitting releases can be done via the website but Submit Musicbrainz Collection provides an easier way to submit all the releases loaded in Jaikoz in one go.