[Logo] Jaikoz and SongKong Forums
  [Search] Search   [Recent Topics] Recent Topics   [Members]  Member Listing   [Groups] Back to home page 
[Register] Register / 
[Login] Login 
Messages posted by: Wabbit  XML
Profile for Wabbit -> Messages posted by Wabbit [27]
Author Message
what i'd like to see when it pops up "musicbrainz server busy" is a countdown to re-establish the connection and continue (without losing it's current place) with the OPTION to quit.

as it stands now, my only choice is to restart tagging from the beginning.
Even though Im only doing 2-3000 files at a time, this adds many extra hours having to restart constantly.

This ISN't a trivial problem.....
what I need is for it to distinguish a network dropout and wait until its connected, then resume, or at least the option to alter the number of retries....
when selecting Folders and/or files to load, it would help if i could click and drag to highlight the folders and/or files i want, as opposed to holding down the Ctrl key....

paultaylor wrote:
on the todo list, but its of limited use if we cant load the whole library into Jaikoz 


On a large collection, it would be impossible to load the entire collection anyway. You would be able to sort and pre-process (or clean up) the data, then run it row by row retriving acoustic id's, loading and releasing each file as it went, writing that acoustic id into the database.
Once all the acoustic Id's are written, you can turn it loose retriving musicbrainz data, and let it go autonomously until its finished.

It's just a matter of picking a strategy that leaves no holes, and gets the job done as quickly and thoroughly as possible, without constant user intervention.

With a database, it isn't necessary to commit ANY changes to the files until after the entire database has been processed. It can simply run autonomously. The User would then run operations on the database, until he/she was happy with the results. It doesn't have to be loaded all at once into Jaikoz, it does have to be written in its entirety into the database, but not all at once....it can be done in logical stages that make the best use of the processing speed and connection to musicbrainz.

I'm sorry for being so pushy, Paul. This is the best tagging software i've found so far...Just want this to be the best it can be, as I'm sure you do as well...

I still dont understand where this table speed issue comes from...

the database is responsible for the speed. tying(or more correctly Binding) the database to the user interface leaves you with choices as to how close to real-time you want the scroll bars to work. It all depends on your strategy.You have a choice of how large a dataset to load into memory, too small, and you get a performance hit when you run out of rows, too large, well then you're eating up resources on something that cant be displayed anyway.

Given the number of tasks that are performed on the data, i can see converting these to database operations being a formidable task, especially if you don't specialize in database programming...
ever considered using xml? I have a dj program that uses an xml based library...

might be worth looking into since its a standard most if not all databases can import and export.
when using "Save and Move", I get a popup message saying that its unable to move some because a file with the same name already exists at the destination.
It says these files have been left with a status of "changed', but then the list goes blank so i have no way to track which ones they are(hence why I suggested implementing a "problem" files folder, to move these files to, so they aren't processed over and over and over again....)

if it says they've been left with a staus of "changed" then they should REMAIN in the list after the "Save and Move" so I can deal with them appropriately...
perhaps a way to mark these files or rename them so they don't get targeted again, wasting valuable time and resources....
nope, im here in the U.S. but I think it's happening because I'm on a wireless connection to my router via usb wireless-g stick. It drops out momentarily, has done so for as long as I can remember, so persisting would overcome this...
I'd rather see it persist and wait for musicbrainz(or my network connection) to re-aquire, rather than have to start over... if it pops up that a connection is lost (for whatever reason) a CHOICE to terminate....
would be far more useful than displaying an error message listing the files it couldn't do something with(and only the first 30 at that...).

ive seen files with offset problems, corrupted files, unable to move because "file already exists with that name", and so on....

rather than leave them in my collection, scattered through folders that i have to hunt down, it would be far easier to manage if there was at least an option to move any files that Jaikoz encountered problems with to a designated folder.
I've had this happen quite frequently lately. The problem is that Jaikoz interrupts its current operation when it encounters this. When you've been waiting a long time for it to finish a batch and it stops unexpectedly somewhere in the middle, that gets pretty frustrating.

I'm letting Jaikoz process @500 - 750 files at a time, autocorrecting and then using "Save and Move" to separate tagged files from untagged ones.

any chance Jaikoz can be set to keep trying Musicbrains rather than give up?
this interrupts the current operation and keeps making me start over....

I can't use Find and Replace to filter out songs by file extension, so i can deal with those separately..

I thought this newest version fully supported wma, was I mistaken?

Why should there even be a separate filename formatter? If it can name one it can name them all...

Tagging is a different story.
its listed under the "Store" tab in preferences and has a checkbox for "Download missing artwork"

I don't know if Itunes reads embedded artwork, or if it considers artwork missing even though its embedded.....

paultaylor wrote:

Wabbit wrote:
a suggestion would be to let the program assume a blank "Find" textbox means "select all" only when working on individual columns.
 


Wabbit wrote:

a combobox dropdown list (of unique entries in that column) instead of the "Replace" textbox would be very helpful in this mode....assuming it operates only on the current "filtered" view.... 


Both good ideas, I'll do this 


perhaps a Hotkey combo that replaces all row entries in a column with the "highlighted" entry. would require less navigating and allow more speed. Would just have to be careful if using without filters....
you have to turn off that feature in Itunes if the files already have artwork embedded..
maybe you could make it so the "Find and Replace" (which can search and affect all entries in a column) will allow just entering the genre desired in the "replace" textbox and then choosing "Replace All"

is there something I can enter in the "Find" textbox to make sure all rows in the selected column are found (irregardless if it's empty, or has multiple genres??

a suggestion would be to let the program assume a blank "Find" textbox means "select all" only when working on individual columns.
a combobox dropdown list (of unique entries in that column) instead of the "Replace" textbox would be very helpful in this mode....assuming it operates only on the current "filtered" view....
even with 50 columns (which for most users would require horizontal scrolling) there is still the fact that only a finite amount of rows can display on a screen and vertical scrolling (in real-time), especially when you drag the scroll bar quickly. may not seem seamless because the database engine has to respond to the scroll command.(i.e smooth scrolling stops and it just updates when you stop)

Microsoft access does this right out of the box....

you can gradually phase in the database, by using it as a pointer to files that have been processed by Jaikoz(or not)

the real performance hit is that Jaikoz cant remember where it started or where it left off, so one has to pick the largest block of files( that you can remember where it leaves off) that doesnt lock up your computer.

the only other way i can imagine would be to process files one at a time, with a way to separate processed vs. unprocessed (those that returned acceptable results) so the user can easily keep track of files that need serious attention....
the database (whichever you choose to use) does that for you. with a 50 x 100,000 grid, only a small fraction is visible at once, the database responds to your scroll, insert, and other sql commands and returns data accordingly. Each type of database may have configurations for how much to cache to remain responsive, but a sort on that amount of records could take from 1 to 5 seconds, depending on which one you pick, how many tables are in it, and how complex the tables are related, but thats why there's the hourglass... i guess you need to figure out which types would work best for your programming language and patience.....
Now programatically interfacing with the database, as well as with the grid itself, i can imagine is no small task......My hat's off to you Sir.......

Ive got @ 200 Gb worth of junk to sort out and clean up. I dont really care how long it takes so long as it gets there eventually....
local year correct isn't in the list.....

paultaylor wrote:

Wabbit wrote:
Was hoping there was a quick and dirty way to do this. Do I need to add this to your wish list?
 

Ive added to my todo list, though really I dont see its much effort to do one copy and one paste, even if you are pasting into a few hundred rows. 


probably easier to implement after a database is incorporated
second example taking into account physical hard drive access limits:

population phase:
Main app = Thread 0

Thread 1 reads and writes to database (in cached blocks) the directory tree, and filenames as it progresses until complete.

Generation phase: (which can start shortly after thread 1 has started populating the database)

Thread 2 reads database and loads blocks of files at a time (writing existing tag data and generating MB Id's) rekeasing files from memory when finished writing to database. the entries in use by this thread are locked.

Thread 1 queries database for entries that arent locked and have null MB Id values (up to max block size) and begins generating MB id's and writing tag values back to database.these entries are also locked while in use.

the two threads go on processing entries in interlocked blocks, until complete. the database is populated with all existing info and ready to go online to retrive data...

so long as thread one and two lock out access to locations theyre working on, the time lag in generating the mb Id should be enough to keep hard drive activity from producing a bottleneck.....

Retrieval phase: MB Ids can be read in interlocked blocks by threads 1 and 2, each writing back results as they go.

the database is now ready to be manipulated as desired.
just for example: main app is thread 0

population phase: staggered thread start
Thread 1 reads/writes the folder/subfolder directory into the database
Thread 2 reads database entries and goes to each folder/subfolder, writing filenames into database
Thread 3 gets file location from database then reads tag info from each file and writes back to database

generation phase:
by now thread 1 is done reading folder structure and starts generating MB Id's, writing to each database entry as it goes.(this is where most memory is used and released)
Thread 2 starts the lengthy process of online MB Id data retrieval writing results as it goes.
Thread 3 finishes writing tag info and generates MB Id starting halfway down the folder tree

i dont know if 2 threads can individually access MB online data at the same time, but if it can, then either/both threads 1 and 3 can go online to retrieve MB Tag data after finishing MB Id generation.

to avoid data corruption you have to keep threads from writing to the same entry or file at the same time(this you already know) so the idea is to stagger the load across multiple threads at multiple locations in the database and the filesystem simultaneously.....Thread 0 could monitor progress and divide database sections for each thread to chew on...

to ease up on read/write access to the database, you could cache them to memory and do it in blocks... in any case the file itself only remains in memory long enough to generate the MB Id( and again to commit changes later)...everything else is database access which can be done with SQL

for multi-core systems this would scream compared to the top>down approach....for older single core, i guess it depends on how well it can multi-thread

food for thought...wish i knew how to do it myself...

is there a setting somewhere to store only the year (vs. the date) when it performs autocorrect?
I found the setting for localcorrect, but thats not an option i can use for the automatic portion.....so i have to make a second pass with localcorrect everytime...
many files end up with more than 2 album art pics. How can i choose which one to display for all files belonging to that album?
Thanks for the reply...

Was hoping there was a quick and dirty way to do this. Do I need to add this to your wish list?

MusicBrains records aren't consistent with Genre's from one album to the next, with the same artist. This feature would help alot, since Artists typically stay within Genre.....for example i have a ton of albums by Pink Floyd, thats a whole lot of pasting.....
it does sound like a database is necessary to gather and retain current and proposed changes(between sessions), with commital to the files being done last...

for 50,000 plus values its going to be a bit sluggish anyway, but a database is the best tool for that job...i guess it all depends on which database system you choose, since each has thier own strengths and weaknesses. Isn't sorting built into the database itself? I have a karaoke songbook database in access with well over 100,000 entries(master listing) that sorts rather well....though album art would seem to pose the biggest headache, unless you store a path to the album art in the database.

I guess the trick here is the sequencing used to get the database populated, and whether you want to multi-thread while the database is populating. I'm running a quad core with 4 gb ram, have you done work optimizing for multi-core processors?
 
Profile for Wabbit -> Messages posted by Wabbit [27]
Go to:   
Powered by JForum 2.1.6 © JForum Team