Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mjb.listing.clear.UNKNOWN should clear both "UNKNOWN" and "UNDEFINED" #1787

Closed
Omertron opened this issue Mar 15, 2015 · 20 comments
Closed

mjb.listing.clear.UNKNOWN should clear both "UNKNOWN" and "UNDEFINED" #1787

Omertron opened this issue Mar 15, 2015 · 20 comments

Comments

@Omertron
Copy link
Member

Original issue 1788 created by Omertron on 2011-01-16T15:33:12.000Z:

Hi,

I find lots of UNDEFINED subtitles (see capture). I would be nice if mjb.listing.clear.UNKNOWN could clear both UNDEFINED and UNKNOWN subtitles.

Thanks for your help.

@Omertron
Copy link
Member Author

Comment #1 originally posted by Omertron on 2011-01-16T19:35:49.000Z:

This issue was closed by revision r2063.

@Omertron
Copy link
Member Author

Comment #2 originally posted by Omertron on 2011-01-18T17:30:42.000Z:

Oan't index with this release (r2063): "GC overhead limit exceeded" !

{18:12:03 Thread-9} Finished: ça commence aujourd'hui (1999) -- Bertrand Tavernier - Philippe Torreton, Maria Pitarresi, Nadia Kaci
(2266/2266)
{18:12:03 main} Memory - Total: 981 MB, Free: 903 MB
{18:12:03 main} Indexing libraries...
{18:12:03 Thread-13} Indexing Genres...
{18:12:03 Thread-14} Indexing Year...
{18:12:03 Thread-15} Indexing Director...
{18:12:03 Thread-18} Indexing Other...
{18:12:03 Thread-16} Indexing Cast...
{18:12:03 Thread-17} Indexing Country...
{18:12:03 Thread-15} Indexing Title...
{18:12:03 Thread-13} Indexing Library ...
{18:12:04 main} Sorting Indexes ...
{18:12:05 main} Memory - Total: 981 MB, Free: 777 MB
{18:12:05 main} Indexing masters...
{18:12:07 main} Memory - Total: 981 MB, Free: 860 MB
{18:12:07 main} Writing movie data...
{18:12:29 main} Memory - Total: 2,24 GB, Free: 637 MB
{18:12:29 main} Writing Indexes XML...
{18:12:29 main} Indexing Director (1/8) contains 1047 indexes
{18:12:34 main} Indexing Genres (2/8) contains 27 indexes
{18:12:35 main} Indexing Year (3/8) contains 13 indexes
{18:12:35 main} Indexing Country (4/8) contains 60 indexes
{18:12:35 main} Indexing Title (5/8) contains 27 indexes
{18:12:39 main} Indexing Other (6/8) contains 5 indexes
{18:12:39 main} Indexing Cast (7/8) contains 32742 indexes
{18:14:10 main} Indexing Set (8/8) contains 51 indexes
{18:14:12 main} Writing Category XML...
{18:14:12 main} Indexing Categories
{18:14:13 main} Writing Indexes HTML...
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded

I have to reverse to r2009 to make it work.
Windows 7 64 bits
java -Xms1024m -Xmx5500m -classpath .;resources;lib/* com.moviejukebox.MovieJukebox %*
This is the maximum I can do with my 6Mb machine.

Thanks for your help!

@Omertron
Copy link
Member Author

Comment #3 originally posted by Omertron on 2011-01-18T17:32:40.000Z:

...sorry - I meant 6Gb machine...

@Omertron
Copy link
Member Author

Comment #4 originally posted by Omertron on 2011-01-18T22:01:31.000Z:

This issue was updated by revision r2076.

Added more memory GC points during indexing

@Omertron
Copy link
Member Author

Comment #5 originally posted by Omertron on 2011-01-19T16:43:20.000Z:

Sorry, no change with r2076:

{17:29:46 Thread-11} Finished: Zelig - Woody Allen (2266/2268)
{17:29:46 Thread-10} Finished: ça commence aujourd'hui (1999) -- Bertrand Tavernier - Philippe Torreton, Maria Pitarresi, Nadia Kac
i (2268/2268)
{17:29:46 Thread-10} Memory - Total: 981 MB, Free: 900 MB
{17:29:46 Thread-12} Finished: Zorba le Grec (1964) -- Anthony Quinn, Alan Bates, Irene Papas (2267/2268)
{17:29:46 main} Memory - Total: 981 MB, Free: 918 MB
{17:29:46 main} Indexing libraries...
{17:29:46 Thread-14} Memory - Total: 981 MB, Free: 921 MB
{17:29:47 Thread-15} Memory - Total: 981 MB, Free: 919 MB
{17:29:47 Thread-16} Memory - Total: 981 MB, Free: 920 MB
{17:29:47 Thread-17} Memory - Total: 981 MB, Free: 914 MB
{17:29:47 Thread-18} Memory - Total: 981 MB, Free: 918 MB
{17:29:47 Thread-19} Memory - Total: 981 MB, Free: 915 MB
{17:29:47 Thread-14} Indexing Genres...
{17:29:47 Thread-19} Indexing Other...
{17:29:47 Thread-18} Indexing Country...
{17:29:47 Thread-17} Indexing Cast...
{17:29:47 Thread-16} Indexing Director...
{17:29:47 Thread-15} Indexing Year...
{17:29:47 Thread-18} Memory - Total: 981 MB, Free: 916 MB
{17:29:47 Thread-16} Memory - Total: 981 MB, Free: 915 MB
{17:29:47 Thread-18} Indexing Title...
{17:29:47 Thread-16} Indexing Library ...
{17:29:48 main} Memory - Total: 981 MB, Free: 737 MB
{17:29:48 main} Memory - Total: 981 MB, Free: 836 MB
{17:29:48 main} Sorting Indexes ...
{17:29:50 main} Memory - Total: 981 MB, Free: 799 MB
{17:29:50 main} Memory - Total: 981 MB, Free: 888 MB
{17:29:50 main} Memory - Total: 981 MB, Free: 907 MB
{17:29:50 main} Indexing masters...
{17:29:51 main} Memory - Total: 981 MB, Free: 870 MB
{17:29:52 main} Writing movie data...
{17:30:13 main} Memory - Total: 2,13 GB, Free: 1,08 GB
{17:30:14 main} Writing Indexes XML...
{17:30:14 main} Indexing Country (1/8) contains 61 indexes
{17:30:14 main} Indexing Director (2/8) contains 1048 indexes
{17:30:22 main} Indexing Year (3/8) contains 13 indexes
{17:30:22 main} Indexing Genres (4/8) contains 27 indexes
{17:30:22 main} Indexing Title (5/8) contains 27 indexes
{17:30:22 main} Indexing Other (6/8) contains 5 indexes
{17:30:22 main} Indexing Cast (7/8) contains 32754 indexes
{17:31:50 main} Indexing Set (8/8) contains 51 indexes
{17:31:51 main} Writing Category XML...
{17:31:51 main} Indexing Categories
{17:31:52 main} Writing Indexes HTML...
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded

Please find attached a capture of the amount of memory used by java just after the error message appeared.

@Omertron
Copy link
Member Author

Comment #6 originally posted by Omertron on 2011-01-19T16:59:48.000Z:

Results with r2009:

{17:50:05 Thread-10} Finished: Zelig - Woody Allen (2266/2268)
{17:50:05 Thread-8} Finished: Zorba le Grec (1964) -- Anthony Quinn, Alan Bates, Irene Papas (2267/2268)
{17:50:05 Thread-9} Finished: Ça commence aujourd'hui (1999) -- Bertrand Tavernier - Philippe Torreton, Maria Pitarresi, Nadia Kaci
(2268/2268)
{17:50:05 Thread-9} Memory - Total: 981 MB, Free: 904 MB
{17:50:05 main} Memory - Total: 981 MB, Free: 923 MB
{17:50:06 main} Indexing libraries...
{17:50:06 Thread-14} Indexing Genres...
{17:50:06 Thread-16} Indexing Director...
{17:50:06 Thread-15} Indexing Year...
{17:50:06 Thread-18} Indexing Country...
{17:50:06 Thread-19} Indexing Other...
{17:50:06 Thread-17} Indexing Cast...
{17:50:06 Thread-18} Indexing Library ...
{17:50:06 Thread-16} Indexing Title...
{17:50:07 main} Sorting Indexes ...
{17:50:08 main} Memory - Total: 981 MB, Free: 771 MB
{17:50:08 main} Indexing masters...
{17:50:09 main} Memory - Total: 1,21 GB, Free: 1,09 GB
{17:50:09 main} Writing movie data...
{17:50:21 main} Memory - Total: 1,21 GB, Free: 769 MB
{17:50:21 main} Writing Indexes XML...
{17:50:21 main} Indexing Director (1/8) contains 1048 indexes
{17:50:22 main} Indexing Country (2/8) contains 61 indexes
{17:50:22 main} Indexing Year (3/8) contains 13 indexes
{17:50:22 main} Indexing Genres (4/8) contains 27 indexes
{17:50:22 main} Indexing Title (5/8) contains 27 indexes
{17:50:23 main} Indexing Other (6/8) contains 5 indexes
{17:50:23 main} Indexing Cast (7/8) contains 32754 indexes
{17:51:10 main} Indexing Set (8/8) contains 51 indexes
{17:51:12 main} Writing Category XML...
{17:51:12 main} Indexing Categories
{17:51:13 main} Writing Indexes HTML...
{17:52:09 main} Writing Category HTML...
{17:52:11 main} Memory - Total: 4,64 GB, Free: 1,27 GB
{17:52:13 main} Copying new files to Jukebox directory...
Copying directory C:\YAMJ\temp\Jukebox (4684/4684)
{17:56:23 main} Skin copying skipped.
{17:56:23 main} Memory - Total: 4,62 GB, Free: 1,86 GB
{17:56:27 main} Jukebox cleaning skipped
{17:56:27 main} Generating listing output...
{17:56:30 main} Copying to: A:/Work/\MovieJukebox-listing.csv
{17:56:30 main} Clean up temporary files
{17:56:32 main}
{17:56:32 main} MovieJukebox process completed at Wed Jan 19 17:56:32 CET 2011
{17:56:32 main} Processing took 00:12:19

@Omertron
Copy link
Member Author

Comment #7 originally posted by Omertron on 2011-01-22T20:21:53.000Z:

Could anyone help me with this issue. I can't upgrade further than r2009!

@Omertron
Copy link
Member Author

Comment #8 originally posted by Omertron on 2011-01-23T08:03:41.000Z:

I thought I'd mentioned to reduce the Xms to a smaller value such as Xms256m.

@Omertron
Copy link
Member Author

Comment #9 originally posted by Omertron on 2011-01-23T17:06:57.000Z:

I tried with Xms256m and r2076 (java -Xms256m -Xmx5500m -XX:-UseGCOverheadLimit -classpath .;resources;lib/* com.moviejukebox.MovieJukebox %*) and I had to cancel the process after waiting more than 20mns for HTML indexing to complete (with r2009 it takes approximately 1mn).
Java seems to loop (consumes CPU) and uses 5Gb memory (as per attachment).

Thanks for your help!

@Omertron
Copy link
Member Author

Comment #10 originally posted by Omertron on 2011-01-23T18:27:01.000Z:

There are a lot of indexes there.
I don't know what's changed in the last 70 revisions that might cause this issue.
The only thing I can suggest is to run YAMJ with "-i" to Just generate the base XML files and then run normally to generate the indexes.

Either that or increase the minimum cast/director indexes to reduce the number of indexes generated and the memory usage

@Omertron
Copy link
Member Author

Comment #11 originally posted by Omertron on 2011-01-25T15:10:35.000Z:

FYI with the new AllocineAPI plugin, the number of actors, firectors, etc... retrieve have increased enormously. Perhaps a side effect ....

@Omertron
Copy link
Member Author

Comment #12 originally posted by Omertron on 2011-01-25T15:18:17.000Z:

We need to find a better way to index the movies than we currently do.
There are several copies held in memory during indexing which is wasteful and slow.

Perhaps we should look at indexing as videos are added, then indexing would just be about writing the indexes to disk.

@Omertron
Copy link
Member Author

Comment #13 originally posted by Omertron on 2011-01-26T18:05:49.000Z:

I think the main problem is that xml files are generated with all actors available for one movie, although I have set actors.max=6 in skins.properties.

For example, here's what I find in the xml for the movie "2 days in Paris":

Adam Goldberg Julie Delpy Daniel Brühl Marie Pillet Albert Delpy Aleksia Landeau Adan Jodorowsky Alexandre Nahon Charlotte Maury Sentier Vanessa Seward De Thibault Lussy Chick Ortega Patrick Chupin Antar Boudache Ludovic Berthillot Hubert Toint Sandra Berrebi Arnaud Beunaiche Claude Harold Benjamin Baroche Jean-Baptiste Puech Clément Rouault Nanou Benahmou

I understand that max.actors is used to display THAT number of actors in the detail page (see attachment).

But I think it would be useful if we could limit the number of actors written to the xml files. It would reduce the number of indexes to generate for casts.

For example:
max.actors=3 would display 3 actors in detail pages
max.xml.actors=6 would save only 6 actors in xml files, reducing the number of indexes to create

I tried your suggestion to run YAMJ with -i, but it didn't change a thing. All xml were regenerated with too many actors...

@Omertron
Copy link
Member Author

Comment #14 originally posted by Omertron on 2011-01-27T12:59:32.000Z:

I think it will better to optimize index creation that adding another property... What's do you think Stuart ?

@Omertron
Copy link
Member Author

Comment #15 originally posted by Omertron on 2011-01-27T13:44:12.000Z:

Yes, the indexing is very inefficient at the moment (lots of wasted memory) and needs to be optimised

@Omertron
Copy link
Member Author

Comment #16 originally posted by Omertron on 2011-01-27T13:45:10.000Z:

Besides, chopping off the list of actors for each movie will give incorrect results when you add all the actors up to decided where to cut the list.

@Omertron
Copy link
Member Author

Comment #17 originally posted by Omertron on 2011-01-27T17:57:03.000Z:

I agree with you The best solution is to improve the indexing process. When do you think it could be done, Stuart?

@Omertron
Copy link
Member Author

Comment #18 originally posted by Omertron on 2011-01-27T18:04:37.000Z:

If I knew how to do it, it would be done already :)

@Omertron
Copy link
Member Author

Comment #19 originally posted by Omertron on 2011-01-27T18:12:06.000Z:

In that case, can't you implement the solution I suggest (xml.max.actors) until you find how to do it? I would be grateful!

@Omertron
Copy link
Member Author

Comment #20 originally posted by Omertron on 2011-01-30T18:59:38.000Z:

You can close this one. I'll open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant